Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3474085.3475353acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Hybrid Network Compression via Meta-Learning

Published: 17 October 2021 Publication History

Abstract

Neural network pruning and quantization are two major lines of network compression. This raises a natural question that whether we can find the optimal compression by considering multiple network compression criteria in a unified framework. This paper incorporates two criteria and seeks layer-wise compression by leveraging the meta-learning framework. A regularization loss is applied to unify the constraint of input and output channel numbers, bit-width of network activations and weights, so that the compressed network can satisfy a given Bit-OPerations counts (BOPs) constraint. We further propose an iterative compression constraint for optimizing the compression procedure, which effectively achieves a high compression rate and maintains the original network performance. Extensive experiments on various networks and vision tasks show that the proposed method yields better performance and compression rates than recent methods. For instance, our method achieves better image classification accuracy and compactness than the recent DJPQ. It achieves similar performance with the recent DHP in image super-resolution, meanwhile saves about 50% computation.

References

[1]
Eirikur Agustsson and Radu Timofte. 2017. Ntire 2017 challenge on single image super-resolution: Dataset and study. In CVPRW.
[2]
Chaim Baskin, Eli Schwartz, Evgenii Zheltonozhskii, Natan Liss, Raja Giryes, Alexander M. Bronstein, and Avi Mendelson. 2018. UNIQ: Uniform Noise Injection for the Quantization of Neural Networks. CoRR (2018).
[3]
Marco Bevilacqua, Aline Roumy, Christine Guillemot, and Marie Line Alberi-Morel. 2012. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. (2012).
[4]
Bin Dai, Chen Zhu, Baining Guo, and David Wipf. 2018. Compressing neural networks using the variational information bottleneck. In ICML.
[5]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In CVPR.
[6]
Ahmed T Elthakeb, Prannoy Pilligundla, FatemehSadat Mireshghallah, Amir Yazdanbakhsh, Sicun Gao, and Hadi Esmaeilzadeh. 2018. Releq: an automatic reinforcement learning approach for deep quantization of neural networks. arXiv (2018).
[7]
Shaopeng Guo, Yujie Wang, Quanquan Li, and Junjie Yan. 2020. DMCP: Differentiable Markov Channel Pruning for Neural Networks. In CVPR.
[8]
Philipp Gysel, Jon Pimentel, Mohammad Motamedi, and Soheil Ghiasi. 2018. Ristretto: A framework for empirical study of resource-efficient inference in convolutional neural networks. TNNLS (2018).
[9]
Song Han, Huizi Mao, and William J Dally. 2016a. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. ICLR (2016).
[10]
Song Han, Huizi Mao, and William J Dally. 2016b. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. ICLR (2016).
[11]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.
[12]
Yang He, Guoliang Kang, Xuanyi Dong, Yanwei Fu, and Yi Yang. 2018a. Soft filter pruning for accelerating deep convolutional neural networks. In IJCAI.
[13]
Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han. 2018b. AMC: Automl for model compression and acceleration on mobile devices. In ECCV.
[14]
Yang He, Ping Liu, Ziwei Wang, Zhilan Hu, and Yi Yang. 2019. Filter pruning via geometric median for deep convolutional neural networks acceleration. In CVPR.
[15]
Yihui He, Xiangyu Zhang, and Jian Sun. 2017. Channel pruning for accelerating very deep neural networks. In ICCV.
[16]
Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Single image super-resolution from transformed self-exemplars. In CVPR.
[17]
Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized Neural Networks. In NeurIPS.
[18]
Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images. (2009).
[19]
Se Jung Kwon, Dongsoo Lee, Byeongwook Kim, Parichay Kapoor, Baeseong Park, and Gu-Yeon Wei. 2020. Structured Compression by Weight Encryption for Unstructured Pruning and Quantization. In CVPR.
[20]
Fengfu Li, Bo Zhang, and Bin Liu. 2016. Ternary weight networks. arXiv (2016).
[21]
Yawei Li, Shuhang Gu, Luc Van Gool, and Radu Timofte. 2019. Learning filter basis for convolutional neural network compression. In ICCV.
[22]
Yawei Li, Shuhang Gu, Kai Zhang, Luc Van Gool, and Radu Timofte. 2020. DHP: Differentiable Meta Pruning via HyperNetworks. In ECCV.
[23]
Yawei Li, Wen Li, Martin Danelljan, Kai Zhang, Shuhang Gu, Luc Van Gool, and Radu Timofte. 2021. The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures. In CVPR.
[24]
Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In CVPRW.
[25]
Zechun Liu, Haoyuan Mu, Xiangyu Zhang, Zichao Guo, Xin Yang, Kwang-Ting Cheng, and Jian Sun. 2019. Metapruning: Meta learning for automatic neural network channel pruning. In ICCV.
[26]
Christos Louizos, Matthias Reisser, Tijmen Blankevoort, Efstratios Gavves, and Max Welling. 2018. Relaxed Quantization for Discretized Neural Networks. In ICLR.
[27]
David Martin, Charless Fowlkes, Doron Tal, and Jitendra Malik. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In ICCV. IEEE.
[28]
Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. 2016. Pruning convolutional neural networks for resource efficient inference. arXiv preprint arXiv:1611.06440 (2016).
[29]
Hanyu Peng, Jiaxiang Wu, Shifeng Chen, and Junzhou Huang. 2019. Collaborative channel pruning for deep networks. In ICML.
[30]
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR.
[31]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. In ICLR.
[32]
Sanghyun Son, Seungjun Nah, and Kyoung Mu Lee. 2018. Clustering convolutional kernels to compress deep neural networks. In ECCV.
[33]
Lucas Theis, Iryna Korshunova, Alykhan Tejani, and Ferenc Huszár. 2018. Faster gaze prediction with dense networks and fisher pruning. arXiv preprint arXiv:1801.05787 (2018).
[34]
Naftali Tishby, Fernando C Pereira, and William Bialek. 1999. The information bottleneck method. In Annual Allerton Conference on Communications, Control and Computing.
[35]
Stefan Uhlich, Lukas Mauch, Fabien Cardinaux, Kazuki Yoshiyama, Javier Alonso Garcia, Stephen Tiedemann, Thomas Kemp, and Akira Nakamura. 2019. Mixed Precision DNNs: All you need is a good parametrization. In ICLR.
[36]
Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, and Song Han. 2019. Haq: Hardware-aware automated quantization with mixed precision. In CVPR.
[37]
Min Wang, Baoyuan Liu, and Hassan Foroosh. 2017. Factorized convolutional neural networks. In ICCVW.
[38]
Ying Wang, Yadong Lu, and Tijmen Blankevoort. 2020. Differentiable joint pruning and quantization for hardware efficiency. In ECCV.
[39]
Shuang Wu, Guoqi Li, Feng Chen, and Luping Shi. 2018. Training and inference with integers in deep neural networks. arXiv (2018).
[40]
Haichuan Yang, Shupeng Gui, Yuhao Zhu, and Ji Liu. 2020. Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-Based Approach. In CVPR.
[41]
Jianbo Ye, Xin Lu, Zhe Lin, and James Z Wang. 2018. Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers. In ICLR.
[42]
Roman Zeyde, Michael Elad, and Matan Protter. 2010. On single image scale-up using sparse-representations. In ICCS. Springer.
[43]
Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, and Yuheng Zou. 2016. Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv (2016).

Cited By

View all
  • (2023)Iterative-AMC: a novel model compression and structure optimization method in mechanical system safety monitoringStructural Health Monitoring10.1177/1475921723115548622:6(3723-3741)Online publication date: 8-Mar-2023
  • (2023)Edge-Cloud Polarization and Collaboration: A Comprehensive Survey for AIIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.317821135:7(6866-6886)Online publication date: 1-Jul-2023
  • (2023)Efficient Vision Transformer via Token MergerIEEE Transactions on Image Processing10.1109/TIP.2023.329376332(4156-4169)Online publication date: 2023
  • Show More Cited By

Index Terms

  1. Hybrid Network Compression via Meta-Learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '21: Proceedings of the 29th ACM International Conference on Multimedia
    October 2021
    5796 pages
    ISBN:9781450386517
    DOI:10.1145/3474085
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. neural network compression
    2. neural networks
    3. pruning
    4. quantization

    Qualifiers

    • Research-article

    Funding Sources

    • Natural Science Foundation of China
    • The National Key Research and Development Program of China
    • Beijing Natural Science Foundation

    Conference

    MM '21
    Sponsor:
    MM '21: ACM Multimedia Conference
    October 20 - 24, 2021
    Virtual Event, China

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)23
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 13 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Iterative-AMC: a novel model compression and structure optimization method in mechanical system safety monitoringStructural Health Monitoring10.1177/1475921723115548622:6(3723-3741)Online publication date: 8-Mar-2023
    • (2023)Edge-Cloud Polarization and Collaboration: A Comprehensive Survey for AIIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.317821135:7(6866-6886)Online publication date: 1-Jul-2023
    • (2023)Efficient Vision Transformer via Token MergerIEEE Transactions on Image Processing10.1109/TIP.2023.329376332(4156-4169)Online publication date: 2023
    • (2023)Reducing Computational Complexity of Neural Networks in Optical Channel Equalization: From Concepts to ImplementationJournal of Lightwave Technology10.1109/JLT.2023.323432741:14(4557-4581)Online publication date: 15-Jul-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media