research-article

Hybrid Network Compression via Meta-Learning

Authors:

Shiliang Zhang,

Jingdong WangAuthors Info & Claims

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Pages 1423 - 1431

https://doi.org/10.1145/3474085.3475353

Published: 17 October 2021 Publication History

Abstract

Neural network pruning and quantization are two major lines of network compression. This raises a natural question that whether we can find the optimal compression by considering multiple network compression criteria in a unified framework. This paper incorporates two criteria and seeks layer-wise compression by leveraging the meta-learning framework. A regularization loss is applied to unify the constraint of input and output channel numbers, bit-width of network activations and weights, so that the compressed network can satisfy a given Bit-OPerations counts (BOPs) constraint. We further propose an iterative compression constraint for optimizing the compression procedure, which effectively achieves a high compression rate and maintains the original network performance. Extensive experiments on various networks and vision tasks show that the proposed method yields better performance and compression rates than recent methods. For instance, our method achieves better image classification accuracy and compactness than the recent DJPQ. It achieves similar performance with the recent DHP in image super-resolution, meanwhile saves about 50% computation.

References

[1]

Eirikur Agustsson and Radu Timofte. 2017. Ntire 2017 challenge on single image super-resolution: Dataset and study. In CVPRW.

[2]

Chaim Baskin, Eli Schwartz, Evgenii Zheltonozhskii, Natan Liss, Raja Giryes, Alexander M. Bronstein, and Avi Mendelson. 2018. UNIQ: Uniform Noise Injection for the Quantization of Neural Networks. CoRR (2018).

[3]

Marco Bevilacqua, Aline Roumy, Christine Guillemot, and Marie Line Alberi-Morel. 2012. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. (2012).

[4]

Bin Dai, Chen Zhu, Baining Guo, and David Wipf. 2018. Compressing neural networks using the variational information bottleneck. In ICML.

[5]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database. In CVPR.

[6]

Ahmed T Elthakeb, Prannoy Pilligundla, FatemehSadat Mireshghallah, Amir Yazdanbakhsh, Sicun Gao, and Hadi Esmaeilzadeh. 2018. Releq: an automatic reinforcement learning approach for deep quantization of neural networks. arXiv (2018).

[7]

Shaopeng Guo, Yujie Wang, Quanquan Li, and Junjie Yan. 2020. DMCP: Differentiable Markov Channel Pruning for Neural Networks. In CVPR.

[8]

Philipp Gysel, Jon Pimentel, Mohammad Motamedi, and Soheil Ghiasi. 2018. Ristretto: A framework for empirical study of resource-efficient inference in convolutional neural networks. TNNLS (2018).

[9]

Song Han, Huizi Mao, and William J Dally. 2016a. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. ICLR (2016).

[10]

Song Han, Huizi Mao, and William J Dally. 2016b. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. ICLR (2016).

[11]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR.

[12]

Yang He, Guoliang Kang, Xuanyi Dong, Yanwei Fu, and Yi Yang. 2018a. Soft filter pruning for accelerating deep convolutional neural networks. In IJCAI.

Digital Library

[13]

Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, and Song Han. 2018b. AMC: Automl for model compression and acceleration on mobile devices. In ECCV.

[14]

Yang He, Ping Liu, Ziwei Wang, Zhilan Hu, and Yi Yang. 2019. Filter pruning via geometric median for deep convolutional neural networks acceleration. In CVPR.

[15]

Yihui He, Xiangyu Zhang, and Jian Sun. 2017. Channel pruning for accelerating very deep neural networks. In ICCV.

[16]

Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Single image super-resolution from transformed self-exemplars. In CVPR.

[17]

Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. 2016. Binarized Neural Networks. In NeurIPS.

Digital Library

[18]

Alex Krizhevsky. 2009. Learning Multiple Layers of Features from Tiny Images. (2009).

[19]

Se Jung Kwon, Dongsoo Lee, Byeongwook Kim, Parichay Kapoor, Baeseong Park, and Gu-Yeon Wei. 2020. Structured Compression by Weight Encryption for Unstructured Pruning and Quantization. In CVPR.

[20]

Fengfu Li, Bo Zhang, and Bin Liu. 2016. Ternary weight networks. arXiv (2016).

[21]

Yawei Li, Shuhang Gu, Luc Van Gool, and Radu Timofte. 2019. Learning filter basis for convolutional neural network compression. In ICCV.

[22]

Yawei Li, Shuhang Gu, Kai Zhang, Luc Van Gool, and Radu Timofte. 2020. DHP: Differentiable Meta Pruning via HyperNetworks. In ECCV.

[23]

Yawei Li, Wen Li, Martin Danelljan, Kai Zhang, Shuhang Gu, Luc Van Gool, and Radu Timofte. 2021. The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures. In CVPR.

[24]

Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In CVPRW.

[25]

Zechun Liu, Haoyuan Mu, Xiangyu Zhang, Zichao Guo, Xin Yang, Kwang-Ting Cheng, and Jian Sun. 2019. Metapruning: Meta learning for automatic neural network channel pruning. In ICCV.

[26]

Christos Louizos, Matthias Reisser, Tijmen Blankevoort, Efstratios Gavves, and Max Welling. 2018. Relaxed Quantization for Discretized Neural Networks. In ICLR.

[27]

David Martin, Charless Fowlkes, Doron Tal, and Jitendra Malik. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In ICCV. IEEE.

[28]

Pavlo Molchanov, Stephen Tyree, Tero Karras, Timo Aila, and Jan Kautz. 2016. Pruning convolutional neural networks for resource efficient inference. arXiv preprint arXiv:1611.06440 (2016).

[29]

Hanyu Peng, Jiaxiang Wu, Shifeng Chen, and Junzhou Huang. 2019. Collaborative channel pruning for deep networks. In ICML.

[30]

Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. Mobilenetv2: Inverted residuals and linear bottlenecks. In CVPR.

[31]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. In ICLR.

[32]

Sanghyun Son, Seungjun Nah, and Kyoung Mu Lee. 2018. Clustering convolutional kernels to compress deep neural networks. In ECCV.

[33]

Lucas Theis, Iryna Korshunova, Alykhan Tejani, and Ferenc Huszár. 2018. Faster gaze prediction with dense networks and fisher pruning. arXiv preprint arXiv:1801.05787 (2018).

[34]

Naftali Tishby, Fernando C Pereira, and William Bialek. 1999. The information bottleneck method. In Annual Allerton Conference on Communications, Control and Computing.

[35]

Stefan Uhlich, Lukas Mauch, Fabien Cardinaux, Kazuki Yoshiyama, Javier Alonso Garcia, Stephen Tiedemann, Thomas Kemp, and Akira Nakamura. 2019. Mixed Precision DNNs: All you need is a good parametrization. In ICLR.

[36]

Kuan Wang, Zhijian Liu, Yujun Lin, Ji Lin, and Song Han. 2019. Haq: Hardware-aware automated quantization with mixed precision. In CVPR.

[37]

Min Wang, Baoyuan Liu, and Hassan Foroosh. 2017. Factorized convolutional neural networks. In ICCVW.

[38]

Ying Wang, Yadong Lu, and Tijmen Blankevoort. 2020. Differentiable joint pruning and quantization for hardware efficiency. In ECCV.

[39]

Shuang Wu, Guoqi Li, Feng Chen, and Luping Shi. 2018. Training and inference with integers in deep neural networks. arXiv (2018).

[40]

Haichuan Yang, Shupeng Gui, Yuhao Zhu, and Ji Liu. 2020. Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-Based Approach. In CVPR.

[41]

Jianbo Ye, Xin Lu, Zhe Lin, and James Z Wang. 2018. Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers. In ICLR.

[42]

Roman Zeyde, Michael Elad, and Matan Protter. 2010. On single image scale-up using sparse-representations. In ICCS. Springer.

Digital Library

[43]

Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, and Yuheng Zou. 2016. Dorefa-net: Training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv (2016).

Cited By

Ji MPeng GLi SHuang WLi WLi Z(2023)Iterative-AMC: a novel model compression and structure optimization method in mechanical system safety monitoringStructural Health Monitoring10.1177/1475921723115548622:6(3723-3741)Online publication date: 8-Mar-2023
https://doi.org/10.1177/14759217231155486
Yao JZhang SYao YWang FMa JZhang JChu YJi LJia KShen TWu AZhang FTan ZKuang KWu CWu FZhou JYang H(2023)Edge-Cloud Polarization and Collaboration: A Comprehensive Survey for AIIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.317821135:7(6866-6886)Online publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1109/TKDE.2022.3178211
Feng ZZhang S(2023)Efficient Vision Transformer via Token MergerIEEE Transactions on Image Processing10.1109/TIP.2023.329376332(4156-4169)Online publication date: 2023
https://doi.org/10.1109/TIP.2023.3293763
Show More Cited By

Index Terms

Hybrid Network Compression via Meta-Learning
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

A Compression Method for Object Detection Network Using Joint Pruning and Quantization
ISMSI '24: Proceedings of the 2024 8th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence

In recent years, the application scenarios of artificial intelligence technology have become increasingly diverse, with more and more involvement in terminal devices, whose computational and storage capacities are typically limited. At the same time, ...
Developing a Dynamic Cluster Quantization based Lossless Audio Compression (DCQLAC)
Abstract
In this paper, an approach has been made to produce a compressed audio without losing any information. The proposed scheme is fabricated with the help of dynamic cluster quantization followed by Burrows Wheeler Transform (BWT) and Huffman coding. ...
Using Distillation to Improve Network Performance after Pruning and Quantization
MLMI '19: Proceedings of the 2019 2nd International Conference on Machine Learning and Machine Intelligence

As the complexity of processing issues increases, deep neural networks require more computing and storage resources. At the same time, the researchers found that the deep neural network contains a lot of redundancy, causing unnecessary waste, and the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

October 2021

5796 pages

ISBN:9781450386517

DOI:10.1145/3474085

General Chairs:
Heng Tao Shen
University of Electronic Science&Technology of China, China
,
Yueting Zhuang
Zhejiang University, China
,
John R. Smith
IBM, USA
,
Program Chairs:
Yang Yang
University of Electronic Science and Technology of China, China
,
Pablo Cesar
CWI&TU Delft, The Netherlands
,
Florian Metze
FACEBOOK, Inc., USA
,
Balakrishnan Prabhakaran
University of Texas at Dallas, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Natural Science Foundation of China
The National Key Research and Development Program of China
Beijing Natural Science Foundation

Conference

MM '21

Sponsor:

SIGMM

MM '21: ACM Multimedia Conference

October 20 - 24, 2021

Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
264
Total Downloads

Downloads (Last 12 months)23
Downloads (Last 6 weeks)3

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Ji MPeng GLi SHuang WLi WLi Z(2023)Iterative-AMC: a novel model compression and structure optimization method in mechanical system safety monitoringStructural Health Monitoring10.1177/1475921723115548622:6(3723-3741)Online publication date: 8-Mar-2023
https://doi.org/10.1177/14759217231155486
Yao JZhang SYao YWang FMa JZhang JChu YJi LJia KShen TWu AZhang FTan ZKuang KWu CWu FZhou JYang H(2023)Edge-Cloud Polarization and Collaboration: A Comprehensive Survey for AIIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2022.317821135:7(6866-6886)Online publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1109/TKDE.2022.3178211
Feng ZZhang S(2023)Efficient Vision Transformer via Token MergerIEEE Transactions on Image Processing10.1109/TIP.2023.329376332(4156-4169)Online publication date: 2023
https://doi.org/10.1109/TIP.2023.3293763
Freire PNapoli ASpinnler BAnderson MRon DSchairer WBex TCosta NTuritsyn SPrilepsky J(2023)Reducing Computational Complexity of Neural Networks in Optical Channel Equalization: From Concepts to ImplementationJournal of Lightwave Technology10.1109/JLT.2023.323432741:14(4557-4581)Online publication date: 15-Jul-2023
https://doi.org/10.1109/JLT.2023.3234327

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents