research-article

Best of both, Structured and Unstructured Sparsity in Neural Networks

Authors:

Christoph Schulte,

Dimitrios Bariamis,

Barbara HammerAuthors Info & Claims

EuroMLSys '23: Proceedings of the 3rd Workshop on Machine Learning and Systems

Pages 104 - 108

https://doi.org/10.1145/3578356.3592583

Published: 08 May 2023 Publication History

Abstract

Besides quantization, pruning has shown to be one of the most effective methods to reduce the inference time and required energy of Deep Neural Networks (DNNs). In this work, we propose a sparsity definition that reflects the number of saved operations by pruned parameters to guide the pruning process in order to save as many operations as possible. Based on this, we show the importance of the baseline model's size and quantify the overhead of unstructured sparsity for a commercial-of-the-shelf AI Hardware Accelerator (HWA) in terms of latency reductions. Furthermore, we show that a combination of both structured and unstructured sparsity can mitigate this effect.

References

[1]

Alessandro Aimar, Hesham Mostafa, Enrico Calabrese, Antonio Rios-Navarro, Ricardo Tapiador-Morales, Iulia-Alexandra Lungu, Moritz B Milde, Federico Corradi, Alejandro Linares-Barranco, Shih-Chii Liu, et al. 2018. Nullhop: A flexible convolutional neural network accelerator based on sparse representations of feature maps. IEEE transactions on neural networks and learning systems 30, 3 (2018), 644--656.

[2]

Ambarella International LP. 2022. CV22S Computer Vision SoC for IP Cameras. https://www.ambarella.com/wp-content/uploads/Ambarella_CV22S_Product_Brief.pdf (2023/04/13).

[3]

Maxim Bonnaerens, Matthias Freiberger, and Joni Dambre. 2022. Anchor pruning for object detection. Computer Vision and Image Understanding 221 (2022), 103445.

Digital Library

[4]

Xiaohan Ding, Guiguang Ding, Yuchen Guo, and Jungong Han. 2019. Centripetal sgd for pruning very deep convolutional networks with complicated structure. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 4943--4953.

[5]

Sara Elkerdawy, Mostafa Elhoushi, Abhineet Singh, Hong Zhang, and Nilanjan Ray. 2020. To filter prune, or to layer prune, that is the question. In Proceedings of the Asian Conference on Computer Vision.

[6]

Utku Evci, Trevor Gale, Jacob Menick, Pablo Samuel Castro, and Erich Elsen. 2020. Rigging the lottery: Making all tickets winners. In International Conference on Machine Learning. PMLR, 2943--2952.

[7]

Trevor Gale, Erich Elsen, and Sara Hooker. 2019. The state of sparsity in deep neural networks. arXiv preprint arXiv:1902.09574 (2019).

[8]

Song Han, Xingyu Liu, Huizi Mao, Jing Pu, Ardavan Pedram, Mark A Horowitz, and William J Dally. 2016. EIE: Efficient inference engine on compressed deep neural network. ACM SIGARCH Computer Architecture News 44, 3 (2016), 243--254.

Digital Library

[9]

Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).

[10]

Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. Advances in neural information processing systems 28 (2015).

[11]

Torsten Hoefler, Dan Alistarh, Tal Ben-Nun, Nikoli Dryden, and Alexandra Peste. 2021. Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks. The Journal of Machine Learning Research 22, 1 (2021), 10882--11005.

Digital Library

[12]

Sara Hooker, Aaron Courville, Yann Dauphin, and Andrea Frome. 2019. Selective Brain Damage: Measuring the Disparate Impact of Model Pruning. https://arxiv.org/abs/1911.05248 (2019). arXiv:1911.05248 [cs.LG]

[13]

Hyeong-Ju Kang. 2019. Accelerator-aware pruning for convolutional neural networks. IEEE Transactions on Circuits and Systems for Video Technology 30, 7 (2019), 2093--2103.

[14]

Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. 2016. Pruning filters for efficient convnets. arXiv preprint arXiv:1608.08710 (2016).

[15]

Yawei Li, Kamil Adamczewski, Wen Li, Shuhang Gu, Radu Timofte, and Luc Van Gool. 2022. Revisiting random channel pruning for neural network compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 191--201.

[16]

Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11--14, 2016, Proceedings, Part I 14. Springer, 21--37.

[17]

Zhuang Liu, Jianguo Li, Zhiqiang Shen, Gao Huang, Shoumeng Yan, and Changshui Zhang. 2017. Learning efficient convolutional networks through network slimming. In Proceedings of the IEEE international conference on computer vision. 2736--2744.

[18]

Zhuang Liu, Mingjie Sun, Tinghui Zhou, Gao Huang, and Trevor Darrell. 2018. Rethinking the value of network pruning. arXiv preprint arXiv:1810.05270 (2018).

[19]

Huizi Mao, Song Han, Jeff Pool, Wenshuo Li, Xingyu Liu, Yu Wang, and William J Dally. 2017. Exploring the regularity of sparse structure in convolutional neural networks. arXiv preprint arXiv:1705.08922 (2017).

[20]

Qiao Xiao, Boqian Wu, Yu Zhang, Shiwei Liu, Mykola Pechenizkiy, Elena Mocanu, and Decebal Constantin Mocanu. 2022. Dynamic Sparse Network for Time Series Classification: Learning What to" see". arXiv preprint arXiv:2212.09840 (2022).

[21]

Amir Yazdanbakhsh, Sheng-Chun Kao, Shivani Agrawal, Suvinay Subramanian, Tushar Krishna, and Utku Evci. 2022. Training Recipe for N: M Structured Sparsity with Decaying Pruning Mask. arXiv preprint arXiv:2209.07617 (2022).

[22]

Pengyi Zhang, Yunxin Zhong, and Xiaoqiong Li. 2019. SlimYOLOv3: Narrower, faster and better for real-time UAV applications. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops. 0--0.

Cited By

Azam BKuttichira DVerma B(2024)Neuron Efficiency Index: An Empirical Method for Optimizing Parameters in Deep Learning2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650887(1-6)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10650887
Faghihi AFathollahi MRajabi R(2023)Diagnosis of skin cancer using VGG16 and VGG19 based transfer learning modelsMultimedia Tools and Applications10.1007/s11042-023-17735-283:19(57495-57510)Online publication date: 14-Dec-2023
https://doi.org/10.1007/s11042-023-17735-2

Recommendations

Structured Pruning of Deep Convolutional Neural Networks
Special Issue on Hardware and Algorithms for Learning On-a-chip and Special Issue on Alternative Computing Systems

Real-time application of deep learning algorithms is often hindered by high computational complexity and frequent memory accesses. Network pruning is a promising technique to solve this problem. However, pruning usually results in irregular network ...
Taming unstructured sparsity on GPUs via latency-aware optimization
DAC '20: Proceedings of the 57th ACM/EDAC/IEEE Design Automation Conference

Neural Networks (NNs) exhibit high redundancy in their parameters so that pruning methods can achieve high compression ratio without accuracy loss. However, the very high sparsity produced by unstructured pruning methods is difficult to be efficiently ...
Structured Pruning with Automatic Pruning Rate Derivation for Image Processing Neural Networks
ISMSI '22: Proceedings of the 2022 6th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence

Structured pruning has been proposed for network model compression. Because most of existing structured pruning methods assign pruning rate manually, finding appropriate pruning rate to suppress the degradation of pruned model accuracy is difficult. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

EuroMLSys '23: Proceedings of the 3rd Workshop on Machine Learning and Systems

May 2023

176 pages

ISBN:9798400700842

DOI:10.1145/3578356

Workshop Co-chairs:
Eiko Yoneki
University of Cambridge
,
Luigi Nardi
Lund University
Stanford University

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

EuroMLSys '23

Sponsor:

SIGOPS

EuroMLSys '23: 3rd Workshop on Machine Learning and Systems

May 8, 2023

Rome, Italy

Acceptance Rates

Overall Acceptance Rate 18 of 26 submissions, 69%

Upcoming Conference

EuroSys '25

Sponsor:
sigops

Twentieth European Conference on Computer Systems

March 30 - April 3, 2025

Rotterdam , Netherlands

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
190
Total Downloads

Downloads (Last 12 months)89
Downloads (Last 6 weeks)6

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Azam BKuttichira DVerma B(2024)Neuron Efficiency Index: An Empirical Method for Optimizing Parameters in Deep Learning2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650887(1-6)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10650887
Faghihi AFathollahi MRajabi R(2023)Diagnosis of skin cancer using VGG16 and VGG19 based transfer learning modelsMultimedia Tools and Applications10.1007/s11042-023-17735-283:19(57495-57510)Online publication date: 14-Dec-2023
https://doi.org/10.1007/s11042-023-17735-2

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten