Soft Masking for Cost-Constrained Channel Pruning

Humble, Ryan; Shen, Maying; Latorre, Jorge Albericio; Darve, Eric; Alvarez, Jose

doi:10.1007/978-3-031-20083-0_38

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13671))

Included in the following conference series:

European Conference on Computer Vision

2884 Accesses

Abstract

Structured channel pruning has been shown to significantly accelerate inference time for convolution neural networks (CNNs) on modern hardware, with a relatively minor loss of network accuracy. Recent works permanently zero these channels during training, which we observe to significantly hamper final accuracy, particularly as the fraction of the network being pruned increases. We propose Soft Masking for cost-constrained Channel Pruning (SMCP) to allow pruned channels to adaptively return to the network while simultaneously pruning towards a target cost constraint. By adding a soft mask re-parameterization of the weights and channel pruning from the perspective of removing input channels, we allow gradient updates to previously pruned channels and the opportunity for the channels to later return to the network. We then formulate input channel pruning as a global resource allocation problem. Our method outperforms prior works on both the ImageNet classification and PASCAL VOC detection datasets.

R. Humble—Work performed during a NVIDIA internship.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Automatic channel pruning via clustering and swarm intelligence optimization for CNN

Article 05 April 2022

Channel Selection Using Gumbel Softmax

ACP: Automatic Channel Pruning Method by Introducing Additional Loss for Deep Neural Networks

Article 05 July 2022

Notes

1.
Our code can be accessed at https://github.com/NVlabs/SMCP.

References

Alvarez, J.M., Salzmann, M.: Learning the number of neurons in deep networks. In: NeurIPS, pp. 2262–2270 (2016)
Google Scholar
Bengio, Y., Léonard, N., Courville, A.C.: Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR abs/1308.3432 (2013)
Google Scholar
Chen, C., Tung, F., Vedula, N., Mori, G.: Constraint-aware deep neural network compression. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 409–424. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3_25
Chapter Google Scholar
Chetlur, S., et al.: CUDNN: efficient primitives for deep learning. CoRR abs/1410.0759 (2014)
Google Scholar
Dai, X., et al.: ChamNet: towards efficient network design through platform-aware model adaptation. In: CVPR, pp. 11398–11407 (2019)
Google Scholar
Dettmers, T., Zettlemoyer, L.: Sparse networks from scratch: faster training without losing performance. CoRR abs/1907.04840 (2019)
Google Scholar
Dong, J.-D., Cheng, A.-C., Juan, D.-C., Wei, W., Sun, M.: DPP-Net: device-aware progressive search for pareto-optimal neural architectures. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 540–555. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_32
Chapter Google Scholar
Evci, U., Gale, T., Menick, J., Castro, P.S., Elsen, E.: Rigging the lottery: making all tickets winners. In: ICML, pp. 2943–2952 (2020)
Google Scholar
Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Guo, Y., Yao, A., Chen, Y.: Dynamic network surgery for efficient DNNs. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) NeurIPS, pp. 1379–1387 (2016)
Google Scholar
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural network with pruning, trained quantization and huffman coding. In: ICLR (2016)
Google Scholar
Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural network. In: NeurIPS, pp. 1135–1143 (2015)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
He, Y., Kang, G., Dong, X., Fu, Y., Yang, Y.: Soft filter pruning for accelerating deep convolutional neural networks. In: IJCAI, pp. 2234–2240 (2018)
Google Scholar
He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: ICCV, pp. 1398–1406 (2017)
Google Scholar
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017)
Google Scholar
de Jorge, P., Sanyal, A., Behl, H.S., Torr, P.H.S., Rogez, G., Dokania, P.K.: Progressive skeletonization: trimming more fat from a network at initialization. In: ICLR (2021)
Google Scholar
Kang, M., Han, B.: Operation-aware soft channel pruning using differentiable masks. In: ICML, pp. 5122–5131 (2020)
Google Scholar
Kusupati, A., et al.: Soft threshold weight reparameterization for learnable sparsity. In: ICML, pp. 5544–5555 (2020)
Google Scholar
LeCun, Y., Denker, J.S., Solla, S.A.: Optimal brain damage. In: NeurIPS, pp. 598–605 (1989)
Google Scholar
Li, B., Wu, B., Su, J., Wang, G.: EagleEye: fast sub-net evaluation for efficient neural network pruning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 639–654. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_38
Chapter Google Scholar
Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient convnets. In: ICLR (2017)
Google Scholar
Lin, T., Stich, S.U., Barba, L., Dmitriev, D., Jaggi, M.: Dynamic model pruning with feedback. In: ICLR (2020)
Google Scholar
Lui, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: ICCV, pp. 2755–2763 (2017)
Google Scholar
Liu, Z., Sun, M., Zhou, T., Huang, G., Darrell, T.: Rethinking the value of network pruning. In: ICLR (2019)
Google Scholar
Luo, J., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. In: ICCV, pp. 5068–5076 (2017)
Google Scholar
Mishra, A.K., et al.: Accelerating sparse deep neural networks. CoRR abs/2104.08378 (2021)
Google Scholar
Molchanov, P., Mallya, A., Tyree, S., Frosio, I., Kautz, J.: Importance estimation for neural network pruning. In: CVPR, pp. 11264–11272 (2019)
Google Scholar
Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference. In: ICLR (2017)
Google Scholar
Mostafa, H., Wang, X.: Parameter efficient training of deep convolutional neural networks by dynamic sparse reparameterization. In: ICML, pp. 4646–4655 (2019)
Google Scholar
NVIDIA deep learning performance guide: convolutional layers user guide. https://docs.nvidia.com/deeplearning/performance/dl-performance-convolutional/index.html. Accessed 15 Nov 2021
NVIDIA Deep Learning Examples: ResNet50 v1.5 For Pytorch. https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/Classification/ConvNets/resnet50v1.5/README.md. Accessed 15 Nov 2021
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: 2019 Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) NeurIPS, pp. 8024–8035 (2019)
Google Scholar
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32
Chapter Google Scholar
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Shen, M., Yin, H., Molchanov, P., Mao, L., Liu, J., Alvarez, J.M.: HALP: hardware-aware latency pruning. CoRR abs/2110.10811 (2021)
Google Scholar
Sinha, P., Zoltners, A.A.: The multiple-choice knapsack problem. Oper. Res. 27(3), 503–515 (1979)
Article MathSciNet MATH Google Scholar
Stosic, D., Stosic, D.: Search spaces for neural model training. CoRR abs/2105.12920 (2021)
Google Scholar
Su, X., You, S., Wang, F., Qian, C., Zhang, C., Xu, C.: BCNet: searching for network width with bilaterally coupled network. In: CVPR, pp. 2175–2184 (2021)
Google Scholar
Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: CVPR, pp. 2820–2828 (2019)
Google Scholar
Wang, H., Qin, C., Zhang, Y., Fu, Y.: Neural pruning via growing regularization. In: ICLR (2021)
Google Scholar
Wortsman, M., Farhadi, A., Rastegari, M.: Discovering neural wirings. In: NeurIPS, pp. 2680–2690 (2019)
Google Scholar
Wu, Y., Liu, C., Chen, B., Chien, S.: Constraint-aware importance estimation for global filter pruning under multiple resource constraints. In: CVPR, pp. 2935–2943 (2020)
Google Scholar
Yang, T.-J., et al.: NetAdapt: platform-aware neural network adaptation for mobile applications. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 289–304. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_18
Chapter Google Scholar
Yang, T., Liao, Y., Sze, V.: Netadaptv2: efficient neural architecture search with fast super-network training and architecture optimization. In: CVPR, pp. 2402–2411. Computer Vision Foundation/IEEE (2021)
Google Scholar
Ye, J., Lu, X., Lin, Z., Wang, J.Z.: Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. In: ICLR (2018)
Google Scholar
Yin, H., et al.: Dreaming to distill: data-free knowledge transfer via DeepInversion. In: CVPR, pp. 8712–8721 (2020)
Google Scholar
You, Z., Yan, K., Ye, J., Ma, M., Wang, P.: Gate decorator: global filter pruning method for accelerating deep convolutional neural networks. In: NeurIPS, pp. 2130–2141 (2019)
Google Scholar
Yu, J., Huang, T.S.: Network slimming by slimmable networks: towards one-shot architecture search for channel numbers. CoRR abs/1903.11728 (2019)
Google Scholar
Yu, R., et al.: NISP: pruning networks using neuron importance score propagation. In: CVPR, pp. 9194–9203 (2018)
Google Scholar
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. In: ICLR (2017)
Google Scholar
Zhou, A., et al.: Learning N: M fine-grained structured sparse neural networks from scratch. In: ICLR (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Stanford University, Stanford, CA, 94305, USA
Ryan Humble & Eric Darve
NVIDIA, Santa Clara, CA, 95051, USA
Maying Shen, Jorge Albericio Latorre & Jose Alvarez

Authors

Ryan Humble
View author publications
You can also search for this author in PubMed Google Scholar
Maying Shen
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Albericio Latorre
View author publications
You can also search for this author in PubMed Google Scholar
Eric Darve
View author publications
You can also search for this author in PubMed Google Scholar
Jose Alvarez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ryan Humble .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 393 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Humble, R., Shen, M., Latorre, J.A., Darve, E., Alvarez, J. (2022). Soft Masking for Cost-Constrained Channel Pruning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13671. Springer, Cham. https://doi.org/10.1007/978-3-031-20083-0_38

Download citation

DOI: https://doi.org/10.1007/978-3-031-20083-0_38
Published: 03 November 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20082-3
Online ISBN: 978-3-031-20083-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Soft Masking for Cost-Constrained Channel Pruning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Automatic channel pruning via clustering and swarm intelligence optimization for CNN

Channel Selection Using Gumbel Softmax

ACP: Automatic Channel Pruning Method by Introducing Additional Loss for Deep Neural Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 393 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Soft Masking for Cost-Constrained Channel Pruning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Automatic channel pruning via clustering and swarm intelligence optimization for CNN

Channel Selection Using Gumbel Softmax

ACP: Automatic Channel Pruning Method by Introducing Additional Loss for Deep Neural Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 393 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation