Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Soft Masking for Cost-Constrained Channel Pruning

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Abstract

Structured channel pruning has been shown to significantly accelerate inference time for convolution neural networks (CNNs) on modern hardware, with a relatively minor loss of network accuracy. Recent works permanently zero these channels during training, which we observe to significantly hamper final accuracy, particularly as the fraction of the network being pruned increases. We propose Soft Masking for cost-constrained Channel Pruning (SMCP) to allow pruned channels to adaptively return to the network while simultaneously pruning towards a target cost constraint. By adding a soft mask re-parameterization of the weights and channel pruning from the perspective of removing input channels, we allow gradient updates to previously pruned channels and the opportunity for the channels to later return to the network. We then formulate input channel pruning as a global resource allocation problem. Our method outperforms prior works on both the ImageNet classification and PASCAL VOC detection datasets.

R. Humble—Work performed during a NVIDIA internship.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Our code can be accessed at https://github.com/NVlabs/SMCP.

References

  1. Alvarez, J.M., Salzmann, M.: Learning the number of neurons in deep networks. In: NeurIPS, pp. 2262–2270 (2016)

    Google Scholar 

  2. Bengio, Y., Léonard, N., Courville, A.C.: Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR abs/1308.3432 (2013)

    Google Scholar 

  3. Chen, C., Tung, F., Vedula, N., Mori, G.: Constraint-aware deep neural network compression. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 409–424. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3_25

    Chapter  Google Scholar 

  4. Chetlur, S., et al.: CUDNN: efficient primitives for deep learning. CoRR abs/1410.0759 (2014)

    Google Scholar 

  5. Dai, X., et al.: ChamNet: towards efficient network design through platform-aware model adaptation. In: CVPR, pp. 11398–11407 (2019)

    Google Scholar 

  6. Dettmers, T., Zettlemoyer, L.: Sparse networks from scratch: faster training without losing performance. CoRR abs/1907.04840 (2019)

    Google Scholar 

  7. Dong, J.-D., Cheng, A.-C., Juan, D.-C., Wei, W., Sun, M.: DPP-Net: device-aware progressive search for pareto-optimal neural architectures. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 540–555. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_32

    Chapter  Google Scholar 

  8. Evci, U., Gale, T., Menick, J., Castro, P.S., Elsen, E.: Rigging the lottery: making all tickets winners. In: ICML, pp. 2943–2952 (2020)

    Google Scholar 

  9. Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)

    Article  Google Scholar 

  10. Guo, Y., Yao, A., Chen, Y.: Dynamic network surgery for efficient DNNs. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) NeurIPS, pp. 1379–1387 (2016)

    Google Scholar 

  11. Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural network with pruning, trained quantization and huffman coding. In: ICLR (2016)

    Google Scholar 

  12. Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural network. In: NeurIPS, pp. 1135–1143 (2015)

    Google Scholar 

  13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  14. He, Y., Kang, G., Dong, X., Fu, Y., Yang, Y.: Soft filter pruning for accelerating deep convolutional neural networks. In: IJCAI, pp. 2234–2240 (2018)

    Google Scholar 

  15. He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: ICCV, pp. 1398–1406 (2017)

    Google Scholar 

  16. Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017)

    Google Scholar 

  17. de Jorge, P., Sanyal, A., Behl, H.S., Torr, P.H.S., Rogez, G., Dokania, P.K.: Progressive skeletonization: trimming more fat from a network at initialization. In: ICLR (2021)

    Google Scholar 

  18. Kang, M., Han, B.: Operation-aware soft channel pruning using differentiable masks. In: ICML, pp. 5122–5131 (2020)

    Google Scholar 

  19. Kusupati, A., et al.: Soft threshold weight reparameterization for learnable sparsity. In: ICML, pp. 5544–5555 (2020)

    Google Scholar 

  20. LeCun, Y., Denker, J.S., Solla, S.A.: Optimal brain damage. In: NeurIPS, pp. 598–605 (1989)

    Google Scholar 

  21. Li, B., Wu, B., Su, J., Wang, G.: EagleEye: fast sub-net evaluation for efficient neural network pruning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 639–654. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_38

    Chapter  Google Scholar 

  22. Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient convnets. In: ICLR (2017)

    Google Scholar 

  23. Lin, T., Stich, S.U., Barba, L., Dmitriev, D., Jaggi, M.: Dynamic model pruning with feedback. In: ICLR (2020)

    Google Scholar 

  24. Lui, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  25. Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: ICCV, pp. 2755–2763 (2017)

    Google Scholar 

  26. Liu, Z., Sun, M., Zhou, T., Huang, G., Darrell, T.: Rethinking the value of network pruning. In: ICLR (2019)

    Google Scholar 

  27. Luo, J., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. In: ICCV, pp. 5068–5076 (2017)

    Google Scholar 

  28. Mishra, A.K., et al.: Accelerating sparse deep neural networks. CoRR abs/2104.08378 (2021)

    Google Scholar 

  29. Molchanov, P., Mallya, A., Tyree, S., Frosio, I., Kautz, J.: Importance estimation for neural network pruning. In: CVPR, pp. 11264–11272 (2019)

    Google Scholar 

  30. Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference. In: ICLR (2017)

    Google Scholar 

  31. Mostafa, H., Wang, X.: Parameter efficient training of deep convolutional neural networks by dynamic sparse reparameterization. In: ICML, pp. 4646–4655 (2019)

    Google Scholar 

  32. NVIDIA deep learning performance guide: convolutional layers user guide. https://docs.nvidia.com/deeplearning/performance/dl-performance-convolutional/index.html. Accessed 15 Nov 2021

  33. NVIDIA Deep Learning Examples: ResNet50 v1.5 For Pytorch. https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/Classification/ConvNets/resnet50v1.5/README.md. Accessed 15 Nov 2021

  34. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: 2019 Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) NeurIPS, pp. 8024–8035 (2019)

    Google Scholar 

  35. Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32

    Chapter  Google Scholar 

  36. Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  37. Shen, M., Yin, H., Molchanov, P., Mao, L., Liu, J., Alvarez, J.M.: HALP: hardware-aware latency pruning. CoRR abs/2110.10811 (2021)

    Google Scholar 

  38. Sinha, P., Zoltners, A.A.: The multiple-choice knapsack problem. Oper. Res. 27(3), 503–515 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  39. Stosic, D., Stosic, D.: Search spaces for neural model training. CoRR abs/2105.12920 (2021)

    Google Scholar 

  40. Su, X., You, S., Wang, F., Qian, C., Zhang, C., Xu, C.: BCNet: searching for network width with bilaterally coupled network. In: CVPR, pp. 2175–2184 (2021)

    Google Scholar 

  41. Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: CVPR, pp. 2820–2828 (2019)

    Google Scholar 

  42. Wang, H., Qin, C., Zhang, Y., Fu, Y.: Neural pruning via growing regularization. In: ICLR (2021)

    Google Scholar 

  43. Wortsman, M., Farhadi, A., Rastegari, M.: Discovering neural wirings. In: NeurIPS, pp. 2680–2690 (2019)

    Google Scholar 

  44. Wu, Y., Liu, C., Chen, B., Chien, S.: Constraint-aware importance estimation for global filter pruning under multiple resource constraints. In: CVPR, pp. 2935–2943 (2020)

    Google Scholar 

  45. Yang, T.-J., et al.: NetAdapt: platform-aware neural network adaptation for mobile applications. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 289–304. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_18

    Chapter  Google Scholar 

  46. Yang, T., Liao, Y., Sze, V.: Netadaptv2: efficient neural architecture search with fast super-network training and architecture optimization. In: CVPR, pp. 2402–2411. Computer Vision Foundation/IEEE (2021)

    Google Scholar 

  47. Ye, J., Lu, X., Lin, Z., Wang, J.Z.: Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. In: ICLR (2018)

    Google Scholar 

  48. Yin, H., et al.: Dreaming to distill: data-free knowledge transfer via DeepInversion. In: CVPR, pp. 8712–8721 (2020)

    Google Scholar 

  49. You, Z., Yan, K., Ye, J., Ma, M., Wang, P.: Gate decorator: global filter pruning method for accelerating deep convolutional neural networks. In: NeurIPS, pp. 2130–2141 (2019)

    Google Scholar 

  50. Yu, J., Huang, T.S.: Network slimming by slimmable networks: towards one-shot architecture search for channel numbers. CoRR abs/1903.11728 (2019)

    Google Scholar 

  51. Yu, R., et al.: NISP: pruning networks using neuron importance score propagation. In: CVPR, pp. 9194–9203 (2018)

    Google Scholar 

  52. Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. In: ICLR (2017)

    Google Scholar 

  53. Zhou, A., et al.: Learning N: M fine-grained structured sparse neural networks from scratch. In: ICLR (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ryan Humble .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 393 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Humble, R., Shen, M., Latorre, J.A., Darve, E., Alvarez, J. (2022). Soft Masking for Cost-Constrained Channel Pruning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13671. Springer, Cham. https://doi.org/10.1007/978-3-031-20083-0_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20083-0_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20082-3

  • Online ISBN: 978-3-031-20083-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics