Abstract
Structured channel pruning has been shown to significantly accelerate inference time for convolution neural networks (CNNs) on modern hardware, with a relatively minor loss of network accuracy. Recent works permanently zero these channels during training, which we observe to significantly hamper final accuracy, particularly as the fraction of the network being pruned increases. We propose Soft Masking for cost-constrained Channel Pruning (SMCP) to allow pruned channels to adaptively return to the network while simultaneously pruning towards a target cost constraint. By adding a soft mask re-parameterization of the weights and channel pruning from the perspective of removing input channels, we allow gradient updates to previously pruned channels and the opportunity for the channels to later return to the network. We then formulate input channel pruning as a global resource allocation problem. Our method outperforms prior works on both the ImageNet classification and PASCAL VOC detection datasets.
R. Humble—Work performed during a NVIDIA internship.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Our code can be accessed at https://github.com/NVlabs/SMCP.
References
Alvarez, J.M., Salzmann, M.: Learning the number of neurons in deep networks. In: NeurIPS, pp. 2262–2270 (2016)
Bengio, Y., Léonard, N., Courville, A.C.: Estimating or propagating gradients through stochastic neurons for conditional computation. CoRR abs/1308.3432 (2013)
Chen, C., Tung, F., Vedula, N., Mori, G.: Constraint-aware deep neural network compression. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 409–424. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3_25
Chetlur, S., et al.: CUDNN: efficient primitives for deep learning. CoRR abs/1410.0759 (2014)
Dai, X., et al.: ChamNet: towards efficient network design through platform-aware model adaptation. In: CVPR, pp. 11398–11407 (2019)
Dettmers, T., Zettlemoyer, L.: Sparse networks from scratch: faster training without losing performance. CoRR abs/1907.04840 (2019)
Dong, J.-D., Cheng, A.-C., Juan, D.-C., Wei, W., Sun, M.: DPP-Net: device-aware progressive search for pareto-optimal neural architectures. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 540–555. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_32
Evci, U., Gale, T., Menick, J., Castro, P.S., Elsen, E.: Rigging the lottery: making all tickets winners. In: ICML, pp. 2943–2952 (2020)
Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J.M., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Guo, Y., Yao, A., Chen, Y.: Dynamic network surgery for efficient DNNs. In: Lee, D.D., Sugiyama, M., von Luxburg, U., Guyon, I., Garnett, R. (eds.) NeurIPS, pp. 1379–1387 (2016)
Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural network with pruning, trained quantization and huffman coding. In: ICLR (2016)
Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural network. In: NeurIPS, pp. 1135–1143 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
He, Y., Kang, G., Dong, X., Fu, Y., Yang, Y.: Soft filter pruning for accelerating deep convolutional neural networks. In: IJCAI, pp. 2234–2240 (2018)
He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: ICCV, pp. 1398–1406 (2017)
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017)
de Jorge, P., Sanyal, A., Behl, H.S., Torr, P.H.S., Rogez, G., Dokania, P.K.: Progressive skeletonization: trimming more fat from a network at initialization. In: ICLR (2021)
Kang, M., Han, B.: Operation-aware soft channel pruning using differentiable masks. In: ICML, pp. 5122–5131 (2020)
Kusupati, A., et al.: Soft threshold weight reparameterization for learnable sparsity. In: ICML, pp. 5544–5555 (2020)
LeCun, Y., Denker, J.S., Solla, S.A.: Optimal brain damage. In: NeurIPS, pp. 598–605 (1989)
Li, B., Wu, B., Su, J., Wang, G.: EagleEye: fast sub-net evaluation for efficient neural network pruning. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12347, pp. 639–654. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58536-5_38
Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, H.P.: Pruning filters for efficient convnets. In: ICLR (2017)
Lin, T., Stich, S.U., Barba, L., Dmitriev, D., Jaggi, M.: Dynamic model pruning with feedback. In: ICLR (2020)
Lui, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Liu, Z., Li, J., Shen, Z., Huang, G., Yan, S., Zhang, C.: Learning efficient convolutional networks through network slimming. In: ICCV, pp. 2755–2763 (2017)
Liu, Z., Sun, M., Zhou, T., Huang, G., Darrell, T.: Rethinking the value of network pruning. In: ICLR (2019)
Luo, J., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. In: ICCV, pp. 5068–5076 (2017)
Mishra, A.K., et al.: Accelerating sparse deep neural networks. CoRR abs/2104.08378 (2021)
Molchanov, P., Mallya, A., Tyree, S., Frosio, I., Kautz, J.: Importance estimation for neural network pruning. In: CVPR, pp. 11264–11272 (2019)
Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference. In: ICLR (2017)
Mostafa, H., Wang, X.: Parameter efficient training of deep convolutional neural networks by dynamic sparse reparameterization. In: ICML, pp. 4646–4655 (2019)
NVIDIA deep learning performance guide: convolutional layers user guide. https://docs.nvidia.com/deeplearning/performance/dl-performance-convolutional/index.html. Accessed 15 Nov 2021
NVIDIA Deep Learning Examples: ResNet50 v1.5 For Pytorch. https://github.com/NVIDIA/DeepLearningExamples/blob/master/PyTorch/Classification/ConvNets/resnet50v1.5/README.md. Accessed 15 Nov 2021
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: 2019 Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) NeurIPS, pp. 8024–8035 (2019)
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Shen, M., Yin, H., Molchanov, P., Mao, L., Liu, J., Alvarez, J.M.: HALP: hardware-aware latency pruning. CoRR abs/2110.10811 (2021)
Sinha, P., Zoltners, A.A.: The multiple-choice knapsack problem. Oper. Res. 27(3), 503–515 (1979)
Stosic, D., Stosic, D.: Search spaces for neural model training. CoRR abs/2105.12920 (2021)
Su, X., You, S., Wang, F., Qian, C., Zhang, C., Xu, C.: BCNet: searching for network width with bilaterally coupled network. In: CVPR, pp. 2175–2184 (2021)
Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: CVPR, pp. 2820–2828 (2019)
Wang, H., Qin, C., Zhang, Y., Fu, Y.: Neural pruning via growing regularization. In: ICLR (2021)
Wortsman, M., Farhadi, A., Rastegari, M.: Discovering neural wirings. In: NeurIPS, pp. 2680–2690 (2019)
Wu, Y., Liu, C., Chen, B., Chien, S.: Constraint-aware importance estimation for global filter pruning under multiple resource constraints. In: CVPR, pp. 2935–2943 (2020)
Yang, T.-J., et al.: NetAdapt: platform-aware neural network adaptation for mobile applications. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 289–304. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_18
Yang, T., Liao, Y., Sze, V.: Netadaptv2: efficient neural architecture search with fast super-network training and architecture optimization. In: CVPR, pp. 2402–2411. Computer Vision Foundation/IEEE (2021)
Ye, J., Lu, X., Lin, Z., Wang, J.Z.: Rethinking the smaller-norm-less-informative assumption in channel pruning of convolution layers. In: ICLR (2018)
Yin, H., et al.: Dreaming to distill: data-free knowledge transfer via DeepInversion. In: CVPR, pp. 8712–8721 (2020)
You, Z., Yan, K., Ye, J., Ma, M., Wang, P.: Gate decorator: global filter pruning method for accelerating deep convolutional neural networks. In: NeurIPS, pp. 2130–2141 (2019)
Yu, J., Huang, T.S.: Network slimming by slimmable networks: towards one-shot architecture search for channel numbers. CoRR abs/1903.11728 (2019)
Yu, R., et al.: NISP: pruning networks using neuron importance score propagation. In: CVPR, pp. 9194–9203 (2018)
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. In: ICLR (2017)
Zhou, A., et al.: Learning N: M fine-grained structured sparse neural networks from scratch. In: ICLR (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Humble, R., Shen, M., Latorre, J.A., Darve, E., Alvarez, J. (2022). Soft Masking for Cost-Constrained Channel Pruning. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13671. Springer, Cham. https://doi.org/10.1007/978-3-031-20083-0_38
Download citation
DOI: https://doi.org/10.1007/978-3-031-20083-0_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20082-3
Online ISBN: 978-3-031-20083-0
eBook Packages: Computer ScienceComputer Science (R0)