Learning Extremely Lightweight and Robust Model with Differentiable Constraints on Sparsity and Condition Number

Wei, Xian; Xu, Yangyu; Huang, Yanhui; Lv, Hairong; Lan, Hai; Chen, Mingsong; Tang, Xuan

doi:10.1007/978-3-031-19772-7_40

Xian Wei¹²,
Yangyu Xu^13,14,
Yanhui Huang¹⁵,
Hairong Lv¹⁶,
Hai Lan¹³,
Mingsong Chen¹² &
…
Xuan Tang¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13664))

Included in the following conference series:

European Conference on Computer Vision

2962 Accesses

Abstract

Learning lightweight and robust deep learning models is an enormous challenge for safety-critical devices with limited computing and memory resources, owing to robustness against adversarial attacks being proportional to network capacity. The community has extensively explored the integration of adversarial training and model compression, such as weight pruning. However, lightweight models generated by highly pruned over-parameterized models lead to sharp drops in both robust and natural accuracy. It has been observed that the parameters of these models lie in ill-conditioned weight space, i.e., the condition number of weight matrices tend to be large enough that the model is not robust. In this work, we propose a framework for building extremely lightweight models, which combines tensor product with the differentiable constraints for reducing condition number and promoting sparsity. Moreover, the proposed framework is incorporated into adversarial training with the min-max optimization scheme. We evaluate the proposed approach on VGG-16 and Visual Transformer. Experimental results on datasets such as ImageNet, SVHN, and CIFAR$-10$ show that we can achieve an overwhelming advantage at a high compression ratio, e.g., 200 times.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Fast data-free model compression via dictionary-pair reconstruction

Article 11 April 2023

An Integrated Approach to Produce Robust Deep Neural Network Models with High Efficiency

Search-and-Train: Two-Stage Model Compression and Acceleration

Notes

1.
https://github.com/MVPR-Group/ARLST.

References

Andriushchenko, Maksym, Croce, Francesco, Flammarion, Nicolas, Hein, Matthias: Square attack: a query-efficient black-box adversarial attack via random search. In: Vedaldi, Andrea, Bischof, Horst, Brox, Thomas, Frahm, Jan-Michael. (eds.) ECCV 2020. LNCS, vol. 12368, pp. 484–501. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58592-1_29
Chapter Google Scholar
Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807. IEEE (2017)
Google Scholar
Croce, F., Hein, M.: Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: International Conference on International Conference on Machine Learning (ICML) (2020)
Google Scholar
De Lathauwer, L., De Moor, B., Vandewalle, J.: A multilinear singular value decomposition. SIAM J. Matrix Anal. Appl. 21(4), 1253–1278 (2000)
Article MathSciNet MATH Google Scholar
Demmel, J.W.: Applied Numerical Linear Algebra, vol. 56. SIAM, Philadelphia (1997)
Google Scholar
Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. In: Conference on Advances in Neural Information Processing Systems (NeurIPS) (2014)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (ICLR) (2020)
Google Scholar
Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (ICLR) (2014)
Google Scholar
Gui, S., Wang, H., Yang, H., Yu, C., Wang, Z., Liu, J.: Model compression with adversarial robustness: a unified optimization framework. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS), vol. 32 (2019)
Google Scholar
Guo, Y., Zhang, C., Zhang, C., Chen, Y.: Sparse dnns with improved adversarial robustness. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS). pp. 242–251 (2018)
Google Scholar
Han, S., Mao, H., Dally, W.J.: Deep compression: Compressing deep neural networks with prunning, trained quantization and huffman coding. In: International Conference on Learning Representations (ICLR) (2016)
Google Scholar
Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS) (2015)
Google Scholar
Hassani, A., Walton, S., Shah, N., Abuduweili, A., Li, J., Shi, H.: Escaping the big data paradigm with compact transformers. arXiv preprint arXiv:2104.05704 (2021)
He, Z., Gao, S., Xiao, L., Liu, D., He, H., Barber, D.: Wider and deeper, cheaper and faster: Tensorized LSTMs for sequence learning. In: Conference: Advances In Neural Information Processing System(NeurIPS), vol. 30 (2017)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. Statistics 9 1050 (2015)
Google Scholar
Horn, R.A., Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis. Cambridge University Press, Cambridge (1994)
Google Scholar
Khrulkov, V., Hrinchuk, O., Mirvakhabova, L., Oseledets, I.: Tensorized embedding layers for efficient model compression. In: 8th International Conference on Learning Representations (ICLR) (2020)
Google Scholar
Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Master Thesis (2009)
Google Scholar
Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial machine learning at scale. In: International Conference on Learning Representations (ICLR) (2016)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Lee, E., Lee, C.Y.: NeuralScale: efficient scaling of neurons for resource-constrained deep neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1478–1487 (2020)
Google Scholar
Lin, J., Gan, C., Han, S.: Defensive quantization: when efficiency meets robustness. In: International Conference on Learning Representations. International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Lin, J., Rao, Y., Lu, J., Zhou, J.: Runtime neural pruning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurIPS), pp. 2178–2188 (2017)
Google Scholar
Madaan, D., Shin, J., Hwang, S.J.: Adversarial neural pruning with latent vulnerability suppression. In: International Conference on Machine Learning (ICML), pp. 6575–6585. PMLR (2020)
Google Scholar
Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: 6th International Conference on Learning Representations (ICLR). Vancouver, Canada (2018)
Google Scholar
Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: DeepfOol: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2574–2582 (2016)
Google Scholar
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2011)
Google Scholar
Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European Symposium on Security and Privacy (EuroS &P), pp. 372–387. IEEE (2016)
Google Scholar
Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597. IEEE (2016)
Google Scholar
Peng, X., Zhang, L., Yi, Z., Tan, K.K.: Learning locality-constrained collaborative representation for robust face recognition. Pattern Recogn. 47(9), 2794–2806 (2014)
Article Google Scholar
Phan, Anh-Huy., Sobolev, Konstantin, Sozykin, Konstantin, Ermilov, Dmitry, Gusak, Julia, Tichavský, Petr, Glukhov, Valeriy, Oseledets, Ivan, Cichocki, Andrzej: Stable low-rank tensor decomposition for compression of convolutional neural network. In: Vedaldi, Andrea, Bischof, Horst, Brox, Thomas, Frahm, Jan-Michael. (eds.) ECCV 2020. LNCS, vol. 12374, pp. 522–539. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58526-6_31
Chapter Google Scholar
Polino, A., Pascanu, R., Alistarh, D.: Model compression via distillation and quantization. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Rolnick, D., Tegmark, M.: The power of deeper networks for expressing natural functions. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Russakovsky, D., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Sainath, T.N., Kingsbury, B., Sindhwani, V., Arisoy, E., Ramabhadran, B.: Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6655–6659. IEEE (2013)
Google Scholar
Sehwag, V., Wang, S., Mittal, P., Jana, S.: HYDRA: pruning adversarially robust neural networks. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 19655–19666 (2020)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR) (2015)
Google Scholar
Sinha, Abhishek, Singh, Mayank, Krishnamurthy, Balaji: Neural networks in an adversarial setting and ill-conditioned weight space. In: Alzate, C., et al. (eds.) ECML PKDD 2018. LNCS (LNAI), vol. 11329, pp. 177–190. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-13453-2_14
Chapter Google Scholar
Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning (ICML), pp. 6105–6114 (2019)
Google Scholar
Thakker, U., et al.: Pushing the limits of RNN compression. In: 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing-NeurIPS Edition (EMC2-NeurIPS), pp. 18–21. IEEE (2019)
Google Scholar
Tolstikhin, I.O., et al.: MLP-mixer: an all-MLP architecture for vision. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 34, pp. 24261–2427a2 (2021)
Google Scholar
Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. In: International Conference on Learning Representations (ICLR) (2018)
Google Scholar
Van Loan, C.F.: The ubiquitous kronecker product. J. Comput. Appl. Math. 123(1–2), 85–100 (2000)
Article MathSciNet MATH Google Scholar
Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 2074–2082 (2016)
Google Scholar
Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4820–4828 (2016)
Google Scholar
Xu, K., et al.: Structured adversarial attack: Towards general implementation and better interpretability. In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Ye, S., et al.: Adversarial robustness vs. model compression, or both? In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 111–120 (2019)
Google Scholar
Yin, M., Sui, Y., Liao, S., Yuan, B.: Towards efficient tensor decomposition-based DNN model compression with optimization framework. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10674–10683 (2021)
Google Scholar
Yu, X., Liu, T., Wang, X., Tao, D.: On compressing deep models by low rank and sparse decomposition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7370–7379 (2017)
Google Scholar
Zhang, H., Yu, Y., Jiao, J., Xing, E., El Ghaoui, L., Jordan, M.I.: Theoretically principled trade-off between robustness and accuracy. In: International Conference on Machine Learning (ICML), pp. 7472–7482 (2019)
Google Scholar
Zhao, Y., Shumailov, I., Mullins, R., Anderson, R.: To compress or not to compress: Understanding the interactions between adversarial attacks and neural network compression. In: Proceedings of Machine Learning and Systems (MLSys), pp. 230–240 (2019)
Google Scholar

Download references

Acknowledgment

This work was supported by National Key Research and Development Program of China (No. 2018YFB2101300), and Natural Science Foundation of China (No. 61872147).

Author information

Authors and Affiliations

MoE Engineering Research Center of Hardware/Software Co-design Technology and Application, East China Normal University, Shanghai, China
Xian Wei & Mingsong Chen
Fujian Institute of Research on the Structure of Matter, Chinese Academy of Sciences, Beijing, China
Yangyu Xu & Hai Lan
University of Chinese Academy of Sciences, Beijing, China
Yangyu Xu
Fuzhou University, Fuzhou, China
Yanhui Huang
Tsinghua University, Beijing, China
Hairong Lv
School of Communication and Electronic Engineering, East China Normal University, Shanghai, China
Xuan Tang

Authors

Xian Wei
View author publications
You can also search for this author in PubMed Google Scholar
Yangyu Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yanhui Huang
View author publications
You can also search for this author in PubMed Google Scholar
Hairong Lv
View author publications
You can also search for this author in PubMed Google Scholar
Hai Lan
View author publications
You can also search for this author in PubMed Google Scholar
Mingsong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xuan Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xuan Tang .

Editor information

Editors and Affiliations

Tel Aviv University, Tel Aviv, Israel
Shai Avidan
University College London, London, UK
Gabriel Brostow
Google AI, Accra, Ghana
Moustapha Cissé
University of Catania, Catania, Italy
Giovanni Maria Farinella
Facebook (United States), Menlo Park, CA, USA
Tal Hassner

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 298 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wei, X. et al. (2022). Learning Extremely Lightweight and Robust Model with Differentiable Constraints on Sparsity and Condition Number. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13664. Springer, Cham. https://doi.org/10.1007/978-3-031-19772-7_40

Download citation

DOI: https://doi.org/10.1007/978-3-031-19772-7_40
Published: 28 October 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-19771-0
Online ISBN: 978-3-031-19772-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning Extremely Lightweight and Robust Model with Differentiable Constraints on Sparsity and Condition Number