Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Learning Extremely Lightweight and Robust Model with Differentiable Constraints on Sparsity and Condition Number

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13664))

Included in the following conference series:

  • 2824 Accesses

Abstract

Learning lightweight and robust deep learning models is an enormous challenge for safety-critical devices with limited computing and memory resources, owing to robustness against adversarial attacks being proportional to network capacity. The community has extensively explored the integration of adversarial training and model compression, such as weight pruning. However, lightweight models generated by highly pruned over-parameterized models lead to sharp drops in both robust and natural accuracy. It has been observed that the parameters of these models lie in ill-conditioned weight space, i.e., the condition number of weight matrices tend to be large enough that the model is not robust. In this work, we propose a framework for building extremely lightweight models, which combines tensor product with the differentiable constraints for reducing condition number and promoting sparsity. Moreover, the proposed framework is incorporated into adversarial training with the min-max optimization scheme. We evaluate the proposed approach on VGG-16 and Visual Transformer. Experimental results on datasets such as ImageNet, SVHN, and CIFAR\(-10\) show that we can achieve an overwhelming advantage at a high compression ratio, e.g., 200 times.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://github.com/MVPR-Group/ARLST.

References

  1. Andriushchenko, Maksym, Croce, Francesco, Flammarion, Nicolas, Hein, Matthias: Square attack: a query-efficient black-box adversarial attack via random search. In: Vedaldi, Andrea, Bischof, Horst, Brox, Thomas, Frahm, Jan-Michael. (eds.) ECCV 2020. LNCS, vol. 12368, pp. 484–501. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58592-1_29

    Chapter  Google Scholar 

  2. Athalye, A., Carlini, N., Wagner, D.: Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. In: International Conference on Learning Representations (ICLR) (2018)

    Google Scholar 

  3. Chollet, F.: Xception: Deep learning with depthwise separable convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807. IEEE (2017)

    Google Scholar 

  4. Croce, F., Hein, M.: Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: International Conference on International Conference on Machine Learning (ICML) (2020)

    Google Scholar 

  5. De Lathauwer, L., De Moor, B., Vandewalle, J.: A multilinear singular value decomposition. SIAM J. Matrix Anal. Appl. 21(4), 1253–1278 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  6. Demmel, J.W.: Applied Numerical Linear Algebra, vol. 56. SIAM, Philadelphia (1997)

    Google Scholar 

  7. Denton, E.L., Zaremba, W., Bruna, J., LeCun, Y., Fergus, R.: Exploiting linear structure within convolutional networks for efficient evaluation. In: Conference on Advances in Neural Information Processing Systems (NeurIPS) (2014)

    Google Scholar 

  8. Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (ICLR) (2020)

    Google Scholar 

  9. Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. In: International Conference on Learning Representations (ICLR) (2014)

    Google Scholar 

  10. Gui, S., Wang, H., Yang, H., Yu, C., Wang, Z., Liu, J.: Model compression with adversarial robustness: a unified optimization framework. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS), vol. 32 (2019)

    Google Scholar 

  11. Guo, Y., Zhang, C., Zhang, C., Chen, Y.: Sparse dnns with improved adversarial robustness. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS). pp. 242–251 (2018)

    Google Scholar 

  12. Han, S., Mao, H., Dally, W.J.: Deep compression: Compressing deep neural networks with prunning, trained quantization and huffman coding. In: International Conference on Learning Representations (ICLR) (2016)

    Google Scholar 

  13. Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. In: Proceedings of the Advances in Neural Information Processing Systems (NeurIPS) (2015)

    Google Scholar 

  14. Hassani, A., Walton, S., Shah, N., Abuduweili, A., Li, J., Shi, H.: Escaping the big data paradigm with compact transformers. arXiv preprint arXiv:2104.05704 (2021)

  15. He, Z., Gao, S., Xiao, L., Liu, D., He, H., Barber, D.: Wider and deeper, cheaper and faster: Tensorized LSTMs for sequence learning. In: Conference: Advances In Neural Information Processing System(NeurIPS), vol. 30 (2017)

    Google Scholar 

  16. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. Statistics 9 1050 (2015)

    Google Scholar 

  17. Horn, R.A., Horn, R.A., Johnson, C.R.: Topics in Matrix Analysis. Cambridge University Press, Cambridge (1994)

    Google Scholar 

  18. Khrulkov, V., Hrinchuk, O., Mirvakhabova, L., Oseledets, I.: Tensorized embedding layers for efficient model compression. In: 8th International Conference on Learning Representations (ICLR) (2020)

    Google Scholar 

  19. Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Master Thesis (2009)

    Google Scholar 

  20. Kurakin, A., Goodfellow, I., Bengio, S.: Adversarial machine learning at scale. In: International Conference on Learning Representations (ICLR) (2016)

    Google Scholar 

  21. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  22. Lee, E., Lee, C.Y.: NeuralScale: efficient scaling of neurons for resource-constrained deep neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1478–1487 (2020)

    Google Scholar 

  23. Lin, J., Gan, C., Han, S.: Defensive quantization: when efficiency meets robustness. In: International Conference on Learning Representations. International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  24. Lin, J., Rao, Y., Lu, J., Zhou, J.: Runtime neural pruning. In: Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurIPS), pp. 2178–2188 (2017)

    Google Scholar 

  25. Madaan, D., Shin, J., Hwang, S.J.: Adversarial neural pruning with latent vulnerability suppression. In: International Conference on Machine Learning (ICML), pp. 6575–6585. PMLR (2020)

    Google Scholar 

  26. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., Vladu, A.: Towards deep learning models resistant to adversarial attacks. In: 6th International Conference on Learning Representations (ICLR). Vancouver, Canada (2018)

    Google Scholar 

  27. Moosavi-Dezfooli, S.M., Fawzi, A., Frossard, P.: DeepfOol: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2574–2582 (2016)

    Google Scholar 

  28. Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., Ng, A.Y.: Reading digits in natural images with unsupervised feature learning. In: Advances in Neural Information Processing Systems (NeurIPS) (2011)

    Google Scholar 

  29. Papernot, N., McDaniel, P., Jha, S., Fredrikson, M., Celik, Z.B., Swami, A.: The limitations of deep learning in adversarial settings. In: 2016 IEEE European Symposium on Security and Privacy (EuroS &P), pp. 372–387. IEEE (2016)

    Google Scholar 

  30. Papernot, N., McDaniel, P., Wu, X., Jha, S., Swami, A.: Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE Symposium on Security and Privacy (SP), pp. 582–597. IEEE (2016)

    Google Scholar 

  31. Peng, X., Zhang, L., Yi, Z., Tan, K.K.: Learning locality-constrained collaborative representation for robust face recognition. Pattern Recogn. 47(9), 2794–2806 (2014)

    Article  Google Scholar 

  32. Phan, Anh-Huy., Sobolev, Konstantin, Sozykin, Konstantin, Ermilov, Dmitry, Gusak, Julia, Tichavský, Petr, Glukhov, Valeriy, Oseledets, Ivan, Cichocki, Andrzej: Stable low-rank tensor decomposition for compression of convolutional neural network. In: Vedaldi, Andrea, Bischof, Horst, Brox, Thomas, Frahm, Jan-Michael. (eds.) ECCV 2020. LNCS, vol. 12374, pp. 522–539. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58526-6_31

    Chapter  Google Scholar 

  33. Polino, A., Pascanu, R., Alistarh, D.: Model compression via distillation and quantization. In: International Conference on Learning Representations (ICLR) (2018)

    Google Scholar 

  34. Rolnick, D., Tegmark, M.: The power of deeper networks for expressing natural functions. In: International Conference on Learning Representations (ICLR) (2018)

    Google Scholar 

  35. Russakovsky, D., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  36. Sainath, T.N., Kingsbury, B., Sindhwani, V., Arisoy, E., Ramabhadran, B.: Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6655–6659. IEEE (2013)

    Google Scholar 

  37. Sehwag, V., Wang, S., Mittal, P., Jana, S.: HYDRA: pruning adversarially robust neural networks. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 19655–19666 (2020)

    Google Scholar 

  38. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  39. Sinha, Abhishek, Singh, Mayank, Krishnamurthy, Balaji: Neural networks in an adversarial setting and ill-conditioned weight space. In: Alzate, C., et al. (eds.) ECML PKDD 2018. LNCS (LNAI), vol. 11329, pp. 177–190. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-13453-2_14

    Chapter  Google Scholar 

  40. Tan, M., Le, Q.: EfficientNet: rethinking model scaling for convolutional neural networks. In: International Conference on Machine Learning (ICML), pp. 6105–6114 (2019)

    Google Scholar 

  41. Thakker, U., et al.: Pushing the limits of RNN compression. In: 2019 Fifth Workshop on Energy Efficient Machine Learning and Cognitive Computing-NeurIPS Edition (EMC2-NeurIPS), pp. 18–21. IEEE (2019)

    Google Scholar 

  42. Tolstikhin, I.O., et al.: MLP-mixer: an all-MLP architecture for vision. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 34, pp. 24261–2427a2 (2021)

    Google Scholar 

  43. Tramèr, F., Kurakin, A., Papernot, N., Goodfellow, I., Boneh, D., McDaniel, P.: Ensemble adversarial training: attacks and defenses. In: International Conference on Learning Representations (ICLR) (2018)

    Google Scholar 

  44. Van Loan, C.F.: The ubiquitous kronecker product. J. Comput. Appl. Math. 123(1–2), 85–100 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  45. Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. In: Proceedings of the 30th International Conference on Neural Information Processing Systems, pp. 2074–2082 (2016)

    Google Scholar 

  46. Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neural networks for mobile devices. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4820–4828 (2016)

    Google Scholar 

  47. Xu, K., et al.: Structured adversarial attack: Towards general implementation and better interpretability. In: International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  48. Ye, S., et al.: Adversarial robustness vs. model compression, or both? In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 111–120 (2019)

    Google Scholar 

  49. Yin, M., Sui, Y., Liao, S., Yuan, B.: Towards efficient tensor decomposition-based DNN model compression with optimization framework. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 10674–10683 (2021)

    Google Scholar 

  50. Yu, X., Liu, T., Wang, X., Tao, D.: On compressing deep models by low rank and sparse decomposition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7370–7379 (2017)

    Google Scholar 

  51. Zhang, H., Yu, Y., Jiao, J., Xing, E., El Ghaoui, L., Jordan, M.I.: Theoretically principled trade-off between robustness and accuracy. In: International Conference on Machine Learning (ICML), pp. 7472–7482 (2019)

    Google Scholar 

  52. Zhao, Y., Shumailov, I., Mullins, R., Anderson, R.: To compress or not to compress: Understanding the interactions between adversarial attacks and neural network compression. In: Proceedings of Machine Learning and Systems (MLSys), pp. 230–240 (2019)

    Google Scholar 

Download references

Acknowledgment

This work was supported by National Key Research and Development Program of China (No. 2018YFB2101300), and Natural Science Foundation of China (No. 61872147).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuan Tang .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 298 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wei, X. et al. (2022). Learning Extremely Lightweight and Robust Model with Differentiable Constraints on Sparsity and Condition Number. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13664. Springer, Cham. https://doi.org/10.1007/978-3-031-19772-7_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19772-7_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19771-0

  • Online ISBN: 978-3-031-19772-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics