Abstract
Adversarial examples have exposed the inherent vulnerabilities of deep neural networks. Although adversarial training has emerged as the leading strategy for adversarial defenses, it is frequently hindered by a challenging balance between maintaining accuracy on unaltered examples and enhancing model robustness. Recent efforts on decoupling network components can effectively reduce the degradation of classification accuracy, but at the cost of an unsatisfactory in robust accuracy, and may suffer from robust overfitting. In this paper, we delve into the underlying causes of this compromise, and introduce a novel framework, the Regularized Decoupled Adversarial Training Mechanism (RDAT) to effectively deal with the trade-off and overfitting. Specifically, RDAT comprises two distinct modules: Regularization module mitigates harmful perturbations by controlling the data distribution distance of examples before and after adversarial attacks. Decoupling training module separates clean and adversarial examples so that they can have special optimization strategies to avoid the suboptimal result in adversarial training. With marginal compromise on the classification accuracy, RDAT achieves remarkably better model robustness with the improvement of robust accuracy by an average of 4.47% on CIFAR-10 and 3.23% on CIFAR-100 when compared to state-of-the-art methods.






Similar content being viewed by others
References
Chen C, Seff A, Kornhauser A, Xiao J (2015) Deepdriving: learning affordance for direct perception in autonomous driving. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp. 2722–2730
Schroff F, Kalenichenko D, Philbin J (2015) Facenet: a unified embedding for face recognition and clustering. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 815–823
Irvin J, Rajpurkar P, Ko M, Yu Y, Ciurea-Ilcus S, Chute C, Marklund H, Haghgoo B, Ball R, Shpanskaya K et al (2019) Chexpert: a large chest radiograph dataset with uncertainty labels and expert comparison. In: Proceedings of the AAAI conference on artificial intelligence, vol. 33, pp. 590–597
Pu N, Zhong Z, Sebe N, Lew MS (2023) A memorizing and generalizing framework for lifelong person re-identification. IEEE Trans Pattern Anal Mach Intell 45(11):13567–13585
Pu N, Zhong Z, Sebe N (2023) Dynamic conceptional contrastive learning for generalized category discovery. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
Truong J-B, Maini P, Walls RJ, Papernot N (2021) Data-free model extraction. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 4771–4780
Liu Y, Xie Y, Srivastava A (2017) Neural trojans. In: 2017 IEEE international conference on computer design (ICCD), pp. 45–48
Gu T, Liu K, Dolan-Gavitt B, Garg S (2019) Badnets: evaluating backdooring attacks on deep neural networks. IEEE Access 7:47230–47244
Huang H, Mu J, Gong NZ, Li Q, Liu B, Xu M (2021) Data poisoning attacks to deep learning based recommender systems. arXiv preprint arXiv:2101.02644
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2014) Intriguing properties of neural networks. In: International conference on learning representations (ICLR)
Zhou H, Li W, Kong Z, Guo J, Zhang Y, Yu B, Zhang L, Liu C (2020) Deepbillboard: systematic physical-world testing of autonomous driving systems. In: International conference on software engineering (ICSE), pp. 347–358
Croce F, Gowal S, Brunner T, Shelhamer E, Hein M, Cemgil T (2022) Evaluating the adversarial robustness of adaptive test-time defenses. In: International conference on machine learning (ICML), pp. 4421–4435
Tramèr F (2022) Detecting adversarial examples is (nearly) as hard as classifying them. In: International conference on machine learning (ICML), pp. 21692–21702
Chen Z, Li B, Xu J, Wu S, Ding S, Zhang W (2022) Towards practical certifiable patch defense with vision transformer. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 15127–15137
Genzel M, Macdonald J, März M (2023) Solving inverse problems with deep neural networks—robustness included. IEEE Trans Pattern Anal Mach Intell 45(1):1119–1134
Liu Q, Wen W (2023) Model compression hardens deep neural networks: a new perspective to prevent adversarial attacks. IEEE Trans Neural Netw Learn Syst 34(1):3–14
Qiu H, Zeng Y, Zheng Q, Guo S, Zhang T, Li H (2024) An efficient preprocessing-based approach to mitigate advanced adversarial attacks. IEEE Trans Comput. https://doi.org/10.1109/TC.2021.3076826
Jia X, Zhang Y, Wu B, Ma K, Wang J, Cao X (2022) Las-at: adversarial training with learnable attack strategy. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 13388–13398
Dong J, Moosavi-Dezfooli, S-M, Lai J, Xie X (2023) The enemy of my enemy is my friend: exploring inverse adversaries for improving adversarial training. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 24678–24687
Ghamizi S, Zhang J, Cordy M, Papadakis M, Sugiyama M, Le Traon Y (2023) Gat: guided adversarial training with pareto-optimal auxiliary tasks. In: International conference on machine learning (ICML), pp. 11255–11282
Li B, Liu W (2023) Wat: improve the worst-class robustness in adversarial training. In: Proceedings of the AAAI conference on artificial intelligence, vol. 37, pp. 14982–14990
Cheng M, Lei Q, Chen P-Y, Dhillon IS, Hsieh C-J (2022) Cat: customized adversarial training for improved robustness. IJCAI 2022:673–679
Schmidt L, Santurkar S, Tsipras D, Talwar K, Madry A (2018) Adversarially robust generalization requires more data. In: Conference on neural information processing systems, pp. 5019–5031
Su D, Zhang H, Chen H, Yi J, Chen P-Y, Gao Y (2018) Is robustness the cost of accuracy?–a comprehensive study on the robustness of 18 deep image classification models. In: Proceedings of the european conference on computer vision (ECCV), pp. 631–648
Zhang H, Yu Y, Jiao J, Xing E, El Ghaoui L, Jordan M (2019) Theoretically principled trade-off between robustness and accuracy. In: International conference on machine learning (ICML), pp. 7472–7482
Rice L, Wong E, Kolter Z (2020) Overfitting in adversarially robust deep learning. In: International conference on machine learning (ICML), pp. 8093–8104
Andriushchenko M, Flammarion N (2020) Understanding and improving fast adversarial training. Adv Neural Inf Process Syst 33:16048–16059
Wong E, Rice L, Kolter JZ (2020) Fast is better than free: revisiting adversarial training. In: International conference on learning representations (ICLR)
Zhang J, Zhu J, Niu G, Han B, Sugiyama M, Kankanhalli M (2021) Geometry-aware instance-reweighted adversarial training. In: International conference on learning representations (ICLR)
Uesato J, Alayrac J-B, Huang P-S, Stanforth R, Fawzi A, Kohli P (2019) Are labels required for improving adversarial robustness? Adv Neural Inf Process Syst 32:12192–12202
Pang T, Lin M, Yang X, Zhu J, Yan S (2022) Robustness and accuracy could be reconcilable by (proper) definition. In: International conference on machine learning (ICML), pp. 17258–17277
Wang H, Wang Y (2023) Generalist: decoupling natural and robust generalization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 20554–20563
Javanmard A, Soltanolkotabi M, Hassani H (2020) Precise tradeoffs in adversarial training for linear regression. In: Conference on learning theory (COLT), pp. 2034–2078
Aditi R, Michael XS, Fanny Y, John D, Percy L (2020) Understanding and mitigating the tradeoff between robustness and accuracy. In: International conference on machine learning (ICML), pp. 7909–7919
Balaji Y, Goldstein T, Hoffman J (2019) Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets. arXiv preprint arXiv:1910.08051
Cui J, Liu S, Wang L, Jia J (2021) Learnable boundary guided adversarial training. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp. 15721–15730
Jin G, Yi X, Huang W, Schewe S, Huang X (2022) Enhancing adversarial training with second-order statistics of weights. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 15273–15283
Mao X, Chen Y, Duan R, Zhu Y, Qi G, Li X, Zhang R, Xue H et al (2022) Enhance the visual representation via discrete adversarial training. Adv Neural Inf Process Syst 35:7520–7533
Hendrycks D, Lee K, Mazeika M (2019) Using pre-training can improve model robustness and uncertainty. In: International conference on machine learning (ICML), pp. 2712–2721
Jin G, Yi X, Wu D, Mu R, Huang X (2023) Randomized adversarial training via taylor expansion. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 16447–16457
Zhang H, Chen H, Song Z, Boning D, Dhillon IS, Hsieh C-J (2019) The limitations of adversarial training and the blind-spot attack. In: International conference on learning representations (ICLR)
Carmon Y, Raghunathan A, Schmidt L, Liang P, Duchi JC (2019) Unlabeled data improves adversarial robustness. Adv Neural Inf Process Syst 32:11190–11201
Song C, He K, Lin J, Wang L, Hopcroft JE (2019) Robust local features for improving the generalization of adversarial training. In: International conference on learning representations (ICLR)
Poursaeed O, Jiang T, Yang H, Belongie S, Lim S-N (2021) Robustness and generalization via generative adversarial training. In: Proceedings of the IEEE/CVF international conference on computer vision (ICCV), pp. 15711–15720
Yu Y, Yang Z, Dobriban E, Steinhardt J, Ma Y (2021) Understanding generalization in adversarial training via the bias-variance decomposition. In: Computer research repository (CoRR)
Tack J, Yu S, Jeong J, Kim M, Hwang SJ, Shin J (2022) Consistency regularization for adversarial robustness. In: Proceedings of the AAAI conference on artificial intelligence vol. 36, pp. 8414–8422
Huang H, Wang Y, Erfani S, Gu Q, Bailey J, Ma X (2021) Exploring architectural ingredients of adversarially robust deep neural networks. Adv Neural Inf Process Syst 34:5545–5559
Liu A, Tang S, Liang S, Gong R, Wu B, Liu X, Tao D (2023) Exploring the relationship between architectural design and adversarially robust generalization. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 4096–4107
Li T, Wu Y, Chen S, Fang K, Huang X (2022) Subspace adversarial training. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 13399–13408
Alex K, Geoffrey H (2009) Learning multiple layers of features from tiny images. https://www.cs.utoronto.ca/~kriz/learning-features-2009-TR.pdf
Yuval N (2011) Reading digits in natural images with unsupervised feature learning. In: Proceedings of the neural information processing systems workshop on deep learning and unsupervised feature learning
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 770–778
Leslie R, Eric W, Zico KJ (2020) Overfitting in adversarially robust deep learning. In: International conference on machine learning (ICML), pp. 8093–8104
Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV (2019) Autoaugment: learning augmentation strategies from data. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 113–123
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: International conference on learning representations (ICLR)
Dong Y, Liao F, Pang T, Su H, Zhu J, Hu X, Li J (2018) Boosting adversarial attacks with momentum. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), pp. 9185–9193
Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy (S &P), pp. 39–57
Croce F, Hein M (2020) Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In: International conference on machine learning (ICML), pp. 2206–2216
Zhang J, Xu X, Han B, Niu G, Cui L, Sugiyama M, Kankanhalli M (2020) Attacks which do not kill training make adversarial learning stronger. In: International conference on machine learning (ICML), pp. 11278–11287
Lamb A, Verma V, Kannala J, Bengio Y (2019) Interpolated adversarial training: achieving robust neural networks without sacrificing too much accuracy. In: Proceedings of the 12th ACM workshop on artificial intelligence and security (ICAIS), pp. 95–103
Xu H, Liu X, Li Y, Jain AK, Tang J (2021) To be robust or to be fair: towards fairness in adversarial training. In: International conference on machine learning (ICML), pp. 11492–11501
Yang S, Xu C (2022) One size does not fit all: data-adaptive adversarial training. In: Proceedings of the European conference on computer vision (ECCV), pp. 78–85
Rade R, Moosavi-Dezfooli S-M (2022) Reducing excessive margin to achieve a better accuracy vs. robustness trade-off. In: International conference on learning representations (ICLR)
Zhao S, Yu J, Sun Z, Zhang B, Wei X (2022) Enhanced accuracy and robustness via multi-teacher adversarial distillation. In: European conference on computer vision (ECCV), pp. 585–602
Acknowledgements
This work was supported by Dreams Foundation of Jianghuai Advance Technology Center (NO.2023-ZM01Z002) and the Natural Science Foundation of Hunan Province under Grant 2023JJ30082.
Author information
Authors and Affiliations
Contributions
Yishan Li wrote the main manuscript, and Yanming Guo and Yulun Wu gave guidance on the writing of the thesis and gave advice on revisions. Yuxiang Xie and Mingrui Lao produced the charts in the text. Tianyuan Yu and Yirun Ruan directed the experimental part. All authors have reviewed the manuscript
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, Y., Guo, Y., Wu, Y. et al. RDAT: an efficient regularized decoupled adversarial training mechanism. Int J Multimed Info Retr 13, 24 (2024). https://doi.org/10.1007/s13735-024-00330-y
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13735-024-00330-y