Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3638530.3664104acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article
Free access

Tightening the Approximation Error of Adversarial Risk with Auto Loss Function Search

Published: 01 August 2024 Publication History
  • Get Citation Alerts
  • Abstract

    How to accurately evaluate the adversarial robustness of Deep Neural Networks (DNNs) is critical for their deployment in real-world applications. An ideal indicator of robustness is adversarial risk. Unfortunately, since it involves maximizing the 0--1 loss, calculating the true risk is technically intractable. The most common solution for this is to compute an approximate risk by replacing the 0--1 loss with a surrogate, such as Cross-Entropy loss. However, these functions are all manually designed and may not be well suited for adversarial robustness evaluation. In this paper, we leverage AutoML to tighten the gap between the true and approximate risks. First, AutoLoss-AR, the first method to search for surrogate losses for adversarial risk is proposed. The experimental results on 10 adversarially trained models demonstrate the effectiveness of the proposed method: the risks evaluated using the best-discovered losses are 0.2% to 1.6% better than those evaluated with baselines. Second, 5 surrogate losses with clean and readable formulas are distilled out and tested on 7 unseen adversarially trained models. These losses outperform the baselines by 0.8% to 2.4%, indicating that they can be used individually as some kind of new knowledge. Our code is publicly available at https://github.com/xpf/Tightening-Approximation-Error.

    References

    [1]
    Sravanti Addepalli, Samyak Jain, Gaurang Sriramanan, and Venkatesh Babu Radhakrishnan. 2021. Towards Achieving Adversarial Robustness Beyond Perceptual Limits. (2021).
    [2]
    Moustafa Alzantot, Yash Sharma, Ahmed Elgohary, Bo-Jhang Ho, Mani Srivastava, and Kai-Wei Chang. 2018. Generating natural language adversarial examples. arXiv preprint arXiv:1804.07998 (2018).
    [3]
    Sanjeev Arora, László Babai, Jacques Stern, and Z Sweedyk. 1997. The hardness of approximate optima in lattices, codes, and systems of linear equations. J. Comput. System Sci. 54, 2 (1997), 317--331.
    [4]
    Anish Athalye, Nicholas Carlini, and David Wagner. 2018. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In International conference on machine learning. PMLR, 274--283.
    [5]
    Nicholas Carlini, Anish Athalye, Nicolas Papernot, Wieland Brendel, Jonas Rauber, Dimitris Tsipras, Ian Goodfellow, Aleksander Madry, and Alexey Kurakin. 2019. On evaluating adversarial robustness. arXiv preprint arXiv:1902.06705 (2019).
    [6]
    Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In 2017 ieee symposium on security and privacy (sp). IEEE, 39--57.
    [7]
    Yair Carmon, Aditi Raghunathan, Ludwig Schmidt, John C Duchi, and Percy S Liang. 2019. Unlabeled data improves adversarial robustness. Advances in Neural Information Processing Systems 32 (2019).
    [8]
    Francesco Croce and Matthias Hein. 2020. Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks. In International conference on machine learning. PMLR, 2206--2216.
    [9]
    Jiequan Cui, Shu Liu, Liwei Wang, and Jiaya Jia. 2021. Learnable boundary guided adversarial training. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 15721--15730.
    [10]
    François-Michel De Rainville, Félix-Antoine Fortin, Marc-André Gardner, Marc Parizeau, and Christian Gagné. 2012. Deap: A python framework for evolutionary algorithms. In Proceedings of the 14th annual conference companion on Genetic and evolutionary computation. 85--92.
    [11]
    Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. 2018. Boosting adversarial attacks with momentum. In Proceedings of the IEEE conference on computer vision and pattern recognition. 9185--9193.
    [12]
    Logan Engstrom, Andrew Ilyas, Shibani Santurkar, and Dimitris Tsipras. [n. d.]. Robustness (python library), 2019. URL https://github.com/MadryLab/robustness 4, 4 ([n. d.]), 4--3.
    [13]
    Logan Engstrom, Brandon Tran, Dimitris Tsipras, Ludwig Schmidt, and Aleksander Madry. 2018. A rotation and a translation suffice: Fooling cnns with simple transformations. (2018).
    [14]
    Ross Girshick. 2015. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision. 1440--1448.
    [15]
    David E Goldberg and Kalyanmoy Deb. 1991. A comparative analysis of selection schemes used in genetic algorithms. In Foundations of genetic algorithms. Vol. 1. Elsevier, 69--93.
    [16]
    Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).
    [17]
    Sorin Grigorescu, Bogdan Trasnea, Tiberiu Cocias, and Gigel Macesanu. 2020. A survey of deep learning techniques for autonomous driving. Journal of Field Robotics 37, 3 (2020), 362--386.
    [18]
    Chuan Guo, Mayank Rana, Moustapha Cisse, and Laurens Van Der Maaten. 2017. Countering adversarial images using input transformations. arXiv preprint arXiv:1711.00117 (2017).
    [19]
    Xin He, Kaiyong Zhao, and Xiaowen Chu. 2021. AutoML: A Survey of the State-of-the-Art. Knowledge-Based Systems 212 (2021), 106622.
    [20]
    Dan Hendrycks, Kimin Lee, and Mantas Mazeika. 2019. Using pre-training can improve model robustness and uncertainty. In International Conference on Machine Learning. PMLR, 2712--2721.
    [21]
    Lang Huang, Chao Zhang, and Hongyang Zhang. 2020. Self-adaptive training: beyond empirical risk minimization. Advances in neural information processing systems 33 (2020), 19365--19376.
    [22]
    Connie Kou, Hwee Kuan Lee, Ee-Chien Chang, and Teck Khim Ng. 2019. Enhancing transformation-based defenses against adversarial attacks with a distribution classifier. In International Conference on Learning Representations.
    [23]
    John R Koza. 1992. Genetic programming: on the programming of computers by means of natural selection. Vol. 1. MIT press.
    [24]
    A. Krizhevsky and G. Hinton. 2009. Learning multiple layers of features from tiny images. Master's thesis, Department of Computer Science, University of Toronto (2009).
    [25]
    Alexey Kurakin, Ian Goodfellow, and Samy Bengio. 2016. Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236 (2016).
    [26]
    Chuming Li, Xin Yuan, Chen Lin, Minghao Guo, Wei Wu, Junjie Yan, and Wanli Ouyang. 2019. Am-lfs: Automl for loss function search. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8410--8419.
    [27]
    Hao Li, Tianwen Fu, Jifeng Dai, Hongsheng Li, Gao Huang, and Xizhou Zhu. 2021. AutoLoss-Zero: Searching Loss Functions from Scratch for Generic Tasks. arXiv preprint arXiv:2103.14026 (2021).
    [28]
    Hao Li, Chenxin Tao, Xizhou Zhu, Xiaogang Wang, Gao Huang, and Jifeng Dai. 2020. Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation. arXiv preprint arXiv:2010.07930 (2020).
    [29]
    Peidong Liu, Gengwei Zhang, Bochao Wang, Hang Xu, Xiaodan Liang, Yong Jiang, and Zhenguo Li. 2021. Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search. arXiv preprint arXiv:2102.04700 (2021).
    [30]
    Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3431--3440.
    [31]
    Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 (2017).
    [32]
    Nicolas Papernot, Patrick McDaniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami. 2017. Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security. 506--519.
    [33]
    Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in pytorch. (2017).
    [34]
    Leslie Rice, Eric Wong, and Zico Kolter. 2020. Overfitting in adversarially robust deep learning. In International Conference on Machine Learning. PMLR, 8093--8104.
    [35]
    Andrew Ross and Finale Doshi-Velez. 2018. Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32.
    [36]
    Vikash Sehwag, Saeed Mahloujifar, Tinashe Handina, Sihui Dai, Chong Xiang, Mung Chiang, and Prateek Mittal. 2021. Improving adversarial robustness using proxy distributions. arXiv preprint arXiv:2104.09425 (2021).
    [37]
    Ali Shafahi, Mahyar Najibi, Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer, Larry S Davis, Gavin Taylor, and Tom Goldstein. 2019. Adversarial training for free! arXiv preprint arXiv:1904.12843 (2019).
    [38]
    Jiawei Su, Danilo Vasconcellos Vargas, and Kouichi Sakurai. 2019. One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation 23, 5 (2019), 828--841.
    [39]
    Kalaivani Sundararajan and Damon L Woodard. 2018. Deep learning for biometrics: A survey. ACM Computing Surveys (CSUR) 51, 3 (2018), 1--34.
    [40]
    Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).
    [41]
    Florian Tramer, Nicholas Carlini, Wieland Brendel, and Aleksander Madry. 2020. On adaptive attacks to adversarial example defenses. arXiv preprint arXiv:2002.08347 (2020).
    [42]
    Jonathan Uesato, Brendan O'donoghue, Pushmeet Kohli, and Aaron Oord. 2018. Adversarial risk and the dangers of evaluating against weak attacks. In International Conference on Machine Learning. PMLR, 5025--5034.
    [43]
    Xiaobo Wang, Shuo Wang, Cheng Chi, Shifeng Zhang, and Tao Mei. 2020. Loss function search for face recognition. In International Conference on Machine Learning. PMLR, 10029--10038.
    [44]
    Yisen Wang, Difan Zou, Jinfeng Yi, James Bailey, Xingjun Ma, and Quanquan Gu. 2019. Improving adversarial robustness requires revisiting misclassified examples. In International Conference on Learning Representations.
    [45]
    Eric Wong, Leslie Rice, and J Zico Kolter. 2020. Fast is better than free: Revisiting adversarial training. arXiv preprint arXiv:2001.03994 (2020).
    [46]
    Dongxian Wu, Shu-Tao Xia, and Yisen Wang. 2020. Adversarial weight perturbation helps robust generalization. Advances in Neural Information Processing Systems 33 (2020), 2958--2969.
    [47]
    Pengfei Xia and Bin Li. 2021. Improving resistance to adversarial deformations by regularizing gradients. Neurocomputing 455 (2021), 38--46.
    [48]
    Pengfei Xia, Ziqiang Li, Hongjing Niu, and Bin Li. 2021. Understanding the error in evaluating adversarial robustness. arXiv preprint arXiv:2101.02325 (2021).
    [49]
    Pengfei Xia, Ziqiang Li, Wei Zhang, and Bin Li. 2022. Data-Efficient Backdoor Attacks. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22. 3992--3998.
    [50]
    Pengfei Xia, Hongjing Niu, Ziqiang Li, and Bin Li. 2021. On the receptive field misalignment in CAM-based visual explanations. Pattern Recognition Letters 152 (2021), 275--282.
    [51]
    Pengfei Xia, Hongjing Niu, Ziqiang Li, and Bin Li. 2022. Enhancing Backdoor Attacks with Multi-Level MMD Regularization. IEEE Transactions on Dependable and Secure Computing (2022).
    [52]
    Chaowei Xiao, Jun-Yan Zhu, Bo Li, Warren He, Mingyan Liu, and Dawn Song. 2018. Spatially transformed adversarial examples. arXiv preprint arXiv:1801.02612 (2018).
    [53]
    Weilin Xu, David Evans, and Yanjun Qi. 2017. Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155 (2017).
    [54]
    Dinghuai Zhang, Tianyuan Zhang, Yiping Lu, Zhanxing Zhu, and Bin Dong. 2019. You only propagate once: Accelerating adversarial training via maximal principle. arXiv preprint arXiv:1905.00877 (2019).
    [55]
    Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric Xing, Laurent El Ghaoui, and Michael Jordan. 2019. Theoretically principled trade-off between robustness and accuracy. In International Conference on Machine Learning. PMLR, 7472--7482.
    [56]
    Jingfeng Zhang, Xilie Xu, Bo Han, Gang Niu, Lizhen Cui, Masashi Sugiyama, and Mohan Kankanhalli. 2020. Attacks which do not kill training make adversarial learning stronger. In International conference on machine learning. PMLR, 11278--11287.
    [57]
    Barret Zoph and Quoc V Le. 2016. Neural architecture search with reinforcement learning. arXiv preprint arXiv:1611.01578 (2016).

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    GECCO '24 Companion: Proceedings of the Genetic and Evolutionary Computation Conference Companion
    July 2024
    2187 pages
    ISBN:9798400704956
    DOI:10.1145/3638530
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 August 2024

    Check for updates

    Author Tags

    1. deep neural networks
    2. adversarial examples
    3. adversarial risk
    4. approximation error
    5. auto loss function search

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    GECCO '24 Companion
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media