research-article

Adaptive Adversarial Logits Pairing

Authors:

Changsheng XuAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications and Applications, Volume 20, Issue 2

Article No.: 56, Pages 1 - 16

https://doi.org/10.1145/3616375

Published: 18 October 2023 Publication History

Abstract

Adversarial examples provide an opportunity as well as impose a challenge for understanding image classification systems. Based on the analysis of the adversarial training solution—Adversarial Logits Pairing (ALP), we observed in this work that: (1) The inference of adversarially robust model tends to rely on fewer high-contribution features compared with vulnerable ones. (2) The training target of ALP does not fit well to a noticeable part of samples, where the logits pairing loss is overemphasized and obstructs minimizing the classification loss. Motivated by these observations, we design an Adaptive Adversarial Logits Pairing (AALP) solution by modifying the training process and training target of ALP. Specifically, AALP consists of an adaptive feature optimization module with Guided Dropout to systematically pursue fewer high-contribution features, and an adaptive sample weighting module by setting sample-specific training weights to balance between logits pairing loss and classification loss. The proposed AALP solution demonstrates superior defense performance on multiple datasets with extensive experiments.

References

[1]

Sajjad Amini and Shahrokh Ghaemmaghami. 2020. Towards improving robustness of deep neural networks to adversarial perturbations. IEEE Trans. Multim. 22, 7 (2020), 1889–1903.

[2]

Anish Athalye, Nicholas Carlini, and David A. Wagner. 2018. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of the ICML. 274–283.

[3]

Nicholas Carlini and David A. Wagner. 2017. Towards evaluating the robustness of neural networks. In Proceedings of the SP. 39–57.

[4]

Nicholas Carlini and David A. Wagner. 2018. Audio adversarial examples: Targeted attacks on speech-to-text. In Proceedings of the SP.

[5]

Junsuk Choe and Hyunjung Shim. 2019. Attention-based dropout layer for weakly supervised object localization. In Proceedings of the CVPR.

[6]

Guneet S. Dhillon, Kamyar Azizzadenesheli, Zachary C. Lipton, Jeremy Bernstein, Jean Kossaifi, Aran Khanna, and Anima Anandkumar. 2018. Stochastic activation pruning for robust adversarial defense. Retrieved from https://abs/1803.01442

[7]

Yali Du, Meng Fang, Jinfeng Yi, Chang Xu, Jun Cheng, and Dacheng Tao. 2019. Enhancing the robustness of neural collaborative filtering systems under malicious attacks. IEEE Trans. Multim. 21, 3 (2019), 555–565.

[8]

Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. 2018. HotFlip: White-box adversarial examples for text classification. In Proceedings of the ACL.

[9]

Reuben Feinman, Ryan R. Curtin, Saurabh Shintre, and Andrew B. Gardner. 2017. Detecting adversarial samples from artifacts. Retrieved from https://abs/1703.00410

[10]

Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In Proceedings of the ICLR.

[11]

Chuan Guo, Mayank Rana, Moustapha Cissé, and Laurens van der Maaten. 2017. Countering adversarial images using input transformations. Retrieved from https://abs/1711.00117

[12]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the CVPR.

[13]

Harini Kannan, Alexey Kurakin, and Ian J. Goodfellow. 2018. Adversarial logit pairing. Retrieved from https://abs/1803.06373

[14]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the ICLR.

[15]

Alex Krizhevsky and Geoffrey Hinton. 2009. Learning Multiple Layers of Features from Tiny Images. Technical Report. Citeseer.

[16]

Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. 2017. Adversarial examples in the physical world. In 5th International Conference on Learning Representations, (ICLR’17, Toulon, France, April 24-26, 2017) Workshop Track Proceedings. https://openreview.net/forum?id=HJGU3Rodl

[17]

Yann LeCun, Patrick Haffner, Léon Bottou, and Yoshua Bengio. 1999. Object recognition with gradient-based learning. In Shape, Contour and Grouping in Computer Vision. Springer, Berlin.

[18]

Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Xiaolin Hu, and Jun Zhu. 2018. Defense against adversarial attacks using high-level representation guided denoiser. In Proceedings of the CVPR. 1778–1787.

[19]

Yiting Lin, Bineng Zhong, Guorong Li, Sicheng Zhao, Ziyi Chen, and Wentao Fan. 2019. Localization-aware meta tracker guided with adversarial features. IEEE Access 7 (2019), 99441–99450.

[20]

Chihuang Liu and Joseph JáJá. 2018. Feature prioritization and regularization improve standard accuracy and adversarial robustness. Retrieved from https://abs/1810.02424

[21]

Yuan Yuan Liu, Lu Yue Ye, Wen Ze Shao, Qi Ge, Li Qian Wang, Bing Kun Bao, and Hai Bo Li. 2019. Adversarial representation learning for dynamic scene deblurring: A simple, fast and robust approach. In Proceedings of the ICIP. 4644–4648.

[22]

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. Retrieved from https://abs/1706.06083

[23]

Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. DeepFool: A simple and accurate method to fool deep neural networks. In Proceedings of the CVPR. 2574–2582.

[24]

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng. 2011. Reading digits in natural images with unsupervised feature learning. In Proceedings of the NIPS.

[25]

Nicolas Papernot, Patrick D. McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a defense to adversarial perturbations against deep neural networks. In Proceedings of the SP. 582–597.

[26]

Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the ICCV. 618–626.

[27]

Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer, and Michael K. Reiter. 2018. Adversarial generative nets: Neural network attacks on state-of-the-art face recognition. Retrieved from https://abs/1801.00349

[28]

Shiwei Shen, Guoqing Jin, Ke Gao, and Yongdong Zhang. 2017. Ape-gan: Adversarial perturbation elimination with gan. Proceedings of the ICLR (2017).

[29]

Yang Song, Taesup Kim, Sebastian Nowozin, Stefano Ermon, and Nate Kushman. 2017. PixelDefend: Leveraging generative models to understand and defend against adversarial examples. Retrieved from https://abs/1710.10766

[30]

Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1 (2014), 1929–1958.

Digital Library

[31]

Zhou Su, Qing Fang, Honggang Wang, Sanjeev Mehrotra, Ali C. Begen, Qiang Ye, and Andrea Cavallaro. 2019. Guest editorial trustworthiness in social multimedia analytics and delivery. IEEE Trans. Multim. 21, 3 (2019), 537–538.

[32]

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In Proceedings of the ICLR.

[33]

Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Dan Boneh, and Patrick D. McDaniel. 2017. Ensemble adversarial training: Attacks and defenses. Retrieved from https://abs/1705.07204

[34]

Yulong Wang, Hang Su, Bo Zhang, and Xiaolin Hu. 2020. learning reliable visual saliency for model explanations. IEEE Trans. Multim. 22, 7 (2020), 1796–1807.

[35]

Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, and Alan L. Yuille. 2017. Adversarial examples for semantic segmentation and object detection. In Proceedings of the ICCV.

[36]

Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, and Michael I. Jordan. 2019. Theoretically principled trade-off between robustness and accuracy. In Proceedings of the ICML. 7472–7482.

Index Terms

Adaptive Adversarial Logits Pairing
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Machine learning algorithms

Recommendations

Attention, Please! Adversarial Defense via Activation Rectification and Preservation
This study provides a new understanding of the adversarial attack problem by examining the correlation between adversarial attack and visual attention change. In particular, we observed that: (1) images with incomplete attention regions are more ...
Enhancing Adversarial Robustness via Anomaly-aware Adversarial Training
Knowledge Science, Engineering and Management
Abstract
Adversarial training (AT) is one of the most promising solutions for defending adversarial attacks. By exploiting the adversarial examples generated in the maximization step of AT, a large improvement on the robustness can be brought. However, by ... $^{}$ $^{}$
Enhancing Model Robustness Against Adversarial Attacks with an Anti-adversarial Module
Pattern Recognition and Computer Vision
Abstract
Due to the rapid development of artificial intelligence technologies, such as deep neural networks in recent years, the subsequent emergence of adversarial samples poses a great threat to the security of deep neural network models. In order to ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 20, Issue 2

February 2024

548 pages

EISSN:1551-6865

DOI:10.1145/3613570

Editor:
Abdulmotaleb El Saddik
Mohamed Bin Zayed University of Artificial Intelligence, UAE and University of Ottawa, Canada

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 October 2023

Online AM: 21 August 2023

Accepted: 27 July 2023

Revised: 27 February 2023

Received: 21 April 2022

Published in TOMM Volume 20, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Fundamental Research Funds for the Central Universities
National Natural Science Foundation of China
Beijing Natural Science Foundation
CCF-Zhipu AI Large Model Fund

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
99
Total Downloads

Downloads (Last 12 months)84
Downloads (Last 6 weeks)8

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents