Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Adaptive Adversarial Logits Pairing

Published: 18 October 2023 Publication History

Abstract

Adversarial examples provide an opportunity as well as impose a challenge for understanding image classification systems. Based on the analysis of the adversarial training solution—Adversarial Logits Pairing (ALP), we observed in this work that: (1) The inference of adversarially robust model tends to rely on fewer high-contribution features compared with vulnerable ones. (2) The training target of ALP does not fit well to a noticeable part of samples, where the logits pairing loss is overemphasized and obstructs minimizing the classification loss. Motivated by these observations, we design an Adaptive Adversarial Logits Pairing (AALP) solution by modifying the training process and training target of ALP. Specifically, AALP consists of an adaptive feature optimization module with Guided Dropout to systematically pursue fewer high-contribution features, and an adaptive sample weighting module by setting sample-specific training weights to balance between logits pairing loss and classification loss. The proposed AALP solution demonstrates superior defense performance on multiple datasets with extensive experiments.

References

[1]
Sajjad Amini and Shahrokh Ghaemmaghami. 2020. Towards improving robustness of deep neural networks to adversarial perturbations. IEEE Trans. Multim. 22, 7 (2020), 1889–1903.
[2]
Anish Athalye, Nicholas Carlini, and David A. Wagner. 2018. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of the ICML. 274–283.
[3]
Nicholas Carlini and David A. Wagner. 2017. Towards evaluating the robustness of neural networks. In Proceedings of the SP. 39–57.
[4]
Nicholas Carlini and David A. Wagner. 2018. Audio adversarial examples: Targeted attacks on speech-to-text. In Proceedings of the SP.
[5]
Junsuk Choe and Hyunjung Shim. 2019. Attention-based dropout layer for weakly supervised object localization. In Proceedings of the CVPR.
[6]
Guneet S. Dhillon, Kamyar Azizzadenesheli, Zachary C. Lipton, Jeremy Bernstein, Jean Kossaifi, Aran Khanna, and Anima Anandkumar. 2018. Stochastic activation pruning for robust adversarial defense. Retrieved from https://abs/1803.01442
[7]
Yali Du, Meng Fang, Jinfeng Yi, Chang Xu, Jun Cheng, and Dacheng Tao. 2019. Enhancing the robustness of neural collaborative filtering systems under malicious attacks. IEEE Trans. Multim. 21, 3 (2019), 555–565.
[8]
Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. 2018. HotFlip: White-box adversarial examples for text classification. In Proceedings of the ACL.
[9]
Reuben Feinman, Ryan R. Curtin, Saurabh Shintre, and Andrew B. Gardner. 2017. Detecting adversarial samples from artifacts. Retrieved from https://abs/1703.00410
[10]
Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In Proceedings of the ICLR.
[11]
Chuan Guo, Mayank Rana, Moustapha Cissé, and Laurens van der Maaten. 2017. Countering adversarial images using input transformations. Retrieved from https://abs/1711.00117
[12]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the CVPR.
[13]
Harini Kannan, Alexey Kurakin, and Ian J. Goodfellow. 2018. Adversarial logit pairing. Retrieved from https://abs/1803.06373
[14]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of the ICLR.
[15]
Alex Krizhevsky and Geoffrey Hinton. 2009. Learning Multiple Layers of Features from Tiny Images. Technical Report. Citeseer.
[16]
Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. 2017. Adversarial examples in the physical world. In 5th International Conference on Learning Representations, (ICLR’17, Toulon, France, April 24-26, 2017) Workshop Track Proceedings. https://openreview.net/forum?id=HJGU3Rodl
[17]
Yann LeCun, Patrick Haffner, Léon Bottou, and Yoshua Bengio. 1999. Object recognition with gradient-based learning. In Shape, Contour and Grouping in Computer Vision. Springer, Berlin.
[18]
Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Xiaolin Hu, and Jun Zhu. 2018. Defense against adversarial attacks using high-level representation guided denoiser. In Proceedings of the CVPR. 1778–1787.
[19]
Yiting Lin, Bineng Zhong, Guorong Li, Sicheng Zhao, Ziyi Chen, and Wentao Fan. 2019. Localization-aware meta tracker guided with adversarial features. IEEE Access 7 (2019), 99441–99450.
[20]
Chihuang Liu and Joseph JáJá. 2018. Feature prioritization and regularization improve standard accuracy and adversarial robustness. Retrieved from https://abs/1810.02424
[21]
Yuan Yuan Liu, Lu Yue Ye, Wen Ze Shao, Qi Ge, Li Qian Wang, Bing Kun Bao, and Hai Bo Li. 2019. Adversarial representation learning for dynamic scene deblurring: A simple, fast and robust approach. In Proceedings of the ICIP. 4644–4648.
[22]
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2017. Towards deep learning models resistant to adversarial attacks. Retrieved from https://abs/1706.06083
[23]
Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. DeepFool: A simple and accurate method to fool deep neural networks. In Proceedings of the CVPR. 2574–2582.
[24]
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y. Ng. 2011. Reading digits in natural images with unsupervised feature learning. In Proceedings of the NIPS.
[25]
Nicolas Papernot, Patrick D. McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a defense to adversarial perturbations against deep neural networks. In Proceedings of the SP. 582–597.
[26]
Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. 2017. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the ICCV. 618–626.
[27]
Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer, and Michael K. Reiter. 2018. Adversarial generative nets: Neural network attacks on state-of-the-art face recognition. Retrieved from https://abs/1801.00349
[28]
Shiwei Shen, Guoqing Jin, Ke Gao, and Yongdong Zhang. 2017. Ape-gan: Adversarial perturbation elimination with gan. Proceedings of the ICLR (2017).
[29]
Yang Song, Taesup Kim, Sebastian Nowozin, Stefano Ermon, and Nate Kushman. 2017. PixelDefend: Leveraging generative models to understand and defend against adversarial examples. Retrieved from https://abs/1710.10766
[30]
Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1 (2014), 1929–1958.
[31]
Zhou Su, Qing Fang, Honggang Wang, Sanjeev Mehrotra, Ali C. Begen, Qiang Ye, and Andrea Cavallaro. 2019. Guest editorial trustworthiness in social multimedia analytics and delivery. IEEE Trans. Multim. 21, 3 (2019), 537–538.
[32]
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In Proceedings of the ICLR.
[33]
Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Dan Boneh, and Patrick D. McDaniel. 2017. Ensemble adversarial training: Attacks and defenses. Retrieved from https://abs/1705.07204
[34]
Yulong Wang, Hang Su, Bo Zhang, and Xiaolin Hu. 2020. learning reliable visual saliency for model explanations. IEEE Trans. Multim. 22, 7 (2020), 1796–1807.
[35]
Cihang Xie, Jianyu Wang, Zhishuai Zhang, Yuyin Zhou, Lingxi Xie, and Alan L. Yuille. 2017. Adversarial examples for semantic segmentation and object detection. In Proceedings of the ICCV.
[36]
Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, and Michael I. Jordan. 2019. Theoretically principled trade-off between robustness and accuracy. In Proceedings of the ICML. 7472–7482.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications
ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 20, Issue 2
February 2024
548 pages
EISSN:1551-6865
DOI:10.1145/3613570
  • Editor:
  • Abdulmotaleb El Saddik
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 October 2023
Online AM: 21 August 2023
Accepted: 27 July 2023
Revised: 27 February 2023
Received: 21 April 2022
Published in TOMM Volume 20, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Adversarial defense
  2. adaptive
  3. dropout

Qualifiers

  • Research-article

Funding Sources

  • Fundamental Research Funds for the Central Universities
  • National Natural Science Foundation of China
  • Beijing Natural Science Foundation
  • CCF-Zhipu AI Large Model Fund

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 99
    Total Downloads
  • Downloads (Last 12 months)84
  • Downloads (Last 6 weeks)8
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media