research-article

Open access

Two Souls in an Adversarial Image: Towards Universal Adversarial Example Detection using Multi-view Inconsistency

Authors:

Bo LuoAuthors Info & Claims

ACSAC '21: Proceedings of the 37th Annual Computer Security Applications Conference

Pages 31 - 44

https://doi.org/10.1145/3485832.3485904

Published: 06 December 2021 Publication History

All formats PDF

Abstract

In the evasion attacks against deep neural networks (DNN), the attacker generates adversarial instances that are visually indistinguishable from benign samples and sends them to the target DNN to trigger misclassifications. In this paper, we propose a novel multi-view adversarial image detector, namely Argos, based on a novel observation. That is, there exist two “souls” in an adversarial instance, i.e., the visually unchanged content, which corresponds to the true label, and the added invisible perturbation, which corresponds to the misclassified label. Such inconsistencies could be further amplified through an autoregressive generative approach that generates images with seed pixels selected from the original image, a selected label, and pixel distributions learned from the training data. The generated images (i.e., the “views”) will deviate significantly from the original one if the label is adversarial, demonstrating inconsistencies that Argos expects to detect. To this end, Argos first amplifies the discrepancies between the visual content of an image and its misclassified label induced by the attack using a set of regeneration mechanisms and then identifies an image as adversarial if the reproduced views deviate to a preset degree. Our experimental results show that Argos significantly outperforms two representative adversarial detectors in both detection accuracy and robustness against six well-known adversarial attacks. Code is available at: https://github.com/sohaib730/Argos-Adversarial_Detection

References

[1]

Anish Athalye, Nicholas Carlini, and David Wagner. 2018. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In International Conference on Machine Learning (ICML).

[2]

Sana Awan, Bo Luo, and Fengjun Li. 2021. CONTRA: Defending against Poisoning Attacks in Federated Learning. In In European Symposium on Research in Computer Security (ESORICS).

Digital Library

[3]

Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Šrndić, Pavel Laskov, Giorgio Giacinto, and Fabio Roli. 2013. Evasion attacks against machine learning at test time. In Machine Learning and Knowledge Discovery in Databases.

[4]

Xiaoyu Cao and Neil Zhenqiang Gong. 2017. Mitigating evasion attacks to deep neural networks via region-based classification. In Proceedings of the 33rd Annual Computer Security Applications Conference.

Digital Library

[5]

Nicholas Carlini and David Wagner. 2017. Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (AISec’17).

Digital Library

[6]

Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In IEEE symposium on security and privacy.

[7]

Gilad Cohen, Guillermo Sapiro, and Raja Giryes. 2020. Detecting Adversarial Samples Using Influence Functions and Nearest Neighbors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]

Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. 2019. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning (ICML).

[9]

Bao Gia Doan, Ehsan Abbasnejad, and Damith C Ranasinghe. 2020. Februus: Input purification defense against trojan attacks on deep neural network systems. In Annual Computer Security Applications Conference.

Digital Library

[10]

Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. 2018. Boosting adversarial attacks with momentum. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).

[11]

Gamaleldin Elsayed, Shreya Shankar, Brian Cheung, Nicolas Papernot, Alexey Kurakin, Ian Goodfellow, and Jascha Sohl-Dickstein. 2018. Adversarial examples that fool both computer vision and time-limited humans. In Advances in Neural Information Processing Systems (NeurIPS).

[12]

Reuben Feinman, Ryan R Curtin, Saurabh Shintre, and Andrew B Gardner. 2017. Detecting adversarial samples from artifacts. arxiv:1703.00410

[13]

Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model inversion attacks that exploit confidence information and basic countermeasures. In ACM SIGSAC Conference on Computer and Communications Security (CCS).

Digital Library

[14]

Zhitao Gong, Wenlu Wang, and Wei-Shinn Ku. 2017. Adversarial and Clean Data Are Not Twins. arXiv:1704.04960

[15]

Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. In International Conference on Learning Representations (ICLR).

[16]

Sorin Grigorescu, Bogdan Trasnea, Tiberiu Cocias, and Gigel Macesanu. 2020. A survey of deep learning techniques for autonomous driving. Journal of Field Robotics(2020).

[17]

Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes, and Patrick McDaniel. 2017. On the (Statistical) Detection of Adversarial Examples. arXiv:1702.06280

[18]

Pei Guo, Xiaoran Ni, Xiaogang Chen, and Xiangyang Ji. 2017. Fast PixelCNN: Based on network acceleration cache and partial generation network. In International Symposium Intelligent Signal Processing and Communications Systems ISPACS.

[19]

Dan Hendrycks and Kevin Gimpel. 2017. Early Methods for Detecting Adversarial Images. In International Conference on Learning Representations (ICLR).

[20]

Hossein Hosseini, S. Kannan, and R. Poovendran. 2019. Are Odds Really Odd? Bypassing Statistical Detection of Adversarial Examples. arXiv:1907.12138

[21]

A. Ilyas, S. Santurkar, L. Engstrom, B. Tran, and A. Madry. 2019. Adversarial Examples Are Not Bugs, They Are Features. In Advances in Neural Information Processing Systems (NeurIPS).

[22]

James M. Joyce. 2011. Kullback-Leibler Divergence. International Encyclopedia of Statistical Science (2011).

[23]

Sohaib Kiani, Sana Awan, Jun Huan, Fengjun Li, and Bo Luo. 2020. WOLF: Automated Machine Learning Workflow Management Framework for Malware Detection and Other Applications. In Proceedings of the 7th Symposium on Hot Topics in the Science of Security.

Digital Library

[24]

Alex Krizhevsky. 2009. Learning multiple layers of features from tiny images.

[25]

Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. 2018. A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks. In Advances in Neural Information Processing Systems (NeurIPS).

[26]

Ming Liang and Xiaolin Hu. 2015. Recurrent convolutional neural network for object recognition. In Conference on Computer Vision and Pattern Recognition (CVPR).

[27]

Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation Forest. In IEEE International Conference on Data Mining (ICDM).

[28]

Yingqi Liu, Shiqing Ma, Yousra Aafer, W. Lee, Juan Zhai, Weihang Wang, and X. Zhang. 2018. Trojaning Attack on Neural Networks. In Network and Distributed System Security Symposium (NDSS).

[29]

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In International Conference on Learning Representations (ICLR).

[30]

Jan Hendrik Metzen, Tim Genewein, Volker Fischer, and Bastian Bischoff. 2017. On detecting adversarial perturbations. International Conference of Learning Representation (ICLR).

[31]

Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).

[32]

Khan Muhammad, Amin Ullah, Jaime Lloret, Javier Del Ser, and Victor Hugo C de Albuquerque. 2020. Deep Learning for Safe Autonomous Driving: Current Challenges and Future Directions. IEEE Transactions on Intelligent Transportation Systems (2020).

[33]

Anh Nguyen, Jason Yosinski, and Jeff Clune. 2015. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Conference on Computer Vision and Pattern Recognition (CVPR).

[34]

Aäron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. 2016. Conditional Image Generation with PixelCNN Decoders. In Advances in Neural Information Processing Systems (NeurIPS).

[35]

Ren Pang, Hua Shen, Xinyang Zhang, Shouling Ji, Yevgeniy Vorobeychik, Xiapu Luo, Alex Liu, and Ting Wang. 2020. A tale of evil twins: Adversarial inputs versus poisoned models. In ACM SIGSAC Conference on Computer and Communications Security.

Digital Library

[36]

Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, Alexey Kurakin, Cihang Xie, Yash Sharma, Tom Brown, Aurko Roy, Alexander Matyasko, Vahid Behzadan, Karen Hambardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley, Abhibhav Garg, Jonathan Uesato, Willi Gierke, Yinpeng Dong, David Berthelot, Paul Hendricks, Jonas Rauber, and Rujun Long. 2018. Technical Report on the CleverHans v2.1.0 Adversarial Examples Library. arXiv preprint arXiv:1610.00768.

[37]

Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. 2016. The limitations of deep learning in adversarial settings. In IEEE European symposium on security and privacy (EuroS&P).

[38]

Yao Qin, Nicholas Frosst, Sara Sabour, Colin Raffel, Garrison Cottrell, and Geoffrey Hinton. 2020. Detecting and Diagnosing Adversarial Images with Class-Conditional Capsule Reconstructions. In International Conference on Learning Representations (ICLR).

[39]

Prajit Ramachandran, Tom Le Paine, Pooya Khorrami, Mohammad Babaeizadeh, Shiyu Chang, Yang Zhang, Mark A Hasegawa-Johnson, Roy H Campbell, and Thomas S Huang. 2017. Fast generation for convolutional autoregressive models. International Conference on Learning Representations (ICLR).

[40]

Waseem Rawat and Zenghui Wang. 2017. Deep convolutional neural networks for image classification: A comprehensive review. Neural computation (2017).

[41]

Shahbaz Rezaei and Xin Liu. 2019. A Target-Agnostic Attack on Deep Models: Exploiting Security Vulnerabilities of Transfer Learning. In International Conference on Learning Representations (ICLR).

[42]

Kevin Roth, Yannic Kilcher, and Thomas Hofmann. 2019. The Odds are Odd: A Statistical Test for Detecting Adversarial Examples. In Proceedings of International Conference on Machine Learning (PMLR).

[43]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, A. C. Berg, and L. Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. Int. journal Computer Vision (IJCV)(2015).

[44]

Tim Salimans, Andrej Karpathy, Xi Chen, and Diederik P Kingma. 2017. Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications. In International Conference on Learning Representations (ICLR).

[45]

Pouya Samangouei, Maya Kabkab, and Rama Chellappa. 2018. Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models. International Conference of Learning Representation (ICLR).

[46]

Ali Shafahi, W Ronny Huang, Mahyar Najibi, Octavian Suciu, Christoph Studer, Tudor Dumitras, and Tom Goldstein. 2018. Poison frogs! targeted clean-label poisoning attacks on neural networks. In Advances in Neural Information Processing Systems (NeurIPS).

[47]

D. Silver, A. Huang, C. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, 2016. Mastering the game of Go with deep neural networks and tree search. Nature (2016).

[48]

Yang Song, Taesup Kim, Sebastian Nowozin, Stefano Ermon, and Nate Kushman. 2018. PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples. In International Conference on Learning Representations (ICLR).

[49]

J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel. 2012. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks (2012).

[50]

Leo Breiman Statistics and Leo Breiman. 2001. Random Forests. Machine Learning (2001).

[51]

Jacob Steinhardt, Pang Wei Koh, and Percy Liang. 2017. Certified defenses for data poisoning attacks. In Advances in Neural Information Processing Systems (NeurIPS).

[52]

Yi Sun, Ding Liang, Xiaogang Wang, and Xiaoou Tang. 2015. Deepid3: Face recognition with very deep neural networks. CoRR abs/1502.00873. arXiv:1502.00873http://arxiv.org/abs/1502.00873

[53]

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv:1312.6199.

[54]

Di Tang, XiaoFeng Wang, Haixu Tang, and Kehuan Zhang. 2021. Demon in the variant: Statistical analysis of dnns for robust backdoor contamination detection. 30th USENIX Security Symposium (USENIX Security 21).

[55]

Ruixiang Tang, Mengnan Du, Ninghao Liu, Fan Yang, and Xia Hu. 2020. An embarrassingly simple approach for Trojan attack in deep neural networks. In ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.

Digital Library

[56]

Florian Tramèr, Nicholas Carlini, Wieland Brendel, and Aleksander Madry. 2020. On Adaptive Attacks to Adversarial Example Defenses. In Advances in Neural Information Processing Systems (NeurIPS).

[57]

Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and Aleksander Madry. 2018. Robustness May Be at Odds with Accuracy. In International Conference on Learning Representations (ICLR).

[58]

Benigno Uria, Marc-Alexandre Côté, Karol Gregor, Iain Murray, and Hugo Larochelle. 2016. Neural Autoregressive Distribution Estimation. Journal of Machine Learning Research(2016).

[59]

Giovanni Vacanti and Arnaud Van Looveren. 2020. Adversarial Detection and Correction by Matching Prediction Distributions. CoRR abs/2002.09364. arXiv:2002.09364https://arxiv.org/abs/2002.09364

[60]

Aaron Van Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. 2016. Pixel recurrent neural networks. In International Conference on Machine Learning (ICML).

[61]

Lei Wu and Zhanxing Zhu. 2020. Towards Understanding and Improving the Transferability of Adversarial Examples in Deep Neural Networks. In Asian Conference on Machine Learning.

[62]

Xi Wu, Matthew Fredrikson, Somesh Jha, and Jeffrey F Naughton. 2016. A methodology for formalizing model-inversion attacks. In IEEE Computer Security Foundations Symposium (CSF).

[63]

Yuhang Wu, Sunpreet S Arora, Yanhong Wu, and Hao Yang. 2021. Beating Attackers At Their Own Games: Adversarial Example Detection Using Adversarial Gradient Directions. Proceedings of the AAAI Conference on Artificial Intelligence.

[64]

Chaowei Xiao, Bo Li, Jun Yan Zhu, Warren He, Mingyan Liu, and Dawn Song. 2018. Generating adversarial examples with adversarial networks. In 27th International Joint Conference on Artificial Intelligence, (IJCAI).

Digital Library

[65]

Weilin Xu, David Evans, and Yanjun Qi. 2018. Feature squeezing: Detecting adversarial examples in deep neural networks. Network and Distributed System Security Symposium (NDSS).

[66]

Yilun Xu, Yang Song, Sahaj Garg, Linyuan Gong, Rui Shu, Aditya Grover, and Stefano Ermon. 2021. Anytime Sampling for Autoregressive Models via Ordered Autoencoding. In International Conference on Learning Representations (ICLR).

[67]

Sergey Zagoruyko and Nikos Komodakis. 2016. Wide Residual Networks. British Machine Vision Conference (BMV).

[68]

Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, and Michael I. Jordan. 2019. Theoretically Principled Trade-off between Robustness and Accuracy. In International Conference on Machine Learning (ICML).

[69]

Zhihao Zheng and Pengyu Hong. 2018. Robust Detection of Adversarial Attacks by Modeling the Intrinsic Properties of Deep Neural Networks. In Advances in Neural Information Processing Systems (NeurIPS).

Cited By

Yao ZKong DLu MBai XYang JXiong H(2023)Multi-View Multi-Task Campaign Embedding for Cold-Start Conversion Rate ForecastingIEEE Transactions on Big Data10.1109/TBDATA.2022.31621509:1(280-293)Online publication date: 1-Feb-2023
https://doi.org/10.1109/TBDATA.2022.3162150
Zhu HZhang SChen K(2023)AI-Guardian: Defeating Adversarial Attacks using Backdoors2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179473(701-718)Online publication date: May-2023
https://doi.org/10.1109/SP46215.2023.10179473
Lu ZHu HHuo SLi S(2022)MR2D: Multiple Random Masking Reconstruction Adversarial Detector2022 10th International Conference on Information Systems and Computing Technology (ISCTech)10.1109/ISCTech58360.2022.00016(61-67)Online publication date: Dec-2022
https://doi.org/10.1109/ISCTech58360.2022.00016

Index Terms

Two Souls in an Adversarial Image: Towards Universal Adversarial Example Detection using Multi-view Inconsistency
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Learning paradigms
    2. Machine learning approaches
      1. Neural networks
2. Security and privacy

Index terms have been assigned to the content through auto-classification.

Recommendations

Feature autoencoder for detecting adversarial examples
Abstract
Deep neural networks (DNNs) have gained widespread adoption in computer vision. Unfortunately, state‐of‐the‐art DNNs are vulnerable to adversarial example (AE) attacks, where an adversary introduces imperceptible perturbations to a test example ...
Defense against adversarial examples based on wavelet domain analysis
Abstract
In recent years, machine learning and deep learning, in particular, have shown powerful performance on different challenging tasks. However, research has shown that deep learning systems can be vulnerable to malicious inputs modified by ...
Detect Adversarial Examples by Using Feature Autoencoder
Artificial Intelligence and Security
Abstract
The existence of adversarial samples seriously threatens the security of various deep learning models. Therefore, the detection of adversarial examples is a very important work. Motivated by the comparison with feature maps of adversarial examples ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACSAC '21: Proceedings of the 37th Annual Computer Security Applications Conference

December 2021

1077 pages

ISBN:9781450385794

DOI:10.1145/3485832

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 December 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ACSAC '21

ACSAC '21: Annual Computer Security Applications Conference

December 6 - 10, 2021

Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 104 of 497 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
604
Total Downloads

Downloads (Last 12 months)188
Downloads (Last 6 weeks)30

Reflects downloads up to 28 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yao ZKong DLu MBai XYang JXiong H(2023)Multi-View Multi-Task Campaign Embedding for Cold-Start Conversion Rate ForecastingIEEE Transactions on Big Data10.1109/TBDATA.2022.31621509:1(280-293)Online publication date: 1-Feb-2023
https://doi.org/10.1109/TBDATA.2022.3162150
Zhu HZhang SChen K(2023)AI-Guardian: Defeating Adversarial Attacks using Backdoors2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179473(701-718)Online publication date: May-2023
https://doi.org/10.1109/SP46215.2023.10179473
Lu ZHu HHuo SLi S(2022)MR2D: Multiple Random Masking Reconstruction Adversarial Detector2022 10th International Conference on Information Systems and Computing Technology (ISCTech)10.1109/ISCTech58360.2022.00016(61-67)Online publication date: Dec-2022
https://doi.org/10.1109/ISCTech58360.2022.00016

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents