Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3485832.3485904acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacsacConference Proceedingsconference-collections
research-article
Open access

Two Souls in an Adversarial Image: Towards Universal Adversarial Example Detection using Multi-view Inconsistency

Published: 06 December 2021 Publication History
  • Get Citation Alerts
  • Abstract

    In the evasion attacks against deep neural networks (DNN), the attacker generates adversarial instances that are visually indistinguishable from benign samples and sends them to the target DNN to trigger misclassifications. In this paper, we propose a novel multi-view adversarial image detector, namely Argos, based on a novel observation. That is, there exist two “souls” in an adversarial instance, i.e., the visually unchanged content, which corresponds to the true label, and the added invisible perturbation, which corresponds to the misclassified label. Such inconsistencies could be further amplified through an autoregressive generative approach that generates images with seed pixels selected from the original image, a selected label, and pixel distributions learned from the training data. The generated images (i.e., the “views”) will deviate significantly from the original one if the label is adversarial, demonstrating inconsistencies that Argos expects to detect. To this end, Argos first amplifies the discrepancies between the visual content of an image and its misclassified label induced by the attack using a set of regeneration mechanisms and then identifies an image as adversarial if the reproduced views deviate to a preset degree. Our experimental results show that Argos significantly outperforms two representative adversarial detectors in both detection accuracy and robustness against six well-known adversarial attacks. Code is available at: https://github.com/sohaib730/Argos-Adversarial_Detection

    References

    [1]
    Anish Athalye, Nicholas Carlini, and David Wagner. 2018. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In International Conference on Machine Learning (ICML).
    [2]
    Sana Awan, Bo Luo, and Fengjun Li. 2021. CONTRA: Defending against Poisoning Attacks in Federated Learning. In In European Symposium on Research in Computer Security (ESORICS).
    [3]
    Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Šrndić, Pavel Laskov, Giorgio Giacinto, and Fabio Roli. 2013. Evasion attacks against machine learning at test time. In Machine Learning and Knowledge Discovery in Databases.
    [4]
    Xiaoyu Cao and Neil Zhenqiang Gong. 2017. Mitigating evasion attacks to deep neural networks via region-based classification. In Proceedings of the 33rd Annual Computer Security Applications Conference.
    [5]
    Nicholas Carlini and David Wagner. 2017. Adversarial examples are not easily detected: Bypassing ten detection methods. In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (AISec’17).
    [6]
    Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. In IEEE symposium on security and privacy.
    [7]
    Gilad Cohen, Guillermo Sapiro, and Raja Giryes. 2020. Detecting Adversarial Samples Using Influence Functions and Nearest Neighbors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
    [8]
    Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. 2019. Certified adversarial robustness via randomized smoothing. In International Conference on Machine Learning (ICML).
    [9]
    Bao Gia Doan, Ehsan Abbasnejad, and Damith C Ranasinghe. 2020. Februus: Input purification defense against trojan attacks on deep neural network systems. In Annual Computer Security Applications Conference.
    [10]
    Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. 2018. Boosting adversarial attacks with momentum. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
    [11]
    Gamaleldin Elsayed, Shreya Shankar, Brian Cheung, Nicolas Papernot, Alexey Kurakin, Ian Goodfellow, and Jascha Sohl-Dickstein. 2018. Adversarial examples that fool both computer vision and time-limited humans. In Advances in Neural Information Processing Systems (NeurIPS).
    [12]
    Reuben Feinman, Ryan R Curtin, Saurabh Shintre, and Andrew B Gardner. 2017. Detecting adversarial samples from artifacts. arxiv:1703.00410
    [13]
    Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model inversion attacks that exploit confidence information and basic countermeasures. In ACM SIGSAC Conference on Computer and Communications Security (CCS).
    [14]
    Zhitao Gong, Wenlu Wang, and Wei-Shinn Ku. 2017. Adversarial and Clean Data Are Not Twins. arXiv:1704.04960
    [15]
    Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. In International Conference on Learning Representations (ICLR).
    [16]
    Sorin Grigorescu, Bogdan Trasnea, Tiberiu Cocias, and Gigel Macesanu. 2020. A survey of deep learning techniques for autonomous driving. Journal of Field Robotics(2020).
    [17]
    Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes, and Patrick McDaniel. 2017. On the (Statistical) Detection of Adversarial Examples. arXiv:1702.06280
    [18]
    Pei Guo, Xiaoran Ni, Xiaogang Chen, and Xiangyang Ji. 2017. Fast PixelCNN: Based on network acceleration cache and partial generation network. In International Symposium Intelligent Signal Processing and Communications Systems ISPACS.
    [19]
    Dan Hendrycks and Kevin Gimpel. 2017. Early Methods for Detecting Adversarial Images. In International Conference on Learning Representations (ICLR).
    [20]
    Hossein Hosseini, S. Kannan, and R. Poovendran. 2019. Are Odds Really Odd? Bypassing Statistical Detection of Adversarial Examples. arXiv:1907.12138
    [21]
    A. Ilyas, S. Santurkar, L. Engstrom, B. Tran, and A. Madry. 2019. Adversarial Examples Are Not Bugs, They Are Features. In Advances in Neural Information Processing Systems (NeurIPS).
    [22]
    James M. Joyce. 2011. Kullback-Leibler Divergence. International Encyclopedia of Statistical Science (2011).
    [23]
    Sohaib Kiani, Sana Awan, Jun Huan, Fengjun Li, and Bo Luo. 2020. WOLF: Automated Machine Learning Workflow Management Framework for Malware Detection and Other Applications. In Proceedings of the 7th Symposium on Hot Topics in the Science of Security.
    [24]
    Alex Krizhevsky. 2009. Learning multiple layers of features from tiny images.
    [25]
    Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. 2018. A Simple Unified Framework for Detecting Out-of-Distribution Samples and Adversarial Attacks. In Advances in Neural Information Processing Systems (NeurIPS).
    [26]
    Ming Liang and Xiaolin Hu. 2015. Recurrent convolutional neural network for object recognition. In Conference on Computer Vision and Pattern Recognition (CVPR).
    [27]
    Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation Forest. In IEEE International Conference on Data Mining (ICDM).
    [28]
    Yingqi Liu, Shiqing Ma, Yousra Aafer, W. Lee, Juan Zhai, Weihang Wang, and X. Zhang. 2018. Trojaning Attack on Neural Networks. In Network and Distributed System Security Symposium (NDSS).
    [29]
    Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In International Conference on Learning Representations (ICLR).
    [30]
    Jan Hendrik Metzen, Tim Genewein, Volker Fischer, and Bastian Bischoff. 2017. On detecting adversarial perturbations. International Conference of Learning Representation (ICLR).
    [31]
    Seyed-Mohsen Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. Deepfool: a simple and accurate method to fool deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
    [32]
    Khan Muhammad, Amin Ullah, Jaime Lloret, Javier Del Ser, and Victor Hugo C de Albuquerque. 2020. Deep Learning for Safe Autonomous Driving: Current Challenges and Future Directions. IEEE Transactions on Intelligent Transportation Systems (2020).
    [33]
    Anh Nguyen, Jason Yosinski, and Jeff Clune. 2015. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In Conference on Computer Vision and Pattern Recognition (CVPR).
    [34]
    Aäron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. 2016. Conditional Image Generation with PixelCNN Decoders. In Advances in Neural Information Processing Systems (NeurIPS).
    [35]
    Ren Pang, Hua Shen, Xinyang Zhang, Shouling Ji, Yevgeniy Vorobeychik, Xiapu Luo, Alex Liu, and Ting Wang. 2020. A tale of evil twins: Adversarial inputs versus poisoned models. In ACM SIGSAC Conference on Computer and Communications Security.
    [36]
    Nicolas Papernot, Fartash Faghri, Nicholas Carlini, Ian Goodfellow, Reuben Feinman, Alexey Kurakin, Cihang Xie, Yash Sharma, Tom Brown, Aurko Roy, Alexander Matyasko, Vahid Behzadan, Karen Hambardzumyan, Zhishuai Zhang, Yi-Lin Juang, Zhi Li, Ryan Sheatsley, Abhibhav Garg, Jonathan Uesato, Willi Gierke, Yinpeng Dong, David Berthelot, Paul Hendricks, Jonas Rauber, and Rujun Long. 2018. Technical Report on the CleverHans v2.1.0 Adversarial Examples Library. arXiv preprint arXiv:1610.00768.
    [37]
    Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. 2016. The limitations of deep learning in adversarial settings. In IEEE European symposium on security and privacy (EuroS&P).
    [38]
    Yao Qin, Nicholas Frosst, Sara Sabour, Colin Raffel, Garrison Cottrell, and Geoffrey Hinton. 2020. Detecting and Diagnosing Adversarial Images with Class-Conditional Capsule Reconstructions. In International Conference on Learning Representations (ICLR).
    [39]
    Prajit Ramachandran, Tom Le Paine, Pooya Khorrami, Mohammad Babaeizadeh, Shiyu Chang, Yang Zhang, Mark A Hasegawa-Johnson, Roy H Campbell, and Thomas S Huang. 2017. Fast generation for convolutional autoregressive models. International Conference on Learning Representations (ICLR).
    [40]
    Waseem Rawat and Zenghui Wang. 2017. Deep convolutional neural networks for image classification: A comprehensive review. Neural computation (2017).
    [41]
    Shahbaz Rezaei and Xin Liu. 2019. A Target-Agnostic Attack on Deep Models: Exploiting Security Vulnerabilities of Transfer Learning. In International Conference on Learning Representations (ICLR).
    [42]
    Kevin Roth, Yannic Kilcher, and Thomas Hofmann. 2019. The Odds are Odd: A Statistical Test for Detecting Adversarial Examples. In Proceedings of International Conference on Machine Learning (PMLR).
    [43]
    Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, A. C. Berg, and L. Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. Int. journal Computer Vision (IJCV)(2015).
    [44]
    Tim Salimans, Andrej Karpathy, Xi Chen, and Diederik P Kingma. 2017. Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications. In International Conference on Learning Representations (ICLR).
    [45]
    Pouya Samangouei, Maya Kabkab, and Rama Chellappa. 2018. Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models. International Conference of Learning Representation (ICLR).
    [46]
    Ali Shafahi, W Ronny Huang, Mahyar Najibi, Octavian Suciu, Christoph Studer, Tudor Dumitras, and Tom Goldstein. 2018. Poison frogs! targeted clean-label poisoning attacks on neural networks. In Advances in Neural Information Processing Systems (NeurIPS).
    [47]
    D. Silver, A. Huang, C. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, 2016. Mastering the game of Go with deep neural networks and tree search. Nature (2016).
    [48]
    Yang Song, Taesup Kim, Sebastian Nowozin, Stefano Ermon, and Nate Kushman. 2018. PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples. In International Conference on Learning Representations (ICLR).
    [49]
    J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel. 2012. Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition. Neural Networks (2012).
    [50]
    Leo Breiman Statistics and Leo Breiman. 2001. Random Forests. Machine Learning (2001).
    [51]
    Jacob Steinhardt, Pang Wei Koh, and Percy Liang. 2017. Certified defenses for data poisoning attacks. In Advances in Neural Information Processing Systems (NeurIPS).
    [52]
    Yi Sun, Ding Liang, Xiaogang Wang, and Xiaoou Tang. 2015. Deepid3: Face recognition with very deep neural networks. CoRR abs/1502.00873. arXiv:1502.00873http://arxiv.org/abs/1502.00873
    [53]
    Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv:1312.6199.
    [54]
    Di Tang, XiaoFeng Wang, Haixu Tang, and Kehuan Zhang. 2021. Demon in the variant: Statistical analysis of dnns for robust backdoor contamination detection. 30th USENIX Security Symposium (USENIX Security 21).
    [55]
    Ruixiang Tang, Mengnan Du, Ninghao Liu, Fan Yang, and Xia Hu. 2020. An embarrassingly simple approach for Trojan attack in deep neural networks. In ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.
    [56]
    Florian Tramèr, Nicholas Carlini, Wieland Brendel, and Aleksander Madry. 2020. On Adaptive Attacks to Adversarial Example Defenses. In Advances in Neural Information Processing Systems (NeurIPS).
    [57]
    Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Alexander Turner, and Aleksander Madry. 2018. Robustness May Be at Odds with Accuracy. In International Conference on Learning Representations (ICLR).
    [58]
    Benigno Uria, Marc-Alexandre Côté, Karol Gregor, Iain Murray, and Hugo Larochelle. 2016. Neural Autoregressive Distribution Estimation. Journal of Machine Learning Research(2016).
    [59]
    Giovanni Vacanti and Arnaud Van Looveren. 2020. Adversarial Detection and Correction by Matching Prediction Distributions. CoRR abs/2002.09364. arXiv:2002.09364https://arxiv.org/abs/2002.09364
    [60]
    Aaron Van Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. 2016. Pixel recurrent neural networks. In International Conference on Machine Learning (ICML).
    [61]
    Lei Wu and Zhanxing Zhu. 2020. Towards Understanding and Improving the Transferability of Adversarial Examples in Deep Neural Networks. In Asian Conference on Machine Learning.
    [62]
    Xi Wu, Matthew Fredrikson, Somesh Jha, and Jeffrey F Naughton. 2016. A methodology for formalizing model-inversion attacks. In IEEE Computer Security Foundations Symposium (CSF).
    [63]
    Yuhang Wu, Sunpreet S Arora, Yanhong Wu, and Hao Yang. 2021. Beating Attackers At Their Own Games: Adversarial Example Detection Using Adversarial Gradient Directions. Proceedings of the AAAI Conference on Artificial Intelligence.
    [64]
    Chaowei Xiao, Bo Li, Jun Yan Zhu, Warren He, Mingyan Liu, and Dawn Song. 2018. Generating adversarial examples with adversarial networks. In 27th International Joint Conference on Artificial Intelligence, (IJCAI).
    [65]
    Weilin Xu, David Evans, and Yanjun Qi. 2018. Feature squeezing: Detecting adversarial examples in deep neural networks. Network and Distributed System Security Symposium (NDSS).
    [66]
    Yilun Xu, Yang Song, Sahaj Garg, Linyuan Gong, Rui Shu, Aditya Grover, and Stefano Ermon. 2021. Anytime Sampling for Autoregressive Models via Ordered Autoencoding. In International Conference on Learning Representations (ICLR).
    [67]
    Sergey Zagoruyko and Nikos Komodakis. 2016. Wide Residual Networks. British Machine Vision Conference (BMV).
    [68]
    Hongyang Zhang, Yaodong Yu, Jiantao Jiao, Eric P. Xing, Laurent El Ghaoui, and Michael I. Jordan. 2019. Theoretically Principled Trade-off between Robustness and Accuracy. In International Conference on Machine Learning (ICML).
    [69]
    Zhihao Zheng and Pengyu Hong. 2018. Robust Detection of Adversarial Attacks by Modeling the Intrinsic Properties of Deep Neural Networks. In Advances in Neural Information Processing Systems (NeurIPS).

    Cited By

    View all
    • (2023)Multi-View Multi-Task Campaign Embedding for Cold-Start Conversion Rate ForecastingIEEE Transactions on Big Data10.1109/TBDATA.2022.31621509:1(280-293)Online publication date: 1-Feb-2023
    • (2023)AI-Guardian: Defeating Adversarial Attacks using Backdoors2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179473(701-718)Online publication date: May-2023
    • (2022)MR2D: Multiple Random Masking Reconstruction Adversarial Detector2022 10th International Conference on Information Systems and Computing Technology (ISCTech)10.1109/ISCTech58360.2022.00016(61-67)Online publication date: Dec-2022

    Index Terms

    1. Two Souls in an Adversarial Image: Towards Universal Adversarial Example Detection using Multi-view Inconsistency
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Other conferences
            ACSAC '21: Proceedings of the 37th Annual Computer Security Applications Conference
            December 2021
            1077 pages
            ISBN:9781450385794
            DOI:10.1145/3485832
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            Published: 06 December 2021

            Permissions

            Request permissions for this article.

            Check for updates

            Author Tags

            1. Adversarial Attacks
            2. Adversarial Detection
            3. Deep Generative Modeling
            4. Deep Learning
            5. Multi-view Machine Learning

            Qualifiers

            • Research-article
            • Research
            • Refereed limited

            Conference

            ACSAC '21

            Acceptance Rates

            Overall Acceptance Rate 104 of 497 submissions, 21%

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)188
            • Downloads (Last 6 weeks)30
            Reflects downloads up to 28 Jul 2024

            Other Metrics

            Citations

            Cited By

            View all
            • (2023)Multi-View Multi-Task Campaign Embedding for Cold-Start Conversion Rate ForecastingIEEE Transactions on Big Data10.1109/TBDATA.2022.31621509:1(280-293)Online publication date: 1-Feb-2023
            • (2023)AI-Guardian: Defeating Adversarial Attacks using Backdoors2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179473(701-718)Online publication date: May-2023
            • (2022)MR2D: Multiple Random Masking Reconstruction Adversarial Detector2022 10th International Conference on Information Systems and Computing Technology (ISCTech)10.1109/ISCTech58360.2022.00016(61-67)Online publication date: Dec-2022

            View Options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            HTML Format

            View this article in HTML Format.

            HTML Format

            Get Access

            Login options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media