research-article

Public Access

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

Authors:

Nicholas Carlini,

David WagnerAuthors Info & Claims

AISec '17: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security

Pages 3 - 14

https://doi.org/10.1145/3128572.3140444

Published: 03 November 2017 Publication History

Abstract

Neural networks are known to be vulnerable to adversarial examples: inputs that are close to natural inputs but classified incorrectly. In order to better understand the space of adversarial examples, we survey ten recent proposals that are designed for detection and compare their efficacy. We show that all can be defeated by constructing new loss functions. We conclude that adversarial examples are significantly harder to detect than previously appreciated, and the properties believed to be intrinsic to adversarial examples are in fact not. Finally, we propose several simple guidelines for evaluating future proposed defenses.

References

[1]

Marco Barreno, Blaine Nelson, Anthony D. Joseph, and J. D. Tygar. 2010. The security of machine learning. Machine Learning, Vol. 81, 2 (2010), 121--148.

Digital Library

[2]

Marco Barreno, Blaine Nelson, Russell Sears, Anthony D. Joseph, and J. Doug Tygar. 2006. Can machine learning be secure? In Proceedings of the 2006 ACM Symposium on Information, computer and communications security. ACM, 16--25.

Digital Library

[3]

Osbert Bastani, Yani Ioannou, Leonidas Lampropoulos, Dimitrios Vytiniotis, Aditya Nori, and Antonio Criminisi. 2016. Measuring neural net robustness with constraints. Advances In Neural Information Processing Systems. 2613--2621.

[4]

Arjun Nitin Bhagoji, Daniel Cullina, and Prateek Mittal. 2017. Dimensionality Reduction as a Defense against Evasion Attacks on Machine Learning Classifiers. arXiv preprint arXiv:1704:02654 (2017).

[5]

Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Šrndić, Pavel Laskov, Giorgio Giacinto, and Fabio Roli. 2013. Evasion attacks against machine learning at test time Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 387--402.

[6]

Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, and others. 2016. End to End Learning for Self-Driving Cars. arXiv preprint arXiv:1604.07316 (2016).

[7]

Karsten M Borgwardt, Arthur Gretton, Malte J Rasch, Hans-Peter Kriegel, Bernhard Schölkopf, and Alex J Smola. 2006. Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics, Vol. 22, 14 (2006), e49--e57.

Digital Library

[8]

Nicholas Carlini and David Wagner. 2017. Towards evaluating the robustness of neural networks. IEEE Symposium on Security and Privacy (2017).

[9]

Nilesh Dalvi, Pedro Domingos, Sumit Sanghai, Deepak Verma, and others. 2004. Adversarial classification. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 99--108.

Digital Library

[10]

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A large-scale hierarchical image database Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on. IEEE, 248--255.

[11]

Reuben Feinman, Ryan R. Curtin, Saurabh Shintre, and Andrew B. Gardner. 2017. Detecting Adversarial Samples from Artifacts. arXiv preprint arXiv:1703.00410 (2017).

[12]

Zhitao Gong, Wenlu Wang, and Wei-Shinn Ku. 2017. Adversarial and Clean Data Are Not Twins. arXiv preprint arXiv:1704.04960 (2017).

[13]

Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).

[14]

Arthur Gretton, Karsten M Borgwardt, Malte J Rasch, Bernhard Schölkopf, and Alexander Smola. 2012. A kernel two-sample test. Journal of Machine Learning Research Vol. 13, Mar (2012), 723--773.

Digital Library

[15]

Kathrin Grosse, Praveen Manoharan, Nicolas Papernot, Michael Backes, and Patrick McDaniel. 2017. On the (Statistical) Detection of Adversarial Examples. arXiv preprint arXiv:1702.06280 (2017).

[16]

Shixiang Gu and Luca Rigazio. 2014. Towards deep neural network architectures robust to adversarial examples. arXiv preprint arXiv:1412.5068 (2014).

[17]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.

[18]

Jan Hendrik Metzen, Tim Genewein, Volker Fischer, and Bastian Bischoff. 2017. On Detecting Adversarial Perturbations. In International Conference on Learning Representations. shownotearXiv preprint arXiv:1702.04267.

[19]

Dan Hendrycks and Kevin Gimpel. 2017. Early Methods for Detecting Adversarial Images. In International Conference on Learning Representations (Workshop Track).

[20]

Ruitong Huang, Bing Xu, Dale Schuurmans, and Csaba Szepesvári. 2015. Learning with a strong adversary. CoRR, abs/1511.03034 (2015).

[21]

Jonghoon Jin, Aysegul Dundar, and Eugenio Culurciello. 2015. Robust Convolutional Neural Networks under Adversarial Noise. arXiv preprint arXiv:1511.06306 (2015).

[22]

Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images. (2009).

[23]

Yann LeCun, Corinna Cortes, and Christopher J. C. Burges. 1998. The MNIST database of handwritten digits. (1998).

[24]

Xin Li and Fuxin Li. 2016. Adversarial Examples Detection in Deep Networks with Convolutional Filter Statistics. arXiv preprint arXiv:1612.07767 (2016).

[25]

Daniel Lowd and Christopher Meek. 2005. Adversarial learning Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, 641--647.

[26]

Seyed-Mohsen, Moosavi-Dezfooli, Alhussein Fawzi, and Pascal Frossard. 2016. Deepfool: a simple and accurate method to fool deep neural networks Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2574--2582.

[27]

Vinod Nair and Geoffrey E. Hinton. 2010. Rectified linear units improve restricted boltzmann machines Proceedings of the 27th international conference on machine learning (ICML-10). 807--814.

[28]

Anders Odén and Hans Wedel. 1975. Arguments for Fisher's permutation test. The Annals of Statistics (1975), 518--520.

[29]

Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow. 2016. Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. arXiv preprint arXiv:1605.07277 (2016).

[30]

Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. 2016. The limitations of deep learning in adversarial settings Security and Privacy (EuroS&P), 2016 IEEE European Symposium on. IEEE, 372--387.

[31]

Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a defense to adversarial perturbations against deep neural networks. IEEE Symposium on Security and Privacy (2016).

[32]

Slav Petrov. 2016. Announcing syntaxnet: The world's most accurate parser goes open source. Google Research Blog, May Vol. 12 (2016), 2016.

[33]

Andras Rozsa, Ethan M. Rudd, and Terrance E. Boult. 2016. Adversarial diversity and hard positive generation Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 25--32.

[34]

Uri Shaham, Yutaro Yamada, and Sahand Negahban. 2015. Understanding Adversarial Training: Increasing Local Stability of Neural Nets through Robust Optimization. arXiv preprint arXiv:1511.05432 (2015).

[35]

David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, and others. 2016. Mastering the game of Go with deep neural networks and tree search. Nature, Vol. 529, 7587 (2016), 484--489.

[36]

Jost Tobias Springenberg, Alexey Dosovitskiy, Thomas Brox, and Martin Riedmiller. 2015. Striving for simplicity: The all convolutional net International Conference on Learning Representations (Workshop Track).

[37]

Nitish Srivastava, Geoffrey E. Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research Vol. 15, 1 (2014), 1929--1958.

Digital Library

[38]

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2818--2826.

[39]

Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. (2014).

[40]

Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, and others. 2016. Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016).

[41]

Stephan Zheng, Yang Song, Thomas Leung, and Ian Goodfellow. 2016. Improving the robustness of deep neural networks via stability training Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4480--4488.

Cited By

Kanda YHatano K(2024)Adversarial Example Detection Using Robustness against Multi-Step Gaussian NoiseProceedings of the 2024 13th International Conference on Software and Computer Applications10.1145/3651781.3651808(178-184)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1145/3651781.3651808
Wei WLiu L(2024)Trustworthy Distributed AI Systems: Robustness, Privacy, and GovernanceACM Computing Surveys10.1145/3645102Online publication date: 7-Feb-2024
https://doi.org/10.1145/3645102
Drichel AMeyer MMeyer UQuek TGao DZhou JCardenas A(2024)Towards Robust Domain Generation Algorithm ClassificationProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3656287(2-18)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3634737.3656287
Show More Cited By

Index Terms

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Security and privacy
  1. Software and application security

Recommendations

Adversarial examples: A survey of attacks and defenses in deep learning-enabled cybersecurity systems
Abstract
Over the last few years, the adoption of machine learning in a wide range of domains has been remarkable. Deep learning, in particular, has been extensively used to drive applications and services in specializations such as computer vision, ...
Highlights
- A taxonomy of cybersecurity applications is established.
- Adversarial machine learning is systematically overviewed.
- An extensive, curated list of cybersecurity-related datasets is provided.
- Methods for generating adversarial ...
Generating Transferable Adversarial Examples for Speech Classification
Highlights
- Speech classification systems are vulnerable to adversarial attacks.
- ...
Abstract
Despite the success of deep neural networks, the existence of adversarial attacks has revealed the vulnerability of neural networks in terms of security. Adversarial attacks add subtle noise to the original example, resulting in a ...
Learning defense transformations for counterattacking adversarial examples
Abstract
Deep neural networks (DNNs) are vulnerable to adversarial examples with small perturbations. Adversarial defense thus has been an important means which improves the robustness of DNNs by defending against adversarial examples. Existing defense ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AISec '17: Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security

November 2017

140 pages

ISBN:9781450352024

DOI:10.1145/3128572

General Chair:
Bhavani Thuraisingham
University of Texas at Dallas, USA
,
Program Chairs:
Battista Biggio
Pluribus One and University of Cagliari, Italy
,
David Mandell Freeman
Facebook Inc., USA
,
Brad Miller
Google Inc., USA
,
Arunesh Sinha
University of Michigan, Ann Arbor, USA

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

Qualcomm
AFOSR
Intel
Hewlett Foundation

Conference

CCS '17

Sponsor:

SIGSAC

CCS '17: 2017 ACM SIGSAC Conference on Computer and Communications Security

November 3, 2017

Texas, Dallas, USA

Acceptance Rates

AISec '17 Paper Acceptance Rate 11 of 36 submissions, 31%;

Overall Acceptance Rate 94 of 231 submissions, 41%

Upcoming Conference

CCS '24

Sponsor:
sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

Salt Lake City , UT , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

798
Total Citations
View Citations
6,738
Total Downloads

Downloads (Last 12 months)923
Downloads (Last 6 weeks)64

Reflects downloads up to 18 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kanda YHatano K(2024)Adversarial Example Detection Using Robustness against Multi-Step Gaussian NoiseProceedings of the 2024 13th International Conference on Software and Computer Applications10.1145/3651781.3651808(178-184)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1145/3651781.3651808
Wei WLiu L(2024)Trustworthy Distributed AI Systems: Robustness, Privacy, and GovernanceACM Computing Surveys10.1145/3645102Online publication date: 7-Feb-2024
https://doi.org/10.1145/3645102
Drichel AMeyer MMeyer UQuek TGao DZhou JCardenas A(2024)Towards Robust Domain Generation Algorithm ClassificationProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3656287(2-18)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3634737.3656287
Zhao ZChen GLiu TLi TSong FWang JSun J(2024)Attack as Detection: Using Adversarial Attack Methods to Detect Abnormal ExamplesACM Transactions on Software Engineering and Methodology10.1145/363197733:3(1-45)Online publication date: 15-Mar-2024
https://dl.acm.org/doi/10.1145/3631977
Chaalan TPang SKamruzzaman JGondal IZhang X(2024)The Path to Defence: A Roadmap to Characterising Data Poisoning Attacks on Victim ModelsACM Computing Surveys10.1145/362753656:7(1-39)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.1145/3627536
Gafur JGoddard SLai W(2024)Adversarial Robustness and Explainability of Machine Learning ModelsPractice and Experience in Advanced Research Computing 2024: Human Powered Computing10.1145/3626203.3670522(1-7)Online publication date: 17-Jul-2024
https://dl.acm.org/doi/10.1145/3626203.3670522
Qayyum AButt MAli HUsman MHalabi OAl-Fuqaha AAbbasi QImran MQadir J(2024)Secure and Trustworthy Artificial Intelligence-extended Reality (AI-XR) for MetaversesACM Computing Surveys10.1145/361442656:7(1-38)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.1145/3614426
Hartline JLong SZhang C(2024)Regulation of Algorithmic CollusionProceedings of the Symposium on Computer Science and Law10.1145/3614407.3643706(98-108)Online publication date: 12-Mar-2024
https://dl.acm.org/doi/10.1145/3614407.3643706
Scheurer ESchmalfuss JLis ABruhn A(2024)Detection Defenses: An Empty Promise against Adversarial Patch Attacks on Optical Flow2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00636(6475-6484)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00636
Cohen GGiryes R(2024)Simple Post-Training Robustness using Test Time Augmentations and Random Forest2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00395(3984-3994)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00395
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents