Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness

Jacobsen, Jörn-Henrik; Behrmannn, Jens; Carlini, Nicholas; Tramèr, Florian; Papernot, Nicolas

Computer Science > Machine Learning

arXiv:1903.10484 (cs)

[Submitted on 25 Mar 2019]

Title:Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness

Authors:Jörn-Henrik Jacobsen, Jens Behrmannn, Nicholas Carlini, Florian Tramèr, Nicolas Papernot

View PDF

Abstract:Adversarial examples are malicious inputs crafted to cause a model to misclassify them. Their most common instantiation, "perturbation-based" adversarial examples introduce changes to the input that leave its true label unchanged, yet result in a different model prediction. Conversely, "invariance-based" adversarial examples insert changes to the input that leave the model's prediction unaffected despite the underlying input's label having changed.
In this paper, we demonstrate that robustness to perturbation-based adversarial examples is not only insufficient for general robustness, but worse, it can also increase vulnerability of the model to invariance-based adversarial examples. In addition to analytical constructions, we empirically study vision classifiers with state-of-the-art robustness to perturbation-based adversaries constrained by an $\ell_p$ norm. We mount attacks that exploit excessive model invariance in directions relevant to the task, which are able to find adversarial examples within the $\ell_p$ ball. In fact, we find that classifiers trained to be $\ell_p$-norm robust are more vulnerable to invariance-based adversarial examples than their undefended counterparts.
Excessive invariance is not limited to models trained to be robust to perturbation-based $\ell_p$-norm adversaries. In fact, we argue that the term adversarial example is used to capture a series of model limitations, some of which may not have been discovered yet. Accordingly, we call for a set of precise definitions that taxonomize and address each of these shortcomings in learning.

Comments:	Accepted at the ICLR 2019 SafeML Workshop
Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as:	arXiv:1903.10484 [cs.LG]
	(or arXiv:1903.10484v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1903.10484

Submission history

From: Nicholas Carlini [view email]
[v1] Mon, 25 Mar 2019 17:29:52 UTC (3,889 KB)

Computer Science > Machine Learning

Title:Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Exploiting Excessive Invariance caused by Norm-Bounded Adversarial Robustness

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators