Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples

Gowal, Sven; Qin, Chongli; Uesato, Jonathan; Mann, Timothy; Kohli, Pushmeet

Statistics > Machine Learning

arXiv:2010.03593 (stat)

[Submitted on 7 Oct 2020 (v1), last revised 30 Mar 2021 (this version, v3)]

Title:Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples

Authors:Sven Gowal, Chongli Qin, Jonathan Uesato, Timothy Mann, Pushmeet Kohli

View PDF

Abstract:Adversarial training and its variants have become de facto standards for learning robust deep neural networks. In this paper, we explore the landscape around adversarial training in a bid to uncover its limits. We systematically study the effect of different training losses, model sizes, activation functions, the addition of unlabeled data (through pseudo-labeling) and other factors on adversarial robustness. We discover that it is possible to train robust models that go well beyond state-of-the-art results by combining larger models, Swish/SiLU activations and model weight averaging. We demonstrate large improvements on CIFAR-10 and CIFAR-100 against $\ell_\infty$ and $\ell_2$ norm-bounded perturbations of size $8/255$ and $128/255$, respectively. In the setting with additional unlabeled data, we obtain an accuracy under attack of 65.88% against $\ell_\infty$ perturbations of size $8/255$ on CIFAR-10 (+6.35% with respect to prior art). Without additional data, we obtain an accuracy under attack of 57.20% (+3.46%). To test the generality of our findings and without any additional modifications, we obtain an accuracy under attack of 80.53% (+7.62%) against $\ell_2$ perturbations of size $128/255$ on CIFAR-10, and of 36.88% (+8.46%) against $\ell_\infty$ perturbations of size $8/255$ on CIFAR-100. All models are available at this https URL.

Comments:	Fixed minor formatting issues and added link to models
Subjects:	Machine Learning (stat.ML); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2010.03593 [stat.ML]
	(or arXiv:2010.03593v3 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2010.03593

Submission history

From: Sven Gowal [view email]
[v1] Wed, 7 Oct 2020 18:19:09 UTC (267 KB)
[v2] Tue, 27 Oct 2020 16:28:20 UTC (1,452 KB)
[v3] Tue, 30 Mar 2021 08:08:12 UTC (1,509 KB)

Statistics > Machine Learning

Title:Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators