Improving Adversarial Robustness via Joint Classification and Multiple Explicit Detection Classes

Baharlouei, Sina; Sheikholeslami, Fatemeh; Razaviyayn, Meisam; Kolter, Zico

Computer Science > Computer Vision and Pattern Recognition

arXiv:2210.14410 (cs)

[Submitted on 26 Oct 2022 (v1), last revised 10 May 2023 (this version, v2)]

Title:Improving Adversarial Robustness via Joint Classification and Multiple Explicit Detection Classes

Authors:Sina Baharlouei, Fatemeh Sheikholeslami, Meisam Razaviyayn, Zico Kolter

View PDF

Abstract:This work concerns the development of deep networks that are certifiably robust to adversarial attacks. Joint robust classification-detection was recently introduced as a certified defense mechanism, where adversarial examples are either correctly classified or assigned to the "abstain" class. In this work, we show that such a provable framework can benefit by extension to networks with multiple explicit abstain classes, where the adversarial examples are adaptively assigned to those. We show that naively adding multiple abstain classes can lead to "model degeneracy", then we propose a regularization approach and a training method to counter this degeneracy by promoting full use of the multiple abstain classes. Our experiments demonstrate that the proposed approach consistently achieves favorable standard vs. robust verified accuracy tradeoffs, outperforming state-of-the-art algorithms for various choices of number of abstain classes.

Comments:	20 pages, 6 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2210.14410 [cs.CV]
	(or arXiv:2210.14410v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2210.14410
Journal reference:	International Conference on Artificial Intelligence and Statistics, PMLR 2023

Submission history

From: Sina Baharlouei [view email]
[v1] Wed, 26 Oct 2022 01:23:33 UTC (2,470 KB)
[v2] Wed, 10 May 2023 22:33:51 UTC (6,525 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Improving Adversarial Robustness via Joint Classification and Multiple Explicit Detection Classes

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Improving Adversarial Robustness via Joint Classification and Multiple Explicit Detection Classes

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators