DiG-IN: Diffusion Guidance for Investigating Networks -- Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations

Augustin, Maximilian; Neuhaus, Yannic; Hein, Matthias

Computer Science > Computer Vision and Pattern Recognition

arXiv:2311.17833 (cs)

[Submitted on 29 Nov 2023 (v1), last revised 12 Jul 2024 (this version, v3)]

Title:DiG-IN: Diffusion Guidance for Investigating Networks -- Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations

Authors:Maximilian Augustin, Yannic Neuhaus, Matthias Hein

View PDF HTML (experimental)

Abstract:While deep learning has led to huge progress in complex image classification tasks like ImageNet, unexpected failure modes, e.g. via spurious features, call into question how reliably these classifiers work in the wild. Furthermore, for safety-critical tasks the black-box nature of their decisions is problematic, and explanations or at least methods which make decisions plausible are needed urgently. In this paper, we address these problems by generating images that optimize a classifier-derived objective using a framework for guided image generation. We analyze the decisions of image classifiers by visual counterfactual explanations (VCEs), detection of systematic mistakes by analyzing images where classifiers maximally disagree, and visualization of neurons and spurious features. In this way, we validate existing observations, e.g. the shape bias of adversarially robust models, as well as novel failure modes, e.g. systematic errors of zero-shot CLIP classifiers. Moreover, our VCEs outperform previous work while being more versatile.

Comments:	CVPR 2024
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2311.17833 [cs.CV]
	(or arXiv:2311.17833v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2311.17833

Submission history

From: Maximilian Augustin [view email]
[v1] Wed, 29 Nov 2023 17:35:29 UTC (40,549 KB)
[v2] Sun, 2 Jun 2024 19:18:37 UTC (21,735 KB)
[v3] Fri, 12 Jul 2024 06:53:50 UTC (21,735 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:DiG-IN: Diffusion Guidance for Investigating Networks -- Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:DiG-IN: Diffusion Guidance for Investigating Networks -- Uncovering Classifier Differences Neuron Visualisations and Visual Counterfactual Explanations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators