The Clever Hans Effect in Unsupervised Learning

Kauffmann, Jacob; Dippel, Jonas; Ruff, Lukas; Samek, Wojciech; Müller, Klaus-Robert; Montavon, Grégoire

Computer Science > Machine Learning

arXiv:2408.08041 (cs)

[Submitted on 15 Aug 2024]

Title:The Clever Hans Effect in Unsupervised Learning

Authors:Jacob Kauffmann, Jonas Dippel, Lukas Ruff, Wojciech Samek, Klaus-Robert Müller, Grégoire Montavon

View PDF HTML (experimental)

Abstract:Unsupervised learning has become an essential building block of AI systems. The representations it produces, e.g. in foundation models, are critical to a wide variety of downstream applications. It is therefore important to carefully examine unsupervised models to ensure not only that they produce accurate predictions, but also that these predictions are not "right for the wrong reasons", the so-called Clever Hans (CH) effect. Using specially developed Explainable AI techniques, we show for the first time that CH effects are widespread in unsupervised learning. Our empirical findings are enriched by theoretical insights, which interestingly point to inductive biases in the unsupervised learning machine as a primary source of CH effects. Overall, our work sheds light on unexplored risks associated with practical applications of unsupervised learning and suggests ways to make unsupervised learning more robust.

Comments:	12 pages + supplement
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:2408.08041 [cs.LG]
	(or arXiv:2408.08041v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2408.08041

Submission history

From: Grégoire Montavon [view email]
[v1] Thu, 15 Aug 2024 09:19:42 UTC (11,827 KB)

Computer Science > Machine Learning

Title:The Clever Hans Effect in Unsupervised Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:The Clever Hans Effect in Unsupervised Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators