Counterfactuals and Causability in Explainable Artificial Intelligence: Theory, Algorithms, and Applications

Chou, Yu-Liang; Moreira, Catarina; Bruza, Peter; Ouyang, Chun; Jorge, Joaquim

Computer Science > Artificial Intelligence

arXiv:2103.04244 (cs)

[Submitted on 7 Mar 2021 (v1), last revised 8 Jun 2021 (this version, v2)]

Title:Counterfactuals and Causability in Explainable Artificial Intelligence: Theory, Algorithms, and Applications

Authors:Yu-Liang Chou, Catarina Moreira, Peter Bruza, Chun Ouyang, Joaquim Jorge

View PDF

Abstract:There has been a growing interest in model-agnostic methods that can make deep learning models more transparent and explainable to a user. Some researchers recently argued that for a machine to achieve a certain degree of human-level explainability, this machine needs to provide human causally understandable explanations, also known as causability. A specific class of algorithms that have the potential to provide causability are counterfactuals. This paper presents an in-depth systematic review of the diverse existing body of literature on counterfactuals and causability for explainable artificial intelligence. We performed an LDA topic modelling analysis under a PRISMA framework to find the most relevant literature articles. This analysis resulted in a novel taxonomy that considers the grounding theories of the surveyed algorithms, together with their underlying properties and applications in real-world data. This research suggests that current model-agnostic counterfactual algorithms for explainable AI are not grounded on a causal theoretical formalism and, consequently, cannot promote causability to a human decision-maker. Our findings suggest that the explanations derived from major algorithms in the literature provide spurious correlations rather than cause/effects relationships, leading to sub-optimal, erroneous or even biased explanations. This paper also advances the literature with new directions and challenges on promoting causability in model-agnostic approaches for explainable artificial intelligence.

Subjects:	Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2103.04244 [cs.AI]
	(or arXiv:2103.04244v2 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2103.04244

Submission history

From: Catarina Moreira [view email]
[v1] Sun, 7 Mar 2021 03:11:39 UTC (5,796 KB)
[v2] Tue, 8 Jun 2021 06:50:02 UTC (6,059 KB)

Computer Science > Artificial Intelligence

Title:Counterfactuals and Causability in Explainable Artificial Intelligence: Theory, Algorithms, and Applications

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Counterfactuals and Causability in Explainable Artificial Intelligence: Theory, Algorithms, and Applications

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators