research-article

Explaining black-box classifiers: : Properties and functions

Author:

Leila AmgoudAuthors Info & Claims

Volume 155, Issue C

Pages 40 - 65

https://doi.org/10.1016/j.ijar.2023.01.004

Published: 01 April 2023 Publication History

Abstract

Explaining black-box classification models is a hot topic in AI, with the overall goal of improving trust in decisions made by such models. Several works have been done and diverse functions have been proposed. However, their formal properties and links have not been sufficiently studied. This paper presents four contributions: The first consists of investigating global explanations of black-box classifiers. We provide a formal and unifying framework in which such explanations are defined from the whole feature space. The framework is based on two concepts, which are seen as two types of global explanations: arguments in favour of (or pro) predictions and arguments against (or con) predictions. The second contribution consists of defining various types of local explanations (abductive explanations, counterfactuals, contrastive explanations) from the whole feature space, investigating their properties, links and differences, and showing how they relate to global explanations. The third contribution consists of analysing and defining explanation functions that generate (global, local) abductive explanations from incomplete information (i.e., from a subset of the feature space). We start by proposing two desirable properties that an explainer would satisfy, namely success and coherence. The former ensures the existence of explanations while the latter ensures their correctness. We show that in the incomplete case, the two properties cannot be satisfied together. The fourth contribution consists of proposing two functions that generate abductive explanations and which satisfy coherence at the expense of success.

References

[1]

O. Biran, C. Cotton, Explanation and justification in machine learning: a survey, in: IJCAI Workshop on Explainable Artificial Intelligence (XAI), 2017, pp. 1–6.

[2]

R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, D. Pedreschi, A survey of methods for explaining black box models, ACM Comput. Surv. 51 (5) (2019).

Digital Library

[3]

T. Miller, Explanation in artificial intelligence: insights from the social sciences, Artif. Intell. 267 (2019) 1–38.

[4]

C. Molnar, Interpretable Machine Learning, Lulu.com, 2020, https://books.google.fr/books?id=RHjTxgEACAAJ.

[5]

N. Burkart, M. Huber, A survey on the explainability of supervised machine learning, J. Artif. Intell. Res. 70 (2021) 245–317.

Digital Library

[6]

I. Stepin, J.M. Alonso, A. Catalá, M. Pereira-Fariña, A survey of contrastive and counterfactual explanation generation methods for explainable artificial intelligence, IEEE Access 9 (2021) 11974–12001.

[7]

K. Cyras, A. Rago, E. Albini, P. Baroni, F. Toni, Argumentative XAI: a survey, in: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI, 2021, pp. 4392–4399.

[8]

M.T. Ribeiro, S. Singh, C. Guestrin, Why should I trust you?: Explaining the predictions of any classifier, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 1135–1144.

[9]

M.T. Ribeiro, S. Singh, C. Guestrin, Anchors: high-precision model-agnostic explanations, in: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018, pp. 1527–1535.

[10]

A. Dhurandhar, P. Chen, R. Luss, C. Tu, P. Ting, K. Shanmugam, P. Das, Explanations based on the missing: towards contrastive explanations with pertinent negatives, in: Annual Conference on Neural Information Processing Systems, NeurIPS, 2018, pp. 590–601.

[11]

A. Darwiche, A. Hirth, On the reasons behind decisions, in: 24th European Conference on Artificial Intelligence, in: Frontiers in Artificial Intelligence and Applications, vol. 325, ECAI, IOS Press, 2020, pp. 712–720.

[12]

A. Ignatiev, N. Narodytska, J. Marques-Silva, Abduction-based explanations for machine learning models, in: The Thirty-Third Conference on Artificial Intelligence, AAAI, 2019, pp. 1511–1519.

[13]

L. Amgoud, Explaining black-box classification models with arguments, in: 33rd IEEE International Conference on Tools with Artificial Intelligence, ICTAI, 2021, pp. 791–795.

[14]

A. Ignatiev, N. Narodytska, J. Marques-Silva, On relating explanations and adversarial examples, in: Thirty-Third Conference on Neural Information Processing Systems, NeurIPS, 2019, pp. 15857–15867.

[15]

A. Ignatiev, N. Narodytska, N. Asher, J. Marques-Silva, From contrastive to abductive explanations and back again, in: AIxIA 2020 - Advances in Artificial Intelligence - XIXth International Conference of the Italian Association for Artificial Intelligence, in: Lecture Notes in Computer Science, vol. 12414, Springer, 2020, pp. 335–355.

[16]

G. Audemard, S. Bellart, L. Bounia, F. Koriche, J.-M. Lagniez, P. Marquis, On preferred abductive explanations for decision trees and random forests, in: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI, 2022, pp. 643–650.

[17]

L. Amgoud, Non-monotonic explanation functions, in: Proceedings of the 16th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, in: Lecture Notes in Computer Science, vol. 12897, ECSQARU, 2021, pp. 19–31.

[18]

A. Shih, A. Choi, A. Darwiche, A symbolic approach to explaining Bayesian network classifiers, in: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI, 2018, pp. 5103–5111.

[19]

A. Ignatiev, J.P.M. Silva, SAT-based rigorous explanations for decision lists, in: 24th International Conference on Theory and Applications of Satisfiability Testing - SAT, 2021, pp. 251–269.

[20]

J. Ferreira, M. de Sousa Ribeiro, R. Gonçalves, J. Leite, Looking inside the black-box: logic-based explanations for neural networks, in: 19th International Conference on Principles of Knowledge Representation and Reasoning, KR, 2022.

[21]

O. Biran, K.R. McKeown, Human-centric justification of machine learning predictions, in: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI, 2017, pp. 1461–1467.

[22]

Luss, R.; Chen, P.; Dhurandhar, A.; Sattigeri, P.; Shanmugam, K.; Tu, C. (2019): Generating contrastive explanations with monotonic attribute functions. CoRR http://arxiv.org/abs/1905.12698.

[23]

B. Mittelstadt, C. Russell, S. Wachter, Explaining explanations in AI, in: Proceedings of the Conference on Fairness, Accountability, and Transparency, 2019, pp. 279–288.

[24]

Wachter, S.; Mittelstadt, B.D.; Russell, C. (2017): Counterfactual explanations without opening the black box: automated decisions and the GDPR. CoRR arXiv:1711.00399.

[25]

R. Byrne, Semifactual “even if” thinking, Think. Reasoning 8 (1) (2002) 41–67.

[26]

I. Rahwan, G. Simari (Eds.), Argumentation in Artificial Intelligence, Springer, 2009.

[27]

F. Lin, Y. Shoham, Argument systems - an uniform basis for non-monotonic reasoning, in: Proc. of KR, 1989, pp. 245–255.

[28]

G. Simari, R. Loui, A mathematical treatment of defeasible reasoning and its implementation, Artif. Intell. 53 (2–3) (1992) 125–157.

[29]

P. Besnard, A. Hunter, A logic-based theory of deductive arguments, Artif. Intell. 128 (1–2) (2001) 203–235.

[30]

L. Amgoud, H. Prade, Using arguments for making and explaining decisions, Artif. Intell. 173 (3–4) (2009) 413–436.

Digital Library

[31]

L. Amgoud, M. Serrurier, Agents that argue and explain classifications, Auton. Agents M ti-Agent Syst. 16 (2) (2008) 187–209.

[32]

K. Atkinson, P. Baroni, M. Giacomin, A. Hunter, H. Prakken, C. Reed, G.R. Simari, M. Thimm, S. Villata, Towards artificial argumentation, AI Mag. 38 (3) (2017) 25–36.

Digital Library

[33]

P.M. Dung, On the acceptability of arguments and its fundamental role in non-monotonic reasoning, logic programming and n-person games, Artificial Intelligence 77 (1995) 321–357.

[34]

S. Kotsiantis, D. Kanellopoulos, Discretization techniques: a recent survey, GESTS Int. Trans. Comput. Sci. Eng. 32 (1) (2006) 47–58.

[35]

Park, D.; Hendricks, L.; Akata, Z.; Rohrbach, A.; Schiele, B.; Darrell, T.; Rohrbach, M. (2018): Multimodal explanations: justifying decisions and pointing to the evidence. CoRR arXiv:1802.08129.

[36]

A. Schulz, F. Hinder, B. Hammer, Deepview: visualizing classification boundaries of deep neural networks as scatter plots using discriminative dimensionality reduction, in: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI, 2020, pp. 2305–2311.

[37]

O. Li, H. Liu, C. Chen, C. Rudin, Deep learning for case-based reasoning through prototypes: a neural network that explains its predictions, in: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, AAAI, 2018, pp. 3530–3537.

[38]

I. Stepin, A. Catala, M. Pereira-Fariña, J. Alonso, Paving the way towards counterfactual generation in argumentative conversational agents, in: Proceedings of the 1st Workshop on Interactive Natural Language Technology for Explainable Artificial Intelligence (NL4XAI), Association for Computational Linguistics, 2019, pp. 20–25.

[39]

Y. Dimopoulos, S. Dzeroski, A. Kakas, Integrating explanatory and descriptive learning in ILP, in: Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, IJCAI, 1997, pp. 900–907.

[40]

A. Kakas, F. Riguzzi, Abductive concept learning, New Gener. Comput. 18 (3) (2000) 243–294.

[41]

R. Byrne, Counterfactuals in explainable artificial intelligence (XAI): evidence from human reasoning, in: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI, 2019, pp. 6276–6282.

[42]

R. Byrne, Counterfactual thought, Annu. Rev. Psychol. 67 (2016).

[43]

S. Coste-Marquis, C. Devred, P. Marquis, Symmetric argumentation frameworks, in: Proceedings of the European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, ECSQARU, 2005, pp. 317–328.

[44]

G. Choquet, Theory of capacities, Ann. Inst. Fourier 5 (1953) 131–295.

[45]

B. Liao, L. van der Torre, Explanation semantics for abstract argumentation, in: H. Prakken, S. Bistarelli, F. Santini, C. Taticchi (Eds.), Computational Models of Argument - Proceedings of COMMA, in: Frontiers in Artificial Intelligence and Applications, vol. 326, IOS Press, 2020, pp. 271–282.

[46]

J. Fandinno, C. Schulz, Answering the “why” in answer set programming - a survey of explanation approaches, Theory Pract. Log. Program. 19 (2) (2019) 114–203.

[47]

A. Rago, O. Cocarascu, C. Bechlivanidis, D.A. Lagnado, F. Toni, Argumentative explanations for interactive recommendations, Artif. Intell. 296 (2021).

[48]

C. Labreuche, Explanation with the winter value: efficient computation for hierarchical Choquet integrals, Int. J. Approx. Reason. 151 (2022) 225–250.

[49]

B. Krarup, S. Krivic, D. Magazzeni, D. Long, M. Cashmore, D.E. Smith, Contrastive explanations of plans through model restrictions, J. Artif. Intell. Res. 72 (2021) 533–612.

[50]

D. Aineto, E. Onaindia, M. Ramírez, E. Scala, I. Serina, Explaining the behaviour of hybrid systems with PDDL+ planning, in: L.D. Raedt (Ed.), Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI, 2022, pp. 4567–4573. ijcai.org.

[51]

S.M. Lundberg, S. Lee, A unified approach to interpreting model predictions, in: I. Guyon, U. von Luxburg, S. Bengio, H.M. Wallach, R. Fergus, S.V.N. Vishwanathan, R. Garnett (Eds.), Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, 2017, pp. 4765–4774.

[52]

Lundberg, S.M.; Erion, G.G.; Lee, S.-I. (2018): Consistent individualized feature attribution for tree ensembles. CoRR arXiv:1802.03888.

[53]

P. Rasouli, I.C. Yu, EXPLAN: explaining black-box classifiers using adaptive neighborhood generation, in: 2020 International Joint Conference on Neural Networks, IJCNN, IEEE, 2020, pp. 1–9.

[54]

M. Setzu, R. Guidotti, A. Monreale, F. Turini, D. Pedreschi, F. Giannotti, Glocalx - from local to global explanations of black box AI models, Artif. Intell. 294 (2021).

[55]

van der Linden, I.; Haned, H.; Kanoulas, E. (2019): Global aggregations of local explanations for black box models. CoRR arXiv:1907.03039.

[56]

R. Boumazouza, F.C. Alili, B. Mazure, K. Tabia, ASTERYX: a model-agnostic sat-based approach for symbolic and score-based explanations, in: G. Demartini, G. Zuccon, J.S. Culpepper, Z. Huang, H. Tong (Eds.), The 30th ACM International Conference on Information and Knowledge Management, CIKM, ACM, 2021, pp. 120–129.

[57]

O. Cocarascu, A. Stylianou, K. Cyras, F. Toni, Data-empowered argumentation for dialectically explainable predictions, in: Proceedings of the Twenty-Fourth Conference on Artificial Intelligence, ECAI, 2020, pp. 2449–2456.

[58]

K. Cyras, D. Birch, Y. Guo, F. Toni, R. Dulay, S. Turvey, D. Greenberg, T. Hapuarachchi, Explanations by arbitrated argumentative dispute, Expert Systems Applications 127 (2019) 141–156.

[59]

K. Cyras, D. Letsios, R. Misener, F. Toni, Argumentation for explainable scheduling, in: The Thirty-Third Conference on Artificial Intelligence, AAAI, 2019, pp. 2752–2759.

[60]

A. Rago, O. Cocarascu, F. Toni, Argumentation-based recommendations: fantastic explanations and how to find them, in: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI, 2018, pp. 1949–1955.

[61]

A. Borg, F. Bex, Contrastive explanations for argumentation-based conclusions, in: 21st International Conference on Autonomous Agents and Multiagent Systems, AAMAS, 2022, pp. 1551–1553.

[62]

A. Borg, F. Bex, Necessary and sufficient explanations for argumentation-based conclusions, in: J. Vejnarová, N. Wilson (Eds.), Symbolic and Quantitative Approaches to Reasoning with Uncertainty - 16th European Conference, in: Lecture Notes in Computer Science, vol. 12897, ECSQARU, Springer, 2021, pp. 45–58.

[63]

L. Amgoud, D. Doder, S. Vesic, Evaluation of argument strength in attack graphs: foundations and semantics, Artificial Intelligence 302 (2022).

Cited By

Ji CDarwiche A(2023)A New Class of Explanations for Classifiers with Non-binary FeaturesLogics in Artificial Intelligence10.1007/978-3-031-43619-2_8(106-122)Online publication date: 20-Sep-2023
https://dl.acm.org/doi/10.1007/978-3-031-43619-2_8

Index Terms

Explaining black-box classifiers: Properties and functions
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees
2. Theory of computation

Index terms have been assigned to the content through auto-classification.

Recommendations

Explaining deep multi-class time series classifiers
Abstract
Explainability helps users trust deep learning solutions for time series classification. However, existing explainability methods for multi-class time series classifiers focus on one class at a time, ignoring relationships between the classes. ...
Non-monotonic Explanation Functions
Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Abstract
Explaining black-box classification models is a hot topic in AI, it has the overall goal of improving trust in decisions made by such models. Several works have been done and diverse explanation functions have been proposed. The most prominent ...
Multi-criteria Approaches to Explaining Black Box Machine Learning Models
Intelligent Information and Database Systems
Abstract
The adoption of machine learning algorithms, especially in critical domains often encounters obstacles related to the lack of their interpretability. In this paper we discuss the methods producing local explanations being either counterfactuals or ...

Comments

Information & Contributors

Information

Published In

cover image International Journal of Approximate Reasoning

International Journal of Approximate Reasoning Volume 155, Issue C

Apr 2023

145 pages

ISSN:0888-613X

Issue’s Table of Contents

Elsevier Inc.

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 April 2023

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ji CDarwiche A(2023)A New Class of Explanations for Classifiers with Non-binary FeaturesLogics in Artificial Intelligence10.1007/978-3-031-43619-2_8(106-122)Online publication date: 20-Sep-2023
https://dl.acm.org/doi/10.1007/978-3-031-43619-2_8

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents