Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Explaining black-box classifiers: : Properties and functions

Published: 01 April 2023 Publication History

Abstract

Explaining black-box classification models is a hot topic in AI, with the overall goal of improving trust in decisions made by such models. Several works have been done and diverse functions have been proposed. However, their formal properties and links have not been sufficiently studied. This paper presents four contributions: The first consists of investigating global explanations of black-box classifiers. We provide a formal and unifying framework in which such explanations are defined from the whole feature space. The framework is based on two concepts, which are seen as two types of global explanations: arguments in favour of (or pro) predictions and arguments against (or con) predictions. The second contribution consists of defining various types of local explanations (abductive explanations, counterfactuals, contrastive explanations) from the whole feature space, investigating their properties, links and differences, and showing how they relate to global explanations. The third contribution consists of analysing and defining explanation functions that generate (global, local) abductive explanations from incomplete information (i.e., from a subset of the feature space). We start by proposing two desirable properties that an explainer would satisfy, namely success and coherence. The former ensures the existence of explanations while the latter ensures their correctness. We show that in the incomplete case, the two properties cannot be satisfied together. The fourth contribution consists of proposing two functions that generate abductive explanations and which satisfy coherence at the expense of success.

References

[1]
O. Biran, C. Cotton, Explanation and justification in machine learning: a survey, in: IJCAI Workshop on Explainable Artificial Intelligence (XAI), 2017, pp. 1–6.
[2]
R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, D. Pedreschi, A survey of methods for explaining black box models, ACM Comput. Surv. 51 (5) (2019).
[3]
T. Miller, Explanation in artificial intelligence: insights from the social sciences, Artif. Intell. 267 (2019) 1–38.
[4]
C. Molnar, Interpretable Machine Learning, Lulu.com, 2020, https://books.google.fr/books?id=RHjTxgEACAAJ.
[5]
N. Burkart, M. Huber, A survey on the explainability of supervised machine learning, J. Artif. Intell. Res. 70 (2021) 245–317.
[6]
I. Stepin, J.M. Alonso, A. Catalá, M. Pereira-Fariña, A survey of contrastive and counterfactual explanation generation methods for explainable artificial intelligence, IEEE Access 9 (2021) 11974–12001.
[7]
K. Cyras, A. Rago, E. Albini, P. Baroni, F. Toni, Argumentative XAI: a survey, in: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI, 2021, pp. 4392–4399.
[8]
M.T. Ribeiro, S. Singh, C. Guestrin, Why should I trust you?: Explaining the predictions of any classifier, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 1135–1144.
[9]
M.T. Ribeiro, S. Singh, C. Guestrin, Anchors: high-precision model-agnostic explanations, in: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018, pp. 1527–1535.
[10]
A. Dhurandhar, P. Chen, R. Luss, C. Tu, P. Ting, K. Shanmugam, P. Das, Explanations based on the missing: towards contrastive explanations with pertinent negatives, in: Annual Conference on Neural Information Processing Systems, NeurIPS, 2018, pp. 590–601.
[11]
A. Darwiche, A. Hirth, On the reasons behind decisions, in: 24th European Conference on Artificial Intelligence, in: Frontiers in Artificial Intelligence and Applications, vol. 325, ECAI, IOS Press, 2020, pp. 712–720.
[12]
A. Ignatiev, N. Narodytska, J. Marques-Silva, Abduction-based explanations for machine learning models, in: The Thirty-Third Conference on Artificial Intelligence, AAAI, 2019, pp. 1511–1519.
[13]
L. Amgoud, Explaining black-box classification models with arguments, in: 33rd IEEE International Conference on Tools with Artificial Intelligence, ICTAI, 2021, pp. 791–795.
[14]
A. Ignatiev, N. Narodytska, J. Marques-Silva, On relating explanations and adversarial examples, in: Thirty-Third Conference on Neural Information Processing Systems, NeurIPS, 2019, pp. 15857–15867.
[15]
A. Ignatiev, N. Narodytska, N. Asher, J. Marques-Silva, From contrastive to abductive explanations and back again, in: AIxIA 2020 - Advances in Artificial Intelligence - XIXth International Conference of the Italian Association for Artificial Intelligence, in: Lecture Notes in Computer Science, vol. 12414, Springer, 2020, pp. 335–355.
[16]
G. Audemard, S. Bellart, L. Bounia, F. Koriche, J.-M. Lagniez, P. Marquis, On preferred abductive explanations for decision trees and random forests, in: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI, 2022, pp. 643–650.
[17]
L. Amgoud, Non-monotonic explanation functions, in: Proceedings of the 16th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, in: Lecture Notes in Computer Science, vol. 12897, ECSQARU, 2021, pp. 19–31.
[18]
A. Shih, A. Choi, A. Darwiche, A symbolic approach to explaining Bayesian network classifiers, in: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI, 2018, pp. 5103–5111.
[19]
A. Ignatiev, J.P.M. Silva, SAT-based rigorous explanations for decision lists, in: 24th International Conference on Theory and Applications of Satisfiability Testing - SAT, 2021, pp. 251–269.
[20]
J. Ferreira, M. de Sousa Ribeiro, R. Gonçalves, J. Leite, Looking inside the black-box: logic-based explanations for neural networks, in: 19th International Conference on Principles of Knowledge Representation and Reasoning, KR, 2022.
[21]
O. Biran, K.R. McKeown, Human-centric justification of machine learning predictions, in: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI, 2017, pp. 1461–1467.
[22]
Luss, R.; Chen, P.; Dhurandhar, A.; Sattigeri, P.; Shanmugam, K.; Tu, C. (2019): Generating contrastive explanations with monotonic attribute functions. CoRR http://arxiv.org/abs/1905.12698.
[23]
B. Mittelstadt, C. Russell, S. Wachter, Explaining explanations in AI, in: Proceedings of the Conference on Fairness, Accountability, and Transparency, 2019, pp. 279–288.
[24]
Wachter, S.; Mittelstadt, B.D.; Russell, C. (2017): Counterfactual explanations without opening the black box: automated decisions and the GDPR. CoRR arXiv:1711.00399.
[25]
R. Byrne, Semifactual “even if” thinking, Think. Reasoning 8 (1) (2002) 41–67.
[26]
I. Rahwan, G. Simari (Eds.), Argumentation in Artificial Intelligence, Springer, 2009.
[27]
F. Lin, Y. Shoham, Argument systems - an uniform basis for non-monotonic reasoning, in: Proc. of KR, 1989, pp. 245–255.
[28]
G. Simari, R. Loui, A mathematical treatment of defeasible reasoning and its implementation, Artif. Intell. 53 (2–3) (1992) 125–157.
[29]
P. Besnard, A. Hunter, A logic-based theory of deductive arguments, Artif. Intell. 128 (1–2) (2001) 203–235.
[30]
L. Amgoud, H. Prade, Using arguments for making and explaining decisions, Artif. Intell. 173 (3–4) (2009) 413–436.
[31]
L. Amgoud, M. Serrurier, Agents that argue and explain classifications, Auton. Agents M ti-Agent Syst. 16 (2) (2008) 187–209.
[32]
K. Atkinson, P. Baroni, M. Giacomin, A. Hunter, H. Prakken, C. Reed, G.R. Simari, M. Thimm, S. Villata, Towards artificial argumentation, AI Mag. 38 (3) (2017) 25–36.
[33]
P.M. Dung, On the acceptability of arguments and its fundamental role in non-monotonic reasoning, logic programming and n-person games, Artificial Intelligence 77 (1995) 321–357.
[34]
S. Kotsiantis, D. Kanellopoulos, Discretization techniques: a recent survey, GESTS Int. Trans. Comput. Sci. Eng. 32 (1) (2006) 47–58.
[35]
Park, D.; Hendricks, L.; Akata, Z.; Rohrbach, A.; Schiele, B.; Darrell, T.; Rohrbach, M. (2018): Multimodal explanations: justifying decisions and pointing to the evidence. CoRR arXiv:1802.08129.
[36]
A. Schulz, F. Hinder, B. Hammer, Deepview: visualizing classification boundaries of deep neural networks as scatter plots using discriminative dimensionality reduction, in: Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI, 2020, pp. 2305–2311.
[37]
O. Li, H. Liu, C. Chen, C. Rudin, Deep learning for case-based reasoning through prototypes: a neural network that explains its predictions, in: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, AAAI, 2018, pp. 3530–3537.
[38]
I. Stepin, A. Catala, M. Pereira-Fariña, J. Alonso, Paving the way towards counterfactual generation in argumentative conversational agents, in: Proceedings of the 1st Workshop on Interactive Natural Language Technology for Explainable Artificial Intelligence (NL4XAI), Association for Computational Linguistics, 2019, pp. 20–25.
[39]
Y. Dimopoulos, S. Dzeroski, A. Kakas, Integrating explanatory and descriptive learning in ILP, in: Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence, IJCAI, 1997, pp. 900–907.
[40]
A. Kakas, F. Riguzzi, Abductive concept learning, New Gener. Comput. 18 (3) (2000) 243–294.
[41]
R. Byrne, Counterfactuals in explainable artificial intelligence (XAI): evidence from human reasoning, in: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI, 2019, pp. 6276–6282.
[42]
R. Byrne, Counterfactual thought, Annu. Rev. Psychol. 67 (2016).
[43]
S. Coste-Marquis, C. Devred, P. Marquis, Symmetric argumentation frameworks, in: Proceedings of the European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, ECSQARU, 2005, pp. 317–328.
[44]
G. Choquet, Theory of capacities, Ann. Inst. Fourier 5 (1953) 131–295.
[45]
B. Liao, L. van der Torre, Explanation semantics for abstract argumentation, in: H. Prakken, S. Bistarelli, F. Santini, C. Taticchi (Eds.), Computational Models of Argument - Proceedings of COMMA, in: Frontiers in Artificial Intelligence and Applications, vol. 326, IOS Press, 2020, pp. 271–282.
[46]
J. Fandinno, C. Schulz, Answering the “why” in answer set programming - a survey of explanation approaches, Theory Pract. Log. Program. 19 (2) (2019) 114–203.
[47]
A. Rago, O. Cocarascu, C. Bechlivanidis, D.A. Lagnado, F. Toni, Argumentative explanations for interactive recommendations, Artif. Intell. 296 (2021).
[48]
C. Labreuche, Explanation with the winter value: efficient computation for hierarchical Choquet integrals, Int. J. Approx. Reason. 151 (2022) 225–250.
[49]
B. Krarup, S. Krivic, D. Magazzeni, D. Long, M. Cashmore, D.E. Smith, Contrastive explanations of plans through model restrictions, J. Artif. Intell. Res. 72 (2021) 533–612.
[50]
D. Aineto, E. Onaindia, M. Ramírez, E. Scala, I. Serina, Explaining the behaviour of hybrid systems with PDDL+ planning, in: L.D. Raedt (Ed.), Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI, 2022, pp. 4567–4573. ijcai.org.
[51]
S.M. Lundberg, S. Lee, A unified approach to interpreting model predictions, in: I. Guyon, U. von Luxburg, S. Bengio, H.M. Wallach, R. Fergus, S.V.N. Vishwanathan, R. Garnett (Eds.), Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems, 2017, pp. 4765–4774.
[52]
Lundberg, S.M.; Erion, G.G.; Lee, S.-I. (2018): Consistent individualized feature attribution for tree ensembles. CoRR arXiv:1802.03888.
[53]
P. Rasouli, I.C. Yu, EXPLAN: explaining black-box classifiers using adaptive neighborhood generation, in: 2020 International Joint Conference on Neural Networks, IJCNN, IEEE, 2020, pp. 1–9.
[54]
M. Setzu, R. Guidotti, A. Monreale, F. Turini, D. Pedreschi, F. Giannotti, Glocalx - from local to global explanations of black box AI models, Artif. Intell. 294 (2021).
[55]
van der Linden, I.; Haned, H.; Kanoulas, E. (2019): Global aggregations of local explanations for black box models. CoRR arXiv:1907.03039.
[56]
R. Boumazouza, F.C. Alili, B. Mazure, K. Tabia, ASTERYX: a model-agnostic sat-based approach for symbolic and score-based explanations, in: G. Demartini, G. Zuccon, J.S. Culpepper, Z. Huang, H. Tong (Eds.), The 30th ACM International Conference on Information and Knowledge Management, CIKM, ACM, 2021, pp. 120–129.
[57]
O. Cocarascu, A. Stylianou, K. Cyras, F. Toni, Data-empowered argumentation for dialectically explainable predictions, in: Proceedings of the Twenty-Fourth Conference on Artificial Intelligence, ECAI, 2020, pp. 2449–2456.
[58]
K. Cyras, D. Birch, Y. Guo, F. Toni, R. Dulay, S. Turvey, D. Greenberg, T. Hapuarachchi, Explanations by arbitrated argumentative dispute, Expert Systems Applications 127 (2019) 141–156.
[59]
K. Cyras, D. Letsios, R. Misener, F. Toni, Argumentation for explainable scheduling, in: The Thirty-Third Conference on Artificial Intelligence, AAAI, 2019, pp. 2752–2759.
[60]
A. Rago, O. Cocarascu, F. Toni, Argumentation-based recommendations: fantastic explanations and how to find them, in: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI, 2018, pp. 1949–1955.
[61]
A. Borg, F. Bex, Contrastive explanations for argumentation-based conclusions, in: 21st International Conference on Autonomous Agents and Multiagent Systems, AAMAS, 2022, pp. 1551–1553.
[62]
A. Borg, F. Bex, Necessary and sufficient explanations for argumentation-based conclusions, in: J. Vejnarová, N. Wilson (Eds.), Symbolic and Quantitative Approaches to Reasoning with Uncertainty - 16th European Conference, in: Lecture Notes in Computer Science, vol. 12897, ECSQARU, Springer, 2021, pp. 45–58.
[63]
L. Amgoud, D. Doder, S. Vesic, Evaluation of argument strength in attack graphs: foundations and semantics, Artificial Intelligence 302 (2022).

Cited By

View all

Index Terms

  1. Explaining black-box classifiers: Properties and functions
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image International Journal of Approximate Reasoning
          International Journal of Approximate Reasoning  Volume 155, Issue C
          Apr 2023
          145 pages

          Publisher

          Elsevier Science Inc.

          United States

          Publication History

          Published: 01 April 2023

          Author Tags

          1. Classification
          2. Explainability
          3. Arguments

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)0
          • Downloads (Last 6 weeks)0
          Reflects downloads up to 02 Sep 2024

          Other Metrics

          Citations

          Cited By

          View all

          View Options

          View options

          Get Access

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media