Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Survey on Explainable AI: : Techniques, challenges and open issues

Published: 01 December 2024 Publication History

Abstract

Artificial Intelligence (AI) has become an important component of many software applications. It has reached a point where it can provide complex and critical decisions in our life. However, the success of most AI-powered applications is based on black-box approaches (e.g., deep neural networks), which can create learned models that are able to predict and make decisions. While these advanced models could achieve high accuracy, they are generally unable to explain their decisions (e.g., predictions) to users. As a result, there is a pressing need for explainable machine learning systems in order to be trustworthy by governments, organizations, industries, and users. This paper classifies and compares the main findings in the domain of explainable machine learning and deep learning. We also discuss the application of Explainable AI (XAI) in sensitive domains such as cybersecurity. In addition, we characterize each reviewed article on the basis of the methods and techniques used to achieve XAI. This, in turn, allows us to discern the strengths and limitations of the existing XAI techniques. We finally discuss some substantial challenges and future research directions related to XAI.

Highlights

A new taxonomy for describing and comparing the recent and main findings in Explainable AI (XAI).
Providing a detailed comparative analysis of existing literature on XAI.
Identifying and presenting open issues and challenges related to XAI.
Guidelines on how to improve XAI solutions to address current and continuing challenges.

References

[1]
[2]
Abusitta A., Aïmeur E., Wahab O.A., Generative adversarial networks for mitigating biases in machine learning systems, 2019, arXiv preprint arXiv:1905.09972.
[3]
Abusitta A., Li M.Q., Fung B.C., Malware classification and composition analysis: A survey of recent developments, Journal of Information Security and Applications 59 (2021).
[4]
Adadi A., Berrada M., Peeking inside the black-box: a survey on explainable artificial intelligence (XAI), IEEE Access 6 (2018) 52138–52160.
[5]
Adebayo J., Gilmer J., Goodfellow I., Kim B., Local explanation methods for deep neural networks lack sensitivity to parameter values, 2018, arXiv preprint arXiv:1810.03307.
[6]
Adler P., Falk C., Friedler S.A., Nix T., Rybeck G., Scheidegger C., et al., Auditing black-box models for indirect influence, Knowledge and Information Systems 54 (1) (2018) 95–122.
[7]
Ahmad, M. A., Eckert, C., & Teredesai, A. (2018). Interpretable machine learning in healthcare. In Proceedings of the 2018 ACM international conference on bioinformatics, computational biology, and health informatics (pp. 559–560).
[8]
Akula, A., Wang, S., & Zhu, S. C. (2020). Cocox: Generating conceptual and counterfactual explanations via fault-lines. In Proceedings of the AAAI conference on artificial intelligence, vol. 34, no. 03 (pp. 2594–2601).
[9]
Alonso, J. M., Ramos-Soto, A., Castiello, C., & Mencar, C. (2018). Explainable AI Beer Style Classifier. In SICSA reaLX.
[10]
Amoukou S.I., Brunel N.J., Salaün T., The Shapley value of coalition of variables provides better explanations, 2021, arXiv preprint arXiv:2103.13342.
[11]
Angelov P.P., Soares E.A., Jiang R., Arnold N.I., Atkinson P.M., Explainable artificial intelligence: an analytical review, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 11 (5) (2021).
[12]
Apley D.W., Zhu J., Visualizing the effects of predictor variables in black box supervised learning models, Journal of the Royal Statistical Society. Series B. Statistical Methodology 82 (4) (2020) 1059–1086.
[13]
Arbatli A.D., Akin H.L., Rule extraction from trained neural networks using genetic algorithms, Nonlinear Analysis. Theory, Methods & Applications 30 (3) (1997) 1639–1648.
[14]
Arrieta A.B., Díaz-Rodríguez N., Del Ser J., Bennetot A., Tabik S., Barbado A., et al., Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI, Information Fusion 58 (2020) 82–115.
[15]
Augasta M.G., Kathirvalavakumar T., Reverse engineering the neural networks for rule extraction in classification problems, Neural Processing Letters 35 (2) (2012) 131–150.
[16]
Aung M.H., Lisboa P.G., Etchells T.A., Testa A.C., Van Calster B., Van Huffel S., et al., Comparing analytical decision support models through boolean rule extraction: A case study of ovarian tumour malignancy, in: International symposium on neural networks, Springer, 2007, pp. 1177–1186.
[17]
Auret L., Aldrich C., Interpretation of nonlinear relationships between process variables by use of random forests, Minerals Engineering 35 (2012) 27–42.
[18]
Bach S., Binder A., Montavon G., Klauschen F., Müller K.R., Samek W., On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation, PLoS One 10 (7) (2015).
[19]
Baehrens D., Schroeter T., Harmeling S., Kawanabe M., Hansen K., Müller K.R., How to explain individual classification decisions, Journal of Machine Learning Research 11 (2010) 1803–1831.
[20]
Barakat N.H., Bradley A.P., Rule extraction from support vector machines: A sequential covering approach, IEEE Transactions on Knowledge and Data Engineering 19 (6) (2007) 729–741.
[21]
Barakat N., Diederich J., Eclectic rule-extraction from support vector machines, International Journal of Computer and Information Engineering 2 (5) (2008) 1672–1675.
[22]
Bastani O., Kim C., Bastani H., Interpretability via model extraction, 2017, arXiv preprint arXiv:1706.09773.
[23]
Bega D., Gramaglia M., Banchs A., Sciancalepore V., Costa-Pérez X., A machine learning approach to 5G infrastructure market optimization, IEEE Transactions on Mobile Computing 19 (3) (2019) 498–512.
[24]
Belle V., Papantonis I., Principles and practice of explainable machine learning, Frontiers in Big Data (2021) 39.
[25]
Bonifazi G., Cauteruccio F., Corradini E., Marchetti M., Terracina G., Ursino D., et al., A model-agnostic, network theory-based framework for supporting XAI on classifiers, Expert Systems with Applications 241 (2024).
[26]
Bursac Z., Gauss C.H., Williams D.K., Hosmer D.W., Purposeful selection of variables in logistic regression, Source Code for Biology and Medicine 3 (1) (2008) 1–8.
[27]
Cai, C. J., Jongejan, J., & Holbrook, J. (2019). The effects of example-based explanations in a machine learning interface. In Proceedings of the 24th international conference on intelligent user interfaces (pp. 258–262).
[28]
Camburu O.M., Explaining deep neural networks, 2020, arXiv preprint arXiv:2010.01496.
[29]
Campagnolo G.M., Sharkey E., Algorithmic encounters: an interactional approach to the AI accuracy vs interpretability trade-off, 2021.
[30]
Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M., & Elhadad, N. (2015). Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1721–1730).
[31]
Casalicchio G., Molnar C., Bischl B., Visualizing the feature importance for black box models, in: Joint European conference on machine learning and knowledge discovery in databases, Springer, 2018, pp. 655–670.
[32]
Chakraborty A., Alam M., Dey V., Chattopadhyay A., Mukhopadhyay D., Adversarial attacks and defences: A survey, 2018, arXiv preprint arXiv:1810.00069.
[33]
Chaves A.C., Vellasco M.M., Tanscheit R., Fuzzy rule extraction from support vector machines, in: Fifth international conference on hybrid intelligent systems, IEEE, 2005, pp. 6–pp.
[34]
Che Z., Purushotham S., Khemani R., Liu Y., Interpretable deep models for ICU outcome prediction, AMIA annual symposium proceedings, vol. 2016, American Medical Informatics Association, 2016, p. 371.
[35]
Chen D., Fraiberger S.P., Moakler R., Provost F., Enhancing transparency and control when drawing data-driven inferences about individuals, Big Data 5 (3) (2017) 197–212.
[36]
Chen Z., Li J., Wei L., A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue, Artificial Intelligence in Medicine 41 (2) (2007) 161–175.
[37]
Chen H., Lundberg S., Lee S.I., Explaining models by propagating Shapley values of local components, in: Explainable AI in healthcare and medicine, Springer, 2021, pp. 261–270.
[38]
Choi E., Bahadori M.T., Kulas J.A., Schuetz A., Stewart W.F., Sun J., Retain: An interpretable predictive model for healthcare using reverse time attention mechanism, 2016, arXiv preprint arXiv:1608.05745.
[39]
Cortez P., Embrechts M.J., Opening black box data mining models using sensitivity analysis, in: 2011 IEEE symposium on computational intelligence and data mining, IEEE, 2011, pp. 341–348.
[40]
Cortez P., Embrechts M.J., Using sensitivity analysis and visualization techniques to open black box data mining models, Information Sciences 225 (2013) 1–17.
[41]
Craven M.W., Extracting comprehensible models from trained neural networks, University of Wisconsin-Madison Department of Computer Sciences, 1996.
[42]
Craven M.W., Shavlik J.W., Using sampling and queries to extract rules from trained neural networks, in: Machine learning proceedings 1994, Elsevier, 1994, pp. 37–45.
[43]
Dabkowski P., Gal Y., Real time image saliency for black box classifiers, 2017, arXiv preprint arXiv:1705.07857.
[44]
Danilevsky M., Qian K., Aharonov R., Katsis Y., Kawas B., Sen P., A survey of the state of explainable AI for natural language processing, 2020, arXiv preprint arXiv:2010.00711.
[45]
Das A., Rad P., Opportunities and challenges in explainable artificial intelligence (XAI): A survey, 2020, arXiv preprint arXiv:2006.11371.
[46]
Datta A., Sen S., Zick Y., Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems, in: 2016 IEEE symposium on security and privacy, IEEE, 2016, pp. 598–617.
[47]
de Mello F.L., A survey on machine learning adversarial attacks, Journal of Information Security and Cryptography (Enigma) 7 (1) (2020) 1–7.
[48]
Deng H., Interpreting tree ensembles with intrees, International Journal of Data Science and Analytics 7 (4) (2019) 277–287.
[50]
Dombrowski A.K., Alber M., Anders C.J., Ackermann M., Müller K.R., Kessel P., Explanations can be manipulated and geometry is to blame, 2019, arXiv preprint arXiv:1906.07983.
[51]
Domingos P., Knowledge discovery via multiple models, Intelligent Data Analysis 2 (1–4) (1998) 187–202.
[52]
Dong, Y., Su, H., Zhu, J., & Zhang, B. (2017). Improving interpretability of deep neural networks with semantic information. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4306–4314).
[53]
Doshi-Velez F., Kim B., Towards a rigorous science of interpretable machine learning, 2017, arXiv preprint arXiv:1702.08608.
[54]
Došilović F.K., Brčić M., Hlupić N., Explainable artificial intelligence: A survey, in: 2018 41st international convention on information and communication technology, electronics and microelectronics, IEEE, 2018, pp. 0210–0215.
[55]
Dosovitskiy, A., & Brox, T. (2016). Inverting visual representations with convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4829–4837).
[56]
Erhan D., Bengio Y., Courville A., Vincent P., Visualizing higher-layer features of a deep network, University of Montreal 1341 (3) (2009) 1.
[57]
Fatima S.S., Wooldridge M., Jennings N.R., A linear approximation method for the Shapley value, Artificial Intelligence 172 (14) (2008) 1673–1699.
[58]
Féraud R., Clérot F., A methodology to explain neural network classification, Neural Networks 15 (2) (2002) 237–246.
[59]
Fisher A., Rudin C., Dominici F., All models are wrong, but many are useful: Learning a variable’s importance by studying an entire class of prediction models simultaneously, Journal of Machine Learning Research 20 (177) (2019) 1–81.
[60]
Fong, R. C., & Vedaldi, A. (2017). Interpretable explanations of black boxes by meaningful perturbation. In Proceedings of the IEEE international conference on computer vision (pp. 3429–3437).
[61]
Friedman J.H., Greedy function approximation: a gradient boosting machine, Annals of Statistics (2001) 1189–1232.
[62]
Frosst N., Hinton G., Distilling a neural network into a soft decision tree, 2017, arXiv preprint arXiv:1711.09784.
[63]
Fryer D., Strümke I., Nguyen H., Shapley values for feature selection: The good, the bad, and the axioms, IEEE Access 9 (2021) 144352–144360.
[64]
Fu L., Rule generation from neural networks, IEEE Transactions on Systems, Man and Cybernetics 24 (8) (1994) 1114–1124.
[65]
Fu X., Ong C., Keerthi S., Hung G.G., Goh L., Extracting the knowledge embedded in support vector machines, 2004 IEEE international joint conference on neural networks (IEEE cat. no. 04CH37541), vol. 1, IEEE, 2004, pp. 291–296.
[66]
Gaonkar B., Shinohara R.T., Davatzikos C., Initiative A.D.N., et al., Interpreting support vector machine models for multivariate group wise analysis in neuroimaging, Medical Image Analysis 24 (1) (2015) 190–204.
[67]
Goldstein A., Kapelner A., Bleich J., Pitkin E., Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation, Journal of Computational and Graphical Statistics 24 (1) (2015) 44–65.
[68]
Guidotti R., Monreale A., Pedreschi D., Giannotti F., Principles of explainable artificial intelligence, in: Explainable AI within the digital transformation and cyber physical systems, Springer, 2021, pp. 9–31.
[69]
Guidotti R., Monreale A., Ruggieri S., Turini F., Giannotti F., Pedreschi D., A survey of methods for explaining black box models, ACM Computing Surveys (CSUR) 51 (5) (2018) 1–42.
[70]
Guidotti R., Ruggieri S., On the stability of interpretable models, in: 2019 International joint conference on neural networks, IEEE, 2019, pp. 1–8.
[71]
Gunning D., Stefik M., Choi J., Miller T., Stumpf S., Yang G.Z., XAI—Explainable artificial intelligence, Science Robotics 4 (37) (2019) eaay7120.
[72]
Haasdonk B., Feature space interpretation of SVMs with indefinite kernels, IEEE Transactions on Pattern Analysis and Machine Intelligence 27 (4) (2005) 482–492.
[73]
Hara S., Hayashi K., Making tree ensembles interpretable, 2016, arXiv preprint arXiv:1606.05390.
[74]
Henelius A., Puolamäki K., Boström H., Asker L., Papapetrou P., A peek into the black box: exploring classifiers by randomization, Data mining and Knowledge Discovery 28 (5) (2014) 1503–1529.
[75]
Henelius A., Puolamäki K., Ukkonen A., Interpreting classifiers through attribute interactions in datasets, 2017, arXiv preprint arXiv:1707.07576.
[76]
Heo J., Joo S., Moon T., Fooling neural network interpretations via adversarial model manipulation, Advances in Neural Information Processing Systems 32 (2019) 2925–2936.
[77]
Hinton G.E., Deep belief networks, Scholarpedia 4 (5) (2009) 5947.
[78]
Hinton G.E., Osindero S., Teh Y.W., A fast learning algorithm for deep belief nets, Neural Computation 18 (7) (2006) 1527–1554.
[79]
Hinton G., Vinyals O., Dean J., Distilling the knowledge in a neural network, 2015, arXiv preprint arXiv:1503.02531.
[80]
Hoffrage U., Gigerenzer G., Using natural frequencies to improve diagnostic inferences, Academic Medicine 73 (5) (1998) 538–540.
[81]
Hooker, G. (2004). Discovering additive structure in black box functions. In Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 575–580).
[82]
Islam S.R., Eberle W., Ghafoor S.K., Ahmed M., Explainable artificial intelligence approaches: A survey, 2021, arXiv preprint arXiv:2101.09429.
[83]
Jaccard J., Jaccard J., Interaction effects in logistic regression, no. 135, Sage, 2001.
[84]
Jakkula V., Tutorial on support vector machine (svm), vol. 37, School of EECS, Washington State University, 2006.
[85]
Jakulin, A., Možina, M., Demšar, J., Bratko, I., & Zupan, B. (2005). Nomograms for visualizing support vector machines. In Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery in data mining (pp. 108–117).
[86]
Janzing D., Minorics L., Blöbaum P., Feature relevance quantification in explainable AI: A causal problem, in: International conference on artificial intelligence and statistics, PMLR, 2020, pp. 2907–2916.
[87]
Johansson U., König R., Niklasson L., The truth is in there-rule extraction from opaque models using genetic programming., in: FLAIRS conference, Miami Beach, FL, 2004, pp. 658–663.
[88]
Johansson U., Niklasson L., Evolving decision trees using oracle guides, in: 2009 IEEE symposium on computational intelligence and data mining, IEEE, 2009, pp. 238–244.
[89]
Johansson U., Niklasson L., König R., Accuracy vs. comprehensibility in data mining models, Proceedings of the seventh international conference on information fusion, vol. 1, Citeseer, 2004, pp. 295–300.
[90]
Kanehira, A., & Harada, T. (2019). Learning to explain with complemental examples. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8603–8611).
[91]
Karpathy A., Johnson J., Fei-Fei L., Visualizing and understanding recurrent networks, 2015, arXiv preprint arXiv:1506.02078.
[92]
Kästner L., Langer M., Lazar V., Schomäcker A., Speith T., Sterz S., On the relation of trust and explainability: Why to engineer for trustworthiness, in: 2021 IEEE 29th international requirements engineering conference workshops, IEEE, 2021, pp. 169–175.
[93]
Kim B., Rudin C., Shah J.A., The bayesian case model: A generative approach for case-based reasoning and prototype classification, in: Advances in neural information processing systems, 2014, pp. 1952–1960.
[94]
Kim B., Wattenberg M., Gilmer J., Cai C., Wexler J., Viegas F., et al., Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav), in: International conference on machine learning, PMLR, 2018, pp. 2668–2677.
[95]
Kindermans P.J., Schütt K.T., Alber M., Müller K.R., Erhan D., Kim B., et al., Learning how to explain neural networks: Patternnet and patternattribution, 2017, arXiv preprint arXiv:1705.05598.
[96]
Knight W., An AI pioneer wants his algorithms to understand the ’Why’, 2021, https://www.wired.com/story/ai-pioneer-algorithms-understand-why/.
[97]
Koh P.W., Liang P., Understanding black-box predictions via influence functions, in: International conference on machine learning, PMLR, 2017, pp. 1885–1894.
[98]
Kök I., Okay F.Y., Muyanlı Ö., Özdemir S., Explainable artificial intelligence (XAI) for internet of things: a survey, IEEE Internet of Things Journal (2023).
[99]
Konig R., Johansson U., Niklasson L., G-REX: A versatile framework for evolutionary data mining, in: 2008 IEEE international conference on data mining workshops, IEEE, 2008, pp. 971–974.
[100]
Krakovna V., Doshi-Velez F., Increasing the interpretability of recurrent neural networks using hidden Markov models, 2016, arXiv preprint arXiv:1606.05320.
[101]
Krause, J., Perer, A., & Ng, K. (2016). Interacting with predictions: Visual inspection of black-box machine learning models. In Proceedings of the 2016 CHI conference on human factors in computing systems (pp. 5686–5697).
[102]
Krishnan R., Sivakumar G., Bhattacharya P., Extracting decision trees from trained neural networks, Pattern Recognition 32 (12) (1999).
[103]
Krishnan, S., & Wu, E. (2017). Palm: Machine learning explanations for iterative debugging. In Proceedings of the 2nd workshop on human-in-the-loop data analytics (pp. 1–6).
[104]
Kuppa A., Le-Khac N.A., Black box attacks on explainable artificial intelligence (XAI) methods in cyber security, in: 2020 international joint conference on neural networks, IEEE, 2020, pp. 1–8.
[105]
Lage, I., Chen, E., He, J., Narayanan, M., Kim, B., Gershman, S. J., et al. (2019). Human evaluation of models built for interpretability. In Proceedings of the AAAI conference on human computation and crowdsourcing, vol. 7 (pp. 59–67).
[106]
Lakkaraju H., Kamar E., Caruana R., Leskovec J., Interpretable & explorable approximations of black box models, 2017, arXiv preprint arXiv:1707.01154.
[107]
Landecker W., Thomure M.D., Bettencourt L.M., Mitchell M., Kenyon G.T., Brumby S.P., Interpreting individual classifications of hierarchical networks, in: 2013 IEEE symposium on computational intelligence and data mining, IEEE, 2013, pp. 32–38.
[108]
Langer M., Oster D., Speith T., Hermanns H., Kästner L., Schmidt E., et al., What do we want from explainable artificial intelligence (XAI)?–A stakeholder perspective on XAI and a conceptual model guiding interdisciplinary XAI research, Artificial Intelligence 296 (2021).
[109]
Laurent H., Rivest R.L., Constructing optimal binary decision trees is NP-complete, Information Processing Letters 5 (1) (1976) 15–17.
[110]
Lei T., Barzilay R., Jaakkola T., Rationalizing neural predictions, 2016, arXiv preprint arXiv:1606.04155.
[111]
Letham B., Rudin C., McCormick T.H., Madigan D., et al., Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model, Annals of Applied Statistics 9 (3) (2015) 1350–1371.
[112]
Li J., Chen X., Hovy E., Jurafsky D., Visualizing and understanding neural models in nlp, 2015, arXiv preprint arXiv:1506.01066.
[113]
Li M.Q., Fung B., Abusitta A., On the effectiveness of interpretable feedforward neural network, 2021, arXiv preprint arXiv:2111.02303.
[114]
Li M.Q., Fung B.C.M., Charland P., Ding S.H.H., I-MAD: Interpretable malware detector using Galaxy Transformers, Computers & Security (COSE) 108 (102371) (2021) 1–15.
[115]
Linardatos P., Papastefanopoulos V., Kotsiantis S., Explainable AI: A review of machine learning interpretability methods, Entropy 23 (1) (2021) 18.
[116]
Lipton Z.C., The mythos of model interpretability, Queue 16 (3) (2016) 31–57.
[117]
Lipton Z.C., The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery, Queue 16 (3) (2018) 31–57.
[118]
Lu J., Lee D., Kim T.W., Danks D., Good explanation for algorithmic transparency, 2019, Available at SSRN 3503603.
[119]
Lundberg S.M., Erion G., Chen H., DeGrave A., Prutkin J.M., Nair B., et al., From local explanations to global understanding with explainable AI for trees, Nature Machine Intelligence 2 (1) (2020) 56–67.
[120]
Lundberg S., Lee S.I., A unified approach to interpreting model predictions, 2017, arXiv preprint arXiv:1705.07874.
[121]
Mahendran, A., & Vedaldi, A. (2015). Understanding deep image representations by inverting them. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5188–5196).
[122]
Maimon O.Z., Rokach L., Data mining with decision trees: theory and applications, World scientific, 2014.
[123]
Martens D., Provost F., Explaining data-driven document classifications, Mis Quarterly 38 (1) (2014) 73–100.
[124]
McDermid J.A., Jia Y., Porter Z., Habli I., Artificial intelligence explainability: the technical and ethical dimensions, Philosophical Transactions of the Royal Society, Series A 379 (2207) (2021).
[125]
Medsker L.R., Jain L., Recurrent neural networks, Design and Applications 5 (2001) 64–67.
[126]
Messalas A., Kanellopoulos Y., Makris C., Model-agnostic interpretability with shapley values, in: 2019 10th international conference on information, intelligence, systems and applications, IEEE, 2019, pp. 1–7.
[127]
Miotto R., Wang F., Wang S., Jiang X., Dudley J.T., Deep learning for healthcare: review, opportunities and challenges, Briefings in Bioinformatics 19 (6) (2018) 1236–1246.
[128]
Mishra, S., Sturm, B. L., & Dixon, S. (2017). Local Interpretable Model-Agnostic Explanations for Music Content Analysis. In ISMIR (pp. 537–543).
[129]
Molnar C., Interpretable machine learning, Lulu.com, 2020.
[130]
Montavon G., Lapuschkin S., Binder A., Samek W., Müller K.R., Explaining nonlinear classification decisions with deep taylor decomposition, Pattern Recognition 65 (2017) 211–222.
[131]
Navia-Vázquez A., Parrado-Hernández E., Support vector machine interpretation, Neurocomputing 69 (13–15) (2006) 1754–1759.
[132]
Nefeslioglu H., Sezer E., Gokceoglu C., Bozkir A., Duman T., Assessment of landslide susceptibility by decision trees in the metropolitan area of Istanbul, Turkey, Mathematical Problems in Engineering 2010 (2010).
[133]
Nguyen A., Dosovitskiy A., Yosinski J., Brox T., Clune J., Synthesizing the preferred inputs for neurons in neural networks via deep generator networks, 2016, arXiv preprint arXiv:1605.09304.
[134]
Núñez H., Angulo C., Català A., Support vector machines with symbolic interpretation, in: VII Brazilian symposium on neural networks, 2002. SBRN 2002. proceedings, IEEE, 2002, pp. 142–147.
[135]
Nunez H., Angulo C., Catala A., Rule-based learning systems for support vector machines, Neural Processing Letters 24 (1) (2006) 1–18.
[136]
O’Shea K., Nash R., An introduction to convolutional neural networks, 2015, arXiv preprint arXiv:1511.08458.
[137]
Palczewska A., Palczewski J., Robinson R.M., Neagu D., Interpreting random forest classification models using a feature contribution method, in: Integration of reusable systems, Springer, 2014, pp. 193–218.
[138]
Papernot N., McDaniel P., Deep k-nearest neighbors: Towards confident, interpretable and robust deep learning, 2018, arXiv preprint arXiv:1803.04765.
[139]
Peng C.Y.J., Lee K.L., Ingersoll G.M., An introduction to logistic regression analysis and reporting, The Journal of Educational Research 96 (1) (2002) 3–14.
[140]
Peng X., Li Y., Tsang I.W., Zhu H., Lv J., Zhou J.T., XAI beyond classification: Interpretable neural clustering, Journal of Machine Learning Research 23 (6) (2022) 1–28.
[141]
Quinlan J.R., Induction of decision trees, Machine Learning 1 (1) (1986) 81–106.
[142]
Quinlan, J. R. (1987). Generating production rules from decision trees. In International joint conference on artificial intelligence, vol. 87 (pp. 304–307).
[143]
Rabiul Islam S., Eberle W., Khaled Ghafoor S., Ahmed M., Explainable artificial intelligence approaches: A survey, 2021, arXiv e-prints, arXiv--2101.
[144]
Rai A., Explainable AI: From black box to glass box, Journal of the Academy of Marketing Science 48 (1) (2020) 137–141.
[145]
Rajani N.F., Mooney R.J., Ensembling visual explanations, in: Explainable and interpretable models in computer vision and machine learning, Springer, 2018, pp. 155–172.
[146]
Rajani, N. F., & Mooney, R. (2018b). Stacking with auxiliary features for visual question answering. In Proceedings of the 2018 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long papers) (pp. 2217–2226).
[147]
Ribeiro, M. T., Singh, S., & Guestrin, C. (2016a). ”Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135–1144).
[148]
Ribeiro M.T., Singh S., Guestrin C., Nothing else matters: model-agnostic explanations by identifying prediction invariance, 2016, arXiv preprint arXiv:1611.05817.
[149]
Ribeiro, M. T., Singh, S., & Guestrin, C. (2018). Anchors: High-precision model-agnostic explanations. In Proceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1.
[150]
Rieger L., Hansen L.K., A simple defense against adversarial attacks on heatmap explanations, 2020, arXiv preprint arXiv:2007.06381.
[151]
Rieger L., Singh C., Murdoch W., Yu B., Interpretations are useful: penalizing explanations to align neural networks with prior knowledge, in: International conference on machine learning, PMLR, 2020, pp. 8116–8126.
[152]
Robnik-Šikonja M., Kononenko I., Explaining classifications for individual instances, IEEE Transactions on Knowledge and Data Engineering 20 (5) (2008) 589–600.
[153]
Rosenbaum L., Hinselmann G., Jahn A., Zell A., Interpreting linear support vector machine models with heat map molecule coloring, Journal of Cheminformatics 3 (1) (2011) 1–12.
[154]
Roth A.E., The Shapley value: essays in honor of Lloyd S. Shapley, Cambridge University Press, 1988.
[155]
Rovnyak S., Kretsinger S., Thorp J., Brown D., Decision trees for real-time transient stability prediction, IEEE Transactions on Power Systems 9 (3) (1994) 1417–1426.
[156]
Rudin C., Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead, Nature Machine Intelligence 1 (5) (2019) 206–215.
[157]
Saeed W., Omlin C., Explainable AI (XAI): A systematic meta-survey of current challenges and future opportunities, Knowledge-Based Systems 263 (2023).
[158]
Samek W., Müller K.R., Towards explainable artificial intelligence, in: Explainable AI: interpreting, explaining and visualizing deep learning, Springer, 2019, pp. 5–22.
[159]
Samek W., Wiegand T., Müller K.R., Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models, 2017, arXiv preprint arXiv:1708.08296.
[160]
Saranya A., Subhashini R., A systematic review of explainable artificial intelligence models and applications: Recent developments and future trends, Decision Analytics Journal (2023).
[161]
Schmitz G.P., Aldrich C., Gouws F.S., ANN-DT: an algorithm for extraction of decision trees from artificial neural networks, IEEE Transactions on Neural Networks 10 (6) (1999) 1392–1401.
[162]
Schwalbe G., Finzel B., A comprehensive taxonomy for explainable artificial intelligence: a systematic survey of surveys on methods and concepts, Data Mining and Knowledge Discovery (2023) 1–59.
[163]
Setiono R., Leow W.K., FERNN: An algorithm for fast extraction of rules from neural networks, Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies 12 (1) (2000) 15–25.
[164]
Seungjun K., Explainable AI (XAI) methods part 5— global surrogate models, 2022, https://towardsdatascience.com/explainable-ai-xai-methods-part-5-global-surrogate-models-9c228d27e13a.
[165]
Shrikumar A., Greenside P., Kundaje A., Learning important features through propagating activation differences, in: International conference on machine learning, PMLR, 2017, pp. 3145–3153.
[166]
Shrikumar A., Greenside P., Shcherbina A., Kundaje A., Not just a black box: Learning important features through propagating activation differences, 2016, arXiv preprint arXiv:1605.01713.
[167]
Slack D., Friedler S.A., Scheidegger C., Roy C.D., Assessing the local interpretability of machine learning models, 2019, arXiv preprint arXiv:1902.03501.
[168]
Sollich, P. (1999). Probabilistic Methods for Support Vector Machines. In NIPS, vol. 12 (pp. 349–355).
[169]
Sollich P., Bayesian methods for support vector machines: Evidence and predictive class probabilities, Machine Learning 46 (1) (2002) 21–52.
[170]
Song C., Cheng H.P., Yang H., Li S., Wu C., Wu Q., et al., MAT: A multi-strength adversarial training method to mitigate adversarial attacks, in: 2018 IEEE computer society annual symposium on VLSI, IEEE, 2018, pp. 476–481.
[171]
Stilgoe J., Machine learning, social learning and the governance of self-driving cars, Social Studies of Science 48 (1) (2018) 25–56.
[172]
Strumbelj E., Kononenko I., An efficient explanation of individual classifications using game theory, Journal of Machine Learning Research 11 (2010) 1–18.
[173]
Su G., Wei D., Varshney K.R., Malioutov D.M., Interpretable two-level boolean rule learning for classification, 2015, arXiv preprint arXiv:1511.07361.
[174]
Sundararajan M., Taly A., Yan Q., Axiomatic attribution for deep networks, in: International conference on machine learning, PMLR, 2017, pp. 3319–3328.
[175]
Taha I.A., Ghosh J., Symbolic interpretation of artificial neural networks, IEEE Transactions on Knowledge and Data Engineering 11 (3) (1999) 448–463.
[176]
Tallón-Ballesteros A., Chen C., Explainable AI: Using Shapley value to explain complex anomaly detection ML-based systems, Machine Learning and Artificial Intelligence 332 (2020) 152.
[177]
Tan, S., Caruana, R., Hooker, G., & Lou, Y. (2018). Distill-and-compare: Auditing black-box models using transparent model distillation. In Proceedings of the 2018 AAAI/ACM conference on AI, ethics, and society (pp. 303–310).
[178]
Tan S., Sim K.C., Gales M., Improving the interpretability of deep neural networks with stimulated learning, in: 2015 IEEE workshop on automatic speech recognition and understanding, IEEE, 2015, pp. 617–623.
[179]
Tan, S., Soloviev, M., Hooker, G., & Wells, M. T. (2020). Tree space prototypes: Another look at making tree ensembles interpretable. In Proceedings of the 2020 ACM-iMS on foundations of data science conference (pp. 23–34).
[180]
Thiagarajan J.J., Kailkhura B., Sattigeri P., Ramamurthy K.N., Treeview: Peeking into deep neural networks via feature-space partitioning, 2016, arXiv preprint arXiv:1611.07429.
[181]
Thrun S., Extracting rules from artificial neural networks with distributed representations, Advances in Neural Information Processing Systems (1995) 505–512.
[182]
Tolomei, G., Silvestri, F., Haines, A., & Lalmas, M. (2017). Interpretable predictions of tree-based ensembles via actionable feature tweaking. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 465–474).
[183]
Towell G.G., Shavlik J.W., Extracting refined rules from knowledge-based neural networks, Machine Learning 13 (1) (1993) 71–101.
[184]
Trunk A., Birkel H., Hartmann E., On the current state of combining human and artificial intelligence for strategic organizational decision making, Business Research 13 (3) (2020) 875–919.
[185]
Tsukimoto H., Extracting rules from trained neural networks, IEEE Transactions on Neural networks 11 (2) (2000) 377–389.
[186]
Usman M., Jan M.A., He X., Chen J., A survey on representation learning efforts in cybersecurity domain, ACM Computing Surveys 52 (6) (2019) 1–28.
[187]
Üstün B., Melssen W., Buydens L., Visualisation and interpretation of support vector regression models, Analytica Chimica Acta 595 (1–2) (2007) 299–309.
[188]
Utgoff P.E., Incremental induction of decision trees, Machine Learning 4 (2) (1989) 161–186.
[189]
Vaishak B., Ioannis P., Principles and practice of explainable machine learning, 2021, https://www.frontiersin.org/articles/10.3389/fdata.2021.688969/full.
[190]
van Campen T., Hamers H., Husslage B., Lindelauf R., A new approximation method for the Shapley value applied to the WTC 9/11 terrorist attack, Social Network Analysis and Mining 8 (1) (2018) 1–12.
[191]
van der Waa J., Nieuwburg E., Cremers A., Neerincx M., Evaluating XAI: A comparison of rule-based and example-based explanations, Artificial Intelligence 291 (2021).
[192]
Vellido A., Martín-Guerrero J.D., Lisboa P.J., Making machine learning models interpretable, ESANN, vol. 12, Citeseer, 2012, pp. 163–172.
[193]
Vilone G., Longo L., Classification of explainable artificial intelligence methods through their output formats, Machine Learning and Knowledge Extraction 3 (3) (2021) 615–661.
[194]
Vincent P., Larochelle H., Lajoie I., Bengio Y., Manzagol P.A., Bottou L., Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion, Journal of Machine Learning Research 11 (12) (2010).
[195]
Welling S.H., Refsgaard H.H., Brockhoff P.B., Clemmensen L.H., Forest floor visualizations of random forests, 2016, arXiv preprint arXiv:1605.09196.
[196]
Winter E., The shapley value, Handbook of game theory with economic applications, vol. 3, Elsevier, 2002, pp. 2025–2054.
[197]
Wisdom S., Powers T., Pitton J., Atlas L., Interpretable recurrent neural networks using sequential sparse recovery, 2016, arXiv preprint arXiv:1611.07252.
[198]
Wu, M., Hughes, M., Parbhoo, S., Zazzi, M., Roth, V., & Doshi-Velez, F. (2018). Beyond sparsity: Tree regularization of deep models for interpretability. In Proceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1.
[199]
Xu K., Park D.H., Yi C., Sutton C., Interpreting deep classifier by visual distillation of dark knowledge, 2018, arXiv preprint arXiv:1803.04042.
[200]
Yosinski J., Clune J., Nguyen A., Fuchs T., Lipson H., Understanding neural networks through deep visualization, 2015, arXiv preprint arXiv:1506.06579.
[201]
Yu F., Wei C., Deng P., Peng T., Hu X., Deep exploration of random forest model boosts the interpretability of machine learning studies of complicated immune responses and lung burden of nanoparticles, Science Advances 7 (22) (2021) eabf4130.
[202]
Zeiler M.D., Fergus R., Visualizing and understanding convolutional networks, in: European conference on computer vision, Springer, 2014, pp. 818–833.
[203]
Zeiler M.D., Krishnan D., Taylor G.W., Fergus R., Deconvolutional networks, in: 2010 IEEE computer society conference on computer vision and pattern recognition, IEEE, 2010, pp. 2528–2535.
[204]
Zeiler M.D., Taylor G.W., Fergus R., Adaptive deconvolutional networks for mid and high level feature learning, in: 2011 international conference on computer vision, IEEE, 2011, pp. 2018–2025.
[205]
Zhang P., An interval mean–average absolute deviation model for multiperiod portfolio selection with risk control and cardinality constraints, Soft Computing 20 (3) (2016) 1203–1212.
[206]
Zhang Y., Su H., Jia T., Chu J., Rule extraction from trained support vector machines, in: Pacific-Asia conference on knowledge discovery and data mining, Springer, 2005, pp. 61–70.
[207]
Zhang, Q., Wu, Y. N., & Zhu, S. C. (2018). Interpretable convolutional neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8827–8836).
[208]
Zhang, Q., Yang, Y., Ma, H., & Wu, Y. N. (2019). Interpreting cnns via decision trees. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6261–6270).
[209]
Zhao H., Chen H., Yang F., Liu N., Deng H., Cai H., et al., Explainability for large language models: A survey, ACM Transactions on Intelligent Systems and Technology 15 (2) (2024) 1–38.
[210]
Zhou Y., Hooker G., Interpreting models via single tree approximation, 2016, arXiv preprint arXiv:1610.09036.
[211]
Zhou Z.H., Jiang Y., Chen S.F., Extracting symbolic rules from trained neural network ensembles, Ai Communications 16 (1) (2003) 3–15.
[212]
Zilke J.R., Mencía E.L., Janssen F., Deepred–rule extraction from deep neural networks, in: International conference on discovery science, Springer, 2016, pp. 457–473.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Expert Systems with Applications: An International Journal
Expert Systems with Applications: An International Journal  Volume 255, Issue PC
Dec 2024
1588 pages

Publisher

Pergamon Press, Inc.

United States

Publication History

Published: 01 December 2024

Author Tags

  1. Explainable artificial intelligence
  2. Machine Learning
  3. Interpretability
  4. Trusted artificial intelligence

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media