Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3321707.3321726acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

What's inside the black-box?: a genetic programming method for interpreting complex machine learning models

Published: 13 July 2019 Publication History

Abstract

Interpreting state-of-the-art machine learning algorithms can be difficult. For example, why does a complex ensemble predict a particular class? Existing approaches to interpretable machine learning tend to be either local in their explanations, apply only to a particular algorithm, or overly complex in their global explanations. In this work, we propose a global model extraction method which uses multi-objective genetic programming to construct accurate, simplistic and model-agnostic representations of complex black-box estimators. We found the resulting representations are far simpler than existing approaches while providing comparable reconstructive performance. This is demonstrated on a range of datasets, by approximating the knowledge of complex black-box models such as 200 layer neural networks and ensembles of 500 trees, with a single tree.

References

[1]
Osbert Bastani, Carolyn Kim, and Hamsa Bastani. 2017. Interpretability via model extraction. arXiv preprint arXiv:1706.09773 (2017).
[2]
Hans-Georg Beyer and Hans-Paul Schwefel. 2002. Evolution strategies-A comprehensive introduction. Natural computing 1, 1 (2002), 3--52.
[3]
Tolga Bolukbasi, Kai-Wei Chang, James Y Zou, Venkatesh Saligrama, and Adam T Kalai. 2016. Man is to computer programmer as woman is to homemaker? debiasing word embeddings. In Advances in Neural Information Processing Systems. 4349--4357.
[4]
Indranil Bose and Radha K Mahapatra. 2001. Business data mining - a machine learning perspective. Information & management 39, 3 (2001), 211--225.
[5]
Cristian Bucilua, Rich Caruana, and Alexandru Niculescu-Mizil. 2006. Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 535--541.
[6]
Rich Caruana, Yin Lou, Johannes Gehrke, Paul Koch, Marc Sturm, and Noemie Elhadad. 2015. Intelligible models for healthcare: Predicting pneumonia risk and hospital 30-day readmission. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1721--1730.
[7]
Mark Craven and Jude W Shavlik. 1996. Extracting tree-structured representations of trained networks. In Advances in neural information processing systems. 24--30.
[8]
Hoa Khanh Dam, Truyen Tran, and Aditya Ghose. 2018. Explainable Software Analytics. CoRR abs/1802.00603 (2018). arXiv:1802.00603 http://arxiv.org/abs/1802.00603
[9]
Hoa Khanh Dam, Truyen Tran, and Aditya Ghose. 2018. Explainable software analytics. In Proceedings of the 40th International Conference on Software Engineering: New Ideas and Emerging Results. ACM, 53--56.
[10]
Kalyanmoy Deb. 2014. Multi-objective optimization. In Search methodologies. Springer, 403--449.
[11]
Kalyanmoy Deb, Amrit Pratap, Sameer Agarwal, and TAMT Meyarivan. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE transactions on evolutionary computation 6, 2 (2002), 182--197.
[12]
Janez Demšar. 2006. Statistical comparisons of classifiers over multiple data sets. Journal of Machine learning research 7, Jan (2006), 1--30.
[13]
Dua Dheeru and Efi Karra Taniskidou. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
[14]
Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).
[15]
Filip Karlo Došilović, Mario Brčić, and Nikica Hlupić. 2018. Explainable artificial intelligence: A survey. In 2018 41st International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO). IEEE, 0210--0215.
[16]
Usama Fayyad and Keki Irani. 1993. Multi-interval discretization of continuous-valued attributes for classification learning. (1993).
[17]
Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. 2015. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015).
[18]
Been Kim and Finale Doshi-Velez. {n. d.}. ICML 2017 tutorial on interpretable machine learning. http://people.csail.mit.edu/beenkim/icml_tutorial.html
[19]
John R Koza. 1994. Genetic programming as a means for programming computers by natural selection. Statistics and computing 4, 2 (1994), 87--112.
[20]
Benjamin Letham, Cynthia Rudin, Tyler H McCormick, David Madigan, et al. 2015. Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model. The Annals of Applied Statistics 9, 3 (2015), 1350--1371.
[21]
Thomas Loveard and Victor Ciesielski. 2001. Representing classification problems in genetic programming. In Evolutionary Computation, 2001. Proceedings of the 2001 Congress on, Vol. 2. IEEE, 1070--1077.
[22]
Tamas Madl. 2018. Sklearn interpretable tree. https://github.com/tmadl/sklearn-interpretable-tree
[23]
Christoph Molnar. 2018. Interpretable machine learning. A Guide for Making Black Box Models Explainable (2018).
[24]
David J Montana. 1995. Strongly typed genetic programming. Evolutionary computation 3, 2 (1995), 199--230.
[25]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825--2830.
[26]
General Data Protection Regulation. 2016. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46. Official Journal of the European Union (OJ) 59, 1--88 (2016), 294.
[27]
Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. "Why Should I Trust You?": Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13--17, 2016. 1135--1144.
[28]
Wojciech Samek, Thomas Wiegand, and Klaus-Robert Müller. 2017. Explainable artificial intelligence: Understanding, visualizing and interpreting deep learning models. arXiv preprint arXiv:1708.08296 (2017).
[29]
Juliet Popper Shaffer. 1995. Multiple hypothesis testing. Annual review of psychology 46, 1 (1995), 561--584.
[30]
Latanya Sweeney. 2013. Discrimination in online ad delivery. Queue 11, 3 (2013), 10.
[31]
The H2O.ai team. 2015. h2o: Python Interface for H2O. http://www.h2o.ai Pythonpackage version 3.1.0.99999.
[32]
John W Tukey. 1977. Exploratory data analysis. Vol. 2. Reading, Mass.
[33]
Joaquin Vanschoren, Jan N. van Rijn, Bernd Bischl, and Luis Torgo. 2013. OpenML: Networked Science in Machine Learning. SIGKDD Explorations 15, 2 (2013), 49--60.
[34]
Alfredo Vellido, José David Martín-Guerrero, and Paulo JG Lisboa. 2012. Making machine learning models interpretable. In ESANN, Vol. 12. Citeseer, 163--172.
[35]
Hongyu Yang, Cynthia Rudin, and Margo Seltzer. 2016. Scalable Bayesian rule lists. arXiv preprint arXiv:1602.08610 (2016).
[36]
Matthew D Zeiler, Dilip Krishnan, Graham W Taylor, and Rob Fergus. 2010. Deconvolutional networks. (2010).
[37]
Mengjie Zhang and Will Smart. 2004. Multiclass object classification using genetic programming. In Workshops on Applications of Evolutionary Computation. Springer, 369--378.

Cited By

View all
  • (2024)Artificial Intelligence and Evolutionary Approaches in Particle TechnologyKONA Powder and Particle Journal10.14356/kona.202401141(3-25)Online publication date: 10-Jan-2024
  • (2024)Minimizing the EXA-GP Graph-Based Genetic Programming Algorithm for Interpretable Time Series ForecastingProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664173(1686-1690)Online publication date: 14-Jul-2024
  • (2024)Intepretable Local Explanations Through Genetic ProgrammingProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654370(247-250)Online publication date: 14-Jul-2024
  • Show More Cited By

Index Terms

  1. What's inside the black-box?: a genetic programming method for interpreting complex machine learning models

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    GECCO '19: Proceedings of the Genetic and Evolutionary Computation Conference
    July 2019
    1545 pages
    ISBN:9781450361118
    DOI:10.1145/3321707
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 July 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. evolutionary multi-objective optimisation
    2. explainable artificial intelligence
    3. interpretable machine learning

    Qualifiers

    • Research-article

    Conference

    GECCO '19
    Sponsor:
    GECCO '19: Genetic and Evolutionary Computation Conference
    July 13 - 17, 2019
    Prague, Czech Republic

    Acceptance Rates

    Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)99
    • Downloads (Last 6 weeks)20
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Artificial Intelligence and Evolutionary Approaches in Particle TechnologyKONA Powder and Particle Journal10.14356/kona.202401141(3-25)Online publication date: 10-Jan-2024
    • (2024)Minimizing the EXA-GP Graph-Based Genetic Programming Algorithm for Interpretable Time Series ForecastingProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664173(1686-1690)Online publication date: 14-Jul-2024
    • (2024)Intepretable Local Explanations Through Genetic ProgrammingProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654370(247-250)Online publication date: 14-Jul-2024
    • (2024)EXA-GP: Unifying Graph-Based Genetic Programming and Neuroevolution for Explainable Time Series ForecastingProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654349(523-526)Online publication date: 14-Jul-2024
    • (2024)Function Class Learning with Genetic Programming: Towards Explainable Meta Learning for Tumor Growth FunctionalsProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654145(1354-1362)Online publication date: 14-Jul-2024
    • (2024)On the Nature of the Phenotype in Tree Genetic ProgrammingProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654129(868-877)Online publication date: 14-Jul-2024
    • (2024)Semantically Rich Local Dataset Generation for Explainable AI in GenomicsProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3653990(267-276)Online publication date: 14-Jul-2024
    • (2024)Explaining Genetic Programming-Evolved Routing Policies for Uncertain Capacitated Arc Routing ProblemsIEEE Transactions on Evolutionary Computation10.1109/TEVC.2023.323874128:4(918-932)Online publication date: Aug-2024
    • (2024)Metaheuristic and Evolutionary Algorithms in Explainable Artificial IntelligenceAdvanced Machine Learning with Evolutionary and Metaheuristic Techniques10.1007/978-981-99-9718-3_2(33-65)Online publication date: 23-Apr-2024
    • (2024)A Comparative Analysis of SHAP, LIME, ANCHORS, and DICE for Interpreting a Dense Neural Network in Credit Card Fraud DetectionExplainable Artificial Intelligence10.1007/978-3-031-63803-9_20(365-383)Online publication date: 10-Jul-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media