Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
column

Comprehensible classification models: a position paper

Published: 17 March 2014 Publication History

Abstract

The vast majority of the literature evaluates the performance of classification models using only the criterion of predictive accuracy. This paper reviews the case for considering also the comprehensibility (interpretability) of classification models, and discusses the interpretability of five types of classification models, namely decision trees, classification rules, decision tables, nearest neighbors and Bayesian network classifiers. We discuss both interpretability issues which are specific to each of those model types and more generic interpretability issues, namely the drawbacks of using model size as the only criterion to evaluate the comprehensibility of a model, and the use of monotonicity constraints to improve the comprehensibility and acceptance of classification models by users.

References

[1]
Allahyari, H., and Lavesson, N. User-oriented assessment of classification model understandability. Proc. 11th Scandinavian Conf. on Artificial Intelligence. IOS, 2011.
[2]
Altendorf, E.E., Restificar, A.C., and Dietterich, T.G. Learning from sparse data by exploiting monotonicity constraints. Proc. 21st Annual Conf. on Uncertainty in Artificial Intelligence (UAI'05), 18--26. AUAI, 2005.
[3]
Augusta, M.G., and Kathirvalavakumar, T. Reverse engineering the neural networks for rule extraction in classification problems. Neural Processing Letters 35(2): 131--150, April 2012.
[4]
Baesens, B., Mues, C., De Backer, M., and Vanthienen, J. Building intelligent credit scoring systems using decision tables. In: Enterprise Information Systems V, 131--137. Kluwer, 2004.
[5]
Bellazzi, R., and Zupan, B. Predictive data mining in clinical medicine: current issues and guidelines. International Journal of Medical Informatics 77(2): 81--97, Feb. 2008.
[6]
Ben-David, A. Monotonicity maintenance in informationtheoretic machine learning algorithms. Machine Learning 19(1): 29--43. 1995.
[7]
Ben-David, A., Sterling, L., and Tran, T. Adding monotonicity to learning algorithms may impair their accuracy. Expert Systems with Applications 36(3): 6627--6634. April 2009.
[8]
Boz, O. Extracting decision trees from trained neural networks. Proc. 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'02), 456--461. ACM, 2002.
[9]
Bramer, M. Principles of Data Mining. Springer, 2007.
[10]
Cendrowska, J. PRISM: an algorithm for inducing modular rules. International Journal of Man-Machine Studies 27(4): 349--370. 1987.
[11]
Cheng, J., and Greiner, R. Learning Bayesian belief classifiers: algorithms and system. Proc. 14th Biennial Conference of Canadian Society on Computational Studies of Intelligence (AI'01), 141--151. Springer, 2001.
[12]
Clark, P., Boswell, R. Rule induction with CN2: some recent improvements. In: Machine Learning -- Proc. Fifth European Conf. (EWSL'91), 151--163. Springer, 1991.
[13]
Cristianini, N., and Shawe-Taylor, J. An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press, 2000.
[14]
Dejaeger, K., Goethals, F., Giangreco, A., Mola, L., and Baesens, B. Gaining insight into student satisfaction using comprehensible data mining techniques. European Journal of Operational Research, 218(2): 548--562, 2012.
[15]
Dhar, V., Chou, D., and Provost, F. Discovering interesting patterns for investment decision making with GLOWER -- a genetic learner overlaid with entropy reduction. Data Mining and Knowledge Discovery 4(4): 251--280, 2000.
[16]
Doderer, M., Yoon, K., Salinas, J., and Kwek, S. Protein subcellular localization prediction using a hybrid of similarity search and error-correcting output code techniques that produces interpretable results. In Silico Biology 6(5): 419--433, 2006.
[17]
Domingos, P. Occam's two razors: the sharp and the blunt. Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining (KDD'98), 37--43. AAAI, 1998.
[18]
Duivesteijn, W., and Feelders, A. Nearest neighbour classification with monotonicity constraints. Proc. ECML PKDD 2008, Part I, LNAI 5211, 301--316. Springer, 2008.
[19]
Elazmeh, W., Matwin, W., O'Sullivan, D., Michalowski, W., and Farion, W. Insights from predicting pediatric asthma exacerbations from retrospective clinical data. In: Evaluation Methods for Machine Learning II -- Papers from 2007 AAAI Workshop, 10--15. Technical Report WS-07-05. AAAI, 2007.
[20]
Elomaa, T. In Defense of C4.5: Notes on learning one-level decision trees. Proc. 11th Int. Conf. on Machine Learning (ICML'94), pp. 62--69. Morgan Kaufmann, 1994.
[21]
Feelders, A.J. Prior knowledge in economic applications of data mining. Proc. European Conf. on Principles and Practice of Knowledge Discovery and Data Mining (PKDD' 2000), LNAI 1910, 395--400. Springer, 2000.
[22]
Feelders, A., and Pardoel, M. Pruning for monotone classification trees. Proc. Intelligent Data Analysis (IDA) Conf., LNCS 2810, 1--12. Springer, 2003.
[23]
Ferri, C., Hernandez-Orallo, J., and Ramirez-Quintana, M.J. From ensemble methods to comprehensible models. Proc. 5th Int. Conf. on Discovery Science (DS'2002), LNCS 2534, 165--177. Springer, 2002.
[24]
Freitas, A.A. Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer, 2002.
[25]
Freitas, A.A. A critical review of multi-objective optimization in data mining: a position paper. ACM SIGKDD Explorations, 6(2): 77--86. ACM, Dec. 2004.
[26]
Freitas, A.A., Wieser, D.C. and Apweiler, R. On the importance of comprehensible classification models for protein function prediction. ACM/IEEE Transactions on Computational Biology and Bioinformatics 7(1): 172--182, Jan.-Mar. 2010.
[27]
Friedman, N., Geiger, D., and Goldszmidt, M. Bayesian network classifiers. Machine Learning 29(2-3): 131--163, Nov./Dec. 1997.
[28]
Fung, G., Sandilya, S., and Rao, R.B. Rule extraction from linear support vector machines. Proc. 11th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD' 2005), 32--40. ACM, 2005.
[29]
Furnkranz, J. Separate-and-conquer rule learning. Artificial Intelligence Review 13(1): 3--54. 1999.
[30]
Grunwald, P.D. The Minimum Description Length Principle. MIT Press, 2007.
[31]
Hayete, B., and Bienkowska, J.R. GOTrees: predicting GO associations from protein domain composition using decision trees. Proc. Pacific Symp. on Biocomput. 10, 127--138, 2005.
[32]
Heckerman, D., Chickering, D.M., Meek, C., Rounthwaite, R., and Kadie, C. Dependency networks for inference, collaborative filtering and data visualization. Journal of Machine Learning Research 1: 49--75, 2000.
[33]
Henery, R.J. Classification. In: Michie, D., Spiegelhalter, D.J., and Taylor, C.C. Machine Learning, Neural and Statistical Classification, 6--16. Ellis Horwood, 1994.
[34]
Huang, J., and Ling, C. Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering 17(3): 299--310, 2005.
[35]
Huysmans, J., Dejaeger, K., Mues, C., Vanthienen, J., and Baesens, B. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decision Support Systems 51(1): 141--154. 2011.
[36]
Japkowicz, N., and Shah, M. Evaluating learning algorithms: a classification perspective. Cambridge University Press, 2011.
[37]
Jiang, T., and Keating, A.E. AVID: an integrative framework for discovering functional relationships among proteins. BMC Bioinformatics 6:136, 2005.
[38]
Jin, Y. (Ed.) Multiobjective Machine Learning. Springer, 2006.
[39]
Johansson, U., and Niklasson, U. Evolving decision trees using oracle guides. Proc. 2009 IEEE Symp. on Computational Intelligence and Data Mining (CIDM 2009), 238--244. IEEE Press, 2009.
[40]
Karpf, J. Inductive modelling in law: example based expert systems in administrative law. Proc. 3rd Int. Conf. on Artificial Intelligence in Law, 297--306. ACM, 1991.
[41]
Karwath, A., and King, R.D. Homology induction: the use of machine learning to improve sequence similarity searches. BMC Bioinformatics 3:11, 2002.
[42]
Kaufmann, K.A., and Michalski, R.S. Learning from inconsistent and noisy data: the AQ18 approach. Foundations of Intelligent Systems (Proc. ISMIS'99). LNAI 1609, 411--419. Springer, 1999.
[43]
Kohavi, R. The power of decision tables. Proc. 1995 European Conf. on Machine Learning (ECML'95), LNAI 914, 174--189. Springer, 1995.
[44]
Kohavi, R., and Sommerfield, D. Targeting business users with decision table classifiers. Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining (KDD'98), 249--253. AAAI, 1998.
[45]
Kononenko, I. Inductive and Bayesian learning in medical diagnosis. Applied Artificial Intelligence 7(4): 317--337, 1993.
[46]
Lavrac, N. Selected techniques for data mining in medicine. Artificial Intelligence in Medicine 16(1): 3--23, May 1999.
[47]
Lima, E., Mues, C., and Baesens, B. Domain knowledge integration in data mining using decision tables: case studies in churn prediction. Journal of the Operational Research Society 60: 1096--1106, 2009.
[48]
Maes, R., and Van Dijk, J.E.M. On the role of ambiguity and incompleteness in the design of decision tables and rulebased systems. The Computer Journal 31(6): 481--489. 1988.
[49]
Marteens, D., Vanthienen, J., Verbeke, W., and Baesens, B. Performance of classification models from a user perspective. Decision Support Systems 51(4): 782--793. 2011.
[50]
Michie, D., Spiegelhalter, D.J., and Taylor, C.C. (Eds.) Machine Learning, Neural and Statistical Classification. Ellis Horwood, 1994.
[51]
Otero, F.E.B. and Freitas, A.A. Improving the interpretability of classification rules discovered by an ant colony algorithm. Proc. 2013 Genetic and Evolutionary Computation Conference (GECCO'13), 73--80. ACM, 2013.
[52]
Pappa, G.L., Baines, A.J., and Freitas, A.A. Predicting postsynaptic activity in proteins with data mining. Bioinformatics 21(Suppl. 2): ii19--ii25, 2005.
[53]
Pazzani, M. Comprehensible Knowledge Discovery: Gaining Insight from Data. Proc. First Federal Data Mining Conf. and Exposition, 73--82. Washington, D.C., 1997.
[54]
Pazzani, M.J. Learning with globally predictive tests. Proc. Discovery Science (DS'98), LNAI 1532. Springer, 1998.
[55]
Pazzani, M.J., Mani, S., and Shankle, W.R. Acceptance of rules generated by machine learning among medical experts. Methods of Information in Medicine, 40(5): 380--385, 2001.
[56]
Quinlan, J.R. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.
[57]
Quinlan, J.R. Some elements of machine learning. Proc. 16th Int. Conf. on Machine Learning (ICML'99), 523--525. Morgan Kaufmann, 1999.
[58]
Quinonero-Candela, J., Sugiyama, M., Schwaighofer, A., and Lawrence, N.D. (Eds.) Dataset Shift in Machine Learning. MIT Press, 2009.
[59]
Richards, G., Rayward-Smith, V.J., Sonksen, P.H., Carey, S., and Weng, C. Data mining for indicators of early mortality in a database of clinical records. Artificial Intelligence in Medicine 22(3): 215--231, June 2001.
[60]
Rokach, L. Pattern Classification Using Ensemble Methods. World Scientific, 2010.
[61]
Rokach, L. and Maimon, O. Data Mining with Decision Trees: theory and applications. World Scientific, 2008.
[62]
Schwabacher, M., and Langley, P. Discovering communicable scientific knowledge from spatio-temporal data. Proc. 18th Int. Conf. on Machine Learning (ICML' 2001), 489--496. Morgan Kaufmann, 2001.
[63]
Sen, S., and Knight, L. A genetic prototype learner. Proc. 14th Int. Joint Conf. on Artificial Intelligence (IJCAI'95). 1995.
[64]
Sokolova, M., and Lapalme, G. A systematic analysis of performance measures for classification tasks. Information Processing and Management 45(4): 427--437, July 2009.
[65]
Subramanian, G.H., Nosek, J., Raghunathan, S.P., and Kanitkar, S.S. A comparison of the decision table and tree. Communications of the ACM 35(1): 89--94, Jan. 1992.
[66]
Suri, N.R., Srinivas, V.S. and Murty, M.N. A cooperative game theoretic approach to prototype selection. Proc. 2007 European Conf. on Machine Learning (ECML 2007), LNAI 4701, 556--564. Springer, 2007.
[67]
Szafron, D., Lu, P., Greiner, R., Wishart, D.S., Poulin, B., Eisner, R., Lu, Z., Anvik, J., Macdonell, C., Fyshe, A., and Meeuwis, D. Proteome analyst: custom predictions with explanations in a web-based tool for high-throughput proteome annotations. Nucleic Acids Research 32(Supp. 2): W365--W371, 2004.
[68]
S. Tsumoto. Clinical knowledge discovery in hospital information systems: two case studies. Proc. Europ. Conf. on Principles and Practice of Knowledge Discovery and Data Mining (PKDD'2000), LNAI 1910, 652--656. Springer, 2000.
[69]
van Assche, A., and Blockeel, H. Seeing the forest through the trees: learning a comprehensible model from an ensemble. Proc. 2007 European Conf. on Machine Learning (ECML 2007), LNAI 4701, 418--429. Springer, 2007.
[70]
Verbeke, W., Marteens, D., Mues, C., and Baesens, B. Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Systems with Applications 38(3): 2354--2364. 2011.
[71]
Wettschereck, D., Aha, and D.W., Mohri, T. A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. In: Aha, D.W. (Ed.) Lazy Learning, 273--314. Kluwer, 1997.
[72]
Witten, I.H., Frank, E., and Hall, M.A. Data Mining: practical machine learning tools and techniques. 3rd Ed. Morgan Kaufman, 2011.
[73]
Wong, M.L., and Leung, K.S. Data Mining Using Grammar- Based Genetic Programming & Applications. Kluwer, 2000.
[74]
Zahalka, J., and Zelesny, F. An experimental test of Occam's Razor in classification. Machine Learning 82(3): 475--481, March 2011.
[75]
Zhang, J. Selecting typical instances in instance-based learning. Proc. 9th Int. Workshop on Machine Learning (ML'92), 470--479. 1992.
[76]
Zupan, B., Demsar, J., Kattan, M.W., Beck, J.R., and Bratko, I. Machine learning for survival analysis: a case study on recurrence of prostate cancer. Artificial Intelligence in Medicine 20(1): 59--75, Aug. 2000.

Cited By

View all
  • (2025)Using AI explainable models and handwriting/drawing tasks for psychological well-beingInformation Systems10.1016/j.is.2024.102465127(102465)Online publication date: Jan-2025
  • (2025)Computer vision-based regression techniques for renewable energy: predicting energy output and performanceComputer Vision and Machine Intelligence for Renewable Energy Systems10.1016/B978-0-443-28947-7.00003-3(41-66)Online publication date: 2025
  • (2024)Enabling Explainable AI in Cybersecurity SolutionsAdvances in Explainable AI Applications for Smart Cities10.4018/978-1-6684-6361-1.ch009(255-275)Online publication date: 18-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGKDD Explorations Newsletter
ACM SIGKDD Explorations Newsletter  Volume 15, Issue 1
June 2013
50 pages
ISSN:1931-0145
EISSN:1931-0153
DOI:10.1145/2594473
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 March 2014
Published in SIGKDD Volume 15, Issue 1

Check for updates

Author Tags

  1. Bayesian network classifiers
  2. decision table
  3. decision tree
  4. monotonicity constraint
  5. nearest neighbors
  6. rule induction

Qualifiers

  • Column

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)652
  • Downloads (Last 6 weeks)106
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Using AI explainable models and handwriting/drawing tasks for psychological well-beingInformation Systems10.1016/j.is.2024.102465127(102465)Online publication date: Jan-2025
  • (2025)Computer vision-based regression techniques for renewable energy: predicting energy output and performanceComputer Vision and Machine Intelligence for Renewable Energy Systems10.1016/B978-0-443-28947-7.00003-3(41-66)Online publication date: 2025
  • (2024)Enabling Explainable AI in Cybersecurity SolutionsAdvances in Explainable AI Applications for Smart Cities10.4018/978-1-6684-6361-1.ch009(255-275)Online publication date: 18-Jan-2024
  • (2024)Interpretable Machine Learning for Finding Intermediate-mass Black HolesThe Astrophysical Journal10.3847/1538-4357/ad2261965:1(89)Online publication date: 9-Apr-2024
  • (2024)A Meta Algorithm for Interpretable Ensemble Learning: The League of ExpertsMachine Learning and Knowledge Extraction10.3390/make60200386:2(800-826)Online publication date: 9-Apr-2024
  • (2024)Interpretable Machine Learning: A Case Study on Predicting Fuel Consumption in VLGC Ship PropulsionJournal of Marine Science and Engineering10.3390/jmse1210184912:10(1849)Online publication date: 16-Oct-2024
  • (2024)An Explainable AI System for the Diagnosis of High-Dimensional Biomedical DataBioMedInformatics10.3390/biomedinformatics40100134:1(197-218)Online publication date: 11-Jan-2024
  • (2024)Interpretable Medical Imagery Diagnosis with Self-Attentive Transformers: A Review of Explainable AI for Health CareBioMedInformatics10.3390/biomedinformatics40100084:1(113-126)Online publication date: 8-Jan-2024
  • (2024)Machine learning in business and finance: a literature review and research opportunitiesFinancial Innovation10.1186/s40854-024-00629-z10:1Online publication date: 19-Sep-2024
  • (2024)The Acoustic Properties of Affective Timbres: Consistencies and Discrepancies in a Synthesis of Multiple DatasetsMusic & Science10.1177/205920432412560127Online publication date: 22-May-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media