column

Comprehensible classification models: a position paper

Author:

Alex A. FreitasAuthors Info & Claims

ACM SIGKDD Explorations Newsletter, Volume 15, Issue 1

Pages 1 - 10

https://doi.org/10.1145/2594473.2594475

Published: 17 March 2014 Publication History

Abstract

The vast majority of the literature evaluates the performance of classification models using only the criterion of predictive accuracy. This paper reviews the case for considering also the comprehensibility (interpretability) of classification models, and discusses the interpretability of five types of classification models, namely decision trees, classification rules, decision tables, nearest neighbors and Bayesian network classifiers. We discuss both interpretability issues which are specific to each of those model types and more generic interpretability issues, namely the drawbacks of using model size as the only criterion to evaluate the comprehensibility of a model, and the use of monotonicity constraints to improve the comprehensibility and acceptance of classification models by users.

References

[1]

Allahyari, H., and Lavesson, N. User-oriented assessment of classification model understandability. Proc. 11th Scandinavian Conf. on Artificial Intelligence. IOS, 2011.

[2]

Altendorf, E.E., Restificar, A.C., and Dietterich, T.G. Learning from sparse data by exploiting monotonicity constraints. Proc. 21st Annual Conf. on Uncertainty in Artificial Intelligence (UAI'05), 18--26. AUAI, 2005.

[3]

Augusta, M.G., and Kathirvalavakumar, T. Reverse engineering the neural networks for rule extraction in classification problems. Neural Processing Letters 35(2): 131--150, April 2012.

Digital Library

[4]

Baesens, B., Mues, C., De Backer, M., and Vanthienen, J. Building intelligent credit scoring systems using decision tables. In: Enterprise Information Systems V, 131--137. Kluwer, 2004.

[5]

Bellazzi, R., and Zupan, B. Predictive data mining in clinical medicine: current issues and guidelines. International Journal of Medical Informatics 77(2): 81--97, Feb. 2008.

[6]

Ben-David, A. Monotonicity maintenance in informationtheoretic machine learning algorithms. Machine Learning 19(1): 29--43. 1995.

Digital Library

[7]

Ben-David, A., Sterling, L., and Tran, T. Adding monotonicity to learning algorithms may impair their accuracy. Expert Systems with Applications 36(3): 6627--6634. April 2009.

Digital Library

[8]

Boz, O. Extracting decision trees from trained neural networks. Proc. 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'02), 456--461. ACM, 2002.

Digital Library

[9]

Bramer, M. Principles of Data Mining. Springer, 2007.

Digital Library

[10]

Cendrowska, J. PRISM: an algorithm for inducing modular rules. International Journal of Man-Machine Studies 27(4): 349--370. 1987.

[11]

Cheng, J., and Greiner, R. Learning Bayesian belief classifiers: algorithms and system. Proc. 14th Biennial Conference of Canadian Society on Computational Studies of Intelligence (AI'01), 141--151. Springer, 2001.

Digital Library

[12]

Clark, P., Boswell, R. Rule induction with CN2: some recent improvements. In: Machine Learning -- Proc. Fifth European Conf. (EWSL'91), 151--163. Springer, 1991.

Digital Library

[13]

Cristianini, N., and Shawe-Taylor, J. An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press, 2000.

Digital Library

[14]

Dejaeger, K., Goethals, F., Giangreco, A., Mola, L., and Baesens, B. Gaining insight into student satisfaction using comprehensible data mining techniques. European Journal of Operational Research, 218(2): 548--562, 2012.

[15]

Dhar, V., Chou, D., and Provost, F. Discovering interesting patterns for investment decision making with GLOWER -- a genetic learner overlaid with entropy reduction. Data Mining and Knowledge Discovery 4(4): 251--280, 2000.

Digital Library

[16]

Doderer, M., Yoon, K., Salinas, J., and Kwek, S. Protein subcellular localization prediction using a hybrid of similarity search and error-correcting output code techniques that produces interpretable results. In Silico Biology 6(5): 419--433, 2006.

[17]

Domingos, P. Occam's two razors: the sharp and the blunt. Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining (KDD'98), 37--43. AAAI, 1998.

[18]

Duivesteijn, W., and Feelders, A. Nearest neighbour classification with monotonicity constraints. Proc. ECML PKDD 2008, Part I, LNAI 5211, 301--316. Springer, 2008.

Digital Library

[19]

Elazmeh, W., Matwin, W., O'Sullivan, D., Michalowski, W., and Farion, W. Insights from predicting pediatric asthma exacerbations from retrospective clinical data. In: Evaluation Methods for Machine Learning II -- Papers from 2007 AAAI Workshop, 10--15. Technical Report WS-07-05. AAAI, 2007.

[20]

Elomaa, T. In Defense of C4.5: Notes on learning one-level decision trees. Proc. 11th Int. Conf. on Machine Learning (ICML'94), pp. 62--69. Morgan Kaufmann, 1994.

[21]

Feelders, A.J. Prior knowledge in economic applications of data mining. Proc. European Conf. on Principles and Practice of Knowledge Discovery and Data Mining (PKDD' 2000), LNAI 1910, 395--400. Springer, 2000.

Digital Library

[22]

Feelders, A., and Pardoel, M. Pruning for monotone classification trees. Proc. Intelligent Data Analysis (IDA) Conf., LNCS 2810, 1--12. Springer, 2003.

[23]

Ferri, C., Hernandez-Orallo, J., and Ramirez-Quintana, M.J. From ensemble methods to comprehensible models. Proc. 5th Int. Conf. on Discovery Science (DS'2002), LNCS 2534, 165--177. Springer, 2002.

Digital Library

[24]

Freitas, A.A. Data Mining and Knowledge Discovery with Evolutionary Algorithms. Springer, 2002.

Digital Library

[25]

Freitas, A.A. A critical review of multi-objective optimization in data mining: a position paper. ACM SIGKDD Explorations, 6(2): 77--86. ACM, Dec. 2004.

Digital Library

[26]

Freitas, A.A., Wieser, D.C. and Apweiler, R. On the importance of comprehensible classification models for protein function prediction. ACM/IEEE Transactions on Computational Biology and Bioinformatics 7(1): 172--182, Jan.-Mar. 2010.

Digital Library

[27]

Friedman, N., Geiger, D., and Goldszmidt, M. Bayesian network classifiers. Machine Learning 29(2-3): 131--163, Nov./Dec. 1997.

Digital Library

[28]

Fung, G., Sandilya, S., and Rao, R.B. Rule extraction from linear support vector machines. Proc. 11th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD' 2005), 32--40. ACM, 2005.

Digital Library

[29]

Furnkranz, J. Separate-and-conquer rule learning. Artificial Intelligence Review 13(1): 3--54. 1999.

Digital Library

[30]

Grunwald, P.D. The Minimum Description Length Principle. MIT Press, 2007.

Digital Library

[31]

Hayete, B., and Bienkowska, J.R. GOTrees: predicting GO associations from protein domain composition using decision trees. Proc. Pacific Symp. on Biocomput. 10, 127--138, 2005.

[32]

Heckerman, D., Chickering, D.M., Meek, C., Rounthwaite, R., and Kadie, C. Dependency networks for inference, collaborative filtering and data visualization. Journal of Machine Learning Research 1: 49--75, 2000.

Digital Library

[33]

Henery, R.J. Classification. In: Michie, D., Spiegelhalter, D.J., and Taylor, C.C. Machine Learning, Neural and Statistical Classification, 6--16. Ellis Horwood, 1994.

Digital Library

[34]

Huang, J., and Ling, C. Using AUC and accuracy in evaluating learning algorithms. IEEE Transactions on Knowledge and Data Engineering 17(3): 299--310, 2005.

Digital Library

[35]

Huysmans, J., Dejaeger, K., Mues, C., Vanthienen, J., and Baesens, B. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decision Support Systems 51(1): 141--154. 2011.

Digital Library

[36]

Japkowicz, N., and Shah, M. Evaluating learning algorithms: a classification perspective. Cambridge University Press, 2011.

Digital Library

[37]

Jiang, T., and Keating, A.E. AVID: an integrative framework for discovering functional relationships among proteins. BMC Bioinformatics 6:136, 2005.

[38]

Jin, Y. (Ed.) Multiobjective Machine Learning. Springer, 2006.

Digital Library

[39]

Johansson, U., and Niklasson, U. Evolving decision trees using oracle guides. Proc. 2009 IEEE Symp. on Computational Intelligence and Data Mining (CIDM 2009), 238--244. IEEE Press, 2009.

[40]

Karpf, J. Inductive modelling in law: example based expert systems in administrative law. Proc. 3rd Int. Conf. on Artificial Intelligence in Law, 297--306. ACM, 1991.

Digital Library

[41]

Karwath, A., and King, R.D. Homology induction: the use of machine learning to improve sequence similarity searches. BMC Bioinformatics 3:11, 2002.

[42]

Kaufmann, K.A., and Michalski, R.S. Learning from inconsistent and noisy data: the AQ18 approach. Foundations of Intelligent Systems (Proc. ISMIS'99). LNAI 1609, 411--419. Springer, 1999.

Digital Library

[43]

Kohavi, R. The power of decision tables. Proc. 1995 European Conf. on Machine Learning (ECML'95), LNAI 914, 174--189. Springer, 1995.

Digital Library

[44]

Kohavi, R., and Sommerfield, D. Targeting business users with decision table classifiers. Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining (KDD'98), 249--253. AAAI, 1998.

[45]

Kononenko, I. Inductive and Bayesian learning in medical diagnosis. Applied Artificial Intelligence 7(4): 317--337, 1993.

[46]

Lavrac, N. Selected techniques for data mining in medicine. Artificial Intelligence in Medicine 16(1): 3--23, May 1999.

[47]

Lima, E., Mues, C., and Baesens, B. Domain knowledge integration in data mining using decision tables: case studies in churn prediction. Journal of the Operational Research Society 60: 1096--1106, 2009.

[48]

Maes, R., and Van Dijk, J.E.M. On the role of ambiguity and incompleteness in the design of decision tables and rulebased systems. The Computer Journal 31(6): 481--489. 1988.

Digital Library

[49]

Marteens, D., Vanthienen, J., Verbeke, W., and Baesens, B. Performance of classification models from a user perspective. Decision Support Systems 51(4): 782--793. 2011.

Digital Library

[50]

Michie, D., Spiegelhalter, D.J., and Taylor, C.C. (Eds.) Machine Learning, Neural and Statistical Classification. Ellis Horwood, 1994.

Digital Library

[51]

Otero, F.E.B. and Freitas, A.A. Improving the interpretability of classification rules discovered by an ant colony algorithm. Proc. 2013 Genetic and Evolutionary Computation Conference (GECCO'13), 73--80. ACM, 2013.

Digital Library

[52]

Pappa, G.L., Baines, A.J., and Freitas, A.A. Predicting postsynaptic activity in proteins with data mining. Bioinformatics 21(Suppl. 2): ii19--ii25, 2005.

Digital Library

[53]

Pazzani, M. Comprehensible Knowledge Discovery: Gaining Insight from Data. Proc. First Federal Data Mining Conf. and Exposition, 73--82. Washington, D.C., 1997.

[54]

Pazzani, M.J. Learning with globally predictive tests. Proc. Discovery Science (DS'98), LNAI 1532. Springer, 1998.

Digital Library

[55]

Pazzani, M.J., Mani, S., and Shankle, W.R. Acceptance of rules generated by machine learning among medical experts. Methods of Information in Medicine, 40(5): 380--385, 2001.

[56]

Quinlan, J.R. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993.

Digital Library

[57]

Quinlan, J.R. Some elements of machine learning. Proc. 16th Int. Conf. on Machine Learning (ICML'99), 523--525. Morgan Kaufmann, 1999.

[58]

Quinonero-Candela, J., Sugiyama, M., Schwaighofer, A., and Lawrence, N.D. (Eds.) Dataset Shift in Machine Learning. MIT Press, 2009.

Digital Library

[59]

Richards, G., Rayward-Smith, V.J., Sonksen, P.H., Carey, S., and Weng, C. Data mining for indicators of early mortality in a database of clinical records. Artificial Intelligence in Medicine 22(3): 215--231, June 2001.

Digital Library

[60]

Rokach, L. Pattern Classification Using Ensemble Methods. World Scientific, 2010.

Digital Library

[61]

Rokach, L. and Maimon, O. Data Mining with Decision Trees: theory and applications. World Scientific, 2008.

Digital Library

[62]

Schwabacher, M., and Langley, P. Discovering communicable scientific knowledge from spatio-temporal data. Proc. 18th Int. Conf. on Machine Learning (ICML' 2001), 489--496. Morgan Kaufmann, 2001.

Digital Library

[63]

Sen, S., and Knight, L. A genetic prototype learner. Proc. 14th Int. Joint Conf. on Artificial Intelligence (IJCAI'95). 1995.

Digital Library

[64]

Sokolova, M., and Lapalme, G. A systematic analysis of performance measures for classification tasks. Information Processing and Management 45(4): 427--437, July 2009.

Digital Library

[65]

Subramanian, G.H., Nosek, J., Raghunathan, S.P., and Kanitkar, S.S. A comparison of the decision table and tree. Communications of the ACM 35(1): 89--94, Jan. 1992.

Digital Library

[66]

Suri, N.R., Srinivas, V.S. and Murty, M.N. A cooperative game theoretic approach to prototype selection. Proc. 2007 European Conf. on Machine Learning (ECML 2007), LNAI 4701, 556--564. Springer, 2007.

Digital Library

[67]

Szafron, D., Lu, P., Greiner, R., Wishart, D.S., Poulin, B., Eisner, R., Lu, Z., Anvik, J., Macdonell, C., Fyshe, A., and Meeuwis, D. Proteome analyst: custom predictions with explanations in a web-based tool for high-throughput proteome annotations. Nucleic Acids Research 32(Supp. 2): W365--W371, 2004.

[68]

S. Tsumoto. Clinical knowledge discovery in hospital information systems: two case studies. Proc. Europ. Conf. on Principles and Practice of Knowledge Discovery and Data Mining (PKDD'2000), LNAI 1910, 652--656. Springer, 2000.

Digital Library

[69]

van Assche, A., and Blockeel, H. Seeing the forest through the trees: learning a comprehensible model from an ensemble. Proc. 2007 European Conf. on Machine Learning (ECML 2007), LNAI 4701, 418--429. Springer, 2007.

Digital Library

[70]

Verbeke, W., Marteens, D., Mues, C., and Baesens, B. Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Systems with Applications 38(3): 2354--2364. 2011.

Digital Library

[71]

Wettschereck, D., Aha, and D.W., Mohri, T. A review and empirical evaluation of feature weighting methods for a class of lazy learning algorithms. In: Aha, D.W. (Ed.) Lazy Learning, 273--314. Kluwer, 1997.

Digital Library

[72]

Witten, I.H., Frank, E., and Hall, M.A. Data Mining: practical machine learning tools and techniques. 3rd Ed. Morgan Kaufman, 2011.

Digital Library

[73]

Wong, M.L., and Leung, K.S. Data Mining Using Grammar- Based Genetic Programming & Applications. Kluwer, 2000.

Digital Library

[74]

Zahalka, J., and Zelesny, F. An experimental test of Occam's Razor in classification. Machine Learning 82(3): 475--481, March 2011.

Digital Library

[75]

Zhang, J. Selecting typical instances in instance-based learning. Proc. 9th Int. Workshop on Machine Learning (ML'92), 470--479. 1992.

Digital Library

[76]

Zupan, B., Demsar, J., Kattan, M.W., Beck, J.R., and Bratko, I. Machine learning for survival analysis: a case study on recurrence of prostate cancer. Artificial Intelligence in Medicine 20(1): 59--75, Aug. 2000.

Digital Library

Cited By

Prinzi FBarbiero PGreco CAmorese TCordasco GLiò PVitabile SEsposito A(2025)Using AI explainable models and handwriting/drawing tasks for psychological well-beingInformation Systems10.1016/j.is.2024.102465127(102465)Online publication date: Jan-2025
https://doi.org/10.1016/j.is.2024.102465
Chaudhuri AJiang R(2025)Computer vision-based regression techniques for renewable energy: predicting energy output and performanceComputer Vision and Machine Intelligence for Renewable Energy Systems10.1016/B978-0-443-28947-7.00003-3(41-66)Online publication date: 2025
https://doi.org/10.1016/B978-0-443-28947-7.00003-3
Shah IJhanjhi NRay S(2024)Enabling Explainable AI in Cybersecurity SolutionsAdvances in Explainable AI Applications for Smart Cities10.4018/978-1-6684-6361-1.ch009(255-275)Online publication date: 18-Jan-2024
https://doi.org/10.4018/978-1-6684-6361-1.ch009
Show More Cited By

Index Terms

Comprehensible classification models: a position paper
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification
    2. Machine learning approaches
      1. Classification and regression trees
2. Information systems
  1. Information retrieval
    1. Document representation

Recommendations

Classification trees for problems with monotonicity constraints

For classification problems with ordinal attributes very often the class attribute should increase with each or some of the explaining attributes. These are called classification problems with monotonicity constraints. Classical decision tree algorithms ...
RIONA: A New Classification System Combining Rule Induction and Instance-Based Learning

The article describes a method combining two widely-used empirical approaches to learning from examples: rule induction and instance-based learning. In our algorithm (RIONA) decision is predicted not on the basis of the whole support set of all rules ...
RIONA: A New Classification System Combining Rule Induction and Instance-Based Learning

The article describes a method combining two widely-used empirical approaches to learning from examples: rule induction and instance-based learning. In our algorithm (RIONA) decision is predicted not on the basis of the whole support set of all rules ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGKDD Explorations Newsletter

ACM SIGKDD Explorations Newsletter Volume 15, Issue 1

June 2013

50 pages

ISSN:1931-0145

EISSN:1931-0153

DOI:10.1145/2594473

Editors:
Bart Goethals
University of Antwerp, Belgium
,
Charu Aggarwal
IBM T. J. Watson Research Center in Yorktown Heights, New York
,
Srinivasan Parthasarathy
The Ohio State University, Columbus, OH
,
Ankur Teredesai
University of Washington, Seattle, Washington

Issue’s Table of Contents

Copyright © 2014 Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 March 2014

Published in SIGKDD Volume 15, Issue 1

Check for updates

Author Tags

Qualifiers

Column

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

422
Total Citations
View Citations
3,774
Total Downloads

Downloads (Last 12 months)652
Downloads (Last 6 weeks)106

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Prinzi FBarbiero PGreco CAmorese TCordasco GLiò PVitabile SEsposito A(2025)Using AI explainable models and handwriting/drawing tasks for psychological well-beingInformation Systems10.1016/j.is.2024.102465127(102465)Online publication date: Jan-2025
https://doi.org/10.1016/j.is.2024.102465
Chaudhuri AJiang R(2025)Computer vision-based regression techniques for renewable energy: predicting energy output and performanceComputer Vision and Machine Intelligence for Renewable Energy Systems10.1016/B978-0-443-28947-7.00003-3(41-66)Online publication date: 2025
https://doi.org/10.1016/B978-0-443-28947-7.00003-3
Shah IJhanjhi NRay S(2024)Enabling Explainable AI in Cybersecurity SolutionsAdvances in Explainable AI Applications for Smart Cities10.4018/978-1-6684-6361-1.ch009(255-275)Online publication date: 18-Jan-2024
https://doi.org/10.4018/978-1-6684-6361-1.ch009
Pasquato MTrevisan PAskar ALemos PCarenini GMapelli MHezaveh Y(2024)Interpretable Machine Learning for Finding Intermediate-mass Black HolesThe Astrophysical Journal10.3847/1538-4357/ad2261965:1(89)Online publication date: 9-Apr-2024
https://doi.org/10.3847/1538-4357/ad2261
Vogel RSchlosser TManthey RRitter MVodel MEibl MSchneider K(2024)A Meta Algorithm for Interpretable Ensemble Learning: The League of ExpertsMachine Learning and Knowledge Extraction10.3390/make60200386:2(800-826)Online publication date: 9-Apr-2024
https://doi.org/10.3390/make6020038
Vorkapić AMartinčić-Ipšić SPiltaver R(2024)Interpretable Machine Learning: A Case Study on Predicting Fuel Consumption in VLGC Ship PropulsionJournal of Marine Science and Engineering10.3390/jmse1210184912:10(1849)Online publication date: 16-Oct-2024
https://doi.org/10.3390/jmse12101849
Ultsch AHoffmann JRöhnert Mvon Bonin MOelschlägel UBrendel CThrun M(2024)An Explainable AI System for the Diagnosis of High-Dimensional Biomedical DataBioMedInformatics10.3390/biomedinformatics40100134:1(197-218)Online publication date: 11-Jan-2024
https://doi.org/10.3390/biomedinformatics4010013
Lai T(2024)Interpretable Medical Imagery Diagnosis with Self-Attentive Transformers: A Review of Explainable AI for Health CareBioMedInformatics10.3390/biomedinformatics40100084:1(113-126)Online publication date: 8-Jan-2024
https://doi.org/10.3390/biomedinformatics4010008
Gao HKou GLiang HZhang HChao XLi CDong Y(2024)Machine learning in business and finance: a literature review and research opportunitiesFinancial Innovation10.1186/s40854-024-00629-z10:1Online publication date: 19-Sep-2024
https://doi.org/10.1186/s40854-024-00629-z
Korsmit IMontrey MWong-Min AMcAdams S(2024)The Acoustic Properties of Affective Timbres: Consistencies and Discrepancies in a Synthesis of Multiple DatasetsMusic & Science10.1177/205920432412560127Online publication date: 22-May-2024
https://doi.org/10.1177/20592043241256012
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents