Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Conceptual views on tree ensemble classifiers

Published: 01 August 2023 Publication History

Abstract

Random Forests and related tree-based methods are popular for supervised learning from table based data. Apart from their ease of parallelization, their classification performance is also superior. However, this performance, especially parallelizability, is offset by the loss of explainability. Statistical methods are often used to compensate for this disadvantage. Yet, their ability for local explanations, and in particular for global explanations, is limited. In the present work we propose an algebraic method, rooted in lattice theory, for the (global) explanation of tree ensembles. In detail, we introduce two novel conceptual views on tree ensemble classifiers and demonstrate their explanatory capabilities on Random Forests that were trained with standard parameters.

References

[1]
Aldinucci, T.; et al. (2022): Contextual decision trees. https://doi.org/10.48550/ARXIV.2207.06355.
[2]
A. Altmann, et al., Permutation importance: a corrected feature importance measure, Bioinformatics 26 (10) (2010) 1340–1347.
[3]
R. Belohlávek, et al., Characterizing trees in concept lattices, Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 16 (Supplement-1) (2008) 1–15,.
[4]
R. Belohlávek, J. Outrata, M. Trnecka, Impact of Boolean factorization as preprocessing methods for classification of Boolean data, Ann. Math. Artif. Intell. 72.1–2 (2014) 3–22.
[5]
R. Belohlávek, et al., Inducing decision trees via concept lattices, in: P.W. Eklund, J. Diatta, M. Liquiere (Eds.), CLA, in: CEUR Workshop Proceedings, CEUR-WS.org, vol. 331, 2007.
[6]
Bischl, B.; et al. (2019): OpenML benchmarking suites. arXiv:1708.03731v2 [stat.ML].
[7]
Blockeel, H.; Raedt, L.D.; Ramon, J. (1998): Top-down induction of clustering trees. arXiv:cs.LG/0011032.
[8]
L. Breiman, Random forests, English Mach. Learn. (ISSN ) 45 (1) (2001) 5–32,.
[9]
L. Breiman, Random forests, Mach. Learn. 45 (1) (2001) 5–32.
[10]
L. Breiman, et al., Classification and Regression Trees, Wadsworth, ISBN 0-534-98053-8, Jan 2002.
[11]
D. Dua, C. Graff, UCI Machine Learning Repository, 2017.
[12]
E. Dudyrev, S.O. Kuznetsov, Decision concept lattice vs. decision trees and random forests, in: International Conference on Formal Concept Analysis, Springer, 2021, pp. 252–260.
[13]
D. Dürrschnabel, M. Koyda, G. Stumme, Attribute selection using contranominal scales, in: T. Braun, et al. (Eds.), Graph-Based Representation and Reasoning - 26th International Conference on Conceptual Structures, ICCS 2021, Virtual Event, September 20-22, 2021, Proceedings, in: Lecture Notes in Computer Science, vol. 12879, Springer, 2021, pp. 127–141,.
[14]
Feurer, M.; et al. (2020): OpenML-Python: an extensible Python API for OpenML. arXiv:1911.02490.
[15]
J.H. Friedman, Stochastic gradient boosting, Comput. Stat. Data Anal. 38 (4) (Feb. 2002) 367–378.
[16]
B. Ganter, R. Wille, Formal Concept Analysis: Mathematical Foundations, Springer-Verlag, Berlin, 1999, x+284.
[17]
T. Hanika, J. Hirth, Knowledge cores in large formal contexts, Ann. Math. Artif. Intell. 90 (6) (2022) 537–567,.
[18]
T. Hanika, J. Hirth, On the lattice of conceptual measurements, Inf. Sci. 613 (2022) 453–468,.
[19]
T. Hanika, M. Koyda, G. Stumme, Relevant attributes in formal contexts, in: D. Endres, M. Alam, D. Sotropa (Eds.), Graph-Based Representation and Reasoning - 24th International Conference on Conceptual Structures, ICCS 2019, in: Lecture Notes in Computer Science, vol. 11530, Marburg, Germany, July 1-4, 2019, Proceedings, Springer, 2019, pp. 102–116,.
[20]
T. Hastie, et al., The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Vol. 2, Springer, 2009.
[21]
J. Hirth, T. Hanika, Formal Conceptual Views in Neural Networks, 2022.
[22]
W. Iba, P. Langley, Induction of one-level decision trees, in: Machine Learning Proceedings 1992, Elsevier, 1992, pp. 233–240.
[23]
J. Kim, S. Choi, On uncertainty estimation by tree-based surrogate models in sequential model-based optimization, in: G. Camps-Valls, F.J.R. Ruiz, I. Valera (Eds.), Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, in: Proceedings of Machine Learning Research. PMLR, vol. 151, 2022, pp. 4359–4375.
[24]
S.O. Kuznetsov, T.P. Makhalova, On interestingness measures of formal concepts, Inf. Sci. 442–443 (2018) 202–219,.
[25]
C. Lindig, Fast concept analysis, in: Working with Conceptual Structures – Contributions to ICCS 2000, Shaker Verlag, 2000, pp. 152–161.
[26]
B. Liu, Y. Xia, P.S. Yu, Clustering through decision tree construction, in: International Conference on Information and Knowledge Management, 2000.
[27]
T.M. Mitchell, Machine Learning, McGraw-Hill, New York, NY, 2010.
[28]
F. Moosmann, B. Triggs, F. Jurie, Fast discriminative visual codebooks using randomized clustering forests, in: B. Schölkopf, J. Platt, T. Hoffman (Eds.), Advances in Neural Information Processing Systems, Vol. 19, MIT Press, 2006.
[29]
S. Prediger, G. Stumme, Theory-driven logical scaling, in: Proc. 6th Intl. WSorkshop Knowledge Representation Meets Databases (KRDB'99), in: CEUR Workshop Proc. 21, 1999, Also in Theory-driven logical scaling, in: P. Lambrix, et al. (Eds.), Proc. Intl. WS on Description Logics (DL'99), in: CEUR Workshop Proc. 22, 1999.
[30]
P. Probst, M.N. Wright, A.-L. Boulesteix, Hyperparameters and tuning strategies for random forest, Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 9 (3) (2019).
[31]
O. Prokasheva, A. Onishchenko, S. Gurov, Classification methods based on formal concept analysis, in: FCAIR 2012–Formal Concept Analysis Meets Information Retrieval, 2013, p. 95.
[32]
J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, ISBN 1-55860-238-0, 1993.
[33]
S. Ruggieri, Enumerating distinct decision trees, in: D. Precup, Y.W. Teh (Eds.), Proceedings of the 34th International Conference on Machine Learning, in: Proceedings of Machine Learning Research. PMLR, vol. 70, 2017, pp. 2960–2968.
[34]
P. Strecht, A survey of merging decision trees data mining approaches, in: Proc. 10th Doctoral Symposium in Informatics Engineering, 2015, pp. 36–47.
[35]
G. Stumme, et al., Computing iceberg concept lattices with TITANIC, Data Knowl. Eng. (ISSN ) 42 (2) (Aug. 2002) 189–222,.
[36]
T. Vidal, M. Schiffer, Born-again tree ensembles, in: H. Daumé III, A. Singh (Eds.), Proceedings of the 37th International Conference on Machine Learning, in: Proceedings of Machine Learning Research. PMLR, vol. 119, 2020, pp. 9743–9753.
[37]
R. Wille, Restructuring lattice theory: an approach based on hierarchies of concepts, in: I. Rival (Ed.), Ordered Sets: Proc. of the NATO Advanced Study Institute, Springer, Dordrecht, ISBN 978-94-009-7798-3, 1982, pp. 445–470.
[38]
R. Wille, Formal concept analysis as mathematical theory of concepts and concept hierarchies, in: B. Ganter, G. Stumme, R. Wille (Eds.), Formal Concept Analysis, in: Lecture Notes in Computer Science, vol. 3626, Springer, ISBN 3-540-27891-5, 2005, pp. 1–33.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image International Journal of Approximate Reasoning
International Journal of Approximate Reasoning  Volume 159, Issue C
Aug 2023
345 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 August 2023

Author Tags

  1. Decision tree
  2. Random forest
  3. Ensemble classification
  4. Explanation
  5. Formal concept analysis
  6. Explainable AI

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media