research-article

Conceptual views on tree ensemble classifiers

Authors:

Johannes HirthAuthors Info & Claims

Volume 159, Issue C

https://doi.org/10.1016/j.ijar.2023.108930

Published: 01 August 2023 Publication History

Abstract

Random Forests and related tree-based methods are popular for supervised learning from table based data. Apart from their ease of parallelization, their classification performance is also superior. However, this performance, especially parallelizability, is offset by the loss of explainability. Statistical methods are often used to compensate for this disadvantage. Yet, their ability for local explanations, and in particular for global explanations, is limited. In the present work we propose an algebraic method, rooted in lattice theory, for the (global) explanation of tree ensembles. In detail, we introduce two novel conceptual views on tree ensemble classifiers and demonstrate their explanatory capabilities on Random Forests that were trained with standard parameters.

References

[1]

Aldinucci, T.; et al. (2022): Contextual decision trees. https://doi.org/10.48550/ARXIV.2207.06355.

[2]

A. Altmann, et al., Permutation importance: a corrected feature importance measure, Bioinformatics 26 (10) (2010) 1340–1347.

Digital Library

[3]

R. Belohlávek, et al., Characterizing trees in concept lattices, Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 16 (Supplement-1) (2008) 1–15,.

[4]

R. Belohlávek, J. Outrata, M. Trnecka, Impact of Boolean factorization as preprocessing methods for classification of Boolean data, Ann. Math. Artif. Intell. 72.1–2 (2014) 3–22.

[5]

R. Belohlávek, et al., Inducing decision trees via concept lattices, in: P.W. Eklund, J. Diatta, M. Liquiere (Eds.), CLA, in: CEUR Workshop Proceedings, CEUR-WS.org, vol. 331, 2007.

[6]

Bischl, B.; et al. (2019): OpenML benchmarking suites. arXiv:1708.03731v2 [stat.ML].

[7]

Blockeel, H.; Raedt, L.D.; Ramon, J. (1998): Top-down induction of clustering trees. arXiv:cs.LG/0011032.

[8]

L. Breiman, Random forests, English Mach. Learn. (ISSN ) 45 (1) (2001) 5–32,.

Digital Library

[9]

L. Breiman, Random forests, Mach. Learn. 45 (1) (2001) 5–32.

Digital Library

[10]

L. Breiman, et al., Classification and Regression Trees, Wadsworth, ISBN 0-534-98053-8, Jan 2002.

[11]

D. Dua, C. Graff, UCI Machine Learning Repository, 2017.

[12]

E. Dudyrev, S.O. Kuznetsov, Decision concept lattice vs. decision trees and random forests, in: International Conference on Formal Concept Analysis, Springer, 2021, pp. 252–260.

[13]

D. Dürrschnabel, M. Koyda, G. Stumme, Attribute selection using contranominal scales, in: T. Braun, et al. (Eds.), Graph-Based Representation and Reasoning - 26th International Conference on Conceptual Structures, ICCS 2021, Virtual Event, September 20-22, 2021, Proceedings, in: Lecture Notes in Computer Science, vol. 12879, Springer, 2021, pp. 127–141,.

Digital Library

[14]

Feurer, M.; et al. (2020): OpenML-Python: an extensible Python API for OpenML. arXiv:1911.02490.

[15]

J.H. Friedman, Stochastic gradient boosting, Comput. Stat. Data Anal. 38 (4) (Feb. 2002) 367–378.

Digital Library

[16]

B. Ganter, R. Wille, Formal Concept Analysis: Mathematical Foundations, Springer-Verlag, Berlin, 1999, x+284.

[17]

T. Hanika, J. Hirth, Knowledge cores in large formal contexts, Ann. Math. Artif. Intell. 90 (6) (2022) 537–567,.

Digital Library

[18]

T. Hanika, J. Hirth, On the lattice of conceptual measurements, Inf. Sci. 613 (2022) 453–468,.

Digital Library

[19]

T. Hanika, M. Koyda, G. Stumme, Relevant attributes in formal contexts, in: D. Endres, M. Alam, D. Sotropa (Eds.), Graph-Based Representation and Reasoning - 24th International Conference on Conceptual Structures, ICCS 2019, in: Lecture Notes in Computer Science, vol. 11530, Marburg, Germany, July 1-4, 2019, Proceedings, Springer, 2019, pp. 102–116,.

[20]

T. Hastie, et al., The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Vol. 2, Springer, 2009.

[21]

J. Hirth, T. Hanika, Formal Conceptual Views in Neural Networks, 2022.

[22]

W. Iba, P. Langley, Induction of one-level decision trees, in: Machine Learning Proceedings 1992, Elsevier, 1992, pp. 233–240.

[23]

J. Kim, S. Choi, On uncertainty estimation by tree-based surrogate models in sequential model-based optimization, in: G. Camps-Valls, F.J.R. Ruiz, I. Valera (Eds.), Proceedings of the 25th International Conference on Artificial Intelligence and Statistics, in: Proceedings of Machine Learning Research. PMLR, vol. 151, 2022, pp. 4359–4375.

[24]

S.O. Kuznetsov, T.P. Makhalova, On interestingness measures of formal concepts, Inf. Sci. 442–443 (2018) 202–219,.

Digital Library

[25]

C. Lindig, Fast concept analysis, in: Working with Conceptual Structures – Contributions to ICCS 2000, Shaker Verlag, 2000, pp. 152–161.

[26]

B. Liu, Y. Xia, P.S. Yu, Clustering through decision tree construction, in: International Conference on Information and Knowledge Management, 2000.

Digital Library

[27]

T.M. Mitchell, Machine Learning, McGraw-Hill, New York, NY, 2010.

Digital Library

[28]

F. Moosmann, B. Triggs, F. Jurie, Fast discriminative visual codebooks using randomized clustering forests, in: B. Schölkopf, J. Platt, T. Hoffman (Eds.), Advances in Neural Information Processing Systems, Vol. 19, MIT Press, 2006.

[29]

S. Prediger, G. Stumme, Theory-driven logical scaling, in: Proc. 6th Intl. WSorkshop Knowledge Representation Meets Databases (KRDB'99), in: CEUR Workshop Proc. 21, 1999, Also in Theory-driven logical scaling, in: P. Lambrix, et al. (Eds.), Proc. Intl. WS on Description Logics (DL'99), in: CEUR Workshop Proc. 22, 1999.

[30]

P. Probst, M.N. Wright, A.-L. Boulesteix, Hyperparameters and tuning strategies for random forest, Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 9 (3) (2019).

[31]

O. Prokasheva, A. Onishchenko, S. Gurov, Classification methods based on formal concept analysis, in: FCAIR 2012–Formal Concept Analysis Meets Information Retrieval, 2013, p. 95.

[32]

J.R. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, ISBN 1-55860-238-0, 1993.

[33]

S. Ruggieri, Enumerating distinct decision trees, in: D. Precup, Y.W. Teh (Eds.), Proceedings of the 34th International Conference on Machine Learning, in: Proceedings of Machine Learning Research. PMLR, vol. 70, 2017, pp. 2960–2968.

[34]

P. Strecht, A survey of merging decision trees data mining approaches, in: Proc. 10th Doctoral Symposium in Informatics Engineering, 2015, pp. 36–47.

[35]

G. Stumme, et al., Computing iceberg concept lattices with TITANIC, Data Knowl. Eng. (ISSN ) 42 (2) (Aug. 2002) 189–222,.

Digital Library

[36]

T. Vidal, M. Schiffer, Born-again tree ensembles, in: H. Daumé III, A. Singh (Eds.), Proceedings of the 37th International Conference on Machine Learning, in: Proceedings of Machine Learning Research. PMLR, vol. 119, 2020, pp. 9743–9753.

[37]

R. Wille, Restructuring lattice theory: an approach based on hierarchies of concepts, in: I. Rival (Ed.), Ordered Sets: Proc. of the NATO Advanced Study Institute, Springer, Dordrecht, ISBN 978-94-009-7798-3, 1982, pp. 445–470.

[38]

R. Wille, Formal concept analysis as mathematical theory of concepts and concept hierarchies, in: B. Ganter, G. Stumme, R. Wille (Eds.), Formal Concept Analysis, in: Lecture Notes in Computer Science, vol. 3626, Springer, ISBN 3-540-27891-5, 2005, pp. 1–33.

Cited By

Hu MWang Z(2024)A three-way confirmatory approach to formal concept analysis in classificationApplied Soft Computing10.1016/j.asoc.2024.111448155:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.asoc.2024.111448
Ganter BHanika THirth J(2023)Scaling DimensionFormal Concept Analysis10.1007/978-3-031-35949-1_5(64-77)Online publication date: 17-Jul-2023
https://dl.acm.org/doi/10.1007/978-3-031-35949-1_5

Recommendations

On the Interpretation of Ensemble Classifiers in Terms of Bayes Classifiers

Many of the best classifiers are ensemble methods such as bagging, random forests, boosting, and Bayes model averaging. We give conditions under which each of these four classifiers can be regarded as a Bayes classifier. We also give conditions under ...
Optimizing the number of trees in a decision forest to discover a subforest with high ensemble accuracy using a genetic algorithm

A decision forest is an ensemble of decision trees, and it is often built to discover more patterns (i.e. logic rules) and predict/classify class values more accurately than a single decision tree. Existing decision forest algorithms are typically used ...
Empirical analysis of support vector machine ensemble classifiers

Ensemble classification - combining the results of a set of base learners - has received much attention in the machine learning community and has demonstrated promising capabilities in improving classification accuracy. Compared with neural network or ...

Comments

Information & Contributors

Information

Published In

cover image International Journal of Approximate Reasoning

International Journal of Approximate Reasoning Volume 159, Issue C

Aug 2023

345 pages

ISSN:0888-613X

Issue’s Table of Contents

Elsevier Inc.

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 August 2023

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Hu MWang Z(2024)A three-way confirmatory approach to formal concept analysis in classificationApplied Soft Computing10.1016/j.asoc.2024.111448155:COnline publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1016/j.asoc.2024.111448
Ganter BHanika THirth J(2023)Scaling DimensionFormal Concept Analysis10.1007/978-3-031-35949-1_5(64-77)Online publication date: 17-Jul-2023
https://dl.acm.org/doi/10.1007/978-3-031-35949-1_5

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents