Abstract
Decision Trees (DTs) are predictive models in supervised learning, known not only for their unquestionable utility in a wide range of applications but also for their interpretability and robustness. Research on the subject is still going strong after almost 60 years since its original inception, and in the last decade, several researchers have tackled key matters in the field. Although many great surveys have been published in the past, there is a gap since none covers the last decade of the field as a whole. This paper proposes a review of the main recent advances in DT research, focusing on three major goals of a predictive learner: issues regarding the fitting of training data, generalization, and interpretability. Moreover, by organizing several topics that have been previously analyzed in isolation, this survey attempts to provide an overview of the field, its key concerns, and future trends, serving as a good entry point for both researchers and newcomers to the machine learning community.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availibility
Not applicable.
Code availability
Not applicable.
References
Adibi MA (2019) Single and multiple outputs decision tree classification using bi-level discrete-continues genetic algorithm. Pattern Recognit Lett 128:190–196. https://doi.org/10.1016/j.patrec.2019.09.001
Aghaei S, Azizi MJ, Vayanos P (2019) Learning optimal and fair decision trees for non-discriminative decision-making. In: Proceedings of the AAAI conference on artificial intelligence, vol 33(01), pp 1418–1426. https://doi.org/10.1609/aaai.v33i01.33011418
Aglin G, Nijssen S, Schaus P (2020) Learning optimal decision trees using caching branch-and-bound search. In: Proceedings of the AAAI conference on artificial intelligence, vol 34(04), pp 3146–3153. https://doi.org/10.1609/aaai.v34i04.5711
Alvarez-Melis D, Jaakkola TS (2018) On the robustness of interpretability methods. arXiv:1806.08049 [cs, stat]
Amodei D, Ananthanarayanan S, Anubhai R et al (2016) Deep speech 2: end-to-end speech recognition in English and Mandarin. In: International conference on machine learning, PMLR, pp 173–182
Angelino E, Larus-Stone N, Alabi D et al (2017) Learning certifiably optimal rule lists. https://doi.org/10.1145/3097983.3098047
Avellaneda F (2020) Efficient inference of optimal decision trees. In: Proceedings of the AAAI conference on artificial intelligence, vol 34(04), pp 3195–3202. https://doi.org/10.1609/aaai.v34i04.5717
Baranauskas JA (2015) The number of classes as a source for instability of decision tree algorithms in high dimensional datasets. Artif Intell Rev 43(2):301–310. https://doi.org/10.1007/s10462-012-9374-7
Barros RC, Basgalupp MP, de Carvalho ACPLF et al (2012) A survey of evolutionary algorithms for decision-tree induction. IEEE Trans Syst Man Cybern C 42(3):291–312. https://doi.org/10.1109/TSMCC.2011.2157494
Barros RC, de Carvalho ACPLF, Freitas AA (2015) Automatic design of decision-tree induction algorithms. Springer Briefs in computer science. Springer. https://doi.org/10.1007/978-3-319-14231-9
Bennett KP (1992) Decision tree construction via linear programming, Technical report. University of Wisconsin-Madison Department of Computer Sciences
Bennett KP, Blue JA (1996) Optimal decision trees. Technical report, R.P.I. Math Report No. 214. Rensselaer Polytechnic Institute
Bertsimas D, Dunn J (2017) Optimal classification trees. Mach Learn 106(7):1039–1082. https://doi.org/10.1007/s10994-017-5633-9
Bertsimas D, Dunn J, Mundru N (2019) Optimal prescriptive trees 1(2):164–183. https://doi.org/10.1287/ijoo.2018.0005
Bessiere C, Hebrard E, O’Sullivan B (2009) Minimising decision tree size as combinatorial optimisation. In: Gent IP (ed) Principles and practice of constraint programming—CP 2009. Lecture notes in computer science, vol 5732. Springer, Berlin, pp 173–187. https://doi.org/10.1007/978-3-642-04244-7\_{1}6
Blanquero R, Carrizosa E, Molero-Río C et al (2020) Sparsity in optimal randomized classification trees. Eur J Oper Res 284(1):255–272. arXiv: 2002.09191
Blanquero R, Carrizosa E, Molero-Río C et al (2021) Optimal randomized classification trees. Comput Oper Res 132(105):281. https://doi.org/10.1016/j.cor.2021.105281
Blockeel H, Raedt LD, Ramon J (1998) Top-down induction of clustering trees. In: Proceedings of the fifteenth international conference on machine learning, 1998, pp 55–63
Bojarski M, Del Testa D, Dworakowski D et al (2016) End to end learning for self-driving cars. arXiv preprint. arXiv:1604.07316
Breiman L, Friedman JH (1988) Tree-structured classification via generalized discriminant analysis: comment. J Am Stat Assoc 83(403):725–727
Breiman L, Friedman J, Stone CJ et al (1984) Classification and regression trees. Taylor & Francis, Boca Raton
Breslow LA, Aha DW (1997) Simplifying decision trees: a survey. Knowl Eng Rev 12(01):1–40. https://doi.org/10.1017/S0269888997000015
Brodley CE, Utgoff PE (1995) Multivariate decision trees. Mach Learn 19(1):45–77. https://doi.org/10.1007/BF00994660
Broelemann K, Kasneci G (2019) A gradient-based split criterion for highly accurate and transparent model trees. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence, 2019, pp 2030–2037. https://doi.org/10.24963/ijcai.2019/281
Brunello A, Marzano E, Montanari A et al (2017) Decision tree pruning via multi-objective evolutionary computation. Int J Mach Learn Comput 7(6):167–175. https://doi.org/10.18178/ijmlc.2017.7.6.641
Cao-Van K, De Baets B (2003) Growing decision trees in an ordinal setting. Int J Intell Syst 18(7):733–750. https://doi.org/10.1002/int.10113
Carreira-Perpinan MA, Hada SS (2021) Counterfactual explanations for oblique decision trees: exact, efficient algorithms. In: Proceedings of the AAAI conference on artificial intelligence, 2021, vol 35(8), pp 6903–6911
Carrizosa E, Molero-Río C, Romero Morales D (2021) Mathematical optimization in classification and regression trees. TOP 29(1):5–33. https://doi.org/10.1007/s11750-021-00594-1
Chabbouh M, Bechikh S, Hung CC et al (2019) Multi-objective evolution of oblique decision trees for imbalanced data binary classification. Swarm Evol Comput 49:1–22. https://doi.org/10.1016/j.swevo.2019.05.005
Chen YL, Wu CC, Tang K (2016) Time-constrained cost-sensitive decision tree induction. Inf Sci 354:140–152. https://doi.org/10.1016/j.ins.2016.03.022
Clemmensen L, Hastie T, Witten D et al (2011) Sparse discriminant analysis. Technometrics 53(4):406–413. https://doi.org/10.1198/TECH.2011.08118
Correa Bahnsen A, Aouada D, Ottersten B (2015) Example-dependent cost-sensitive decision trees. Expert Syst Appl 42(19):6609–6619. https://doi.org/10.1016/j.eswa.2015.04.042
Czajkowski M, Kretowski M (2016) The role of decision tree representation in regression problems—an evolutionary perspective. Appl Soft Comput 48:458–475. https://doi.org/10.1016/j.asoc.2016.07.007
Czajkowski M, Jurczuk K, Kretowski M (2015) A parallel approach for evolutionary induced decision trees. MPI+OpenMP implementation. In: Rutkowski L, Korytkowski M, Scherer R et al (eds) Artificial intelligence and soft computing. Lecture notes in computer science, vol 9119. Springer, Cham, pp 340–349. https://doi.org/10.1007/978-3-319-19324-3_31
Demirović E, Stuckey PJ (2021) Optimal decision trees for nonlinear metrics. In: Proceedings of the AAAI conference on artificial intelligence, 2021, vol 35(5), pp 3733–3741
Demirović E, Lukina A, Hebrard E et al (2021) MurTree: optimal classification trees via dynamic programming and search. arXiv:2007.12652 [cs, stat] ArXiv: 2007.12652
Dunn JW (2018) Optimal trees for prediction and prescription. PhD Thesis, Massachusetts Institute of Technology
Elsisi M, Mahmoud K, Lehtonen M et al (2021) Reliable industry 4.0 based on machine learning and IoT for analyzing, monitoring, and securing smart meters. Sensors 21(2):487
Esmeir S, Markovitch S (2007) Anytime learning of decision trees. J Mach Learn Res 8:891–933
Firat M, Crognier G, Gabor AF et al (2020) Column generation based heuristic for learning classification trees. Comput Oper Res 116(104):866. https://doi.org/10.1016/j.cor.2019.104866
Fraiman R, Ghattas B, Svarc M (2013) Interpretable clustering using unsupervised binary trees. Adv Data Anal Classif 7(2):125–145. https://doi.org/10.1007/s11634-013-0129-3
Frank E, Mayo M, Kramer S (2015) Alternating model trees. In: Proceedings of the 30th annual ACM symposium on applied computing, Salamanca, Spain. ACM, pp 871–878. https://doi.org/10.1145/2695664.2695848
Freitas AA (2014) Comprehensible classification models: a position paper. ACM SIGKDD Explor Newsl 15(1):1–10
Freund Y, Mason L (1999) The alternating decision tree learning algorithm. In: Proceedings of the sixteenth international conference on machine learning, 1999, pp 124–133
Frosst N, Hinton G (2017) Distilling a neural network into a soft decision tree. arXiv:1711.09784 [cs, stat]
Garcia Leiva R, Fernandez Anta A, Mancuso V et al (2019) A novel hyperparameter-free approach to decision tree construction that avoids overfitting by design. IEEE Access 7:99978–99987. https://doi.org/10.1109/ACCESS.2019.2930235
Ghattas B, Michel P, Boyer L (2017) Clustering nominal data using unsupervised binary decision trees: comparisons with the state of the art methods. Pattern Recognit 67:177–185. https://doi.org/10.1016/j.patcog.2017.01.031
Gleser MA, Collen MF (1972) Towards automated medical decisions. Comput Biomed Res 5(2):180–189. https://doi.org/10.1016/0010-4809(72)90080-8
Günlük O, Kalagnanam J, Li M et al (2021) Optimal decision trees for categorical data via integer programming. J Glob Optim. https://doi.org/10.1007/s10898-021-01009-y
Hastie T, Tibshirani R, Friedman JH et al (2009) The elements of statistical learning: data mining, inference, and prediction, vol 2. Springer, New York
Heath D, Kasif S, Salzberg S (1993) Induction of oblique decision trees. J Artif Intell Res 1993:1002–1007
Hehn TM, Kooij JFP, Hamprecht FA (2020) End-to-end learning of decision trees and forests. Int J Comput Vis 128(4):997–1011. https://doi.org/10.1007/s11263-019-01237-6
Hu Q, Guo M, Yu D et al (2010) Information entropy for ordinal classification. Sci China Inf Sci 53(6):1188–1200. https://doi.org/10.1007/s11432-010-3117-7
Hu Q, Che X, Zhang L et al (2012) Rank entropy-based decision trees for monotonic classification. IEEE Trans Knowl Data Eng 24(11):2052–2064. https://doi.org/10.1109/TKDE.2011.149
Hu X, Rudin C, Seltzer M (2019) Optimal sparse decision trees. In: Advances in neural information processing systems (NeurIPS)
Hu H, Siala M, Hebrard E, et al (2020) Learning optimal decision trees with MaxSAT and its integration in AdaBoost. In: Proceedings of the twenty-ninth international joint conference on artificial intelligence, pp 1170–1176. ISSN: 1045-0823. https://doi.org/10.24963/ijcai.2020/163
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501
Hwang S, Yeo HG, Hong JS (2020) A new splitting criterion for better interpretable trees. IEEE Access 8:62762–62774. https://doi.org/10.1109/ACCESS.2020.2985255
Hyafil L, Rivest RL (1976) Constructing optimal binary decision trees is NP-complete. Inf Process Lett 5(1):15–17. https://doi.org/10.1016/0020-0190(76)90095-8
Ikonomovska E, Gama J, Džeroski S (2011) Learning model trees from evolving data streams. Data Min Knowl Discov 23(1):128–168. https://doi.org/10.1007/s10618-010-0201-y
Iorio C, Aria M, D’Ambrosio A et al (2019) Informative trees by visual pruning. Expert Syst Appl 127:228–240. https://doi.org/10.1016/j.eswa.2019.03.018
Irsoy O, Yıldız OT, Alpaydın E (2012) Soft decision trees. In: Proceedings of the 21st international conference on pattern recognition (ICPR2012), 2012, pp 1819–1822
Irsoy O, Yildiz OT, Alpaydin E (2014) Budding trees. In: 2014 22nd international conference on pattern recognition, Stockholm, Sweden, 2014. IEEE, pp 3582–3587. https://doi.org/10.1109/ICPR.2014.616
Janikow C (1998) Fuzzy decision trees: issues and methods. IEEE Trans Syst Man Cybern B 28(1):1–14. https://doi.org/10.1109/3477.658573
Janota M, Morgado A (2020) SAT-based encodings for optimal decision trees with explicit paths. In: Theory and applications of satisfiability testing—SAT 12178, pp 501–518. https://doi.org/10.1007/978-3-030-51825-7_35
Johansson U, Linusson H, Löfström T et al (2018) Interpretable regression trees using conformal prediction. Expert Syst Appl 97:394–404. https://doi.org/10.1016/j.eswa.2017.12.041
Jordan MI, Jacobs RA (1994) Hierarchical mixtures of experts and the EM algorithm. Neural Comput 6(2):181–214. https://doi.org/10.1162/neco.1994.6.2.181
Jurczuk K, Czajkowski M, Kretowski M (2017) Evolutionary induction of a decision tree for large-scale data: a GPU-based approach. Soft Comput 21(24):7363–7379. https://doi.org/10.1007/s00500-016-2280-1
Karabadji NEI, Seridi H, Bousetouane F et al (2017) An evolutionary scheme for decision tree construction. Knowl Based Syst 119:166–177. https://doi.org/10.1016/j.knosys.2016.12.011
Kim K (2016) A hybrid classification algorithm by subspace partitioning through semi-supervised decision tree. Pattern Recognit 60:157–163. https://doi.org/10.1016/j.patcog.2016.04.016
Kim H, Loh WY (2001) Classification trees with unbiased multiway splits. J Am Stat Assoc 96(454):589–604
Kohavi R (1996) Scaling up the accuracy of naive-Bayes classifiers: a decision-tree hybrid. In: Proceedings of the second international conference on knowledge discovery and data mining. KDD’96, 1996. AAAI Press, pp 202–207
Kotsiantis SB (2013) Decision trees: a recent overview. Artif Intell Rev 39(4):261–283. https://doi.org/10.1007/s10462-011-9272-4
Kretowski M, Grzes M (2007) Evolutionary induction of mixed decision trees. IJDWM 3:68–82. https://doi.org/10.4018/jdwm.2007100104
Landwehr N, Hall M, Frank E (2005) Logistic model trees. Mach Learn 59(1):161–205. https://doi.org/10.1007/s10994-005-0466-3
Levatić J, Ceci M, Kocev D et al (2017) Semi-supervised classification trees. J Intell Inf Syst 49(3):461–486. https://doi.org/10.1007/s10844-017-0457-4
Levatić J, Kocev D, Ceci M et al (2018) Semi-supervised trees for multi-target regression. Inf Sci 450:109–127. https://doi.org/10.1016/j.ins.2018.03.033
Li RH, Belford GG (2002) Instability of decision tree classification algorithms. In: Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’02, New York, NY, USA. Association for Computing Machinery, pp 570–575. https://doi.org/10.1145/775047.775131
Li X, Zhao H, Zhu W (2015) A cost sensitive decision tree algorithm with two adaptive mechanisms. Knowl Based Syst 88:24–33. https://doi.org/10.1016/j.knosys.2015.08.012
Li J, Ma S, Le T et al (2017) Causal decision trees. IEEE Trans Knowl Data Eng 29(2):257–271. https://doi.org/10.1109/TKDE.2016.2619350
Lin J, Zhong C, Hu D et al (2020) Generalized and scalable optimal sparse decision trees. In: Proceedings of the 37th international conference on machine learning, 2020. PMLR, pp 6150–6160. ISSN: 2640-3498
Lipton ZC (2018) The mythos of model interpretability: in machine learning, the concept of interpretability is both important and slippery. Queue 16(3):31–57
Liu B, Xia Y, Yu PS (2000) Clustering through decision tree construction. In: Proceedings of the ninth international conference on information and knowledge management, CIKM ’00, New York, NY, USA, 2000. Association for Computing Machinery, pp 20–29. https://doi.org/10.1145/354756.354775
Loh WY (2009) Improving the precision of classification trees. Ann Appl Stat. https://doi.org/10.1214/09-AOAS260
Loh WY (2011) Classification and regression trees. Wiley Interdiscip Rev Data Min Knowl Discov 1(1):14–23
Loh WY (2014) Fifty years of classification and regression trees. Int Stat Rev 82(3):329–348. https://doi.org/10.1111/insr.12016
Loh WY, Shih YS (1997) Split selection methods for classification trees. Stat Sin 7(4):815–840
Loh WY, Vanichsetakul N (1988) Tree-structured classification via generalized discriminant analysis. J Am Stat Assoc 83(403):715–725. https://doi.org/10.1080/01621459.1988.10478652
Lomax S, Vadera S (2013) A survey of cost-sensitive decision tree induction algorithms. ACM Comput Surv (CSUR). https://doi.org/10.1145/2431211.2431215
López-Chau A, Cervantes J, López-García L et al (2013) Fisher’s decision tree. Expert Syst Appl 40(16):6283–6291. https://doi.org/10.1016/j.eswa.2013.05.044
Manwani N, Sastry PS (2012) Geometric decision tree. IEEE Trans Syst Man Cybern B 42(1):181–192. https://doi.org/10.1109/TSMCB.2011.2163392
Marsala C, Petturiti D (2015) Rank discrimination measures for enforcing monotonicity in decision tree induction. Inf Sci 291:143–171. https://doi.org/10.1016/j.ins.2014.08.045
Meisel W, Michalopoulos D (1973) A partitioning algorithm with application in pattern classification and the optimization of decision trees. IEEE Trans Comput C–22(1):93–103. https://doi.org/10.1109/T-C.1973.223603
Mingers J (1989) An empirical comparison of pruning methods for decision tree induction. Mach Learn 4(2):227–243. https://doi.org/10.1023/A:1022604100933
Mitchell M (1998) An introduction to genetic algorithms. MIT Press, Cambridge
Molnar C (2022) Interpretable machine learning, 2nd edn. christophm.github.io/interpretable-ml-book/
Morgan JN, Sonquist JA (1963) Problems in the analysis of survey data, and a proposal. J Am Stat Assoc 58(302):415–434. https://doi.org/10.1080/01621459.1963.10500855
Mu Y, Liu X, Wang L et al (2020) A parallel fuzzy rule-base based decision tree in the framework of Map-Reduce. Pattern Recognit 103(107):326. https://doi.org/10.1016/j.patcog.2020.107326
Murthy SK (1998) Automatic construction of decision trees from data: a multi-disciplinary survey. Data Min Knowl Discov 2(4):345–389. https://doi.org/10.1023/a:1009744630224
Murthy S, Salzberg S (1995a) Lookahead and pathology in decision tree induction. In: Proceedings of the 14th international joint conference on artificial intelligence, IJCAI’95, vol 2. Morgan Kaufmann Publishers Inc., San Francisco, pp 1025–1031
Murthy SK, Salzberg S (1995b) Decision tree induction: how effective is the greedy heuristic? p 6
Murthy S, Kasif S, Salzberg S et al (1993) OC1: a randomized induction of oblique decision trees. In: AAAI, Citeseer, pp 322–327
Narodytska N, Ignatiev A, Pereira F et al (2018) Learning optimal decision trees with SAT. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence. International Joint Conferences on Artificial Intelligence Organization, Stockholm, pp 1362–1368. https://doi.org/10.24963/ijcai.2018/189
Nijssen S, Fromont E (2010) Optimal constraint-based decision tree induction from itemset lattices. Data Min Knowl Discov 21(1):9–51. https://doi.org/10.1007/s10618-010-0174-x
Norouzi M, Collins M, Johnson MA et al (2015) Efficient non-greedy optimization of decision trees. In: Advances in neural information processing systems, vol 28. Curran Associates, Inc., Red Hook
Norton SW (1989) Generating better decision trees. In: IJCAI, pp 800–805
Nunes C, De Craene M, Langet H et al (2020) Learning decision trees through Monte Carlo tree search: an empirical evaluation. WIREs Data Min Knowl Discov. https://doi.org/10.1002/widm.1348
Paez A, López F, Ruiz M et al (2019) Inducing non-orthogonal and non-linear decision boundaries in decision trees via interactive basis functions. Expert Syst Appl 122:183–206. https://doi.org/10.1016/j.eswa.2018.12.041
Pei S, Hu Q, Chen C (2016) Multivariate decision trees with monotonicity constraints. Knowl Based Syst 112:14–25
Piltaver R, Luštrek M, Gams M et al (2016) What makes classification trees comprehensible? Expert Syst Appl 62:333–346. https://doi.org/10.1016/j.eswa.2016.06.009
Potharst R, Bioch JC (1999) A decision tree algorithm for ordinal classification. In: Goos G, Hartmanis J, van Leeuwen J et al (eds) Advances in intelligent data analysis. Lecture notes in computer science, vol 1642. Springer, Berlin, pp 187–198. https://doi.org/10.1007/3-540-48412-4_16
Provost F, Domingos P (2003) Tree induction for probability-based ranking. Mach Learn 52(3):199–215. https://doi.org/10.1023/A:1024099825458
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106. https://doi.org/10.1007/BF00116251
Quinlan JR (1987) Simplifying decision trees. Int J Man–Mach Stud 27(3):221–234. https://doi.org/10.1016/S0020-7373(87)80053-6
Quinlan JR (1992) Learning with continuous classes. In: 5th Australian joint conference on artificial intelligence. World Scientific, pp 343–348
Ragavan H, Rendell LA (1993) Lookahead feature construction for learning hard concepts. In: Proceedings of the tenth international conference on international conference on machine learning, ICML’93, 1993. Morgan Kaufmann Publishers, Inc., San Francisco, pp 252–259
Rhuggenaath J, Zhang Y, Akcay A et al (2018) Learning fuzzy decision trees using integer programming. In: 2018 IEEE international conference on fuzzy systems (FUZZ-IEEE), pp 1–8. https://doi.org/10.1109/FUZZ-IEEE.2018.8491636
Rokach L, Maimon OZ (2007) Data mining with decision trees: theory and applications. World Scientific, Singapore
Roscher R, Bohn B, Duarte MF et al (2020) Explainable machine learning for scientific insights and discoveries. IEEE Access 8:42200–42216. https://doi.org/10.1109/ACCESS.2020.2976199
Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1(5):206–215. https://doi.org/10.1038/s42256-019-0048-x
Rusch T, Zeileis A (2014) Discussion on fifty years of classification and regression trees. Int Stat Rev 82(3):361–367
Sarker IH (2021) Machine learning: algorithms, real-world applications and research directions. SN Comput Sci 2(3):1–21
Schidler A, Szeider S (2021) SAT-based decision tree learning for large data sets. In: Proceedings of the AAAI conference on artificial intelligence, vol 35(5), pp 3904–3912
Silva A, Gombolay M, Killian T et al (2020) Optimization methods for interpretable differentiable decision trees applied to reinforcement learning. In: Proceedings of the twenty third international conference on artificial intelligence and statistics, 2020. PMLR, pp 1855–1865. ISSN: 2640-3498
Silver D, Huang A, Maddison CJ et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Sok HK, Ooi MPL, Kuang YC (2015) Sparse alternating decision tree. Pattern Recognit Lett 60–61:57–64. https://doi.org/10.1016/j.patrec.2015.03.002
Sok HK, Ooi MPL, Kuang YC et al (2016) Multivariate alternating decision trees. Pattern Recognit 50:195–209. https://doi.org/10.1016/j.patcog.2015.08.014
Sosnowski ZA, Gadomer Lu (2019) Fuzzy trees and forests—review. Wiley Interdiscip Rev Data Min Knowl Discov. https://doi.org/10.1002/widm.1316
Suarez A, Lutsko J (1999) Globally optimal fuzzy decision trees for classification and regression. IEEE Trans Pattern Anal Mach Intell 21(12):1297–1311. https://doi.org/10.1109/34.817409
Tanha J, van Someren M, Afsarmanesh H (2017) Semi-supervised self-training for decision tree classifiers. Int J Mach Learn Cybern 8(1):355–370
Tanno R, Arulkumaran K, Alexander D et al (2019) Adaptive neural trees. In: Proceedings of the 36th international conference on machine learning, 2019. PMLR, pp 6166–6175. ISSN: 2640-3498
Tharwat A, Gaber T, Ibrahim A et al (2017) Linear discriminant analysis: a detailed tutorial. AI Commun 30(2):169–190. https://doi.org/10.3233/AIC-170729
Tran MQ, Elsisi M, Mahmoud K et al (2021) Experimental setup for online fault diagnosis of induction machines via promising IoT and machine learning: towards industry 4.0 empowerment. IEEE Access 9:115429–115441
Verwer S, Zhang Y (2017) Learning decision trees with flexible constraints and objectives using integer optimization. In: Salvagnin D, Lombardi M (eds) Integration of AI and OR techniques in constraint programming. Lecture notes in computer science, vol 10335. Springer, Cham, pp 94–103. https://doi.org/10.1007/978-3-319-59776-8_8
Verwer S, Zhang Y (2019) Learning optimal classification trees using a binary linear program formulation. In: Proceedings of the AAAI conference on artificial intelligence, vol 33(01), pp 1625–1632. https://doi.org/10.1609/aaai.v33i01.33011624
Wan A, Dunlap L, Ho D et al (2020) NBDT: neural-backed decision trees. arXiv:2004.00221
Wang R, Kwong S, Wang XZ et al (2014) Segment based decision tree induction with continuous valued attributes. IEEE Trans Cybern 45(7):1262–1275
Wang J, Fujimaki R, Motohashi Y (2015a) Trading interpretability for accuracy: oblique treed sparse additive models. In: Proceedings of the 21st ACM SIGKDD international conference on knowledge discovery and data mining, 2015, pp 1245–1254
Wang R, He YL, Chow CY et al (2015b) Learning ELM-Tree from big data based on uncertainty reduction. Fuzzy Sets Syst 258:79–100. https://doi.org/10.1016/j.fss.2014.04.028
Wang X, Liu X, Pedrycz W et al (2015c) Fuzzy rule based decision trees. Pattern Recognit 48(1):50–59. https://doi.org/10.1016/j.patcog.2014.08.001
Webb GI (1997) Decision tree grafting. In: Proceedings of the fifteenth international joint conference on artificial intelligence, IJCAI’97, vol 2. Morgan Kaufmann Publishers, Inc., San Francisco, pp 846–851
Wickramarachchi D, Robertson B, Reale M et al (2016) HHCART: an oblique decision tree. Comput Stat Data Anal 96:12–23. https://doi.org/10.1016/j.csda.2015.11.006
Wickramarachchi DC, Robertson BL, Reale M et al (2019) A reflected feature space for CART. Aust NZ J Stat 61(3):380–391. https://doi.org/10.1111/anzs.12275
Wu CC, Chen YL, Liu YH et al (2016) Decision tree induction with a constrained number of leaf nodes. Appl Intell 45(3):673–685. https://doi.org/10.1007/s10489-016-0785-z
Wu CC, Chen YL, Tang K (2019) Cost-sensitive decision tree with multiple resource constraints. Appl Intell 49(10):3765–3782. https://doi.org/10.1007/s10489-019-01464-x
Yan J, Zhang Z, Xie L et al (2019) A unified framework for decision tree on continuous attributes. IEEE Access 7:11924–11933. https://doi.org/10.1109/ACCESS.2019.2892083
Yang L, Liu S, Tsoka S et al (2017) A regression tree approach using mathematical programming. Expert Syst Appl 78:347–357. https://doi.org/10.1016/j.eswa.2017.02.013
Yang Y, Morillo IG, Hospedales TM (2018) Deep neural decision trees. arXiv:1806.06988 [cs, stat]
Yuan Y, Shaw MJ (1995) Induction of fuzzy decision trees. Fuzzy Sets Syst 69(2):125–139. https://doi.org/10.1016/0165-0114(94)00229-Z
Zhao H, Li X (2017) A cost sensitive decision tree algorithm based on weighted class distribution with batch deleting attribute mechanism. Inf Sci 378:303–316. https://doi.org/10.1016/j.ins.2016.09.054
Zhou X, Yan D (2019) Model tree pruning. Int J Mach Learn Cybern 10(12):3431–3444. https://doi.org/10.1007/s13042-019-00930-9
Zhu H, Murali P, Phan D et al (2020) A scalable MIP-based method for learning optimal multivariate decision trees. In: Larochelle H, Ranzato M, Hadsell R et al (eds) Advances in neural information processing systems, vol 33. Curran Associates, Inc., Red Hook, pp 1771–1781
Acknowledgements
Both authors are grateful to the anonymous reviewers for their valuable suggestions and feedback.
Funding
This work was partially funded by the Brazilian research agencies CNPq—National Council for Scientific and Technological Development (Grant Number 306258/2019-6), FAPERJ—Foundation for Research Support of Rio de Janeiro State (Grant Number E-26/200.840/2021), and a Ph.D. Scholarship from CAPES (PROEX).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Costa, V.G., Pedreira, C.E. Recent advances in decision trees: an updated survey. Artif Intell Rev 56, 4765–4800 (2023). https://doi.org/10.1007/s10462-022-10275-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-022-10275-5