article

Incremental Learning of Linear Model Trees

Authors:

Duncan Potts,

Claude SammutAuthors Info & Claims

Machine Learning, Volume 61, Issue 1-3

Pages 5 - 48

https://doi.org/10.1007/s10994-005-1121-8

Published: 01 November 2005 Publication History

Abstract

A linear model tree is a decision tree with a linear functional model in each leaf. Previous model tree induction algorithms have been batch techniques that operate on the entire training set. However there are many situations when an incremental learner is advantageous. In this article a new batch model tree learner is described with two alternative splitting rules and a stopping rule. An incremental algorithm is then developed that has many similarities with the batch version but is able to process examples one at a time. An online pruning rule is also developed. The incremental training time for an example is shown to only depend on the height of the tree induced so far, and not on the number of previous examples. The algorithms are evaluated empirically on a number of standard datasets, a simple test function and three dynamic domains ranging from a simple pendulum to a complex 13 dimensional flight simulator. The new batch algorithm is compared with the most recent batch model tree algorithms and is seen to perform favourably overall. The new incremental model tree learner compares well with an alternative online function approximator. In addition it can sometimes perform almost as well as the batch model tree algorithms, highlighting the effectiveness of the incremental implementation.

References

[1]

Alexander, W., & Grimshaw, S. (1996). Treed regression. Journal of Computational and Graphical Statistics, 5, 156-175.

Google Scholar

[2]

Atkeson, C., Moore, A., & Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review, 11, 11-73.

Digital Library

Google Scholar

[3]

Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. Wadsworth.

Google Scholar

[4]

Cestnik, B., & Bratko, I. (1991). On estimating probabilities in tree pruning. In Y. Kodratoff (Ed.),Proceedings of the European Working Session on Learning, vol. 482 of Lecture Notes in Artificial Intelligence (pp. 138-150). Springer.

Digital Library

Google Scholar

[5]

Chaudhuri, P., Huang, M., Loh, W., & Yao, R. (1994). Piecewise-polynomial regression trees. Statistica Sinica, 4, 143-167.

Google Scholar

[6]

Chow, G. (1960). Tests of equality between sets of coefficients in two linear regressions. Econometrica, 28:3, 591-605.

Crossref

Google Scholar

[7]

Dobra, A., & Gehrke, J. (2002). SECRET: A scalable linear regression tree algorithm. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 481-487). ACM.

Digital Library

Google Scholar

[8]

Frank, E., Wang, Y., Inglis, S., Holmes, G., & Witten, I. (1998). Using model trees for classification. Machine Learning, 32:1, 63-76.

Digital Library

Google Scholar

[9]

Gama, J. (2004). Functional trees. Machine Learning, 55:3,219-250.

Digital Library

Google Scholar

[10]

Gama, J., Rocha, R., & Medas, P. (2003). Accurate decision trees for mining high-speed data streams. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 523- 528). ACM.

Digital Library

Google Scholar

[11]

Hastie, T., & Loader, C. (1993). Local regression: Automatic kernel carpentry. Statistical Science, 8:2, 120- 143.

Crossref

Google Scholar

[12]

Haykin, S. (2002). Adaptive Filter Theory. Prentice-Hall.

Digital Library

Google Scholar

[13]

Isaac, A., & Sammut, C. (2003). Goal-directed learning to fly. In T. Fawcett & N. Mishra (Eds.), Proceedings of the 20th International Conference of Machine Learning (pp. 258-265). AAAI Press.

Google Scholar

[14]

Karalič, A. (1992). Employing linear regression in regression tree leaves. In B. Neumann (Ed.), Proceedings of the 10th European Conference on Artificial Intelligence (pp. 440-441). Wiley.

Digital Library

Google Scholar

[15]

Kullback, S., & Rosenblatt, H. (1957). On the analysis of multiple regression in k categories. Biometrika, 44, 67-83.

Crossref

Google Scholar

[16]

Last, M. (2002). Online classification of non-stationary data streams. Intelligent Data Analysis, 6, 129-147.

Digital Library

Google Scholar

[17]

Li, K., Lue, H., & Chen, C. (2000). Interactive tree-structured regression via principal Hessian directions. Journal of the American Statistical Association, 95, 547-560.

Crossref

Google Scholar

[18]

Ljung, L. (1987). System Identification: Theory for the User. Prentice-Hall.

Digital Library

Google Scholar

[19]

Loh, W. (2002). Regression trees with unbiased variable selection and interaction detection. Statistica Sinica, 12, 361-386.

Google Scholar

[20]

Malerba, D., Esposito, F., Ceci, M., & Appice, A. (2004). Top-down induction of model trees with regression and splitting nodes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 612-625.

Digital Library

Google Scholar

[21]

Mehta, M., Agrawal, R., & Rissanen, J. (1996). SLIQ: A fast scalable classifier for data mining. In P. Apers, M. Bouzeghoub, & G. Gardarin (Eds.), Proceedings of the 5th International Conference on Extending Database Technology, vol. 1057 of Lecture Notes in Computer Science (pp. 18-32). Springer.

Digital Library

Google Scholar

[22]

Moore, A., & Atkeson, C. (1995). The parti-game algorithm for variable resolution reinforcement learning in multidimensional state-spaces. Machine Learning, 21,199-233.

Digital Library

Google Scholar

[23]

Munos, R., & Moore, A. (2002). Variable resolution discretization in optimal control. Machine Learning, 49, 291-323.

Digital Library

Google Scholar

[24]

Murray-Smith, R. (1994). A local model network approach to nonlinear modelling. Ph.D. thesis, University of Strathclyde, Strathclyde, UK.

Google Scholar

[25]

Nakanishi, J., Farrell, J., & Schaal, S. (2004). Learning composite adaptive control for a class of nonlinear systems. In IEEE International Conference on Robotics and Automation (pp. 2647-2652).

Crossref

Google Scholar

[26]

Nelles, O. (2001) Nonlinear System Identification. Springer.

Google Scholar

[27]

Potts, D. (2004a). Fast incremental learning of linear model trees. In J. Carbonell & J. Siekmann (Eds.), Proceedings of the 8th Pacific Rim International Conference on Artificial Intelligence, vol. 3157 of Lecture Notes in Artificial Intelligence (pp. 221-230). Springer.

Google Scholar

[28]

Potts, D. (2004b). Incremental learning of linear model trees. In R. Greiner & D. Schuurmans (Eds.), Proceedings of the 21st International Conference on Machine Learning (pp. 663-670). ACM.

Digital Library

Google Scholar

[29]

Quinlan, J. (1993a). C4.5: Programs for Machine Learning. Morgan Kaufmann.

Digital Library

Google Scholar

[30]

Quinlan, J. (1993b). Combining instance-based and model-based learning. In Proceedings of the 10th International Conference on Machine Learning (pp. 236-243). Morgan Kaufmann.

Crossref

Google Scholar

[31]

Robnik-Škonja, M., & Kononenko, I. (1998). Pruning regression trees with MDL. In H. Prade (Ed.), Proceedings of the 13th European Conference on Artificial Intelligence (pp. 455-459). Wiley.

Google Scholar

[32]

Schaal, S., & Atkeson, C. (1998). Constructive incremental learning from only local information. Neural Computation, 10, 2047-2084.

Digital Library

Google Scholar

[33]

Schlimmer, J., & Fisher, D. (1986). A case study of incremental concept induction. In Proceedings of the 5th National Conference on Artificial Intelligence (pp. 496-501). AAAI Press.

Google Scholar

[34]

Sicilano, R., & Mola, F. (1994). Modelling for recursive partitioning and variable selection. In R. Dutter & W. Grossmann (Eds.), Proceedings in Computational Statistics: COMPSTAT '94 (pp. 172-177). Physica Verlag.

Google Scholar

[35]

Slotine, J., & Li, W. (1991). Applied nonlinear control. Prentice-Hall.

Google Scholar

[36]

Šuc, D., Vladušič, D., & Bratko, I. (2004). Qualitatively faithful quantitative prediction. Artificial Intelligence, 158:2, 189-214.

Digital Library

Google Scholar

[37]

Torgo, L. (1997). Functional models for regression tree leaves. In D. Fisher (Ed.), Proceedings of the 14th International Conference on Machine Learning (pp. 385-393). Morgan Kaufmann.

Digital Library

Google Scholar

[38]

Torgo, L. (2002). Computationally efficient linear regression trees. In K. Jajuga, A. Sokolowski, & H.-H. Bock (Eds.), Classification, Clustering and Data Analysis: Recent Advances and Applications. Springer.

Google Scholar

[39]

Utgoff, P., Berkman, N., & Clouse, J. (1997). Decision tree induction based on efficient tree restructuring. Machine Learning, 29, 5-44.

Digital Library

Google Scholar

[40]

Vijayakumar, S., & Schaal, S. (2000). Locally weighted projection regression: Incremental real time learning in high dimensional space. In P. Langley (Ed.), Proceedings of the 17th International Conference on Machine Learning (pp. 1079-1086). Morgan Kaufmann.

Digital Library

Google Scholar

[41]

Wang, Y., & Witten, I. (1997). Inducing model trees for continuous classes. In Proceedings of Poster Papers, 9th European Conference on Machine Learning.

Google Scholar

Cited By

View all

Izzo ZLiu RZou JKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Data-driven subgroup identification for linear regressionProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619000(14531-14552)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619000
Abels ALenaerts TTrianni VNowé AKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Expertise trees resolve knowledge limitations in collective decision-makingProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618413(79-90)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618413
Zern ABroelemann KKasneci GWilliams BChen YNeville J(2023)Interventional SHAP values and interaction values for piecewise linear regression treesProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i9.26322(11164-11173)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i9.26322
Show More Cited By

Index Terms

Incremental Learning of Linear Model Trees

Recommendations

Incremental Induction of Decision Trees

This article presents an incremental algorithm for inducing decision trees equivalent to those formed by Quinlan's nonincremental ID3 algorithm, given the same training instances. The new algorithm, named ID5R, lets one apply the ID3 induction process ...
Top-Down Induction of Model Trees with Regression and Splitting Nodes

Abstract--Model trees are an extension of regression trees that associate leaves with multiple regression models. In this paper, a method for the data-driven construction of model trees is presented, namely, the Stepwise Model Tree Induction (SMOTI) ...
Incremental learning of linear model trees
ICML '04: Proceedings of the twenty-first international conference on Machine learning

A linear model tree is a decision tree with a linear functional model in each leaf. Previous model tree induction algorithms have operated on the entire training set, however there are many situations when an incremental learner is advantageous. In this ...

Reviews

Reviewer: Jose Hernandez-Orallo

Linear models and decision trees are classical machine learning techniques that can be recognized by their ease of use, understandability, and quick learning methods. The combination of both techniques can be traced back to the origins of decision tree learners, such as Breiman et al.'s classification and regression tree (CART) algorithm [1]. This work presents the first general approach to the incremental learning of linear decision trees. At first, it might seem that this is yet another paper that fills the gap between two existing techniques, incremental decision tree learners and linear decision trees, trying to get the best from both. However, the intention is not to solely fill this gap, but to present a thorough redesign of how linear decision trees must be constructed, how pruning must take place, and how some other issues must be addressed (windows, use of statistics, and so on). The design and writing of the paper leaves few areas for criticism. I would have liked to see some comparisons with unrelated techniques, such as a multilayer perceptron, or a classical decision tree learner, such as J48 in Weka, the package that is used for the experiments. The work establishes two algorithms that will advance the state of the art of the incremental (and batch) learning of linear decision trees. The complete account of related work and the thorough evaluation make this paper a necessary reference for every future work on linear decision or regression trees. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Information & Contributors

Information

Published In

Machine Language Volume 61, Issue 1-3

November 2005

183 pages

ISSN:0885-6125

Issue’s Table of Contents

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 November 2005

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

21
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Izzo ZLiu RZou JKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Data-driven subgroup identification for linear regressionProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619000(14531-14552)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619000
Abels ALenaerts TTrianni VNowé AKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Expertise trees resolve knowledge limitations in collective decision-makingProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618413(79-90)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618413
Zern ABroelemann KKasneci GWilliams BChen YNeville J(2023)Interventional SHAP values and interaction values for piecewise linear regression treesProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i9.26322(11164-11173)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i9.26322
Zheng XChen SWallach HLarochelle HBeygelzimer Ad'Alché-Buc FFox E(2019)Partitioning structure learning for segmented linear regression treesProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3454486(2222-2231)Online publication date: 8-Dec-2019
https://dl.acm.org/doi/10.5555/3454287.3454486
Broelemann KKasneci G(2019)A gradient-based split criterion for highly accurate and transparent model treesProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367243.3367321(2030-2037)Online publication date: 10-Aug-2019
https://dl.acm.org/doi/10.5555/3367243.3367321
Nascimento Sde Macedo JLopes Hda Silva TCasanova Mde Castro Machado J(2017)On computing travel time functions from Trajectory Data StreamsProceedings of the 8th ACM SIGSPATIAL Workshop on GeoStreaming10.1145/3148160.3148162(11-20)Online publication date: 7-Nov-2017
https://dl.acm.org/doi/10.1145/3148160.3148162
Wiley TBratko ISammut C(2017)A Machine Learning System for Controlling a Rescue RobotRoboCup 2017: Robot World Cup XXI10.1007/978-3-030-00308-1_9(108-119)Online publication date: 27-Jul-2017
https://dl.acm.org/doi/10.1007/978-3-030-00308-1_9
Duarte JGama JBifet A(2016)Adaptive Model Rules From High-Speed Data StreamsACM Transactions on Knowledge Discovery from Data10.1145/282995510:3(1-22)Online publication date: 29-Jan-2016
https://dl.acm.org/doi/10.1145/2829955
Duarte JGama J(2014)Ensembles of Adaptive model rules from high-speed data streamsProceedings of the 3rd International Conference on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications - Volume 3610.5555/2999973.2999988(198-213)Online publication date: 24-Aug-2014
https://dl.acm.org/doi/10.5555/2999973.2999988
Almeida EFerreira CGama J(2013)Learning model rules from high-speed data streamsProceedings of the 3rd International Conference on Ubiquitous Data Mining - Volume 108810.5555/2907177.2907181(10-16)Online publication date: 3-Aug-2013
https://dl.acm.org/doi/10.5555/2907177.2907181
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

References

Cited By

Index Terms

Recommendations

Incremental Induction of Decision Trees

Top-Down Induction of Model Trees with Regression and Splitting Nodes