Abstract
A global optimization algorithm is designed to find the parameters of a CART regression tree extended with linear predictors at its leaves. In order to render the optimization mathematically feasible, the internal decisions of the CART tree are made continuous. This is accomplished by the replacement of the crisp decisions at the internal nodes of the tree with soft ones. The algorithm then adjusts the parameters of the tree in a manner similar to the back propagation algorithm in multilayer perceptrons. With this procedure it is possible to generate regression trees optimized with a global cost function, which give a continuous representation of the unknown function, and whose architecture is automatically fixed by the data. The integration in one decision system of complementary features of symbolic and connectionist methods leads to improvements in prediction efficiency in both synthetic and real-world regression problems.
Chapter PDF
Similar content being viewed by others
References
Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J.: Classification and Regression Trees. Chapman & Hall, New York (1984).
Breiman, L., Bagging Predictors: Machine Learning, 24 (1996) 123–140.
Breiman, L.: Bias, Variance and Arcing Classifiers. Technical Report 460, Statistics Department, University of California, (1996).
Breiman, L.: Arcing Classifiers (with Discussion). The Annals of Statistics, 24 (1998) 2350–2383.
Breiman, L.: Randomizing Outputs to Increase Prediction Accuracy. Machine Learning, 40 (2000) 229–242.
Breiman, L.: Private Communication.
Cherkassky, V. and Muller, F.: Statistical and Neural Network Techniques for Non-parametric Regression. In: Cheeseman, P.W. and Oldford, R.W. (eds.): Selecting Models from Data. Springer-Verlag, New York (1994) 383–392..
Freund, Y. and Schapire, R.E.: Experiments with a New Boosting Algorithm. In Machine Learning: Proc. 13th International Conference. Morgan-Kaufmann, San Francisco (1996) 148–156.
Friedman, J.H.: Multivariate Adaptative Regression Splines (with Discussion). The Annals of Statistics, 19 (1991) 1–141.
Gelfand, S.B., Ravishankar, C.S. and Delp, E.J.: An Iterative Growing and Pruning Algorithm for Classification Tree Design. IEEE Trans. Pattern Analysis and Machine Intelligence, 13, 2 (1991) 163–174.
Geman, S., Bienenstock, E. and Doursat, R.: Neural Networks and the Bias/Variance Dilemma. Neural Computation, 4 (1992) 1–58.
Quinlan, J. R. Learning with continuous classes, Proceedings of the Australian Joint Conference on Artificial Intelligence (1992) 343–348.
Jennrich, R.E.: Stepwise Regression. In: Statistical Methods for Digital Computers. Wiley, New York (1977) 58–75.
Jordan, M.I. and Jacobs, R.A.: Hierarchical Mixtures of Experts and the EM algorithm. Neural Computation, 6 (1994) 181–214.
Kohavi, R. and Wolpert, D.H.: Bias Plus Variance Decomposition for Zero-one Loss Functions. In Machine Learning, Proc. 13th International Conference. Morgan-Kaufmann, San Francisco (1996) 275–283.
Kong, E.B. and Dietterich, T.G.: Error-correcting Output Coding Corrects Bias and Variance. In Proc. 12th International Conference on Machine Learning. Morgan-Kaufmann, San Francisco (1995) 313–321.
Press, W. Teukolsky, W.T., Vetterling, S.A. and Flannery, B.: Numerical Recipes C: The Art of Scientific Computing. Cambridge Univ. Press, Cambridge (1993).
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993).
Quinlan, J.R.: Induction of Decision Trees. Machine Learning 1, 1 (1986) 81–106.
Schapire, R.E., Freund, Y. Bartlett, P. and Lee, W.S.: Boosting the Margin: a New Explanation for the Effectiveness of Voting Methods. The Annals of Statistics 26 5 (1998) 1651–1686.
Suárez, A. and Lutsko, J.F.: Globally Optimal Fuzzy Decision Trees for Classification and Regression. IEEE Trans. Pattern Analysis and Machine Intelligence 21 12 (1999) 1297–1311.
Suárez, A. and Lutsko, J.F.: Automatic Induction of Piecewise Linear Models with Decision Trees. In Proc. International Conference on Artificial Intelligence, Vol 2. H.R. Arabnia ed. Las Vegas, (2000) 1025–1031.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Medina-Chico, V., Suárez, A., Lutsko, J.F. (2001). Backpropagation in Decision Trees for Regression. In: De Raedt, L., Flach, P. (eds) Machine Learning: ECML 2001. ECML 2001. Lecture Notes in Computer Science(), vol 2167. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44795-4_30
Download citation
DOI: https://doi.org/10.1007/3-540-44795-4_30
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42536-6
Online ISBN: 978-3-540-44795-5
eBook Packages: Springer Book Archive