Abstract
In this paper we examine ensemble methods for regression that leverage or “boost” base regressors by iteratively calling them on modified samples. The most successful leveraging algorithm for classification is AdaBoost, an algorithm that requires only modest assumptions on the base learning method for its strong theoretical guarantees. We present several gradient descent leveraging algorithms for regression and prove AdaBoost-style bounds on their sample errors using intuitive assumptions on the base learners. We bound the complexity of the regression functions produced in order to derive PAC-style bounds on their generalization errors. Experiments validate our theoretical results.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Anthony, M., & Bartlett, P. L. (1999). Neural network learning: Theoretical foundations. Cambridge: Cambridge University Press.
Barron, A. R. (1993). Universal approximation bounds for superposition of a sigmoidal function. IEEE Trans. on Information Theory, 39, 930–945.
Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting and variants. Machine Learning, 36: 1/2, 105–39.
Bertoni, A., Campadelli, P., & Parodi, M. (1997). Aboosting algorithm for regression. In W. Gerstner, A. Germond, M. Hasler, and J.-D. Nicoud (Eds.), Proceedings ICANN'97, Int. Conf. on Artificial Neural Networks (pp. 343–348). Berlin: Springer. Vol. V of LNCS.
Blake, C. E. K., & Merz, C. (1998). UCI repository of machine learning databases.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24:2, 123–140.
Breiman, L. (1998). Arcing classifiers. The Annals of Statistics, 26:3, 801–849.
Breiman, L. (1999). Prediction games and arcing algorithms. Neural Computation, 11, 1493–1517.
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Belmont, CA: Wadsworth International Group.
Duffy, N., & Helmbold, D. (2000). Potential boosters? In S. Solla, T. Leen, & K.-R. Müller (Eds.), Advances in neural information processing systems, 12 (pp. 258–264) Cambridge, MA: MIT Press.
Duffy, N. & Helmbold, D. (1999). A geometric approach to leveraging weak learners. In P. Fischer, & H. U. Simon (Eds.), Computational learning theory: 4th European Conference (EuroCOLT '99) (pp. 18–33). Berlin: Springer-Verlag.
Freund, Y. (1995). Boosting a weak learning algorithm by majority. Information and Computation, 121:2, 256–285.
Freund, Y. (1999). An adaptive version of the boost by majority algorithm. In Proc. 12th Annu. Conf. on Comput. Learning Theory (pp. 102–113). New York, NY: ACM Press.
Freund, Y., & Schapire, R. E. (1996). Experiments with a new Boosting algorithm. In Proc. 13th International Conference on Machine Learning (pp. 148–156). San Matco, CA: Morgan Kaufmann.
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55:1, 119–139.
Friedman, J. H. (1999a). Greedy function approximation: A gradient boosting machine. Technical Report, Department of Statistics, Sequoia Hall, Stanford University, Stanford California 94305.
Friedman, J. H. (1999b). Stochastic gradient boosting. Technical Report, Department of Statistics, Sequoia Hall, Stanford University, Stanford California 94305.
Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting. The Annals of Statistics, 28:2, 337–374.
Drucker, H. (1997). Improving regressors using boosting techniques. In Proceedings of the Fourteenth International Converence on Machine Learning (pp. 107–115). San Matco, CA: Morgan-Kaufman.
Jones, L. K. (1992). Asimple lemma on greedy approximation in Hilbert space and convergence rates for projection pursuit regression and neural network training. The Annals of Statistics, 20, 608–613.
Kearns, M., & Valiant, L. (1994). Cryptographic limitations on learning Boolean formulae and finite automata. Journal of the ACM, 41:1, 67–95.
Kearns, M. J., & Vazirani, U. V. (1994). An introduction to computational learning theory. Cambridge, MA: The MIT Press.
Lee, W. S., Bartlett, P. L., & Williamson, R. C. (1995). On efficient agnostic learning of linear combinations of basis functions. In Proc. 8th Annu. Conf. on Comput. Learning Theory (pp. 369–376). New York, NY: ACM Press.
Mason, L., Baxter, J., Bartlett, P., & Frean, M. (2000). Boosting algorithms as gradient descent. In S. Solla, T. Leen, & K.-R. Müller (Eds.), Advances in neural information processing systems, 12 (pp. 512–518). Cambridge, MA: MIT Press.
Quinlan, J. R. (1996). Bagging, Boosting and C4.5. In Proceedings of the Thirteenth National Conference of Artificial Intelligence (pp. 725–730). Cambridge, MA: AAAI Press MIT Press.
Rätsch, G., Onoda, T., & Müller, K.-R. (2001). Soft margins for AdaBoost'. Machine Learning, 42:3, 287–320.
Rätsch, G., Warmuth, M., Mika, S., Onoda, T., Lemm, S., & Müller, K. R. (2000). Barrier Boosting. In Proc. 13th Annu. Conference on Comput. Learning Theory (pp. 170–179). San Francisco: Morgan Kaufmann.
Ridgeway, G., Madigan, D., & Richardson, T. (1999). Boosting methodology for regression problems. In D. Heckerman, & J. Whittaker (Eds.), Proc. Artificial Intelligence and Statistics (pp. 152-161).
Schapire, R. E. (1992). The design and analysis of efficient learning algorithms. Cambridge, MA: MIT Press.
Schapire, R. E., Freund, Y., Bartlett, P., & Lee, W. S. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26:5, 1651–1686.
Schapire, R. E., & Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37:3, 297–336.
Valiant, L. G. (1984). A theory of the learnable. Commun. ACM, 27:11, 1134–1142.
Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Duffy, N., Helmbold, D. Boosting Methods for Regression. Machine Learning 47, 153–200 (2002). https://doi.org/10.1023/A:1013685603443
Issue Date:
DOI: https://doi.org/10.1023/A:1013685603443