Random Forests

Breiman, Leo

doi:10.1023/A:1010933404324

Random Forests

Published: October 2001

Volume 45, pages 5–32, (2001)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Random Forests

Download PDF

Leo Breiman¹

592k Accesses
59k Citations
233 Altmetric
36 Mentions
Explore all metrics

Abstract

Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, ***, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Amit, Y. & Geman, D. (1997). Shape quantization and recognition with randomized trees. Neural Computation, 9, 1545–1588.
Google Scholar
Amit, Y., Blanchard, G., & Wilder, K. (1999). Multiple randomized classifiers: MRCL Technical Report, Department of Statistics, University of Chicago.
Bauer, E. & Kohavi, R. (1999). An empirical comparison of voting classification algorithms. Machine Learning, 36(1/2), 105–139.
Google Scholar
Breiman, L. (1996a). Bagging predictors. Machine Learning 26(2), 123–140.
Google Scholar
Breiman, L. (1996b). Out-of-bag estimation, ftp.stat.berkeley.edu/pub/users/breiman/OOBestimation.ps
Breiman, L. (1998a). Arcing classifiers (discussion paper). Annals of Statistics, 26, 801–824.
Google Scholar
Breiman. L. (1998b). Randomizing outputs to increase prediction accuracy. Technical Report 518, May 1, 1998, Statistics Department, UCB (in press, Machine Learning).
Breiman, L. 1999. Using adaptive bagging to debias regressions. Technical Report 547, Statistics Dept. UCB.
Breiman, L. 2000. Some infinity theory for predictor ensembles. Technical Report 579, Statistics Dept. UCB.
Dietterich, T. (1998). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting and randomization, Machine Learning, 1–22.
Freund, Y. & Schapire, R. (1996). Experiments with a new boosting algorithm, Machine Learning: Proceedings of the Thirteenth International Conference, 148–156.
Grove, A. & Schuurmans, D. (1998). Boosting in the limit: Maximizing the margin of learned ensembles. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI-98).
Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Trans. on Pattern Analysis and Machine Intelligence, 20(8), 832–844.
Google Scholar
Kleinberg, E. (2000). On the algorithmic implementation of stochastic discrimination. IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(5), 473–490.
Google Scholar
Schapire, R., Freund, Y., Bartlett, P., & Lee,W. (1998). Boosting the margin:Anewexplanation for the effectiveness of voting methods. Annals of Statistics, 26(5), 1651–1686.
Google Scholar
Tibshirani, R. (1996). Bias, variance, and prediction error for classification rules. Technical Report, Statistics Department, University of Toronto.
Wolpert, D. H. & Macready, W. G. (1997). An efficient method to estimate Bagging's generalization error (in press, Machine Learning).

Download references

Author information

Authors and Affiliations

Statistics Department, University of California, Berkeley, CA, 94720
Leo Breiman

Authors

Leo Breiman
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Breiman, L. Random Forests. Machine Learning 45, 5–32 (2001). https://doi.org/10.1023/A:1010933404324

Download citation

Issue Date: October 2001
DOI: https://doi.org/10.1023/A:1010933404324

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Random Forests

Abstract

Article PDF

Similar content being viewed by others

Mutual Information Estimation with Random Forests

Models under which random forests perform badly; consequences for applications

A random forest guided tour

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Random Forests

Abstract

Article PDF

Similar content being viewed by others

Mutual Information Estimation with Random Forests

Models under which random forests perform badly; consequences for applications

A random forest guided tour

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation