Abstract
Many feature selection algorithms are limited in that they attempt to identify relevant feature subsets by examining the features individually. This paper introduces a technique for determining feature relevance using the average information gain achieved during the construction of decision tree ensembles. The technique introduces a node complexity measure and a statistical method for updating the feature sampling distribution based upon confidence intervals to control the rate of convergence. A feature selection threshold is also derived, using the expected performance of an irrelevant feature. Experiments demonstrate the potential of these methods and illustrate the need for both feature weighting and selection.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Freund, Y., Schapire, R.: A short introduction to boosting. Journal of Japanese Society for Artificial Intelligence 14, 771–780 (1999)
Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Dietterich, T.: An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning 40, 139–157 (2000)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification And Regression Trees. Wadsworth (1984)
Opitz, D.: Feature selection for ensembles. In: 16th National Conference on Artificial Intelligence, pp. 379–384. AAAI, Menlo Park (1999)
Ho, T.: Nearest neighbours in random subspaces. In: Amin, A., Pudil, P., Dori, D. (eds.) SPR 1998 and SSPR 1998. LNCS, vol. 1451, pp. 640–648. Springer, Heidelberg (1998)
Hall, M.: Correlation-based feature selection for discrete and numeric class machine learning. In: 17th International Conference on Machine Learning, pp. 359–366 (2000)
John, G., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Cohen, W., Hirsh, H. (eds.) Machine Learning, pp. 121–129. Morgan Kaufmann, San Francisco (1994)
Roobaert, D., Karakoulas, G., Chawla, N.: Information gain, correlation and support vector machines. In: Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. (eds.) Feature Extraction, Foundations and Applications. Springer, Heidelberg (2005) (In Press)
Yu, L., Liu, H.: Feature selection for high-dimensional data: A fast correlation-based filter solution. In: Machine Learning, pp. 856–863. AAAI, Menlo Park (2003)
Koller, D., Sahami, M.: Toward optimal feature selection. In: International Conference on Machine Learning, pp. 284–292 (1996)
Scott, M., Niranjan, M., Prager, R.: Parcel: feature subset selection in variable cost domains. Technical report, Cambridge University Engineering Department (1998)
Borisov, A., Eruhimov, V., Tuv, E.: Tree-based ensembles with dynamic soft feature selection. In: Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L. (eds.) Feature Extraction, Foundations and Applications. Springer, Heidelberg (2005) (In Press)
Blake, C., Merz, C.: UCI repository of machine learning databases (1998)
Friedman, J.: Multivariate adaptive regression splines. The Annals of Statistics 19, 1–141 (1991)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rogers, J.D., Gunn, S.R. (2005). Ensemble Algorithms for Feature Selection. In: Winkler, J., Niranjan, M., Lawrence, N. (eds) Deterministic and Statistical Methods in Machine Learning. DSMML 2004. Lecture Notes in Computer Science(), vol 3635. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11559887_11
Download citation
DOI: https://doi.org/10.1007/11559887_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29073-5
Online ISBN: 978-3-540-31728-9
eBook Packages: Computer ScienceComputer Science (R0)