Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Wavelet decompositions of Random Forests: smoothness analysis, sparse approximation and applications

Published: 01 January 2016 Publication History
  • Get Citation Alerts
  • Abstract

    In this paper we introduce, in the setting of machine learning, a generalization of wavelet analysis which is a popular approach to low dimensional structured signal analysis. The wavelet decomposition of a Random Forest provides a sparse approximation of any regression or classification high dimensional function at various levels of detail, with a concrete ordering of the Random Forest nodes: from 'significant' elements to nodes capturing only 'insignificant' noise. Motivated by function space theory, we use the wavelet decomposition to compute numerically a 'weak-type' smoothness index that captures the complexity of the underlying function. As we show through extensive experimentation, this sparse representation facilitates a variety of applications such as improved regression for difficult datasets, a novel approach to feature importance, resilience to noisy or irrelevant features, compression of ensembles, etc.

    References

    [1]
    Alani D., Averbuch A. and Dekel S., Image coding using geometric wavelets, IEEE transactions on image processing 16:69-77, 2007.
    [2]
    Alpaydin E., Introduction to machine learning, MIT Press, 2004.
    [3]
    Avery M., Literature Review for Local Polynomial Regression, http://www4.ncsu.edu/mravery/AveryReview2.pdf.
    [4]
    Bernard S., Adam S. and Heutte L., Dynamic random forests, Pattern Recognition Letters 33:1580-1586.
    [5]
    Biau G., Analysis of a random forests model, Journal of Machine Learning Research 13:1063-1095, 2012.
    [6]
    Biau G., Devroye L. and Lugosi G., Consistency of random forests and other averaging classifiers, Journal of Machine Learning Research 9:2015-2033, 2008.
    [7]
    Biau G. and Scornet E., A random forest guided tour, TEST 25(2):197-227, 2016.
    [8]
    Breiman L., Random forests, Machine Learning 45:5-32, 2001.
    [9]
    Breiman L., Bagging predictors, Machine Learning 24(2):123-140, 1996.
    [10]
    Breiman L, Friedman J., Stone C. and Olshen R., Classification and Regression Trees, Chapman and Hall/CRC, 1984.
    [11]
    Boulesteix A., Janitza S., Kruppa J. and König I., Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 2(6):493-507, 2012.
    [12]
    Chen H., Tino P. and Yao X., Predictive Ensemble Pruning by Expectation Propagation, IEEE journal of knowledge and data engineering 21:999-1013, 2009.
    [13]
    Christensen O., An introduction to Frames and Riesz Bases, Birkäuser, 2002.
    [14]
    Criminisi A., Shotton J. and Konukoglu E., Forests for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning, Microsoft Research technical report, report TR-2011-114, 2011.
    [15]
    Dahmen W., Dekel S. and Petrushev P., Two-level-split decomposition of anisotropic Besov spaces, Constructive approximation 31:149-194, 2001.
    [16]
    Daubechies I., Ten lectures on wavelets, CBMS-NSF Regional Conference Series in Applied Mathematics, 1992.
    [17]
    Dekel S., Gershtansky I., Active Geometric Wavelets, In Proceedings of Approximation Theory XIII 2010, 95-109, 2012.
    [18]
    Dekel S. and Leviatan D., Adaptive multivariate approximation using binary space partitions and geometric wavelets, SIAM Journal on Numerical Analysis 43:707-732, 2005.
    [19]
    Denil M., Matheson D. and De Freitas N., Narrowing the gap Random forests in theory and in practice, In Proceedings of the 31st International Conference on Machine Learning 32, 2014.
    [20]
    DeVore R., Nonlinear approximation, Acta Numerica 7:51-150, 1998.
    [21]
    DeVore R. and Lorentz G., Constructive approximation, Springer Science and Business, 1993.
    [22]
    DeVore R., Jawerth B. and Lucier B., Image compression through wavelet transform coding, IEEE transactions on information theory 38(2):719-746, 1992.
    [23]
    Du W. and Zhan Z., Building decision tree classifier on private data, In Proceedings of the IEEE international conference on Privacy, security and data mining 14:1-8, 2002.
    [24]
    Elad M., Sparse and redundant representations: from theory to applications in signal and image processing, Springer Science and Business Media, 2010.
    [25]
    Feng N., Wang J. and Saligrama V., Feature-Budgeted Random Forest, In Proceedings of The 32nd International Conference on Machine Learning, 1983-1991, 2015.
    [26]
    Kelley P. and Barry R., Sparse spatial autoregressions, Statistics and Probability Letters 33(3):291-297, 1997.
    [27]
    Gavish M., Nadler B., Coifman R., Multiscale wavelets on trees, graphs and high dimensional data: Theory and applications to semi supervised learning, In Proceedings of the 27th International Conference on Machine Learning, 367-374, 2010.
    [28]
    Genuer R., Poggi J. and Christine T., Variable selection using Random Forests, Pattern Recognition Letters 31(14): 2225-2236, 2010.
    [29]
    Geurts P. and Gilles L., Learning to rank with extremely randomized trees, In JMLR: Workshop and Conference Proceedings 14:49-61, 2011.
    [30]
    Guyon I. and Elisseff A., An introduction to variable and feature selection, Journal of Machine Learning Research 3:1157-1182, 2003.
    [31]
    Hastie T., Tibshirani R. and Friedman J., The elements of statistical learning, Springer, 2009.
    [32]
    Joly A., Schnitzler F., Geurts P. and Wehenkel L., L1-based compression of random forest models, In Proceedings of the European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, 375-380, 2012.
    [33]
    Karaivanov B. and Petrushev P., Nonlinear piecewise polynomial approximation beyond Besov spaces, Applied and computational harmonic analysis 15:177-223, 2003.
    [34]
    Kulkarni V. and Sinha P., Pruning of Random Forest classifiers: A survey and future directions, In International Conference on data science and engineering, 64-68, 2012.
    [35]
    Lee A., Nadler B. and Wasserman L., Treelets: an adaptive multi-scale basis for sparse unordered data, Annals of Applied Statistics 2(2):435-471, 2008.
    [36]
    Loh W., Classification and regression trees, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 1(1):14-23, 2011.
    [37]
    Louppe G., Wehenkel L., Sutera A. and Geurts P., Understanding variable importances in forests of randomized trees, Advances in Neural Information Processing Systems 26:431-439, 2013.
    [38]
    Mallat S., A Wavelet tour of signal processing, 3rd edition (the sparse way), Acadmic Press, 2009.
    [39]
    Martinez-Muoz G., Hernández-Lobato D. and Suarez A., An analysis of ensemble pruning techniques based on ordered aggregation, IEEE Transactions on pattern analysis and machine intelligence 31:245-259, 2009.
    [40]
    Radha H., Vetterli M. and Leonardi R., Image compression using binary space partitioning trees, IEEE transactions on image processing 5:1610-1624, 1996.
    [41]
    Raileanu L. and Stoffel K., Theoretical comparison between the Gini index and information gain criteria, Annals of Mathematics and Artificial Intelligence 41(1):77-93, 2004.
    [42]
    'Random Forest' package in R, http://cran.r-project.org/web/packages/randomForest/randomForest.pdf
    [43]
    Rokach L. and Maimon O., Top-down induction of decision trees classifiers-a survey, IEEE transactions on systems, man, and cybernetics, part C: applications and reviews 35(4):476-487, 2005.
    [44]
    Salembier P. and Garrido L., Binary partition tree as an efficient representation for image processing, segmentation, and information retrieval, IEEE transactions on image processing 9:561-576, 2000.
    [45]
    Spiral dataset, http://www.inside-r.org/packages/cran/mlbench/docs/mlbench.spirals.
    [46]
    Strobl C., Boulesteix A., Zeileis A. and Hothorn T., Bias in random forest variable importance measures, In Workshop on Statistical Modelling of Complex Systems, 2006.
    [47]
    Strobl C., Boulesteix A., Kneib T., Augustin T. and Zeileis A., Conditional variable importance for random forests, BMC bioinformatics 9(1):1-11, 2008.
    [48]
    UCI machine learning repository, http://archive.ics.uci.edu/ml/.
    [49]
    Vens C. and Costa F., Random forest based feature induction, In IEEE international conference on data mining, 744-753, 2011.
    [50]
    Yang F., Lu. W., Luo L. and Li T., Margin optimization based pruning for random forest, Neurocomputing 94:54-63, 2012.
    [51]
    Wavelet-based Random Forest source code, https://github.com/orenelis/WaveletsForest.git.

    Cited By

    View all
    • (2024)Haar-Like Wavelets on Hierarchical TreesJournal of Scientific Computing10.1007/s10915-024-02466-999:1Online publication date: 22-Feb-2024
    • (2019)UA-CRNN: Uncertainty-Aware Convolutional Recurrent Neural Network for Mortality Risk PredictionProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3357884(109-118)Online publication date: 3-Nov-2019
    • (2019)Random Decision DAG: An Entropy Based Compression Approach for Random ForestDatabase Systems for Advanced Applications10.1007/978-3-030-18590-9_37(319-323)Online publication date: 22-Apr-2019
    • Show More Cited By

    Index Terms

    1. Wavelet decompositions of Random Forests: smoothness analysis, sparse approximation and applications
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image The Journal of Machine Learning Research
      The Journal of Machine Learning Research  Volume 17, Issue 1
      January 2016
      8391 pages
      ISSN:1532-4435
      EISSN:1533-7928
      Issue’s Table of Contents

      Publisher

      JMLR.org

      Publication History

      Revised: 01 July 2016
      Published: 01 January 2016
      Published in JMLR Volume 17, Issue 1

      Author Tags

      1. adaptive approximation
      2. besov spaces
      3. feature importance
      4. random forest
      5. wavelets

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)28
      • Downloads (Last 6 weeks)9

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Haar-Like Wavelets on Hierarchical TreesJournal of Scientific Computing10.1007/s10915-024-02466-999:1Online publication date: 22-Feb-2024
      • (2019)UA-CRNN: Uncertainty-Aware Convolutional Recurrent Neural Network for Mortality Risk PredictionProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3357884(109-118)Online publication date: 3-Nov-2019
      • (2019)Random Decision DAG: An Entropy Based Compression Approach for Random ForestDatabase Systems for Advanced Applications10.1007/978-3-030-18590-9_37(319-323)Online publication date: 22-Apr-2019
      • (2017)Globally Induced ForestProceedings of the 34th International Conference on Machine Learning - Volume 7010.5555/3305381.3305425(420-428)Online publication date: 6-Aug-2017

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media