Article

Efficient and robust automated machine learning

Authors:

Matthias Feurer,

Katharina Eggensperger,

Jost Tobias Springenberg,

Frank HutterAuthors Info & Claims

NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2

Pages 2755 - 2763

Published: 07 December 2015 Publication History

Abstract

The success of machine learning in a broad range of applications has led to an ever-growing demand for machine learning systems that can be used off the shelf by non-experts. To be effective in practice, such systems need to automatically choose a good algorithm and feature preprocessing steps for a new dataset at hand, and also set their respective hyperparameters. Recent work has started to tackle this automated machine learning (AutoML) problem with the help of efficient Bayesian optimization methods. Building on this, we introduce a robust new AutoML system based on scikit-learn (using 15 classifiers, 14 feature preprocessing methods, and 4 data preprocessing methods, giving rise to a structured hypothesis space with 110 hyperparameters). This system, which we dub AUTO-SKLEARN, improves on existing AutoML methods by automatically taking into account past performance on similar datasets, and by constructing ensembles from the models evaluated during the optimization. Our system won the first phase of the ongoing ChaLearn AutoML challenge, and our comprehensive analysis on over 100 diverse datasets shows that it substantially outperforms the previous state of the art in AutoML. We also demonstrate the performance gains due to each of our contributions and derive insights into the effectiveness of the individual components of AUTO-SKLEARN.

References

[1]

I. Guyon, K. Bennett, G. Cawley, H. Escalante, S. Escalera, T. Ho, N.Macià, B. Ray, M. Saeed, A. Statnikov, and E. Viegas. Design of the 2015 ChaLearn AutoML Challenge. In Proc. of IJCNN'15, 2015.

[2]

C. Thornton, F. Hutter, H. Hoos, and K. Leyton-Brown. Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In Proc. of KDD'13, pages 847-855, 2013.

[3]

E. Brochu, V. Cora, and N. de Freitas. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. CoRR, abs/1012.2599, 2010.

[4]

M. Feurer, J. Springenberg, and F. Hutter. Initializing Bayesian hyperparameter optimization via meta-learning. In Proc. of AAAI'15, pages 1128-1135, 2015.

[5]

Reif M, F. Shafait, and A. Dengel. Meta-learning for evolutionary parameter optimization of classifiers. Machine Learning, 87:357-380, 2012.

[6]

T. Gomes, R. Prudêncio, C. Soares, A. Rossi, and A. Carvalho. Combining meta-learning and search techniques to select parameters for support vector machines. Neurocomputing, 75(1):3-13, 2012.

[7]

F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. JMLR, 12:2825-2830, 2011.

[8]

M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. Witten. The WEKA data mining software: An update. SIGKDD, 11(1):10-18, 2009.

[9]

F. Hutter, H. Hoos, and K. Leyton-Brown. Sequential model-based optimization for general algorithm configuration. In Proc. of LION'11, pages 507-523, 2011.

[10]

J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl. Algorithms for hyper-parameter optimization. In Proc. of NIPS' 11, pages 2546-2554, 2011.

[11]

J. Snoek, H. Larochelle, and R. P. Adams. Practical Bayesian optimization of machine learning algorithms. In Proc. of NIPS' 12, pages 2960-2968, 2012.

[12]

K. Eggensperger, M. Feurer, F. Hutter, J. Bergstra, J. Snoek, H. Hoos, and K. Leyton-Brown. Towards an empirical foundation for assessing Bayesian optimization of hyperparameters. In NIPS Workshop on Bayesian Optimization in Theory and Practice, 2013.

[13]

B. Komer, J. Bergstra, and C. Eliasmith. Hyperopt-sklearn: Automatic hyperparameter configuration for scikit-learn. In ICML workshop on AutoML, 2014.

[14]

L. Breiman. Random forests. MLJ, 45:5-32, 2001.

[15]

P. Brazdil, C. Giraud-Carrier, C. Soares, and R. Vilalta. Metalearning: Applications to Data Mining. Springer, 2009.

[16]

R. Bardenet, M. Brendel, B. Kégl, and M. Sebag. Collaborative hyperparameter tuning. In Proc. of ICML' 13 [28], pages 199-207.

[17]

D. Yogatama and G. Mann. Efficient transfer learning method for automatic hyperparameter tuning. In Proc. of AISTATS'14, pages 1077-1085, 2014.

[18]

J. Vanschoren, J. van Rijn, B. Bischl, and L. Torgo. OpenML: Networked science in machine learning. SIGKDD Explorations, 15(2):49-60, 2013.

[19]

D. Michie, D. Spiegelhalter, C. Taylor, and J. Campbell. Machine Learning, Neural and Statistical Classification. Ellis Horwood, 1994.

[20]

A. Kalousis. Algorithm Selection via Meta-Learning. PhD thesis, University of Geneve, 2002.

[21]

B. Pfahringer, H. Bensusan, and C. Giraud-Carrier. Meta-learning by landmarking various learning algorithms. In Proc. of (ICML' 00), pages 743-750, 2000.

[22]

I. Guyon, A. Saffari, G. Dror, and G. Cawley. Model selection: Beyond the Bayesian/Frequentist divide. JMLR, 11:61-87, 2010.

[23]

A. Lacoste, M. Marchand, F. Laviolette, and H. Larochelle. Agnostic Bayesian learning of ensembles. In Proc. of ICML '14, pages 611-619, 2014.

[24]

R. Caruana, A. Niculescu-Mizil, G. Crew, and A. Ksikes. Ensemble selection from libraries of models. In Proc. of ICML '04, page 18, 2004.

[25]

R. Caruana, A. Munson, and A. Niculescu-Mizil. Getting the most out of ensemble selection. In Proc. of ICDM'06, pages 828-833, 2006.

[26]

D. Wolpert. Stacked generalization. Neural Networks, 5:241-259, 1992.

[27]

G. Hamerly and C. Elkan. Learning the k in k-means. In Proc. of NIPS '04, pages 281-288, 2004.

[28]

Proc. of ICML 13, 2014.

Cited By

Kenny ARay TLimmer SSingh HRodemann TOlhofer MLi XHandl J(2024)Using Bayesian Optimization to Improve Hyperparameter Search in TPOTProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654061(340-348)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638529.3654061
Vente T(2023)Advancing Automation of Design Decisions in Recommender System PipelinesProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608886(1355-1360)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3608886
Li PChen ZChu XRong K(2023)DiffPrep: Differentiable Data Preprocessing Pipeline Search for Learning over Tabular DataProceedings of the ACM on Management of Data10.1145/35893281:2(1-26)Online publication date: 20-Jun-2023
https://dl.acm.org/doi/10.1145/3589328
Show More Cited By

Efficient and robust automated machine learning
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning

Recommendations

Hands-On Automated Machine Learning: A beginner's guide to building automated machine learning systems using AutoML and Python
Lifelong Machine Learning
Towards Green Automated Machine Learning: Status Quo and Future Directions
Automated machine learning (AutoML) strives for the automatic configuration of machine learning algorithms and their composition into an overall (software) solution — a machine learning pipeline — tailored to the learning task (dataset) at hand. Over the ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2

December 2015

3626 pages

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 07 December 2015

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

94
Total Citations
View Citations
9
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kenny ARay TLimmer SSingh HRodemann TOlhofer MLi XHandl J(2024)Using Bayesian Optimization to Improve Hyperparameter Search in TPOTProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654061(340-348)Online publication date: 14-Jul-2024
https://dl.acm.org/doi/10.1145/3638529.3654061
Vente T(2023)Advancing Automation of Design Decisions in Recommender System PipelinesProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608886(1355-1360)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3608886
Li PChen ZChu XRong K(2023)DiffPrep: Differentiable Data Preprocessing Pipeline Search for Learning over Tabular DataProceedings of the ACM on Management of Data10.1145/35893281:2(1-26)Online publication date: 20-Jun-2023
https://dl.acm.org/doi/10.1145/3589328
Kopito GSchwartz JAmblard JFilman RRabern LSilva SPaquete L(2023)MLStar: A System for Synthesis of Machine-Learning ProgramsProceedings of the Companion Conference on Genetic and Evolutionary Computation10.1145/3583133.3596367(1721-1726)Online publication date: 15-Jul-2023
https://dl.acm.org/doi/10.1145/3583133.3596367
Kenny ARay TLimmer SSingh HRodemann TOlhofer MSilva SPaquete L(2023)Hybridizing TPOT with Bayesian OptimizationProceedings of the Genetic and Evolutionary Computation Conference10.1145/3583131.3590364(502-510)Online publication date: 15-Jul-2023
https://dl.acm.org/doi/10.1145/3583131.3590364
Pineda Arango SGrabocka JSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)Deep Pipeline Embeddings for AutoMLProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599303(1907-1919)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599303
Dinh TZeng YZhang RLin ZGira MRajput SSohn JPapailiopoulos DLee KKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)LIFTProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601125(11763-11784)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601125
Shen YLu YLi YTu YZhang WCui BKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)DivBOProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3600484(2958-2971)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3600484
Helali MMansour EAbdelaziz IDolby JSrinivas K(2022)A scalable AutoML approach based on graph neural networksProceedings of the VLDB Endowment10.14778/3551793.355180415:11(2428-2436)Online publication date: 1-Jul-2022
https://dl.acm.org/doi/10.14778/3551793.3551804
De Bie TDe Raedt LHernández-Orallo JHoos HSmyth PWilliams C(2022)Automating data scienceCommunications of the ACM10.1145/349525665:3(76-87)Online publication date: 23-Feb-2022
https://dl.acm.org/doi/10.1145/3495256
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Table of Contents