Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2969442.2969547guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
Article

Efficient and robust automated machine learning

Published: 07 December 2015 Publication History

Abstract

The success of machine learning in a broad range of applications has led to an ever-growing demand for machine learning systems that can be used off the shelf by non-experts. To be effective in practice, such systems need to automatically choose a good algorithm and feature preprocessing steps for a new dataset at hand, and also set their respective hyperparameters. Recent work has started to tackle this automated machine learning (AutoML) problem with the help of efficient Bayesian optimization methods. Building on this, we introduce a robust new AutoML system based on scikit-learn (using 15 classifiers, 14 feature preprocessing methods, and 4 data preprocessing methods, giving rise to a structured hypothesis space with 110 hyperparameters). This system, which we dub AUTO-SKLEARN, improves on existing AutoML methods by automatically taking into account past performance on similar datasets, and by constructing ensembles from the models evaluated during the optimization. Our system won the first phase of the ongoing ChaLearn AutoML challenge, and our comprehensive analysis on over 100 diverse datasets shows that it substantially outperforms the previous state of the art in AutoML. We also demonstrate the performance gains due to each of our contributions and derive insights into the effectiveness of the individual components of AUTO-SKLEARN.

References

[1]
I. Guyon, K. Bennett, G. Cawley, H. Escalante, S. Escalera, T. Ho, N.Macià, B. Ray, M. Saeed, A. Statnikov, and E. Viegas. Design of the 2015 ChaLearn AutoML Challenge. In Proc. of IJCNN'15, 2015.
[2]
C. Thornton, F. Hutter, H. Hoos, and K. Leyton-Brown. Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In Proc. of KDD'13, pages 847-855, 2013.
[3]
E. Brochu, V. Cora, and N. de Freitas. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. CoRR, abs/1012.2599, 2010.
[4]
M. Feurer, J. Springenberg, and F. Hutter. Initializing Bayesian hyperparameter optimization via meta-learning. In Proc. of AAAI'15, pages 1128-1135, 2015.
[5]
Reif M, F. Shafait, and A. Dengel. Meta-learning for evolutionary parameter optimization of classifiers. Machine Learning, 87:357-380, 2012.
[6]
T. Gomes, R. Prudêncio, C. Soares, A. Rossi, and A. Carvalho. Combining meta-learning and search techniques to select parameters for support vector machines. Neurocomputing, 75(1):3-13, 2012.
[7]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. JMLR, 12:2825-2830, 2011.
[8]
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. Witten. The WEKA data mining software: An update. SIGKDD, 11(1):10-18, 2009.
[9]
F. Hutter, H. Hoos, and K. Leyton-Brown. Sequential model-based optimization for general algorithm configuration. In Proc. of LION'11, pages 507-523, 2011.
[10]
J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl. Algorithms for hyper-parameter optimization. In Proc. of NIPS' 11, pages 2546-2554, 2011.
[11]
J. Snoek, H. Larochelle, and R. P. Adams. Practical Bayesian optimization of machine learning algorithms. In Proc. of NIPS' 12, pages 2960-2968, 2012.
[12]
K. Eggensperger, M. Feurer, F. Hutter, J. Bergstra, J. Snoek, H. Hoos, and K. Leyton-Brown. Towards an empirical foundation for assessing Bayesian optimization of hyperparameters. In NIPS Workshop on Bayesian Optimization in Theory and Practice, 2013.
[13]
B. Komer, J. Bergstra, and C. Eliasmith. Hyperopt-sklearn: Automatic hyperparameter configuration for scikit-learn. In ICML workshop on AutoML, 2014.
[14]
L. Breiman. Random forests. MLJ, 45:5-32, 2001.
[15]
P. Brazdil, C. Giraud-Carrier, C. Soares, and R. Vilalta. Metalearning: Applications to Data Mining. Springer, 2009.
[16]
R. Bardenet, M. Brendel, B. Kégl, and M. Sebag. Collaborative hyperparameter tuning. In Proc. of ICML' 13 [28], pages 199-207.
[17]
D. Yogatama and G. Mann. Efficient transfer learning method for automatic hyperparameter tuning. In Proc. of AISTATS'14, pages 1077-1085, 2014.
[18]
J. Vanschoren, J. van Rijn, B. Bischl, and L. Torgo. OpenML: Networked science in machine learning. SIGKDD Explorations, 15(2):49-60, 2013.
[19]
D. Michie, D. Spiegelhalter, C. Taylor, and J. Campbell. Machine Learning, Neural and Statistical Classification. Ellis Horwood, 1994.
[20]
A. Kalousis. Algorithm Selection via Meta-Learning. PhD thesis, University of Geneve, 2002.
[21]
B. Pfahringer, H. Bensusan, and C. Giraud-Carrier. Meta-learning by landmarking various learning algorithms. In Proc. of (ICML' 00), pages 743-750, 2000.
[22]
I. Guyon, A. Saffari, G. Dror, and G. Cawley. Model selection: Beyond the Bayesian/Frequentist divide. JMLR, 11:61-87, 2010.
[23]
A. Lacoste, M. Marchand, F. Laviolette, and H. Larochelle. Agnostic Bayesian learning of ensembles. In Proc. of ICML '14, pages 611-619, 2014.
[24]
R. Caruana, A. Niculescu-Mizil, G. Crew, and A. Ksikes. Ensemble selection from libraries of models. In Proc. of ICML '04, page 18, 2004.
[25]
R. Caruana, A. Munson, and A. Niculescu-Mizil. Getting the most out of ensemble selection. In Proc. of ICDM'06, pages 828-833, 2006.
[26]
D. Wolpert. Stacked generalization. Neural Networks, 5:241-259, 1992.
[27]
G. Hamerly and C. Elkan. Learning the k in k-means. In Proc. of NIPS '04, pages 281-288, 2004.
[28]
Proc. of ICML 13, 2014.

Cited By

View all
  • (2024)Using Bayesian Optimization to Improve Hyperparameter Search in TPOTProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654061(340-348)Online publication date: 14-Jul-2024
  • (2023)Advancing Automation of Design Decisions in Recommender System PipelinesProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608886(1355-1360)Online publication date: 14-Sep-2023
  • (2023)DiffPrep: Differentiable Data Preprocessing Pipeline Search for Learning over Tabular DataProceedings of the ACM on Management of Data10.1145/35893281:2(1-26)Online publication date: 20-Jun-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
NIPS'15: Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2
December 2015
3626 pages

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 07 December 2015

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Using Bayesian Optimization to Improve Hyperparameter Search in TPOTProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654061(340-348)Online publication date: 14-Jul-2024
  • (2023)Advancing Automation of Design Decisions in Recommender System PipelinesProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608886(1355-1360)Online publication date: 14-Sep-2023
  • (2023)DiffPrep: Differentiable Data Preprocessing Pipeline Search for Learning over Tabular DataProceedings of the ACM on Management of Data10.1145/35893281:2(1-26)Online publication date: 20-Jun-2023
  • (2023)MLStar: A System for Synthesis of Machine-Learning ProgramsProceedings of the Companion Conference on Genetic and Evolutionary Computation10.1145/3583133.3596367(1721-1726)Online publication date: 15-Jul-2023
  • (2023)Hybridizing TPOT with Bayesian OptimizationProceedings of the Genetic and Evolutionary Computation Conference10.1145/3583131.3590364(502-510)Online publication date: 15-Jul-2023
  • (2023)Deep Pipeline Embeddings for AutoMLProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599303(1907-1919)Online publication date: 6-Aug-2023
  • (2022)LIFTProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601125(11763-11784)Online publication date: 28-Nov-2022
  • (2022)DivBOProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3600484(2958-2971)Online publication date: 28-Nov-2022
  • (2022)A scalable AutoML approach based on graph neural networksProceedings of the VLDB Endowment10.14778/3551793.355180415:11(2428-2436)Online publication date: 1-Jul-2022
  • (2022)Automating data scienceCommunications of the ACM10.1145/349525665:3(76-87)Online publication date: 23-Feb-2022
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media