Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3292500.3330701acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Optuna: A Next-generation Hyperparameter Optimization Framework

Published: 25 July 2019 Publication History

Abstract

The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. The criteria we propose include (1) define-by-run API that allows users to construct the parameter search space dynamically, (2) efficient implementation of both searching and pruning strategies, and (3) easy-to-setup, versatile architecture that can be deployed for various purposes, ranging from scalable distributed computing to light-weight experiment conducted via interactive interface. In order to prove our point, we will introduce Optuna, an optimization software which is a culmination of our effort in the development of a next generation optimization software. As an optimization software designed with define-by-run principle, Optuna is particularly the first of its kind. We will present the design-techniques that became necessary in the development of the software that meets the above criteria, and demonstrate the power of our new design through experimental results and real world applications. Our software is available under the MIT license (https://github.com/pfnet/optuna/).

References

[1]
Mart'in Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-scale Machine Learning. In OSDI. 265--283.
[2]
Takuya Akiba, Tommi Kerola, Yusuke Niitani, Toru Ogawa, Shotaro Sano, and Shuji Suzuki. 2018. PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track. In ECCV Workshop on Open Images Challenge.
[3]
James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for Hyper-parameter Optimization. In NIPS. 2546--2554.
[4]
James Bergstra, Brent Komer, Chris Eliasmith, Dan Yamins, and David D Cox. 2015. Hyperopt: a Python library for model selection and hyperparameter optimization. Computational Science & Discovery, Vol. 8, 1 (2015), 14008.
[5]
Ian Dewancker, Michael McCourt, Scott Clark, Patrick Hayes, Alexandra Johnson, and George Ke. 2016. A Strategy for Ranking Optimization Methods using Multiple Criteria. In ICML Workshop on AutoML. 11--20.
[6]
Tobias Domhan, Jost Tobias Springenberg, and Frank Hutter. 2015. Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves. In IJCAI. 3460--3468.
[7]
Siying Dong, Mark Callaghan, Leonidas Galanis, Dhruba Borthakur, Tony Savor, and Michael Strum. 2017. Optimizing Space Amplification in RocksDB. In CIDR.
[8]
Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Karro, and D Sculley. 2017. Google Vizier: A Service for Black-Box Optimization. In KDD. 1487--1495.
[9]
Nikolaus Hansen and Andreas Ostermeier. 2001. Completely Derandomized Self-Adaptation in Evolution Strategies. Evolutionary Computation, Vol. 9, 2 (2001), 159--195.
[10]
Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2011. Sequential Model-based Optimization for General Algorithm Configuration. In LION. 507--523.
[11]
Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren (Eds.). 2018. Automatic Machine Learning: Methods, Systems, Challenges .Springer. In press, available at http://automl.org/book.
[12]
Kevin Jamieson and Ameet Talwalkar. 2016. Non-stochastic best arm identification and hyperparameter optimization. In Artificial Intelligence and Statistics. 240--248.
[13]
Aaron Klein, Stefan Falkner, Jost Tobias Springenberg, and Frank Hutter. 2017. Learning Curve Prediction with Bayesian Neural Networks. In ICLR.
[14]
Thomas Kluyver, Benjamin Ragan-Kelley, Fernando Pérez, Brian Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica Hamrick, Jason Grout, Sylvain Corlay, Paul Ivanov, Damián Avila, Safia Abdalla, and Carol Willing. 2016. Jupyter Notebooks -- a publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas, F. Loizides and B. Schmidt (Eds.). IOS Press, 87 -- 90.
[15]
Patrick Koch, Oleg Golovidov, Steven Gardner, Brett Wujek, Joshua Griffin, and Yan Xu. 2018. Autotune: A Derivative-free Optimization Framework for Hyperparameter Tuning. In KDD. 443--452.
[16]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS. 1097--1105.
[17]
Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper R. R. Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Tom Duerig, and Vittorio Ferrari. 2018. The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale. CoRR, Vol. abs/1811.00982 (2018). arxiv: 1811.00982
[18]
Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. 2018a. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. Journal of Machine Learning Research, Vol. 18, 185 (2018), 1--52.
[19]
Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz Hardt, Benjamin Recht, and Ameet Talwalkar. 2018b. Massively Parallel Hyperparameter Tuning. In NeurIPS Workshop on Machine Learning Systems.
[20]
Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E. Gonzalez, and Ion Stoica. 2018. Tune: A Research Platform for Distributed Model Selection and Training. In ICML Workshop on AutoML.
[21]
Michael McCourt. 2016. Benchmark suite of test functions suitable for evaluating black-box optimization strategies. https://github.com/sigopt/evalset.
[22]
Wes McKinney. 2011. Pandas: a Foundational Python Library for Data Analysis and Statistics. In SC Workshop on Python for High Performance and Scientific Computing.
[23]
Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, William Paul, Michael I. Jordan, and Ion Stoica. 2017. Ray: A Distributed Framework for Emerging AI Applications. CoRR, Vol. abs/1712.05889 (2017). arxiv: 1712.05889 http://arxiv.org/abs/1712.05889
[24]
Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. 2011. Reading Digits in Natural Images with Unsupervised Feature Learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning.
[25]
Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, and Pengcheng Yin. 2017. DyNet: The Dynamic Neural Network Toolkit. CoRR, Vol. abs/1701.03980 (2017).
[26]
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In NIPS Autodiff Workshop.
[27]
Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. 2016. Taking the human out of the loop: A review of bayesian optimization. Proc. IEEE, Vol. 104, 1 (2016), 148--175.
[28]
Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical Bayesian Optimization of Machine Learning Algorithms. In NIPS. 2951--2959.
[29]
Seiya Tokui, Kenta Oono, Shohei Hido, and Justin Clayton. 2015. Chainer: a Next-Generation Open Source Framework for Deep Learning. In NIPS Workshop on Machine Learning Systems.

Cited By

View all
  • (2025)Parameter Estimation for Phase-field Crack Propagation Simulation Using Nonsequential Data Assimilation非逐次データ同化によるフェーズフィールドき裂進展シミュレーションのパラメータ推定Journal of the Japan Society for Technology of Plasticity10.9773/sosei.240901Online publication date: 2025
  • (2025)Recurrent prediction within 1, 3, and 5 years after acute ischemic stroke based on machine learning using 10 years J-ASPECT studyJ-ASPECT Study 10年間の日本全国DPCデータを用いた機械学習による急性期脳梗塞発症後の1,3,5年以内の再発予測Japanese Journal of Stroke10.3995/jstroke.1126447:1(17-24)Online publication date: 2025
  • (2025)Feature group tabular transformer: a novel approach to traffic crash modeling and causality analysisApplied Computing and Intelligence10.3934/aci.20250035:1(29-56)Online publication date: 2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
July 2019
3305 pages
ISBN:9781450362016
DOI:10.1145/3292500
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Bayesian optimization
  2. black-box optimization
  3. hyperparameter optimization
  4. machine learning system

Qualifiers

  • Research-article

Conference

KDD '19
Sponsor:

Acceptance Rates

KDD '19 Paper Acceptance Rate 110 of 1,200 submissions, 9%;
Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3,285
  • Downloads (Last 6 weeks)377
Reflects downloads up to 04 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Parameter Estimation for Phase-field Crack Propagation Simulation Using Nonsequential Data Assimilation非逐次データ同化によるフェーズフィールドき裂進展シミュレーションのパラメータ推定Journal of the Japan Society for Technology of Plasticity10.9773/sosei.240901Online publication date: 2025
  • (2025)Recurrent prediction within 1, 3, and 5 years after acute ischemic stroke based on machine learning using 10 years J-ASPECT studyJ-ASPECT Study 10年間の日本全国DPCデータを用いた機械学習による急性期脳梗塞発症後の1,3,5年以内の再発予測Japanese Journal of Stroke10.3995/jstroke.1126447:1(17-24)Online publication date: 2025
  • (2025)Feature group tabular transformer: a novel approach to traffic crash modeling and causality analysisApplied Computing and Intelligence10.3934/aci.20250035:1(29-56)Online publication date: 2025
  • (2025)Machine learning-based laser heterodyne photothermal displacement method: simultaneous estimation of silicon thermal diffusivity and carrier lifetimeJapanese Journal of Applied Physics10.35848/1347-4065/ada9f664:2(02SP01)Online publication date: 3-Feb-2025
  • (2025)Robustness of Actual Evapotranspiration Predicted by Random Forest Model Integrating Remote Sensing and Meteorological Information: Case of Watermelon (Citrullus lanatus, (Thunb.) Matsum. & Nakai, 1916)Water10.3390/w1703032317:3(323)Online publication date: 23-Jan-2025
  • (2025)MTL-DoHTA: Multi-Task Learning-Based DNS over HTTPS Traffic Analysis for Enhanced Network SecuritySensors10.3390/s2504099325:4(993)Online publication date: 7-Feb-2025
  • (2025)Stroke Classification in Table Tennis as a Multi-Label Classification Task with Two Labels Per StrokeSensors10.3390/s2503083425:3(834)Online publication date: 30-Jan-2025
  • (2025)Underwater Gas Leak Quantification by Convolutional Neural Network Using ImagesProcesses10.3390/pr1301011813:1(118)Online publication date: 5-Jan-2025
  • (2025)Efficient Personalization in E-Commerce: Leveraging Universal Customer Representations with EmbeddingsJournal of Theoretical and Applied Electronic Commerce Research10.3390/jtaer2001001220:1(12)Online publication date: 16-Jan-2025
  • (2025)Fusion of In-Situ and Modelled Marine Data for Enhanced Coastal Dynamics Prediction Along the Western Black Sea CoastJournal of Marine Science and Engineering10.3390/jmse1302019913:2(199)Online publication date: 22-Jan-2025
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media