research-article

Optuna: A Next-generation Hyperparameter Optimization Framework

Authors:

Toshihiko Yanase,

Masanori KoyamaAuthors Info & Claims

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 2623 - 2631

https://doi.org/10.1145/3292500.3330701

Published: 25 July 2019 Publication History

Abstract

The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. The criteria we propose include (1) define-by-run API that allows users to construct the parameter search space dynamically, (2) efficient implementation of both searching and pruning strategies, and (3) easy-to-setup, versatile architecture that can be deployed for various purposes, ranging from scalable distributed computing to light-weight experiment conducted via interactive interface. In order to prove our point, we will introduce Optuna, an optimization software which is a culmination of our effort in the development of a next generation optimization software. As an optimization software designed with define-by-run principle, Optuna is particularly the first of its kind. We will present the design-techniques that became necessary in the development of the software that meets the above criteria, and demonstrate the power of our new design through experimental results and real world applications. Our software is available under the MIT license (https://github.com/pfnet/optuna/).

References

[1]

Mart'in Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-scale Machine Learning. In OSDI. 265--283.

Digital Library

[2]

Takuya Akiba, Tommi Kerola, Yusuke Niitani, Toru Ogawa, Shotaro Sano, and Shuji Suzuki. 2018. PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track. In ECCV Workshop on Open Images Challenge.

[3]

James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for Hyper-parameter Optimization. In NIPS. 2546--2554.

Digital Library

[4]

James Bergstra, Brent Komer, Chris Eliasmith, Dan Yamins, and David D Cox. 2015. Hyperopt: a Python library for model selection and hyperparameter optimization. Computational Science & Discovery, Vol. 8, 1 (2015), 14008.

[5]

Ian Dewancker, Michael McCourt, Scott Clark, Patrick Hayes, Alexandra Johnson, and George Ke. 2016. A Strategy for Ranking Optimization Methods using Multiple Criteria. In ICML Workshop on AutoML. 11--20.

[6]

Tobias Domhan, Jost Tobias Springenberg, and Frank Hutter. 2015. Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves. In IJCAI. 3460--3468.

Digital Library

[7]

Siying Dong, Mark Callaghan, Leonidas Galanis, Dhruba Borthakur, Tony Savor, and Michael Strum. 2017. Optimizing Space Amplification in RocksDB. In CIDR.

[8]

Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Karro, and D Sculley. 2017. Google Vizier: A Service for Black-Box Optimization. In KDD. 1487--1495.

Digital Library

[9]

Nikolaus Hansen and Andreas Ostermeier. 2001. Completely Derandomized Self-Adaptation in Evolution Strategies. Evolutionary Computation, Vol. 9, 2 (2001), 159--195.

Digital Library

[10]

Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2011. Sequential Model-based Optimization for General Algorithm Configuration. In LION. 507--523.

Digital Library

[11]

Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren (Eds.). 2018. Automatic Machine Learning: Methods, Systems, Challenges .Springer. In press, available at http://automl.org/book.

[12]

Kevin Jamieson and Ameet Talwalkar. 2016. Non-stochastic best arm identification and hyperparameter optimization. In Artificial Intelligence and Statistics. 240--248.

[13]

Aaron Klein, Stefan Falkner, Jost Tobias Springenberg, and Frank Hutter. 2017. Learning Curve Prediction with Bayesian Neural Networks. In ICLR.

[14]

Thomas Kluyver, Benjamin Ragan-Kelley, Fernando Pérez, Brian Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica Hamrick, Jason Grout, Sylvain Corlay, Paul Ivanov, Damián Avila, Safia Abdalla, and Carol Willing. 2016. Jupyter Notebooks -- a publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas, F. Loizides and B. Schmidt (Eds.). IOS Press, 87 -- 90.

[15]

Patrick Koch, Oleg Golovidov, Steven Gardner, Brett Wujek, Joshua Griffin, and Yan Xu. 2018. Autotune: A Derivative-free Optimization Framework for Hyperparameter Tuning. In KDD. 443--452.

Digital Library

[16]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS. 1097--1105.

Digital Library

[17]

Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper R. R. Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Tom Duerig, and Vittorio Ferrari. 2018. The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale. CoRR, Vol. abs/1811.00982 (2018). arxiv: 1811.00982

[18]

Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. 2018a. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. Journal of Machine Learning Research, Vol. 18, 185 (2018), 1--52.

Digital Library

[19]

Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz Hardt, Benjamin Recht, and Ameet Talwalkar. 2018b. Massively Parallel Hyperparameter Tuning. In NeurIPS Workshop on Machine Learning Systems.

[20]

Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E. Gonzalez, and Ion Stoica. 2018. Tune: A Research Platform for Distributed Model Selection and Training. In ICML Workshop on AutoML.

[21]

Michael McCourt. 2016. Benchmark suite of test functions suitable for evaluating black-box optimization strategies. https://github.com/sigopt/evalset.

[22]

Wes McKinney. 2011. Pandas: a Foundational Python Library for Data Analysis and Statistics. In SC Workshop on Python for High Performance and Scientific Computing.

[23]

Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, William Paul, Michael I. Jordan, and Ion Stoica. 2017. Ray: A Distributed Framework for Emerging AI Applications. CoRR, Vol. abs/1712.05889 (2017). arxiv: 1712.05889 http://arxiv.org/abs/1712.05889

Digital Library

[24]

Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. 2011. Reading Digits in Natural Images with Unsupervised Feature Learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning.

[25]

Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, and Pengcheng Yin. 2017. DyNet: The Dynamic Neural Network Toolkit. CoRR, Vol. abs/1701.03980 (2017).

[26]

Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In NIPS Autodiff Workshop.

[27]

Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. 2016. Taking the human out of the loop: A review of bayesian optimization. Proc. IEEE, Vol. 104, 1 (2016), 148--175.

[28]

Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical Bayesian Optimization of Machine Learning Algorithms. In NIPS. 2951--2959.

Digital Library

[29]

Seiya Tokui, Kenta Oono, Shohei Hido, and Justin Clayton. 2015. Chainer: a Next-Generation Open Source Framework for Deep Learning. In NIPS Workshop on Machine Learning Systems.

Cited By

SASAKI KSUEKI SYAMANAKA A(2025)Parameter Estimation for Phase-field Crack Propagation Simulation Using Nonsequential Data Assimilation非逐次データ同化によるフェーズフィールドき裂進展シミュレーションのパラメータ推定Journal of the Japan Society for Technology of Plasticity10.9773/sosei.240901Online publication date: 2025
https://doi.org/10.9773/sosei.240901
Watanabe SRen NOgata SNakaoku YHagihara AKobashi SHiramatsu HOhta TNoguchi TKataoka HIhara MNishimura KIihara K(2025)Recurrent prediction within 1, 3, and 5 years after acute ischemic stroke based on machine learning using 10 years J-ASPECT studyJ-ASPECT Study 10年間の日本全国DPCデータを用いた機械学習による急性期脳梗塞発症後の1，3，5年以内の再発予測Japanese Journal of Stroke10.3995/jstroke.1126447:1(17-24)Online publication date: 2025
https://doi.org/10.3995/jstroke.11264
Lares OZhen HYang J(2025)Feature group tabular transformer: a novel approach to traffic crash modeling and causality analysisApplied Computing and Intelligence10.3934/aci.20250035:1(29-56)Online publication date: 2025
https://doi.org/10.3934/aci.2025003
Show More Cited By

Index Terms

Optuna: A Next-generation Hyperparameter Optimization Framework
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
2. Theory of computation
  1. Design and analysis of algorithms
    1. Mathematical optimization

Recommendations

Radial Basis Function and Bayesian Methods for the Hyperparameter Optimization of Classification Random Forests
Computational Science and Its Applications – ICCSA 2023 Workshops
Abstract
The hyperparameter optimization of a random forest (RF) is a discrete black-box optimization problem that aims to find the settings of the hyperparameters that optimize an overall out-of-bag (OOB) performance measure of the RF. This problem is ...
Improved particle swarm optimization algorithm based on grouping and its application in hyperparameter optimization
Abstract
In this article, an Improved Particle Swarm Optimization (IPSO) is proposed for solving global optimization and hyperparameter optimization. This improvement is proposed to reduce the probability of particles falling into local optimum and ...
A data-driven robust optimization algorithm for black-box cases: An application to hyper-parameter optimization of machine learning algorithms
Graphical abstract

Display Omitted
Highlights
- A novel Black-Box data-driven robust optimization approach is proposed.
- A ...
Abstract
The huge availability of data in the last decade has raised the opportunity for the better use of data in decision-making processes. The idea of using the existing data to achieve a more coherent reality solution has led to a branch of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

July 2019

3305 pages

ISBN:9781450362016

DOI:10.1145/3292500

General Chairs:
Ankur Teredesai
KenSci
,
Vipin Kumar
University of Minnesota
,
Program Chairs:
Ying Li
EV Analysis Corporation
,
Rómer Rosales
LinkedIn
,
Evimaria Terzi
Boston University
,
George Karypis
University of Minnesota

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '19

Sponsor:

KDD '19: The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 4 - 8, 2019

AK, Anchorage, USA

Acceptance Rates

KDD '19 Paper Acceptance Rate 110 of 1,200 submissions, 9%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3,048
Total Citations
View Citations
10,759
Total Downloads

Downloads (Last 12 months)3,285
Downloads (Last 6 weeks)377

Reflects downloads up to 04 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

SASAKI KSUEKI SYAMANAKA A(2025)Parameter Estimation for Phase-field Crack Propagation Simulation Using Nonsequential Data Assimilation非逐次データ同化によるフェーズフィールドき裂進展シミュレーションのパラメータ推定Journal of the Japan Society for Technology of Plasticity10.9773/sosei.240901Online publication date: 2025
https://doi.org/10.9773/sosei.240901
Watanabe SRen NOgata SNakaoku YHagihara AKobashi SHiramatsu HOhta TNoguchi TKataoka HIhara MNishimura KIihara K(2025)Recurrent prediction within 1, 3, and 5 years after acute ischemic stroke based on machine learning using 10 years J-ASPECT studyJ-ASPECT Study 10年間の日本全国DPCデータを用いた機械学習による急性期脳梗塞発症後の1，3，5年以内の再発予測Japanese Journal of Stroke10.3995/jstroke.1126447:1(17-24)Online publication date: 2025
https://doi.org/10.3995/jstroke.11264
Lares OZhen HYang J(2025)Feature group tabular transformer: a novel approach to traffic crash modeling and causality analysisApplied Computing and Intelligence10.3934/aci.20250035:1(29-56)Online publication date: 2025
https://doi.org/10.3934/aci.2025003
Urano SHarada TIkari TKutsukake KFukuyama A(2025)Machine learning-based laser heterodyne photothermal displacement method: simultaneous estimation of silicon thermal diffusivity and carrier lifetimeJapanese Journal of Applied Physics10.35848/1347-4065/ada9f664:2(02SP01)Online publication date: 3-Feb-2025
https://doi.org/10.35848/1347-4065/ada9f6
Garofalo SArdito FSanitate NDe Carolis GRuggieri SGiannico VRana GFerrara R(2025)Robustness of Actual Evapotranspiration Predicted by Random Forest Model Integrating Remote Sensing and Meteorological Information: Case of Watermelon (Citrullus lanatus, (Thunb.) Matsum. & Nakai, 1916)Water10.3390/w1703032317:3(323)Online publication date: 23-Jan-2025
https://doi.org/10.3390/w17030323
Jung WKwak B(2025)MTL-DoHTA: Multi-Task Learning-Based DNS over HTTPS Traffic Analysis for Enhanced Network SecuritySensors10.3390/s2504099325:4(993)Online publication date: 7-Feb-2025
https://doi.org/10.3390/s25040993
Fujihara YShimada TKong XTanaka ANishikawa HTomiyama H(2025)Stroke Classification in Table Tennis as a Multi-Label Classification Task with Two Labels Per StrokeSensors10.3390/s2503083425:3(834)Online publication date: 30-Jan-2025
https://doi.org/10.3390/s25030834
Caldas GMoreira Rde Souza M(2025)Underwater Gas Leak Quantification by Convolutional Neural Network Using ImagesProcesses10.3390/pr1301011813:1(118)Online publication date: 5-Jan-2025
https://doi.org/10.3390/pr13010118
Alves Gomes MMeisen PMeisen T(2025)Efficient Personalization in E-Commerce: Leveraging Universal Customer Representations with EmbeddingsJournal of Theoretical and Applied Electronic Commerce Research10.3390/jtaer2001001220:1(12)Online publication date: 16-Jan-2025
https://doi.org/10.3390/jtaer20010012
Mihailov MChirosca AChirosca G(2025)Fusion of In-Situ and Modelled Marine Data for Enhanced Coastal Dynamics Prediction Along the Western Black Sea CoastJournal of Marine Science and Engineering10.3390/jmse1302019913:2(199)Online publication date: 22-Jan-2025
https://doi.org/10.3390/jmse13020199
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten