Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3377929.3398118acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

What do you mean?: the role of the mean function in bayesian optimisation

Published: 08 July 2020 Publication History

Abstract

Bayesian optimisation is a popular approach for optimising expensive black-box functions. The next location to be evaluated is selected via maximising an acquisition function that balances exploitation and exploration. Gaussian processes, the surrogate models of choice in Bayesian optimisation, are often used with a constant prior mean function equal to the arithmetic mean of the observed function values. We show that the rate of convergence can depend sensitively on the choice of mean function. We empirically investigate 8 mean functions (constant functions equal to the arithmetic mean, minimum, median and maximum of the observed function evaluations, linear, quadratic polynomials, random forests and RBF networks), using 10 synthetic test problems and two real-world problems, and using the Expected Improvement and Upper Confidence Bound acquisition functions.
We find that for design dimensions ≥ 5 using a constant mean function equal to the worst observed quality value is consistently the best choice on the synthetic problems considered. We argue that this worst-observed-quality function promotes exploitation leading to more rapid convergence. However, for the real-world tasks the more complex mean functions capable of modelling the fitness landscape may be effective, although there is no clearly optimum choice.

Supplementary Material

PDF File (p1623_de_ath_suppl.pdf)
Supplemental material.

References

[1]
Erdem Acar. 2013. Effects of the correlation model, the trend model, and the number of training points on the accuracy of Kriging metamodels. Expert Systems 30, 5 (2013), 418--428.
[2]
The GPyOpt authors. 2016. GPyOpt: A Bayesian Optimization framework in python. http://github.com/SheffieldML/GPyOpt.
[3]
Maximilian Balandat, Brian Karrer, Daniel R. Jiang, Samuel Daulton, Benjamin Letham, Andrew Gordon Wilson, and Eytan Bakshy. 2019. BoTorch: Programmable Bayesian Optimization in PyTorch. arXiv:1910.06403
[4]
Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer, Berlin, Heidelberg.
[5]
B. J. N. Blight and L. Ott. 1975. A Bayesian approach to model inadequacy for polynomial regression. Biometrika 62, 1 (1975), 79--88.
[6]
Leo Breiman. 2001. Random Forests. Machine Learning 45, 1 (2001), 5--32.
[7]
Eric Brochu, Vlad M Cora, and Nando De Freitas. 2010. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv:1012.2599
[8]
Richard H Byrd, Peihuang Lu, Jorge Nocedal, and Ciyou Zhu. 1995. A limited memory algorithm for bound constrained optimization. SIAM Journal on Scientific Computing 16, 5 (1995), 1190--1208.
[9]
Steven J. Daniels, Alma A. M. Rahat, Richard M. Everson, Gavin R. Tabor, and Jonathan E. Fieldsend. 2018. A Suite of Computationally Expensive Shape Optimisation Problems Using Computational Fluid Dynamics. In Parallel Problem Solving from Nature - PPSN XV. Springer, 296--307.
[10]
Alex Davies and Zoubin Ghahramani. 2014. The Random Forest Kernel and other kernels for big data from random partitions. arXiv:1402.4293
[11]
George De Ath, Richard M. Everson, Jonathan E. Fieldsend, and Alma A. M. Rahat. 2020. ε-shotgun: ε-greedy Batch Bayesian Optimisation. In Proceedings of the Genetic and Evolutionary Computation Conference. Association for Computing Machinery.
[12]
George De Ath, Richard M. Everson, Alma A. M. Rahat, and Jonathan E. Fieldsend. 2019. Greed is Good: Exploration and Exploitation Trade-offs in Bayesian Optimisation. arXiv:1911.12809
[13]
Vincent Fortuin, Heiko Strathmann, and Gunnar Rätsch. 2019. Meta-Learning Mean Functions for Gaussian Processes. arXiv:1901.08098
[14]
Peter I Frazier. 2018. A tutorial on Bayesian optimization. arXiv:1807.02811
[15]
Pierre Geurts, Damien Ernst, and Louis Wehenkel. 2006. Extremely randomized trees. Machine Learning 63, 1 (2006), 3--42.
[16]
Nikolaus Hansen. 2009. Benchmarking a BI-population CMA-ES on the BBOB-2009 function testbed. In Proceedings of the 11th Annual Conference Companion on Genetic and Evolutionary Computation Conference: Late Breaking Papers. ACM, 2389--2396.
[17]
José Miguel Hernández-Lobato, Matthew W Hoffman, and Zoubin Ghahramani. 2014. Predictive Entropy Search for Efficient Global Optimization of Blackbox Functions. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 918--926.
[18]
Sture Holm. 1979. A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6, 2 (1979), 65--70.
[19]
Tomoharu Iwata and Zoubin Ghahramani. 2017. Improving Output Uncertainty Estimation and Generalization in Deep Learning via Neural Network Gaussian Processes. arXiv:1707.05922
[20]
Hrvoje Jasak, Aleksandar Jemcov, and Zeljko Tuković. 2007. OpenFOAM: A C++ Library for Complex Physics Simulations. In International Workshop on Coupled Methods in Numerical Dynamics. 1--20.
[21]
Shali Jiang, Henry Chai, Javier González, and Roman Garnett. 2019. Efficient nonmyopic Bayesian optimization and quadrature. arXiv:arXiv:1909.04568
[22]
Donald R. Jones, Matthias Schonlau, and William J. Welch. 1998. Efficient Global Optimization of Expensive Black-Box Functions. Journal of Global Optimization 13, 4 (1998), 455--492.
[23]
Marc C. Kennedy and Anthony O'Hagan. 2001. Bayesian calibration of computer models. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 63, 3 (2001), 425--464.
[24]
Joshua D. Knowles, Lothar Thiele, and Eckart Zitzler. 2006. A Tutorial on the Performance Assessment of Stochastic Multiobjective Optimizers. Technical Report TIK214. Computer Engineering and Networks Laboratory, ETH Zurich, Zurich, Switzerland.
[25]
Harold J. Kushner. 1964. A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. Journal Basic Engineering 86, 1 (1964), 97--106.
[26]
Marius Lindauer, Matthias Feurer, Katharina Eggensperger, André Biedenkapp, and Frank Hutter. 2019. Towards Assessing the Impact of Bayesian Optimization's Own Hyperparameters. In IJCAI 2019 DSO Workshop.
[27]
Micheal D. McKay, Richard J. Beckman, and William J. Conover. 2000. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 42, 1 (2000), 55--61.
[28]
Jonas Močkus, Vytautas Tiešis, and Antanas Žilinskas. 1978. The application of Bayesian methods for seeking the extremum. Towards Global Optimization 2, 1 (1978), 117--129.
[29]
Tanmoy Mukhopadhyay, S Chakraborty, S Dey, S Adhikari, and R Chowdhury. 2017. A critical assessment of Kriging model variants for high-fidelity uncertainty quantification in dynamics of composite shells. Archives of Computational Methods in Engineering 24, 3 (2017), 495--518.
[30]
Ulf Nilsson, Daniel Lindblad, and Olivier Petit. 2014. Description of adjointShapeOptimizationFoam and how to implement new objective functions. Technical Report. Chalmers University of Technology, Gothenburg, Sweden.
[31]
Mark J. L. Orr. 1996. Introduction to radial basis function networks. Technical Report. University of Edinburgh.
[32]
Pramudita Satria Palar and Koji Shimoyama. 2019. Efficient global optimization with ensemble and selection of kernel functions for engineering design. Structural and Multidisciplinary Optimization 59, 1 (2019), 93--116.
[33]
Pramudita S. Palar, Lavi R. Zuhal, Tinkle Chugh, and Alma Rahat. 2020. On the Impact of Covariance Functions in Multi-Objective Bayesian Optimization for Engineering Design. In AIAA Scitech 2020 Forum. American Institute of Aeronautics and Astronautics.
[34]
Carl Edward Rasmussen and Christopher K. I. Williams. 2006. Gaussian processes for machine learning. The MIT Press, Boston, MA.
[35]
Frederik Rehbach, Martin Zaefferer, Boris Naujoks, and Thomas Bartz-Beielstein. 2020. Expected Improvement versus Predicted Value in Surrogate-Based Optimization. arXiv:2001.02957
[36]
Erwan Scornet. 2016. Random Forests and Kernel Methods. IEEE Transactions on Information Theory 62, 3 (2016), 1485--1500.
[37]
Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P. Adams, and Nando de Freitas. 2016. Taking the human out of the loop: A review of Bayesian optimization. Proc. IEEE 104, 1 (2016), 148--175.
[38]
Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical Bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2951--2959.
[39]
Niranjan Srinivas, Andreas Krause, Sham Kakade, and Matthias Seeger. 2010. Gaussian process optimization in the bandit setting: no regret and experimental design. In Proceedings of the 27th International Conference on Machine Learning. Omnipress, 1015--1022.
[40]
Michael L Stein. 2012. Interpolation of spatial data: some theory for kriging. Springer Science & Business Media.
[41]
Zi Wang, Caelan Reed Garrett, Leslie Pack Kaelbling, and Tomás Lozano-Pérez. 2018. Active Model Learning and Diverse Action Sampling for Task and Motion Planning. In Proceedings of the International Conference on Intelligent Robots and Systems. IEEE, 4107--4114.
[42]
Zi Wang and Stefanie Jegelka. 2017. Max-value entropy search for efficient Bayesian optimization. In Proceedings of the 34th International Conference on Machine Learning. PMLR, 3627--3635.
[43]
Daniel Williamson, Adam T Blaker, Charlotte Hampton, and James Salter. 2015. Identifying and removing structural biases in climate models with history matching. Climate dynamics 45, 5 (2015), 1299--1324.
[44]
David H. Wolpert and William G. Macready. 1997. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation 1, 1 (1997), 67--82.

Cited By

View all
  • (2024)Vanilla Bayesian optimization performs great in high dimensionsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692905(20793-20817)Online publication date: 21-Jul-2024
  • (2024)Robust and conjugate Gaussian process regressionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692119(1155-1185)Online publication date: 21-Jul-2024
  • (2024)Inducing clusters deep kernel Gaussian process for longitudinal dataProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i12.29279(13736-13743)Online publication date: 20-Feb-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference Companion
July 2020
1982 pages
ISBN:9781450371278
DOI:10.1145/3377929
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 July 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. acquisition function
  2. bayesian optimisation
  3. gaussian process
  4. mean function
  5. surrogate modelling

Qualifiers

  • Research-article

Funding Sources

Conference

GECCO '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)6
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Vanilla Bayesian optimization performs great in high dimensionsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692905(20793-20817)Online publication date: 21-Jul-2024
  • (2024)Robust and conjugate Gaussian process regressionProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692119(1155-1185)Online publication date: 21-Jul-2024
  • (2024)Inducing clusters deep kernel Gaussian process for longitudinal dataProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i12.29279(13736-13743)Online publication date: 20-Feb-2024
  • (2024) Data-driven optimization of a gas turbine combustor: A Bayesian approach addressing NO x emissions, lean extinction limits, and thermoacoustic stability Data-Centric Engineering10.1017/dce.2024.295Online publication date: 18-Nov-2024
  • (2023)Sample-Efficient Hyperparameter Optimization of an Aim Point Controller for Solar Tower Power Plants by Bayesian OptimizationSolarPACES Conference Proceedings10.52825/solarpaces.v1i.6361Online publication date: 13-Dec-2023
  • (2023)On Nonstationary Gaussian Process Model for Solving Data-Driven Optimization ProblemsIEEE Transactions on Cybernetics10.1109/TCYB.2021.312018853:4(2440-2453)Online publication date: Apr-2023
  • (2021)Green machine learning via augmented Gaussian processes and multi-information source optimizationSoft Computing10.1007/s00500-021-05684-725:19(12591-12603)Online publication date: 10-Mar-2021
  • (2020)Feasibility of Kd-Trees in Gaussian Process Regression to Partition Test Points in High Resolution Input SpaceAlgorithms10.3390/a1312032713:12(327)Online publication date: 5-Dec-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media