A Bayesian optimization approach to find Nash equilibria

Picheny, Victor; Binois, Mickael; Habbal, Abderrahmane

doi:10.1007/s10898-018-0688-0

A Bayesian optimization approach to find Nash equilibria

Published: 12 July 2018

Volume 73, pages 171–192, (2019)
Cite this article

Journal of Global Optimization Aims and scope Submit manuscript

Victor Picheny¹^na1,
Mickael Binois²^na1 &
Abderrahmane Habbal³

1163 Accesses
15 Citations
1 Altmetric
Explore all metrics

Abstract

Game theory finds nowadays a broad range of applications in engineering and machine learning. However, in a derivative-free, expensive black-box context, very few algorithmic solutions are available to find game equilibria. Here, we propose a novel Gaussian-process based approach for solving games in this context. We follow a classical Bayesian optimization framework, with sequential sampling decisions based on acquisition functions. Two strategies are proposed, based either on the probability of achieving equilibrium or on the stepwise uncertainty reduction paradigm. Practical and numerical aspects are discussed in order to enhance the scalability and reduce computation time. Our approach is evaluated on several synthetic game problems with varying number of players and decision space dimensions. We show that equilibria can be found reliably for a fraction of the cost (in terms of black-box evaluations) compared to classical, derivative-based algorithms. The method is available in the R package GPGame available on CRAN at https://cran.r-project.org/package=GPGame.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Precision game engineering through reshaping strategic payoffs

Article Open access 24 October 2024

Simple Stochastic Stopping Games: A Generator and Benchmark Library

Exploiting Extensive-Form Structure in Empirical Game-Theoretic Analysis

References

Adams, R.A., Fournier, J.J.: Sobolev Spaces, vol. 140. Academic Press, Cambridge (2003)
MATH Google Scholar
Álvarez, M.A., Rosasco, L., Lawrence, N.D.: Kernels for vector-valued functions: a review. Found. Trend Mach. Learn. 4(3), 195–266 (2011). https://doi.org/10.1561/2200000036
Article MATH Google Scholar
Azzalini, A., Genz, A.: The R package mnormt: the multivariate normal and $t$ distributions (version 1.5–4). http://azzalini.stat.unipd.it/SW/Pkg-mnormt (2016). Accessed 8 Mar 2016
Başar, T.: Relaxation techniques and asynchronous algorithms for on-line computation of noncooperative equilibria. J. Econ. Dyn. Control. 11(4), 531–549 (1987)
Article MathSciNet MATH Google Scholar
Bect, J., Ginsbourger, D., Li, L., Picheny, V., Vazquez, E.: Sequential design of computer experiments for the estimation of a probability of failure. Stat. Comput. 22(3), 773–793 (2012)
Article MathSciNet MATH Google Scholar
Bect, J., Bachoc, F., Ginsbourger, D.: A supermartingale approach to Gaussian process based sequential design of experiments. arXiv preprint arXiv:1608.01118 (2016)
Brown, N., Ganzfried, S., Sandholm, T.: Hierarchical abstraction, distributed equilibrium computation, and post-processing, with application to a champion no-limit Texas hold’em agent. In: Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, pp. 7–15 (2015)
Chevalier, C., Ginsbourger, D.: Fast computation of the multi-points expected improvement with applications in batch selection. In: Learning and Intelligent Optimization, Springer, pp. 59–69 (2013)
Chevalier, C., Emery, X., Ginsbourger, D.: Fast update of conditional simulation ensembles. Math. Geosci. 47(7), 771–789 (2015)
Article MATH Google Scholar
Cressie, N.: Statistics for spatial data. Terra Nova 4(5), 613–617 (1992)
Article Google Scholar
Dorsch, D., Jongen, H.T., Shikhman, V.: On structure and computation of generalized nash equilibria. SIAM J. Optim. 23(1), 452–474 (2013)
Article MathSciNet MATH Google Scholar
Facchinei, F., Kanzow, C.: Generalized nash equilibrium problems. Annal. Oper. Res. 175(1), 177–211 (2010)
Article MathSciNet MATH Google Scholar
Fleuret, F., Geman, D.: Graded learning for object detection. In: Proceedings of the Workshop on Statistical and Computational Theories of Vision of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR/SCTV), vol. 2 (1999)
Friedman, A.: Stochastic differential games. J. Differ. Equ. 11(1), 79–108 (1972)
Article MathSciNet MATH Google Scholar
Games, I.L.S.C.: Lenient learning in independent-learner stochastic cooperative games. J. Mach. Learn. Res. 17, 1–42 (2016)
MathSciNet Google Scholar
Garivier, A., Kaufmann, E., Koolen, W. M.: Maximin action identification: a new bandit framework for games. In: 29th Annual Conference on Learning Theory, pp. 1028–1050 (2016)
Genz, A., Bretz, F.: Computation of Multivariate Normal and t Probabilities. Lecture Notes in Statistics. Springer, Heidelberg (2009)
Book MATH Google Scholar
Genz, A., Bretz, F., Miwa, T., Mi, X., Leisch, F., Scheipl, F., Hothorn, T.: mvtnorm: Multivariate normal and t Distributions. http://CRAN.R-project.org/package=mvtnorm, r package version 1.0–5 (2016). Accessed 2 Feb 2016
Gibbons, R.: Game Theory for Applied Economists. Princeton University Press, Princeton (1992)
Book Google Scholar
Ginsbourger, D., Le Riche, R.: Towards Gaussian process-based optimization with finite time horizon. In: mODa9–Advances in Model-Oriented Design and Analysis, Springer, pp. 89–96 (2010)
Gonzalez, J., Osborne, M., Lawrence, N.: Glasses: relieving the myopia of Bayesian optimisation. In: Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, pp. 790–799 (2016)
Gramacy, R.B., Apley, D.W.: Local gaussian process approximation for large computer experiments. J. Comput. Graph. Stat. 24(2), 561–578 (2015)
Article MathSciNet Google Scholar
Gramacy, R.B., Ludkovski, M.: Sequential design for optimal stopping problems. SIAM J. Financ. Math. 6(1), 748–775 (2015)
Article MathSciNet MATH Google Scholar
Habbal, A., Kallel, M.: Neumann–Dirichlet Nash strategies for the solution of elliptic Cauchy problems. SIAM J. Control Optim. 51(5), 4066–4083 (2013). https://doi.org/10.1137/120869808
Article MathSciNet MATH Google Scholar
Habbal, A., Petersson, J., Thellner, M.: Multidisciplinary topology optimization solved as a Nash game. Int. J. Numer. Methods Eng. 61, 949–963 (2004)
Article MathSciNet MATH Google Scholar
Harsanyi, J.C.: Games with randomly disturbed payoffs: a new rationale for mixed-strategy equilibrium points. Int. J. Game Theory 2(1), 1–23 (1973)
Article MathSciNet MATH Google Scholar
Heaton, M. J., Datta, A., Finley, A., Furrer, R., Guhaniyogi, R., Gerber, F., Gramacy, R. B., Hammerling, D., Katzfuss, M., Lindgren, F., et al.: A case study competition among methods for analyzing large spatial data. arXiv preprint arXiv:1710.05013 (2017)
Hecht, F., Pironneau, O., Le Hyaric, A., Ohtsuka, K.: Freefem++ v. 2.11. Users? Manual University of Paris 6 (2010)
Hennig, P., Schuler, C.J.: Entropy search for information-efficient global optimization. J. Mach. Learn. Res. 13, 1809–1837 (2012)
MathSciNet MATH Google Scholar
Hernández-Lobato, J.M., Hoffman, M.W., Ghahramani, Z.: Predictive entropy search for efficient global optimization of black-box functions. In: Advances in neural information processing systems, pp. 918–926 (2014)
Hernández-Lobato, J.M., Gelbart, M.A., Adams, R.P., Hoffman, M.W., Ghahramani, Z.: A general framework for constrained bayesian optimization using information-based search. J. Mach. Learn. Res. 17(160), 1–53 (2016)
MathSciNet MATH Google Scholar
Hu, J., Wellman, M.P.: Nash q-learning for general-sum stochastic games. J. Mach. Learn. Res. 4, 1039–1069 (2003)
MathSciNet MATH Google Scholar
Isaacs, R.: Differential Games. A Mathematical Theory with Applications to Warfare and Pursuit, Control and Optimization. Wiley, New York (1965)
MATH Google Scholar
Jala, M., Lévy-Leduc, C., Moulines, É., Conil, E., Wiart, J.: Sequential design of computer experiments for the assessment of fetus exposure to electromagnetic fields. Technometrics 58(1), 30–42 (2016)
Article MathSciNet Google Scholar
Johanson, M., Bowling, M.H.: Data biased robust counter strategies. In: Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 264–271 (2009)
Jones, D.R., Schonlau, M., Welch, W.J.: Efficient global optimization of expensive black-box functions. J. Glob. Optim. 13(4), 455–492 (1998)
Article MathSciNet MATH Google Scholar
Kanzow, C., Steck, D.: Augmented lagrangian methods for the solution of generalized nash equilibrium problems. SIAM J. Optim. 26(4), 2034–2058 (2016)
Article MathSciNet MATH Google Scholar
Lanctot, M., Burch, N., Zinkevich, M., Bowling, M., Gibson, R.G.: No-regret learning in extensive-form games with imperfect recall. In: Proceedings of the 29th International Conference on Machine Learning (ICML-12), pp. 65–72 (2012)
León, E.R., Pape, A.L., Désidéri, J.A., Alfano, D., Costes, M.: Concurrent aerodynamic optimization of rotor blades using a nash game method. J. Am. Helicopter Soc. 61, 1–13 (2014)
Article Google Scholar
Li, S., Başar, T.: Distributed algorithms for the computation of noncooperative equilibria. Autom. J. IFAC 23(4), 523–533 (1987)
Article MathSciNet MATH Google Scholar
Littman, M.L., Stone, P.: A polynomial-time nash equilibrium algorithm for repeated games. Decis. Support Syst. 39(1), 55–66 (2005)
Article Google Scholar
McKay, M.D., Beckman, R.J., Conover, W.J.: Comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21(2), 239–245 (1979)
MathSciNet MATH Google Scholar
Mockus, J.: Bayesian Approach to Global Optimization: Theory and Applications. Springer, Berlin (1989)
Book MATH Google Scholar
Neyman, A., Sorin, S.: Stochastic Games and Applications, vol. 570. Springer, Berlin (2003)
Book MATH Google Scholar
Nishimura, R., Hayashi, S., Fukushima, M.: Robust nash equilibria in n-person non-cooperative games: uniqueness and reformulation. Pac. J. Optim. 5(2), 237–259 (2009)
MathSciNet MATH Google Scholar
Parr, J. M.: Improvement Criteria for Constraint Handling and Multiobjective Optimization. Ph.D thesis, University of Southampton (2012)
Picheny, V.: A stepwise uncertainty reduction approach to constrained global optimization. In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, JMLR W&CP, vol 33, pp. 787–795 (2014)
Picheny, V., Binois, M.: GPGame: solving complex game problems using Gaussian processes. URL http://CRAN.R-project.org/package=GPGame, r package version 0.1.3 (2017)
Plumlee, M.: Fast prediction of deterministic functions using sparse grid experimental designs. J. Am. Stat. Assoc. 109(508), 1581–1591 (2014)
Article MathSciNet MATH Google Scholar
R Core Team (2016) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna. https://www.R-project.org/. Accessed 15 Mar 2018
Rasmussen, C.E., Williams, C.: Gaussian Processes for Machine Learning. MIT Press. http://www.gaussianprocess.org/gpml/ (2006)
Rosenmüller, J.: On a generalization of the lemke-howson algorithm to noncooperative n-person games. SIAM J. Appl. Math. 21(1), 73–79 (1971)
Article MathSciNet MATH Google Scholar
Roustant, O., Ginsbourger, D., Deville, Y.: DiceKriging, DiceOptim: two R packages for the analysis of computer experiments by kriging-based metamodeling and optimization. J. Stat. Softw. 51(1), 1–55 (2012)
Article Google Scholar
Rullière, D., Durrande, N., Bachoc, F., Chevalier, C.: Nested kriging predictions for datasets with a large number of observations. Stat. Comput. 28, 1–19 (2016)
MathSciNet MATH Google Scholar
Scilab Enterprises (2012) Scilab: Free and Open Source Software for Numerical Computation. Scilab Enterprises, Orsay. http://www.scilab.org. Accessed 1 Apr 2015
Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of bayesian optimization. Proc. IEEE 104(1), 148–175 (2016)
Article Google Scholar
Shapley, L.S.: Stochastic games. Proc. Natl. Acad. Sci. 39(10), 1095–1100 (1953)
Article MathSciNet MATH Google Scholar
Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.: Information-theoretic regret bounds for gaussian process optimization in the bandit setting. Inf. Theory IEEE Trans. 58(5), 3250–3265 (2012)
Article MathSciNet MATH Google Scholar
Uryas’ev, S., Rubinstein, R.Y.: On relaxation algorithms in computation of noncooperative equilibria. IEEE Trans. Autom. Control 39(6), 1263–1267 (1994)
Article MathSciNet MATH Google Scholar
Villemonteix, J., Vazquez, E., Walter, E.: An informational approach to the global optimization of expensive-to-evaluate functions. J. Glob. Optim. 44(4), 509–534 (2009)
Article MathSciNet MATH Google Scholar
Wagner, T., Emmerich, M., Deutz, A., Ponweiser, W.: On expected-improvement criteria for model-based multi-objective optimization. In: International Conference on Parallel Problem Solving from Nature, Springer, Berlin. pp. 718–727 (2010)
Wang, G., Shan, S.: Review of metamodeling techniques in support of engineering design optimization. J. Mech. Des. 129(4), 370 (2007)
Article Google Scholar
Wilson, A., Nickisch, H.: Kernel interpolation for scalable structured Gaussian processes (kiss-gp). In: International Conference on Machine Learning, pp. 1775–1784 (2015)
Žilinskas, A., Zhigljavsky, A.: Stochastic global optimization: a review on the occasion of 25 years of informatica. Informatica 27(2), 229–256 (2016)
Article MATH Google Scholar

Download references

Acknowledgements

The authors acknowledge inspiration from Lorentz Center Workshop “SAMCO-Surrogate Model Assisted Multicriteria Optimization”, at Leiden University Feb 29–March 4, 2016. Mickal Binois is grateful for support from National Science Foundation Grant DMS-1521702.

Author information

Victor Picheny and Mickael Binois contributed equally to this manuscript.

Authors and Affiliations

MIAT, Université de Toulouse, INRA, Castanet-Tolosan, France
Victor Picheny
The University of Chicago Booth School of Business, 5807 S. Woodlawn Ave., Chicago, IL, 60637, USA
Mickael Binois
Inria, CNRS, LJAD, UMR 7351, Université Côte d’Azur, Parc Valrose, Nice, 06108, France
Abderrahmane Habbal

Authors

Victor Picheny
View author publications
You can also search for this author in PubMed Google Scholar
Mickael Binois
View author publications
You can also search for this author in PubMed Google Scholar
Abderrahmane Habbal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Victor Picheny.

Appendices

Handling conditional simulations

We detail here how we generate the draws of $\mathbf {Y}| \varvec{\mathcal {F}}_i$ to compute ${\hat{J}}(\mathbf {x})$ in practice. We employ the FOXY (fast update of conditional simulation ensemble) algorithm proposed by Chevalier et al. [9], as detailed below.

Let $\varvec{\mathcal {Y}}_1, \ldots , \varvec{\mathcal {Y}}_M$ be independent draws of $\mathbf {Y}\left( \mathbb {X}\right) $ (each $\varvec{\mathcal {Y}}_i \in \mathbb {R}^{N \times p}$), generated using the posterior Gaussian distribution of Eq. (8), and $\varvec{\mathcal {F}}_1, \ldots , \varvec{\mathcal {F}}_K$ independent (of each other and of the $\varvec{\mathcal {Y}}_i$’s) draws of $\mathbf {Y}(\mathbf {x}) + \varvec{\varepsilon }$ from the posterior Gaussian distribution of Eq. (9). As shown in Chevalier et al. [9], draws of $\mathbf {Y}| \varvec{\mathcal {F}}_i$ can be obtained efficiently from $\varvec{\mathcal {Y}}_1, \ldots , \varvec{\mathcal {Y}}_M$ using:

$$\begin{aligned} \mathcal {Y}_j^{(i)} | \mathcal {F}_k^{(i)}= & {} \mathcal {Y}_j^{(i)} + \varvec{\lambda }^{(i)}(\mathbf {x}) \left( \mathcal {F}_k^{(i)} - \mathcal {Y}_j^{(i)}(\mathbf {x}) \right) , \end{aligned}$$

(25)

with $1 \le i \le p$, $1 \le j \le M$, $1 \le k \le K$ and

$$\begin{aligned} \varvec{\lambda }^{(i)}(\mathbf {x}) = \frac{\mathbf {k}_n^{(i)}(\mathbf {x}, \mathbb {X})}{\mathbf {k}_n^{(i)}(\mathbf {x}, \mathbf {x})}. \end{aligned}$$

Notice that $\varvec{\lambda }^{(i)}(\mathbf {x})$ may only be computed once for all $\mathcal {Y}_j^{(i)}(\mathbf {x})$.

$C(\mathbf {x})$ formulae

For a given target $T_E \in \mathbb {R}^p$ and $\mathbf {x}\in \mathbb {X}$:

$$\begin{aligned} C_{\text {target}}(\mathbf {x}) = \prod _{i=1}^p \phi \left( \frac{T_{Ei} - \mu _i(\mathbf {x})}{\sigma _i(\mathbf {x})} \right) , \end{aligned}$$

(26)

with $\phi $ the probability density function of the standard Gaussian variable.

Let $T_L \in \mathbb {R}^p$ and $T_U \in \mathbb {R}^p$ such that $\forall 1 \le i \le p, T_{Li} < T_{Ui}$ define a box in the objective space. Defining $\varvec{\varPsi } = \left[ \varPsi (\varvec{\mathcal {Y}}_1), \ldots , \varPsi (\varvec{\mathcal {Y}}_M) \right] $ the $p \times M$ matrix of simulated NE, we use:

$$\begin{aligned} \forall 1 \le i \le p \qquad T_{Li} = \min \varvec{\varPsi }_{i, 1 \ldots M} \quad \text { and } \quad T_{Ui} = \max \varvec{\varPsi }_{i, 1 \ldots M}. \end{aligned}$$

Then, the probability to belong to the box is:

$$\begin{aligned} C_{\text {box}}(\mathbf {x}) = \prod _{i=1}^p \left[ \varPhi \left( \frac{T_{Ui} - \mu _i(\mathbf {x})}{\sigma _i(\mathbf {x})} \right) - \varPhi \left( \frac{\mu _i(\mathbf {x}) - T_{Li}}{\sigma _i(\mathbf {x})} \right) \right] . \end{aligned}$$

(27)

Solving NEP on GP draws

We detail here a simple algorithm to extract Nash equilibria from GP draws.

Computational time

We report here the computational time required to perform a single iteration of our algorithm for each of the three examples (not including the time required to run the simulation itself). Experiments were run on an Intel®Core$^{{\mathrm{TM}}}$ i7-5600U CPU at 2.60GHz with 4 $\times $ 8GB of RAM.

Table 3 Average CPU times required for one iteration of the GP-based algorithm on the different test problems

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Picheny, V., Binois, M. & Habbal, A. A Bayesian optimization approach to find Nash equilibria. J Glob Optim 73, 171–192 (2019). https://doi.org/10.1007/s10898-018-0688-0

Download citation

Received: 19 September 2017
Accepted: 04 July 2018
Published: 12 July 2018
Issue Date: 15 January 2019
DOI: https://doi.org/10.1007/s10898-018-0688-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Bayesian optimization approach to find Nash equilibria

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Precision game engineering through reshaping strategic payoffs

Simple Stochastic Stopping Games: A Generator and Benchmark Library

Exploiting Extensive-Form Structure in Empirical Game-Theoretic Analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Handling conditional simulations

\(C(\mathbf {x})\) formulae

Solving NEP on GP draws

Computational time

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A Bayesian optimization approach to find Nash equilibria

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Precision game engineering through reshaping strategic payoffs

Simple Stochastic Stopping Games: A Generator and Benchmark Library

Exploiting Extensive-Form Structure in Empirical Game-Theoretic Analysis

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Handling conditional simulations

\(C(\mathbf {x})\) formulae

Solving NEP on GP draws

Computational time

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation