research-article

Learning Density-Based Correlated Equilibria for Markov Games

Authors:

Bakh Khoussainov,

Michael Witbrock,

Jiamou LiuAuthors Info & Claims

AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems

Pages 652 - 660

Published: 30 May 2023 Publication History

Abstract

Correlated Equilibrium (CE) is a well-established solution concept that captures coordination among agents and enjoys good algorithmic properties. In real-world multi-agent systems, in addition to being in an equilibrium, agents' policies are often expected to meet requirements with respect to safety, and fairness. Such additional requirements can often be expressed in terms of the \em state density which measures the state-visitation frequencies during the course of a game. However, existing CE notions or CE-finding approaches cannot explicitly specify a CE with particular properties concerning state density; they do so implicitly by either modifying reward functions or using value functions as the selection criteria. The resulting CE may thus not fully fulfil the state-density requirements. In this paper, we propose \em Density-Based Correlated Equilibria (DBCE), a new notion of CE that explicitly takes state density as selection criterion. Concretely, we instantiate DBCE by specifying different state-density requirements motivated by real-world applications. To compute DBCE, we put forward the \em Density Based Correlated Policy Iteration algorithm for the underlying control problem. We perform experiments on various games where results demonstrate the advantage of our CE-finding approach over existing methods in scenarios with state-density concerns.

References

[1]

E. Altaian, K. Avrachenkov, Nicolas Bonneau, mérouane Debbah, Rachid El-Azouzi, and Daniel Menasché. 2007. Constrained Stochastic Games in Wireless Networks. GLOBECOM - IEEE Global Telecommunications Conference, 315 -- 320. https://doi.org/10.1109/GLOCOM.2007.66

[2]

Eitan Altman. 1993. Asymptotic properties of constrained Markov decision processes. Zeitschrift für Operations Research 37, 2 (1993), 151--170.

[3]

Eitan Altman and Adam Shwartz. 2000. Constrained markov games: Nash equilibria. In Advances in dynamic games and applications. Springer, 213--221.

[4]

Robert J Aumann. 1987. Correlated equilibrium as an expression of Bayesian rationality. Econometrica: Journal of the Econometric Society (1987), 1--18.

[5]

Vivek S Borkar. 2002. Q-learning for risk-sensitive control. Mathematics of operations research 27, 2 (2002), 294--311.

[6]

Kun-Jen Chung and Matthew J Sobel. 1987. Discounted MDP's: Distribution functions and exponential utility maximization. SIAM journal on control and optimization 25, 1 (1987), 49--62.

[7]

Liam Dermed and Charles Isbell. 2009. Solving Stochastic Games. Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference, 1186--1194.

[8]

Arlington M Fink. 1964. Equilibrium in a stochastic n-person game. Journal of science of the hiroshima university, series ai (mathematics) 28, 1 (1964), 89--93.

[9]

Javier Garcia and Fernando Fernández. 2015. A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research 16, 1 (2015), 1437--1480.

Digital Library

[10]

Ather Gattami, Qinbo Bai, and Vaneet Agarwal. 2019. Reinforcement learning for multi-objective and constrained Markov decision processes. arXiv preprint arXiv:1901.08978 (2019).

[11]

Yangyang Ge, Fei Zhu, Wei Huang, Peiyao Zhao, and Quan Liu. 2020. Multi-agent cooperation Q-learning algorithm based on constrained Markov Game. Computer Science and Information Systems 17, 2 (2020), 647--664.

[12]

Peter Geibel and Fritz Wysotzki. 2005. Risk-sensitive reinforcement learning applied to control under constraints. Journal of Artificial Intelligence Research 24 (2005), 81--108.

[13]

Geoffrey J Gordon, Amy Greenwald, and Casey Marks. 2008. No-regret learning in convex games. In Proceedings of the 25th international conference on Machine learning. 360--367.

Digital Library

[14]

Amy Greenwald, Keith Hall, Roberto Serrano, et al . 2003. Correlated Q-learning. In ICML, Vol. 3. 242--249.

Digital Library

[15]

Zhu Han, Charles Pandana, and KJ Ray Liu. 2007. Distributive opportunistic spectrum access for cognitive radio using correlated equilibrium and no-regret learning. In 2007 IEEE wireless communications and networking conference. IEEE, 11--15.

[16]

Sergiu Hart and Andreu Mas-Colell. 2000. A simple adaptive procedure leading to correlated equilibrium. Econometrica 68, 5 (2000), 1127--1150.

[17]

Sergiu Hart and Andreu Mas-Colell. 2001. A reinforcement procedure leading to correlated equilibrium. In Economics essays. Springer, 181--200.

[18]

Mohammadhosein Hasanbeig, Alessandro Abate, and Daniel Kroening. 2018. Logically-constrained reinforcement learning. arXiv preprint arXiv:1801.08099 (2018).

[19]

Charles A Holt and Alvin E Roth. 2004. The Nash equilibrium: A perspective. Proceedings of the National Academy of Sciences 101, 12 (2004), 3999--4002.

[20]

Junling Hu and Michael P Wellman. 2003. Nash Q-learning for general-sum stochastic games. Journal of machine learning research 4, Nov (2003), 1039--1069.

[21]

Xiaofeng Jiang, Shuangwu Chen, Jian Yang, Han Hu, and Zhenliang Zhang. 2020. Finding the Equilibrium for Continuous Constrained Markov Games Under the Average Criteria. IEEE Trans. Automat. Control 65, 12 (2020), 5399--5406.

[22]

Haiming Jin, Hongpeng Guo, Lu Su, Klara Nahrstedt, and Xinbing Wang. 2019. Dynamic task pricing in multi-requester mobile crowd sensing with markov correlated equilibrium. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 1063--1071.

Digital Library

[23]

Toryn Q Klassen, Sheila A McIlraith, and Christian Muise. 2022. An AI Safety Threat from Learned Planning Models. (2022).

[24]

Harry Markowitz. 1952. PORTFOLIO SELECTION*. The Journal of Finance 7, 1 (1952), 77--91. https://doi.org/10.1111/j.1540--6261.1952.tb01525.x arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1540--6261.1952.tb01525.x

[25]

Oliver Mihatsch and Ralph Neuneier. 2002. Risk-sensitive reinforcement learning. Machine learning 49, 2 (2002), 267--290.

[26]

Chris Murray and Geoff Gordon. 2007. Finding correlated equilibria in general sum stochastic games. Carnegie Mellon University, School of Computer Science, Machine Learning . . . .

[27]

Abraham Neyman. 1997. Correlated equilibrium and potential games. International Journal of Game Theory 26, 2 (1997), 223--227.

Digital Library

[28]

Luis E Ortiz, Robert E Schapire, and Sham M Kakade. 2007. Maximum entropy correlated equilibria. In Artificial Intelligence and Statistics. PMLR, 347--354.

[29]

Zengyi Qin, Yuxiao Chen, and Chuchu Fan. 2021. Density constrained reinforcement learning. In International Conference on Machine Learning. PMLR, 8682--8692.

[30]

Anders Rantzer. 2001. A dual to Lyapunov's stability theorem. Systems & Control Letters 42, 3 (2001), 161--168.

[31]

Daniel Schiff, Justin Biddle, Jason Borenstein, and Kelly Laas. 2020. What's next for ai ethics, policy, and governance? a global overview. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. 153--158.

Digital Library

[32]

Yun Shen, Michael J Tobia, Tobias Sommer, and Klaus Obermayer. 2014. Risk- sensitive reinforcement learning. Neural computation 26, 7 (2014), 1298--1328.

[33]

Umar Syed, Michael Bowling, and Robert E Schapire. 2008. Apprenticeship learning using linear programming. In Proceedings of the 25th international conference on Machine learning. 1032--1039.

Digital Library

[34]

Csaba Szepesvári and Michael L Littman. 1999. A unified analysis of value-function-based reinforcement-learning algorithms. Neural computation 11, 8 (1999), 2017--2060.

Digital Library

[35]

Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, Ilhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. 2020. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods 17 (2020), 261--272. https://doi.org/10.1038/s41592-019-0686--2

[36]

Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, and Jun Wang. 2018. Mean field multi-agent reinforcement learning. In International conference on machine learning. PMLR, 5571--5580.

[37]

Tao Yu, HZ Wang, Bin Zhou, Ka Wing Chan, and J Tang. 2014. Multi-agent correlated equilibrium Q (λ) learning for coordinated smart generation control of interconnected power grids. IEEE transactions on power systems 30, 4 (2014), 1669--1679.

[38]

Brian D Ziebart, Drew Bagnell, and Anind K Dey. 2010. Maximum causal entropy correlated equilibria for Markov games. In Workshops at the Twenty-Fourth AAAI Conference on Artificial Intelligence.

Index Terms

Learning Density-Based Correlated Equilibria for Markov Games
1. Theory of computation
  1. Theory and algorithms for application domains
    1. Algorithmic game theory and mechanism design
      1. Solution concepts in game theory
    2. Machine learning theory
      1. Multi-agent learning
      2. Reinforcement learning
        Multi-agent reinforcement learning

Recommendations

Computing Optimal Ex Ante Correlated Equilibria in Two-Player Sequential Games
AAMAS '19: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems

We investigate the computation of equilibria in extensive-form games when ex ante correlation is possible, focusing on correlated equilibria requiring the least amount of communication between the players and the mediator. Motivated by hardness results ...
On Approximate and Weak Correlated Equilibria in Constrained Discounted Stochastic Games
Abstract
In this paper, we consider constrained discounted stochastic games with a countably generated state space and norm continuous transition probability having a density function. We prove existence of approximate stationary equilibria and stationary ...
Simple approximate equilibria in large games
EC '14: Proceedings of the fifteenth ACM conference on Economics and computation

We prove that in every normal form n-player game with m actions for each player, there exists an approximate Nash equilibrium in which each player randomizes uniformly among a set of O(log m + log n) pure actions. This result induces an O(N ^{log log N})-...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems

May 2023

3131 pages

ISBN:9781450394321

General Chairs:
Noa Agmon
Bar-Ilan University, Israel
,
Bo An
Nanyang Technological University, Singapore
,
Program Chairs:
Alessandro Ricci
University of Bologna, Italy
,
William Yeoh
Washington University in St. Louis, USA

Sponsors

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 30 May 2023

Check for updates

Author Tags

Qualifiers

Research-article

Conference

AAMAS '23

Sponsor:

SIGAI

AAMAS '23: International Conference on Autonomous Agents and Multiagent Systems

May 29 - June 2, 2023

London, United Kingdom

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
39
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)1

Reflects downloads up to 23 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten