Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3545946.3598696acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

Learning Density-Based Correlated Equilibria for Markov Games

Published: 30 May 2023 Publication History

Abstract

Correlated Equilibrium (CE) is a well-established solution concept that captures coordination among agents and enjoys good algorithmic properties. In real-world multi-agent systems, in addition to being in an equilibrium, agents' policies are often expected to meet requirements with respect to safety, and fairness. Such additional requirements can often be expressed in terms of the \em state density which measures the state-visitation frequencies during the course of a game. However, existing CE notions or CE-finding approaches cannot explicitly specify a CE with particular properties concerning state density; they do so implicitly by either modifying reward functions or using value functions as the selection criteria. The resulting CE may thus not fully fulfil the state-density requirements. In this paper, we propose \em Density-Based Correlated Equilibria (DBCE), a new notion of CE that explicitly takes state density as selection criterion. Concretely, we instantiate DBCE by specifying different state-density requirements motivated by real-world applications. To compute DBCE, we put forward the \em Density Based Correlated Policy Iteration algorithm for the underlying control problem. We perform experiments on various games where results demonstrate the advantage of our CE-finding approach over existing methods in scenarios with state-density concerns.

References

[1]
E. Altaian, K. Avrachenkov, Nicolas Bonneau, mérouane Debbah, Rachid El-Azouzi, and Daniel Menasché. 2007. Constrained Stochastic Games in Wireless Networks. GLOBECOM - IEEE Global Telecommunications Conference, 315 -- 320. https://doi.org/10.1109/GLOCOM.2007.66
[2]
Eitan Altman. 1993. Asymptotic properties of constrained Markov decision processes. Zeitschrift für Operations Research 37, 2 (1993), 151--170.
[3]
Eitan Altman and Adam Shwartz. 2000. Constrained markov games: Nash equilibria. In Advances in dynamic games and applications. Springer, 213--221.
[4]
Robert J Aumann. 1987. Correlated equilibrium as an expression of Bayesian rationality. Econometrica: Journal of the Econometric Society (1987), 1--18.
[5]
Vivek S Borkar. 2002. Q-learning for risk-sensitive control. Mathematics of operations research 27, 2 (2002), 294--311.
[6]
Kun-Jen Chung and Matthew J Sobel. 1987. Discounted MDP's: Distribution functions and exponential utility maximization. SIAM journal on control and optimization 25, 1 (1987), 49--62.
[7]
Liam Dermed and Charles Isbell. 2009. Solving Stochastic Games. Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference, 1186--1194.
[8]
Arlington M Fink. 1964. Equilibrium in a stochastic n-person game. Journal of science of the hiroshima university, series ai (mathematics) 28, 1 (1964), 89--93.
[9]
Javier Garcia and Fernando Fernández. 2015. A comprehensive survey on safe reinforcement learning. Journal of Machine Learning Research 16, 1 (2015), 1437--1480.
[10]
Ather Gattami, Qinbo Bai, and Vaneet Agarwal. 2019. Reinforcement learning for multi-objective and constrained Markov decision processes. arXiv preprint arXiv:1901.08978 (2019).
[11]
Yangyang Ge, Fei Zhu, Wei Huang, Peiyao Zhao, and Quan Liu. 2020. Multi-agent cooperation Q-learning algorithm based on constrained Markov Game. Computer Science and Information Systems 17, 2 (2020), 647--664.
[12]
Peter Geibel and Fritz Wysotzki. 2005. Risk-sensitive reinforcement learning applied to control under constraints. Journal of Artificial Intelligence Research 24 (2005), 81--108.
[13]
Geoffrey J Gordon, Amy Greenwald, and Casey Marks. 2008. No-regret learning in convex games. In Proceedings of the 25th international conference on Machine learning. 360--367.
[14]
Amy Greenwald, Keith Hall, Roberto Serrano, et al . 2003. Correlated Q-learning. In ICML, Vol. 3. 242--249.
[15]
Zhu Han, Charles Pandana, and KJ Ray Liu. 2007. Distributive opportunistic spectrum access for cognitive radio using correlated equilibrium and no-regret learning. In 2007 IEEE wireless communications and networking conference. IEEE, 11--15.
[16]
Sergiu Hart and Andreu Mas-Colell. 2000. A simple adaptive procedure leading to correlated equilibrium. Econometrica 68, 5 (2000), 1127--1150.
[17]
Sergiu Hart and Andreu Mas-Colell. 2001. A reinforcement procedure leading to correlated equilibrium. In Economics essays. Springer, 181--200.
[18]
Mohammadhosein Hasanbeig, Alessandro Abate, and Daniel Kroening. 2018. Logically-constrained reinforcement learning. arXiv preprint arXiv:1801.08099 (2018).
[19]
Charles A Holt and Alvin E Roth. 2004. The Nash equilibrium: A perspective. Proceedings of the National Academy of Sciences 101, 12 (2004), 3999--4002.
[20]
Junling Hu and Michael P Wellman. 2003. Nash Q-learning for general-sum stochastic games. Journal of machine learning research 4, Nov (2003), 1039--1069.
[21]
Xiaofeng Jiang, Shuangwu Chen, Jian Yang, Han Hu, and Zhenliang Zhang. 2020. Finding the Equilibrium for Continuous Constrained Markov Games Under the Average Criteria. IEEE Trans. Automat. Control 65, 12 (2020), 5399--5406.
[22]
Haiming Jin, Hongpeng Guo, Lu Su, Klara Nahrstedt, and Xinbing Wang. 2019. Dynamic task pricing in multi-requester mobile crowd sensing with markov correlated equilibrium. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 1063--1071.
[23]
Toryn Q Klassen, Sheila A McIlraith, and Christian Muise. 2022. An AI Safety Threat from Learned Planning Models. (2022).
[24]
Harry Markowitz. 1952. PORTFOLIO SELECTION*. The Journal of Finance 7, 1 (1952), 77--91. https://doi.org/10.1111/j.1540--6261.1952.tb01525.x arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1540--6261.1952.tb01525.x
[25]
Oliver Mihatsch and Ralph Neuneier. 2002. Risk-sensitive reinforcement learning. Machine learning 49, 2 (2002), 267--290.
[26]
Chris Murray and Geoff Gordon. 2007. Finding correlated equilibria in general sum stochastic games. Carnegie Mellon University, School of Computer Science, Machine Learning . . . .
[27]
Abraham Neyman. 1997. Correlated equilibrium and potential games. International Journal of Game Theory 26, 2 (1997), 223--227.
[28]
Luis E Ortiz, Robert E Schapire, and Sham M Kakade. 2007. Maximum entropy correlated equilibria. In Artificial Intelligence and Statistics. PMLR, 347--354.
[29]
Zengyi Qin, Yuxiao Chen, and Chuchu Fan. 2021. Density constrained reinforcement learning. In International Conference on Machine Learning. PMLR, 8682--8692.
[30]
Anders Rantzer. 2001. A dual to Lyapunov's stability theorem. Systems & Control Letters 42, 3 (2001), 161--168.
[31]
Daniel Schiff, Justin Biddle, Jason Borenstein, and Kelly Laas. 2020. What's next for ai ethics, policy, and governance? a global overview. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society. 153--158.
[32]
Yun Shen, Michael J Tobia, Tobias Sommer, and Klaus Obermayer. 2014. Risk- sensitive reinforcement learning. Neural computation 26, 7 (2014), 1298--1328.
[33]
Umar Syed, Michael Bowling, and Robert E Schapire. 2008. Apprenticeship learning using linear programming. In Proceedings of the 25th international conference on Machine learning. 1032--1039.
[34]
Csaba Szepesvári and Michael L Littman. 1999. A unified analysis of value-function-based reinforcement-learning algorithms. Neural computation 11, 8 (1999), 2017--2060.
[35]
Pauli Virtanen, Ralf Gommers, Travis E. Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, Stéfan J. van der Walt, Matthew Brett, Joshua Wilson, K. Jarrod Millman, Nikolay Mayorov, Andrew R. J. Nelson, Eric Jones, Robert Kern, Eric Larson, C J Carey, Ilhan Polat, Yu Feng, Eric W. Moore, Jake VanderPlas, Denis Laxalde, Josef Perktold, Robert Cimrman, Ian Henriksen, E. A. Quintero, Charles R. Harris, Anne M. Archibald, Antônio H. Ribeiro, Fabian Pedregosa, Paul van Mulbregt, and SciPy 1.0 Contributors. 2020. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods 17 (2020), 261--272. https://doi.org/10.1038/s41592-019-0686--2
[36]
Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, and Jun Wang. 2018. Mean field multi-agent reinforcement learning. In International conference on machine learning. PMLR, 5571--5580.
[37]
Tao Yu, HZ Wang, Bin Zhou, Ka Wing Chan, and J Tang. 2014. Multi-agent correlated equilibrium Q (λ) learning for coordinated smart generation control of interconnected power grids. IEEE transactions on power systems 30, 4 (2014), 1669--1679.
[38]
Brian D Ziebart, Drew Bagnell, and Anind K Dey. 2010. Maximum causal entropy correlated equilibria for Markov games. In Workshops at the Twenty-Fourth AAAI Conference on Artificial Intelligence.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems
May 2023
3131 pages
ISBN:9781450394321
  • General Chairs:
  • Noa Agmon,
  • Bo An,
  • Program Chairs:
  • Alessandro Ricci,
  • William Yeoh

Sponsors

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 30 May 2023

Check for updates

Author Tags

  1. Markov games
  2. correlated equilibrium
  3. state density

Qualifiers

  • Research-article

Conference

AAMAS '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 39
    Total Downloads
  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)1
Reflects downloads up to 23 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media