article

Value-function reinforcement learning in Markov games

Author:

Michael L. LittmanAuthors Info & Claims

Cognitive Systems Research, Volume 2, Issue 1

Pages 55 - 66

https://doi.org/10.1016/S1389-0417(01)00015-8

Published: 01 April 2001 Publication History

Abstract

Markov games are a model of multiagent environments that are convenient for studying multiagent reinforcement learning. This paper describes a set of reinforcement-learning algorithms based on estimating value functions and presents convergence theorems for these algorithms. The main contribution of this paper is that it presents the convergence theorems in a way that makes it easy to reason about the behavior of simultaneous learners in a shared environment.

References

[1]

Learning and sequential decision making. In: Gabriel, M., Moore, J. (Eds.), Learning and computational neuroscience: foundations of adaptive networks, MIT Press, Cambridge, MA.

[2]

. Princeton University Press, Princeton, NJ.

[3]

. Prentice-Hall, Englewood Cliffs, NJ.

[4]

Planning, learning and coordination in multiagent decision processes. In: Proceedings of the Sixth Conference on Theoretical Aspects of Rationality and Knowledge (TARK-96),

[5]

Convergence problems of general-sum multiagent reinforcement learning. In: Proceedings of the Seventeenth International Conference on Machine Learning,

[6]

An analysis of stochastic game theory for multiagent reinforcement learning. In: Technical Report CMU-CS-00-165, Computer Science Department, Carnegie Mellon University.

[7]

The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence,

[8]

. Springer-Verlag.

[9]

. MIT Press, Cambridge, MA.

[10]

. Department of Computer Science, University of Michigan.

[11]

Multiagent reinforcement learning: theoretical framework and an algorithm. In: Shavlik, J. (Ed.), Proceedings of the Fifteenth International Conference on Machine Learning, Morgan Kaufmann.

[12]

Experimental results on Q-learning for general-sum stochastic games. In: Langley, P. (Ed.), Proceedings of the Seventeenth International Conference on Machine Learning, Morgan Kaufmann.

[13]

On the convergence of stochastic iterative dynamic programming algorithms. Neural Comput. v6 i6. 1185-1201.

[14]

Reinforcement learning: a survey. J. Artificial Intell. Res. v4. 237-285.

[15]

Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the Eleventh International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA.

[16]

A generalized reinforcement-learning model: convergence and applications. In: Saitta, L. (Ed.), Proceedings of the Thirteenth International Conference on Machine Learning,

[17]

Evaluating concurrent reinforcement learners. In: Proceedings of the Fourth International Conference on Multiagent Systems, IEEE Press.

[18]

. In: Game Theory, Academic Press, Orlando, FL.

[19]

. John Wiley, New York.

[20]

Learning to coordinate without sharing information. In: Proceedings of the Twelfth National Conference on Artificial Intelligence,

[21]

Stochastic games. Proc. Natl. Acad. Sci. USA. v39. 1095-1100.

[22]

Convergence results for single-step on-policy reinforcement-learning algorithms. Machine Learn. v39. 287-308.

[23]

Rationality assumptions and optimality of co-learning. In: Proceedings of PRIMA'2000, Lecture Notes in Artificial Intelligence, Springer-Verlag.

[24]

Learning to predict by the method of temporal differences. Machine Learn. v3 i1. 9-44.

[25]

. MIT Press.

[26]

A unified analysis of value-function-based reinforcement-learning algorithms. Neural Comput. v11 i8. 2017-2059.

[27]

Asynchronous stochastic approximation and Q-learning. Machine Learn. v16 i3. 185-202.

[28]

.

[29]

. In: Mathematical centre tracts, Vol. 139. Mathematisch Centrum, Amsterdam.

[30]

. Princeton University Press, Princeton, NJ.

[31]

Fictitious play applied to sequences of games and discounted stochastic games. Int. J. Game Theory. v11 i2. 71-85.

[32]

. King's College, Cambridge, UK.

[33]

Q-learning. Machine Learn. v8 i3. 279-292.

Cited By

Lu PZhang LLiu MSridhar KSokolsky OKong FLee I(2024)Recovery from Adversarial Attacks in Cyber-physical Systems: Shallow, Deep, and Exploratory WorksACM Computing Surveys10.1145/365397456:8(1-31)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3653974
Zhao ZZhang YWang SZhang FZhang MChen W(2024)QDAPKnowledge-Based Systems10.1016/j.knosys.2024.111719294:COnline publication date: 21-Jun-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111719
Qiao WHuang MGao ZWang X(2024)Distributed dynamic pricing of multiple perishable products using multi-agent reinforcement learningExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121252237:PAOnline publication date: 27-Feb-2024
https://dl.acm.org/doi/10.1016/j.eswa.2023.121252
Show More Cited By

Value-function reinforcement learning in Markov games
1. Computing methodologies

Recommendations

QL2, a simple reinforcement learning scheme for two-player zero-sum Markov games

Markov games is a framework which can be used to formalise n-agent reinforcement learning (RL). Littman (Markov games as a framework for multi-agent reinforcement learning, in: Proceedings of the 11th International Conference on Machine Learning (ICML-...
Colearning in Differential Games

Game playing has been a popular problem area for research in artificial intelligence and machine learning for many years. In almost every study of game playing and machine learning, the focus has been on games with a finite set of states and a finite ...
Evaluation of reinforcement learning techniques
IITM '10: Proceedings of the First International Conference on Intelligent Interactive Technologies and Multimedia

Reinforcement learning is became one of the most important approaches to machine intelligence. Now RL is widely use by different research field as intelligent control, robotics and neuroscience. It provides us possible solution within unknown ...

Comments

Information & Contributors

Information

Published In

cover image Cognitive Systems Research

Cognitive Systems Research Volume 2, Issue 1

April, 2001

93 pages

ISSN:1389-0417

Issue’s Table of Contents

Copyright © Elsevier Science B.V. © 2001.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 April 2001

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

75
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Lu PZhang LLiu MSridhar KSokolsky OKong FLee I(2024)Recovery from Adversarial Attacks in Cyber-physical Systems: Shallow, Deep, and Exploratory WorksACM Computing Surveys10.1145/365397456:8(1-31)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3653974
Zhao ZZhang YWang SZhang FZhang MChen W(2024)QDAPKnowledge-Based Systems10.1016/j.knosys.2024.111719294:COnline publication date: 21-Jun-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.111719
Qiao WHuang MGao ZWang X(2024)Distributed dynamic pricing of multiple perishable products using multi-agent reinforcement learningExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121252237:PAOnline publication date: 27-Feb-2024
https://dl.acm.org/doi/10.1016/j.eswa.2023.121252
Xie QChen YWang ZYang Z(2023)Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated EquilibriumMathematics of Operations Research10.1287/moor.2022.126848:1(433-462)Online publication date: 1-Feb-2023
https://dl.acm.org/doi/10.1287/moor.2022.1268
Yu YYang WDing WZhou J(2023)Reinforcement Learning Solution for Cyber-Physical Systems Security Against Replay AttacksIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.326853218(2583-2595)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TIFS.2023.3268532
Amhraoui EMasrour T(2023)Smooth Q-Learning: An Algorithm for Independent Learners in Stochastic Cooperative Markov GamesJournal of Intelligent and Robotic Systems10.1007/s10846-023-01917-z108:4Online publication date: 18-Jul-2023
https://dl.acm.org/doi/10.1007/s10846-023-01917-z
Qu GWierman ALi N(2022)Scalable Reinforcement Learning for Multiagent Networked SystemsOperations Research10.1287/opre.2021.222670:6(3601-3628)Online publication date: 1-Nov-2022
https://dl.acm.org/doi/10.1287/opre.2021.2226
Chen XLi ZDi X(2022)Social Learning In Markov Games: Empowering Autonomous Driving2022 IEEE Intelligent Vehicles Symposium (IV)10.1109/IV51971.2022.9827289(478-483)Online publication date: 4-Jun-2022
https://dl.acm.org/doi/10.1109/IV51971.2022.9827289
Schmidt LBrosig JPlinge AEskofier BMutschler C(2022)An Introduction to Multi-Agent Reinforcement Learning and Review of its Application to Autonomous Mobility2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)10.1109/ITSC55140.2022.9922205(1342-1349)Online publication date: 8-Oct-2022
https://dl.acm.org/doi/10.1109/ITSC55140.2022.9922205
Lin YZhang KYang ZWang ZBaşar TSandhu RLiu J(2022)A Communication-Efficient Multi-Agent Actor-Critic Algorithm for Distributed Reinforcement Learning^*2019 IEEE 58th Conference on Decision and Control (CDC)10.1109/CDC40024.2019.9029257(5562-5567)Online publication date: 28-Dec-2022
https://dl.acm.org/doi/10.1109/CDC40024.2019.9029257
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents