Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Value-function reinforcement learning in Markov games

Published: 01 April 2001 Publication History

Abstract

Markov games are a model of multiagent environments that are convenient for studying multiagent reinforcement learning. This paper describes a set of reinforcement-learning algorithms based on estimating value functions and presents convergence theorems for these algorithms. The main contribution of this paper is that it presents the convergence theorems in a way that makes it easy to reason about the behavior of simultaneous learners in a shared environment.

References

[1]
Learning and sequential decision making. In: Gabriel, M., Moore, J. (Eds.), Learning and computational neuroscience: foundations of adaptive networks, MIT Press, Cambridge, MA.
[2]
. Princeton University Press, Princeton, NJ.
[3]
. Prentice-Hall, Englewood Cliffs, NJ.
[4]
Planning, learning and coordination in multiagent decision processes. In: Proceedings of the Sixth Conference on Theoretical Aspects of Rationality and Knowledge (TARK-96),
[5]
Convergence problems of general-sum multiagent reinforcement learning. In: Proceedings of the Seventeenth International Conference on Machine Learning,
[6]
An analysis of stochastic game theory for multiagent reinforcement learning. In: Technical Report CMU-CS-00-165, Computer Science Department, Carnegie Mellon University.
[7]
The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence,
[8]
. Springer-Verlag.
[9]
. MIT Press, Cambridge, MA.
[10]
. Department of Computer Science, University of Michigan.
[11]
Multiagent reinforcement learning: theoretical framework and an algorithm. In: Shavlik, J. (Ed.), Proceedings of the Fifteenth International Conference on Machine Learning, Morgan Kaufmann.
[12]
Experimental results on Q-learning for general-sum stochastic games. In: Langley, P. (Ed.), Proceedings of the Seventeenth International Conference on Machine Learning, Morgan Kaufmann.
[13]
On the convergence of stochastic iterative dynamic programming algorithms. Neural Comput. v6 i6. 1185-1201.
[14]
Reinforcement learning: a survey. J. Artificial Intell. Res. v4. 237-285.
[15]
Markov games as a framework for multi-agent reinforcement learning. In: Proceedings of the Eleventh International Conference on Machine Learning, Morgan Kaufmann, San Francisco, CA.
[16]
A generalized reinforcement-learning model: convergence and applications. In: Saitta, L. (Ed.), Proceedings of the Thirteenth International Conference on Machine Learning,
[17]
Evaluating concurrent reinforcement learners. In: Proceedings of the Fourth International Conference on Multiagent Systems, IEEE Press.
[18]
. In: Game Theory, Academic Press, Orlando, FL.
[19]
. John Wiley, New York.
[20]
Learning to coordinate without sharing information. In: Proceedings of the Twelfth National Conference on Artificial Intelligence,
[21]
Stochastic games. Proc. Natl. Acad. Sci. USA. v39. 1095-1100.
[22]
Convergence results for single-step on-policy reinforcement-learning algorithms. Machine Learn. v39. 287-308.
[23]
Rationality assumptions and optimality of co-learning. In: Proceedings of PRIMA'2000, Lecture Notes in Artificial Intelligence, Springer-Verlag.
[24]
Learning to predict by the method of temporal differences. Machine Learn. v3 i1. 9-44.
[25]
. MIT Press.
[26]
A unified analysis of value-function-based reinforcement-learning algorithms. Neural Comput. v11 i8. 2017-2059.
[27]
Asynchronous stochastic approximation and Q-learning. Machine Learn. v16 i3. 185-202.
[29]
. In: Mathematical centre tracts, Vol. 139. Mathematisch Centrum, Amsterdam.
[30]
. Princeton University Press, Princeton, NJ.
[31]
Fictitious play applied to sequences of games and discounted stochastic games. Int. J. Game Theory. v11 i2. 71-85.
[32]
. King's College, Cambridge, UK.
[33]
Q-learning. Machine Learn. v8 i3. 279-292.

Cited By

View all
  • (2024)Recovery from Adversarial Attacks in Cyber-physical Systems: Shallow, Deep, and Exploratory WorksACM Computing Surveys10.1145/365397456:8(1-31)Online publication date: 26-Apr-2024
  • (2024)QDAPKnowledge-Based Systems10.1016/j.knosys.2024.111719294:COnline publication date: 21-Jun-2024
  • (2024)Distributed dynamic pricing of multiple perishable products using multi-agent reinforcement learningExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121252237:PAOnline publication date: 27-Feb-2024
  • Show More Cited By
  1. Value-function reinforcement learning in Markov games

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Cognitive Systems Research
    Cognitive Systems Research  Volume 2, Issue 1
    April, 2001
    93 pages

    Publisher

    Elsevier Science Publishers B. V.

    Netherlands

    Publication History

    Published: 01 April 2001

    Author Tags

    1. Game theory
    2. Markov games
    3. Nash equilibria
    4. Q-learning
    5. Reinforcement learning
    6. Temporal difference learning
    7. Value functions

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 16 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Recovery from Adversarial Attacks in Cyber-physical Systems: Shallow, Deep, and Exploratory WorksACM Computing Surveys10.1145/365397456:8(1-31)Online publication date: 26-Apr-2024
    • (2024)QDAPKnowledge-Based Systems10.1016/j.knosys.2024.111719294:COnline publication date: 21-Jun-2024
    • (2024)Distributed dynamic pricing of multiple perishable products using multi-agent reinforcement learningExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121252237:PAOnline publication date: 27-Feb-2024
    • (2023)Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated EquilibriumMathematics of Operations Research10.1287/moor.2022.126848:1(433-462)Online publication date: 1-Feb-2023
    • (2023)Reinforcement Learning Solution for Cyber-Physical Systems Security Against Replay AttacksIEEE Transactions on Information Forensics and Security10.1109/TIFS.2023.326853218(2583-2595)Online publication date: 1-Jan-2023
    • (2023)Smooth Q-Learning: An Algorithm for Independent Learners in Stochastic Cooperative Markov GamesJournal of Intelligent and Robotic Systems10.1007/s10846-023-01917-z108:4Online publication date: 18-Jul-2023
    • (2022)Scalable Reinforcement Learning for Multiagent Networked SystemsOperations Research10.1287/opre.2021.222670:6(3601-3628)Online publication date: 1-Nov-2022
    • (2022)Social Learning In Markov Games: Empowering Autonomous Driving2022 IEEE Intelligent Vehicles Symposium (IV)10.1109/IV51971.2022.9827289(478-483)Online publication date: 4-Jun-2022
    • (2022)An Introduction to Multi-Agent Reinforcement Learning and Review of its Application to Autonomous Mobility2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)10.1109/ITSC55140.2022.9922205(1342-1349)Online publication date: 8-Oct-2022
    • (2022)A Communication-Efficient Multi-Agent Actor-Critic Algorithm for Distributed Reinforcement Learning*2019 IEEE 58th Conference on Decision and Control (CDC)10.1109/CDC40024.2019.9029257(5562-5567)Online publication date: 28-Dec-2022
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media