Article

A multiagent reinforcement learning algorithm by dynamically merging markov decision processes

Authors:

Mohammad Ghavamzadeh,

Sridhar MahadevanAuthors Info & Claims

AAMAS '02: Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2

Pages 845 - 846

https://doi.org/10.1145/544862.544940

Published: 15 July 2002 Publication History

Get Access

Abstract

One general strategy for accelerating the learning of cooperative multiagent problems is to reuse good or optimal solutions to the task when each agent is acting alone. In this paper, we formalize this approach as dynamically merging solutions to multiple Markov decision processes (MDPs), each representing an individual agent's solution when acting alone, to obtain solutions to the overall multiagent MDP when all the agents act together. We present a new learning algorithm called MAPLE (MultiAgent Policy LEarning) that uses Q-learning and dynamic merging to efficiently construct global solutions to the overall multiagent problem from solutions to the individual MDPs. We illustrate the efficiency of MAPLE by comparing its performance with standard Q-learning applied to the overall multiagent MDP.

References

[1]

C. Boutilier. Sequential Optimality and Coordination in Multiagent Systems. In Proceedings of IJCAI, 1999]]

Digital Library

Google Scholar

[2]

M. Ghavamzadeh and S. Mahadevan. A Multiagent Reinforcement Learning Algorithm by Dynamically Merging Markov Decision Processes. http://www.cs.umass.edu/ mgh/agents02.ps]]

Google Scholar

[3]

S. Singh and D. Cohn. How to Dynamically Merge Markov Decision Processes. In Proceedings of NIPS, 1999]]

Digital Library

Google Scholar

Cited By

View all

Wang S(2021)Taxi scheduling research based on Q-learning2021 3rd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI)10.1109/MLBDBI54094.2021.00138(700-703)Online publication date: Dec-2021
https://doi.org/10.1109/MLBDBI54094.2021.00138
Han MSenellart PBressan SWu HMukhopadhyay SZhai CBertino ECrestani FMostafa JTang JSi LZhou XChang YLi YSondhi P(2016)Routing an Autonomous Taxi with Reinforcement LearningProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983379(2421-2424)Online publication date: 24-Oct-2016
https://dl.acm.org/doi/10.1145/2983323.2983379
Mousavi SGhazanfari BMozayani NJahed-Motlagh M(2014)Automatic abstraction controller in reinforcement learning agent via automataApplied Soft Computing10.1016/j.asoc.2014.08.07125:C(118-128)Online publication date: 1-Dec-2014
https://dl.acm.org/doi/10.1016/j.asoc.2014.08.071
Show More Cited By

Index Terms

A multiagent reinforcement learning algorithm by dynamically merging markov decision processes
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Multi-agent systems
    2. Philosophical/theoretical foundations of artificial intelligence

Recommendations

Multiagent Reinforcement Social Learning toward Coordination in Cooperative Multiagent Systems

Most previous works on coordination in cooperative multiagent systems study the problem of how two (or more) players can coordinate on Pareto-optimal Nash equilibrium(s) through fixed and repeated interactions in the context of cooperative games. ...
Learning Cooperative Behaviours in Multiagent Reinforcement Learning
ICONIP '09: Proceedings of the 16th International Conference on Neural Information Processing: Part I

We investigated the coordination among agents in a goal finding task in a partially observable environment. In our problem formulation, the task was to locate a goal in a 2D space. However, no information related to the goal was given to the agents ...
Partially decentralized reinforcement learning in finite, multi-agent Markov decision processes

In this paper, we propose a novel, partially decentralized learning algorithm for the control of finite, multi-agent Markov Decision Process with unknown transition probabilities and reward values. One learning automaton is associated with each agent ...

Comments

Information & Contributors

Information

Published In

AAMAS '02: Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2

July 2002

508 pages

ISBN:1581134800

DOI:10.1145/544862

Conference Chairs:
Maria Gini
University of Minnesota, USA
,
Toru Ishida
Kyoto University, Japan
,
Program Chairs:
Cristiano Castelfranchi
CNR and Università di Siena, Italy
,
W. Lewis Johnson
University of Southern California, USA

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 July 2002

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

AAMAS02

Sponsor:

AAMAS02: The First International Joint Conference on Autonomous Agents and Multi-Agent Systems ( formerly known as Autonomous Agents )

July 15 - 19, 2002

Bologna, Italy

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
382
Total Downloads

Downloads (Last 12 months)1
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Wang S(2021)Taxi scheduling research based on Q-learning2021 3rd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI)10.1109/MLBDBI54094.2021.00138(700-703)Online publication date: Dec-2021
https://doi.org/10.1109/MLBDBI54094.2021.00138
Han MSenellart PBressan SWu HMukhopadhyay SZhai CBertino ECrestani FMostafa JTang JSi LZhou XChang YLi YSondhi P(2016)Routing an Autonomous Taxi with Reinforcement LearningProceedings of the 25th ACM International on Conference on Information and Knowledge Management10.1145/2983323.2983379(2421-2424)Online publication date: 24-Oct-2016
https://dl.acm.org/doi/10.1145/2983323.2983379
Mousavi SGhazanfari BMozayani NJahed-Motlagh M(2014)Automatic abstraction controller in reinforcement learning agent via automataApplied Soft Computing10.1016/j.asoc.2014.08.07125:C(118-128)Online publication date: 1-Dec-2014
https://dl.acm.org/doi/10.1016/j.asoc.2014.08.071
Bahuguna JRavindran BMadhava Krishna K(2009)MDP based active localization for multiple robots2009 IEEE International Conference on Automation Science and Engineering10.1109/COASE.2009.5234142(635-640)Online publication date: Aug-2009
https://doi.org/10.1109/COASE.2009.5234142
Iwata KIkeda KSakai H(2006)A Statistical Property of Multiagent Learning Based on Markov Decision ProcessIEEE Transactions on Neural Networks10.1109/TNN.2006.87599017:4(829-842)Online publication date: Jul-2006
https://doi.org/10.1109/TNN.2006.875990
Zhang HHuang S(2004)Merging Individually Learned Optimal Results to Accelerate CoordinationAdvances in Web-Age Information Management10.1007/978-3-540-27772-9_64(628-633)Online publication date: 2004
https://doi.org/10.1007/978-3-540-27772-9_64
Becker RZilberstein SLesser VGoldman CRosenschein JWooldridge MSandholm TYokoo M(2003)Transition-independent decentralized markov decision processesProceedings of the second international joint conference on Autonomous agents and multiagent systems10.1145/860575.860583(41-48)Online publication date: 14-Jul-2003
https://dl.acm.org/doi/10.1145/860575.860583

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Multiagent Reinforcement Social Learning toward Coordination in Cooperative Multiagent Systems

Learning Cooperative Behaviours in Multiagent Reinforcement Learning

Partially decentralized reinforcement learning in finite, multi-agent Markov decision processes

Comments

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Other Metrics

Article Metrics

Other Metrics

Cited By

Login options

Full Access

PDF

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Multiagent Reinforcement Social Learning toward Coordination in Cooperative Multiagent Systems

Learning Cooperative Behaviours in Multiagent Reinforcement Learning

Partially decentralized reinforcement learning in finite, multi-agent Markov decision processes

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations