Article

Hierarchical multi-agent reinforcement learning

Authors:

Sridhar Mahadevan,

Mohammad GhavamzadehAuthors Info & Claims

AGENTS '01: Proceedings of the fifth international conference on Autonomous agents

Pages 246 - 253

https://doi.org/10.1145/375735.376302

Published: 28 May 2001 Publication History

Abstract

In this paper we investigate the use of hierarchical reinforcement learning to speed up the acquisition of cooperative multi-agent tasks. We extend the MAXQ framework to the multi-agent case. Each agent uses the same MAXQ hierarchy to decompose a task into sub-tasks. Learning is decentralized, with each agent learning three interrelated skills: how to perform subtasks, which order to do them in, and how to coordinate with other agents. Coordination skills among agents are learned by using joint actions at the highest level(s) of the hierarchy. The Q nodes at the highest level(s) of the hierarchy are configured to represent the joint task-action space among multiple agents. In this approach, each agent only knows what other agents are doing at the level of sub-tasks, and is unaware of lower level (primitive) actions. This hierarchical approach allows agents to learn coordination faster by sharing information at the level of sub-tasks, rather than attempting to learn coordination taking into account primitive joint state-action values. We apply this hierarchical multi-agent reinforcement learning algorithm to a complex AGV scheduling task and compare its performance and speed with other learning approaches, including flat multi-agent, single agent using MAXQ, selfish multiple agents using MAXQ (where each agent acts independently without communicating with the other agents), as well as several well-known AGV heuristics like "first come first serve", "highest queue first" and "nearest station first". We also compare the tradeoffs in learning speed vs. performance of modeling joint action values at multiple levels in the MAXQ hierarchy.

References

[1]

R. Askin and C. Standridge. Modeling and Analysis of Manufacturing Systems. John Wiley and Sons, 1993.

[2]

T. Balch and R. Arkin. Behavior-based formation control for multi-robot teams. IEEE Transactions on Robotics and Automation, 14(6):1-15, 1998.

[3]

R. Crites and A. Barto. Elevator group control using multiple reinforcement learning agents. Machine Learning, 33:235-262, 1998.

Digital Library

[4]

T. Dietterich. Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research, volume 13. pages 227-303, 2000.

Digital Library

[5]

J. Hu and M. Wellman. Multiagent reinforcement learning: Theoretical framework and an algorithm. In Fifteenth International Conference on Machine Learning, pages 242-250, 1998.

Digital Library

[6]

C. Klein and J. Kim. Agv dispatching. International Journal of Production Research, 34(1):95-110, 1996.

[7]

J. Lee. Composite dispatching rules for multiple-vehicle AGV systems. SIMULATION, 66(2):121-130, 1996.

[8]

M. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the Eleventh International Conference on Machine Learning, pages 157-163, 1994.

Digital Library

[9]

M. Mataric. Reinforcement learning in the multi-robot domain. Autonomous Robots, 4(1):73-83, 1997.

Digital Library

[10]

R. Parr. Hierarchical Control and Learning for Markov Decision Processes. PhD Thesis, University of California, Berkeley, 1998.

Digital Library

[11]

M. L. Puterman. Markov Decision Processes. Wiley Interscience, New York, USA, 1994.

[12]

P. Stone and M. Veloso. Team-partitioned, opaque-transition reinforcement learning. Third International Conference onAutonomous Agents, pages 86-91, 1999.

Digital Library

[13]

T. Sugawara and V. Lesser. Learning to improve coordinated actions in cooperative distributed problem-solving environments. Machine Learning, 33:129-154, 1998.

Digital Library

[14]

R. Sutton and A. Barto. An Introduction to Reinforcement Learning. MIT Press, Cambridge, MA., 1998.

Digital Library

[15]

R. Sutton, D. Precup, and S. Singh. Between MDPs and Semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112:181-211, 1999.

Digital Library

[16]

P. Tadepalli and D. Ok. Scaling up average reward reinforcement learning by approximating the domain models and the value function. In Proceedings of International Machine Learning Conference, 1996.

[17]

M. Tan. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the Tenth International Conference on Machine Learning, pages 330-337, 1993.

Digital Library

[18]

G. Wang and S. Mahadevan. Hierarchical optimization of policy-coupled semi-markov decision processes. In Proceedings of the Sixteenth International Conference on Machine Learning, 1999.

Digital Library

[19]

G. Weiss. Multiagent Systems: A Modern Approach to Distributed Artificial Intelligence. MIT Press, Cambridge, MA., 1999.

Digital Library

Cited By

Liu XTan Y(2023)Feudal Latent Space Exploration for Coordinated Multi-Agent Reinforcement LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.314620134:10(7775-7783)Online publication date: Oct-2023
https://doi.org/10.1109/TNNLS.2022.3146201
Kong WZhou DDu YZhou YZhao Y(2023)Reinforcement Learning for Multiaircraft Autonomous Air Combat in Multisensor UCAV PlatformIEEE Sensors Journal10.1109/JSEN.2022.322032423:18(20596-20606)Online publication date: 15-Sep-2023
https://doi.org/10.1109/JSEN.2022.3220324
Lyu XBanitalebi-Dehkordi AChen MZhang Y(2023)Asynchronous, Option-Based Multi-Agent Policy Gradient: A Conditional Reasoning Approach2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS55552.2023.10342281(7348-7353)Online publication date: 1-Oct-2023
https://doi.org/10.1109/IROS55552.2023.10342281
Show More Cited By

Index Terms

Hierarchical multi-agent reinforcement learning
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Multi-agent systems
    2. Planning and scheduling
  2. Machine learning

Recommendations

Hierarchical multi-agent reinforcement learning

In this paper, we investigate the use of hierarchical reinforcement learning (HRL) to speed up the acquisition of cooperative multi-agent tasks. We introduce a hierarchical multi-agent reinforcement learning (RL) framework, and propose a hierarchical ...
Mediated Multi-Agent Reinforcement Learning
AAMAS '23: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems

The majority of Multi-Agent Reinforcement Learning (MARL) literature equates the cooperation of self-interested agents in mixed environments to the problem of social welfare maximization, allowing agents to arbitrarily share rewards and private ...
Multi-agent systems with reinforcement hierarchical neuro-fuzzy models

This paper introduces a new multi-agent model for intelligent agents, called reinforcement learning hierarchical neuro-fuzzy multi-agent system. This class of model uses a hierarchical partitioning of the input space with a reinforcement learning ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AGENTS '01: Proceedings of the fifth international conference on Autonomous agents

May 2001

662 pages

ISBN:158113326X

DOI:10.1145/375735

Chairmen:
Elisabeth André
DFKI, Germany
,
Sandip Sen
Univ. of Tulsa,
,
Claude Frasson
Univ. of Montreal, Montreal, P.Q., Canada
,
Jörg P. Müller
Seimens, Germany

Copyright © 2001 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 May 2001

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Conference

AGENTS01

Sponsor:

AGENTS01: Autonomous Agents 2001

Quebec, Montreal, Canada

Acceptance Rates

AGENTS '01 Paper Acceptance Rate 66 of 248 submissions, 27%;

Overall Acceptance Rate 182 of 599 submissions, 30%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

89
Total Citations
View Citations
2,103
Total Downloads

Downloads (Last 12 months)166
Downloads (Last 6 weeks)19

Reflects downloads up to 26 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liu XTan Y(2023)Feudal Latent Space Exploration for Coordinated Multi-Agent Reinforcement LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.314620134:10(7775-7783)Online publication date: Oct-2023
https://doi.org/10.1109/TNNLS.2022.3146201
Kong WZhou DDu YZhou YZhao Y(2023)Reinforcement Learning for Multiaircraft Autonomous Air Combat in Multisensor UCAV PlatformIEEE Sensors Journal10.1109/JSEN.2022.322032423:18(20596-20606)Online publication date: 15-Sep-2023
https://doi.org/10.1109/JSEN.2022.3220324
Lyu XBanitalebi-Dehkordi AChen MZhang Y(2023)Asynchronous, Option-Based Multi-Agent Policy Gradient: A Conditional Reasoning Approach2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS55552.2023.10342281(7348-7353)Online publication date: 1-Oct-2023
https://doi.org/10.1109/IROS55552.2023.10342281
Chen XYao LMcAuley JZhou GWang X(2023)Deep reinforcement learning in recommender systems: A survey and new perspectivesKnowledge-Based Systems10.1016/j.knosys.2023.110335264(110335)Online publication date: Mar-2023
https://doi.org/10.1016/j.knosys.2023.110335
Bai YLv YZhang J(2023)Smart mobile robot fleet management based on hierarchical multi-agent deep Q network towards intelligent manufacturingEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106534124(106534)Online publication date: Oct-2023
https://doi.org/10.1016/j.engappai.2023.106534
Zhang TLiu ZPu ZYi JLiang YZhang D(2023)Robot Subgoal-guided Navigation in Dynamic Crowded Environments with Hierarchical Deep Reinforcement LearningInternational Journal of Control, Automation and Systems10.1007/s12555-022-0171-z21:7(2350-2362)Online publication date: 5-May-2023
https://doi.org/10.1007/s12555-022-0171-z
Strauss NWinkel DBerrendorf MSchubert M(2023)Reinforcement Learning for Multi-Agent Stochastic Resource CollectionMachine Learning and Knowledge Discovery in Databases10.1007/978-3-031-26412-2_13(200-215)Online publication date: 17-Mar-2023
https://doi.org/10.1007/978-3-031-26412-2_13
Chang CMu NWu JPan LXu HKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)E-MAPPProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601153(12154-12168)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601153
Setyawan GHartono PSawada H(2022)An In-Depth Analysis of Cooperative Multi-Robot Hierarchical Reinforcement LearningProceedings of the 7th International Conference on Sustainable Information Engineering and Technology10.1145/3568231.3568258(119-126)Online publication date: 22-Nov-2022
https://dl.acm.org/doi/10.1145/3568231.3568258
Wang LHan LChen XLi CHuang JZhang WZhang WHe XLuo D(2022)Hierarchical Multiagent Reinforcement Learning for Allocating Guaranteed Display AdsIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.307048433:10(5361-5373)Online publication date: Oct-2022
https://doi.org/10.1109/TNNLS.2021.3070484
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents