Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3463952.3464191acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
extended-abstract

Approximate Difference Rewards for Scalable Multigent Reinforcement Learning

Published: 03 May 2021 Publication History
  • Get Citation Alerts
  • Abstract

    We address the problem ofmultiagent credit assignment in a large scale multiagent system. Difference rewards (DRs) are an effective tool to tackle this problem, but their exact computation is known to be challenging even for small number of agents. We propose a scalable method to compute difference rewards based on aggregate information in a multiagent system with large number of agents by exploiting the symmetry present in several practical applications. Empirical evaluation on two multiagent domains---air-traffic control and cooperative navigation, shows better solution quality than previous approaches.

    References

    [1]
    Adrian K. Agogino and Kagan Tumer. 2004. Unifying temporal and structural credit assignment problems. In International Joint Conference on Autonomous Agents and Multiagent Systems. 980--987.
    [2]
    Adrian K. Agogino and Kagan Tumer. 2008. Analyzing and visualizing multiagent rewards in dynamic and stochastic domains. Journal of Autonomous Agents and Multi-Agent Systems, Vol. 17, 2 (2008), 320--338.
    [3]
    Drew Bagnell and Andrew Y Ng. 2006. On local rewards and scaling distributed reinforcement learning. In Advances in Neural Information Processing Systems. 91--98.
    [4]
    Marc Brittain and Peng Wei. 2019. Autonomous Separation Assurance in An High-Density En Route Sector: A Deep Multi-Agent Reinforcement Learning Approach. In IEEE Intelligent Transportation Systems Conference. 3256--3262.
    [5]
    Yu-Han Chang, Tracey Ho, and Leslie P Kaelbling. 2004. All learning is local: Multi-agent learning in global reward games. In Advances in neural information processing systems. 807--814.
    [6]
    Mitchell K. Colby, Theodore Duchow-Pressley, Jen Jen Chung, and Kagan Tumer. 2016. Local Approximation of Difference Evaluation Functions. In Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems. ACM, 521--529.
    [7]
    Jacco M Hoekstra and Joost Ellerbroek. 2016. Bluesky atc simulator project: an open data and open source approach. In Proceedings of the 7th International Conference on Research in Air Transportation, Vol. 131. FAA/Eurocontrol USA/Europe, 132.
    [8]
    Shariq Iqbal and Fei Sha. 2019. Actor-attention-critic for multi-agent reinforcement learning. In International Conference on Machine Learning. PMLR, PMLR, 2961--2970.
    [9]
    Ryan Lowe, YI WU, Aviv Tamar, Jean Harb, OpenAI Pieter Abbeel, and Igor Mordatch. 2017. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 6379--6390.
    [10]
    Duc Thien Nguyen, Akshat Kumar, and Hoong Chuin Lau. 2018. Credit assignment for collective multiagent RL with global rewards. In Advances in Neural Information Processing Systems. 8102--8113.
    [11]
    Arambam James Singh, Akshat Kumar, and Hoong Chuin Lau. 2020. Hierarchical Multiagent Reinforcement Learning for Maritime Traffic Management. In 19th International Conference on Autonomous Agents and MultiAgent Systems. IFAAMAS, 1278--1286.
    [12]
    Arambam James Singh, Duc Thien Nguyen, Akshat Kumar, and Hoong Chuin Lau. 2019. Multiagent Decision Making For Maritime Traffic Management. In AAAI Conference on Artificial Intelligence. AAAI Press, 6171--6178.
    [13]
    Sriram Ganapathi Subramanian, Pascal Poupart, Matthew E. Taylor, and Nidhi Hegde. 2020. Multi Type Mean Field Reinforcement Learning. In Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, AAMAS. IFAAMAS, 411--419.
    [14]
    David H. Wolpert and Kagan Tumer. 2001. Optimal Payoff Functions for Members of Collectives. Advances in Complex Systems, Vol. 4, 2--3 (2001), 265--280.
    [15]
    Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, and Jun Wang. 2018. Mean Field Multi-Agent Reinforcement Learning. In Proceedings of the 35th International Conference on Machine Learning, ICML, Vol. 80. PMLR, 5567--5576.

    Cited By

    View all
    • (2021)Embeddings between state and action labeled probabilistic systemsProceedings of the 36th Annual ACM Symposium on Applied Computing10.1145/3412841.3442048(1759-1767)Online publication date: 22-Mar-2021

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    AAMAS '21: Proceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems
    May 2021
    1899 pages
    ISBN:9781450383073

    Sponsors

    Publisher

    International Foundation for Autonomous Agents and Multiagent Systems

    Richland, SC

    Publication History

    Published: 03 May 2021

    Check for updates

    Author Tags

    1. multiagent systems
    2. reinforcement learning

    Qualifiers

    • Extended-abstract

    Conference

    AAMAS '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Embeddings between state and action labeled probabilistic systemsProceedings of the 36th Annual ACM Symposium on Applied Computing10.1145/3412841.3442048(1759-1767)Online publication date: 22-Mar-2021

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media