Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/860575.860583acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
Article

Transition-independent decentralized markov decision processes

Published: 14 July 2003 Publication History
  • Get Citation Alerts
  • Abstract

    There has been substantial progress with formal models for sequential decision making by individual agents using the Markov decision process (MDP). However, similar treatment of multi-agent systems is lacking. A recent complexity result, showing that solving decentralized MDPs is NEXP-hard, provides a partial explanation. To overcome this complexity barrier, we identify a general class of transition-independent decentralized MDPs that is widely applicable. The class consists of independent collaborating agents that are tied together through a global reward function that depends upon both of their histories. We present a novel algorithm for solving this class of problems and examine its properties. The result is the first effective technique to solve optimally a class of decentralized MDPs. This lays the foundation for further work in this area on both exact and approximate solutions.

    References

    [1]
    D.S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein. The complexity of decentralized control of Markov decision processes. Mathematics of Operations Research, 27(4):819-840, November 2002.
    [2]
    C. Boutilier. Sequential optimality and coordination in multiagent systems. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, 478--485, Stockholm, Sweden, 1999.
    [3]
    K. Decker, V. Lesser. Quantitative modeling of complex environments. International Journal of Intelligent Systems in Accounting, Finance and Management. Special Issue on Mathematical and Computational Models and Characteristics of Agent Behaviour., Volume 2, pp. 215--234. January, 1993.
    [4]
    M. Ghavamzadeh and S. Mahadevan. A multiagent reinforcement learning algorithm by dynamically merging Markov decision processes. Proceedings of the First International Conference on Autonomous Agents and Multiagent Systems, Bologna, Italy, 2002.
    [5]
    C. V. Goldman and S. Zilberstein. Optimizing information exchange in cooperative multi-agent systems. To appear in Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, Melbourne, Australia, 2003.
    [6]
    K. Hsu and S.I. Marcus. Decentralized control of finite state Markov processes. IEEE Transactions on Automatic Control, 27(2):426--431, 1982.
    [7]
    M. Mundhenk, J. Goldsmith, C. Lusena, and E. Allender. Complexity of finite-horizon Markov decision process problems. Journal of the ACM, 47(4):681--720, 2000.
    [8]
    J.M. Ooi and G.W. Wornell. Decentralized control of a multiple access broadcast channel: Performance bounds. Proceedings of the 35th Conference on Decision and Control, 293--298, 1996.
    [9]
    C.H. Papadimitriou and J. Tsitsiklis. The complexity of Markov decision processes. Mathematics of Operations Research, 12(3):441--450, 1987.
    [10]
    L. Peshkin, K.-E. Kim, N. Meuleau, and L.P. Kaelbling. Learning to cooperate via policy search. Proceedings of the Sixteenth International Conference on Uncertainty in Artificial Intelligence, 489--496, 2000.
    [11]
    D. Pynadath and M. Tambe. The communicative multiagent team decision problem: Analyzing teamwork theories and models. Journal of Artificial Intelligence Research, 389--423, 2002.
    [12]
    R. Washington, K. Golden, J. Bresina, D.E. Smith, C. Anderson, and T. Smith. Autonomous rovers for Mars exploration. Proceedings of the IEEE Aerospace Conference, 1999.
    [13]
    P. Xuan and V. Lesser. Multi-agent polices: From centralized ones to decentralized ones. Proceedings of the First International Joint Conference on Autonomous Agents and Multi-Agent Systems, Bologna, Italy, 2002.
    [14]
    P. Xuan, V. Lesser, and S. Zilberstein. Communication decisions in multi-agent cooperation: Model and experiments. Proceedings of the Fifth International Conference on Autonomous Agents, pages 616--623, Montreal, Canada, 2001.
    [15]
    S. Zilberstein, R. Washington, D. S. Bernstein, and A. I. Mouaddib. Decision-Theoretic Control of Planetary Rovers. In M. Beetz et al. (Eds.): Plan-Based control of Robotic Agents, LNAI, No. 2466, 270--289, 2002.

    Cited By

    View all
    • (2024)Minimizing Negative Side Effects in Cooperative Multi-Agent Systems using Distributed CoordinationProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663111(2213-2215)Online publication date: 6-May-2024
    • (2023)Differential privacy in cooperative multiagent planningProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625867(347-357)Online publication date: 31-Jul-2023
    • (2023)Risk-aware analysis for interpretations of probabilistic achievement and maintenance commitmentsArtificial Intelligence10.1016/j.artint.2023.103864317(103864)Online publication date: Apr-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    AAMAS '03: Proceedings of the second international joint conference on Autonomous agents and multiagent systems
    July 2003
    1200 pages
    ISBN:1581136838
    DOI:10.1145/860575
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 July 2003

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. decentralized MDP
    2. decision-theoretic planning

    Qualifiers

    • Article

    Conference

    AAMAS03
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)30
    • Downloads (Last 6 weeks)1

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Minimizing Negative Side Effects in Cooperative Multi-Agent Systems using Distributed CoordinationProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663111(2213-2215)Online publication date: 6-May-2024
    • (2023)Differential privacy in cooperative multiagent planningProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625867(347-357)Online publication date: 31-Jul-2023
    • (2023)Risk-aware analysis for interpretations of probabilistic achievement and maintenance commitmentsArtificial Intelligence10.1016/j.artint.2023.103864317(103864)Online publication date: Apr-2023
    • (2023)A Mixed-Integer Linear Programming Reduction of Disjoint Bilinear Programs via Symbolic Variable EliminationIntegration of Constraint Programming, Artificial Intelligence, and Operations Research10.1007/978-3-031-33271-5_6(79-95)Online publication date: 23-May-2023
    • (2023)Constrained Multiagent Reinforcement Learning for Large Agent PopulationMachine Learning and Knowledge Discovery in Databases10.1007/978-3-031-26412-2_12(183-199)Online publication date: 17-Mar-2023
    • (2022)Distributed influence-augmented local simulators for parallel MARL in large networked systemsProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602322(28305-28318)Online publication date: 28-Nov-2022
    • (2022)Context-Aware Modelling for Multi-Robot Systems Under UncertaintyProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3535987(1228-1236)Online publication date: 9-May-2022
    • (2022)Planning Not to Talk: Multiagent Systems that are Robust to Communication LossProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3535930(705-713)Online publication date: 9-May-2022
    • (2022)POMCP-based decentralized spatial task allocation algorithms for partially observable environmentsApplied Intelligence10.1007/s10489-022-04142-753:10(12613-12631)Online publication date: 29-Sep-2022
    • (2021)A Sufficient Statistic for Influence in Structured Multiagent EnvironmentsJournal of Artificial Intelligence Research10.1613/jair.1.1213670(789-870)Online publication date: 1-May-2021
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media