Article

Transition-independent decentralized markov decision processes

Authors:

Shlomo Zilberstein,

Victor Lesser, and

Claudia V. GoldmanAuthors Info & Claims

AAMAS '03: Proceedings of the second international joint conference on Autonomous agents and multiagent systems

July 2003

Pages 41 - 48

https://doi.org/10.1145/860575.860583

Published: 14 July 2003 Publication History

Abstract

There has been substantial progress with formal models for sequential decision making by individual agents using the Markov decision process (MDP). However, similar treatment of multi-agent systems is lacking. A recent complexity result, showing that solving decentralized MDPs is NEXP-hard, provides a partial explanation. To overcome this complexity barrier, we identify a general class of transition-independent decentralized MDPs that is widely applicable. The class consists of independent collaborating agents that are tied together through a global reward function that depends upon both of their histories. We present a novel algorithm for solving this class of problems and examine its properties. The result is the first effective technique to solve optimally a class of decentralized MDPs. This lays the foundation for further work in this area on both exact and approximate solutions.

References

[1]

D.S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein. The complexity of decentralized control of Markov decision processes. Mathematics of Operations Research, 27(4):819-840, November 2002.

Digital Library

[2]

C. Boutilier. Sequential optimality and coordination in multiagent systems. Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, 478--485, Stockholm, Sweden, 1999.

Digital Library

[3]

K. Decker, V. Lesser. Quantitative modeling of complex environments. International Journal of Intelligent Systems in Accounting, Finance and Management. Special Issue on Mathematical and Computational Models and Characteristics of Agent Behaviour., Volume 2, pp. 215--234. January, 1993.

[4]

M. Ghavamzadeh and S. Mahadevan. A multiagent reinforcement learning algorithm by dynamically merging Markov decision processes. Proceedings of the First International Conference on Autonomous Agents and Multiagent Systems, Bologna, Italy, 2002.

Digital Library

[5]

C. V. Goldman and S. Zilberstein. Optimizing information exchange in cooperative multi-agent systems. To appear in Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, Melbourne, Australia, 2003.

Digital Library

[6]

K. Hsu and S.I. Marcus. Decentralized control of finite state Markov processes. IEEE Transactions on Automatic Control, 27(2):426--431, 1982.

[7]

M. Mundhenk, J. Goldsmith, C. Lusena, and E. Allender. Complexity of finite-horizon Markov decision process problems. Journal of the ACM, 47(4):681--720, 2000.

Digital Library

[8]

J.M. Ooi and G.W. Wornell. Decentralized control of a multiple access broadcast channel: Performance bounds. Proceedings of the 35th Conference on Decision and Control, 293--298, 1996.

[9]

C.H. Papadimitriou and J. Tsitsiklis. The complexity of Markov decision processes. Mathematics of Operations Research, 12(3):441--450, 1987.

Digital Library

[10]

L. Peshkin, K.-E. Kim, N. Meuleau, and L.P. Kaelbling. Learning to cooperate via policy search. Proceedings of the Sixteenth International Conference on Uncertainty in Artificial Intelligence, 489--496, 2000.

Digital Library

[11]

D. Pynadath and M. Tambe. The communicative multiagent team decision problem: Analyzing teamwork theories and models. Journal of Artificial Intelligence Research, 389--423, 2002.

Digital Library

[12]

R. Washington, K. Golden, J. Bresina, D.E. Smith, C. Anderson, and T. Smith. Autonomous rovers for Mars exploration. Proceedings of the IEEE Aerospace Conference, 1999.

[13]

P. Xuan and V. Lesser. Multi-agent polices: From centralized ones to decentralized ones. Proceedings of the First International Joint Conference on Autonomous Agents and Multi-Agent Systems, Bologna, Italy, 2002.

Digital Library

[14]

P. Xuan, V. Lesser, and S. Zilberstein. Communication decisions in multi-agent cooperation: Model and experiments. Proceedings of the Fifth International Conference on Autonomous Agents, pages 616--623, Montreal, Canada, 2001.

Digital Library

[15]

S. Zilberstein, R. Washington, D. S. Bernstein, and A. I. Mouaddib. Decision-Theoretic Control of Planetary Rovers. In M. Beetz et al. (Eds.): Plan-Based control of Robotic Agents, LNAI, No. 2466, 270--289, 2002.

Digital Library

Cited By

Choudhury MSaisubramanian SZhang HZilberstein SDastani MSichman JAlechina NDignum V(2024)Minimizing Negative Side Effects in Cooperative Multi-Agent Systems using Distributed CoordinationProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663111(2213-2215)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663111
Chen BHawkins CKarabag MNeary CHale MTopcu UEvans RShpitser I(2023)Differential privacy in cooperative multiagent planningProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625867(347-357)Online publication date: 31-Jul-2023
https://dl.acm.org/doi/10.5555/3625834.3625867
Zhang QDurfee ESingh S(2023)Risk-aware analysis for interpretations of probabilistic achievement and maintenance commitmentsArtificial Intelligence10.1016/j.artint.2023.103864317(103864)Online publication date: Apr-2023
https://doi.org/10.1016/j.artint.2023.103864
Show More Cited By

Index Terms

Transition-independent decentralized markov decision processes
1. Computing methodologies
  1. Artificial intelligence
    1. Distributed artificial intelligence
      1. Cooperation and coordination
      2. Multi-agent systems

Recommendations

Solving transition independent decentralized Markov decision processes

Formal treatment of collaborative multi-agent systems has been lagging behind the rapid progress in sequential decision making by individual agents. Recent work in the area of decentralized Markov Decision Processes (MDPs) has contributed to closing ...
Read More
Relaxation for Constrained Decentralized Markov Decision Processes: (Extended Abstract)
AAMAS '16: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems

This paper studies a class of decentralized multi-agent stochastic optimization problems. In these problems, each agent has only a partial view of the world state, and a partial control of the actions but must cooperatively maximize the long-term system ...
Read More
Variability Sensitive Markov Decision Processes

Considered are time-average Markov Decision Processes MDPs with finite state and action spaces. Two definitions of variability are introduced, namely, the expected time-average variability and time-average expected variability. The two criteria are in ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AAMAS '03: Proceedings of the second international joint conference on Autonomous agents and multiagent systems

July 2003

1200 pages

ISBN:1581136838

DOI:10.1145/860575

General Chairs:
Jeffrey S. Rosenschein
Israel
,
Michael Wooldridge
UK
,
Program Chairs:
Tuomas Sandholm
USA
,
Makoto Yokoo
Japan

Copyright © 2003 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 July 2003

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

AAMAS03

Sponsor:

AAMAS03: Second International Conference on Autonomous Agents and Multiagent Systems

July 14 - 18, 2003

Melbourne, Australia

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

65
Total Citations
View Citations
863
Total Downloads

Downloads (Last 12 months)30
Downloads (Last 6 weeks)1

Other Metrics

View Author Metrics

Citations

Cited By

Choudhury MSaisubramanian SZhang HZilberstein SDastani MSichman JAlechina NDignum V(2024)Minimizing Negative Side Effects in Cooperative Multi-Agent Systems using Distributed CoordinationProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663111(2213-2215)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663111
Chen BHawkins CKarabag MNeary CHale MTopcu UEvans RShpitser I(2023)Differential privacy in cooperative multiagent planningProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625867(347-357)Online publication date: 31-Jul-2023
https://dl.acm.org/doi/10.5555/3625834.3625867
Zhang QDurfee ESingh S(2023)Risk-aware analysis for interpretations of probabilistic achievement and maintenance commitmentsArtificial Intelligence10.1016/j.artint.2023.103864317(103864)Online publication date: Apr-2023
https://doi.org/10.1016/j.artint.2023.103864
Jeong JSanner SKumar A(2023)A Mixed-Integer Linear Programming Reduction of Disjoint Bilinear Programs via Symbolic Variable EliminationIntegration of Constraint Programming, Artificial Intelligence, and Operations Research10.1007/978-3-031-33271-5_6(79-95)Online publication date: 23-May-2023
https://doi.org/10.1007/978-3-031-33271-5_6
Ling JSingh AThien NKumar A(2023)Constrained Multiagent Reinforcement Learning for Large Agent PopulationMachine Learning and Knowledge Discovery in Databases10.1007/978-3-031-26412-2_12(183-199)Online publication date: 17-Mar-2023
https://doi.org/10.1007/978-3-031-26412-2_12
Suau MHe JÇelikok MSpaan MOliehoek FKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Distributed influence-augmented local simulators for parallel MARL in large networked systemsProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602322(28305-28318)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3602322
Street CLacerda BStaniaszek MMühlig MHawes NPelachaud CTaylor MFaliszewski PMascardi V(2022)Context-Aware Modelling for Multi-Robot Systems Under UncertaintyProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3535987(1228-1236)Online publication date: 9-May-2022
https://dl.acm.org/doi/10.5555/3535850.3535987
Karabag MNeary CTopcu UPelachaud CTaylor MFaliszewski PMascardi V(2022)Planning Not to Talk: Multiagent Systems that are Robust to Communication LossProceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems10.5555/3535850.3535930(705-713)Online publication date: 9-May-2022
https://dl.acm.org/doi/10.5555/3535850.3535930
Amini SPalhang MMozayani N(2022)POMCP-based decentralized spatial task allocation algorithms for partially observable environmentsApplied Intelligence10.1007/s10489-022-04142-753:10(12613-12631)Online publication date: 29-Sep-2022
https://doi.org/10.1007/s10489-022-04142-7
Oliehoek FWitwicki SKaelbling L(2021)A Sufficient Statistic for Influence in Structured Multiagent EnvironmentsJournal of Artificial Intelligence Research10.1613/jair.1.1213670(789-870)Online publication date: 1-May-2021
https://dl.acm.org/doi/10.1613/jair.1.12136
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents