research-article

Cooperating with a markovian ad hoc teammate

Authors:

Doran Chakraborty,

Peter StoneAuthors Info & Claims

AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

Pages 1085 - 1092

Published: 06 May 2013 Publication History

Abstract

This paper focuses on learning in the presence of a Markovian teammate in Ad hoc teams. A Markovian teammate's policy is a function of a set of discrete feature values derived from the joint history of interaction, where the feature values transition in a Markovian fashion on each time step. We introduce a novel algorithm "Learning to Cooperate with a Markovian teammate", or LCM, that converges to optimal cooperation with any Markovian teammate, and achieves safety with any arbitrary teammate. The novel aspect of LCM is the manner in which it satisfies the above two goals via efficient exploration and exploitation. The main contribution of this paper is a full specification and a detailed analysis of LCM's theoretical properties.¹

References

[1]

Noa Agmon and Peter Stone. Leading ad hoc agents in joint action settings with multiple teammates. In Proc. of 11th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2012), June 2012.

Digital Library

[2]

Carlos Diuk, Alexander L. Strehl, and Michael L. Littman. Efficient structure learning in factored-state mdps. In AAAI, pages 645--650, 2007.

Digital Library

[3]

Samuel Barrett, Peter Stone, and Sarit Kraus. Empirical evaluation of ad hoc teamwork in the pursuit domain. In Proc. of 11th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2011), May 2011.

Digital Library

[4]

Michael Bowling and Peter McCracken. Coordination and adaptation in impromptu teams. In Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI), pages 53--58, 2005.

Digital Library

[5]

Ronen I. Brafman and Moshe Tennenholtz. R-max - a general polynomial time algorithm for near-optimal reinforcement learning. J. Mach. Learn. Res., 3:213--231, 2003.

Digital Library

[6]

Doran Chakraborty and Peter Stone. Online multiagent learning against memory bounded adversaries. In European Conference on Machine Learning, pages 211--226, Antwerp, Belgium, 2008.

Digital Library

[7]

Doran Chakraborty and Peter Stone. Convergence, Targeted Optimality and Safety in Multiagent Learning. In Proceedings of the Twenty-seventh International Conference on Machine Learning (ICML 2010), June 2010.

[8]

Doran Chakraborty and Peter Stone. Structure learning in ergodic factored mdps without knowledge of the transition function's in-degree. In Proceedings of the Twenty-eighth International Conference on Machine Learning (ICML 2011), June 2011.

[9]

Carlos Diuk, Lihong Li, and Bethany R. Leffler. The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning. In ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning, pages 249--256, New York, NY, USA, 2009. ACM.

Digital Library

[10]

Drew Fudenberg and David K. Levine. Universal consistency and cautious fictitious play. In Journal of Economic Dynamics and Control, 1995.

[11]

Wassily Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, pages 13--30, 1963.

[12]

Gal A. Kaminka and Milind Tambe. Robust multi-agent teams via socially-attentive monitoring. Journal of Artificial Intelligence Research (JAIR), 12:105--147, 2000.

Digital Library

[13]

Michael Kearns and Satinder Singh. Near-optimal reinforcement learning in polynomial time. In Proc. 15th International Conf. on Machine Learning, pages 260--268. Morgan Kaufmann, San Francisco, CA, 1998.

Digital Library

[14]

Sridhar Mahadevan. Average reward reinforcement learning: Foundations, algorithms, and empirical results. Machine Learning, 22, 1996.

Digital Library

[15]

Martin J. Osborne and Ariel Rubinstein. A Course in Game Theory. The MIT Press., Massachusetts,USA, 1994.

[16]

Rob Powers and Yoav Shoham. Learning against opponents with bounded memory. In IJCAI, pages 817--822, 2005.

Digital Library

[17]

Rob Powers, Yoav Shoham, and Thuc Vu. A general criterion and an algorithmic framework for learning in multi-agent systems. Mach. Learn., 67(1-2):45--76, 2007.

Digital Library

[18]

Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley-Interscience, 1994.

Digital Library

[19]

Peter Stone, Gal A. Kaminka, Sarit Kraus, and Jeffrey S. Rosenschein. Ad hoc autonomous agent teams: Collaboration without pre-coordination. In Proceedings of the Twenty-Fourth Conference on Artificial Intelligence, July 2010.

[20]

Peter Stone and Sarit Kraus. To teach or not to teach? decision making under uncertainty in ad hoc teams. In The Ninth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), May 2010.

Digital Library

[21]

Milind Tambe and Weixiong Zhang. Towards flexible teamwork in persistent teams: Extended report. Autonomous Agents and Multi-Agent Systems, 3(2):159--183, June 2000.

Digital Library

[22]

Feng Wu, Shlomo Zilberstein, and Xiaoping Chen. Online planning for ad hoc autonomous agent teams. In Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, pages 439--445, Barcelona, Spain, 2011.

Digital Library

Cited By

Chen SAndrejczuk EIrissappane AZhang J(2019)ATSISProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367032.3367058(172-179)Online publication date: 10-Aug-2019
https://dl.acm.org/doi/10.5555/3367032.3367058
Liemhetcharat SVeloso M(2017)Allocating training instances to learning agents for team formationAutonomous Agents and Multi-Agent Systems10.1007/s10458-016-9355-331:4(905-940)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1007/s10458-016-9355-3
Chandrasekaran MDoshi PZeng YChen Y(2017)Can bounded and self-interested agents be teammates? Application to planning in ad hoc teamsAutonomous Agents and Multi-Agent Systems10.1007/s10458-016-9354-431:4(821-860)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1007/s10458-016-9354-4
Show More Cited By

Index Terms

Cooperating with a markovian ad hoc teammate
1. Computing methodologies
  1. Machine learning

Recommendations

Ad hoc teamwork for leading a flock
AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

Designing agents that can cooperate with other agents as a team, without prior coordination or explicit communication, is becoming more desirable as autonomous agents become more prevalent. In this paper we examine an aspect of the problem of leading ...
Ad hoc teamwork by learning teammates' task

This paper addresses the problem of ad hoc teamwork, where a learning agent engages in a cooperative task with other (unknown) agents. The agent must effectively coordinate with the other agents towards completion of the intended task, not relying on ...
Can bounded and self-interested agents be teammates? Application to planning in ad hoc teams

Planning for ad hoc teamwork is challenging because it involves agents collaborating without any prior coordination or communication. The focus is on principled methods for a single agent to cooperate with others. This motivates investigating the ad hoc ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

May 2013

1500 pages

ISBN:9781450319935

General Chairs:
Maria Gini
University of Minnesota, USA
,
Onn Shehory
IBM Haifa Research Lab, Israel
,
Program Chairs:
Takayuki Ito
Nagoya Institute of Technology, Japan
,
Catholijn Jonker
Delft Institute of Technology, The Netherlands

Sponsors

IFAAMAS

In-Cooperation

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 06 May 2013

Check for updates

Author Tags

Qualifiers

Research-article

Conference

AAMAS '13

Sponsor:

AAMAS '13: International conference on Autonomous Agents and Multi-Agent Systems

May 6 - 10, 2013

MN, St. Paul, USA

Acceptance Rates

AAMAS '13 Paper Acceptance Rate 140 of 599 submissions, 23%;

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
95
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Chen SAndrejczuk EIrissappane AZhang J(2019)ATSISProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367032.3367058(172-179)Online publication date: 10-Aug-2019
https://dl.acm.org/doi/10.5555/3367032.3367058
Liemhetcharat SVeloso M(2017)Allocating training instances to learning agents for team formationAutonomous Agents and Multi-Agent Systems10.1007/s10458-016-9355-331:4(905-940)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1007/s10458-016-9355-3
Chandrasekaran MDoshi PZeng YChen Y(2017)Can bounded and self-interested agents be teammates? Application to planning in ad hoc teamsAutonomous Agents and Multi-Agent Systems10.1007/s10458-016-9354-431:4(821-860)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1007/s10458-016-9354-4
Eck ASoh LWeiss GYolum PBordini RElkind E(2015)To Ask, Sense, or ShareProceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems10.5555/2772879.2772928(367-376)Online publication date: 4-May-2015
https://dl.acm.org/doi/10.5555/2772879.2772928
Agmon NBarrett SStone PBazzan AHuhns MLomuscio AScerri P(2014)Modeling uncertainty in leading ad hoc teamsProceedings of the 2014 international conference on Autonomous agents and multi-agent systems10.5555/2615731.2615797(397-404)Online publication date: 5-May-2014
https://dl.acm.org/doi/10.5555/2615731.2615797

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents