Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2484920.2485091acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

Cooperating with a markovian ad hoc teammate

Published: 06 May 2013 Publication History

Abstract

This paper focuses on learning in the presence of a Markovian teammate in Ad hoc teams. A Markovian teammate's policy is a function of a set of discrete feature values derived from the joint history of interaction, where the feature values transition in a Markovian fashion on each time step. We introduce a novel algorithm "Learning to Cooperate with a Markovian teammate", or LCM, that converges to optimal cooperation with any Markovian teammate, and achieves safety with any arbitrary teammate. The novel aspect of LCM is the manner in which it satisfies the above two goals via efficient exploration and exploitation. The main contribution of this paper is a full specification and a detailed analysis of LCM's theoretical properties.1

References

[1]
Noa Agmon and Peter Stone. Leading ad hoc agents in joint action settings with multiple teammates. In Proc. of 11th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2012), June 2012.
[2]
Carlos Diuk, Alexander L. Strehl, and Michael L. Littman. Efficient structure learning in factored-state mdps. In AAAI, pages 645--650, 2007.
[3]
Samuel Barrett, Peter Stone, and Sarit Kraus. Empirical evaluation of ad hoc teamwork in the pursuit domain. In Proc. of 11th Int. Conf. on Autonomous Agents and Multiagent Systems (AAMAS 2011), May 2011.
[4]
Michael Bowling and Peter McCracken. Coordination and adaptation in impromptu teams. In Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI), pages 53--58, 2005.
[5]
Ronen I. Brafman and Moshe Tennenholtz. R-max - a general polynomial time algorithm for near-optimal reinforcement learning. J. Mach. Learn. Res., 3:213--231, 2003.
[6]
Doran Chakraborty and Peter Stone. Online multiagent learning against memory bounded adversaries. In European Conference on Machine Learning, pages 211--226, Antwerp, Belgium, 2008.
[7]
Doran Chakraborty and Peter Stone. Convergence, Targeted Optimality and Safety in Multiagent Learning. In Proceedings of the Twenty-seventh International Conference on Machine Learning (ICML 2010), June 2010.
[8]
Doran Chakraborty and Peter Stone. Structure learning in ergodic factored mdps without knowledge of the transition function's in-degree. In Proceedings of the Twenty-eighth International Conference on Machine Learning (ICML 2011), June 2011.
[9]
Carlos Diuk, Lihong Li, and Bethany R. Leffler. The adaptive k-meteorologists problem and its application to structure learning and feature selection in reinforcement learning. In ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning, pages 249--256, New York, NY, USA, 2009. ACM.
[10]
Drew Fudenberg and David K. Levine. Universal consistency and cautious fictitious play. In Journal of Economic Dynamics and Control, 1995.
[11]
Wassily Hoeffding. Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association, pages 13--30, 1963.
[12]
Gal A. Kaminka and Milind Tambe. Robust multi-agent teams via socially-attentive monitoring. Journal of Artificial Intelligence Research (JAIR), 12:105--147, 2000.
[13]
Michael Kearns and Satinder Singh. Near-optimal reinforcement learning in polynomial time. In Proc. 15th International Conf. on Machine Learning, pages 260--268. Morgan Kaufmann, San Francisco, CA, 1998.
[14]
Sridhar Mahadevan. Average reward reinforcement learning: Foundations, algorithms, and empirical results. Machine Learning, 22, 1996.
[15]
Martin J. Osborne and Ariel Rubinstein. A Course in Game Theory. The MIT Press., Massachusetts,USA, 1994.
[16]
Rob Powers and Yoav Shoham. Learning against opponents with bounded memory. In IJCAI, pages 817--822, 2005.
[17]
Rob Powers, Yoav Shoham, and Thuc Vu. A general criterion and an algorithmic framework for learning in multi-agent systems. Mach. Learn., 67(1-2):45--76, 2007.
[18]
Martin L. Puterman. Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley-Interscience, 1994.
[19]
Peter Stone, Gal A. Kaminka, Sarit Kraus, and Jeffrey S. Rosenschein. Ad hoc autonomous agent teams: Collaboration without pre-coordination. In Proceedings of the Twenty-Fourth Conference on Artificial Intelligence, July 2010.
[20]
Peter Stone and Sarit Kraus. To teach or not to teach? decision making under uncertainty in ad hoc teams. In The Ninth International Conference on Autonomous Agents and Multiagent Systems (AAMAS), May 2010.
[21]
Milind Tambe and Weixiong Zhang. Towards flexible teamwork in persistent teams: Extended report. Autonomous Agents and Multi-Agent Systems, 3(2):159--183, June 2000.
[22]
Feng Wu, Shlomo Zilberstein, and Xiaoping Chen. Online planning for ad hoc autonomous agent teams. In Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, pages 439--445, Barcelona, Spain, 2011.

Cited By

View all
  • (2019)ATSISProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367032.3367058(172-179)Online publication date: 10-Aug-2019
  • (2017)Allocating training instances to learning agents for team formationAutonomous Agents and Multi-Agent Systems10.1007/s10458-016-9355-331:4(905-940)Online publication date: 1-Jul-2017
  • (2017)Can bounded and self-interested agents be teammates? Application to planning in ad hoc teamsAutonomous Agents and Multi-Agent Systems10.1007/s10458-016-9354-431:4(821-860)Online publication date: 1-Jul-2017
  • Show More Cited By

Index Terms

  1. Cooperating with a markovian ad hoc teammate

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AAMAS '13: Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
    May 2013
    1500 pages
    ISBN:9781450319935

    Sponsors

    • IFAAMAS

    In-Cooperation

    Publisher

    International Foundation for Autonomous Agents and Multiagent Systems

    Richland, SC

    Publication History

    Published: 06 May 2013

    Check for updates

    Author Tags

    1. ad hoc teamwork
    2. learning
    3. sample complexity analysist

    Qualifiers

    • Research-article

    Conference

    AAMAS '13
    Sponsor:

    Acceptance Rates

    AAMAS '13 Paper Acceptance Rate 140 of 599 submissions, 23%;
    Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)ATSISProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367032.3367058(172-179)Online publication date: 10-Aug-2019
    • (2017)Allocating training instances to learning agents for team formationAutonomous Agents and Multi-Agent Systems10.1007/s10458-016-9355-331:4(905-940)Online publication date: 1-Jul-2017
    • (2017)Can bounded and self-interested agents be teammates? Application to planning in ad hoc teamsAutonomous Agents and Multi-Agent Systems10.1007/s10458-016-9354-431:4(821-860)Online publication date: 1-Jul-2017
    • (2015)To Ask, Sense, or ShareProceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems10.5555/2772879.2772928(367-376)Online publication date: 4-May-2015
    • (2014)Modeling uncertainty in leading ad hoc teamsProceedings of the 2014 international conference on Autonomous agents and multi-agent systems10.5555/2615731.2615797(397-404)Online publication date: 5-May-2014

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media