research-article

Comparative evaluation of MAL algorithms in a diverse set of ad hoc team problems

Authors:

Stefano V. Albrecht,

Subramanian RamamoorthyAuthors Info & Claims

AAMAS '12: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1

Pages 349 - 356

Published: 04 June 2012 Publication History

Abstract

This paper is concerned with evaluating different multiagent learning (MAL) algorithms in problems where individual agents may be heterogenous, in the sense of utilizing different learning strategies, without the opportunity for prior agreements or information regarding coordination. Such a situation arises in ad hoc team problems, a model of many practical multiagent systems applications. Prior work in multiagent learning has often been focussed on homogeneous groups of agents, meaning that all agents were identical and a priori aware of this fact. Also, those algorithms that are specifically designed for ad hoc team problems are typically evaluated in teams of agents with fixed behaviours, as opposed to agents which are adapting their behaviours. In this work, we empirically evaluate five MAL algorithms, representing major approaches to multiagent learning but originally developed with the homogeneous setting in mind, to understand their behaviour in a set of ad hoc team problems. All teams consist of agents which are continuously adapting their behaviours. The algorithms are evaluated with respect to a comprehensive characterisation of repeated matrix games, using performance criteria that include considerations such as attainment of equilibrium, social welfare and fairness. Our main conclusion is that there is no clear winner. However, the comparative evaluation also highlights the relative strengths of different algorithms with respect to the type of performance criteria, e.g., social welfare vs. attainment of equilibrium.

References

[1]

D. Banerjee and S. Sen. Reaching pareto-optimality in prisoner's dilemma using conditional joint action learning. Autonomous Agents and Multi-Agent Systems, 15(1):91--108, 2007.

Digital Library

[2]

S. Barrett, P. Stone, and S. Kraus. Empirical evaluation of ad hoc teamwork in the pursuit domain. In Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems, May 2011.

Digital Library

[3]

M. Bowling and M. Veloso. Multiagent learning using a variable learning rate. Artificial Intelligence, 136(2):215--250, 2002.

Digital Library

[4]

G. Brown. Iterative solution of games by fictitious play. In T. Koopmans, editor, Activity Analysis of Production and Allocation. Wiley, 1951.

[5]

C. Claus and C. Boutilier. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the National Conference on Artificial Intelligence, pages 746--752, 1998.

Digital Library

[6]

V. Conitzer and T. Sandholm. Awesome: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents. In Proceedings of the 20th International Conference on Machine Learning, volume 20, pages 83--90, 2003.

[7]

K. Genter, N. Agmon, and P. Stone. Role-based ad hoc teamwork. In Proceedings of the Plan, Activity, and Intent Recognition Workshop at the 25th Conference on Artificial Intelligence, August 2011.

[8]

A. Greenwald and K. Hall. Correlated q-learning. In Proceedings of the 20th International Conference on Machine Learning, pages 242--249, 2003.

[9]

J. Harsanyi. Bargaining in ignorance of the opponents' utility function. Journal of Conflict Resolution, 6(1):29--38, 1962.

[10]

J. Harsanyi. Games with incomplete information played by "bayesian" players, i-iii. part i. the basic model. Management Science, 14(3):159--182, 1967.

Digital Library

[11]

J. Harsanyi. Games with incomplete information played by "bayesian" players, i-iii. part ii. bayesian equilibrium points. Management Science, 14(5):320--334, 1968.

Digital Library

[12]

J. Harsanyi. Games with incomplete information played by "bayesian" players, i-iii. part iii. the basic probability distribution of the game. Management Science, 14(7):486--502, 1968.

Digital Library

[13]

S. Hart and A. Mas-Colell. A simple adaptive procedure leading to correlated equilibrium. Econometrica, 68(5):1127--1150, 2000.

[14]

S. Hart and A. Mas-Colell. A reinforcement procedure leading to correlated equilibrium. Economic Essays: A Festschrift for Werner Hildenbrand, pages 181--200, 2001.

[15]

J. Hu and M. Wellman. Multiagent reinforcement learning: Theoretical framework and an algorithm. In Proceedings of the Fifteenth International Conference on Machine Learning, volume 242, page 250, 1998.

Digital Library

[16]

J. Hu and M. Wellman. Experimental results on q-learning for general-sum stochastic games. In Proceedings of the 17th International Conference on Machine Learning, page 414. Morgan Kaufmann Publishers Inc., 2000.

Digital Library

[17]

J. Hu and M. Wellman. Nash q-learning for general-sum stochastic games. The Journal of Machine Learning Research, 4:1039--1069, 2003.

Digital Library

[18]

T. Jaakkola, M. Jordan, and S. Singh. On the convergence of stochastic iterative dynamic programming algorithms. Neural Computation, 6(6):1185--1201, 1994.

Digital Library

[19]

J. Jordan. Bayesian learning in normal form games. Games and Economic Behavior, 3(1):60--81, 1991.

[20]

S. Kimbrough and M. Lu. Simple reinforcement learning agents: Pareto beats nash in an algorithmic game theory study. Information Systems and E-Business Management, 3(1):1--19, 2005.

[21]

C. Lemke and J. Howson. Equilibrium points of bimatrix games. Journal of the Society for Industrial and Applied Mathematics, 12(2):413--423, 1964.

[22]

M. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the 11th International Conference on Machine Learning, volume 157, page 163, 1994.

[23]

M. Littman. Friend-or-foe q-learning in general-sum games. In Proceedings of the 18th International Conference on Machine Learning, ICML '01, pages 322--328. Morgan Kaufmann Publishers Inc., 2001.

Digital Library

[24]

M. Littman and C. Szepesvári. A generalized reinforcement-learning model: Convergence and applications. In In Proceedings of the 13th International Conference on Machine Learning, pages 310--318. Morgan Kaufmann, 1996.

[25]

A. Rapoport and M. Guyer. A taxonomy of 2 x 2 games. General Systems: Yearbook of the Society for General Systems Research, 11:203--214, 1966.

[26]

T. Schelling. The Strategy of Conflict. Harvard University Press, 1980.

[27]

S. Sen, S. Airiau, and R. Mukherjee. Towards a pareto-optimal solution in general-sum games. In Proceedings of the Second International Joint Conference on Autonomous Agents and Multiagent Systems, pages 153--160. ACM, 2003.

Digital Library

[28]

L. Shapley. Stochastic games. Proceedings of the National Academy of Sciences of the United States of America, 39(10):1095, 1953.

[29]

P. Stone, G. Kaminka, S. Kraus, and J. Rosenschein. Ad hoc autonomous agent teams: Collaboration without pre-coordination. In Proceedings of the 24th Conference on Artificial Intelligence, July 2010.

Digital Library

[30]

P. Stone, G. Kaminka, and J. Rosenschein. Leading a best-response teammate in an ad hoc team. In Agent-Mediated Electronic Commerce: Designing Trading Strategies and Mechanisms for Electronic Markets, pages 132--146, November 2010.

[31]

P. Stone and S. Kraus. To teach or not to teach? decision making under uncertainty in ad hoc teams. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems, May 2010.

Digital Library

[32]

W. Uther and M. Veloso. Adversarial reinforcement learning. Technical report, Computer Science Department, Carnegie Mellon University, 1997.

[33]

C. Watkins and P. Dayan. Q-learning. Machine learning, 8(3):279--292, 1992.

Digital Library

[34]

D. Wolpert and W. Macready. No free lunch theorems for search. Technical Report SFI-TR-95-02-010, Santa Fe Institute, 1995.

[35]

D. Wolpert and W. Macready. No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1):67--82, 1997.

Digital Library

[36]

F. Wu, S. Zilberstein, and X. Chen. Online planning for ad hoc autonomous agent teams. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence, 2011.

Digital Library

[37]

H. Young. Strategic learning and its limits. Oxford University Press, 2004.

Cited By

Liemhetcharat SVeloso M(2017)Allocating training instances to learning agents for team formationAutonomous Agents and Multi-Agent Systems10.1007/s10458-016-9355-331:4(905-940)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1007/s10458-016-9355-3
Albrecht SCrandall JRamamoorthy S(2016)Belief and truth in hypothesised behavioursArtificial Intelligence10.1016/j.artint.2016.02.004235:C(63-94)Online publication date: 1-Jun-2016
https://dl.acm.org/doi/10.1016/j.artint.2016.02.004
Albrecht SRamamoorthy S(2014)On convergence and optimality of best-response learning with policy types in multiagent systemsProceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence10.5555/3020751.3020754(12-21)Online publication date: 23-Jul-2014
https://dl.acm.org/doi/10.5555/3020751.3020754
Show More Cited By

Index Terms

Comparative evaluation of MAL algorithms in a diverse set of ad hoc team problems
1. Computing methodologies
  1. Artificial intelligence
    1. Philosophical/theoretical foundations of artificial intelligence

Recommendations

Empirical evaluation of ad hoc teamwork in the pursuit domain
AAMAS '11: The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2

The concept of creating autonomous agents capable of exhibiting ad hoc teamwork was recently introduced as a challenge to the AI, and specifically to the multiagent systems community. An agent capable of ad hoc teamwork is one that can effectively ...
An analysis framework for ad hoc teamwork tasks
AAMAS '12: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1

In multiagent team settings, the agents are often given a protocol for coordinating their actions. When such a protocol is not available, agents must engage in ad hoc teamwork to effectively cooperate with one another. A fully general ad hoc team agent ...
Conjectural Equilibrium in Multiagent Learning

Learning in a multiagent environment is complicated by the fact that as other agents learn, the environment effectively changes. Moreover, other agents‘ actions are often not directly observable, and the actions taken by the learning agent can strongly ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AAMAS '12: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1

June 2012

592 pages

ISBN:0981738117

General Chairs:
Wiebe van der Hoek
University of Liverpool, UK
,
Lin Padgham
RMIT University, Australia
,
Program Chairs:
Vincent Conitzer
Duke University
,
Michael Winikoff
University of Otago, New Zealand

Sponsors

The International Foundation for Autonomous Agents and Multiagent Systems: The International Foundation for Autonomous Agents and Multiagent Systems

In-Cooperation

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 04 June 2012

Check for updates

Author Tags

Qualifiers

Research-article

Conference

AAMAS 12

Sponsor:

The International Foundation for Autonomous Agents and Multiagent Systems

AAMAS 12: International Conference on Autonomous Agents and Multiagent Systems

June 4 - 8, 2012

Valencia, Spain

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
108
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liemhetcharat SVeloso M(2017)Allocating training instances to learning agents for team formationAutonomous Agents and Multi-Agent Systems10.1007/s10458-016-9355-331:4(905-940)Online publication date: 1-Jul-2017
https://dl.acm.org/doi/10.1007/s10458-016-9355-3
Albrecht SCrandall JRamamoorthy S(2016)Belief and truth in hypothesised behavioursArtificial Intelligence10.1016/j.artint.2016.02.004235:C(63-94)Online publication date: 1-Jun-2016
https://dl.acm.org/doi/10.1016/j.artint.2016.02.004
Albrecht SRamamoorthy S(2014)On convergence and optimality of best-response learning with policy types in multiagent systemsProceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence10.5555/3020751.3020754(12-21)Online publication date: 23-Jul-2014
https://dl.acm.org/doi/10.5555/3020751.3020754
Agmon NBarrett SStone PBazzan AHuhns MLomuscio AScerri P(2014)Modeling uncertainty in leading ad hoc teamsProceedings of the 2014 international conference on Autonomous agents and multi-agent systems10.5555/2615731.2615797(397-404)Online publication date: 5-May-2014
https://dl.acm.org/doi/10.5555/2615731.2615797
Albrecht SRamamoorthy SGini MShehory OIto TJonker C(2013)Ad hoc coordination in multiagent systems with applications to human-machine interactionProceedings of the 2013 international conference on Autonomous agents and multi-agent systems10.5555/2484920.2485253(1415-1416)Online publication date: 6-May-2013
https://dl.acm.org/doi/10.5555/2484920.2485253
Albrecht SRamamoorthy SGini MShehory OIto TJonker C(2013)A game-theoretic model and best-response learning method for ad hoc coordination in multiagent systemsProceedings of the 2013 international conference on Autonomous agents and multi-agent systems10.5555/2484920.2485118(1155-1156)Online publication date: 6-May-2013
https://dl.acm.org/doi/10.5555/2484920.2485118

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents