article

Allocating training instances to learning agents for team formation

Authors:
Somchaya Liemhetcharat

Institute for Infocomm Research, A*STAR, Singapore, Singapore and Uber Advanced Technologies Center, Pittsburgh, USA

Institute for Infocomm Research, A*STAR, Singapore, Singapore and Uber Advanced Technologies Center, Pittsburgh, USA
View Profile

,
Manuela Veloso

Computer Science Department, Carnegie Mellon University, Pittsburgh, USA

Computer Science Department, Carnegie Mellon University, Pittsburgh, USA
View Profile

Autonomous Agents and Multi-Agent Systems Volume 31 Issue 4July 2017pp 905–940https://doi.org/10.1007/s10458-016-9355-3

Published:01 July 2017Publication History

Autonomous Agents and Multi-Agent Systems

Abstract

Agents can learn to improve their coordination with their teammates and increase team performance. There are finite training instances, where each training instance is an opportunity for the learning agents to improve their coordination. In this article, we focus on allocating training instances to learning agent pairs, i.e., pairs that improve coordination with each other, with the goal of team formation. Agents learn at different rates, and hence, the allocation of training instances affects the performance of the team formed. We build upon previous work on the Synergy Graph model, that is learned completely from data and represents agents' capabilities and compatibility in a multi-agent team. We formally define the learning agents team formation problem, and compare it with the multi-armed bandit problem. We consider learning agent pairs that improve linearly and geometrically, i.e., the marginal improvement decreases by a constant factor. We contribute algorithms that allocate the training instances, and compare against algorithms from the multi-armed bandit problem. In our simulations, we demonstrate that our algorithms perform similarly to the bandit algorithms in the linear case, and outperform them in the geometric case. Further, we apply our model and algorithms to a multi-agent foraging problem, thus demonstrating the efficacy of our algorithms in general multi-agent problems.

References

Agmon, N., & Stone, P. (2011). Leading multiple ad hoc teammates in joint action settings. In Proceedings of the international workshop on interactive decision theory and game theory (pp. 2---8). Google ScholarDigital Library
Agmon, N., & Stone, P. (2012). Leading ad hoc agents in joint action settings with multiple teammates. In: Proceedings of the international conference on autonomous agents and multiagent systems (pp. 341---348). Google ScholarDigital Library
Albrecht, S. V., & Ramamoorthy, S. (2012). Comparative evaluation of mal algorithms in a diverse set of ad hoc team problems. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 349---356). Google ScholarDigital Library
Albrecht, S. V., & Ramamoorthy, S. (2013). A game-theoretic model and best-response learning method for ad hoc coordination in multiagent systems. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1155---1156). Google ScholarDigital Library
Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2---3), 235---256. Google ScholarDigital Library
Barrett, S. (2014). Making friends on the fly: advances in ad hoc teamwork. Ph.D. thesis, The University of Texas at Austin.Google Scholar
Barrett, S., & Stone, P. (2014). Cooperating with unknown teammates in robot soccer. In Proceedings of the international workshop on multiagent interaction without prior coordination.Google Scholar
Barrett, S., & Stone, P. (2015). Cooperating with unknown teammates in complex domains: A robot soccer case study of ad hoc teamwork. In Proceedings of the International Conference on Artificial Intelligence (pp. 2010---2016). Google ScholarDigital Library
Barrett, S., Stone, P., & Kraus, S. (2011). Empirical evaluation of ad hoc teamwork in the pursuit domain. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 567---574). Google ScholarDigital Library
Barrett, S., Stone, P., Kraus, S., & Rosenfeld, A. (2012). Learning teammate models for ad hoc teamwork. In Proceedings of the international workshop on adaptive and learning agents.Google Scholar
Bikakis, A., & Caire, P. (2014). Computing coalitions in multiagent systems: A contextual reasoning approach. In Proceedings of the European conference on multi-agent systems (pp. 85---100).Google Scholar
Blumenthal, H. J., & Parker, G. (2004). Co-evolving team capture strategies for dissimilar robots. In Proceedings of the 2004 AAAI fall symposium (Vol. 2).Google Scholar
Buşoniu, L., Babuška, R., & De Schutter, B. (2010). Multi-agent reinforcement learning: An overview. In D. Srinivasan & L. C. Jain (Eds.), Innovations in multi-agent systems and applications-1 (pp. 183---221). Berlin: Springer.Google Scholar
Caire, P., & Bikakis, A. (2014). A MCS-based methodology for computing coalitions in multirobot systems. In Proceedings of the international workshop on cognitive robotics.Google Scholar
Chakraborty, D. (2014). Sample efficient multiagent learning in the presence of markovian agents. Ph.D. thesis, The University of Texas at Austin.Google Scholar
Chakraborty, D., & Stone, P. (2008). Online multiagent learning against memory bounded adversaries. In Proceedings of the joint European conference on machine learning and knowledge discovery in databases (pp. 211---226).Google ScholarCross Ref
Chakraborty, D., & Stone, P. (2013). Cooperating with a markovian ad hoc teammate. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1085---1092). Google ScholarDigital Library
Chang, Y., Ho, T., & Kaelbling, L. P. (2004). All Learning is local: Multi-agent learning in global reward games. In Proceeedings of the international conference on neural information processing systems (pp. 807---814). Google ScholarDigital Library
Crandall, J. W. (2014). Non-myopic learning in repeated stochastic games. arXiv preprint arXiv:1409.8498Google Scholar
Eck, A., & Soh, L. K. (2015). To ask, sense, or share: Ad hoc information gathering. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 367---376). Google ScholarDigital Library
Feldman, Z., & Domshlak, C. (2014). Simple regret optimization in online planning for markov decision processes. Journal of Artificial Intelligence Research, 51, 165---205. Google ScholarCross Ref
Genter, K., Agmon, N., & Stone, P. (2013). Ad hoc teamwork for leading a flock. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 531---538). Google ScholarDigital Library
Gordin, M., Sen, S., & Puppala, N. (1997). Evolving cooperative groups: Preliminary results. In International workshop on multi-agent learning.Google Scholar
Jumadinova, J., Dasgupta, P., & Soh, L. K. (2014). Strategic capability-learning for improved multiagent collaboration in ad hoc environments. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 44(8), 1003---1014.Google ScholarCross Ref
Liemhetcharat, S. (2013). Representation, planning and learning of dynamic ad hoc robot teams. Ph.D. thesis, Carnegie Mellon University.Google Scholar
Liemhetcharat, S., & Luo, Y. (2015). Adversarial synergy graph model for predicting game outcomes in human basketball. In Proceedings of the international workshop on adaptive and learning agents.Google Scholar
Liemhetcharat, S., & Luo, Y. (2015). Applying the synergy graph model to human basketball (extended abstract). In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1695---1696). Google ScholarDigital Library
Liemhetcharat, S., & Veloso, M. (2012). Modeling and learning synergy for team formation with heterogeneous agents. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 365---375). Google ScholarDigital Library
Liemhetcharat, S., & Veloso, M. (2012). Weighted synergy graphs for role assignment in ad hoc heterogeneous robot teams. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 5247---5254).Google ScholarCross Ref
Liemhetcharat, S., & Veloso, M. (2013). Forming an effective multi-robot team robust to failures. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 5240---5245).Google ScholarCross Ref
Liemhetcharat, S., & Veloso, M. (2013). Learning the synergy of a new teammate. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 5246---5251).Google ScholarCross Ref
Liemhetcharat, S., & Veloso, M. (2013). Synergy graphs for configuring robot team members. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 111---118). Google ScholarDigital Library
Liemhetcharat, S., & Veloso, M. (2014). Weighted synergy graphs for effective team formation with heterogeneous ad hoc agents. Artificial Intelligence, 208(2014), 41---65. Google ScholarDigital Library
Liemhetcharat, S., Yan, R., & Tee, K. (2015). Continuous foraging and information gathering in a multi-agent team. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1325---1333). Google ScholarDigital Library
Liemhetcharat, S., Yan, R., Tee, K., & Lee, M. (2015). Multi-robot item delivery and foraging: Two Sides of a Coin. Robotics, Special Issue on Recent Advances in Multi-Robot Systems: Algorithms, and Applications, 4(3), 365---397.Google Scholar
Melo, F. S., & Sardinha, A. (2016). Ad hoc teamwork by learning teammates' task. Journal of Autonomous Agents and Multi-Agent Systems, 30(2), 175---219. Google ScholarDigital Library
Panait, L., & Luke, S. (2005). Cooperative multi-agent learning: The state of the art. Journal of Autonomous Agents and Multi-Agent Systems, 11(3), 387---434. Google ScholarDigital Library
Puppala, N., Sen, S., & Gordin, M. (1998). Shared memory based cooperative coevolution. In Proceedings of the IEEE international conference on evolutionary computation (pp. 570---574).Google ScholarCross Ref
Russell, S., & Norvig, P. (2003). AI: A modern approach. Upper Saddle Rive: Prentice Hall.Google Scholar
Stone, P., Kaminka, G., Kraus, S., & Rosenschein, J. (2010). Ad hoc autonomous agent teams: Collaboration without pre-coordination. In: Proceedings of the international conference on artificial intelligence (pp. 1504---1509). Google ScholarDigital Library
Stone, P., & Kraus, S. (2010). To teach or not to teach? Decision making under uncertainty in ad hoc teams. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 117---124). Google ScholarDigital Library
Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the international conference on machine learning (pp. 330---337). Google ScholarDigital Library
Thompson, W. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3/4), 285---294.Google ScholarCross Ref
Tuyls, K., & Nowe, A. (2005). Evolutionary game theory and multi-agent reinforcement learning. Journal of Knowledge Engineering Review, 20(1), 65---90. Google ScholarDigital Library
Wang, C., Liemhetcharat, S., & Low, K. (2016). Multi-agent continuous transportation with online balanced partitioning (extended abstract). In: Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1303---1304). Google ScholarDigital Library
Wu, F., Zilberstein, S., & Chen, X. (2011). Online planning for ad hoc autonomous agent teams. In Proceedings of the international joint conference on artificial intelligence (pp. 439---445). Google ScholarDigital Library
Zihayat, M., Kargar, M., & An, A. (2014). Two-phase pareto set discovery for team formation in social networks. In Proceedings of the IEEE/WIC/ACM international joint conferences on web intelligence (WI) and intelligent agent technologies (IAT) (pp. 304---311). Google ScholarDigital Library

Index Terms

Allocating training instances to learning agents for team formation
1. Computing methodologies
  1. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Team formation with learning agents that improve coordination
AAMAS '14: Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems

Learning agents increase their team's performance by learning to coordinate better with their teammates, and we are interested in forming teams that contain such learning agents. In particular, we consider finite training instances for learning agents ...
Read More
Modeling and learning synergy for team formation with heterogeneous agents
AAMAS '12: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1

The performance of a team at a task depends critically on the composition of its members. There is a notion of synergy in human teams that represents how well teams work together, and we are interested in modeling synergy in multi-agent teams. We focus ...
Read More
Weighted synergy graphs for effective team formation with heterogeneous ad hoc agents

Previous approaches to select agents to form a team rely on single-agent capabilities, and team performance is treated as a sum of such known capabilities. Motivated by complex team formation situations, we address the problem where both single-agent ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Autonomous Agents and Multi-Agent Systems Volume 31, Issue 4
July 2017
176 pages
ISSN:1387-2532
Issue’s Table of Contents

Copyright © Copyright © 2017 The Author(s)
Sponsors
In-Cooperation
Publisher
Kluwer Academic Publishers
United States
Publication History
- Published: 1 July 2017
Author Tags
Ad hoc agent
Learning agent
Multi-agent
Team formation
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 0
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

Allocating training instances to learning agents for team formation

Autonomous Agents and Multi-Agent Systems

Abstract

References

Cited By

Index Terms

Recommendations

Team formation with learning agents that improve coordination

Modeling and learning synergy for team formation with heterogeneous agents

Weighted synergy graphs for effective team formation with heterogeneous ad hoc agents

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

Digital Edition

Caption

Allocating training instances to learning agents for team formation

Autonomous Agents and Multi-Agent Systems

Abstract

References

Cited By

Index Terms

Recommendations

Team formation with learning agents that improve coordination

Modeling and learning synergy for team formation with heterogeneous agents

Weighted synergy graphs for effective team formation with heterogeneous ad hoc agents

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

Digital Edition

Share this Publication link

Share on Social Media