Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Allocating training instances to learning agents for team formation

Authors Info & Claims
Published:01 July 2017Publication History
Skip Abstract Section

Abstract

Agents can learn to improve their coordination with their teammates and increase team performance. There are finite training instances, where each training instance is an opportunity for the learning agents to improve their coordination. In this article, we focus on allocating training instances to learning agent pairs, i.e., pairs that improve coordination with each other, with the goal of team formation. Agents learn at different rates, and hence, the allocation of training instances affects the performance of the team formed. We build upon previous work on the Synergy Graph model, that is learned completely from data and represents agents' capabilities and compatibility in a multi-agent team. We formally define the learning agents team formation problem, and compare it with the multi-armed bandit problem. We consider learning agent pairs that improve linearly and geometrically, i.e., the marginal improvement decreases by a constant factor. We contribute algorithms that allocate the training instances, and compare against algorithms from the multi-armed bandit problem. In our simulations, we demonstrate that our algorithms perform similarly to the bandit algorithms in the linear case, and outperform them in the geometric case. Further, we apply our model and algorithms to a multi-agent foraging problem, thus demonstrating the efficacy of our algorithms in general multi-agent problems.

References

  1. Agmon, N., & Stone, P. (2011). Leading multiple ad hoc teammates in joint action settings. In Proceedings of the international workshop on interactive decision theory and game theory (pp. 2---8). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Agmon, N., & Stone, P. (2012). Leading ad hoc agents in joint action settings with multiple teammates. In: Proceedings of the international conference on autonomous agents and multiagent systems (pp. 341---348). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Albrecht, S. V., & Ramamoorthy, S. (2012). Comparative evaluation of mal algorithms in a diverse set of ad hoc team problems. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 349---356). Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Albrecht, S. V., & Ramamoorthy, S. (2013). A game-theoretic model and best-response learning method for ad hoc coordination in multiagent systems. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1155---1156). Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Auer, P., Cesa-Bianchi, N., & Fischer, P. (2002). Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47(2---3), 235---256. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Barrett, S. (2014). Making friends on the fly: advances in ad hoc teamwork. Ph.D. thesis, The University of Texas at Austin.Google ScholarGoogle Scholar
  7. Barrett, S., & Stone, P. (2014). Cooperating with unknown teammates in robot soccer. In Proceedings of the international workshop on multiagent interaction without prior coordination.Google ScholarGoogle Scholar
  8. Barrett, S., & Stone, P. (2015). Cooperating with unknown teammates in complex domains: A robot soccer case study of ad hoc teamwork. In Proceedings of the International Conference on Artificial Intelligence (pp. 2010---2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Barrett, S., Stone, P., & Kraus, S. (2011). Empirical evaluation of ad hoc teamwork in the pursuit domain. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 567---574). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Barrett, S., Stone, P., Kraus, S., & Rosenfeld, A. (2012). Learning teammate models for ad hoc teamwork. In Proceedings of the international workshop on adaptive and learning agents.Google ScholarGoogle Scholar
  11. Bikakis, A., & Caire, P. (2014). Computing coalitions in multiagent systems: A contextual reasoning approach. In Proceedings of the European conference on multi-agent systems (pp. 85---100).Google ScholarGoogle Scholar
  12. Blumenthal, H. J., & Parker, G. (2004). Co-evolving team capture strategies for dissimilar robots. In Proceedings of the 2004 AAAI fall symposium (Vol. 2).Google ScholarGoogle Scholar
  13. Buşoniu, L., Babuška, R., & De Schutter, B. (2010). Multi-agent reinforcement learning: An overview. In D. Srinivasan & L. C. Jain (Eds.), Innovations in multi-agent systems and applications-1 (pp. 183---221). Berlin: Springer.Google ScholarGoogle Scholar
  14. Caire, P., & Bikakis, A. (2014). A MCS-based methodology for computing coalitions in multirobot systems. In Proceedings of the international workshop on cognitive robotics.Google ScholarGoogle Scholar
  15. Chakraborty, D. (2014). Sample efficient multiagent learning in the presence of markovian agents. Ph.D. thesis, The University of Texas at Austin.Google ScholarGoogle Scholar
  16. Chakraborty, D., & Stone, P. (2008). Online multiagent learning against memory bounded adversaries. In Proceedings of the joint European conference on machine learning and knowledge discovery in databases (pp. 211---226).Google ScholarGoogle ScholarCross RefCross Ref
  17. Chakraborty, D., & Stone, P. (2013). Cooperating with a markovian ad hoc teammate. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1085---1092). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Chang, Y., Ho, T., & Kaelbling, L. P. (2004). All Learning is local: Multi-agent learning in global reward games. In Proceeedings of the international conference on neural information processing systems (pp. 807---814). Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Crandall, J. W. (2014). Non-myopic learning in repeated stochastic games. arXiv preprint arXiv:1409.8498Google ScholarGoogle Scholar
  20. Eck, A., & Soh, L. K. (2015). To ask, sense, or share: Ad hoc information gathering. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 367---376). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Feldman, Z., & Domshlak, C. (2014). Simple regret optimization in online planning for markov decision processes. Journal of Artificial Intelligence Research, 51, 165---205. Google ScholarGoogle ScholarCross RefCross Ref
  22. Genter, K., Agmon, N., & Stone, P. (2013). Ad hoc teamwork for leading a flock. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 531---538). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Gordin, M., Sen, S., & Puppala, N. (1997). Evolving cooperative groups: Preliminary results. In International workshop on multi-agent learning.Google ScholarGoogle Scholar
  24. Jumadinova, J., Dasgupta, P., & Soh, L. K. (2014). Strategic capability-learning for improved multiagent collaboration in ad hoc environments. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 44(8), 1003---1014.Google ScholarGoogle ScholarCross RefCross Ref
  25. Liemhetcharat, S. (2013). Representation, planning and learning of dynamic ad hoc robot teams. Ph.D. thesis, Carnegie Mellon University.Google ScholarGoogle Scholar
  26. Liemhetcharat, S., & Luo, Y. (2015). Adversarial synergy graph model for predicting game outcomes in human basketball. In Proceedings of the international workshop on adaptive and learning agents.Google ScholarGoogle Scholar
  27. Liemhetcharat, S., & Luo, Y. (2015). Applying the synergy graph model to human basketball (extended abstract). In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1695---1696). Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Liemhetcharat, S., & Veloso, M. (2012). Modeling and learning synergy for team formation with heterogeneous agents. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 365---375). Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Liemhetcharat, S., & Veloso, M. (2012). Weighted synergy graphs for role assignment in ad hoc heterogeneous robot teams. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 5247---5254).Google ScholarGoogle ScholarCross RefCross Ref
  30. Liemhetcharat, S., & Veloso, M. (2013). Forming an effective multi-robot team robust to failures. In: Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 5240---5245).Google ScholarGoogle ScholarCross RefCross Ref
  31. Liemhetcharat, S., & Veloso, M. (2013). Learning the synergy of a new teammate. In Proceedings of the IEEE/RSJ international conference on intelligent robots and systems (pp. 5246---5251).Google ScholarGoogle ScholarCross RefCross Ref
  32. Liemhetcharat, S., & Veloso, M. (2013). Synergy graphs for configuring robot team members. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 111---118). Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Liemhetcharat, S., & Veloso, M. (2014). Weighted synergy graphs for effective team formation with heterogeneous ad hoc agents. Artificial Intelligence, 208(2014), 41---65. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Liemhetcharat, S., Yan, R., & Tee, K. (2015). Continuous foraging and information gathering in a multi-agent team. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1325---1333). Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Liemhetcharat, S., Yan, R., Tee, K., & Lee, M. (2015). Multi-robot item delivery and foraging: Two Sides of a Coin. Robotics, Special Issue on Recent Advances in Multi-Robot Systems: Algorithms, and Applications, 4(3), 365---397.Google ScholarGoogle Scholar
  36. Melo, F. S., & Sardinha, A. (2016). Ad hoc teamwork by learning teammates' task. Journal of Autonomous Agents and Multi-Agent Systems, 30(2), 175---219. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Panait, L., & Luke, S. (2005). Cooperative multi-agent learning: The state of the art. Journal of Autonomous Agents and Multi-Agent Systems, 11(3), 387---434. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Puppala, N., Sen, S., & Gordin, M. (1998). Shared memory based cooperative coevolution. In Proceedings of the IEEE international conference on evolutionary computation (pp. 570---574).Google ScholarGoogle ScholarCross RefCross Ref
  39. Russell, S., & Norvig, P. (2003). AI: A modern approach. Upper Saddle Rive: Prentice Hall.Google ScholarGoogle Scholar
  40. Stone, P., Kaminka, G., Kraus, S., & Rosenschein, J. (2010). Ad hoc autonomous agent teams: Collaboration without pre-coordination. In: Proceedings of the international conference on artificial intelligence (pp. 1504---1509). Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Stone, P., & Kraus, S. (2010). To teach or not to teach? Decision making under uncertainty in ad hoc teams. In Proceedings of the international conference on autonomous agents and multiagent systems (pp. 117---124). Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Tan, M. (1993). Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the international conference on machine learning (pp. 330---337). Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Thompson, W. (1933). On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika, 25(3/4), 285---294.Google ScholarGoogle ScholarCross RefCross Ref
  44. Tuyls, K., & Nowe, A. (2005). Evolutionary game theory and multi-agent reinforcement learning. Journal of Knowledge Engineering Review, 20(1), 65---90. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Wang, C., Liemhetcharat, S., & Low, K. (2016). Multi-agent continuous transportation with online balanced partitioning (extended abstract). In: Proceedings of the international conference on autonomous agents and multiagent systems (pp. 1303---1304). Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Wu, F., Zilberstein, S., & Chen, X. (2011). Online planning for ad hoc autonomous agent teams. In Proceedings of the international joint conference on artificial intelligence (pp. 439---445). Google ScholarGoogle ScholarDigital LibraryDigital Library
  47. Zihayat, M., Kargar, M., & An, A. (2014). Two-phase pareto set discovery for team formation in social networks. In Proceedings of the IEEE/WIC/ACM international joint conferences on web intelligence (WI) and intelligent agent technologies (IAT) (pp. 304---311). Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Allocating training instances to learning agents for team formation
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access