Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3237383.3237501acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
research-article

Efficient Convention Emergence through Decoupled Reinforcement Social Learning with Teacher-Student Mechanism

Published: 09 July 2018 Publication History

Abstract

In this paper, we design reinforcement learning based (RL-based) strategies to promote convention emergence in multiagent systems (MASs) with large convention space. We apply our approaches to a language coordination problem in which agents need to coordinate on a dominant lexicon for efficient communication. By modeling each lexicon which maps each concept to a single word as a Markov strategy representation, the original single-state convention learning problem can be transformed into a multi-state multiagent coordination problem. The dynamics of lexicon evolutions during an interaction episode can be modeled as a Markov game, which allows agents to improve the action values of each concept separately and incrementally. Specifically we propose two learning strategies, multiple-Q and multiple-R, and also propose incorporating teacher-student mechanism on top of the learning strategies to accelerate lexicon convergence speed. Extensive experiments verify that our approaches outperform the state-of-the-art approaches in terms of convergence efficiency, convention quality and scalability.

References

[1]
Stéphane Airiau, Sandip Sen, and Daniel Villatoro. 2014. Emergence of conventions through social learning. Autonomous Agents and Multi-Agent Systems (2014), 779--804.
[2]
Albert-László Barabási and Réka Albert. 1999. Emergence of scaling in random networks. Science (1999), 509--512.
[3]
Felipe Leno da Silva, Ruben Glatt, and Anna Helena Reali Costa. 2017. Simultaneously learning and advising in multiagent reinforcement learning Proceedings of the 16th International Conference on Autonomous Agents and MultiAgent Systems. 1100--1108.
[4]
Jordi Delgado. 2002. Emergence of social conventions in complex networks. Artificial Intelligence (2002), 171--185.
[5]
Carlo D'Eramo, Marcello Restelli, and Alessandro Nuara. 2016. Estimating maximum expected value through gaussian approximation Proceedings of the 33rd International Conference on Machine Learning. 1032--1040.
[6]
Henry Franks, Nathan Griffiths, and Arshad Jhumka. 2013. Manipulating convention emergence using influencer agents. Autonomous Agents and Multi-Agent Systems (2013), 1--39.
[7]
Carlos Guestrin, Daphne Koller, and Ronald Parr. 2002. Multiagent planning with factored MDPs. In Proceedings of the 16th International Conderence on Neural Information Processing Systems. 1523--1530.
[8]
Jianye Hao and Ho-fung Leung. 2013. The Dynamics of Reinforcement Social Learning in Cooperative Multiagent Systems. Proceedings of the 23rd International Joint Conference on Artificial intelligence. 184--190.
[9]
Mohammad Rashedul Hasan. 2014. Communication convention formation in large multiagent systems Proceedings of the 13th International Conference on Autonomous agents and Multiagent Systems. 1747--1748.
[10]
Mohammad Rashedul Hasan, Anita Raja, and Ana LC Bazzan. 2015. Fast Convention Formation in Dynamic Networks Using Topological Knowledge. Proceedings of the 29th AAAI Conference on Artificial Intelligence. 2067--2073.
[11]
Hado V Hasselt. 2010. Double Q-learning Proceedings of the 24th International Conference on Neural Information Processing Systems. 2613--2621.
[12]
Shuyue Hu and Ho-fung Leung. 2017. Achieving Coordination in Multi-Agent Systems by Stable Local Conventions under Community Networks. In Proceedings of the 26th International Joint Conference on Artificial Intelligence. 4731--4737.
[13]
Jelle R Kok and Nikos Vlassis. 2006. Collaborative multiagent reinforcement learning by payoff propagation. Journal of Machine Learning Research (2006), 1789--1828.
[14]
Mihail Mihaylov, Karl Tuyls, and Ann Nowé. 2014. A decentralized approach for convention emergence in multi-agent systems. Autonomous Agents and Multi-Agent Systems (2014), 749--778.
[15]
Josep M Pujol, Jordi Delgado, Ramon Sangüesa, and Andreas Flache. 2005. The role of clustering on the emergence of efficient social conventions Proceedings of the 19th International Joint Conference on Artificial intelligence. 965--970.
[16]
Norman Salazar, Juan A Rodriguez-Aguilar, and Josep L Arcos. 2010. Robust coordination in large convention spaces. AI Communications (2010), 357--372.
[17]
Jeff Schneider, Weng-Keen Wong, Andrew Moore, and Martin Riedmiller. 1999. Distributed value functions. Robotics Institute (1999), 264.
[18]
Onkur Sen and Sandip Sen. 2010. Effects of social network topology and options on norm emergence. Coordination, Organizations, Institutions and Norms in Agent Systems V. Springer, 211--222.
[19]
Sandip Sen and Stéphane Airiau. 2007. Emergence of norms through social learning. In Proceedings of the 20th International Joint Conference on Artificial intelligence. 1512.
[20]
Daniel Villatoro, Jordi Sabater-Mir, and Sandip Sen. 2013. Robust convention emergence in social networks through self-reinforcing structures dissolution. ACM Transactions on Autonomous and Adaptive Systems (2013), 2.
[21]
Duncan J Watts and Steven H Strogatz. 1998. Collective dynamics of'small-world'networks. Nature (1998), 440.
[22]
Tianpei Yang, Zhaopeng Meng, Jianye Hao, Sandip Sen, and Chao Yu. 2016. Accelerating Norm Emergence Through Hierarchical Heuristic Learning Proceedings of the 25th European Conference on Artificial Intelligenc. 1344--1352.
[23]
Chao Yu, Hongtao Lv, Fenghui Ren, Honglin Bao, and Jianye Hao. 2015. Hierarchical learning for emergence of social norms in networked multiagent systems Proceedings of the 28th Australasian Joint Conference on Artificial Intelligence. 630--643.
[24]
Chao Yu, Hongtao Lv, Sandip Sen, Jianye Hao, Fenghui Ren, and Rui Liu. 2016 a. An adaptive learning framework for efficient emergence of social norms Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems. 1307--1308.
[25]
Chao Yu, Guozhen Tan, Hongtao Lv, Zhen Wang, Jun Meng, Jianye Hao, and Fenghui Ren. 2016 b. Modelling Adaptive Learning Behaviours for Consensus Formation in Human Societies. Scientific Reports (2016), 27626.

Cited By

View all
  • (2022)Gist Trace-based Learning: Efficient Convention Emergence from Multilateral InteractionsACM Transactions on Autonomous and Adaptive Systems10.1145/350219916:1(1-20)Online publication date: 23-Jan-2022
  • (2021)Towards Sample Efficient Learners in Population based Referential Games through Action AdvisingProceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3463952.3464202(1689-1691)Online publication date: 3-May-2021
  • (2019)To be Big Picture Thinker or Detail-Oriented?Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3306127.3331997(2021-2023)Online publication date: 8-May-2019

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AAMAS '18: Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems
July 2018
2312 pages

Sponsors

In-Cooperation

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 09 July 2018

Check for updates

Author Tags

  1. convention emergence
  2. multiagent social learning

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • Special Program of Artificial Intelligence of Tianjin Municipal Science and Technology Commission

Conference

AAMAS '18
Sponsor:
AAMAS '18: Autonomous Agents and MultiAgent Systems
July 10 - 15, 2018
Stockholm, Sweden

Acceptance Rates

AAMAS '18 Paper Acceptance Rate 149 of 607 submissions, 25%;
Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 11 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Gist Trace-based Learning: Efficient Convention Emergence from Multilateral InteractionsACM Transactions on Autonomous and Adaptive Systems10.1145/350219916:1(1-20)Online publication date: 23-Jan-2022
  • (2021)Towards Sample Efficient Learners in Population based Referential Games through Action AdvisingProceedings of the 20th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3463952.3464202(1689-1691)Online publication date: 3-May-2021
  • (2019)To be Big Picture Thinker or Detail-Oriented?Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3306127.3331997(2021-2023)Online publication date: 8-May-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media