research-article

Dynamic Particle Allocation to Solve Interactive POMDP Models for Social Decision Making

Authors:

Rohith Dwarakanath Vallam,

Surya Shravan Kumar Sajja,

Ritwik Chaudhuri,

Rakesh Pimplikar,

Kushal Mukherjee,

Ramasuri Narayanam,

Gyana ParijaAuthors Info & Claims

AAMAS '19: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems

Pages 674 - 682

Published: 08 May 2019 Publication History

Abstract

In social dilemma settings, such as repeated Public Goods Games (PGGs), humans often come across a dilemma whether to contribute or not based on past contributions from others. In such settings, the decision taken by an agent/human actually depends not only on the belief the agent has about other agents and the environment, but also on their beliefs about others' beliefs. To factor in these aspects, we propose a novel formulation of computational theory of mind (ToM) to model human behavior in a repeated PGG using interactive partially observable Markov decision processes (I-POMDPs). Interactive particle filter (IPF) is a well-known algorithm used to approximately solve I-POMDP models for the agents to find their optimal contributions. Number of particles assigned to an agent in IPF can be translated into time and computational resources. Solving I-POMDPs in a time-memory efficient manner even in the case of small state spaces is a largely intractable problem. Also, maintaining a fixed number of particles assigned to each agent, over time, will be highly inefficient in terms of resource utilization. To address this problem, we propose a dynamic particle allocation algorithm for different agents based on how well they could predict. We validate our proposed algorithm through real experiments involving human agents. Our results suggest that dynamic particle allocation based IPF for I-POMDPs is effective in modelling human behaviours in repeated social dilemma setting while utilizing computational resources in an effective manner.

References

[1]

Stefano V Albrecht and Peter Stone. 2018. Autonomous agents modelling other agents: A comprehensive survey and open problems. Artificial Intelligence, Vol. 258 (2018), 66--95.

[2]

Baker, Chris L, Jara-Ettinger, Julian, Saxe, Rebecca, and Joshua B. Tenenbaum. 2017. Rational quantitative attribution of beliefs, desires and percepts in human mentalizing. Nature Human Behaviour (2017).

[3]

Chris L. Baker, Rebecca Saxe, and Joshua B. Tenenbaum. 2011. Bayesian Theory of Mind: Modeling Joint Belief-Desire Attribution. In CogSci .

[4]

Daan Bloembergen, Karl Tuyls, Daniel Hennes, and Michael Kaisers. 2015. Evolutionary Dynamics of Multi-agent Learning: A Survey. J. Artif. Int. Res., Vol. 53, 1 (May 2015), 659--697. http://dl.acm.org/citation.cfm?id=2831071.2831085

Digital Library

[5]

Enrique Munoz de Cote, Alessandro Lazaric, and Marcello Restelli. 2006. Learning to Cooperate in Multi-agent Social Dilemmas. In Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS '06). ACM, New York, NY, USA, 783--785.

Digital Library

[6]

Prashant Doshi and Piotr J. Gmytrasiewicz. 2005 a. Approximating State Estimation in Multiagent Settings Using Particle Filters. In Proceedings of the Fourth International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS '05). ACM, New York, NY, USA, 320--327.

Digital Library

[7]

Prashant Doshi and Piotr J Gmytrasiewicz. 2005 b. A particle filtering based approach to approximating interactive pomdps. In AAAI . 969--974.

Digital Library

[8]

Prashant Doshi and Piotr J. Gmytrasiewicz. 2009. Monte Carlo Sampling Methods for Approximating Interactive POMDPs. J. Artif. Int. Res., Vol. 34, 1 (March 2009), 297--337. http://dl.acm.org/citation.cfm?id=1622716.1622725

Digital Library

[9]

Prashant Doshi and Dennis Perez. 2008a. Generalized Point Based Value Iteration for Interactive POMDPs.

Digital Library

[10]

Prashant Doshi and Dennis Perez. 2008b. Generalized Point Based Value Iteration for Interactive POMDPs. In Proceedings of the 23rd National Conference on Artificial Intelligence - Volume 1 (AAAI'08). AAAI Press, 63--68. http://dl.acm.org/citation.cfm?id=1619995.1620007

Digital Library

[11]

Prashant Doshi, Xia Qu, Adam Goodie, and Diana Young. 2010. Modeling recursive reasoning by humans using empirically informed interactive POMDPs. In Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1-Volume 1. International Foundation for Autonomous Agents and Multiagent Systems, 1223--1230.

Digital Library

[12]

Prashant Doshi, Xia Qu, Adam S Goodie, and Diana L Young. 2012. Modeling human recursive reasoning using empirically informed interactive partially observable markov decision processes. IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans, Vol. 42, 6 (2012), 1529--1542.

Digital Library

[13]

V'ictor Elvira, Joaqu'in M'iguez, and Petar M Djurić. 2017. Adapting the number of particles in sequential Monte Carlo methods through an online scheme for convergence assessment. IEEE Transactions on Signal Processing, Vol. 65, 7 (2017), 1781--1794.

[14]

Piotr J. Gmytrasiewicz and Prashant Doshi. 2005. A Framework for Sequential Planning in Multi-agent Settings. J. Artif. Int. Res., Vol. 24, 1 (July 2005), 49--79. http://dl.acm.org/citation.cfm?id=1622519.1622521

Digital Library

[15]

R Mark Isaac and James M Walker. 1988. Group size effects in public goods provision: The voluntary contributions mechanism. The Quarterly Journal of Economics, Vol. 103, 1 (1988), 179--199.

[16]

Koosha Khalvati, Seongmin A. Park, Jean-Claude Dreher, and Rajesh P. N. Rao. 2016. A Probabilistic Model of Social Decision Making Based on Reward Maximization. In Proceedings of the 30th International Conference on Neural Information Processing Systems (NIPS'16). Curran Associates Inc., USA, 2909--2917. http://dl.acm.org/citation.cfm?id=3157382.3157423

Digital Library

[17]

Paul AM Van Lange, Jeff Joireman, Craig D Parks, and Eric Van Dijk. 2013. The psychology of social dilemmas: A review. Organizational Behavior and Human Decision Processes, Vol. 120, 2 (2013), 125--141.

[18]

J. Ledyard. 1995. Public Goods: A survey of experimental research .In Handbook of Experimental Economics, J.H. Hagel and A.E. Roth, Eds. Princeton University Press, Princeton, NJ, USA.

[19]

et al. Martin Nowak, Karl Sigmund. 1993. A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner's dilemma game . Nature, Vol. 364, 6432 (1993), 56--58.

[20]

Martin Nowak and Karl Sigmund. 1998. Evolution of indirect reciprocity by image scoring. Nature, Vol. 393, 6685 (1998), 573--577.

[21]

Martin A. Nowak, Corina E. Tarnita, and Edward O. Wilson. 2010. The evolution of eusociality. Nature, Vol. 466 (2010).

[22]

Alessandro Panella and Piotr J. Gmytrasiewicz. 2015. Nonparametric Bayesian Learning of Other Agents? Policies in Interactive POMDPs. In Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems (AAMAS '15). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 1875--1876. http://dl.acm.org/citation.cfm?id=2772879.2773481

Digital Library

[23]

Jan Pöppel and Stefan Kopp. 2018. Satisficing Models of Bayesian Theory of Mind for Explaining Behavior of Differently Uncertain Agents: Socially Interactive Agents Track. In Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems . International Foundation for Autonomous Agents and Multiagent Systems, 470--478.

Digital Library

[24]

David Premack and Guy Woodruff. 1978. Does the chimpanzee have a theory of mind? Behavioral and Brain Sciences, Vol. 1, 4 (1978), 515--526.

[25]

Xia Qu, Prashant Doshi, and Adam Goodie. 2012. Modeling deep strategic reasoning by humans in competitive games. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 3. International Foundation for Autonomous Agents and Multiagent Systems, 1243--1244.

Digital Library

[26]

Neil C. Rabinowitz, Frank Perbet, H. Francis Song, Chiyuan Zhang, S. M. Ali Eslami, and Matthew Botvinick. 2018. Machine Theory of Mind. CoRR, Vol. abs/1802.07740 (2018). arxiv: 1802.07740 http://arxiv.org/abs/1802.07740

[27]

Anatol Rapoport. 1974. Prisoner's Dilemma-Recollections and observations .Springer.

[28]

Brian Scassellati. 2002. Theory of Mind for a Humanoid Robot. Autonomous Robots, Vol. 12, 1 (01 Jan 2002), 13--24.

Digital Library

[29]

David Silver and Joel Veness. 2010. Monte-Carlo planning in large POMDPs. In Advances in neural information processing systems . 2164--2172.

Digital Library

[30]

Ekhlas Sonu, Yingke Chen, and Prashant Doshi. 2017. Decision-theoretic planning under anonymity in agent populations. Journal of Artificial Intelligence Research, Vol. 59 (2017), 725--770.

[31]

Siddharth Suri and Duncan J. Watts. 2011. Cooperation and Contagion in Web-Based, Networked Public Goods Experiments. PLOS ONE, Vol. 6, 3 (03 2011), 1--18.

[32]

Michael Wunder, Michael Kaisers, John Robert Yaros, and Michael Littman. 2011. Using Iterated Reasoning to Predict Opponent Strategies. In The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2 (AAMAS '11). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 593--600. http://dl.acm.org/citation.cfm?id=2031678.2031702

Digital Library

Index Terms

Recommendations

Dialogue POMDP components (Part II): learning the reward function

The partially observable Markov decision process (POMDP) framework has been applied in dialogue systems as a formal framework to represent uncertainty explicitly while being robust to noise. In this context, estimating the dialogue POMDP model ...
Resource allocation problems in stochastic sequential decision making
Particle Swarm Optimization with Group Decision Making
HIS '09: Proceedings of the 2009 Ninth International Conference on Hybrid Intelligent Systems - Volume 01

The particle swarm optimization (PSO) is a stochastic optimization algorithm imitating animal behavior, which shows a bad performance when optimizing the multimodal and high dimensional functions. Each particle uses own experience and other’s to make ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

AAMAS '19: Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems

May 2019

2518 pages

ISBN:9781450363099

General Chairs:
Edith Elkind
University of Oxford, UK
,
Manuela Veloso
CMU (on leave), JPMorgan, USA
,
Program Chairs:
Noa Agmon
Bar-Ilan University, Israel
,
Matthew E. Taylor
Borealis AI, Canada

Sponsors

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 08 May 2019

Check for updates

Author Tags

Qualifiers

Research-article

Conference

AAMAS '19

Sponsor:

SIGAI

AAMAS '19: International Conference on Autonomous Agents and Multiagent Systems

May 13 - 17, 2019

Montreal QC, Canada

Acceptance Rates

AAMAS '19 Paper Acceptance Rate 193 of 793 submissions, 24%;

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
109
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten