Strategic advice provision in repeated human-agent interactions

Azaria, Amos; Gal, Ya’akov; Kraus, Sarit; Goldman, Claudia V.

doi:10.1007/s10458-015-9284-6

Strategic advice provision in repeated human-agent interactions

Published: 10 February 2015

Volume 30, pages 4–29, (2016)
Cite this article

Autonomous Agents and Multi-Agent Systems Aims and scope Submit manuscript

Amos Azaria¹,
Ya’akov Gal²,
Sarit Kraus³ &
…
Claudia V. Goldman⁴

771 Accesses
20 Citations
3 Altmetric
Explore all metrics

Abstract

This paper addresses the problem of automated advice provision in scenarios that involve repeated interactions between people and computer agents. This problem arises in many applications such as route selection systems, office assistants and climate control systems. To succeed in such settings agents must reason about how their advice influences people’s future actions or decisions over time. This work models such scenarios as a family of repeated bilateral interaction called “choice selection processes”, in which humans or computer agents may share certain goals, but are essentially self-interested. We propose a social agent for advice provision (SAP) for such environments that generates advice using a social utility function which weighs the sum of the individual utilities of both agent and human participants. The SAP agent models human choice selection using hyperbolic discounting and samples the model to infer the best weights for its social utility function. We demonstrate the effectiveness of SAP in two separate domains which vary in the complexity of modeling human behavior as well as the information that is available to people when they need to decide whether to accept the agent’s advice. In both of these domains, we evaluated SAP in extensive empirical studies involving hundreds of human subjects. SAP was compared to agents using alternative models of choice selection processes informed by behavioral economics and psychological models of decision-making. Our results show that in both domains, the SAP agent was able to outperform alternative models. This work demonstrates the efficacy of combining computational methods with behavioral economics to model how people reason about machine-generated advice and presents a general methodology for agent-design in such repeated advice settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Notes

We use the term “world state” to disambiguate the states of an MDP from those of a selection process.
This method is more common in POMDPs, however, since our state space is very large, we use this method as well.
This model does not require an additional parameter for the actual cost for the receiver (\(c_R(a,v)\)), since \(c_R(a,v)\) is already a linear combination of the comfort level and the energy consumption.
In fact, the exact equivalent to the road selection domain, would be assuming that the user set a cost to each of the possible combinations of the heat load and each of the possible power levels. However, such an assumption would result with too many arms, most of which would not be sampled or sampled only once, and thus would not result in a good human model.

References

Camerer, C. F. (2003). Behavioral game theory. Experiments in strategic interaction, Chapter 2. Princeton: Princeton University Press.
Bonaccio, S., & Dalal, R. S. (2006). Advice taking and decision-making: An integrative literature review and implications for the organizational sciences. Organizational Behavior and Human Decision Processes, 101(2), 127–151.
Article Google Scholar
Yaniv, I., & Kleinberger, E. (2000). Advice taking in decision making: Egocentric discounting and reputation formation. Organizational Behavior and Human Decision Processes, 83(2), 260–281.
Article Google Scholar
Gans, N., Knox, G., & Croson, R. (2007). Simple models of discrete choice and their performance in bandit experiments. Manufacturing & Service Operations Management, 9(4), 383–408.
Article Google Scholar
Haile, P. A., Hortasu, A., & Kosenok, G. (2008). On the empirical content of quantal response equilibrium. American Economic Review, 98(1), 180–200.
Article Google Scholar
Amazon. (2010). Mechanical turk services. Retrieved from http://www.mturk.com/.
Azaria, A., Rabinovich, Z., Kraus, S., Goldman, C. V., & Gal, Y. (2012). Strategic advice provision in repeated human-agent interactions. In The 26th AAAI Conference on Artificial Intelligence (AAAI), Bellevue, WA.
Jonker, C. M., Hindriks, K. V., Wiggers, P., & Broekens, J. (2012). Negotiating agents. AI Magazine, 33(3), 79.
Google Scholar
Rovatsos, M., & Belesiotis, A. (2007). Advice taking in multiagent reinforcement learning. In AAMAS (pp. 237). New York: ACM.
Adomavicius, G., & Tuzhilin, A. (2005). Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6), 734–749.
Article Google Scholar
Ricci, F., Rokach, L., Shapira, B., & Kantor, P. B. (Eds.). (2011). Recommender systems handbook. New York: Springer.
MATH Google Scholar
Azaria, A., Hassidim, A., Kraus, S., Eshkol, A., Weintraub, O., & Netanely, I. (2013). Movie recommender system for profit maximization. In RecSys (pp. 121–128).
Chen, L. S., Hsu, F. H., Chen, M. C., & Hsu, Y. C. (2008). Developing recommender systems with the consideration of product profitability for sellers. Information Sciences, 178(4), 1032–1048.
Article Google Scholar
Das, A., Mathieu, C., & Ricketts, D. (2009). Maximizing profit using recommender systems. ArXiv e-prints, pp. 0908, 3633.
Pathak, B., Garfinkel, R., Gopal, R. D., Venkatesan, R., & Yin, F. (2010). Empirical analysis of the impact of recommender systems on sales. Journal of Management Information Systems, 27(2), 159–188.
Article Google Scholar
Shani, G., Heckerman, D., & Brafman, R. I. (2005). An MDP-based recommender system. The Journal of Machine Learning Research, 6, 1265–1295.
MATH MathSciNet Google Scholar
Rosenberg, S. W., Bohan, L., McCafferty, P., & Harris, K. (1986). The image and the vote: The effect of candidate presentation on voter preference. American Journal of Political Science, 30, 108–127.
Article Google Scholar
Fenster, M., Zuckerman, I., & Kraus, S. (2012). Guiding user choice during discussion by silence, examples and justifications. ECAI (pp. 330–335). Amsterdam: IOS Press.
Google Scholar
Azaria, A., Rabinovich, Z., Kraus, S., & Goldman, C. V. (2011). Strategic information disclosure to people with multiple alternatives. In Proceedings of the 26th AAAI Conference on artificial intelligence (AAAI), Maryland.
Hajaj, C., Hazon, N., & Sarne, D. (2014). Ordering effects and belief adjustment in the use of comparison shopping agents. In AAAI-14 (pp. 930–936). Israel: Bar-Ilan University.
Hajaj, C., Hazon, N., Sarne, D., & Elmalech, A. (2013). Search more, disclose less. In Proceedings of the twenty-seventh AAAI conference on artificial intelligence (pp. 401–408), Bellevue.
Elmalech, A., Sarne, D., Rosenfeld, A., & Erez, E. S. (2015). When suboptimal rules. In Proceedings of AAAI-15, Menlo Park, CA.
Wahlster, W., & Kobsa, A. (1989). User models in dialog systems. Berlin: Springer.
Book MATH Google Scholar
Horvitz, E., Breese, J., Heckerman, D., Hovel, D., & Rommelse, K. (1998). The lumiere project: Bayesian user modeling for inferring the goals and needs of software users. In Proceedings of the fourteenth conference on uncertainty in artificial intelligence (pp. 256–265), Madison.
Amir, O., & Gal, Y. K. (2013). Plan recognition and visualization in exploratory learning environments. ACM Transactions on Interactive Intelligent Systems (TiiS), 3(3), 16.
Google Scholar
Kim, T., Hong, H., & Magerko, B. (2009). Coralog: Use-aware visualization connecting human micro-activities to environmental change. In CHI’09 Extended abstracts on human factors in computing systems (pp. 4303–4308). New York: ACM.
Petersen, D., Steele, J., & Wilkerson, J. (2009). Wattbot: A residential electricity monitoring and feedback system. In CHI’09 extended abstracts on human factors in computing systems (pp. 2847–2852). New York: ACM.
Pierce, J., Schiano, D. J., & Paulos, E. (2010). Home, habits, and energy: Examining domestic interactions and energy consumption. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 1985–1994). New York: ACM.
Froehlich, J., Findlater, L., & Landay, J. (2010). The design of eco-feedback technology. In SIGCHI conference on human factors in computing systems (pp. 1999–2008). New York: ACM.
Fogg, B. J. (2002). Persuasive technology: Using computers to change what we think and do. Ubiquity, 2002, 5.
Article Google Scholar
Auer, P., Cesa-Bianchi, N., Freund, Y., & Schapire, R. E. (1995). Gambling in a rigged casino: The adversarial multi-armed bandit problem. In Proceedings of 36th annual symposium on foundations of computer science (FOCS), (pp. 322–331). Alamitos: IEEE Computer Society Press.
Chabris, C. F., Laibson, D. I., & Schuldt, J. P. (2006). Intertemporal choice. The New Palgrave Dictionary of Economics, 2, 1–11.
Google Scholar
Deaton, A., & Paxson, C. (1994). Intertemporal choice and inequality. The Journal of Political Economy, 102(3), 437–467.
Article Google Scholar
Lisman, J. E., & Idiart, M. A. P. (1995). Storage of 7 \(\pm \) 2 short-term memories in oscillatory subcycles. Science, 267, 1512–1515.
Article Google Scholar
Miller, G. A. (1956). The magical number seven plus or minus two: Some limits on our capacity for processing information. Psychological Review, 63(2), 81–97.
Article Google Scholar
Vermorel, Joanns, & Mohri, Mehryar. (2005). Multi-armed bandit algorithms and empirical evaluation. European conference on machine learning (pp. 437–448). New York: Springer.
Google Scholar
Goldman, C. V., & Zilberstein, S. (2003). Optimizing information exchange in cooperative multi-agent systems. In Proceedings of the second international joint conference on autonomous agents and multiagent systems (pp. 137–144). Melbourne: ACM Press.
Guestrin, C., Koller, D., & Parr, R. (2001). Multiagent planning with factored mdps. In NIPS (Vol. 1, pp. 1523–1530). Dordrecht: Kluwer Academic Publishers.
Marecki, J., Koenig, S., & Tambe, M. (2007). A fast analytical algorithm for solving markov decision processes with real-valued resources. In Proceedings of the international joint conference on artificial intelligence (IJCAI) (pp. 2536–2541), Hyderabad.
Feng, Z., Dearden, R., Meuleau, N., & Washington, R. (2004). Dynamic programming for structured continuous markov decision problems. In The 20th conference on uncertainty in artificial intelligence (pp. 154–161). Orlando: AUAI Press.
Ormoneit, D., & Sen, S. (2002). Kernel-based reinforcement learning. Machine Learning, 49(2), 161–178.
Article MATH Google Scholar
Keith, W. (1970). Hastings. Monte carlo sampling methods using markov chains and their applications. Biometrika, 57(1), 97–109.
Article Google Scholar
Metropolis, N., & Ulam, S. (1949). The Monte carlo method. Journal of the American statistical association, 44(247), 335–341.
Article MATH MathSciNet Google Scholar
Gal, Y., Kraus, S., Gelfand, M., Khashan, H., & Salmon, E. (2011). An adaptive agent for negotiating with people in different cultures. ACM Transactions on Intelligent Systems and Technology (TIST), 3(1), 8.
Google Scholar
Silver, D., & Veness, J. (2010). Monte-carlo planning in large pomdps. In Advances in neural information processing systems (pp. 2164–2172).
Stone, P., & Kraus, S. (2010). To teach or not to teach? Decision making under uncertainty in ad hoc teams. In Proceedings of the 9th international conference on autonomous agents and multiagent systems (Vol. pp. 117–124). Toronto: International Foundation for Autonomous Agents and Multiagent Systems.
Nguyen, T., Yang, R., Azaria, A., Kraus, S., & Tambe, M. (2013). Analyzing the effectiveness of adversary modeling in security games. In AAAI, New York.
Azaria, A., Rabinovich, Z., Kraus, S., & Goldman, C. V. (2014). Strategic information disclosure to people with multiple alternatives. Transactions on Intelligent Systems and Technology (TIST), 5(4), 64–86.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Machine Learning, Carnegie Mellon University, Pittsburgh, PA, USA
Amos Azaria
Department of Information Systems Engineering, Ben-Gurion University of the Negev, Beersheba, Israel
Ya’akov Gal
Department of Computer Science, Bar-Ilan University, Ramat Gan, Israel
Sarit Kraus
General Motors Advanced Technical Center, Herzliya, Israel
Claudia V. Goldman

Authors

Amos Azaria
View author publications
You can also search for this author in PubMed Google Scholar
Ya’akov Gal
View author publications
You can also search for this author in PubMed Google Scholar
Sarit Kraus
View author publications
You can also search for this author in PubMed Google Scholar
Claudia V. Goldman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Amos Azaria.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Azaria, A., Gal, Y., Kraus, S. et al. Strategic advice provision in repeated human-agent interactions. Auton Agent Multi-Agent Syst 30, 4–29 (2016). https://doi.org/10.1007/s10458-015-9284-6

Download citation

Published: 10 February 2015
Issue Date: January 2016
DOI: https://doi.org/10.1007/s10458-015-9284-6

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Strategic advice provision in repeated human-agent interactions

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Disentangling the contribution of individual and social learning processes in human advice-taking behavior

Influencing Retirement Saving Behavior with Expert Advice and Social Comparison as Persuasive Techniques

A meta-analysis of the weight of advice in decision-making

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Strategic advice provision in repeated human-agent interactions

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Disentangling the contribution of individual and social learning processes in human advice-taking behavior

Influencing Retirement Saving Behavior with Expert Advice and Social Comparison as Persuasive Techniques

A meta-analysis of the weight of advice in decision-making

Notes

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation