research-article

Playing repeated Stackelberg games with unknown opponents

Authors:

Janusz Marecki,

Richard SegalAuthors Info & Claims

AAMAS '12: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2

Pages 821 - 828

Published: 04 June 2012 Publication History

Abstract

In Stackelberg games, a "leader" player first chooses a mixed strategy to commit to, then a "follower" player responds based on the observed leader strategy. Notable strides have been made in scaling up the algorithms for such games, but the problem of finding optimal leader strategies spanning multiple rounds of the game, with a Bayesian prior over unknown follower preferences, has been left unaddressed. Towards remedying this shortcoming we propose a first-of-a-kind tractable method to compute an optimal plan of leader actions in a repeated game against an unknown follower, assuming that the follower plays myopic best-response in every round. Our approach combines Monte Carlo Tree Search, dealing with leader exploration/exploitation tradeoffs, with a novel technique for the identification and pruning of dominated leader strategies. The method provably finds asymptotically optimal solutions and scales up to real world security games spanning double-digit number of rounds.

References

[1]

T. Alpcan and T. Basar. A game theoretic approach to decision and analysis in network intrusion detection. In 42nd IEEE Conf. on Decision and Control, 2003.

[2]

P. Auer, N. Cesa-Binachi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47:235--256, 2002.

Digital Library

[3]

H. Finnsson and Y. Björnsson. Simulation-based approach to general game playing. In Proc. of AAAI-2008, pages 259--264, 2008.

Digital Library

[4]

D. Fudenberg and J. Tirole. Game Theory. MIT Press, 1991.

[5]

S. Gelly et al. Modification of UCT with patterns in Monte-Carlo Go. Technical Report 6062, INRIA, 2006.

[6]

M. Jain et al. Security games with arbitrary schedules. In AAAI, 2010.

[7]

C. Kiekintveld, J. Marecki, and M. Tambe. Approximation methods for infinite bayesian stackelberg games: Modeling distributional payoff uncertainty. In AAMAS, 2011.

Digital Library

[8]

L. Kocsis and C. Szepesvari. Bandit based Monte-Carlo planning. In 15th European Conf. on Machine Learning, pages 282--293, 2006.

Digital Library

[9]

J. Letchford, V. Conitzer, and K. Munagala. Learning and approximating the optimal strategy to commit to. In Symp. on the Algorithmic Decision Theory, 2009.

Digital Library

[10]

K. Nguyen and T. Basar. Security games with incomplete information. In IEEE Intl. Conf. on Communications, 2009.

Digital Library

[11]

P. Paruchuri et al. An efficient heuristic approach for security against multiple adversaries. In AAMAS, 2007.

Digital Library

[12]

P. Paruchuri et al. Playing games with security: An efficient exact algorithm for Bayesian Stackelberg games. In AAMAS, 2008.

Digital Library

[13]

J. Pita et al. Deployed ARMOR protection: The application of a game-theoretic model for security at the the LAX airport. In AAMAS Industry Track, 2008.

Digital Library

[14]

J. Pita et al. Effective solutions for real-world stackelberg games: When agents must deal with human uncertainties. In AAMAS, 2009.

Digital Library

[15]

R. Ramanujan, A. sabharwal, and B. Selman. Understanding sampling style adversarial search methods. In Proceedings of UAI-2010, 2010.

[16]

R. Segal. On the scalability of parallel UCT. In H. van den Herik et al., editors, Computers and Games, volume 6515 of LNCS, pages 36--47. Springer Berlin/Heidelberg, 2011.

Digital Library

[17]

D. Silver and J. Veness. Monte-Carlo Planning in Large POMDPs. In Advances in Neural Information Processing Systems 22, 2010.

Cited By

Zhao GZhu BJiao JKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Online learning in stackelberg games with an omniscient followerProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3620188(42304-42316)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3620188
Chen SWu JWu YYang ZKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Learning to incentivize information acquisitionProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618612(5194-5218)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618612
Bernasconi MCastiglioni MCelli AMarchesi ATrovò FGatti NKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Optimal rates and efficient algorithms for online Bayesian persuasionProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618500(2164-2183)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618500
Show More Cited By

Index Terms

Playing repeated Stackelberg games with unknown opponents
1. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic algorithms
    2. Probabilistic reasoning algorithms
      1. Markov-chain Monte Carlo methods
      2. Sequential Monte Carlo methods

Recommendations

Phenomena in Inverse Stackelberg Games, Part 1: Static Problems

Games are considered in which the role of the players is a hierarchical one. Some players behave as leaders, others as followers. Such games are named after Stackelberg. In the current paper, a special type of these games is considered, known in the ...
Phenomena in Inverse Stackelberg Games, Part 2: Dynamic Problems

Dynamic two-person games are considered, in which the roles of the players are hierarchical. One player behaves as a leader, the other one as a follower. Such games are named after Stackelberg. In the current paper, a special type of such games is ...
Playing with Streakiness in Online Games: How Players Perceive and React to Winning and Losing Streaks in League of Legends
CHI '18: Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems

Streakiness refers to observed tendency towards consecutive appearances of particular patterns. In video games, streakiness is oftentimes inevitable, where a player keeps winning or losing for a short period. However, the phenomenon remains understudied ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

AAMAS '12: Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2

June 2012

601 pages

ISBN:0981738125

Sponsors

The International Foundation for Autonomous Agents and Multiagent Systems: The International Foundation for Autonomous Agents and Multiagent Systems

In-Cooperation

SIGAI: ACM Special Interest Group on Artificial Intelligence

Publisher

International Foundation for Autonomous Agents and Multiagent Systems

Richland, SC

Publication History

Published: 04 June 2012

Check for updates

Author Tags

Qualifiers

Research-article

Conference

AAMAS 12

Sponsor:

The International Foundation for Autonomous Agents and Multiagent Systems

AAMAS 12: International Conference on Autonomous Agents and Multiagent Systems

June 4 - 8, 2012

Valencia, Spain

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

21
Total Citations
View Citations
317
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhao GZhu BJiao JKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Online learning in stackelberg games with an omniscient followerProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3620188(42304-42316)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3620188
Chen SWu JWu YYang ZKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Learning to incentivize information acquisitionProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618612(5194-5218)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618612
Bernasconi MCastiglioni MCelli AMarchesi ATrovò FGatti NKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Optimal rates and efficient algorithms for online Bayesian persuasionProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3618500(2164-2183)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3618500
Wu JShen WFang FXu HKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Inverse game theory for stackelberg gamesProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602602(32186-32198)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3602602
Castiglioni MCelli AMarchesi AGatti NLarochelle HRanzato MHadsell RBalcan MLin H(2020)Online Bayesian persuasionProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497082(16188-16198)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3497082
Chen YLiu YPodimata CLarochelle HRanzato MHadsell RBalcan MLin H(2020)Learning strategy-aware linear classifiersProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497004(15265-15276)Online publication date: 6-Dec-2020
https://dl.acm.org/doi/10.5555/3495724.3497004
Nguyen TXu H(2019)Imitative attacker deception in stackelberg security gamesProceedings of the 28th International Joint Conference on Artificial Intelligence10.5555/3367032.3367108(528-534)Online publication date: 10-Aug-2019
https://dl.acm.org/doi/10.5555/3367032.3367108
Kinneer CWagner RFang FLe Goues CGarlan DRoop PZhan NGao SNuzzo P(2019)Modeling observability in adaptive systems to defend against advanced persistent threatsProceedings of the 17th ACM-IEEE International Conference on Formal Methods and Models for System Design10.1145/3359986.3361208(1-11)Online publication date: 9-Oct-2019
https://dl.acm.org/doi/10.1145/3359986.3361208
Wang BZhang YZhou ZZhong S(2019)On repeated stackelberg security game with the cooperative human behavior model for wildlife protectionApplied Intelligence10.1007/s10489-018-1307-y49:3(1002-1015)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.1007/s10489-018-1307-y
Jin RHe XDai H(2017)On the Tradeoff between Privacy and Utility in Collaborative Intrusion Detection Systems-A Game Theoretical ApproachProceedings of the Hot Topics in Science of Security: Symposium and Bootcamp10.1145/3055305.3055311(45-51)Online publication date: 4-Apr-2017
https://dl.acm.org/doi/10.1145/3055305.3055311
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents