Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1082473.1082485acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
Article

Theory of moves learners: towards non-myopic equilibria

Published: 25 July 2005 Publication History

Abstract

In contrast to classical game theoretic analysis of simultaneous and sequential play in bimatrix games, Steven Brams has proposed an alternative framework called the Theory of Moves (TOM) where players can choose their initial actions and then, in alternating turns, decide to shift or not from its current action. A backward induction process is used to determine a non-myopic action and equilibrium is reached when an agent, on its turn to move, decides to not change its current action. Brams claims that the TOM framework captures the dynamics of a wide range of real-life non-cooperative negotiations ranging over political, historical, and religious disputes. We believe that his analysis is weakened by the assumption that a player has perfect knowledge of the opponent's payoff. We present a learning approach by which TOM players can learn to converge to Non-Myopic Equilibria (NME) without prior knowledge of its opponent's preferences and by inducing them from past choices made by the opponent. We present experimental results from all structurally distinct 2-by-2 games without a common preferred outcome showing convergence of our proposed learning player to NMEs. We also discuss the relation between equilibriums in sequential games and NMEs of TOM.

References

[1]
B. Banerjee, S. Sen, and J. Peng. Fast concurrent reinforcement learners. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, pages 825--830, 2001.
[2]
M. Bowling and M. Veloso. Multiagent learning using a variable learning rate. Artificial Intelligence, 136:215--250, 2002.
[3]
S. J. Brams. Theory of Moves. Cambridge University Press, Cambridge: UK, 1994.
[4]
C. Claus and C. Boutilier. The dynamics of reinforcement learning in cooperative multiagent systems. In Proceedings of the Fifteenth National Conference on Artificial Intelligence, pages 746--752, Menlo Park, CA, 1998. AAAI Press/MIT Press.
[5]
V. Conitzer and T. Sandholm. AWESOME: A general multiagent learning algorithm that converges in self-play. In Twentieth International Conference on Machine Learning, pages 83--90, San Francisco, CA, 2003: Morgan Kaufmann.
[6]
J. Hu and M. P. Wellman. Multiagent reinforcement learning: Theoretical framework and an algorithm. In J. Shavlik, editor, Proceedings of the Fifteenth International Conference on Machine Learning, pages 242--250, San Francisco, CA, 1998. Morgan Kaufmann.
[7]
M. Littman and P. Stone. A Polynomial-time Nash Equilibrium algorithm for repeated games. Decision Support Systems, 39:55--66, 2005.
[8]
M. L. Littman. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the Eleventh International Conference on Machine Learning, pages 157--163, San Mateo, CA, 1994. Morgan Kaufmann.
[9]
M. L. Littman. Friend-or-foe q-learning in general-sum games. In Proceedings of the Eighteenth International Conference on Machine Learning, pages 322--328, San Francisco: CA, 2001. Morgan Kaufmann.
[10]
M. L. Littman and P. Stone. Implicit negotiation in repeated games. In Intelligent Agents VIII: AGENT THEORIES, ARCHITECTURE, AND LANGUAGES, pages 393--404, 2001.
[11]
R. B. Myerson. Game Theory: Analysis of Conflict. Harvard University Press, 1991.
[12]
J. F. Nash. Non-cooperative games. Annals of Mathematics, 54:286 -- 295, 1951.
[13]
A. Rapoport and M. Guyer. A taxonomy of 2x2 games. General Systems, 11:203--214, 1966.

Cited By

View all
  • (2013)Discovery, utilization, and analysis of credible threats for 2X2 incomplete information games in TOMProceedings of the 2013 international conference on Autonomous agents and multi-agent systems10.5555/2484920.2485131(1179-1180)Online publication date: 6-May-2013
  • (2009)Adversary aware surveillance systemsIEEE Transactions on Information Forensics and Security10.1109/TIFS.2009.20264594:3(552-563)Online publication date: 1-Sep-2009

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AAMAS '05: Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
July 2005
1407 pages
ISBN:1595930930
DOI:10.1145/1082473
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bimatrix games
  2. non-myopic equilibrium
  3. theory of moves

Qualifiers

  • Article

Conference

AAMAS05
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2013)Discovery, utilization, and analysis of credible threats for 2X2 incomplete information games in TOMProceedings of the 2013 international conference on Autonomous agents and multi-agent systems10.5555/2484920.2485131(1179-1180)Online publication date: 6-May-2013
  • (2009)Adversary aware surveillance systemsIEEE Transactions on Information Forensics and Security10.1109/TIFS.2009.20264594:3(552-563)Online publication date: 1-Sep-2009

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media