Finding Nash Equilibria in Two-Player, Zero Sum Games: Western CEDAR
Finding Nash Equilibria in Two-Player, Zero Sum Games: Western CEDAR
Finding Nash Equilibria in Two-Player, Zero Sum Games: Western CEDAR
Western CEDAR
Computer Science Graduate and Undergraduate
Computer Science
Student Scholarship
2008
Recommended Citation
Wimpee, Jeffrey, "Finding Nash equilibria in two-player, zero sum games" (2008). Computer Science Graduate and Undergraduate
Student Scholarship. 3.
https://cedar.wwu.edu/computerscience_stupubs/3
This Research Paper is brought to you for free and open access by the Computer Science at Western CEDAR. It has been accepted for inclusion in
Computer Science Graduate and Undergraduate Student Scholarship by an authorized administrator of Western CEDAR. For more information,
please contact westerncedar@wwu.edu.
Finding Nash equilibria in two-player,
zero-sum games
Jeffrey Wimpee
Computer Science
Western Washington University
wimpeej@cc.wwu.edu
Abstract
In many games, it is desirable to find strategies for all players that
simultaneously maximize their respective worst-case payoffs. A set of
strategies satisfying this criterion is called a Nash equilibrium. Because
the search space of possible strategies grows rapidly as the size of the
game increases, specialized algorithms are needed to efficiently find Nash
equilibria. In this paper, current equilibrium-finding methods are presented
and key areas for future work are identified. The first algorithm, due to
Koller, Megiddo, and von Stengel, computes standard Nash equilibria in
two-player, zero-sum games. The second algorithm, due to Miltersen and
Sorensen, extends the method of Koller, Megiddo, and von Stengel to find
proper equilibria. Both algorithms run in polynomial time in the worst
case. The hardness of the equilibrium-finding problem for general-sum
games highlights the need for new approximation methods.
Keywords
Nash equilibrium, game theory, two-player games, zero-sum games
1. Introduction
This paper is a survey of algorithms for finding Nash equilibria and proper
equilibria in two-player games. A Nash equilibrium in this context is a pair of strategies,
one for each player, such that each strategy is a best response to the other. In other words,
no player has incentive to unilaterally deviate from his or her respective strategy. The
related concept of proper equilibrium imposes additional constraints on acceptable
strategies. Roughly speaking, proper equilibrium strategies take into account the
possibility that a player may occasionally make suboptimal choices with small
probability. In any case, the number of strategies increases exponentially with the size of
the game. The objective of the research surveyed here is to present efficient algorithms
for computing strategy pairs that constitute equilibria. An additional goal of this survey is
to define the current frontiers of research in the field of two-player games and show
possible directions for future research.
In the remainder of this paper, basic concepts and terminology of game theory are
introduced. Two algorithms are presented for computing Nash equilibria in two-player,
zero-sum games. Finally, several promising areas for future work are identified.
2. Background
In game theory, a game is a strategic situation involving multiple agents. It
specifies the choices available to the agents and the payoffs associated with each possible
sequence of choices. Each agent attempts to maximize the payoff to himself, but his
actions may depend on both past and future choices of the other players.
A strategy defines the action a player takes at each of his available choices. The
particular strategies we are interested in finding satisfy Nash equilibrium, which may be
defined by example. Suppose that each player P adopts a strategy satisfying the following
condition: given that the other players’ strategies remain fixed, P may not increase his
own payoff by changing his strategy. A set of strategies (one for each player) that
satisfies this condition is called a Nash equilibrium. Strategies may be pure or mixed: a
pure strategy defines a set action for each possible decision; a mixed strategy is a
probability distribution over the set of pure strategies.
A game may be represented as a set of matrices, one for each player, that specify
the payoff to that player given the strategies of all players. This representation is known
as the normal or strategic form. Each dimension of a matrix corresponds to the complete
set of strategies available to a single player. Accordingly, in the case of a two player
game, we can call one player the row player and the other the column player.
3. Methods
In this section, two algorithms for computing Nash equilibria in two-player games
are described. The first, due to Koller, Megiddo, and von Stengel, finds standard Nash
equilibria. The method of Miltersen and Sorensen finds proper equilibria.
The linear program to be solved to find the column player’s equilibrium strategy y
in a two-player, zero-sum game is
The objective function eTp represents the payoff from the column player to the row
player. e and f are the scalar 1. A is the payoff matrix for Player 1. E and F are 1 m and
1 n unit matrices, respectively.
To find the corresponding equilibrium strategy x for the row player, we must
solve the dual problem. By the theorem of strong duality, a solution to the primal
problem determines a solution to its dual. The dual of the first program is
In an extensive form game, the entries of E are computed as follows. The first
entry in the first row is 1 and the rest of the entries in the row are 0. Each other row
corresponds to an information set, and each column corresponds to a sequence, with the
first column being the empty sequence. The entries of each row are 1 if the last choice in
the sequence is a choice in that row’s information set, and -1 in the column corresponding
to the choice leading into the information set. The entries of x correspond to the
probability of choosing a given sequence. Then Ex = e corresponds to the constraint that
the realization weight of a sequence must equal the sum of the realization weights of its
child sequences. This ensures that the x corresponds to a some mixed strategy, allowing
us to use the linear program previously defined. The entries of A are determined as
follows: for each leaf, consider the entry in the matrix (where the rows and columns
represent sequences belonging to the respective players) which leads to that leaf. The
entry of the matrix is then the product of all chance probabilities on that sequence times
the payoff associated with the leaf. All other entries of the matrix are zero. The values of
y, F, f and B are defined similarly.
Miltersen and Sorensen have extended the algorithm of Koller, Megiddo and von
Stengel to compute proper equilibria in two-player, zero-sum extensive games [4]. The
core of the algorithm is the repeated solution of the following linear programs:
and
where A, F, e, f, and x are defined as in the Koller, Megiddo and von Stengel method, m
represents actions that are mistakes for Player 2, v(k) is the value to be gained from
exploiting mistakes, and t is the payoff to Player 1. The number of iterations required is
bounded by the number of actions owned by Player 2 in the game tree, and the algorithm
yields a pair of proper equilibrium strategies.
4. Results
Both the methods of Koller, Megiddo, and von Stengel and Miltersen and
Sorensen rely fundamentally on formulating and solving linear programs. Although the
latter requires the solution of a series of linear programs, the number of such programs is
polynomially bounded. Therefore, since LPs can be solved in polynomial time (using
Karmarkar’s algorithm, for example), both methods are in the complexity class P.
Computation
Algorithm Output Complexity Class
Required
Koller, Megiddo, Standard Nash Single solution of
P
von Stengel [3] equilibrium linear program
Miltersen-Sorensen Normal form proper Iterated solution of
P
[4] Nash equilibrium linear program
Table 1: A comparison of algorithms for computing standard and proper Nash equilibria in terms of the
general method and time complexity.
5. Discussion
The algorithms presented here fulfill complementary roles. Koller, Megiddo, and von
Stengel describe a method for computing standard Nash equilibria that is applicable to
extensive form two-player games, both zero- and general-sum. In the latter case, it may
take exponential time. The algorithm of Miltersen and Sorensen computes proper
equilibria in two-player extensive form games, provided the game is zero-sum.
6. Future work
The main difficulty in this area is that computing equilibria in the general-sum
case is intractable. It has been shown by Chen and Deng that even in the two-player case,
computing exact Nash equilibria is complete in a complexity class known as PPAD [1].
Evidence suggests that no efficient algorithms exist for such problems [5]. Therefore,
finding approximate solutions is an important area of current research. -approximate
Nash equilibrium is a promising idea related to this issue. In an -approximate Nash
equilibrium, by changing strategy no player may achieve a payoff more than plus the
optimal payoff in the exact equilibrium. Currently, the best polynomial time
approximation algorithm for two-player, general-sum games achieves equal to 0.3393,
where the payoffs have been linearly mapped into the [0, 1] interval [6]. However, note
that in this result, is constant. It has been shown that general problem of approximating
Nash equilibrium to arbitrary is also PPAD-complete. Thus, it appears that the best that
may be hoped for in this endeavor is to find a smaller, but still constant, value for .
Ideally, a theoretical bound on the value of could be determined.
References
[1] X. Chen and X. Deng. Settling the complexity of two-player Nash equilibrium. In
Proc. 47th FOCS, pages 261-272, 2006.
[3] D. Koller, N. Megiddo, and B. von Stengel. Fast algorithms for finding
randomized strategies in game trees. In Proc. 26th STOC, pages 750-759, 1994.
[4] P. Miltersen and T. Sørensen. Fast algorithms for finding proper strategies in
game trees. In Proc. 19th SODA, pages 874-883, 2008.
[5] C. Papadimitriou. On the complexity of the parity argument and other inefficient
proofs of existence. Journal of Computer and System Sciences, pages 498-532,
1994.