Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
36 views35 pages

IntroGameTheory II

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1/ 35

Game Theory II

Definition of Nash Equilibrium


A game has n players. Each player i has a strategy set Si
This is his possible actions

Each player has a payoff function


pI: S ! R

A strategy ti 2 Si is a best response if there is no other strategy in Si that produces a higher payoff, given the opponents strategies.

Definition of Nash Equilibrium


A strategy profile is a list (s1, s2, , sn) of the strategies each player is using. If each strategy is a best response given the other strategies in the profile, the profile is a Nash equilibrium. Why is this important?
If we assume players are rational, they will play Nash strategies. Even less-than-rational play will often converge to Nash in repeated settings.

An Example of a Nash Equilibrium


Column a a Row b 2,1 1,0 1,2 b 0,1

(b,a) is a Nash equilibrium. To prove this: Given that column is playing a, rows best response is b. Given that row is playing b, columns best response is a.

Finding Nash Equilibria Dominated Strategies


What to do when its not obvious what the equilibrium is? In some cases, we can eliminate dominated strategies.
These are strategies that are inferior for every opponent action.

In the previous example, row = a is dominated.

Example
A 3x3 example:
Column
a b c

a
Row b c

73,25
80,26 28,27

57,42
35,12 63,31

66,32
32,54 54,29

Example
A 3x3 example:
a a Row b c 80,26 28,27 35,12 63,31 32,54 54,29 73,25 Column b 57,42 c 66,32

c dominates a for the column player

Example
A 3x3 example:
a a Row b c 80,26 28,27 35,12 63,31 32,54 54,29 73,25 Column b 57,42 c 66,32

b is then dominated by both a and c for the row player.

Example
A 3x3 example:
a a Row b c 80,26 28,27 35,12 63,31 32,54 54,29 73,25 Column b 57,42 c 66,32

Given this, b dominates c for the column player the column player will always play b.

Example
A 3x3 example:
a a Row b c 80,26 28,27 35,12 63,31 32,54 54,29 73,25 Column b 57,42 c 66,32

Since column is playing b, row will prefer c.

Example
Column a a Row 73,25 b 57,42 c 66,32

b
c

80,26
28,27

35,12
63,31

32,54
54,29

We verify that (c,b) is a Nash Equilibrium by observation: If row plays c, b is the best response for column. If column plays b, c is the best response by row.

Example #2
You try this one:
Column
a b c

a
Row b

2,2
1,2

1,1
4,1

4,0
3,5

Coordination Games
Consider the following problem:
A supplier and a buyer need to decide whether to adopt a new purchasing system.
Buyer

new
new Supplier old 0,0 20,20

old
0,0 5,5

No dominated strategies!

Coordination Games
Buyer new new Supplier old 20,20 0,0 old 0,0 5,5

This game has two Nash equilibria (new,new) and (old,old) Real-life examples: Beta vs VHS, Mac vs Windows vs Linux, others? Each player wants to do what the other does which may be different than what they say theyll do How to choose a strategy? Nothing is dominated.

Solving Coordination Games


Coordination games turn out to be an important real-life problem
Technology/policy/strategy adoption, delegation of authority, synchronization

Human agents tend to use focal points


Solutions that seem to make natural sense
e.g. pick a number between 1 and 10

Social norms/rules are also used


Driving on the right/left side of the road

These strategies change the structure of the game

Finding Nash Equilibria Simultaneous Equations


We can also express a game as a set of equations. Demand for corn is governed by the following equation:
Quantity = 100000(10 p)

Government price supports say that p must be at least 0.25 (and it cant be more than 10) Three farmers can each choose to sell 0-600000 lbs of corn. What are the Nash equilibria?

Setup
Quantity (q) = q1 + q2 + q3 Price(p) = a bq (downward-sloping line) Farmer 1 is trying to decide a quantity to sell. Maximize profit = price * quantity Maximize: pq1 =(a bq) * q1 Profit = (a b(q1 + q2 + q3)) * q1 = = aq1 bq12 bq1q2 bq1q3 Differentiate: Pr = a 2bq1 bq2 bq3 To maximize: set this equal to zero.

Setup
So solutions must satisfy
a b(q2 + q3) 2bq1 = 0

So what if q1 = q2 = q3 (everyone ships the same amount?)


Since the game is symmetric, this should be a solution. a 4bq1 = 0, a = 4bq1, q1 = a/4b. q = 3a/4b, p = a/4. Each farmer gets a2 / 16b. In this problem, a=10, b=1/100000. Price = $2.50, q1=250000, profit = 625,000 q1=q2=q3=250000 is a solution. Price supports not used in this solution.

Setup
What if farmers 2 & 3 send everything they have?
q2 + q3 = 1,200,000

If farmer 1 then shipped nothing, price would be:


10 - 1,200,000/100,000 = -2.
But prices cant fall below $0.25, so theyd be capped there. Adding quantity would reduce the price, except for supports.

So, farmer 1 should sell all his corn at $0.25, and earn $125,000.

So everyone selling everything at the lowest price (q1 = q2 =q3 = 600,000) is also a Nash equilibrium.
These are the only pure strategy Nash equilibria.

Price-matching Example
Two sellers are offering the same book for sale. This book costs each seller $25. The lowest price gets all the customers; if they match, profits are split. What is the Nash Equilibrium strategy?

Price-matching Example
Suppose the monopoly price of the book is $30.
(price that maximizes profit w/o competition)

Each seller offers a rebate: if you find the book cheaper somewhere else, well sell it to you with double the difference subtracted.
E.g. $30 at store 1, $24 at store 2 get it for $18 from store 1.

Now what is each sellers Nash strategy?

Price-matching example
Observation 1: sellers want to have the same price.
Each suffers from giving the rebate.

Profit = p1 2(p1 p2) = -p1 2p2 Pr = -1.


There is no local maximum. So, to maximize profits, maximize price.

At that point, the rebate 2(p1 p2) is 0, and p1 is as high as possible.


The 2 makes up for sharing the market.

Cooperative Games and Coalitions


When a group of agents decide to cooperate to improve their payments (for example, adopting a technology), we call them a coalition
Side payments, bribes, intimidation may be used to set up a coalition. Example: A,B,C are running for class president. The president receives $10, everyone else $0 Each players strategy is to vote for themselves. A offers B $5 to vote for her now both A and B are happier and have formed a coalition.

Efficiency
We say that a coalition is efficient if theres no choice of action that can improve one persons profit without decreasing another.
Same reasoning as Nash equilibria, market equilibria. If someone could change their strategy without hurting anyone and improve their payoff, its not efficient. Money is left on the table

Example: cake-cutting.

Mixed strategies
Unfortunately, not every game has a pure strategy equilibrium.
Rock-paper-scissors

However, every game has a mixed strategy Nash equilibrium. Each action is assigned a probability of play. Player is indifferent between actions, given these probabilities.

Mixed Strategies
In many games (such as coordination games) a player might not have a pure strategy. Instead, optimizing payoff might require a randomized strategy (also called a mixed strategy)
Wife
football shopping

football
Husband shopping

2,1
0,0

0,0
1,2

Strategy Selection
Wife football football 2,1 shopping 0,0 1,2

Husband

shopping 0,0

If we limit to pure strategies: Husband: U(football) = 0.5 * 2 + 0.5 * 0 = 1 U(shopping) = 0.5 * 0 + 0.5 * 1 = Wife: U(shopping) = 1, U(football) = Problem: this wont lead to coordination!

Mixed strategy
Instead, each player selects a probability associated with each action
Goal: utility of each action is equal Players are indifferent to choices at this probability

a=probability husband chooses football b=probability wife chooses shopping Since payoffs must be equal, for husband:
b*1=(1-b)*2 b=2/3

For wife:
a*1=(1-a)*2 = 2/3

In each case, expected payoff is 2/3


2/9 of time go to football, 2/9 shopping, 5/9 miscoordinate

If they could synchronize ahead of time they could do better.

Example: Rock paper scissors


Column rock rock Row paper 1,-1 -1,1 0,0 1,-1 -1,1 0,0 0,0 paper -1,1 scissors 1,-1

scissors

Setup
Player 1 plays rock with probability pr, scissors with probability ps, paper with probability 1-pr ps P2: Utility(rock) = 0*pr + 1*ps 1(1-pr ps) = 2 ps + pr -1 P2: Utility(scissors) = 0*ps + 1*(1 pr ps) 1pr = 1 2pr ps P2: Utility(paper) = 0*(1-pr ps)+ 1*pr 1ps = pr ps
Player 2 wants to choose a probability for each strategy so that the expected payoff for each strategy is the same.

Setup
qr(2 ps + pr 1) = qs(1 2pr ps) = (1-qr-qs) (pr ps) It turns out (after some algebra) that the optimal mixed strategy is to play each strategy of the time.

Intuition: What if you played rock half the time? Your opponent would then play paper half the time, and youd lose more often than you won. So youd decrease the fraction of times you played rock, until your opponent had no edge in guessing what youll do.

Repeated games
Many games get played repeatedly A common strategy for the husband-wife problem is to alternate
This leads to a payoff of 1, 2,1,2, 1.5 per week.

Requires initial synchronization, plus trust that partner will go along. Difference in formulation: we are now thinking of the game as a repeated set of interactions, rather than as a one-shot exchange.

Repeated vs Stage Games


There are two types of multiple-action games:
Stage games: players take a number of actions and then receive a payoff.
Checkers, chess, bidding in an ascending auction

Repeated games: Players repeatedly play a shorter game, receiving payoffs along the way.
Poker, blackjack, rock-paper-scissors, etc

Analyzing Stage Games


Analyzing stage games requires backward induction We start at the last action, determine what should happen there, and work backwards.
Just like a game tree with extensive form.

Strange things can happen here:


Centipede game
Players alternate can either cooperate and get $1 from nature or defect and steal $2 from your opponent Game ends when one player has $100 or one player defects.

Analyzing Repeated Games


Analyzing repeated games requires us to examine the expected utility of different actions. Assumption: game is played infinitely often
Weird endgame effects go away.

Prisoners Dilemma again:


In this case, tit-for-tat outperforms defection.

Collusion can also be explained this way.


Short-term cost of undercutting is less than long-run gains from avoiding competition.

You might also like