21CSC206T Unit3
21CSC206T Unit3
21CSC206T Unit3
19/12/23 1
Adversarial search Methods-Game
playing-Important concepts
19/12/23 2
Adversarial search Methods-Game
playing-Important concepts
19/12/23 3
Game playing and knowledge structure-
Elements of Game Playing search
• To play a game, we use a game tree to know all the • For example, in chess, tic-tac-toe, we have two or three
possible choices and to pick the best one out. There are possible outcomes. Either to win, to lose, or to draw the
following elements of a game-playing: match with values +1,-1 or 0.
• S0: It is the initial state from where a game begins. • Game Tree for Tic-Tac-Toe
19/12/23 4
Game playing and knowledge structure-
Elements of Game Playing search
• INITIAL STATE (S0): The top node in
the game-tree represents the initial
state in the tree and shows all the
possible choice to pick out one.
• PLAYER (s): There are two
players, MAX and MIN. MAX begins
the game by picking one best move
and place X in the empty square
box.
• ACTIONS (s): Both the players can
make moves in the empty boxes
chance by chance.
• RESULT (s, a): The moves made
by MIN and MAX will decide the
outcome of the game.
• TERMINAL-TEST(s): When all the
empty boxes will be filled, it will be
the terminating state of the game.
• UTILITY: At the end, we will get to
know who wins: MAX or MIN, and
accordingly, the price will be given to
them.
19/12/23 5
Game as a search problem
19/12/23 6
Game Playing vs. Search
19/12/23 7
Game Playing
Tree from
Max’s
perspective
19/12/23 9
Minimax Algorithm
• Minimax algorithm
• Perfect play for deterministic, 2-player game
• Max tries to maximize its score
• Min tries to minimize Max’s score (Min)
• Goal: move to position of highest minimax value
🡪 Identify best achievable payoff against best play
19/12/23 10
Minimax Algorithm
19/12/23 11
Minimax Rule
19/12/23 13
Minimax Search
2
2 1 2 1
2 7 1 8 2 7 1 8 2 7 1 8
MAX
MIN
2 7 1 8
19/12/23 14
Minimax Algorithm (cont’d)
3 9 0 7 2 6
19/12/23 15
Minimax Algorithm (cont’d)
3 0 2
3 9 0 7 2 6
19/12/23 16
Minimax Algorithm (cont’d)
3 0 2
3 9 0 7 2 6
19/12/23 17
Minimax Algorithm (cont’d)
• Limitations
• Not always feasible to traverse entire tree
• Time limitations
• Key Improvement
• Use evaluation function instead of utility
• Evaluation function provides estimate of utility at given position
19/12/23 19
Unit 2 List of Topics
• Searching techniques – Uninformed search – • AO* search
General search Algorithm
• Local search Algorithms-Hill Climbing, Simulated
Annealing
• Uninformed search Methods – Breadth First
Search
• Local Beam Search
• Genetic Algorithms
• Uninformed search Methods – Depth First
Search
19/12/23
• Informed search-A* Algorithm
20
Alpha Beta Pruning
19/12/23 22
Alpha Beta Pruning
19/12/23 23
Alpha Beta Pruning
19/12/23 24
Alpha Beta Pruning
Working of Alpha-Beta Pruning:
• Let's take an example of two-player search tree to understand the working of
Alpha-beta pruning
• Step 1: At the first step the, Max player will start first move from node A where
α= -∞ and β= +∞, these value of alpha and beta passed down to node B where
again α= -∞ and β= +∞, and Node B passes the same value to its child D.
19/12/23 25
Alpha Beta Pruning
Step 2: At Node D, the value of α will be calculated as its turn for Max. The value of α is
compared with firstly 2 and then 3, and the max (2, 3) = 3 will be the value of α at
node D and node value will also 3.
Step 3: Now algorithm backtrack to node B, where the value of β will change as this is a
turn of Min, Now β= +∞, will compare with the available subsequent nodes value, i.e.
min (∞, 3) = 3, hence at node B now α= -∞, and β= 3.
In the next step, algorithm traverse the next successor of Node B which is node E, and
the values of α= -∞, and β= 3 will also be passed.
19/12/23 26
Alpha Beta Pruning
Step 4: At node E, Max will take its turn, and the value of alpha will change.
The current value of alpha will be compared with 5, so max (-∞, 5) = 5,
hence at node E α= 5 and β= 3, where α>=β, so the right successor of E
will be pruned, and algorithm will not traverse it, and the value at node E
will be 5.
19/12/23 27
Alpha Beta Pruning
Step 5: At next step, algorithm again backtrack the tree, from node B to node A. At node A, the
value of alpha will be changed the maximum available value is 3 as max (-∞, 3)= 3, and β=
+∞, these two values now passes to right successor of A which is Node C.
At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0, and max(3,0)=
3, and then compared with right child which is 1, and max(3,1)= 3 still α remains 3, but the
node value of F will become 1.
19/12/23 28
Alpha Beta Pruning
19/12/23 29
Alpha Beta Pruning
Step 8: C now returns the value of 1 to A here the best value for A is max (3,
1) = 3. Following is the final game tree which is the showing the nodes
which are computed and nodes which has never computed. Hence the
optimal value for the maximizer is 3 for this example.
19/12/23 30
Alpha Beta Pruning
19/12/23 31
Unit 2 List of Topics
19/12/23
• Informed search-A* Algorithm
32
What is Game Theory?
19/12/23 35
A Further Definition
19/12/23 36
Game theory: assumptions
19/12/23 38
Rules, Strategies, Payoffs, and
Equilibrium
40
19/12/23
Game Outcomes
41
19/12/23
Minimax Criterion
Cut cake as evenly Half the cake minus Half the cake plus a
as possible a crumb crumb
⚫ If the upper and lower values are the same, the number is called the
value of the game and an equilibrium or saddle point condition exists
⚫ The value of a game is the average or expected game outcome if the game is
played an infinite number of times
⚫ A saddle point indicates that each player has a pure strategy i.e., the strategy
is followed no matter what the opponent does
19/12/23 43
Saddle Point
44
19/12/23
Pure Strategy - Minimax Criterion
Y1 Y2
Player X’s strategies X1 10 6 6
X2 -12 2 -12
Maximum Column 10 6
Number
19/12/23 45
Mixed Strategy Game
⚫ When there is no saddle point, players will play each strategy for a
certain percentage of the time
⚫ The most common way to solve a mixed strategy is to use the
expected gain or loss approach
⚫ A player plays each strategy a particular percentage of the time so that
the expected value of the game does not depend upon what the
opponent does
Y1 Y2 Expected Gain
P 1-P
X1 4 2 4P+2(1-P)
Q
X2 1 10 1P+10(1-p)
1-Q
4Q+1(1-Q) 2Q+10(1-q)
19/12/23 46
Mixed Strategy Game
: Solving for P & Q
4P+2(1-P) = 1P+10(1-P)
or: P = 8/11 and 1-p = 3/11
Expected payoff:
1P+10(1-P)
=1(8/11)+10(3/11)
EPX= 3.46
4Q+1(1-Q)=2Q+10(1-q)
or: Q=9/11 and 1-Q = 2/11
Expected payoff:
EPY=3.46
19/12/23 47
Mixed Strategy Game : Example
• Using the solution procedure for a mixed strategy game, solve the
following game
48
19/12/23
Mixed Strategy Game
Example
• This game can be solved by setting up the mixed strategy
table and developing the appropriate equations:
49
19/12/23
Mixed Strategy Game: Example
19/12/23
50
Two-Person Zero-Sum and
Constant-Sum Games
Two-person zero-sum and constant-sum games are played according to the following
basic assumption:
Each player chooses a strategy that enables him/her to do the best he/she can, given
that his/her opponent knows the strategy he/she is following.
(1)
19/12/23 51
Two-Person Zero-Sum and
Constant-Sum Games (Cont)
If a two-person zero-sum or constant-sum game has a saddle point, the row player should
choose any strategy (row) attaining the maximum on the right side of (1). The column player
should choose any strategy (column) attaining the minimum on the right side of (1).
In general, we may use the following method to find the optimal strategies and value of
two-person zero-sum or constant-sum game:
Step 1 Check for a saddle point. If the game has none, go on to step 2.
19/12/23 52
Two-Person Zero-Sum and
Constant-Sum Games (Cont)
Step 2 Eliminate any of the row player’s dominated strategies. Looking at the reduced
matrix (dominated rows crossed out), eliminate any of the column player’s dominated
strategies and then those of the row player. Continue until no more dominated strategies
can be found. Then proceed to step 3.
Step 3 If the game matrix is now 2 x 2, solve the game graphically. Otherwise, solve by
using a linear programming method.
19/12/23 53
Zero Sum Games
Let’s take the following example: Two TV channels (1 and 2) are competing for an
audience of 100 viewers. The rule of the game is to simultaneously announce the type of
show the channels will broadcast. Given the payoff matrix below, what type of show should
channel 1 air?
19/12/23 56
Two-person zero-sum game –
Dominance property
19/12/23 57
Two-person zero-sum game –
Dominance property- To do problem!
\Player B
Player A
B1 B2 B3 B4
A1 3 5 4 2
A2 5 6 2 4
A3 2 1 4 0
A4 3 3 5 2
Solutio
Player B
n
B3 B4
Player A2 2 4
A A4 5 2
19/12/23 58
The Prisoner’s Dilemma
•The prisoner’s dilemma is a universal concept. Theorists now realize that prisoner’s dilemmas
occur in biology, psychology, sociology, economics, and law.
•The prisoner’s dilemma is apt to turn up anywhere a conflict of interests exists -- and the
conflict need not be among sentient beings.
• Study of the prisoner’s dilemma has great power for explaining why animal and human
societies are organized as they are. It is one of the great ideas of the twentieth century, simple
enough for anyone to grasp and of fundamental importance (...).
• The prisoner’s dilemma has become one of the premier philosophical and scientific issues of
our time. It is tied to our very survival (W. Poundstone,1992, p. 9).
19/12/23 59
Prisoner’s Dilemma
19/12/23 60
Prisoner’s Dilemma
– If only one prisoner turns state’s evidence and testifies against his partner he will go free
while the other will receive a 3 year sentence.
– Each prisoner knows the other has the same offer
– The catch is that if both turn state’s evidence, they each receive a 2 year sentence
– If both refuse, each will be imprisoned for 1 year on the lesser charge
19/12/23 61
A game is described by
19/12/23 62
Game Theory Definition
Payoff matrix
Normal- or strategic form
Player B
Top 3, 0 0, -4
Bottom 2, 4 -1, 3
19/12/23 63
Game Playing
19/12/23 64
Nash equilibrium
• If Player A’s choice is optimal given Player B’s choice, and B’s
choice is optimal given A’s choice, a pair of strategies is a
Nash equilibrium.
• When the other players’ choice is revealed neither player like
to change her behavior.
• If a set of strategies are best responses to each other, the
strategy set is a Nash equilibrium.
19/12/23 65
Payoff matrix
Normal- or strategic form
Player B
Top 1, 1 2, 3*
Bottom 2, 3* 1, 2
19/12/23 66
Solution
19/12/23 67
Problems
19/12/23 68
Payoff matrix
Normal- or strategic form
Player B
Top 1, -1 -1, 1
-1, 1
Bottom 1, -1
19/12/23 69
Nash equilibrium in mixed
strategies
19/12/23 70
The prisoner’s dilemma
19/12/23 71
Prisoner’s dilemma
Normal- or strategic form
Prisoner B
Confess -2, -2 0, -4
Solution
Confess is a dominant strategy for both. If both Deny
they would be better off. This is the dilemma.
19/12/23 72
Nash Equilibrium – To do Problems!
COKE
L R B
PEPSI
U 6,8* 4,7 L R
A
D 7,6 3,7 U 7,6* 5,5
D 4,5 6,4
19/12/23 73
GAME PLAYING & MECHANISM DESIGN
19/12/23 74
GAME PLAYING & MECHANISM DESIGN
Mother
Social Planner
Mechanism Designer
Kid 1 Kid 2
Rational and Rational and
Intelligent Intelligent
Example 1: Mechanism Design
Fair Division of a Cake
19/12/23 75
GAME PLAYING & MECHANISM DESIGN
Tenali Rama
(Birbal)
Mechanism Designer
Baby
Mother 1 Mother 2
Rational and Rational and
Intelligent Player Intelligent Player
4 60 4 80
Buyers Buyers
19/12/23 79
Simple reflex agents
19/12/23 80
Simple reflex agents
19/12/23 81
A Simple Reflex Agent in Nature
percepts
(size, motion)
RULES:
(1) If small moving object,
then activate SNAP
(2) If large moving object,
then activate AVOID and inhibit SNAP
ELSE (not moving) then NOOP
needed for
completeness Action: SNAP or AVOID or NOOP
19/12/23 82
Model-based Reflex Agents
19/12/23 83
Example Table Agent
With Internal State
IF THEN
Saw an object ahead, Go straight
and turned right, and
it’s now clear ahead
Saw an object Ahead, Halt
turned right, and object
ahead again
See no objects ahead Go straight
start
19/12/23 87
Goal-based agents
19/12/23 88
Goal-based agents
• Conclusion
● Goal-based agents are less efficient
● but more flexible
● Agent Different goals different tasks
● Search and planning
● two other sub-fields in AI
● to find out the action sequences to achieve its goal
19/12/23 89
Goal-based agents
19/12/23 90
Utility-based agents
19/12/23 91
Utility-based agents(4)
19/12/23 92
Utility-based agents
19/12/23 93
Utility-based agents (3)
19/12/23 94
Learning Agents
19/12/23 95
Learning Agents
19/12/23 96
Learning Agents
19/12/23 97
Constraint Satisfaction Problem
19/12/23 98
Solving Constraint Satisfaction Problem
19/12/23 99
CSP
19/12/23 100
CSP
► Types of Constraints
► Unary Constraints - Single variable
► Binary Constraints - Two Variables
► Higher Order Constraints - More than two variables
► CSP can be represented as Search Problem
► Initial state is empty assignment, while successor function is a
non-conflicting value assigned to an unassigned variables
► Goal test checks whether the current assignment is complete and
path cost is the cost for the path to reach the goal state
► CSP Solutions leads to the final and complete assignment with
no exception
19/12/23 101
Cryptarithmetic puzzles
► Cryptarithmetic puzzles are also represented as CSP
► Example: MIKE + JACK =
JOHN
► Replace every letter in puzzle with single number
(number should not be repeated for two different
alphabets)
► The domain is { 0,1, ... , 9 }
19/12/23 102
Cryptarithmetic puzzles
► M * 1000 + I * 100 + K * 10 + E + J * 1000 + A * 100 + C * 10 + K
= J * 1000 + O * 100 + H * 10 + N
► Constraint Domain is represented by Five-tuple and
represented by,
• D = {var, f , O, dv, rg}
► Var stands for set variables, f is set functions, O stands for the
set of legitimate operators to be used, dv is domain variable and
rg is range of function in the constraint
► Constraint without conjunction is referred as Primitive
constraint (for Eg., x < 9 )
► Constraint with conjunction is called as non-primitive constraint or
a generic constraint (For Eg., x < 9 and x > 2)
19/12/23 103
Crypt arithmetic puzzles
– Solved Example
TO
GO
Var Value
---
T 2
OUT O 1
------ G 8
21 U 0
81 G = 8/9
---
102
------ 2 + G = U + 10
19/12/23 104
Cryptarithmetic puzzles
• SEND + MORE = MONEY
c4 c3 c2 c1
S E N D
M O R E
------------------------
MO N E Y
9567
1 085
----------
1 0 8 52
19/12/23 105
CSP- Room Coloring Problem
-CSP as a Search Problem
► Let K for Kitchen, D for Dining Room, H is for Hall, B and B are bedrooms 2 and 3, MB is
2 3 1
master bedroom, SR is the store Room, GR is Guest Room and Lib is Library
► Constraints
► All bedrooms should not be colored red, only one can
► No two adjacent rooms can have the same color
► The colors available are red, blue, green and violet
► Kitchen should not be colored green
► Recommended to color the kitchen as blue
► Dining room should not have violet color
19/12/23 106
Room Coloring Problem – Representation as a
Search Tree
19/12/23 107
Backtracking Search for CSP
19/12/23 108
Example: Map-Coloring
19/12/23 109
Backtracking example
19/12/23 110
Backtracking example
19/12/23 111
Backtracking example
19/12/23 112
Algorithm for Backtracking
Pick initial state
R = set of all possible states
Select state with var assignment
Add to search space
check for con
If Satisfied
Continue
Else
Go to last Decision Point (DP)
Prune the search sub-space from DP
Continue with next decision option
If state = Goal State
Return Solution
Else
Continue
19/12/23 113
CSP-Backtracking, Role of heuristic
► Backtracking allows to go to the previous decision-making node to eliminate the
invalid search space with respect to constraints
► Heuristics plays a very important role here
► If we are in position to determine which variables should be assigned next, then
backtracking can be improved
► Heuristics help in deciding the initial state as well as subsequent selected
states
► Selection of a variable with minimum number of possible values can help in
simplifying the search
► This is called as Minimum Remaining Values Heuristic (MRV) or Most Constraint
Variable Heuristic
► Restricts the most search which ends up in same variable (which would make the
backtracking ineffective)
19/12/23 114
Heuristic
19/12/23 115
Heuristic – Most Constraining Variable
19/12/23 116
Heuristic- Minimum Remaining Values
19/12/23 117
Forward Checking
► To understand the forward checking, we shall see 4 Queens
problem
► If an arrangement on the board of a queen x , hampers the
position of queensx +1, then this forward check ensures that the
queen x should not be placed at the selected position and a new
position is to be looked upon
19/12/23 118
Forward Checking
► Q1 and Q2 are placed in row 1 nad 2 in the left sub-tree, so, search
is halted, since No positions are left for Q3 and Q4
► Forward Checking keeps track of the next moves that are
available for the unassigned variables
► The search will be terminated when there is no legal move
available for the unassigned variables
19/12/23 119
CSP- Room Coloring Problem
-CSP as a Search Problem
► Let K for Kitchen, D for Dining Room, H is for Hall, B and B are bedrooms 2 and 3, MB is
2 3 1
master bedroom, SR is the store Room, GR is Guest Room and Lib is Library
► Constraints
► All bedrooms should not be colored red, only one can
► No two adjacent rooms can have the same color
► The colors available are red, blue, green and violet
► Kitchen should not be colored green
► Recommended to color the kitchen as blue
► Dining room should not have violet color
19/12/23 120
Forward Checking – Room Coloring Problem
► For Room Coloring problem, Considering all the constraints the mapping can be done in
following ways;
► At first, B2 is selected with Red (R). Accordingly, R is deleted from the adjacent nodes
► Kitchen is assigned with Blue (B). So, B is deleted form the adjacent Nodes
► Furthermore, as MB1 is selected green, no color is left for D.
19/12/23 121
Constraint Propagation
19/12/23 122
Constraint Propagation
► Step 2 shows the consistency propagated from D
to B2
► Since D is can have only G value and B being
2
adjacent to it, the arc is drawn
► It is mapped as D → B or Mathematically,
2
• A → B is consistent ↔ ∀ legal value a ∈ A, ∃
• non-conflicting value b ∈ B
► Failure detection can take place at early stage
19/12/23 123
Algorithm for Arc Assignment
19/12/23 124
Forward checking
• Idea:
• Keep track of remaining legal values for unassigned variables
• Terminate search when any variable has no legal values
19/12/23 125
WA RGB
NT GB
SA B
Q R
NSW G
Y R
T RGB
19/12/23 126
Forward checking
• Idea:
• Keep track of remaining legal values for unassigned variables
• Terminate search when any variable has no legal values
19/12/23 127
Forward checking
• Idea:
• Keep track of remaining legal values for unassigned variables
• Terminate search when any variable has no legal values
19/12/23 128
Forward checking
• Idea:
• Keep track of remaining legal values for unassigned variables
• Terminate search when any variable has no legal values
19/12/23 129
Constraint propagation
19/12/23 130
Arc consistency
19/12/23 131
Arc consistency
19/12/23 132
Arc consistency
19/12/23 133
Arc consistency
19/12/23 135
Intelligent Backtracking
19/12/23 136
Intelligent Backtracking
► Chronological backtracking: The BACKGRACKING-SEARCH in
which, when a branch of the search fails, back up to the preceding
variable and try a different value for it. (The most recent decision
point is revisited).
• e.g: Suppose we have generated the partial assignment {Q=red,
NSW=green, V=blue, T=red}.
• When we try the next variable SA, we see every value violates a
constraint.
• We back up to T and try a new color, it cannot resolve the
problem.
► Intelligent backtracking: Backtrack to a variable that was responsible for making
one of the possible values of the next variable (e.g. SA) impossible.
Conflict set for a variable: A set of assignments that are in conflict with some value
for that variable.
(e.g. The set {Q=red, NSW=green, V=blue} is the conflict set for SA.)
Backjumping method: Backtracks to the most recent assignment in the conflict set.
(e.g. backjumping would jump over T and try a new value for V.)
19/12/23 137
Thank You
19/12/23 138