Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Local Search Algorithms: Hill Climbing Simulated Annealing Search Local Beam Search Genetic Algorithm

Download as pdf or txt
Download as pdf or txt
You are on page 1of 45

Local Search Algorithms

 HILL CLIMBING
 SIMULATED ANNEALING SEARCH
 LOCAL BEAM SEARCH
 GENETIC ALGORITHM

CHAPTER FOUR FROM BOOK


Local Search Algorithms

• When a goal is found, the path to that goal also constitutes


a solution to the problem.
• In many optimization problems, the path to the goal is
irrelevant; the goal state itself is the solution.
• 8-queens problem what matters is the final configuration of
queens, not the order in which they are added.
• The same general property holds for many important
applications such as integrated-circuit design, factory-floor
layout, network optimization and so on.
• State space = set of "complete" configurations
• Find configuration satisfying constraints, e.g., n-queens
• In such cases, we can use local search algorithms
• Keep a single "current" state, try to improve it.
Local search algorithms

 Local search algorithms operate using a single


current node (rather than multiple paths) and
generally move only to neighbors of that node.
 Paths followed by the search are not retained.
 Are not systematic,
 two key advantages:
 (1) they use very little memory—usually a constant
amount; and
 (2) they can often find reasonable solutions in large
or infinite (continuous) state spaces for which
systematic algorithms are unsuitable
 To understand local search, we find it useful to
consider the state-space landscape. A landscape has
both “location” (defined by the state) and “elevation”
(defined by the value of the heuristic cost function or
objective function).
 If elevation corresponds to cost, then the aim is to
find the lowest valley—a global minimum;
 if elevation corresponds to an objective function,
then the aim is to find the highest peak—a global
maximum.
 A complete local search algorithm always finds a
goal if one exists; an optimal algorithm always finds
a global minimum/maximum.
State Space Landscape
Hill climbing

 Hill climbing is sometimes called greedy local


search because it grabs a good neighbor state
without thinking ahead about where to go next.
 Greedy algorithms often perform quite well.
 Hill climbing often makes rapid progress toward a
solution because itis usually quite easy to improve a
bad state.
 For example, from the state (h= 17) it takes just five
steps to reach the state (h=1) and is very nearly a
solution.
 Unfortunately, hill climbing often gets stuck.
Problems of Hill climbing

 Local maxima is a peak that is higher than each of its


neighboring states but lower than the global max.
 Hill-climbing algorithms that reach the vicinity of a local
maximum will be drawn upward toward the peak but will
then be stuck with nowhere else to go.
 Eg. every move of a single queen makes the situation worse.
 Ridges result in a sequence of local maxima that is very
difficult for greedy algorithms to navigate.
 Plateaux: a plateau is a flat area of the state-space
landscape. It can be a flat local maximum, from which no
uphill exit exists, or a shoulder, from which progress is
possible.
Sideways move – on plateau

 Might it not be a good idea to keep going—to allow a


sideways move in the hope that the plateau is
really a shoulder.
 The answer is usually yes, but we must take care.
 If we always allow sideways moves when there are no
uphill moves, an infinite loop will occur whenever
the algorithm reaches a flat local maximum that is
not a shoulder.
 One common solution is to put a limit on the number
of consecutive sideways moves allowed.
Deterministic games in practice

 Checkers: Chinook ended 40-year-reign of human world champion


Marion Tinsley in 1994. Used a precomputed endgame database
defining perfect play for all positions involving 8 or fewer pieces on the
board, a total of 444 billion positions.

 Chess: Deep Blue defeated human world champion Garry Kasparov in a


six-game match in 1997. Deep Blue searches 200 million positions per
second, uses very sophisticated evaluation, and undisclosed methods
for extending some lines of search up to 40 ply.
 Othello: human champions refuse to compete against computers, who
are too good.
 Go: human champions refuse to compete against computers, who are
too bad. In go, b > 300, so most programs use pattern knowledge bases
to suggest plausible moves.
Adversarial Search

CHAPTER FIVE IN BOOK


Games

 Competitive environments, in which the agents’


goals are in conflict, giving rise to adversarial
search problems—often known as games.
 Mathematical game theory, a branch of economics,
views any multiagent environment as a game,
provided that the impact of each agent on the others
is “significant,” regardless of whether the agents are
cooperative or competitive.
 Games are of a rather specialized kind — what game
theorists call deterministic, turn-taking, two-player,
zero-sum games of perfect information
 Games, unlike most of the toy problems, are
interesting because they are too hard to solve.
 Eg, chess has an average branching factor of about
35, and games often go to 50 moves by each player,
so the search tree has about 35100 or 10154 nodes.
 Games, like the real world, therefore require the
ability to make some decision even when calculating
the optimal decision is infeasible.
 Game-playing research has spawned a number of
interesting ideas on how to make the best possible
use of time.
 We look at techniques for choosing a good move
when time is limited.
 Pruning allows us to ignore portions of the search
tree that make no difference to the final choice, and
heuristic evaluation functions allow us to
approximate the true utility of a state without doing
a complete search.
Two player games

 Consider games with two players, whom we call


MAX and MIN.
 MAX moves first, and then they take turns moving
until the game is over.
 At the end of the game, points are awarded to the
winning player and penalties are given to the loser.
 A game can be formally defined as a kind of search
problem with the following elements:
Elements of the game

 S0: The initial state, which specifies how the game is set up at the
start.
 PLAYER(s): Defines which player has the move in a state.
 ACTIONS(s): Returns the set of legal moves in a state.
 RESULT(s,a): The transition model, which defines the result of a
move.
 TERMINAL-TEST(s):A terminal test, which is true when the game is
over and false otherwise.
 States where the game has ended are called Terminal states.
 UTILITY(s,p):A utility function (also called an objective function or pay
off function), defines the final numeric value for a game that ends in
terminal state s for a player p. In chess, the outcome is a win, loss, or
draw, with values +1, 0, or 1/2.
Zero Sum Game

 A zero-sum game is (confusingly) defined as one


where the total payoff to all players is the same for
every instance of the game.
 Chess is zero-sum because every game has payoff of
either 0+1, 1 + 0, ½ + ½ .
 The initial state, ACTIONS function, and RESULT
function define the game tree for the game—a tree
where the nodes are game states and the edges are
moves.
Sample game – tic tac toe

 Play a sample game of Tic tac toe.

x
o
MAX-MIN

 Part of the game tree for tic-tac-toe.


 From the initial state, MAX has nine possible moves.
 Play alternates between MAX’s placing an X and
MIN’s placing an O until we reach leaf nodes
corresponding to terminal states such that one player
has three in a row or all the squares are filled.
 The number on each leaf node indicates the utility
value of the terminal state from the point of view of
MAX; high values are assumed to be good for MAX
and bad for MIN.
 For tic-tac-toe the game tree is relatively small—
fewer than 9! = 362,880 terminal nodes.
 But for chess there are over 1040 nodes, so the game
tree is best thought of as a theoretical construct that
we cannot realize in the physical world.
 In a normal search problem, the optimal solution
would be a sequence of actions leading to a goal
state—a terminal state that is a win.
 In adversarial search, MIN has something to say
about it. MAX therefore must find a contingent
strategy, which specifies MAX’s move in the initial
state, then MAX’s moves in the states resulting from
every possible response by MIN, then MAX’s moves
in the states resulting from every possible response
by MIN to those moves, and so on.
 Roughly speaking, an optimal strategy leads to
outcomes at least as good as any other strategy when
one is playing an infallible opponent.
 Even a simple game like tic-tac-toe is too complex for
us to draw the entire game tree, so switch to the
trivial game.
Sample Game

 The possible moves for MAX at the root node are


labeled a1, a2, and a3. The possible replies to a1 for
MIN are b1, b2, b3, and so on.
 This particular game ends after one move each by
MAX and MIN.
 In game parlance, we say that this tree is one move
deep, consisting of two half-moves, each of which is
called a ply.
 The utilities of the terminal states in this game range
from 2 to 14.
MINIMAX(n)

 Given a game tree, the optimal strategy can be


determined from the minimax value of each node,
which we write as MINIMAX(n).
 The minimax value of a node is the utility (for MAX)
of being in the corresponding state, assuming that
both players play optimally from there to the end of
the game.
 Obviously, the minimax value of a terminal state is
just its utility. Furthermore, given a choice, MAX
prefers to move to a state of maximum value,
whereas MIN prefers a state of minimum value.
Minimax function
 The terminal nodes on the bottom level get their
utility values from the game’s UTILITY function.
 The first MIN node, labeled B, has three successor
states with values 3, 12, and 8, so its minimax value
is 3.
 Similarly, the other two MIN nodes have minimax
value 2.
 The root node is a MAX node; its successor states
have minimax values 3, 2, and 2; so it has a minimax
value of 3.
 The minimax decision at the root: action a1 is the
optimal choice for MAX because it leads to the state
with the highest minimax value.
 This definition of optimal play for MAX assumes that
MIN also plays optimally—it maximizes the worst-
case outcome for MAX. What if MIN does not play
optimally? Then it is easy to show that MAX will do
even better.
 The minimax algorithm computes the minimax
decision from the current state.
 It uses a simple recursive computation of the
minimax values of each successor state, directly
implementing the defining equations.
 The recursion proceeds all the way down to the
leaves of the tree, and then the minimax values are
backed up through the tree as the recursion unwinds.
 For example, the algorithm first recurses down to the
three bottom left nodes and uses the UTILITY
function on them to discover that their values are 3,
12, and 8, respectively. Then it takes the minimum of
these values, 3, and returns it as the backed up value
of node B. A similar process gives the backed-up
values of 2 for C and 2 for D.
 Finally, we take the maximum of 3, 2, and 2 to get
the backed-up value of 3 for the root node.
Time and Space Complexity

 The minimax algorithm performs a complete depth-


first exploration of the game tree.
 If the maximum depth of the tree is m and there are
b legal moves at each point,
 then the time complexity of the minimax
algorithm is O(bm).
 The space complexity is O(bm) for an algorithm
that generates all actions at once, or O(m) for an
algorithm that generates actions one at a time
Properties of minimax

 Complete? Yes (if tree is finite)


 Optimal? Yes (against an optimal opponent)
 Time complexity? O(bm)
 Space complexity? O(bm) (depth-first exploration)
 For chess, b ≈ 35, m ≈100 for "reasonable" games
 exact solution completely infeasible
α-β pruning

 The problem with minimax search is that number of game


states it has to examine is exponential in the depth of tree.
 can’t eliminate the exponent, but can effectively cut it in half.
 The trick is that it is possible to compute the correct minimax
decision without looking at every node in the game tree.
 borrow the idea of pruning to eliminate large parts of the tree
from consideration.
 The particular technique is called alpha–beta pruning.
 When applied to a standard minimax tree, it returns the same
move as minimax would, but prunes away branches that
cannot possibly influence the final decision.
α-β pruning example
α-β pruning example
α-β pruning example
α-β pruning example
α-β pruning example
 Another way to look at this is as a simplification of
the formula for MINIMAX. Let the two unevaluated
successors of node C have values x and y.
 Then the value of the root node is given by
Properties of α-β

 Pruning does not affect final result


 Good move ordering improves effectiveness of
pruning
 With "perfect ordering," time complexity = O(bm/2)
 doubles depth of search
Why is it called α-β?

 α is the value of the best


(i.e., highest-value)
choice found so far at any
choice point along the
path for max
 If v is worse than α, max
will avoid it
  prune that branch
 Define β similarly for
min
The α-β algorithm
The α-β algorithm

You might also like