Python for AI
Python for AI
https://aipython.org https://artint.info
©David L Poole and Alan K Mackworth 2017-2024.
All code is licensed under a Creative Commons Attribution-NonCommercial-
ShareAlike 4.0 International License. See: https://creativecommons.org/licenses/
by-nc-sa/4.0/deed.en
This document and all the code can be downloaded from
https://artint.info/AIPython/ or from https://aipython.org
The authors and publisher of this book have used their best efforts in prepar-
ing this book. These efforts include the development, research and testing of
the programs to determine their effectiveness. The authors and publisher make
no warranty of any kind, expressed or implied, with regard to these programs
or the documentation contained in this book. The author and publisher shall
not be liable in any event for incidental or consequential damages in connection
with, or arising out of, the furnishing, performance, or use of these programs.
Contents 3
3
4 Contents
2.2.3 Plotting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Hierarchical Controller . . . . . . . . . . . . . . . . . . . . . . . 31
2.3.1 World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3.2 Body . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.3.3 Middle Layer . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.3.4 Top Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.3.5 Plotting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
11 Causality 271
11.1 Do Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
11.2 Counterfactual Reasoning . . . . . . . . . . . . . . . . . . . . . 274
11.2.1 Choosing Deterministic System . . . . . . . . . . . . . . . 274
11.2.2 Firing Squad Example . . . . . . . . . . . . . . . . . . . . 278
Bibliography 399
Index 401
AIPython contains runnable code for the book Artificial Intelligence, foundations
of computational agents, 3rd Edition [Poole and Mackworth, 2023]. It has the
following design goals:
9
10 1. Python for Artificial Intelligence
1.4 Pitfalls
It is important to know when side effects occur. Often AI programs consider
what would/might happen given certain conditions. In many such cases, we
don’t want side effects. When an agent acts in the world, side effects are ap-
propriate.
In Python, you need to be careful to understand side effects. For example,
the inexpensive function to add an element to a list, namely append, changes
the list. In a functional language like Haskell or Lisp, adding a new element to a
list, without changing the original list, is a cheap operation. For example if x is
a list containing n elements, adding an extra element to the list in Python (using
append) is fast, but it has the side effect of changing the list x. To construct a
new list that contains the elements of x plus a new element, without changing
the value of x, entails copying the list, or using a different representation for
lists. In the searching code, we will use a different representation for lists for
this reason.
1.5.3 Generators
Python has generators which can be used for a form of lazy evaluation – only
computing values when needed.
A comprehension in round parentheses gives a generator that can generate
the elements as needed. The result can go in a list or used in another com-
prehension, or can be called directly using next. The procedure next takes an
iterator and returns the next element (advancing the iterator); it raises a Sto-
pIteration exception if there is no next element. The following shows a simple
example, where user input is prepended with >>>
>>> a = (e*e for e in range(20) if e%2==0)
>>> next(a)
0
>>> next(a)
4
>>> next(a)
16
>>> list(a)
[36, 64, 100, 144, 196, 256, 324]
>>> next(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
Notice how list(a) continued on the enumeration, and got to the end of it.
To make a procedure into a generator, the yield command returns a value
that is obtained with next. It is typically used to enumerate the values for a for
loop or in generators. (The yield command can also be used for coroutines,
but AIPython only uses it for generators.)
A version of the built-in range, with 2 or 3 arguments (and positive steps)
can be implemented as:1
pythonDemo.py — Some tricky examples
11 def myrange(start, stop, step=1):
12 """enumerates the values from start in steps of size step that are
13 less than stop.
14 """
15 assert step>0, f"only positive steps implemented in myrange: {step}"
16 i = start
17 while i<stop:
18 yield i
19 i += step
20
21 print("list(myrange(2,30,3)):",list(myrange(2,30,3)))
1 Numbered lines are Python code available in the code-directory, aipython. The name of
the file is given in the gray text above the listing. The numbers correspond to the line numbers
in that file.
Exercise 1.1 Implement a version of myrange that acts like the built-in version
when there is a single argument. (Hint: make the second argument have a default
value that can be recognized in the function.) There is no need to make it work
with indexing.
Yield can be used to generate the same sequence of values as in the example
above.
pythonDemo.py — (continued)
23 def ga(n):
24 """generates square of even nonnegative integers less than n"""
25 for e in range(n):
26 if e%2==0:
27 yield e*e
28 a = ga(20)
The sequence of next(a), and list(a) gives exactly the same results as the com-
prehension at the start of this section.
It is straightforward to write a version of the built-in enumerate called myenumerate:
pythonDemo.py — (continued)
pythonDemo.py — (continued)
36 fun_list1 = []
37 for i in range(5):
38 def fun1(e):
39 return e+i
40 fun_list1.append(fun1)
41
42 fun_list2 = []
43 for i in range(5):
44 def fun2(e,iv=i):
45 return e+iv
46 fun_list2.append(fun2)
47
48 fun_list3 = [lambda e: e+i for i in range(5)]
49
50 fun_list4 = [lambda e,iv=i: e+iv for i in range(5)]
51
52 i=56
Try to predict, and then test to see the output, of the output of the following
calls, remembering that the function uses the latest value of any variable that
is not bound in the function call:
pythonDemo.py — (continued)
54 # in Shell do
55 ## ipython -i pythonDemo.py
56 # Try these (copy text after the comment symbol and paste in the Python
prompt):
57 # print([f(10) for f in fun_list1])
58 # print([f(10) for f in fun_list2])
59 # print([f(10) for f in fun_list3])
60 # print([f(10) for f in fun_list4])
In the first for-loop, the function fun1 uses i, whose value is the last value it was
assigned. In the second loop, the function fun2 uses iv. There is a separate iv
variable for each function, and its value is the value of i when the function was
defined. Thus fun1 uses late binding, and fun2 uses early binding. fun_list3
and fun_list4 are equivalent to the first two (except fun_list4 uses a different
i variable).
One of the advantages of using the embedded definitions (as in fun1 and
fun2 above) over the lambda is that is it possible to add a __doc__ string, which
is the standard for documenting functions in Python, to the embedded defini-
tions.
250
200
The y axis
150
100 ellipse?
50
0 20 40 60 80 100
The x axis
64 def myplot(minv,maxv,step,fun1,fun2):
65 plt.ion() # make it interactive
66 plt.xlabel("The x axis")
67 plt.ylabel("The y axis")
68 plt.xscale('linear') # Makes a 'log' or 'linear' scale
69 xvalues = range(minv,maxv,step)
70 plt.plot(xvalues,[fun1(x) for x in xvalues],
71 label="The first fun")
72 plt.plot(xvalues,[fun2(x) for x in xvalues], linestyle='--',color='k',
73 label=fun2.__doc__) # use the doc string of the function
74 plt.legend(loc="upper right") # display the legend
75
76 def slin(x):
77 """y=2x+7"""
78 return 2*x+7
79 def sqfun(x):
80 """y=(x-40)^2/10-20"""
81 return (x-40)**2/10-20
82
83 # Try the following:
84 # from pythonDemo import myplot, slin, sqfun
85 # import matplotlib.pyplot as plt
86 # myplot(0,100,1,slin,sqfun)
87 # plt.legend(loc="best")
88 # import math
89 # plt.plot([41+40*math.cos(th/10) for th in range(50)],
90 # [100+100*math.sin(th/10) for th in range(50)])
91 # plt.text(40,100,"ellipse?")
92 # plt.xscale('log')
At the end of the code are some commented-out commands you should try in
interactive mode. Cut from the file and paste into Python (and remember to
remove the comments symbol and leading space).
1.7 Utilities
1.7.1 Display
To keep things simple, using only standard Python, AIPython code is written
using a text-oriented tracing.
The method self.display is used to trace the program. Any call
self.display(level, to_print . . . )
where the level is less than or equal to the value for max_display_level will be
printed. The to_print . . . can be anything that is accepted by the built-in print
(including any keyword arguments).
The definition of display is:
display.py — A simple way to trace the intermediate steps of algorithms.
11 class Displayable(object):
12 """Class that uses 'display'.
13 The amount of detail is controlled by max_display_level
14 """
15 max_display_level = 1 # can be overridden in subclasses or instances
16
17 def display(self,level,*args,**nargs):
18 """print the arguments if level is less than or equal to the
19 current max_display_level.
20 level is an integer.
21 the other arguments are whatever arguments print can take.
22 """
23 if level <= self.max_display_level:
24 print(*args, **nargs) ##if error you are using Python2 not
Python3
In this code, args gets a tuple of the positional arguments, and nargs gets a
dictionary of the keyword arguments. This will not work in Python 2, and will
give an error.
Any class that wants to use display can be made a subclass of Displayable.
To change the maximum display level to 3 for a class do:
Classname.max_display_level = 3
which will make calls to display in that class print when the value of level is
less-than-or-equal to 3. The default display level is 1. It can also be changed for
individual objects (the object value overrides the class value).
0 display nothing
2 also display the values as they change (little detail through a loop)
1.7.2 Argmax
Python has a built-in max function that takes a generator (or a list or set) and re-
turns the maximum value. The argmaxall method takes a generator of (element, value)
pairs, as for example is generated by the built-in enumerate(list) for lists or
dict.items() for dictionaries. It returns a list of all elements with maximum
value; argmaxe returns one of these values at random. The argmax method
takes a list and returns the index of a random element that has the maximum
value. argmaxd takes a dictionary and returns a key with maximum value.
utilities.py — AIPython useful utilities
11 import random
12 import math
13
14 def argmaxall(gen):
15 """gen is a generator of (element,value) pairs, where value is a real.
16 argmaxall returns a list of all of the elements with maximal value.
17 """
18 maxv = -math.inf # negative infinity
19 maxvals = [] # list of maximal elements
20 for (e,v) in gen:
21 if v > maxv:
22 maxvals, maxv = [e], v
23 elif v == maxv:
24 maxvals.append(e)
25 return maxvals
26
27 def argmaxe(gen):
28 """gen is a generator of (element,value) pairs, where value is a real.
29 argmaxe returns an element with maximal value.
30 If there are multiple elements with the max value, one is returned at
random.
31 """
32 return random.choice(argmaxall(gen))
33
34 def argmax(lst):
35 """returns maximum index in a list"""
36 return argmaxe(enumerate(lst))
37 # Try:
38 # argmax([1,6,3,77,3,55,23])
39
40 def argmaxd(dct):
41 """returns the arg max of a dictionary dct"""
42 return argmaxe(dct.items())
43 # Try:
44 # arxmaxd({2:5,5:9,7:7})
Exercise 1.2 Change argmaxe to have an optional argument that specifies whether
you want the “first”, “last” or a “random” index of the maximum value returned.
If you want the first or the last, you don’t need to keep a list of the maximum
elements. Enable the other methods to have this optional argument, if appropriate.
1.7.3 Probability
For many of the simulations, we want to make a variable True with some prob-
ability. flip(p) returns True with probability p, and otherwise returns False.
utilities.py — (continued)
45 def flip(prob):
46 """return true with probability prob"""
47 return random.random() < prob
utilities.py — (continued)
49 def select_from_dist(item_prob_dist):
50 """ returns a value from a distribution.
51 item_prob_dist is an item:probability dictionary, where the
52 probabilities sum to 1.
53 returns an item chosen in proportion to its probability
54 """
55 ranreal = random.random()
56 for (it,prob) in item_prob_dist.items():
57 if ranreal < prob:
58 return it
59 else:
60 ranreal -= prob
61 raise RuntimeError(f"{item_prob_dist} is not a probability
distribution")
utilities.py — (continued)
63 def test():
64 """Test part of utilities"""
65 assert argmax([1,6,55,3,55,23]) in [2,4]
66 print("Passed unit test in utilities")
67 print("run test_aipython() to test (almost) everything")
68
69 if __name__ == "__main__":
70 test()
The following imports all of the python code and does a simple check of all of
AIPython that has automatic checks. If you develop new algorithms or tests,
add them here!
utilities.py — (continued)
72 def test_aipython():
73 import pythonDemo, display
74 # Agents: currently no tests
75 import agents, agentBuying, agentEnv, agentMiddle, agentTop,
agentFollowTarget
76 # Search:
77 print("***** testing Search *****")
78 import searchGeneric, searchBranchAndBound, searchExample, searchTest
79 searchGeneric.test(searchGeneric.AStarSearcher)
80 searchBranchAndBound.test(searchBranchAndBound.DF_branch_and_bound)
81 searchTest.run(searchExample.problem1,"Problem 1")
82 import searchGUI, searchMPP, searchGrid
83 # CSP
84 print("\n***** testing CSP *****")
85 import cspExamples, cspDFS, cspSearch, cspConsistency, cspSLS
86 cspExamples.test_csp(cspDFS.dfs_solve1)
87 cspExamples.test_csp(cspSearch.solver_from_searcher)
88 cspExamples.test_csp(cspConsistency.ac_solver)
89 cspExamples.test_csp(cspConsistency.ac_search_solver)
90 cspExamples.test_csp(cspSLS.sls_solver)
91 cspExamples.test_csp(cspSLS.any_conflict_solver)
92 import cspConsistencyGUI, cspSoft
93 # Propositions
94 print("\n***** testing Propositional Logic *****")
25
26 2. Agent Architectures and Hierarchical Control
The state of the agent and the state of the environment are represented us-
ing standard Python variables, which are updated as the state changes. The
percept and the actions are represented as variable-value dictionaries.
Agent and Environment are subclasses of Displayable so that they can use
the display method described in Section 1.7.1. raise NotImplementedError()
is a way to specify an abstract method that needs to be overridden in any im-
plemented agent or environment.
agents.py — Agent and Controllers
11 from display import Displayable
12
13 class Agent(Displayable):
14
15 def initial_action(self, percept):
16 """return the initial action."""
17 return self.select_action(percept) # same as select_action
18
19 def select_action(self, percept):
20 """return the next action (and update internal state) given percept
21 percept is variable:value dictionary
22 """
23 raise NotImplementedError("go") # abstract method
The environment implements a do(action) method where action is a variable-
value dictionary. This returns a percept, which is also a variable-value dictio-
nary. The use of dictionaries allows for structured actions and percepts.
Note that
agents.py — (continued)
25 class Environment(Displayable):
26 def initial_percept(self):
27 """returns the initial percept for the agent"""
28 raise NotImplementedError("initial_percept") # abstract method
29
30 def do(self, action):
31 """does the action in the environment
32 returns the next percept """
33 raise NotImplementedError("Environment.do") # abstract method
The simulator is initialized with initial_percept and then the agent and
the environment take turns in updating their states and returning the action
and the percept. This simulator runs for n steps. A slightly more sophisticated
simulator could run until some stopping condition.
agents.py — (continued)
35 class Simulate(Displayable):
36 """simulate the interaction between the agent and the environment
37 for n time steps.
38 """
39 def __init__(self,agent, environment):
40 self.agent = agent
41 self.env = environment
42 self.percept = self.env.initial_percept()
43 self.percept_history = [self.percept]
44 self.action_history = []
45
46 def go(self, n):
47 for i in range(n):
48 action = self.agent.select_action(self.percept)
49 self.display(2,f"i={i} action={action}")
50 self.percept = self.env.do(action)
51 self.display(2,f" percept={self.percept}")
14
15 class TP_env(Environment):
16 price_delta = [0, 0, 0, 21, 0, 20, 0, -64, 0, 0, 23, 0, 0, 0, -35,
17 0, 76, 0, -41, 0, 0, 0, 21, 0, 5, 0, 5, 0, 0, 0, 5, 0, -15, 0, 5,
18 0, 5, 0, -115, 0, 115, 0, 5, 0, -15, 0, 5, 0, 5, 0, 0, 0, 5, 0,
19 -59, 0, 44, 0, 5, 0, 5, 0, 0, 0, 5, 0, -65, 50, 0, 5, 0, 5, 0, 0,
20 0, 5, 0]
21 sd = 5 # noise standard deviation
22
23 def __init__(self):
24 """paper buying agent"""
25 self.time=0
26 self.stock=20
27 self.stock_history = [] # memory of the stock history
28 self.price_history = [] # memory of the price history
29
30 def initial_percept(self):
31 """return initial percept"""
32 self.stock_history.append(self.stock)
33 self.price = round(234+self.sd*random.gauss(0,1))
34 self.price_history.append(self.price)
35 return {'price': self.price,
36 'instock': self.stock}
37
38 def do(self, action):
39 """does action (buy) and returns percept consisting of price and
instock"""
40 used = select_from_dist({6:0.1, 5:0.1, 4:0.1, 3:0.3, 2:0.2, 1:0.2})
41 # used = select_from_dist({7:0.1, 6:0.2, 5:0.2, 4:0.3, 3:0.1,
2:0.1}) # uses more paper
42 bought = action['buy']
43 self.stock = self.stock+bought-used
44 self.stock_history.append(self.stock)
45 self.time += 1
46 self.price = round(self.price
47 + self.price_delta[self.time%len(self.price_delta)] #
repeating pattern
48 + self.sd*random.gauss(0,1)) # plus randomness
49 self.price_history.append(self.price)
50 return {'price': self.price,
51 'instock': self.stock}
53 class TP_agent(Agent):
54 def __init__(self):
55 self.spent = 0
56 percept = env.initial_percept()
57 self.ave = self.last_price = percept['price']
58 self.instock = percept['instock']
59 self.buy_history = []
60
61 def select_action(self, percept):
62 """return next action to carry out
63 """
64 self.last_price = percept['price']
65 self.ave = self.ave+(self.last_price-self.ave)*0.05
66 self.instock = percept['instock']
67 if self.last_price < 0.9*self.ave and self.instock < 60:
68 tobuy = 48
69 elif self.instock < 12:
70 tobuy = 12
71 else:
72 tobuy = 0
73 self.spent += tobuy*self.last_price
74 self.buy_history.append(tobuy)
75 return {'buy': tobuy}
Set up an environment and an agent. Uncomment the last lines to run the agent
for 90 steps, and determine the average amount spent.
agentBuying.py — (continued)
77 env = TP_env()
78 ag = TP_agent()
79 sim = Simulate(ag,env)
80 #sim.go(90)
81 #ag.spent/env.time ## average spent per time period
2.2.3 Plotting
The following plots the price and number in stock history:
agentBuying.py — (continued)
300
250
200 Price
Value
In stock
150 Bought
100
50
0
0 20 40 60 80
Time
Figure 2.1: Percept and command traces for the paper-buying agent
94
95 def plot_env_hist(self):
96 """plot history of price and instock"""
97 num = len(env.stock_history)
98 plt.plot(range(num),env.price_history,label="Price")
99 plt.plot(range(num),env.stock_history,label="In stock")
100 plt.legend()
101 #plt.draw()
102
103 def plot_agent_hist(self):
104 """plot history of buying"""
105 num = len(ag.buy_history)
106 plt.bar(range(1,num+1), ag.buy_history, label="Bought")
107 plt.legend()
108 #plt.draw()
109
110 # sim.go(100); print(f"agent spent ${ag.spent/100}")
111 # pl = Plot_history(ag,env); pl.plot_env_hist(); pl.plot_agent_hist()
Figure 2.1 shows the result of the plotting in the previous code.
Exercise 2.1 Design a better controller for a paper-buying agent.
• Give a controller that can work for many different price histories. An agent
can use other local state variables, but does not have access to the environ-
ment model.
• Is it worthwhile trying to infer the amount of paper that the home uses?
(Try your controller with the different paper consumption commented out
in TP_env.do.)
In this implementation, each layer, including the top layer, implements the en-
vironment class, because each layer is seen as an environment from the layer
above.
The robot controller is decomposed as follows. The world defines the walls.
The body describes the robot’s position, and its physical abilities such as whether
its whisker sensor of on. The body can be told to steer left or right or to go
straight. The middle layer can be told to go to x-y positions, avoiding walls.
The top layer knows about named locations, such as the storage room and lo-
cation o103, and their x-y positions. It can be told a sequence of locations, and
tells the middle layer to go to the positions of the locations in turn.
2.3.1 World
The world defines the walls. This is not implemented as an environment as
it does not change. If the agent could move walls, it should be made into an
environment.
2.3.2 Body
Rob_body defines everything about the agent body, its position and orientation
and whether its whisker sensor is on. It implements the Environment class as
21 import math
22 from agents import Environment
23 import matplotlib.pyplot as plt
24 import time
25
26 class Rob_body(Environment):
27 def __init__(self, world, init_pos=(0,0,90)):
28 """ world is the current world
29 init_pos is a triple of (x-position, y-position, direction)
30 direction is in degrees; 0 is to right, 90 is straight-up, etc
31 """
32 self.world = world
33 self.rob_x, self.rob_y, self.rob_dir = init_pos
34 self.turning_angle = 18 # degrees that a left makes
35 self.whisker_length = 6 # length of the whisker
36 self.whisker_angle = 30 # angle of whisker relative to robot
37 self.crashed = False
38 # The following control how it is plotted
39 self.plotting = True # whether the trace is being plotted
40 self.sleep_time = 0.05 # time between actions (for real-time
plotting)
41 # The following are data structures maintained:
42 self.history = [(self.rob_x, self.rob_y)] # history of (x,y)
positions
43 self.wall_history = [] # history of hitting the wall
44
45 def percept(self):
46 return {'rob_x_pos':self.rob_x, 'rob_y_pos':self.rob_y,
47 'rob_dir':self.rob_dir, 'whisker':self.whisker(),
'crashed':self.crashed}
48 initial_percept = percept # use percept function for initial percept too
49
50 def do(self,action):
51 """ action is {'steer':direction}
52 direction is 'left', 'right' or 'straight'.
53 Returns current percept.
54 """
55 if self.crashed:
56 return self.percept()
57 direction = action['steer']
58 compass_deriv =
{'left':1,'straight':0,'right':-1}[direction]*self.turning_angle
59 self.rob_dir = (self.rob_dir + compass_deriv +360)%360 # make in
range [0,360)
60 rob_x_new = self.rob_x + math.cos(self.rob_dir*math.pi/180)
61 rob_y_new = self.rob_y + math.sin(self.rob_dir*math.pi/180)
62 path = ((self.rob_x,self.rob_y),(rob_x_new,rob_y_new))
The Boolean whisker method returns True when the the robots whisker sensor
intersects with a wall.
agentEnv.py — (continued)
77 def whisker(self):
78 """returns true whenever the whisker sensor intersects with a wall
79 """
80 whisk_ang_world = (self.rob_dir-self.whisker_angle)*math.pi/180
81 # angle in radians in world coordinates
82 wx = self.rob_x + self.whisker_length * math.cos(whisk_ang_world)
83 wy = self.rob_y + self.whisker_length * math.sin(whisk_ang_world)
84 whisker_line = ((self.rob_x,self.rob_y),(wx,wy))
85 hit = any(line_segments_intersect(whisker_line,wall)
86 for wall in self.world.walls)
87 if hit:
88 self.wall_history.append((self.rob_x, self.rob_y))
89 if self.plotting:
90 plt.plot([self.rob_x],[self.rob_y],"ro")
91 plt.draw()
92 return hit
93
94 def line_segments_intersect(linea, lineb):
95 """returns true if the line segments, linea and lineb intersect.
96 A line segment is represented as a pair of points.
97 A point is represented as a (x,y) pair.
98 """
99 ((x0a,y0a),(x1a,y1a)) = linea
100 ((x0b,y0b),(x1b,y1b)) = lineb
101 da, db = x1a-x0a, x1b-x0b
102 ea, eb = y1a-y0a, y1b-y0b
103 denom = db*ea-eb*da
104 if denom==0: # line segments are parallel
105 return False
106 cb = (da*(y0b-y0a)-ea*(x0b-x0a))/denom # intersect along line b
107 if cb<0 or cb>1:
108 return False # intersect is outside line segment b
42 remaining -= 1
43 arrived = self.close_enough(target_pos)
44 return {'arrived':arrived}
The following method determines how to steer depending on whether the goal
is to the right or the left of where the robot is facing.
agentMiddle.py — (continued)
2.3.5 Plotting
The following is used to plot the locations, the walls and (eventually) the move-
ment of the robot. It can either plot the movement if the robot as it is go-
ing (with the default env.plotting = True), or not plot it as it is going (setting
env.plotting = False; in this case the trace can be plotted using pl.plot_run()).
agentTop.py — (continued)
storage
50
40
30
20
10 mail o103 o109
0 20 40 60 80 100
Figure 2.2: A trace of the trajectory of the agent. Red dots correspond to the
whisker sensor being on; the green dot to the whisker sensor being off. The agent
starts at position (0, 0) facing up.
50 def redraw(self):
51 plt.clf()
52 for wall in self.body.world.walls:
53 ((x0,y0),(x1,y1)) = wall
54 plt.plot([x0,x1],[y0,y1],"-k",linewidth=3)
55 for loc in self.top.locations:
56 (x,y) = self.top.locations[loc]
57 plt.plot([x],[y],"k<")
58 plt.text(x+1.0,y+0.5,loc) # print the label above and to the
right
59 plt.plot([self.body.rob_x],[self.body.rob_y],"go")
60 plt.gca().figure.canvas.draw()
61 if self.body.history or self.body.wall_history:
62 self.plot_run()
63
64 def plot_run(self):
65 """plots the history after the agent has finished.
66 This is typically only used if body.plotting==False
67 """
68 if self.body.history:
69 xs,ys = zip(*self.body.history)
70 plt.plot(xs,ys,"go")
71 if self.body.wall_history:
72 wxs,wys = zip(*self.body.wall_history)
73 plt.plot(wxs,wys,"ro")
The following code plots the agent as it acts in the world. Figure 2.2 shows
the result of the top.do
agentTop.py — (continued)
30
20
10
0 goal
10
20
0 10 20 30 40 50 60 70
76
77 world = Rob_world({((20,0),(30,20)), ((70,-5),(70,25))})
78 body = Rob_body(world)
79 middle = Rob_middle_layer(body)
80 top = Rob_top_layer(middle)
81
82 # try:
83 # pl=Plot_env(body,top)
84 # top.do({'visit':['o109','storage','o109','o103']})
85 # You can directly control the middle layer:
86 # middle.do({'go_to':(30,-10), 'timeout':200})
87 # Can you make it crash?
88
89 if __name__ == "__main__":
90 print("Try: Plot_env(body,top);
top.do({'visit':['o109','storage','o109','o103']})")
Exercise 2.2 The following code implements a robot trap (Figure 2.3). It is called
a trap because, once it has hit the wall, it needs to follow the wall, but local features
are not enough for it to know when it can head to the goal. Write a controller that
can escape the “trap” and get to the goal. See Exercise 2.4 in the textbook for hints.
agentTop.py — (continued)
45 self.pressloc = None
46 self.pressevent = None
47
48 def on_move(self, event):
49 if self.pressloc is not None: # and event.inaxes ==
self.pressevent.inaxes:
50 self.display(2,'-',end="")
51 self.top.locations[self.pressloc] = (event.xdata, event.ydata)
52 self.redraw()
53 else:
54 self.display(2,'.',end="")
55
56 # try:
57 # pl=Plot_follow(body,top)
58 # top.do({'visit':['o109','storage','o109','o103']})
59
60 if __name__ == "__main__":
61 print("Try: Plot_follow(body,top);
top.do({'visit':['o109','storage','o109','o103']})")
• a start node
41
42 3. Searching for Solutions
17 * a start node
18 * a neighbors function that gives the neighbors of a node
19 * a specification of a goal
20 * a (optional) heuristic function.
21 The methods must be overridden to define a search problem."""
22
23 def start_node(self):
24 """returns start node"""
25 raise NotImplementedError("start_node") # abstract method
26
27 def is_goal(self,node):
28 """is True if node is a goal"""
29 raise NotImplementedError("is_goal") # abstract method
30
31 def neighbors(self,node):
32 """returns a list (or enumeration) of the arcs for the neighbors of
node"""
33 raise NotImplementedError("neighbors") # abstract method
34
35 def heuristic(self,n):
36 """Gives the heuristic value of node n.
37 Returns 0 if not overridden."""
38 return 0
searchProblem.py — (continued)
40 class Arc(object):
41 """An arc consists of
42 a from_node and a to_node node
43 a (non-negative) cost
44 an (optional) action
45 """
46 def __init__(self, from_node, to_node, cost=1, action=None):
47 self.from_node = from_node
48 self.to_node = to_node
49 self.cost = cost
50 assert cost >= 0, (f"Cost cannot be negative: {self}, cost={cost}")
51 self.action = action
52
53 def __repr__(self):
54 """string representation of an arc"""
55 if self.action:
56 return f"{self.from_node} --{self.action}--> {self.to_node}"
57 else:
58 return f"{self.from_node} --> {self.to_node}"
• a start node
To define a search problem, you need to define the start node, the goal predi-
cate, the neighbors function and, for some algorithms, a heuristic function.
searchProblem.py — (continued)
60 class Search_problem_from_explicit_graph(Search_problem):
61 """A search problem from an explicit graph.
62 """
63
64 def __init__(self, title, nodes, arcs, start=None, goals=set(), hmap={},
65 positions=None):
66 """ A search problem consists of:
67 * list or set of nodes
68 * list or set of arcs
69 * start node
70 * list or set of goal nodes
71 * hmap: dictionary that maps each node into its heuristic value.
72 * positions: dictionary that maps each node into its (x,y) position
73 """
74 self.title = title
75 self.neighs = {}
76 self.nodes = nodes
77 for node in nodes:
78 self.neighs[node]=[]
79 self.arcs = arcs
80 for arc in arcs:
81 self.neighs[arc.from_node].append(arc)
82 self.start = start
83 self.goals = goals
84 self.hmap = hmap
85 if positions is None:
129 bbox =
dict(boxstyle="round4,pad=1.0,rounding_size=0.5",facecolor=node_color)
130 for arc in self.arcs:
131 self.show_arc(ax, arc)
132 for node in self.nodes:
133 self.show_node(ax, node, node_color = node_color)
134
135 def show_node(self, ax, node, node_color):
136 x,y = self.positions[node]
137 ax.text(x,y,node,bbox=dict(boxstyle="round4,pad=1.0,rounding_size=0.5",
138 facecolor=node_color),
139 ha='center',va='center', fontsize=self.fontsize)
140
141 def show_arc(self, ax, arc, arc_color='black', node_color='white'):
142 from_pos = self.positions[arc.from_node]
143 to_pos = self.positions[arc.to_node]
144 ax.annotate(arc.to_node, from_pos, xytext=to_pos,
145 arrowprops={'arrowstyle':'<|-', 'linewidth': 2,
146 'color':arc_color},
147 bbox=dict(boxstyle="round4,pad=1.0,rounding_size=0.5",
148 facecolor=node_color),
149 ha='center',va='center',
150 fontsize=self.fontsize)
151 # Add costs to middle of arcs:
152 if self.show_costs:
153 ax.text((from_pos[0]+to_pos[0])/2, (from_pos[1]+to_pos[1])/2,
154 arc.cost, bbox=dict(pad=1,fc='w',ec='w'),
155 ha='center',va='center',fontsize=self.fontsize)
3.1.2 Paths
A searcher will return a path from the start node to a goal node. A Python list
is not a suitable representation for a path, as many search algorithms consider
multiple paths at once, and these paths should share initial parts of the path.
If we wanted to do this with Python lists, we would need to keep copying the
list, which can be expensive if the list is long. An alternative representation is
used here in terms of a recursive data structure that can share subparts.
A path is either:
• an initial path, and an arc at the end, where the from_node of the arc is the
node at the end of the initial path.
These cases are distinguished in the following code by having arc=None if the
path has length 0, in which case initial is the node of the path. Note that
we only use the most basic form of Python’s yield for enumerations (Section
1.5.3).
searchProblem.py — (continued)
Problem 1
A
1 3
C 1 B
3 1 3
D 1 G
The second search problem is one with 8 nodes where many paths do not lead
to the goal. See Figure 3.2.
searchExample.py — (continued)
Problem 2
A 3 H
1 1
B 1 D 1 G J
3 3
C E
The third search problem is a disconnected graph (contains no arcs), where the
start node is a goal node. This is a boundary case to make sure that weird cases
work.
searchExample.py — (continued)
searchExample.py — (continued)
J 4 G
3
E
H
2
7
B 3 F
4
2 2
C 3 A 4 D
J 4 G
3
E
H
2
6
B 3 F
4
2 2
C 3 A 4 D
73 )
cyclic_simp_delivery_graph is the graph shown Figure 3.4. This is the
graph of Figure 3.10 of [Poole and Mackworth, 2023]. The heuristic values are
the same as in simp_delivery_graph.
searchExample.py — (continued)
74 cyclic_simp_delivery_graph = Search_problem_from_explicit_graph("Cyclic
Delivery Graph",
75 {'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'J'},
76 [ Arc('A', 'B', 2),
77 Arc('A', 'C', 3),
78 Arc('A', 'D', 4),
79 Arc('B', 'E', 2),
80 Arc('B', 'F', 3),
81 Arc('C', 'A', 3),
82 Arc('C', 'J', 6),
83 Arc('D', 'A', 4),
84 Arc('D', 'H', 4),
85 Arc('F', 'B', 3),
86 Arc('F', 'D', 2),
87 Arc('G', 'H', 3),
88 Arc('G', 'J', 4),
89 Arc('H', 'D', 4),
90 Arc('H', 'G', 3),
91 Arc('J', 'C', 6),
92 Arc('J', 'G', 4)],
93 start = 'A',
94 goals = {'G'},
95 hmap = {
96 'A': 7,
97 'B': 5,
98 'C': 9,
99 'D': 6,
100 'E': 3,
101 'F': 5,
102 'G': 0,
103 'H': 3,
104 'J': 4,
105 },
106 positions = {
107 'A': (0.4,0.1),
108 'B': (0.4,0.4),
109 'C': (0.1,0.1),
110 'D': (0.7,0.1),
111 'E': (0.6,0.7),
112 'F': (0.7,0.4),
113 'G': (0.7,0.9),
114 'H': (0.9,0.6),
115 'J': (0.3,0.9)
116 })
The next problem is the tree graph shown in Figure 3.5, and is Figure 3.15
in Poole and Mackworth [2023].
searchExample.py — (continued)
Tree Graph
A
B C
D E F G
H I J K L M N
O P Q R S T U V
W X Y Z AA BB CC DD EE
FF GG HH II JJ KK
3.2.1 Searcher
A Searcher for a problem can be asked repeatedly for the next path. To solve a
search problem, construct a Searcher object for the problem and then repeatedly
ask for the next path using search. If there are no more paths, None is returned.
Expanding: A --> B
J 4 G
3
E
H
2
7
B 3 F
4
2 2
red: selected
C 3 A 4 D blue: neighbors
green: frontier
yellow: goal
frontier contains paths to C and D, used to also contain A → B, and now will
contain A → B → E and A → B → F.
SearcherGUI takes a search class and a problem, and lets one explore the
search space after calling go(). A GUI can only be used for one search; at the
end of the search the loop ends and the buttons no longer work.
This is implemented by redefining display. The search algorithms don’t
need to be modified. If you modify them (or create your own), you just have to
be careful to use the appropriate number for the display. The first argument to
display has the following meanings:
3. (shown with “fine step” but not with “step”) the frontier and the path
selected
4. (shown with “fine step” but not with “step”) the frontier.
59 raise ExitToPython()
60 if level <= self.click: #step
61 print(*args, **nargs)
62 self.ax.set_title(f"Expanding: {self.searcher.path}",
63 fontsize=self.problem.fontsize)
64 if level == 1:
65 self.show_frontier(self.colors['frontier'])
66 self.show_path(self.colors['selected'])
67 self.ax.set_title(f"Solution Found: {self.searcher.path}",
68 fontsize=self.problem.fontsize)
69 elif level == 2: # what should be shown if node in multiple?
70 self.show_frontier(self.colors['frontier'])
71 self.show_path(self.colors['selected'])
72 self.show_neighbors(self.colors['neighbors'])
73 elif level == 3:
74 self.show_frontier(self.colors['frontier'])
75 self.show_path(self.colors['selected'])
76 elif level == 4:
77 self.show_frontier(self.colors['frontier'])
78
79
80 # wait for a button click
81 self.click = 0
82 plt.draw()
83 while self.click == 0 and not self.quitting:
84 plt.pause(0.1)
85 if self.quitting:
86 raise ExitToPython()
87 # undo coloring:
88 self.ax.set_title("")
89 self.show_frontier('white')
90 self.show_neighbors('white')
91 path_show = self.searcher.path
92 while path_show.arc:
93 self.problem.show_arc(self.ax, path_show.arc, 'black')
94 self.problem.show_node(self.ax, path_show.end(), 'white')
95 path_show = path_show.initial
96 self.problem.show_node(self.ax, path_show.end(), 'white')
97 if self.problem.is_goal(self.searcher.path.end()):
98 self.problem.show_node(self.ax, self.searcher.path.end(),
99 self.colors['goal'])
100 plt.draw()
101
102 def show_frontier(self, color):
103 for path in self.searcher.frontier:
104 self.problem.show_node(self.ax, path.end(), color)
105
106 def show_path(self, color):
107 """color selected path"""
108 path_show = self.searcher.path
searchGUI.py — (continued)
searchGeneric.py — (continued)
100 """
101 (_,_,path) = heapq.heappop(self.frontierpq)
102 return path
The following methods are used for finding and printing information about
the frontier.
searchGeneric.py — (continued)
3.2.4 A∗ Search
For an A∗ Search the frontier is implemented using the FrontierPQ class.
searchGeneric.py — (continued)
searchGeneric.py — (continued)
Exercise 3.2 Change the code so that it implements (i) best-first search and (ii)
lowest-cost-first search. For each of these methods compare it to A∗ in terms of the
number of paths expanded, and the path found.
Exercise 3.3 The searcher acts like a Python iterator, in that it returns one value
(here a path) and then returns other values (paths) on demand, but does not imple-
ment the iterator interface. Change the code so it implements the iterator interface.
What does this enable us to do?
46 self.num_expanded,"paths expanded.")
47
48 from searchGeneric import test
49 if __name__ == "__main__":
50 test(SearcherMPP)
51
52 import searchExample
53 # searcherMPPcdp = SearcherMPP(searchExample.cyclic_simp_delivery_graph)
54 # searcherMPPcdp.search() # find first path
55
56 # To use the GUI for SearcherMPP do
57 # python -i searchGUI.py
58 # import searchMPP
59 # SearcherGUI(searchMPP.SearcherMPP,
searchExample.cyclic_simp_delivery_graph)
Exercise 3.4 Chris was very puzzled as to why there was a minus (“−”) in the
second element of the tuple added to the heap in the add method in FrontierPQ in
searchGeneric.py.
Sam suggested the following example would demonstrate the importance of
the minus. Consider an infinite integer grid, where the states are pairs of integers,
the start is (0,0), and the goal is (10,10). The neighbors of (i, j) are (i + 1, j) and (i, j +
1). Consider the heuristic function h((i, j)) = |10 − i| + |10 − j|. Sam suggested you
compare how many paths are expanded with the minus and without the minus.
searchGrid is a representation of Sam’s graph. If something takes too long, you
might consider changing the size.
33 return abs(x-self.size)+abs(y-self.size)
34
35 class GridProblemNH(GridProblem):
36 """Grid problem with a heuristic of 0"""
37 def heuristic(self,node):
38 return 0
39
40 from searchGeneric import Searcher, AStarSearcher
41 from searchMPP import SearcherMPP
42 from searchBranchAndBound import DF_branch_and_bound
43
44 def testGrid(size = 10):
45 print("\nWith MPP")
46 gridsearchermpp = SearcherMPP(GridProblem(size))
47 print(gridsearchermpp.search())
48 print("\nWithout MPP")
49 gridsearchera = AStarSearcher(GridProblem(size))
50 print(gridsearchera.search())
51 print("\nWith MPP and a heuristic = 0 (Dijkstra's algorithm)")
52 gridsearchermppnh = SearcherMPP(GridProblemNH(size))
53 print(gridsearchermppnh.search())
Explain to Chris what the minus does and why it is there. Give evidence for your
claims. It might be useful to refer to other search strategies in your explanation.
As part of your explanation, explain what is special about Sam’s example.
Exercise 3.5 Implement a searcher that implements cycle pruning instead of
multiple-path pruning. You need to decide whether to check for cycles when paths
are added to the frontier or when they are removed. (Hint: either method can be
implemented by only changing one or two lines in SearcherMPP. Hint: there is
a cycle if path.end() in path.initial_nodes() ) Compare no pruning, multiple
path pruning and cycle pruning for the cyclic delivery problem. Which works
better in terms of number of paths expanded, computational time or space?
Depth-first search methods do not need a priority queue, but can use a list
as a stack. In this implementation of branch-and-bound search, we call search
to find an optimal solution with cost less than bound. This uses depth-first
search to find a path to a goal that extends path with cost less than the bound.
Once a path to a goal has been found, that path is remembered as the best_path,
the bound is reduced, and the search continues.
searchBranchAndBound.py — Branch and Bound Search
11 from searchProblem import Path
Exercise 3.6 In searcherb2, in the code above, what happens if the bound is
smaller, say 10? What if it is larger, say 1000?
Exercise 3.7 Implement a branch-and-bound search using recursion. Hint: you
don’t need an explicit frontier, but can do a recursive call for the children.
Exercise 3.8 Add loop detection to branch-and-bound search.
Exercise 3.9 After the branch-and-bound search found a solution, Sam ran search
again, and noticed a different count. Sam hypothesized that this count was related
to the number of nodes that an A∗ search would use (either expand or be added to
the frontier). Or maybe, Sam thought, the count for a number of nodes when the
bound is slightly above the optimal path case is related to how A∗ would work. Is
there a relationship between these counts? Are there different things that it could
count so they are related? Try to find the most specific statement that is true, and
explain why it is true.
To test the hypothesis, Sam wrote the following code, but isn’t sure it is helpful:
69
70 4. Reasoning with Constraints
32 return self.name
33
34 def __repr__(self):
35 return self.name # f"Variable({self.name})"
4.1.2 Constraints
A constraint consists of:
• A tuple (or list) of variables called the scope.
• A condition, a Boolean function that takes the same number of argu-
ments as there are variables in the scope.
• An name (for displaying)
• An optional (x, y) position. The mean of the positions of the variables in
the scope is used, if not specified.
4.1.3 CSPs
A constraint satisfaction problem (CSP) requires:
cspProblem.py — (continued)
46 class CSP(object):
47 """A CSP consists of
48 * a title (a string)
49 * variables, a list or set of variables
50 * constraints, a list of constraints
51 * var_to_const, a variable to set of constraints dictionary
52 """
53 def __init__(self, title, variables, constraints):
54 """title is a string
55 variables is set of variables
56 constraints is a list of constraints
57 """
58 self.title = title
59 self.variables = variables
60 self.constraints = constraints
61 self.var_to_const = {var:set() for var in self.variables}
62 for con in constraints:
63 for var in con.scope:
64 self.var_to_const[var].add(con)
65
66 def __str__(self):
67 """string representation of CSP"""
68 return self.title
69
70 def __repr__(self):
71 """more detailed string representation of CSP"""
72 return f"CSP({self.title}, {self.variables}, {([str(c) for c in
self.constraints])})"
csp.consistent(assignment) returns true if the assignment is consistent with
each of the constraints in csp (i.e., all of the constraints that can be evaluated
evaluate to true). Unless the assignment assigns to all variables, consistent
does not imply the CSP is consistent or has a solution, because constraints in-
volving variables not in the assignment are ignored.
cspProblem.py — (continued)
74 def consistent(self,assignment):
75 """assignment is a variable:value dictionary
76 returns True if all of the constraints that can be evaluated
77 evaluate to True given assignment.
78 """
79 return all(con.holds(assignment)
80 for con in self.constraints
81 if con.can_evaluate(assignment))
The show method uses matplotlib to show the graphical structure of a con-
straint network. This also includes code used for the consistency GUI (Section
4.4.2).
cspProblem.py — (continued)
136 if domains:
137 node_label = f"{var.name}\n{domains[var]}"
138 else:
139 node_label = var.name
140 node = plt.text(x, y, node_label, bbox=var_bbox, ha='center',
va='center',
141 picker=True, fontsize=fontsize)
142 self.nodes[node] = var
143 self.fig.canvas.mpl_connect('pick_event', self.pick_handler)
cspProblem.py — (continued)
4.1.4 Examples
In the following code ne\_, when given a number, returns a function that is
true when its argument is not that number. For example, if f=ne\_(3), then
f(2) is True and f(3) is False. That is, ne\_(x)(y) is true when x ̸= y. Allowing
a function of multiple arguments to use its arguments one at a time is called
currying, after the logician Haskell Curry. Some alternative implementations
are commented out; the uncommented one allows the partial functions to have
names.
23 def is_(val):
24 """is a value"""
25 # return lambda x: x == val # alternative definition
26 # return partial(eq,val) # another alternative definition
27 def isv(x):
28 return val == x
29 isv.__name__ = f"{val} == "
30 return isv
csp0 has variables X, Y and Z, each with domain {1, 2, 3}. The constraints are
X < Y and Y < Z.
cspExamples.py — (continued)
csp1
A B
B != 2
A<B C
B<C
csp2
A A != B B B != 3
A=D B != D A != C
E<A E<B
D C<D C
E<D E<C
C != 2
E
csp3
A A != B B
A<D
D D<C C
D != E C != E
The following example is another scheduling problem (but with multiple an-
swers). This is the same as “scheduling 2” in the original AIspace.org consis-
tency app.
cspExamples.py — (continued)
csp4
A adjacent(A,B) B
B != D A != C adjacent(B,C)
D adjacent(C,D) C
75 def adjacent(x,y):
76 """True when x and y are adjacent numbers"""
77 return abs(x-y) == 1
78
79 csp4 = CSP("csp4", {A,B,C,D},
80 [Constraint([A,B], adjacent, "adjacent(A,B)"),
81 Constraint([B,C], adjacent, "adjacent(B,C)"),
82 Constraint([C,D], adjacent, "adjacent(C,D)"),
83 Constraint([A,C], ne, "A != C"),
84 Constraint([B,D], ne, "B != D") ])
The following examples represent the crossword shown in Figure 4.5.
In the first representation, the variables represent words. The constraint
imposed by the crossword is that where two words intersect, the letter at the
intersection must be the same. The method meet_at is used to test whether two
words intersect with the same letter. For example, the constraint meet_at(2,0)
means that the third letter (at position 2) of the first argument is the same as
the first letter of the second argument. This is shown in Figure 4.6.
cspExamples.py — (continued)
86 def meet_at(p1,p2):
87 """returns a function of two words that is true
88 when the words intersect at positions p1, p2.
1 2
Words:
3
ant, big, bus, car, has,
book, buys, hold, lane,
year, ginger, search,
symbol, syntax.
4
crossword1
3a[0]==1d[2]
three_across 1a[2]==2d[0]
3a[2]==21d[2]
crossword1d
word(p00,p10,p20)
p01 p21
word(p00,p01,p02,p03) word(p02,p12,p22,p32)
word(p20,p21,p22,p23,p24,p25)
p03 p23
p25
123 # pij is the variable representing the letter i from the left and j down
(starting from 0)
124 p00 = Variable('p00', letters, position=(0.1,0.85))
125 p10 = Variable('p10', letters, position=(0.3,0.85))
126 p20 = Variable('p20', letters, position=(0.5,0.85))
127 p01 = Variable('p01', letters, position=(0.1,0.7))
128 p21 = Variable('p21', letters, position=(0.5,0.7))
129 p02 = Variable('p02', letters, position=(0.1,0.55))
130 p12 = Variable('p12', letters, position=(0.3,0.55))
131 p22 = Variable('p22', letters, position=(0.5,0.55))
132 p32 = Variable('p32', letters, position=(0.7,0.55))
133 p03 = Variable('p03', letters, position=(0.1,0.4))
134 p23 = Variable('p23', letters, position=(0.5,0.4))
135 p24 = Variable('p24', letters, position=(0.5,0.25))
136 p34 = Variable('p34', letters, position=(0.7,0.25))
137 p44 = Variable('p44', letters, position=(0.9,0.25))
138 p25 = Variable('p25', letters, position=(0.5,0.1))
139
140 crossword1d = CSP("crossword1d",
141 {p00, p10, p20, # first row
142 p01, p21, # second row
Exercise 4.1 How many assignments of a value to each variable are there for
each of the representations of the above crossword? Do you think an exhaustive
enumeration will work for either one?
The queens problem is a puzzle on a chess board, where the idea is to place
a queen on each column so the queens cannot take each other: there are no
two queens on the same row, column or diagonal. The n-queens problem is a
generalization where the size of the board is an n × n, and n queens have to be
placed.
Here is a representation of the n-queens problem, where the variables are
the columns and the values are the rows in which the queen is placed. The
original queens problem on a standard (8 × 8) chess board is n_queens(8)
cspExamples.py — (continued)
Exercise 4.2 How many constraints does this representation of the n-queens
problem produce? Can it be done with fewer constraints? Either explain why it
can’t be done with fewer constraints, or give a solution using fewer constraints.
Unit tests
The following defines a unit test for csp solvers, by default using example csp1.
cspExamples.py — (continued)
Exercise 4.3 Modify test so that instead of taking in a list of solutions, it checks
whether the returned solution actually is a solution.
Exercise 4.4 Propose a test that is appropriate for CSPs with no solutions. As-
sume that the test designer knows there are no solutions. Consider what a CSP
solver should return if there are no solutions to the CSP.
Exercise 4.5 Write a unit test that checks whether all solutions (e.g., for the search
algorithms that can return multiple solutions) are correct, and whether all solu-
tions can be found.
Exercise 4.6 Instead of testing all constraints at every node, change it so each
constraint is only tested when all of its variables are assigned. Given an elimina-
tion ordering, it is possible to determine when each constraint needs to be tested.
Implement this. Hint: create a parallel list of sets of constraints, where at each po-
sition i in the list, the constraints at position i can be evaluated when the variable
at position i has been assigned.
Exercise 4.7 Estimate how long dfs_solve_all(crossword1d) will take on your
computer. To do this, reduce the number of variables that need to be assigned,
so that the simplified problem can be solved in a reasonable time (between 0.1
second and 10 seconds). This can be done by reducing the number of variables in
var_order, as the program only splits on these. How much more time will it take
if the number of variables is increased by 1? (Try it!) Then extrapolate to all of the
variables. See Section 1.6.1 for how to time your code. Would making the code 100
times faster or using a computer 100 times faster help?
The next solver constructs a search space that can be solved using the search
methods of the previous chapter. This takes in a CSP problem and an optional
variable ordering, which is a list of the variables in the CSP. In this search space:
• A node is a variable : value dictionary which does not violate any con-
straints (so that dictionaries that violate any conmtratints are not added).
The neighbors(node) method uses the fact that the length of the node, which
is the number of variables already assigned, is the index of the next variable to
split on. Note that we do not need to check whether there are no more variables
to split on, as the nodes are all consistent, by construction, and so when there
are no more variables we have a solution, and so don’t need the neighbors.
cspSearch.py — (continued)
48 import cspExamples
49 from searchGeneric import Searcher
50
51 def solver_from_searcher(csp):
52 """depth-first search solver"""
53 path = Searcher(Search_from_CSP(csp)).search()
54 if path is not None:
55 return path.end()
56 else:
57 return None
58
59 if __name__ == "__main__":
60 test_csp(solver_from_searcher)
61
62 ## Test Solving CSPs with Search:
63 searcher1 = Searcher(Search_from_CSP(cspExamples.csp1))
64 #print(searcher1.search()) # get next solution
65 searcher2 = Searcher(Search_from_CSP(cspExamples.csp2))
66 #print(searcher2.search()) # get next solution
67 searcher3 = Searcher(Search_from_CSP(cspExamples.crossword1))
68 #print(searcher3.search()) # get next solution
69 searcher4 = Searcher(Search_from_CSP(cspExamples.crossword1d))
70 #print(searcher4.search()) # get next solution (warning: slow)
Exercise 4.8 What would happen if we constructed the new assignment by as-
signing node[var] = val (with side effects) instead of using dictionary union? Give
an example of where this could give a wrong answer. How could the algorithm be
changed to work with side effects? (Hint: think about what information needs to
be in a node).
Exercise 4.9 Change neighbors so that it returns an iterator of values rather than
a list. (Hint: use yield.)
The following selects an arc. Any element of to_do can be selected. The se-
lected element needs to be removed from to_do. The default implementation
just selects which ever element pop method for sets returns. The graphical user
interface below allows the user to select an arc. Alternatively, a more sophisti-
cated selection could be employed.
cspConsistency.py — (continued)
The value of new_domain is the subset of the domain of var that is consistent
with the assignment to the other variables. To make it easier to understand, the
following treats unary (with no other variables in the constraint) and binary
(with one other variables in the constraint) constraints as special cases. These
cases are not strictly necessary; the last case covers the first two cases, but is
more difficult to understand without seeing the first two cases. Note that this
case analysis is not in the code distribution, but can replace the assignment to
new_domain above.
cspConsistency.py — (continued)
A<D
D D<C C
{1, 2, 3, 4} {1, 2, 3, 4}
D != E C != E
Auto AC E
{1, 2, 3, 4}
Exercise 4.10 Implement solve_all that returns the set of all solutions without
using yield. Hint: it can be like generate_sols but returns a set of solutions; the
recursive calls can be unioned; | is Python’s union.
Exercise 4.11 Implement solve_one that returns one solution if one exists, or False
otherwise, without using yield. Hint: Python’s “or” has the behavior A or B will
return the value of A unless it is None or False, in which case the value of B is
returned.
Unit test:
cspConsistency.py — (continued)
that arc. If the arc selected is not arc consistent, it is made red, the domain is
reduced, and then the arc becomes green. If the arc was already arc consistent
it turns green.
This is implemented by overriding select_arc and select_var to allow the
user to pick the arcs and the variables, and overriding display to allow for the
animation. Note that the first argument of display (the number) in the code
above is interpreted with a special meaning by the GUI and should only be
changed with care.
Clicking AutoAC automates arc selection until the network is arc consistent.
cspConsistencyGUI.py — GUI for consistency-based CSP solving
11 from cspConsistency import Con_solver
12 import matplotlib.pyplot as plt
13
14 class ConsistencyGUI(Con_solver):
15 def __init__(self, csp, fontsize=10, speed=1, **kwargs):
16 """
17 csp is the csp to show
18 fontsize is the size of the text
19 speed is the number of animations per second (controls delay_time)
20 1 (slow) and 4 (fast) seem like good values
21 """
22 self.fontsize = fontsize
23 self.delay_time = 1/speed
24 self.quitting = False
25 Con_solver.__init__(self, csp, **kwargs)
26 csp.show(showAutoAC = True)
27 csp.fig.canvas.mpl_connect('close_event', self.window_closed)
28
29 def go(self):
30 try:
31 res = self.solve_all()
32 self.csp.draw_graph(domains=self.domains,
33 title="No more solutions. GUI finished. ",
34 fontsize=self.fontsize)
35 return res
36 except ExitToPython:
37 print("GUI closed")
38
39 def select_arc(self, to_do):
40 while True:
41 self.csp.draw_graph(domains=self.domains, to_do=to_do,
42 title="click on to_do (blue) arc",
fontsize=self.fontsize)
43 self.wait_for_user()
44 if self.csp.autoAC:
45 break
46 picked = self.csp.picked
47 self.csp.picked = None
48 if picked in to_do:
49 to_do.remove(picked)
50 print(f"{picked} picked")
51 return picked
52 else:
53 print(f"{picked} not in to_do. Pick one of {to_do}")
54 if self.csp.autoAC:
55 self.csp.draw_graph(domains=self.domains, to_do=to_do,
56 title="Auto AC", fontsize=self.fontsize)
57 plt.pause(self.delay_time)
58 return to_do.pop()
59
60 def select_var(self, iter_vars):
61 vars = list(iter_vars)
62 while True:
63 self.csp.draw_graph(domains=self.domains,
64 title="Arc consistent. Click node to
split",
65 fontsize=self.fontsize)
66 self.csp.autoAC = False
67 self.wait_for_user()
68 picked = self.csp.picked
69 self.csp.picked = None
70 if picked in vars:
71 #print("splitting",picked)
72 return picked
73 else:
74 print(picked,"not in",vars)
75
76 def display(self,n,*args,**nargs):
77 if n <= self.max_display_level: # default display
78 print(*args, **nargs)
79 if n==1: # solution found or no solutions"
80 self.csp.draw_graph(domains=self.domains, to_do=set(),
81 title=' '.join(args)+": click any node or
arc to continue",
82 fontsize=self.fontsize)
83 self.csp.autoAC = False
84 self.wait_for_user()
85 self.csp.picked = None
86 elif n==2: # backtracking
87 plt.title("backtracking: click any node or arc to continue")
88 self.csp.autoAC = False
89 self.wait_for_user()
90 self.csp.picked = None
91 elif n==3: # inconsistent arc
92 line = self.csp.thelines[self.arc_selected]
93 line.set_color('red')
94 line.set_linewidth(10)
95 plt.pause(self.delay_time)
96 line.set_color('limegreen')
97 line.set_linewidth(self.csp.linewidth)
98 #elif n==4 and self.add_to_do: # adding to to_do
99 # print("adding to to_do",self.add_to_do) ## highlight these arc
100
101 def wait_for_user(self):
102 while self.csp.picked == None and not self.csp.autoAC and not
self.quitting:
103 plt.pause(0.01) # controls reaction time of GUI
104 if self.quitting:
105 raise ExitToPython()
106
107 def window_closed(self, event):
108 self.quitting = True
109
110 class ExitToPython(Exception):
111 pass
112
113 import cspExamples
114 # Try:
115 # ConsistencyGUI(cspExamples.csp1).go()
116 # ConsistencyGUI(cspExamples.csp3).go()
117 # ConsistencyGUI(cspExamples.csp3, speed=4, fontsize=15).go()
118
119 if __name__ == "__main__":
120 print("Try e.g.: ConsistencyGUI(cspExamples.csp3).go()")
cspConsistency.py — (continued)
163
164 def neighbors(self,node):
165 """returns the neighboring nodes of node.
166 """
167 neighs = []
168 var = select(x for x in node if len(node[x])>1)
169 if var:
170 dom1, dom2 = partition_domain(node[var])
171 self.display(2,"Splitting", var, "into", dom1, "and", dom2)
172 to_do = self.cons.new_to_do(var,None)
173 for dom in [dom1,dom2]:
174 newdoms = node | {var:dom}
175 cons_doms = self.cons.make_arc_consistent(newdoms,to_do)
176 if all(len(cons_doms[v])>0 for v in cons_doms):
177 # all domains are non-empty
178 neighs.append(Arc(node,cons_doms))
179 else:
180 self.display(2,"...",var,"in",dom,"has no solution")
181 return neighs
Exercise 4.12 When splitting a domain, this code splits the domain into half,
approximately in half (without any effort to make a sensible choice). Does it work
better to split one element from a domain?
Unit test:
cspConsistency.py — (continued)
Testing:
cspConsistency.py — (continued)
195 ## Test Solving CSPs with Arc consistency and domain splitting:
196 #Con_solver.max_display_level = 4 # display details of AC (0 turns off)
197 #Con_solver(cspExamples.csp1).solve_all()
198 #searcher1d = Searcher(Search_with_AC_from_CSP(cspExamples.csp1))
199 #print(searcher1d.search())
200 #Searcher.max_display_level = 2 # display search trace (0 turns off)
201 #searcher2c = Searcher(Search_with_AC_from_CSP(cspExamples.csp2))
202 #print(searcher2c.search())
203 #searcher3c = Searcher(Search_with_AC_from_CSP(cspExamples.crossword1))
204 #print(searcher3c.search())
205 #searcher4c = Searcher(Search_with_AC_from_CSP(cspExamples.crossword1d))
206 #print(searcher4c.search())
The following code implements the two-stage choice (select one of the vari-
ables that are involved in the most constraints that are violated, then a value),
the any-conflict algorithm (select a variable that participates in a violated con-
straint) and a random choice of variable, as well as a probabilistic mix of the
three.
Given a CSP, the stochastic local searcher (SLSearcher) creates the data struc-
tures:
cspSLS.py — (continued)
29 def restart(self):
30 """creates a new total assignment and the conflict set
31 """
32 self.current_assignment = {var:random_choice(var.domain) for
33 var in self.csp.variables}
34 self.display(2,"Initial assignment",self.current_assignment)
35 self.conflicts = set()
36 for con in self.csp.constraints:
37 if not con.holds(self.current_assignment):
38 self.conflicts.add(con)
39 self.display(2,"Number of conflicts",len(self.conflicts))
40 self.variable_pq = None
The search method is the top-level searching algorithm. It can either be used
to start the search or to continue searching. If there is no current assignment,
it must create one. Note that, when counting steps, a restart is counted as one
step, which is not appropriate for CSPs with many variables, as it is a relatively
expensive operation for these cases.
This method selects one of two implementations. The argument prob_best
is the probability of selecting a best variable (one involving the most conflicts).
When the value of prob_best is positive, the algorithm needs to maintain a prior-
ity queue of variables and the number of conflicts (using search_with_var_pq). If
the probability of selecting a best variable is zero, it does not need to maintain
this priority queue (as implemented in search_with_any_conflict).
The argument prob_anycon is the probability that the any-conflict strategy
is used (which selects a variable at random that is in a conflict), assuming that
it is not picking a best variable. Note that for the probability parameters, any
value less that zero acts like probability zero and any value greater than 1 acts
like probability 1. This means that when prob_anycon = 1.0, a best variable is
chosen with probability prob_best, otherwise a variable in any conflict is chosen.
A variable is chosen at random with probability 1 − prob_anycon − prob_best as
long as that is positive.
This returns the number of steps needed to find a solution, or None if no
solution is found. If there is a solution, it is in self .current_assignment.
cspSLS.py — (continued)
Exercise 4.13 This does an initial random assignment but does not do any ran-
dom restarts. Implement a searcher that takes in the maximum number of walk
steps (corresponding to existing max_steps) and the maximum number of restarts,
and returns the total number of steps for the first solution found. (As in search, the
solution found can be extracted from the variable self .current_assignment).
4.5.1 Any-conflict
In the any-conflict heuristic a variable that participates in a violated constraint
is picked at random. The implementation need to keeps track of which vari-
ables are in conflicts. This is can avoid the need for a priority queue that is
needed when the probability of picking a best variable is greater than zero.
cspSLS.py — (continued)
82 self.current_assignment[var]=val
83 for varcon in self.csp.var_to_const[var]:
84 if varcon.holds(self.current_assignment):
85 if varcon in self.conflicts:
86 self.conflicts.remove(varcon)
87 else:
88 if varcon not in self.conflicts:
89 self.conflicts.add(varcon)
90 self.display(2," Number of conflicts",len(self.conflicts))
91 if not self.conflicts:
92 self.display(1,"Solution found:", self.current_assignment,
93 "in", self.number_of_steps,"steps")
94 return self.number_of_steps
95 self.display(1,"No solution in",self.number_of_steps,"steps",
96 len(self.conflicts),"conflicts remain")
97 return None
Exercise 4.14 This makes no attempt to find the best value for the variable se-
lected. Modify the code to include an option selects a value for the selected vari-
able that reduces the number of conflicts the most. Have a parameter that specifies
the probability that the best value is chosen, and otherwise chooses a value at ran-
dom.
cspSLS.py — (continued)
Exercise 4.15 These implementations always select a value for the variable se-
lected that is different from its current value (if that is possible). Change the code
so that it does not have this restriction (so it can leave the value the same). Would
you expect this code to be faster? Does it work worse (or better)?
176 http://docs.python.org/3.3/library/heapq.html
177 It could probably be done more efficiently by
178 shuffling the modified element in the heap.
179 """
180 def __init__(self):
181 self.pq = [] # priority queue of [val,rand,elt] triples
182 self.elt_map = {} # map from elt to [val,rand,elt] triple in pq
183 self.REMOVED = "*removed*" # a string that won't be a legal element
184 self.max_size=0
185
186 def add(self,elt,val):
187 """adds elt to the priority queue with priority=val.
188 """
189 assert val <= 0,val
190 assert elt not in self.elt_map, elt
191 new_triple = [val, random.random(),elt]
192 heapq.heappush(self.pq, new_triple)
193 self.elt_map[elt] = new_triple
194
195 def remove(self,elt):
196 """remove the element from the priority queue"""
197 if elt in self.elt_map:
198 self.elt_map[elt][2] = self.REMOVED
199 del self.elt_map[elt]
200
201 def update_each_priority(self,update_dict):
202 """update values in the priority queue by subtracting the values in
203 update_dict from the priority of those elements in priority queue.
204 """
205 for elt,incr in update_dict.items():
206 if incr != 0:
207 newval = self.elt_map.get(elt,[0])[0] - incr
208 assert newval <= 0, f"{elt}:{newval+incr}-{incr}"
209 self.remove(elt)
210 if newval != 0:
211 self.add(elt,newval)
212
213 def pop(self):
214 """Removes and returns the (elt,value) pair with minimal value.
215 If the priority queue is empty, IndexError is raised.
216 """
217 self.max_size = max(self.max_size, len(self.pq)) # keep statistics
218 triple = heapq.heappop(self.pq)
219 while triple[2] == self.REMOVED:
220 triple = heapq.heappop(self.pq)
221 del self.elt_map[triple[2]]
222 return triple[2], triple[0] # elt, value
223
224 def top(self):
225 """Returns the (elt,value) pair with minimal value, without
removing it.
226 If the priority queue is empty, IndexError is raised.
227 """
228 self.max_size = max(self.max_size, len(self.pq)) # keep statistics
229 triple = self.pq[0]
230 while triple[2] == self.REMOVED:
231 heapq.heappop(self.pq)
232 triple = self.pq[0]
233 return triple[2], triple[0] # elt, value
234
235 def empty(self):
236 """returns True iff the priority queue is empty"""
237 return all(triple[2] == self.REMOVED for triple in self.pq)
600
400
200
0
100 101 102 103
Number of Steps
262 stats.append(num_steps)
263 stats.sort()
264 if prob_best >= 1.0:
265 label = "P(best)=1.0"
266 else:
267 p_ac = min(prob_anycon, 1-prob_best)
268 label = "P(best)=%.2f, P(ac)=%.2f" % (prob_best, p_ac)
269 plt.plot(stats,range(len(stats)),label=label)
270 plt.legend(loc="upper left")
271 SLSearcher.max_display_level= temp_mdl #restore display
Figure 4.9 gives run-time distributions for 3 algorithms. It is also useful to
compare the distributions of different runs of the same algorithms and settings.
4.5.5 Testing
cspSLS.py — (continued)
18 """
19 def __init__(self, scope, function, string=None, position=None):
20 Constraint.__init__(self, scope, function, string, position)
21
22 def value(self,assignment):
23 return self.holds(assignment)
cspSoft.py — (continued)
cspSoft.py — (continued)
Exercise 4.17 What happens of some costs are negative? (Does it still work?)
What if a value is added to all costs: does it change the optimum value, and does
it affect efficiency? Make the algorithm work so that negative costs can be in the
constraints. [Hint: make the smallest value be zero.]
Exercise 4.18 Change the stochastic-local search algorithms to work for soft con-
straints. Hint: Instead of the number of constraints violated, consider how much a
change in a variable affects the objective function. Instead of returning a solution,
return the best assignment found.
27 class Askable(object):
28 """An askable atom"""
29
109
110 5. Propositions and Inference
30 def __init__(self,atom):
31 """clause with atom head and lost of atoms body"""
32 self.atom=atom
33
34 def __str__(self):
35 """returns the string representation of a clause."""
36 return f"askable {self.atom}."
37
38 def yes(ans):
39 """returns true if the answer is yes in some form"""
40 return ans.lower() in ['yes', 'oui', 'y'] # bilingual
A knowledge base is a list of clauses and askables. To make top-down inference
faster, this creates an atom_to_clause dictionary that maps each atom into the
set of clauses with that atom in the head.
logicProblem.py — (continued)
Here is a trivial example (I think therefore I am) used in the unit tests:
logicProblem.py — (continued)
74 triv_KB = KB([
75 Clause('i_am', ['i_think']),
76 Clause('i_think'),
77 Clause('i_smell', ['i_exist'])
78 ])
Here is a representation of the electrical domain of the textbook:
logicProblem.py — (continued)
80 elect = KB([
81 Clause('light_l1'),
82 Clause('light_l2'),
83 Clause('ok_l1'),
84 Clause('ok_l2'),
85 Clause('ok_cb1'),
86 Clause('ok_cb2'),
87 Clause('live_outside'),
88 Clause('live_l1', ['live_w0']),
89 Clause('live_w0', ['up_s2','live_w1']),
90 Clause('live_w0', ['down_s2','live_w2']),
91 Clause('live_w1', ['up_s1', 'live_w3']),
92 Clause('live_w2', ['down_s1','live_w3' ]),
93 Clause('live_l2', ['live_w4']),
94 Clause('live_w4', ['up_s3','live_w3' ]),
95 Clause('live_p_1', ['live_w3']),
96 Clause('live_w3', ['live_w5', 'ok_cb1']),
97 Clause('live_p_2', ['live_w6']),
98 Clause('live_w6', ['live_w5', 'ok_cb2']),
99 Clause('live_w5', ['live_outside']),
100 Clause('lit_l1', ['light_l1', 'live_l1', 'ok_l1']),
101 Clause('lit_l2', ['light_l2', 'live_l2', 'ok_l2']),
102 Askable('up_s1'),
103 Askable('down_s1'),
104 Askable('up_s2'),
105 Askable('down_s2'),
106 Askable('up_s3'),
107 Askable('down_s2')
108 ])
109
110 # print(kb)
The following knowledge base is false in the intended interpretation. One of
the clauses is wrong; can you see which one? We will show how to debug it.
logicProblem.py — (continued)
115 Clause('ok_cb1'),
116 Clause('ok_cb2'),
117 Clause('live_outside'),
118 Clause('live_p_2', ['live_w6']),
119 Clause('live_w6', ['live_w5', 'ok_cb2']),
120 Clause('light_l1'),
121 Clause('live_w5', ['live_outside']),
122 Clause('lit_l1', ['light_l1', 'live_l1', 'ok_l1']),
123 Clause('lit_l2', ['light_l2', 'live_l2', 'ok_l2']),
124 Clause('live_l1', ['live_w0']),
125 Clause('live_w0', ['up_s2','live_w1']),
126 Clause('live_w0', ['down_s2','live_w2']),
127 Clause('live_w1', ['up_s3', 'live_w3']),
128 Clause('live_w2', ['down_s1','live_w3' ]),
129 Clause('live_l2', ['live_w4']),
130 Clause('live_w4', ['up_s3','live_w3' ]),
131 Clause('live_p_1', ['live_w3']),
132 Clause('live_w3', ['live_w5', 'ok_cb1']),
133 Askable('up_s1'),
134 Askable('down_s1'),
135 Askable('up_s2'),
136 Clause('light_l2'),
137 Clause('ok_l1'),
138 Clause('light_l2'),
139 Clause('ok_l1'),
140 Clause('ok_l2'),
141 Clause('ok_cb1'),
142 Clause('ok_cb2'),
143 Clause('live_outside'),
144 Clause('live_p_2', ['live_w6']),
145 Clause('live_w6', ['live_w5', 'ok_cb2']),
146 Clause('ok_l2'),
147 Clause('ok_cb1'),
148 Clause('ok_cb2'),
149 Clause('live_outside'),
150 Clause('live_p_2', ['live_w6']),
151 Clause('live_w6', ['live_w5', 'ok_cb2']),
152 Askable('down_s2'),
153 Askable('up_s3'),
154 Askable('down_s2')
155 ])
156
157 # print(kb)
Exercise 5.1 It is not very user-friendly to ask all of the askables up-front. Imple-
ment ask-the-user so that questions are only asked if useful, and are not re-asked.
For example, if there is a clause h ← a ∧ b ∧ c ∧ d ∧ e, where c and e are askable, c
and e only need to be asked if a, b, d are all in fp and they have not been asked be-
fore. Askable e only needs to be asked if the user says “yes” to c. Askable c doesn’t
need to be asked if the user previously replied “no” to e, unless it is needed for
some other clause.
This form of ask-the-user can ask a different set of questions than the top-
down interpreter that asks questions when encountered. Give an example where
they ask different questions (neither set of questions asked is a subset of the other).
Exercise 5.2 This algorithm runs in time O(n2 ), where n is the number of clauses,
for a bounded number of elements in the body; each iteration goes through each
of the clauses, and in the worst case, it will do an iteration for each clause. It is
possible to implement this in time O(n) time by creating an index that maps an
atom to the set of clauses with that atom in the body. Implement this. What is its
complexity as a function of n and b, the maximum number of atoms in the body of
a clause?
Exercise 5.3 It is possible to be more efficient (in terms of the number of elements
in a body) than the method in the previous question by noticing that each element
of the body of clause only needs to be checked once. For example, the clause
a ← b ∧ c ∧ d, needs only be considered when b is added to fp. Once b is added
to fp, if c is already in fp, we know that a can be added as soon as d is added.
Implement this. What is its complexity as a function of n and b, the maximum
number of atoms in the body of a clause?
The following provides a simple unit test that is hard wired for triv_KB:
logicTopDown.py — (continued)
Exercise 5.4 This code can re-ask a question multiple times. Implement this code
so that it only asks a question once and remembers the answer. Also implement
a function to forget the answers, which is useful if someone given an incorrect
response.
Exercise 5.5 What search method is this using? Implement the search interface
so that it can use A∗ or other searching methods. Define an admissible heuristic
that is not always 0.
30
31 def prove_body(kb, ans_body, indent=""):
32 """returns proof tree if kb |- ans_body or "fail" if there is no proof
33 ans_body is a list of atoms in a body to be proved
34 """
35 proofs = []
36 for atom in ans_body:
37 proof_at = prove_atom(kb, atom, indent+" ")
38 if proof_at == "fail":
39 return "fail" # fail if any proof fails
40 else:
41 proofs.append(proof_at)
42 return proofs
The following provides a simple unit test that is hard wired for triv_KB:
logicExplain.py — (continued)
logicExplain.py — (continued)
0 : down_s1
1 : live_w3
logicExplain: quit
>>>
Exercise 5.6 The above code only ever explores one proof – the first proof found.
Change the code to enumerate the proof trees (by returning a list of all proof trees,
or, preferably, using yield). Add the command "retry" to the user interface to try
another proof.
5.5 Assumables
Atom a can be made assumable by including Assumable(a) in the knowledge
base. A knowledge base that can include assumables is declared with KBA.
logicAssumables.py — Definite clauses with assumables
11 from logicProblem import Clause, Askable, KB, yes
12
13 class Assumable(object):
14 """An askable atom"""
15
16 def __init__(self,atom):
17 """clause with atom head and lost of atoms body"""
18 self.atom = atom
19
20 def __str__(self):
21 """returns the string representation of a clause.
22 """
23 return "assumable " + self.atom + "."
24
25 class KBA(KB):
26 """A knowledge base that can include assumables"""
27 def __init__(self,statements):
28 self.assumables = [c.atom for c in statements if isinstance(c,
Assumable)]
29 KB.__init__(self,statements)
The top-down Horn clause interpreter, prove_all_ass returns a list of the sets
of assumables that imply ans_body. This list will contain all of the minimal sets
of assumables, but can also find non-minimal sets, and repeated sets, if they
can be generated with separate proofs. The set assumed is the set of assumables
already assumed.
logicAssumables.py — (continued)
36 """
37 if ans_body:
38 selected = ans_body[0] # select first atom from ans_body
39 if selected in self.askables:
40 if yes(input("Is "+selected+" true? ")):
41 return self.prove_all_ass(ans_body[1:],assumed)
42 else:
43 return [] # no answers
44 elif selected in self.assumables:
45 return self.prove_all_ass(ans_body[1:],assumed|{selected})
46 else:
47 return [ass
48 for cl in self.clauses_for_atom(selected)
49 for ass in
self.prove_all_ass(cl.body+ans_body[1:],assumed)
50 ] # union of answers for each clause with
head=selected
51 else: # empty body
52 return [assumed] # one answer
53
54 def conflicts(self):
55 """returns a list of minimal conflicts"""
56 return minsets(self.prove_all_ass(['false']))
Given a list of sets, minsets returns a list of the minimal sets in the list. For
example, minsets([{2, 3, 4}, {2, 3}, {6, 2, 3}, {2, 3}, {2, 4, 5}]) returns [{2, 3}, {2, 4, 5}].
logicAssumables.py — (continued)
58 def minsets(ls):
59 """ls is a list of sets
60 returns a list of minimal sets in ls
61 """
62 ans = [] # elements known to be minimal
63 for c in ls:
64 if not any(c1<c for c1 in ls) and not any(c1 <= c for c1 in ans):
65 ans.append(c)
66 return ans
67
68 # minsets([{2, 3, 4}, {2, 3}, {6, 2, 3}, {2, 3}, {2, 4, 5}])
Warning: minsets works for a list of sets or for a set of (frozen) sets, but it does
not work for a generator of sets (because variable ls is referenced in the loop).
For example, try to predict and then test:
minsets(e for e in [{2, 3, 4}, {2, 3}, {6, 2, 3}, {2, 3}, {2, 4, 5}])
The diagnoses can be constructed from the (minimal) conflicts as follows.
This also works if there are non-minimal conflicts, but is not as efficient.
logicAssumables.py — (continued)
69 def diagnoses(cons):
70 """cons is a list of (minimal) conflicts.
Test cases:
logicAssumables.py — (continued)
80 electa = KBA([
81 Clause('light_l1'),
82 Clause('light_l2'),
83 Assumable('ok_l1'),
84 Assumable('ok_l2'),
85 Assumable('ok_s1'),
86 Assumable('ok_s2'),
87 Assumable('ok_s3'),
88 Assumable('ok_cb1'),
89 Assumable('ok_cb2'),
90 Assumable('live_outside'),
91 Clause('live_l1', ['live_w0']),
92 Clause('live_w0', ['up_s2','ok_s2','live_w1']),
93 Clause('live_w0', ['down_s2','ok_s2','live_w2']),
94 Clause('live_w1', ['up_s1', 'ok_s1', 'live_w3']),
95 Clause('live_w2', ['down_s1', 'ok_s1','live_w3' ]),
96 Clause('live_l2', ['live_w4']),
97 Clause('live_w4', ['up_s3','ok_s3','live_w3' ]),
98 Clause('live_p_1', ['live_w3']),
99 Clause('live_w3', ['live_w5', 'ok_cb1']),
100 Clause('live_p_2', ['live_w6']),
101 Clause('live_w6', ['live_w5', 'ok_cb2']),
102 Clause('live_w5', ['live_outside']),
103 Clause('lit_l1', ['light_l1', 'live_l1', 'ok_l1']),
104 Clause('lit_l2', ['light_l2', 'live_l2', 'ok_l2']),
105 Askable('up_s1'),
106 Askable('down_s1'),
107 Askable('up_s2'),
108 Askable('down_s2'),
109 Askable('up_s3'),
110 Askable('down_s2'),
111 Askable('dark_l1'),
112 Askable('dark_l2'),
113 Clause('false', ['dark_l1', 'lit_l1']),
114 Clause('false', ['dark_l2', 'lit_l2'])
115 ])
116 # electa.prove_all_ass(['false'])
117 # cs=electa.conflicts()
118 # print(cs)
119 # diagnoses(cs) # diagnoses from conflicts
5.6 Negation-as-failure
The negation of an atom a is written as Not(a) in a body.
Prove with negation-as-failure (prove_naf) is like prove, but with the extra case
to cover Not:
logicNegation.py — (continued)
37 kb.display(2,indent,f"{selected.atom()} fails so
Not({selected.atom()}) succeeds")
38 return prove_naf(kb, ans_body[1:],indent+" ")
39 if selected in kb.askables:
40 return (yes(input("Is "+selected+" true? "))
41 and prove_naf(kb,ans_body[1:],indent+" "))
42 else:
43 return any(prove_naf(kb,cl.body+ans_body[1:],indent+" ")
44 for cl in kb.clauses_for_atom(selected))
45 else:
46 return True # empty body is true
Test cases:
logicNegation.py — (continued)
48 triv_KB_naf = KB([
49 Clause('i_am', ['i_think']),
50 Clause('i_think'),
51 Clause('i_smell', ['i_am', Not('dead')]),
52 Clause('i_bad', ['i_am', Not('i_think')])
53 ])
54
55 triv_KB_naf.max_display_level = 4
56 def test():
57 a1 = prove_naf(triv_KB_naf,['i_smell'])
58 assert a1, f"triv_KB_naf failed to prove i_smell; gave {a1}"
59 a2 = prove_naf(triv_KB_naf,['i_bad'])
60 assert not a2, f"triv_KB_naf wrongly proved i_bad; gave {a2}"
61 print("Passed unit tests")
62 if __name__ == "__main__":
63 test()
Default reasoning about beaches at resorts (Example 5.28 of Poole and Mack-
worth [2023]):
logicNegation.py — (continued)
65 beach_KB = KB([
66 Clause('away_from_beach', [Not('on_beach')]),
67 Clause('beach_access', ['on_beach', Not('ab_beach_access')]),
68 Clause('swim_at_beach', ['beach_access', Not('ab_swim_at_beach')]),
69 Clause('ab_swim_at_beach', ['enclosed_bay', 'big_city',
Not('ab_no_swimming_near_city')]),
70 Clause('ab_no_swimming_near_city', ['in_BC', Not('ab_BC_beaches')])
71 ])
72
73 # prove_naf(beach_KB, ['away_from_beach'])
74 # prove_naf(beach_KB, ['beach_access'])
75 # beach_KB.add_clause(Clause('on_beach',[]))
76 # prove_naf(beach_KB, ['away_from_beach'])
77 # prove_naf(beach_KB, ['swim_at_beach'])
78 # beach_KB.add_clause(Clause('enclosed_bay',[]))
79 # prove_naf(beach_KB, ['swim_at_beach'])
80 # beach_KB.add_clause(Clause('big_city',[]))
81 # prove_naf(beach_KB, ['swim_at_beach'])
82 # beach_KB.add_clause(Clause('in_BC',[]))
83 # prove_naf(beach_KB, ['swim_at_beach'])
Deterministic Planning
• effects: a dictionary of feature:value pairs that are made true by this action.
In particular, a feature in the dictionary has the corresponding value (and
not its previous value) after the action, and a feature not in the dictionary
keeps its old value.
125
126 6. Deterministic Planning
stripsProblem.py — (continued)
31 class STRIPS_domain(object):
32 def __init__(self, feature_domain_dict, actions):
33 """Problem domain
34 feature_domain_dict is a feature:domain dictionary,
35 mapping each feature to its domain
36 actions
37 """
38 self.feature_domain_dict = feature_domain_dict
39 self.actions = actions
stripsProblem.py — (continued)
41 class Planning_problem(object):
42 def __init__(self, prob_domain, initial_state, goal):
43 """
44 a planning problem consists of
45 * a planning domain
46 * the initial state
47 * a goal
48 """
49 self.prob_domain = prob_domain
50 self.initial_state = initial_state
51 self.goal = goal
Coffee
Shop
(cs) Sam's
Office
(off )
Mail Lab
Room (lab)
(mr )
stripsProblem.py — (continued)
stripsProblem.py — (continued)
b move(b,c,a) b
a c a c
move(b,c,table)
a c b
71 problem0 = Planning_problem(delivery_domain,
72 {'RLoc':'lab', 'MW':True, 'SWC':True, 'RHC':False,
73 'RHM':False},
74 {'RLoc':'off'})
75 problem1 = Planning_problem(delivery_domain,
76 {'RLoc':'lab', 'MW':True, 'SWC':True, 'RHC':False,
77 'RHM':False},
78 {'SWC':False})
79 problem2 = Planning_problem(delivery_domain,
80 {'RLoc':'lab', 'MW':True, 'SWC':True, 'RHC':False,
81 'RHM':False},
82 {'SWC':False, 'MW':False, 'RHM':False})
c
b
a
a c
b
27 def zero(*args,**nargs):
28 """always returns 0"""
29 return 0
30
31 class Forward_STRIPS(Search_problem):
32 """A search problem from a planning problem where:
33 * a node is a state
34 * the dynamics are specified by the STRIPS representation of actions
35 """
36 def __init__(self, planning_problem, heur=zero):
37 """creates a forward search space from a planning problem.
38 heur(state,goal) is a heuristic function,
39 an underestimate of the cost from state to goal, where
40 both state and goals are feature:value dictionaries.
41 """
42 self.prob_domain = planning_problem.prob_domain
43 self.initial_state = State(planning_problem.initial_state)
44 self.goal = planning_problem.goal
45 self.heur = heur
46
47 def is_goal(self, state):
48 """is True if node is a goal.
49
50 Every goal feature has the same value in the state and the goal."""
51 return all(state.assignment[prop]==self.goal[prop]
stripsForwardPlanner.py — (continued)
21 def h1(state,goal):
22 """ the distance to the goal location, if there is one"""
23 if 'RLoc' in goal:
24 return dist(state['RLoc'], goal['RLoc'])
25 else:
26 return 0
27
28 def h2(state,goal):
29 """ the distance to the coffee shop plus getting coffee and delivering
it
30 if the robot needs to get coffee
31 """
32 if ('SWC' in goal and goal['SWC']==False
33 and state['SWC']==True
34 and state['RHC']==False):
35 return dist(state['RLoc'],'cs')+3
36 else:
37 return 0
The maximum of the values of a set of admissible heuristics is also an admis-
sible heuristic. The function maxh takes a number of heuristic functions as ar-
guments, and returns a new heuristic function that takes the maximum of the
values of the heuristics. For example, h1 and h2 are heuristic functions and so
maxh(h1,h2) is also. maxh can take an arbitrary number of arguments.
stripsHeuristic.py — (continued)
39 def maxh(*heuristics):
40 """Returns a new heuristic function that is the maximum of the
functions in heuristics.
41 heuristics is the list of arguments which must be heuristic functions.
42 """
43 # return lambda state,goal: max(h(state,goal) for h in heuristics)
44 def newh(state,goal):
45 return max(h(state,goal) for h in heuristics)
46 return newh
The following runs the example with and without the heuristic.
stripsHeuristic.py — (continued)
Exercise 6.4 For more than one start-state/goal combination, test the forward
planner with a heuristic function of just h1, with just h2 and with both. Explain
why each one prunes or doesn’t prune the search space.
Exercise 6.5 Create a better heuristic than maxh(h1,h2). Try it for a number of
different problems. In particular, try and include the following costs:
i) h3 is like h2 but also takes into account the case when Rloc is in goal.
ii) h4 uses the distance to the mail room plus getting mail and delivering it if
the robot needs to get need to deliver mail.
iii) h5 is for getting mail when goal is for the robot to have mail, and then getting
to the goal destination (if there is one).
43 self.heur = heur
44
45 def is_goal(self, subgoal):
46 """if subgoal is true in the initial state, a path has been found"""
47 goal_asst = subgoal.assignment
48 return all(self.initial_state[g]==goal_asst[g]
49 for g in goal_asst)
50
51 def start_node(self):
52 """the start node is the top-level goal"""
53 return self.top_goal
54
55 def neighbors(self,subgoal):
56 """returns a list of the arcs for the neighbors of subgoal in this
problem"""
57 goal_asst = subgoal.assignment
58 return [ Arc(subgoal, self.weakest_precond(act,goal_asst),
act.cost, act)
59 for act in self.prob_domain.actions
60 if self.possible(act,goal_asst)]
61
62 def possible(self,act,goal_asst):
63 """True if act is possible to achieve goal_asst.
64
65 the action achieves an element of the effects and
66 the action doesn't delete something that needs to be achieved and
67 the preconditions are consistent with other subgoals that need to
be achieved
68 """
69 return ( any(goal_asst[prop] == act.effects[prop]
70 for prop in act.effects if prop in goal_asst)
71 and all(goal_asst[prop] == act.effects[prop]
72 for prop in act.effects if prop in goal_asst)
73 and all(goal_asst[prop]== act.preconds[prop]
74 for prop in act.preconds if prop not in act.effects
and prop in goal_asst)
75 )
76
77 def weakest_precond(self,act,goal_asst):
78 """returns the subgoal that must be true so goal_asst holds after
act
79 should be: act.preconds | (goal_asst - act.effects)
80 """
81 new_asst = act.preconds.copy()
82 for g in goal_asst:
83 if g not in act.effects:
84 new_asst[g] = goal_asst[g]
85 return Subgoal(new_asst)
86
87 def heuristic(self,subgoal):
stripsRegressionPlanner.py — (continued)
Exercise 6.7 Multiple path pruning could be used to prune more than the current
node. In particular, if the current node contains more conditions than a previously
visited node, it can be pruned. For example, if {a:True, b:False} has been visited,
then any node that is a superset, e.g., {a:True, b:False, d:True}, need not be
expanded. If the simpler subgoal does not lead to a solution, the more complicated
one will not either. Implement this more severe pruning. (Hint: This may require
modifications to the searcher.)
Exercise 6.8 It is possible that, as knowledge of the domain, that some as-
signment of values to features can never be achieved. For example, the robot
cannot be holding mail when there is mail waiting (assuming it isn’t holding
mail initially). An assignment of values to (some of the) features is incompat-
ible if no possible (reachable) state can include that assignment. For example,
{'MW':True, 'RHM':True} is an incompatible assignment. This information may
be useful information for a planner; there is no point in trying to achieve these
together. Define a subclass of STRIPS_domain that can accept a list of incompatible
assignments. Modify the regression planner code to use such a list of incompatible
assignments. Give an example where the search space is smaller.
Exercise 6.9 After completing the previous exercise, design incompatible assign-
ments for the blocks world. (This can result in dramatic search improvements.)
stripsHeuristic.py — (continued)
Exercise 6.10 Try the regression planner with a heuristic function of just h1 and
with just h2 (defined in Section 6.2.1). Explain how each one prunes or doesn’t
prune the search space.
Exercise 6.11 Create a heuristic that is better for regression planning than heuristic_fun
defined in Section 6.2.1.
The CSP planner assumes there is a single action at each step. This creates a
CSP that can use any of the CSP algorithms to solve (e.g., stochastic local search
or arc consistency with domain splitting).
It uses the same action representation as before; it does not consider fac-
tored actions (action features), or implement state constraints.
stripsCSPPlanner.py — CSP planner where actions are represented using STRIPS
11 from cspProblem import Variable, CSP, Constraint
12
13 class CSP_from_STRIPS(CSP):
14 """A CSP where:
15 * CSP variables are constructed for each feature and time, and each
action and time
16 * the dynamics are specified by the STRIPS representation of actions
17 """
18
19 def __init__(self, planning_problem, number_stages=2):
20 prob_domain = planning_problem.prob_domain
21 initial_state = planning_problem.initial_state
22 goal = planning_problem.goal
23 # self.action_vars[t] is the action variable for time t
prob_domain.feature_domain_dict
66 for t in range(number_stages+1)}
67 CSP.__init__(self, "CSP_from_Strips", variables, constraints)
68
69 def extract_plan(self,soln):
70 return [soln[a] for a in self.action_vars]
The following methods return methods which can be applied to the particular
environment.
For example, is_(3) returns a function that when applied to 3, returns True
and when applied to any other value returns False. So is_(3)(3) returns True
and is_(3)(7) returns False.
Note that the underscore (’_’) is part of the name; we use the convention
that a function with name ending in underscore returns a function. Com-
mented out is an alternative style to define is_ and if_; returning a function
defined by lambda is equivalent to returning the embedded function, except
that the embedded function has a name. The embedded function can also be
given a docstring.
stripsCSPPlanner.py — (continued)
72 def is_(val):
73 """returns a function that is true when it is it applied to val.
74 """
75 #return lambda x: x == val
76 def is_fun(x):
77 return x == val
78 is_fun.__name__ = f"value_is_{val}"
79 return is_fun
80
81 def if_(v1,v2):
82 """if the second argument is v2, the first argument must be v1"""
83 #return lambda x1,x2: x1==v1 if x2==v2 else True
84 def if_fun(x1,x2):
85 return x1==v1 if x2==v2 else True
86 if_fun.__name__ = f"if x2 is {v2} then x1 is {v1}"
87 return if_fun
88
89 def eq_if_not_in_(actset):
90 """first and third arguments are equal if action is not in actset"""
91 # return lambda x1, a, x2: x1==x2 if a not in actset else True
92 def eq_if_not_fun(x1, a, x2):
93 return x1==x2 if a not in actset else True
94 eq_if_not_fun.__name__ = f"first and third arguments are equal if
action is not in {actset}"
95 return eq_if_not_fun
Putting it together, this returns a list of actions that solves the problem for
a given horizon. If you want to do more than just return the list of actions, you
might want to get it to return the solution. Or even enumerate the solutions
(by using Search_with_AC_from_CSP).
stripsCSPPlanner.py — (continued)
97 def con_plan(prob,horizon):
98 """finds a plan for problem prob given horizon.
99 """
100 csp = CSP_from_STRIPS(prob, horizon)
101 sol = Con_solver(csp).solve_one()
102 return csp.extract_plan(sol) if sol else sol
stripsCSPPlanner.py — (continued)
• agenda: a list of (s, a) pairs, where s is a (var, val) pair and a is an action
instance. This means that variable var must have value val before a can
occur.
• causal_links: a set of (a0, g, a1) triples, where a1 and a2 are action instances
and g is a (var, val) pair. This holds when action a0 makes g true for action
a1 .
stripsPOP.py — (continued)
28 class POP_node(object):
29 """a (partial) partial-order plan. This is a node in the search
space."""
30 def __init__(self, actions, constraints, agenda, causal_links):
31 """
32 * actions is a set of action instances
33 * constraints a set of (a0,a1) pairs, representing a0<a1,
34 closed under transitivity
35 * agenda list of (subgoal,action) pairs to be achieved, where
36 subgoal is a (variable,value) pair
37 * causal_links is a set of (a0,g,a1) triples,
38 where ai are action instances, and g is a (variable,value) pair
39 """
40 self.actions = actions # a set of action instances
41 self.constraints = constraints # a set of (a0,a1) pairs
42 self.agenda = agenda # list of (subgoal,action) pairs to be
achieved
43 self.causal_links = causal_links # set of (a0,g,a1) triples
44
45 def __str__(self):
46 return ("actions: "+str({str(a) for a in self.actions})+
47 "\nconstraints: "+
48 str({(str(a1),str(a2)) for (a1,a2) in self.constraints})+
49 "\nagenda: "+
50 str([(str(s),str(a)) for (s,a) in self.agenda])+
51 "\ncausal_links:"+
52 str({(str(a0),str(g),str(a2)) for (a0,g,a2) in
self.causal_links}) )
extract_plan constructs a total order of action instances that is consistent
with the partial order.
stripsPOP.py — (continued)
54 def extract_plan(self):
55 """returns a total ordering of the action instances consistent
56 with the constraints.
57 raises IndexError if there is no choice.
58 """
59 sorted_acts = []
60 other_acts = set(self.actions)
61 while other_acts:
stripsPOP.py — (continued)
stripsPOP.py — (continued)
101 POP_node(node.actions,consts2,new_agenda,new_cls),
102 cost=0)
103 for a0 in self.planning_problem.prob_domain.actions: #a0 is an
action
104 if self.achieves(a0, subgoal):
105 #a0 achieves subgoal
106 new_a = Action_instance(a0)
107 self.display(2," using new action",new_a)
108 new_actions = node.actions + [new_a]
109 consts1 =
self.add_constraint((self.start,new_a),node.constraints)
110 consts2 = self.add_constraint((new_a,act1),consts1)
111 new_agenda1 = new_agenda + [(pre,new_a) for pre in
a0.preconds.items()]
112 new_clink = (new_a,subgoal,act1)
113 new_cls = node.causal_links + [new_clink]
114 for consts3 in
self.protect_all_cls(node.causal_links,new_a,consts2):
115 for consts4 in
self.protect_cl_for_actions(node.actions,consts3,new_clink):
116 yield Arc(node,
117 POP_node(new_actions,consts4,new_agenda1,new_cls),
118 cost=1)
Given a causal link (a0, subgoal, a1), the following method protects the causal
link from each action in actions. Whenever an action deletes subgoal, the action
needs to be before a0 or after a1. This method enumerates all constraints that
result from protecting the causal link from all actions.
stripsPOP.py — (continued)
137 for e in
self.protect_cl_for_actions(rem_actions,constrs,clink):
yield e
138 else:
139 yield constrs
Given an action act, the following method protects all the causal links in
clinks from act. Whenever act deletes subgoal from some causal link (a0, subgoal, a1),
the action act needs to be before a0 or after a1. This method enumerates all con-
straints that result from protecting the causal links from act.
stripsPOP.py — (continued)
The following methods check whether an action (or action instance) achieves
or deletes some subgoal.
stripsPOP.py — (continued)
stripsPOP.py — (continued)
stripsPOP.py — (continued)
• Features: many of the features come directly from the data. Sometimes it
is useful to construct features, e.g. height > 1.9m might be a Boolean fea-
ture constructed from the real-values feature height. The next chapter is
about neural networks and how to learn features; the code in this chapter
constructs them explicitly in what is often known as feature engineering.
• Learning with no input features: this is the base case of many methods.
What should you predict if you have no input features? This provides the
base cases for many algorithms (e.g., decision tree algorithm) and base-
lines that more sophisticated algorithms need to beat. It also provides
ways to test various predictors.
• Decision tree learning: one of the classic and simplest learning algo-
rithms, which is the basis of many other algorithms.
149
150 7. Supervised Machine Learning
• A feature is a function from examples into the range of the feature. Each
feature f also has the following attributes:
Thus for example, a Boolean feature is a function from the examples into
{False, True}. So, if f is a Boolean feature, f .frange == [False, True], and if
e is an example, f (e) is either True or False.
learnProblem.py — (continued)
18 class Data_set(Displayable):
19 """ A dataset consists of a list of training data and a list of test
data.
20 """
21
22 def __init__(self, train, test=None, target_index=0, prob_test=0.10,
prob_valid=0.11,
23 header=None, target_type= None, one_hot=False,
seed=None): #12345):
24 """A dataset for learning.
25 train is a list of tuples representing the training examples
26 test is the list of tuples representing the test examples
27 if test is None, a test set is created by selecting each
28 example with probability prob_test
29 target_index is the index of the target.
30 If negative, it counts from right.
31 If target_index is larger than the number of properties,
32 there is no target (for unsupervised learning)
33 prob_valid is the proability of putting a training example in a
validation set
34 header is a list of names for the features
35 target_type is either None for automatic detection of target type
36 or one of "numeric", "boolean", "categorical"
37 one_hot is True gives a one-hot encoding of categorical features
38 seed is for random number; None gives a different test set each time
39 """
40 if seed: # given seed makes partition consistent from run-to-run
41 random.seed(seed)
42 if test is None:
43 train,test = partition_data(train, prob_test)
44 self.train, self.valid = partition_data(train, prob_valid)
45 self.test = test
46
47 self.display(1,"Training set has",len(self.train),"examples. Number
of columns: ",{len(e) for e in self.train})
48 self.display(1,"Test set has",len(test),"examples. Number of
columns: ",{len(e) for e in test})
49 self.display(1,"Validation set has",len(self.valid),"examples.
Number of columns: ",{len(e) for e in self.valid})
50 self.prob_test = prob_test
51 self.num_properties = len(self.train[0])
52 if target_index < 0: #allows for -1, -2, etc.
53 self.target_index = self.num_properties + target_index
54 else:
55 self.target_index = target_index
56 self.header = header
57 self.domains = [set() for i in range(self.num_properties)]
58 for example in self.train:
59 for ind,val in enumerate(example):
60 self.domains[ind].add(val)
61 self.conditions_cache = {} # cache for computed conditions
62 self.create_features(one_hot)
63 if target_type:
64 self.target.ftype = target_type
65 self.display(1,"There are",len(self.input_features),"input
features")
66
67 def __str__(self):
68 if self.train and len(self.train)>0:
69 return ("Data: "+str(len(self.train))+" training examples, "
70 +str(len(self.test))+" test examples, "
71 +str(len(self.train[0]))+" features.")
72 else:
73 return ("Data: "+str(len(self.train))+" training examples, "
74 +str(len(self.test))+" test examples.")
A feature is a function that takes an example and returns a value in the
range of the feature. Each feature has a frange, which gives the range of the
feature, and an ftype that gives the type, one of “boolean”, “numeric” or “cat-
egorical”.
learnProblem.py — (continued)
93 feat.frange = boolean
94 feat.type = "boolean"
95 self.input_features.append(feat)
96 else:
97 def feat(e,index=i):
98 return e[index]
99 if self.header:
100 feat.__doc__ = self.header[i]
101 else:
102 feat.__doc__ = "e["+str(i)+"]"
103 feat.frange = frange
104 feat.ftype = ftype
105 if i == self.target_index:
106 self.target = feat
107 else:
108 self.input_features.append(feat)
The following tries to infer the type of each feature. Sometimes this can be
wrong, (e.g., when the numbers are really categorical) and may need to be set
explicitly.
learnProblem.py — (continued)
• When the range only has two values, one is designated to be the “true”
value.
• When the values are all numeric, assume they are ordered (as opposed
to just being some classes that happen to be labelled with numbers) and
construct Boolean features for splits of the data. That is, the feature is
e[ind] < cut for some value cut. The number of cut values is less than or
equal to max_num_cuts.
• When the values are not all numeric, it creates an indicator function for
each value. An indicator function for a value returns true when that value
is given and false otherwise. Note that we can’t create an indicator func-
tion for values that appear in the test set but not in the training set be-
cause we haven’t seen the test set. For the examples in the test set with a
value that doesn’t appear in the training set for that feature, the indicator
functions all return false.
Exercise 7.1 Change the code so that it splits using e[ind] ≤ cut instead of e[ind] <
cut. Check boundary cases, such as 3 elements with 2 cuts. As a test case, make
sure that when the range is the 30 integers from 100 to 129, and you want 2 cuts,
the resulting Boolean features should be e[ind] ≤ 109 and e[ind] ≤ 119 to make
sure that each of the resulting domains is of equal size.
Exercise 7.2 This splits on whether the feature is less than one of the values in
the training set. Sam suggested it might be better to split between the values in
the training set, and suggested using
Why might Sam have suggested this? Does this work better? (Try it on a few
datasets).
evaluation criteria are implemented, the squared error (average of the square
of the difference between the actual and predicted values), absolute errors (av-
erage of the absolute difference between the actual and predicted values) and
the log loss (the average negative log-likelihood, which can be interpreted as
the number of bits to describe an example using a code based on the prediction
treated as a probability).
learnProblem.py — (continued)
The following evaluation criteria are defined. This is defined using a class,
Evaluate but no instances will be created. Just use Evaluate.squared_loss etc.
(Please keep the __doc__ strings a consistent length as they are used in tables.)
The prediction is either a real value or a {value : probability} dictionary or a list.
The actual is either a real number or a key of the prediction.
learnProblem.py — (continued)
learnProblem.py — (continued)
The following class is used for datasets where the training and test are in dif-
ferent files
learnProblem.py — (continued)
342 res.append(e.strip())
343 return res
374 if f1 != f2:
375 self.input_features.append(b(f1,f2))
The following are useful unary feature constructors and binary feature com-
biner.
learnProblem.py — (continued)
learnProblem.py — (continued)
Exercise 7.3 For symmetric properties, such as product, we don’t need both
f 1 ∗ f 2 as well as f 2 ∗ f 1 as extra properties. Allow the user to be able to declare
feature constructors as symmetric (by associating a Boolean feature with them).
Change construct_features so that it does not create both versions for symmetric
combiners.
learnProblem.py — (continued)
• a point prediction, where we are only allowed to predict one of the values
of the feature. For example, if the values of the feature are {0, 1} we are
only allowed to predict 0 or 1 or of the values are ratings in {1, 2, 3, 4, 5},
we can only predict one of these integers.
• a point prediction, where we are allowed to predict any value. For exam-
ple, if the values of the feature are {0, 1} we may be allowed to predict
0.3, 1, or even 1.7. For all of the criteria defined, there is no point in pre-
dicting a value greater than 1 or less that zero (but that doesn’t mean you
can’t), but it is often useful to predict a value between 0 and 1. If the
values are ratings in {1, 2, 3, 4, 5}, we may want to predict 3.4.
• a probability distribution over the values of the feature. For each value v,
we predict a non-negative number pv , such that the sum over all predic-
tions is 1.
26 for e in data:
27 counts[e] += 1
28 s = sum(counts.values())
29 return {k:v/s for (k,v) in counts.items()}
30
31 def bounded_empirical(data, domain=[0,1], bound=0.01):
32 "bounded empirical"
33 return {k:min(max(v,bound),1-bound) for (k,v) in
Predict.empirical(data, domain).items()}
34
35 def laplace(data, domain=[0,1]):
36 "Laplace " # for categorical data
37 return Predict.empirical(data, domain, icount=1)
38
39 def cmode(data, domain=[0,1]):
40 "mode " # for categorical data
41 md = statistics.mode(data)
42 return {v: 1 if v==md else 0 for v in domain}
43
44 def cmedian(data, domain=[0,1]):
45 "median " # for categorical data
46 md = statistics.median_low(data) # always return one of the values
47 return {v: 1 if v==md else 0 for v in domain}
48
49 ### The following return a single prediction (for regression). domain
is ignored.
50
51 def mean(data, domain=[0,1]):
52 "mean "
53 # returns a real number
54 return statistics.mean(data)
55
56 def rmean(data, domain=[0,1], mean0=0, pseudo_count=1):
57 "regularized mean"
58 # returns a real number.
59 # mean0 is the mean to be used for 0 data points
60 # With mean0=0.5, pseudo_count=2, same as laplace for [0,1] data
61 # this works for enumerations as well as lists
62 sum = mean0 * pseudo_count
63 count = pseudo_count
64 for e in data:
65 sum += e
66 count += 1
67 return sum/count
68
69 def mode(data, domain=[0,1]):
70 "mode "
71 return statistics.mode(data)
72
73 def median(data, domain=[0,1]):
74 "median "
75 return statistics.median(data)
76
77 all = [empirical, mean, rmean, bounded_empirical, laplace, cmode, mode,
median,cmedian]
78
79 # The following suggests appropriate predictions as a function of the
target type
80 select = {"boolean": [empirical, bounded_empirical, laplace, cmode,
cmedian],
81 "categorical": [empirical, bounded_empirical, laplace, cmode,
cmedian],
82 "numeric": [mean, rmean, mode, median]}
7.3.1 Evaluation
To evaluate a point prediction, let’s first generate some possible values, 0 and
1 for the target feature. Given the ground truth prob, a number in the range
[0, 1], the following code generates some training and test data where prob is
the probability of each example being 1. To generate a 1 with probability prob,
it generates a random number in range [0,1] and return 1 if that number is less
than prob. A prediction is computed by applying the predictor to the training
data, which is evaluated on the test set. This is repeated num_samples times.
Let’s evaluate the predictions of the possible selections according to the
different evaluation criteria, for various training sizes.
learnNoInputs.py — (continued)
Exercise 7.4 Which predictor works best for low counts when the error is
(a) Squared error
(b) Absolute error
(c) Log loss
You may need to try this a few times to make sure your answer is supported by
the evidence. Does the difference from the other methods get more or less as the
number of examples grow?
Exercise 7.5 Suggest some other predictions that only take the training data.
Does your method do better than the given methods? A simple way to get other
predictors is to vary the threshold of bounded average, or to change the pseodo-
counts of the Laplace method (use other numbers instead of 1 and 2).
The decision tree algorithm does binary splits, and assumes that all input
features are binary functions of the examples. It stops splitting if there are
no input features, the number of examples is less than a specified number of
examples or all of the examples agree on the target feature.
learnDT.py — Learning a binary decision tree
11 from learnProblem import Learner, Evaluate
12 from learnNoInputs import Predict
13 import math
14
15 class DT_learner(Learner):
16 def __init__(self,
17 dataset,
If it splits, it selects the best split according to the evaluation criterion (as-
suming that is the only split it gets to do), and returns the condition to split on
(in the variable split) and the corresponding partition of the examples.
learnDT.py — (continued)
learnDT.py — (continued)
Test cases:
learnDT.py — (continued)
Note that different runs may provide different values as they split the train-
ing and test sets differently. So if you have a hypothesis about what works
better, make sure it is true for different runs.
Exercise 7.6 The current algorithm does not have a very sophisticated stopping
criterion. What is the current stopping criterion? (Hint: you need to look at both
learn_tree and select_split.)
Exercise 7.7 Extend the current algorithm to include in the stopping criterion
(a) A minimum child size; don’t use a split if one of the children has fewer
elements that this.
(b) A depth-bound on the depth of the tree.
(c) An improvement bound such that a split is only carried out if error with the
split is better than the error without the split by at least the improvement
bound.
Which values for these parameters make the prediction errors on the test set the
smallest? Try it on more than one dataset.
Exercise 7.8 Without any input features, it is often better to include a pseudo-
count that is added to the counts from the training data. Modify the code so that
it includes a pseudo-count for the predictions. When evaluating a split, including
pseudo counts can make the split worse than no split. Does pruning with an im-
provement bound and pseudo-counts make the algorithm work better than with
an improvement bound by itself?
Exercise 7.9 Some people have suggested using information gain (which is equiv-
alent to greedy optimization of log loss) as the measure of improvement when
building the tree, even in they want to have non-probabilistic predictions in the
final tree. Does this work better than myopically choosing the split that is best for
the evaluation criteria we will use to judge the final prediction?
The above decision tree overfits the data. One way to determine whether
the prediction is overfitting is by cross validation. The code below implements
k-fold cross validation, which can be used to choose the value of parameters
to best fit the training data. If we want to use parameter tuning to improve
predictions on a particular dataset, we can only use the training data (and not
the test data) to tune the parameter.
In k-fold cross validation, we partition the training set into k approximately
equal-sized folds (each fold is an enumeration of examples). For each fold, we
train on the other examples, and determine the error of the prediction on that
fold. For example, if there are 10 folds, we train on 90% of the data, and then
test on remaining 10% of the data. We do this 10 times, so that each example
gets used as a test set once, and in the training set 9 times.
The code below creates one copy of the data, and multiple views of the data.
For each fold, fold enumerates the examples in the fold, and fold_complement
enumerates the examples not in the fold.
learnCrossValidation.py — Cross Validation for Parameter Tuning
11 from learnProblem import Data_set, Data_from_file, Evaluate
12 from learnNoInputs import Predict
13 from learnDT import DT_learner
14 import matplotlib.pyplot as plt
15 import random
16
17 class K_fold_dataset(object):
18 def __init__(self, training_set, num_folds):
19 self.data = training_set.train.copy()
20 self.target = training_set.target
21 self.input_features = training_set.input_features
22 self.num_folds = num_folds
23 self.conditions = training_set.conditions
24
25 random.shuffle(self.data)
26 self.fold_boundaries = [(len(self.data)*i)//num_folds
27 for i in range(0,num_folds+1)]
28
29 def fold(self, fold_num):
30 for i in range(self.fold_boundaries[fold_num],
31 self.fold_boundaries[fold_num+1]):
32 yield self.data[i]
33
34 def fold_complement(self, fold_num):
35 for i in range(0,self.fold_boundaries[fold_num]):
36 yield self.data[i]
37 for i in range(self.fold_boundaries[fold_num+1],len(self.data)):
38 yield self.data[i]
The validation error is the average error for each example, where we test on
each fold, and learn on the other folds.
learnCrossValidation.py — (continued)
The plot_error method plots the average error as a function of the mini-
mum number of examples in decision-tree search, both for the validation set
and for the test set. The error on the validation set can be used to tune the
parameter — choose the value of the parameter that minimizes the error. The
error on the test set cannot be used to tune the parameters; if it were to be used
this way it could not be used to test how well the method works on unseen
examples.
learnCrossValidation.py — (continued)
Figure 7.2 shows the average squared loss in the validation and test sets as
a function of the min_child_weight in the decision-tree learning algorithm.
(SPECT data with seed 12345 followed by plot_error(data)). Different seeds
will produce different graphs. The assumption behind cross validation is that
the parameter that minimizes the loss on the validation set, will be a good pa-
rameter for the test set.
Note that different runs for the same data will have the same test error, but
different validation error. If you rerun the Data_from_file, with a different
seed, you will get the new test and training sets, and so the graph will change.
Exercise 7.10 Change the error plot so that it can evaluate the stopping criteria
0.20
average squared loss
0.18
0.16
0.14
0 20 40 60 80
min_child_weight
of the exercise of Section 7.6. Which criteria makes the most difference?
predictor predicts the value of an example from the current parameter set-
tings. predictor_string gives a string representation of the predictor.
learnLinear.py — (continued)
41
42 def predictor(self,e):
43 """returns the prediction of the learner on example e"""
44 linpred = sum(w*f(e) for f,w in self.weights.items())
45 if self.squashed:
46 return sigmoid(linpred)
47 else:
48 return linpred
49
50 def predictor_string(self, sig_dig=3):
51 """returns the doc string for the current prediction function
52 sig_dig is the number of significant digits in the numbers"""
53 doc = "+".join(str(round(val,sig_dig))+"*"+feat.__doc__
54 for feat,val in self.weights.items())
55 if self.squashed:
56 return "sigmoid("+ doc+")"
57 else:
58 return doc
learn is the main algorithm of the learner. It does num_iter steps of stochastic
gradient descent. Only the number of iterations is specified; the other parame-
ters it gets from the class.
learnLinear.py — (continued)
67 update = self.learning_rate*error
68 for feat in self.weights:
69 d[feat] += update*feat(e)
70 for feat in self.weights:
71 self.weights[feat] -= d[feat]
72 d[feat]=0
73 return self.predictor
one is a function that always returns 1. This is used for one of the input prop-
erties.
learnLinear.py — (continued)
75 def one(e):
76 "1"
77 return 1
sigmoid(x) is the function
1
1 + e−x
The inverse of sigmoid is the logit function
learnLinear.py — (continued)
79 def sigmoid(x):
80 return 1/(1+math.exp(-x))
81
82 def logit(x):
83 return -math.log(1/x-1)
softmax([x0 , x2 , . . . ]) returns [v0 , v2 , . . . ] where
exp(xi )
vi =
∑j exp(xj )
learnLinear.py — (continued)
The following tests the learner on a datasets. Uncomment the other datasets
for different examples.
learnLinear.py — (continued)
The following plots the errors on the training and test sets as a function of
the number of steps of gradient descent.
learnLinear.py — (continued)
training
1.1 test
1.0
Average log loss (bits)
0.9
0.8
0.7
0.6
0.5
0.4
100 101 102 103
step
194 learner.learn(100)
195 learner.learning_rate=0.0001
196 learner.learn(1000)
197 learner.learning_rate=0.00001
198 learner.learn(10000)
199 learner.display(1,"function learned is", learner.predictor_string(),
200 "error=",data.evaluate_dataset(data.train, learner.predictor,
Evaluate.squared_loss))
201 plt.plot([e[0] for e in data.train],[e[-1] for e in
data.train],"bo",label="data")
202 plt.plot(list(arange(minx,maxx,step_size)),[learner.predictor([x])
203 for x in
arange(minx,maxx,step_size)],
204 label=label)
205 plt.legend()
206 plt.draw()
learnLinear.py — (continued)
linestyle=ls, color=col,
237 label="degree="+str(degree))
238 plt.legend(loc='upper left')
239 plt.draw()
240
241 # Try:
242 # data0 = Data_from_file('data/simp_regr.csv', prob_test=0, prob_valid=0,
one_hot=False, target_index=-1)
243 # plot_prediction(data0)
244 # plot_polynomials(data0)
245 # What if the step size was bigger?
246 #datam = Data_from_file('data/mail_reading.csv', target_index=-1)
247 #plot_prediction(datam)
Exercise 7.13 For each of the polynomial functions learned: What is the pre-
diction as x gets larger (x → ∞). What is the prediction as x gets more negative
(x → −∞).
7.7 Boosting
The following code implements functional gradient boosting for regression.
A Boosted dataset is created from a base dataset by subtracting the pre-
diction of the offset function from each example. This does not save the new
dataset, but generates it as needed. The amount of space used is constant, in-
dependent on the size of the dataset.
learnBoosting.py — Functional Gradient Boosting
11 from learnProblem import Data_set, Learner, Evaluate
12 from learnNoInputs import Predict
13 from learnLinear import sigmoid
14 import statistics
15 import random
16
17 class Boosted_dataset(Data_set):
18 def __init__(self, base_dataset, offset_fun, subsample=1.0):
19 """new dataset which is like base_dataset,
20 but offset_fun(e) is subtracted from the target of each example e
21 """
22 self.base_dataset = base_dataset
23 self.offset_fun = offset_fun
24 self.train =
random.sample(base_dataset.train,int(subsample*len(base_dataset.train)))
25 self.test = base_dataset.test
26 #Data_set.__init__(self, base_dataset.train, base_dataset.test,
27 # base_dataset.prob_test, base_dataset.target_index)
28
29 #def create_features(self):
30 """creates new features - called at end of Data_set.init()
31 defines a new target
32 """
33 self.input_features = self.base_dataset.input_features
34 def newout(e):
35 return self.base_dataset.target(e) - self.offset_fun(e)
36 newout.frange = self.base_dataset.target.frange
37 newout.ftype = self.infer_type(newout.frange)
38 self.target = newout
39
40 def conditions(self, *args, colsample_bytree=0.5, **nargs):
41 conds = self.base_dataset.conditions(*args, **nargs)
42 return random.sample(conds, int(colsample_bytree*len(conds)))
A boosting learner takes in a dataset and a base learner, and returns a new
predictor. The base learner, takes a dataset, and returns a Learner object.
learnBoosting.py — (continued)
44 class Boosting_learner(Learner):
45 def __init__(self, dataset, base_learner_class, subsample=0.8):
46 self.dataset = dataset
47 self.base_learner_class = base_learner_class
48 self.subsample = subsample
49 mean = sum(self.dataset.target(e)
50 for e in self.dataset.train)/len(self.dataset.train)
51 self.predictor = lambda e:mean # function that returns mean for
each example
52 self.predictor.__doc__ = "lambda e:"+str(mean)
53 self.offsets = [self.predictor] # list of base learners
54 self.predictors = [self.predictor] # list of predictors
55 self.errors = [data.evaluate_dataset(data.test, self.predictor,
Evaluate.squared_loss)]
56 self.display(1,"Predict mean test set mean squared loss=",
self.errors[0] )
57
58
59 def learn(self, num_ensembles=10):
60 """adds num_ensemble learners to the ensemble.
61 returns a new predictor.
62 """
63 for i in range(num_ensembles):
64 train_subset = Boosted_dataset(self.dataset, self.predictor,
subsample=self.subsample)
65 learner = self.base_learner_class(train_subset)
66 new_offset = learner.learn()
67 self.offsets.append(new_offset)
68 def new_pred(e, old_pred=self.predictor, off=new_offset):
69 return old_pred(e)+off(e)
70 self.predictor = new_pred
71 self.predictors.append(new_pred)
72 self.errors.append(data.evaluate_dataset(data.test,
self.predictor, Evaluate.squared_loss))
73 self.display(1,f"Iteration {len(self.offsets)-1},treesize =
76 # Testing
77
78 from learnDT import DT_learner
79 from learnProblem import Data_set, Data_from_file
80
81 def sp_DT_learner(split_to_optimize=Evaluate.squared_loss,
82 leaf_prediction=Predict.mean,**nargs):
83 """Creates a learner with different default arguments replaced by
**nargs
84 """
85 def new_learner(dataset):
86 return DT_learner(dataset,split_to_optimize=split_to_optimize,
87 leaf_prediction=leaf_prediction, **nargs)
88 return new_learner
89
90 #data = Data_from_file('data/car.csv', target_index=-1) regression
91 data = Data_from_file('data/student/student-mat-nq.csv',
separator=';',has_header=True,target_index=-1,seed=13,include_only=list(range(30))+[32])
#2.0537973790924946
92 #data = Data_from_file('data/SPECT.csv', target_index=0, seed=62) #123)
93 #data = Data_from_file('data/mail_reading.csv', target_index=-1)
94 #data = Data_from_file('data/holiday.csv', has_header=True, num_train=19,
target_index=-1)
95 #learner10 = Boosting_learner(data,
sp_DT_learner(split_to_optimize=Evaluate.squared_loss,
leaf_prediction=Predict.mean, min_child_weight=10))
96 #learner7 = Boosting_learner(data, sp_DT_learner(0.7))
97 #learner5 = Boosting_learner(data, sp_DT_learner(0.5))
98 #predictor9 =learner9.learn(10)
99 #for i in learner9.offsets: print(i.__doc__)
100 import matplotlib.pyplot as plt
101
102 def plot_boosting_trees(data, steps=10, mcws=[30,20,20,10], gammas=
[100,200,300,500]):
103 # to reduce clutter uncomment one of following two lines
104 #mcws=[10]
105 #gammas=[200]
106 learners = [(mcw, gamma, Boosting_learner(data,
sp_DT_learner(min_child_weight=mcw, gamma=gamma)))
107 for gamma in gammas for mcw in mcws
108 ]
109 plt.ion()
Exercise 7.14 For a particular dataset, suggest good values for min_child_weight
and gamma. How stable are these to different random choices that are made (e.g.,
in the training-test split)? Try to explain why these are good settings.
136 self.trees.append(tree)
137 self.display(1,f"""Iteration {i} treesize = {tree.num_leaves}
train logloss={
138 self.dataset.evaluate_dataset(self.dataset.train,
self.gtb_predictor, Evaluate.log_loss)
139 } test logloss={
140 self.dataset.evaluate_dataset(self.dataset.test,
self.gtb_predictor, Evaluate.log_loss)}""")
141 return self.gtb_predictor
142
143 def gtb_predictor(self, example, extra=0):
144 """prediction for example,
145 extras is an extra contribution for this example being considered
146 """
147 return sigmoid(sum(t(example) for t in self.trees)+extra)
148
149 def leaf_value(self, egs, domain=[0,1]):
150 """value at the leaves for examples egs
151 domain argument is ignored"""
152 pred_acts = [(self.gtb_predictor(e),self.target(e)) for e in egs]
153 return sum(a-p for (p,a) in pred_acts) /(sum(p*(1-p) for (p,a) in
pred_acts)+self.lambda_reg)
154
155
156 def sum_losses(self, data_subset):
157 """returns sum of losses for dataset (assuming a leaf is formed
with no more splits)
158 """
159 leaf_val = self.leaf_value(data_subset)
160 error = sum(Evaluate.log_loss(self.gtb_predictor(e,leaf_val),
self.target(e))
161 for e in data_subset) + self.gamma
162 return error
Testing
learnBoosting.py — (continued)
Exercise 7.15 Find better hyperparameter settings than the default ones. Com-
pare prediction error with other methods for Boolean datasets.
8.1 Layers
A neural network is built from layers. In AIPython, unlike Keras and PyTorch,
activation functions are treated as separate layers, which makes them more
modular and the code more readable.
This provides a modular implementation of layers. Layers can easily be
stacked in many configurations. A layer needs to implement a method to com-
pute the output values from the inputs, a method to back-propagate the error,
and a method update its parameters (if it has any) for a batch.
learnNN.py — Neural Network Learning
11 from display import Displayable
12 from learnProblem import Learner, Data_set, Data_from_file,
Data_from_files, Evaluate
13 from learnLinear import sigmoid, one, softmax, indicator
14 import random, math, time
15
187
188 8. Neural Networks and Deep Learning
16 class Layer(Displayable):
17 def __init__(self, nn, num_outputs=None):
18 """Abstract layer class, must be overridden.
19 nn is the neural network this layer is part of
20 num outputs is the number of outputs for this layer.
21 """
22 self.nn = nn
23 self.num_inputs = nn.num_outputs # nn output is layer's input
24 if num_outputs:
25 self.num_outputs = num_outputs
26 else:
27 self.num_outputs = self.num_inputs # same as the inputs
28 self.outputs= [0]*self.num_outputs
29 self.input_errors = [0]*self.num_inputs
30 self.weights = []
31
32 def output_values(self, input_values, training=False):
33 """Return the outputs for this layer for the given input values.
34 input_values is a list (of length self.num_inputs) of the inputs
35 returns a list of length self.num_outputs.
36 It can act differently when training and when predicting.
37 """
38 raise NotImplementedError("output_values") # abstract method
39
40 def backprop(self, out_errors):
41 """Backpropagate the errors on the outputs
42 errors is a list of output errors (of length self.num_outputs).
43 Returns list of input errors (of length self.num_inputs).
44
45 This is only called after corresponding output_values(),
46 which should remember relevant information
47 """
48 raise NotImplementedError("backprop") # abstract method
49
50 class Optimizer(Displayable):
51 def update(self, layer):
52 """updates parameters after a batch.
53 """
54 pass
learnNN.py — (continued)
56 class Linear_complete_layer(Layer):
57 """a completely connected layer"""
58 def __init__(self, nn, num_outputs, limit=None):
59 """A completely connected linear layer.
60 nn is a neural network that the inputs come from
61 num_outputs is the number of outputs
62 the random initialization of parameters is in range [-limit,limit]
63 """
64 Layer.__init__(self, nn, num_outputs)
65 if limit is None:
66 limit =math.sqrt(6/(self.num_inputs+self.num_outputs))
67 # self.weights[i][o] is the weight between input i and output o
68 self.weights = [[random.uniform(-limit, limit)
69 if i < self.num_inputs else 0
70 for o in range(self.num_outputs)]
71 for i in range(self.num_inputs+1)]
72 # self.weights[i][o] is the accumulated change for a batch.
73 self.delta = [[0 for o in range(self.num_outputs)]
74 for i in range(self.num_inputs+1)]
75
76 def output_values(self, inputs, training=False):
77 """Returns the outputs for the input values.
78 It remembers the values for the backprop.
79 """
80 self.display(3,f"Linear layer inputs: {inputs}")
81 self.inputs = inputs
82 for out in range(self.num_outputs):
83 self.outputs[out] = (sum(self.weights[inp][out]*self.inputs[inp]
84 for inp in range(self.num_inputs))
85 + self.weights[self.num_inputs][out])
86 self.display(3,f"Linear layer inputs: {inputs}")
87 return self.outputs
88
89 def backprop(self, errors):
90 """Backpropagate errors, update weights, return input error.
91 errors is a list of size self.num_outputs
92 Returns errors for layer's inputs of size
93 """
94 self.display(3,f"Linear Backprop. input: {self.inputs} output
errors: {errors}")
95 for out in range(self.num_outputs):
96 for inp in range(self.num_inputs):
97 self.input_errors[inp] = self.weights[inp][out] * errors[out]
98 self.delta[inp][out] += self.inputs[inp] * errors[out]
99 self.delta[self.num_inputs][out] += errors[out]
100 self.display(3,f"Linear layer backprop input errors:
{self.input_errors}")
101 return self.input_errors
175 self.layers = []
176 self.bn = 0 # number of batches run
177 self.printed_heading = False
178
179 def add_layer(self,layer):
180 """add a layer to the network.
181 Each layer gets number of inputs from the previous layers outputs.
182 """
183 self.layers.append(layer)
184 #if hasattr(layer, 'weights'):
185 layer.optimizer = self.optimizer(layer, self.parms)
186 self.num_outputs = layer.num_outputs
187
188 def predictor(self,ex):
189 """Predicts the value of the first output for example ex.
190 """
191 values = [f(ex) for f in self.input_features]
192 for layer in self.layers:
193 values = layer.output_values(values)
194 return sigmoid(values[0]) if self.output_type =="boolean" \
195 else softmax(values, self.dataset.target.frange) if
self.output_type == "categorical" \
196 else values[0]
learnNN.py — (continued)
8.3.1 Momentum
learnNN.py — (continued)
250
251 """a completely connected layer"""
252 def __init__(self, layer, parms={'lr':0.01, 'momentum':0.9}):
253 """
254 lr is the learning rate
255 momentum is the momentum parameter of PyTorch or Keras
256
257 """
258 self.lr = parms['lr'] if 'lr' in parms else 0.01
259 self.momentum = parms['momentum'] if 'momentum' in parms else 0.9
260 layer.velocity = [[0 for _ in range((len(layer.weights[0])))]
261 for _ in range(len(layer.weights))]
262
263
264 def update(self, layer):
265 """updates parameters after a batch with momentum"""
266 for inp in range(len(layer.weights)):
267 for out in range(len(layer.weights[0])):
268 layer.velocity[inp][out] =
self.momentum*layer.velocity[inp][out] -
self.lr*layer.delta[inp][out]
269 layer.weights[inp][out] += layer.velocity[inp][out]
270 layer.delta[inp][out] = 0
8.3.2 RMS-Prop
learnNN.py — (continued)
8.4 Dropout
Dropout is implemented as a layer.
learnNN.py — (continued)
8.5 Examples
The following constructs some neural networks (most with one hidden layer).
The output is assumed to be Boolean or Real. If it is categorical, the final layer
should have the same number of outputs as the number of categories (so it can
use a softmax).
learnNN.py — (continued)
[] SGD(lr=0.01) training
1.8 [] SGD(lr=0.01) valid
[3] SGD(lr=0.01) training
[3] SGD(lr=0.01) valid
1.6 [3, 3] SGD(lr=0.01) training
[3, 3] SGD(lr=0.01) valid
1.4
Average log loss (bits)
1.2
1.0
0.8
0.6
0.4
0.2
0 250 500 750 1000 1250 1500 1750 2000
step
Figure 8.1: Plotting train and validation log loss for various architectures on
SPECT dataset. Generated by
plot_algs(archs=[[],[3],[3,3]], opts=[SGD],lrs=[0.01],num_steps=2000)
383
384 def plot_algs(archs=[[3]], opts=[SGD],lrs=[0.1, 0.01,0.001,0.0001],
385 data=data, criterion=crit, num_steps=1000):
386 args = []
387 for arch in archs:
388 for opt in opts:
389 for lr in lrs:
390 args.append((arch,opt,{'lr':lr}))
391 plot_algs_opts(args,data, criterion, num_steps)
392
393 def plot_algs_opts(args, data=data, criterion=crit, num_steps=1000):
394 """args is a list of (architecture, optimizer, parameters)
395 for each of the corresponding triples it plots the learning rate"""
396 for (arch, opt, parms) in args:
397 nn = create_nn(data, arch, opt, parms)
398 parms_string = ','.join(f"{p}={v}" for p,v in parms.items())
399 plot_steps(learner = nn, data = data, criterion=crit,
num_steps=num_steps,
The following tests are on the MNIST digit dataset. The original files are
from http://yann.lecun.com/exdb/mnist/. This code assumes you use the csv
files from Joseph Redmon (https://pjreddie.com/projects/mnist-in-csv/ or
https://github.com/pjreddie/mnist-csv-png or https://www.kaggle.com/datasets/
oddrationale/mnist-in-csv) and put them in the directory ../MNIST/. Note
that this is very inefficient; you would be better to use Keras or PyTorch. There
are 28 ∗ 28 = 784 input units and 512 hidden units, which makes 401,408 pa-
rameters for the lowest linear layer. So don’t be surprised if it takes many hours
in AIPython (even if it only takes a few seconds in Keras).
learnNN.py — (continued)
Exercise 8.3 In the definition of nn3 above, for each of the following, first hy-
pothesize what will happen, then test your hypothesis, then explain whether you
testing confirms your hypothesis or not. Test it for more than one data set, and use
more than one run for each data set.
(a) Which fits the data better, having a sigmoid layer or a ReLU layer after the
first linear layer?
(b) Which is faster to learn, having a sigmoid layer or a ReLU layer after the first
linear layer? (Hint: Plot error as a function of steps).
(c) What happens if you have both the sigmoid layer and then a ReLU layer
after the first linear layer and before the second linear layer?
(d) What happens if you have a ReLU layer then a sigmoid layer after the first
linear layer and before the second linear layer?
(e) What happens if you have neither the sigmoid layer nor a ReLU layer after
the first linear layer?
Exercise 8.4 For each optimizer, use the validation set to choose the best setting
for the hyperparameters, including when to stop, and the parameters of the opti-
mizer (including the learning rate). For the architecture chosen, which optimizer
works best? Suggest another architecture which you conjecture would be better
on the test set (after hyperparameter optimization). Is it better?
201
202 9. Reasoning with Uncertainty
38 def __str__(self):
39 """returns a string representing a summary of the factor"""
40 return f"{self.name}({','.join(str(var) for var in
self.variables)})"
41
42 def to_table(self, variables=None, given={}):
43 """returns a string representation of the factor.
44 Allows for an arbitrary variable ordering.
45 variables is a list of the variables in the factor
46 (can contain other variables)"""
47 if variables==None:
48 variables = [v for v in self.variables if v not in given]
49 else: #enforce ordering and allow for extra variables in ordering
50 variables = [v for v in variables if v in self.variables and v
not in given]
51 head = "\t".join(str(v) for v in variables)+"\t"+self.name
52 return head+"\n"+self.ass_to_str(variables, given, variables)
53
54 def ass_to_str(self, vars, asst, allvars):
55 #print(f"ass_to_str({vars}, {asst}, {allvars})")
56 if vars:
57 return "\n".join(self.ass_to_str(vars[1:], {**asst,
vars[0]:val}, allvars)
58 for val in vars[0].domain)
59 else:
60 val = self.get_value(asst)
61 val_st = "{:.6f}".format(val) if isinstance(val,float) else
str(val)
62 return ("\t".join(str(asst[var]) for var in allvars)
63 + "\t"+val_st)
64
65 __repr__ = __str__
probFactors.py — (continued)
67 class CPD(Factor):
68 def __init__(self, child, parents):
69 """represents P(variable | parents)
70 """
71 self.parents = parents
72 self.child = child
73 Factor.__init__(self, parents+[child], name=f"Probability")
74
75 def __str__(self):
76 """A brief description of a factor using in tracing"""
77 if self.parents:
78 return f"P({self.child}|{','.join(str(p) for p in
self.parents)})"
79 else:
80 return f"P({self.child})"
81
82 __repr__ = __str__
A constant CPD has no parents, and has probability 1 when the variable has
the value specified, and 0 when the variable has a different value.
probFactors.py — (continued)
84 class ConstantCPD(CPD):
85 def __init__(self, variable, value):
86 CPD.__init__(self, variable, [])
87 self.value = value
88 def get_value(self, assignment):
89 return 1 if self.value==assignment[self.child] else 0
P(X=True | Y1 . . . Yk ) = sigmoid(w0 + ∑ wi Yi )
i
9.3.2 Noisy-or
A noisy-or, for Boolean variable X with Boolean parents Y1 . . . Yk is parametrized
by k + 1 parameters p0 , p1 , . . . , pk , where each 0 ≤ pi ≤ 1. The semantics is de-
fined as though there are k + 1 hidden variables Z0 , Z1 . . . Zk , where P(Z0 ) = p0
and P(Zi | Yi ) = pi for i ≥ 1, and where X is true if and only if Z0 ∨ Z1 ∨ · · · ∨ Zk
(where ∨ is “or”). Thus X is false if all of the Zi are false. Intuitively, Z0 is the
probability of X when all Yi are false and each Zi is a noisy (probabilistic) mea-
sure that Yi makes X true, and X only needs one to make it true.
probFactors.py — (continued)
Note that not all parents need to be assigned to evaluate the decision tree; it
only needs a branch down the tree that gives the distribution.
probFactors.py — (continued)
The following shows a decision representation of the Example 9.18 of Poole and
Mackworth [2023]. When the Action is to go out, the probability is a function
of rain; otherwise it is a function of full.
probFactors.py — (continued)
tion (that all factors are conditional probabilities), and builds some useful data
structures.
probGraphicalModels.py — (continued)
28 class BeliefNetwork(GraphicalModel):
29 """The class of belief networks."""
30
31 def __init__(self, title, variables, factors):
32 """vars is a set of variables
33 factors is a set of factors. All of the factors are instances of
CPD (e.g., Prob).
34 """
35 GraphicalModel.__init__(self, title, variables, factors)
36 assert all(isinstance(f,CPD) for f in factors), factors
37 self.var2cpt = {f.child:f for f in factors}
38 self.var2parents = {f.child:f.parents for f in factors}
39 self.children = {n:[] for n in self.variables}
40 for v in self.var2parents:
41 for par in self.var2parents[v]:
42 self.children[par].append(v)
43 self.topological_sort_saved = None
The following creates a topological sort of the nodes, where the parents of
a node come before the node in the resulting order. This is based on Kahn’s
algorithm from 1962.
probGraphicalModels.py — (continued)
45 def topological_sort(self):
46 """creates a topological ordering of variables such that the
parents of
47 a node are before the node.
48 """
49 if self.topological_sort_saved:
50 return self.topological_sort_saved
51 next_vars = {n for n in self.var2parents if not self.var2parents[n]
}
52 self.display(3,'topological_sort: next_vars',next_vars)
53 top_order=[]
54 while next_vars:
55 var = next_vars.pop()
56 self.display(3,'select variable',var)
57 top_order.append(var)
58 next_vars |= {ch for ch in self.children[var]
59 if all(p in top_order for p in
self.var2parents[ch])}
60 self.display(3,'var_with_no_parents_left',next_vars)
61 self.display(3,"top_order",top_order)
62 assert
set(top_order)==set(self.var2parents),(top_order,self.var2parents)
63 self.topologicalsort_saved=top_order
64 return top_order
4-chain
A
B
C
D
probGraphicalModels.py — (continued)
probGraphicalModels.py — (continued)
Report-of-leaving
Tamper Fire
Alarm Smoke
Leaving
Report
Report-of-Leaving Example
The second belief network, bn_report, is Example 9.13 of Poole and Mack-
worth [2023] (http://artint.info). The output of bn_report.show() is shown
in Figure 9.2 of this document.
probExamples.py — Example belief networks
11 from variable import Variable
12 from probFactors import CPD, Prob, LogisticRegression, NoisyOR, ConstantCPD
13 from probGraphicalModels import BeliefNetwork
14
Simple Diagnosis
Influenza Smokes
Coughing Wheezing
probExamples.py — (continued)
Sprinkler Example
The third belief network is the sprinkler example from Pearl [2009]. The output
of bn_sprinkler.show() is shown in Figure 9.4 of this document.
probExamples.py — (continued)
Season
Rained Sprinkler
Grass wet
74
75 bn_sprinkler = BeliefNetwork("Pearl's Sprinkler Example",
76 {Season, Sprinkler, Rained, Grass_wet, Grass_shiny,
Shoes_wet},
77 {f_season, f_sprinkler, f_rained, f_wet, f_shiny,
f_shoes})
probExamples.py — (continued)
probExamples.py — (continued)
107
108 p_cold_lr = Prob(Cold,[],[0.9,0.1])
109 p_flu_lr = Prob(Flu,[],[0.95,0.05])
110 p_covid_lr = Prob(Covid,[],[0.99,0.01])
111
112 p_cough_lr = LogisticRegression(Cough, [Cold,Flu,Covid], [-2.2, 1.67,
1.26, 3.19])
113 p_fever_lr = LogisticRegression(Fever, [ Flu,Covid], [-4.6, 5.02,
5.46])
114 p_sneeze_lr = LogisticRegression(Sneeze, [Cold,Flu ], [-2.94, 3.04, 1.79
])
115
116 bn_lr1 = BeliefNetwork("Bipartite Diagnostic Network - logistic
regression",
117 {Cough, Fever, Sneeze, Cold, Flu, Covid},
118 {p_cold_lr, p_flu_lr, p_covid_lr, p_cough_lr,
p_fever_lr, p_sneeze_lr})
119
120 # to see the conditional probability of Noisy-or do:
121 #print(p_cough_lr.to_table())
122
123 # example from box "Noisy-or compared to logistic regression"
124 # from learnLinear import sigmoid, logit
125 # w0=logit(0.01)
126 # X = Variable("X",boolean)
127 # print(LogisticRegression(X,[A,B,C,D],[w0, logit(0.05)-w0, logit(0.1)-w0,
logit(0.2)-w0, logit(0.2)-w0]).to_table(given={X:True}))
128 # try to predict what would happen (and then test) if we had
129 # w0=logit(0.01)
probGraphicalModels.py — (continued)
Tamper Fire
False: 0.601 False: 0.769
True: 0.399 True: 0.231
Alarm Smoke
False: 0.372 False: 0.785
True: 0.628 True: 0.215
Leaving
False: 0.347
True: 0.653
Report=True
vations on other variables. See Figure 9.9 of Poole and Mackworth [2023].
probRC.py — Search-based Inference for Graphical Models
11 import math
12 from probGraphicalModels import GraphicalModel, InferenceMethod
13 from probFactors import Factor
14
15 class ProbSearch(InferenceMethod):
16 """The class that queries graphical models using search
17
18 gm is graphical model to query
19 """
20 method_name = "naive search"
21
22 def __init__(self,gm=None):
23 InferenceMethod.__init__(self, gm)
24 ## self.max_display_level = 3
25
26 def query(self, qvar, obs={}, split_order=None):
27 """computes P(qvar | obs) where
28 qvar is the query variable
29 obs is a variable:value dictionary
30 split_order is a list of the non-observed non-query variables in gm
31 """
32 if qvar in obs:
33 return {val:(1 if val == obs[qvar] else 0)
34 for val in qvar.domain}
35 else:
36 if split_order == None:
37 split_order = [v for v in self.gm.variables
38 if (v not in obs) and v != qvar]
39 unnorm = [self.prob_search({qvar:val}|obs, self.gm.factors,
split_order)
40 for val in qvar.domain]
41 p_obs = sum(unnorm)
42 return {val:pr/p_obs for val,pr in zip(qvar.domain, unnorm)}
The following is the naive search-based algorithm. It is exponential in the
number of variables, so is not very useful. However, it is simple, and helpful
to understand before looking at the more complicated algorithm used in the
subclass.
probRC.py — (continued)
51 if not factors:
52 return 1
53 elif to_eval := {fac for fac in factors
54 if fac.can_evaluate(context)}:
55 # evaluate factors when all variables are assigned
56 self.display(3,"prob_search evaluating factors",to_eval)
57 val = math.prod(fac.get_value(context) for fac in to_eval)
58 return val * self.prob_search(context, factors-to_eval,
split_order)
59 else:
60 total = 0
61 var = split_order[0]
62 self.display(3, "prob_search branching on", var)
63 for val in var.domain:
64 total += self.prob_search({var:val}|context, factors,
split_order[1:])
65 self.display(3, "prob_search branching on", var,"returning",
total)
66 return total
probRC.py — (continued)
68 class ProbRC(ProbSearch):
69 method_name = "recursive conditioning"
70
71 def __init__(self,gm=None):
72 self.cache = {(frozenset(), frozenset()):1}
73 ProbSearch.__init__(self,gm)
74
75 def prob_search(self, context, factors, split_order):
76 """ returns sum_{split_order} prod_{factors} given assignment in
context
77 context is a variable:value dictionary
78 factors is a set of factors
79 split_order: list of variables in factors that are not in context
80 """
81 self.display(3,"calling rc,",(context,factors))
82 ce = (frozenset(context.items()), frozenset(factors)) # key for the
cache entry
83 if ce in self.cache:
84 self.display(3,"rc cache lookup",(context,factors))
85 return self.cache[ce]
86 elif vars_not_in_factors := {var for var in context
87 if not any(var in fac.variables
88 for fac in factors)}:
89 # forget variables not in any factor
90 self.display(3,"rc forgetting variables", vars_not_in_factors)
91 return self.prob_search({key:val for (key,val) in
context.items()
92 if key not in vars_not_in_factors},
93 factors, split_order)
94 elif to_eval := {fac for fac in factors
95 if fac.can_evaluate(context)}:
96 # evaluate factors when all variables are assigned
97 self.display(3,"rc evaluating factors",to_eval)
98 val = math.prod(fac.get_value(context) for fac in to_eval)
99 if val == 0:
100 return 0
101 else:
102 return val * self.prob_search(context,
103 {fac for fac in factors
104 if fac not in to_eval},
105 split_order)
106 elif len(comp := connected_components(context, factors,
split_order)) > 1:
107 # there are disconnected components
108 self.display(3,"splitting into connected components",comp,"in
context",context)
109 return(math.prod(self.prob_search(context,f,eo) for (f,eo) in
comp))
110 else:
111 assert split_order, "split_order should not be empty to get
here"
112 total = 0
113 var = split_order[0]
114 self.display(3, "rc branching on", var)
115 for val in var.domain:
116 total += self.prob_search({var:val}|context, factors,
split_order[1:])
117 self.cache[ce] = total
118 self.display(2, "rc branching on", var,"returning", total)
119 return total
connected_components returns a list of connected components, where a con-
nected component is a set of factors and a set of variables, where the graph that
connects variables and factors that involve them is connected. The connected
components are built one at a time; with a current connected component. At
all times factors is partitioned into 3 disjoint sets:
• component_factors containing factors in the current connected compo-
nent where all factors that share a variable are already in the component
• other_factors the other factors that are not (yet) in the connected com-
ponent
probRC.py — (continued)
Testing:
probRC.py — (continued)
153 ## bn_4chv.query(A,{D:True},[C,B])
154 ## bn_4chv.query(B,{A:True,D:False})
155
156 from probExamples import bn_report,Alarm,Fire,Leaving,Report,Smoke,Tamper
157 bn_reportRC = ProbRC(bn_report) # answers queries using recursive
conditioning
158 ## bn_reportRC.query(Tamper,{})
159 ## InferenceMethod.max_display_level = 0 # show no detail in displaying
160 ## bn_reportRC.query(Leaving,{})
161 ## bn_reportRC.query(Tamper,{},
split_order=[Smoke,Fire,Alarm,Leaving,Report])
162 ## bn_reportRC.query(Tamper,{Report:True})
163 ## bn_reportRC.query(Tamper,{Report:True,Smoke:False})
164
165 ## To display resulting posteriors try:
166 # bn_reportRC.show_post({})
167 # bn_reportRC.show_post({Smoke:False})
168 # bn_reportRC.show_post({Report:True})
169 # bn_reportRC.show_post({Report:True, Smoke:False})
170
171 ## Note what happens to the cache when these are called in turn:
172 ## bn_reportRC.query(Tamper,{Report:True},
split_order=[Smoke,Fire,Alarm,Leaving])
173 ## bn_reportRC.query(Smoke,{Report:True},
split_order=[Tamper,Fire,Alarm,Leaving])
174
175 from probExamples import bn_sprinkler, Season, Sprinkler, Rained,
Grass_wet, Grass_shiny, Shoes_wet
176 bn_sprinklerv = ProbRC(bn_sprinkler)
177 ## bn_sprinklerv.query(Shoes_wet,{})
178 ## bn_sprinklerv.query(Shoes_wet,{Rained:True})
179 ## bn_sprinklerv.query(Shoes_wet,{Grass_shiny:True})
180 ## bn_sprinklerv.query(Shoes_wet,{Grass_shiny:False,Rained:True})
181
182 from probExamples import bn_no1, bn_lr1, Cough, Fever, Sneeze, Cold, Flu,
Covid
183 bn_no1v = ProbRC(bn_no1)
184 bn_lr1v = ProbRC(bn_lr1)
185 ## bn_no1v.query(Flu, {Fever:1, Sneeze:1})
186 ## bn_lr1v.query(Flu, {Fever:1, Sneeze:1})
187 ## bn_lr1v.query(Cough,{})
188 ## bn_lr1v.query(Cold,{Cough:1,Sneeze:0,Fever:1})
189 ## bn_lr1v.query(Flu,{Cough:0,Sneeze:1,Fever:1})
190 ## bn_lr1v.query(Covid,{Cough:1,Sneeze:0,Fever:1})
191 ## bn_lr1v.query(Covid,{Cough:1,Sneeze:0,Fever:1,Flu:0})
192 ## bn_lr1v.query(Covid,{Cough:1,Sneeze:0,Fever:1,Flu:1})
193
194 if __name__ == "__main__":
195 InferenceMethod.testIM(ProbSearch)
196 InferenceMethod.testIM(ProbRC)
The following example uses the decision tree representation of Section 9.3.4
(page 208).
probRC.py — (continued)
198 from probFactors import Prob, action, rain, full, wet, p_wet
199 from probGraphicalModels import BeliefNetwork
200 p_action = Prob(action,[],{'go_out':0.3, 'get_coffee':0.7})
201 p_rain = Prob(rain,[],[0.4,0.6])
202 p_full = Prob(full,[],[0.1,0.9])
203
204 wetBN = BeliefNetwork("Wet (decision tree CPD)", {action, rain, full, wet},
205 {p_action, p_rain, p_full, p_wet})
206 wetRC = ProbRC(wetBN)
207 # wetRC.query(wet, {action:'go_out', rain:True})
208 # wetRC.show_post({action:'go_out', rain:True})
209 # wetRC.show_post({action:'go_out', wet:True})
Exercise 9.1 Does recursive conditioning split on variable full for the query
commented out above? Does it need to? Fix the code so that decision tree repre-
sentations of conditional probabilities can be evaluated as soon as possible.
Exercise 9.2 This code adds to the cache only after splitting. Implement a variant
that caches after forgetting. (What can the cache start with?) Which version works
better? Compare some measure of the search tree and the space used. Try other
alternatives of what to cache; which method works best?
∑ ∏ f (var).
var f ∈factors
We store the values in a list in a lazy manner; if they are already computed, we
used the stored values. If they are not already computed we can compute and
store them.
probFactors.py — (continued)
255 # vars.append(v)
256 Factor.__init__(self,vars)
257 self.values = {}
258
259 def get_value(self,assignment):
260 """lazy implementation: if not saved, compute it. Return saved
value"""
261 asst = frozenset(assignment.items())
262 if asst in self.values:
263 return self.values[asst]
264 else:
265 total = 0
266 new_asst = assignment.copy()
267 for val in self.var_summed_out.domain:
268 new_asst[self.var_summed_out] = val
269 total += math.prod(fac.get_value(new_asst) for fac in
self.factors)
270 self.values[asst] = total
271 return total
The method factor_times multiplies a set of factors that are all factors on the
same variable (or on no variables). This is the last step in variable elimination
before normalizing. It returns an array giving the product for each value of
variable.
probFactors.py — (continued)
43 def project_observations(self,factor,obs):
44 """Returns the resulting factor after observing obs
45
46 obs is a dictionary of {variable:value} pairs.
47 """
48 if any((var in obs) for var in factor.variables):
49 # a variable in factor is observed
50 return FactorObserved(factor,obs)
51 else:
52 return factor
53
54 def eliminate_var(self,factors,var):
55 """Eliminate a variable var from a list of factors.
56 Returns a new set of factors that has var summed out.
57 """
58 self.display(2,"eliminating ",str(var))
59 contains_var = []
60 not_contains_var = []
61 for fac in factors:
62 if var in fac.variables:
63 contains_var.append(fac)
64 else:
65 not_contains_var.append(fac)
66 if contains_var == []:
67 return factors
68 else:
69 newFactor = FactorSum(var,contains_var)
70 self.display(2,"Multiplying:",[str(f) for f in contains_var])
71 self.display(2,"Creating factor:", newFactor)
72 self.display(3, newFactor.to_table()) # factor in detail
73 not_contains_var.append(newFactor)
74 return not_contains_var
75
76 from probGraphicalModels import bn_4ch, A,B,C,D
77 bn_4chv = VE(bn_4ch)
78 ## bn_4chv.query(A,{})
79 ## bn_4chv.query(D,{})
80 ## InferenceMethod.max_display_level = 3 # show more detail in displaying
81 ## InferenceMethod.max_display_level = 1 # show less detail in displaying
82 ## bn_4chv.query(A,{D:True})
83 ## bn_4chv.query(B,{A:True,D:False})
84
85 from probExamples import bn_report,Alarm,Fire,Leaving,Report,Smoke,Tamper
86 bn_reportv = VE(bn_report) # answers queries using variable elimination
87 ## bn_reportv.query(Tamper,{})
88 ## InferenceMethod.max_display_level = 0 # show no detail in displaying
89 ## bn_reportv.query(Leaving,{})
90 ## bn_reportv.query(Tamper,{},elim_order=[Smoke,Report,Leaving,Alarm,Fire])
91 ## bn_reportv.query(Tamper,{Report:True})
92 ## bn_reportv.query(Tamper,{Report:True,Smoke:False})
93
94 from probExamples import bn_sprinkler, Season, Sprinkler, Rained,
Grass_wet, Grass_shiny, Shoes_wet
95 bn_sprinklerv = VE(bn_sprinkler)
96 ## bn_sprinklerv.query(Shoes_wet,{})
97 ## bn_sprinklerv.query(Shoes_wet,{Rained:True})
98 ## bn_sprinklerv.query(Shoes_wet,{Grass_shiny:True})
99 ## bn_sprinklerv.query(Shoes_wet,{Grass_shiny:False,Rained:True})
100
101 from probExamples import bn_lr1, Cough, Fever, Sneeze, Cold, Flu, Covid
Exercise 9.3
What is the time and space complexity of the following 4 methods to generate
n samples, where m is the length of dist:
The test_sampling method can be used to generate the statistics from a num-
ber of samples. It is useful to see the variability as a function of the number of
samples. Try it for a few samples and also for many samples.
probStochSim.py — (continued)
probStochSim.py — (continued)
53 class SamplingInferenceMethod(InferenceMethod):
54 """The abstract class of sampling-based belief network inference
methods"""
55
56 def __init__(self,gm=None):
57 InferenceMethod.__init__(self, gm)
58
59 def query(self,qvar,obs={},number_samples=1000,sample_order=None):
60 raise NotImplementedError("SamplingInferenceMethod query") #
abstract
probStochSim.py — (continued)
62 class RejectionSampling(SamplingInferenceMethod):
63 """The class that queries Graphical Models using Rejection Sampling.
64
65 gm is a belief network to query
66 """
67 method_name = "rejection sampling"
68
69 def __init__(self, gm=None):
70 SamplingInferenceMethod.__init__(self, gm)
71
72 def query(self, qvar, obs={}, number_samples=1000, sample_order=None):
73 """computes P(qvar | obs) where
74 qvar is a variable.
75 obs is a {variable:value} dictionary.
76 sample_order is a list of variables where the parents
77 come before the variable.
78 """
79 if sample_order is None:
80 sample_order = self.gm.topological_sort()
81 self.display(2,*sample_order,sep="\t")
82 counts = {val:0 for val in qvar.domain}
83 for i in range(number_samples):
84 rejected = False
85 sample = {}
86 for nvar in sample_order:
87 fac = self.gm.var2cpt[nvar] #factor with nvar as child
Exercise 9.4 Change this algorithm so that it does importance sampling using
a proposal distribution that may be different from the prior. It needs sample_one
using a different distribution and then adjust the weight of the current sample. For
testing, use a proposal distribution that only differs from the prior for a subset of
the variables. For which variables does the different proposal distribution make
the most difference?
Resampling
Resample is based on sample_multiple but works with an array of particles.
(Aside: Python doesn’t let us use sample_multiple directly as it uses a dictio-
nary and particles, represented as dictionaries can’t be the key of dictionaries).
probStochSim.py — (continued)
204 result.append(particles[index])
205 return result
9.9.6 Examples
probStochSim.py — (continued)
Exercise 9.5 Change the code so that it can have multiple query variables. Make
the list of query variable be an input to the algorithm, so that the default value is
the list of all non-observed variables.
Exercise 9.6 In this algorithm, explain where it computes the probability of a
variable given its Markov blanket. Instead of returning the average of the samples
for the query variable, it is possible to return the average estimate of the probabil-
ity of the query variable given its Markov blanket. Does this converge to the same
answer as the given code? Does it converge faster, slower, or the same?
1000
800
Cumulative Number
600
400
Gaussian) may not hold for cases where the predictions are bounded and often
skewed.
It is more appropriate to plot the distribution of predictions over multiple
runs. The plot_stats method plots the prediction of a particular variable (or for
the partition function) for a number of runs of the same algorithm. On the x-
axis, is the prediction of the algorithm. On the y-axis is the number of runs
with prediction less than or equal to the x value. Thus this is like a cumulative
distribution over the predictions, but with counts on the y-axis.
Note that for runs where there are no samples that are consistent with the
observations (as can happen with rejection sampling), the prediction of proba-
bility is 1.0 (as a convention for 0/0).
That variable what contains the query variable, or if what is “prob_ev”, the
probability of evidence.
Figure 9.7 shows the distribution of various models. This figure is gener-
ated using the first plot_mult example below. Recursive conditioning gives
the exact answer, and so is a vertical line. The others provide the cumulative
prediction for 1000 runs for each method. This graph shows that for this graph
and query, likelihood weighting is closest to the exact answer.
probStochSim.py — (continued)
360 #plot_mult(methods,bn_report,Tamper,True,{Report:True,Smoke:False},
number_samples=100, number_runs=1000)
361 # plot_mult(methods,bn_report,Tamper,True,{Report:False,Smoke:True},
number_samples=100, number_runs=1000)
362
363 # Sprinkler Example:
364 # plot_stats(bn_sprinklerr,Shoes_wet,True,{Grass_shiny:True,Rained:True},
number_samples=1000)
365 # plot_stats(bn_sprinklerL,Shoes_wet,True,{Grass_shiny:True,Rained:True},
number_samples=1000)
probHMM.py — (continued)
29 # state
30 # 0=middle, 1,2,3 are corners
31 states1 = {'middle', 'c1', 'c2', 'c3'} # states
32 obs1 = {'m1','m2','m3'} # microphones
The observation model is as follows. If the animal is in a corner, it will
be detected by the microphone at that corner with probability 0.6, and will be
independently detected by each of the other microphones with a probability of
0.1. If the animal is in the middle, it will be detected by each microphone with
a probability of 0.4.
probHMM.py — (continued)
96 hmm1f1 = HMMVEfilter(hmm1)
97 # hmm1f1.filter([{'m1':0, 'm2':1, 'm3':1}, {'m1':1, 'm2':0, 'm3':1}])
98 ## HMMVEfilter.max_display_level = 2 # show more detail in displaying
99 # hmm1f2 = HMMVEfilter(hmm1)
100 # hmm1f2.filter([{'m1':1, 'm2':0, 'm3':0}, {'m1':0, 'm2':1, 'm3':0},
{'m1':1, 'm2':0, 'm3':0},
101 # {'m1':0, 'm2':0, 'm3':0}, {'m1':0, 'm2':0, 'm3':0},
{'m1':0, 'm2':0, 'm3':0},
102 # {'m1':0, 'm2':0, 'm3':0}, {'m1':0, 'm2':0, 'm3':1},
{'m1':0, 'm2':0, 'm3':1},
103 # {'m1':0, 'm2':0, 'm3':1}])
104 # hmm1f3 = HMMVEfilter(hmm1)
105 # hmm1f3.filter([{'m1':1, 'm2':0, 'm3':0}, {'m1':0, 'm2':0, 'm3':0},
{'m1':1, 'm2':0, 'm3':0}, {'m1':1, 'm2':0, 'm3':1}])
106
107 # How do the following differ in the resulting state distribution?
108 # Note they start the same, but have different initial observations.
109 ## HMMVEfilter.max_display_level = 1 # show less detail in displaying
110 # for i in range(100): hmm1f1.advance()
111 # hmm1f1.state_dist
112 # for i in range(100): hmm1f3.advance()
113 # hmm1f3.state_dist
Exercise 9.7 The representation assumes that there are a list of Boolean obser-
vations. Extend the representation so that the each observation variable can have
multiple discrete values. You need to choose a representation for the model, and
change the algorithm.
9.10.2 Localization
The localization example in the book is a controlled HMM, where there is a
given action at each time and the transition depends on the action.
probLocalization.py — Controlled HMM and Localization example
11 from probHMM import HMMVEfilter, HMM
12 from display import Displayable
13 import matplotlib.pyplot as plt
14 from matplotlib.widgets import Button, CheckButtons
15
16 class HMM_Controlled(HMM):
17 """A controlled HMM, where the transition probability depends on the
action.
18 Instead of the transition probability, it has a function act2trans
19 from action to transition probability.
20 Any algorithms need to select the transition probability according
to the action.
21 """
probLocalization.py — (continued)
43 class HMM_Local(HMMVEfilter):
44 """VE filter for controlled HMMs
45 """
46 def __init__(self, hmm):
47 HMMVEfilter.__init__(self, hmm)
48
49 def go(self, action):
50 self.hmm.trans = self.hmm.act2trans[action]
51 self.advance()
52
53 loc_filt = HMM_Local(hmm_16pos)
54 # loc_filt.observe({'door':True}); loc_filt.go("right");
loc_filt.observe({'door':False}); loc_filt.go("right");
loc_filt.observe({'door':True})
55 # loc_filt.state_dist
The following lets us interactively move the agent and provide observa-
tions. It shows the distribution over locations. Figure 9.8 shows the GUI ob-
tained by Show_Localization(hmm_16pos) after some interaction.
probLocalization.py — (continued)
57 class Show_Localization(Displayable):
58 def __init__(self, hmm, fontsize=10):
59 self.hmm = hmm
60 self.fontsize = fontsize
0.8
0.6
Probability
0.42
0.4
0.2 0.14
0.08 0.05 0.05 0.08 0.05
0.010.01 0.01 0.02 0.01 0.02 0.01 0.02 0.01
0.0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Location
left right door no door reset
Figure 9.8: Localization GUI after observing a door, moving right, observing no
door, moving right, and observing a door.
61 self.loc_filt = HMM_Local(hmm)
62 fig,(self.ax) = plt.subplots()
63 plt.subplots_adjust(bottom=0.2)
64 ## Set up buttons:
65 left_butt = Button(plt.axes([0.05,0.02,0.1,0.05]), "left")
66 left_butt.label.set_fontsize(self.fontsize)
67 left_butt.on_clicked(self.left)
68 right_butt = Button(plt.axes([0.25,0.02,0.1,0.05]), "right")
69 right_butt.label.set_fontsize(self.fontsize)
70 right_butt.on_clicked(self.right)
71 door_butt = Button(plt.axes([0.45,0.02,0.1,0.05]), "door")
72 door_butt.label.set_fontsize(self.fontsize)
73 door_butt.on_clicked(self.door)
74 nodoor_butt = Button(plt.axes([0.65,0.02,0.1,0.05]), "no door")
75 nodoor_butt.label.set_fontsize(self.fontsize)
76 nodoor_butt.on_clicked(self.nodoor)
77 reset_butt = Button(plt.axes([0.85,0.02,0.1,0.05]), "reset")
78 reset_butt.label.set_fontsize(self.fontsize)
79 reset_butt.on_clicked(self.reset)
80 ## draw the distribution
81 plt.subplot(1, 1, 1)
82 self.draw_dist()
83 plt.show()
84
85 def draw_dist(self):
86 self.ax.clear()
87 plt.ylim(0,1)
88 plt.ylabel("Probability", fontsize=self.fontsize)
89 plt.xlabel("Location", fontsize=self.fontsize)
90 plt.title("Location Probability Distribution",
fontsize=self.fontsize)
91 plt.xticks(self.hmm.states,fontsize=self.fontsize)
92 plt.yticks(fontsize=self.fontsize)
93 vals = [self.loc_filt.state_dist[i] for i in self.hmm.states]
94 self.bars = self.ax.bar(self.hmm.states, vals, color='black')
95 self.ax.bar_label(self.bars,["{v:.2f}".format(v=v) for v in vals],
padding = 1, fontsize=self.fontsize)
96 plt.draw()
97
98 def left(self,event):
99 self.loc_filt.go("left")
100 self.draw_dist()
101 def right(self,event):
102 self.loc_filt.go("right")
103 self.draw_dist()
104 def door(self,event):
105 self.loc_filt.observe({'door':True})
106 self.draw_dist()
107 def nodoor(self,event):
108 self.loc_filt.observe({'door':False})
109 self.draw_dist()
110 def reset(self,event):
111 self.loc_filt.state_dist = {i:1/16 for i in range(16)}
112 self.draw_dist()
113
114 # Show_Localization(hmm_16pos)
115 # Show_Localization(hmm_16pos, fontsize=15) # for demos - enlarge window
116
117 if __name__ == "__main__":
118 print("Try: Show_Localization(hmm_16pos)")
probHMM.py — (continued)
116
117 class HMMparticleFilter(Displayable):
118 def __init__(self,hmm,number_particles=1000):
119 self.hmm = hmm
120 self.particles = [sample_one(hmm.indist)
121 for i in range(number_particles)]
122 self.weights = [1 for i in range(number_particles)]
123
124 def filter(self, obsseq):
125 """returns the state distribution following the sequence of
126 observations in obsseq using particle filtering.
127
128 Note that it first advances time.
129 This is what is required if it is called after previous filtering.
130 If that is not what is wanted initially, do an observe first.
131 """
132 for obs in obsseq:
133 self.advance() # advance time
134 self.observe(obs) # observe
135 self.resample_particles()
136 self.display(2,"After observing", str(obs),
137 "state distribution:",
self.histogram(self.particles))
138 self.display(1,"Final state distribution:",
self.histogram(self.particles))
139 return self.histogram(self.particles)
140
141 def advance(self):
142 """advance to the next time.
143 This assumes that all of the weights are 1."""
144 self.particles = [sample_one(self.hmm.trans[st])
145 for st in self.particles]
146
147 def observe(self, obs):
148 """reweighs the particles to incorporate observations obs"""
149 for i in range(len(self.particles)):
150 for obv in obs:
151 if obs[obv]:
152 self.weights[i] *= self.hmm.pobs[obv][self.particles[i]]
153 else:
154 self.weights[i] *=
1-self.hmm.pobs[obv][self.particles[i]]
155
156 def histogram(self, particles):
157 """returns list of the probability of each state as represented by
158 the particles"""
159 tot=0
160 hist = {st: 0.0 for st in self.hmm.states}
161 for (st,wt) in zip(self.particles,self.weights):
162 hist[st]+=wt
163 tot += wt
164 return {st:hist[st]/tot for st in hist}
165
166 def resample_particles(self):
167 """resamples to give a new set of particles."""
168 self.particles = resample(self.particles, self.weights,
len(self.particles))
169 self.weights = [1] * len(self.particles)
The following are some queries for hmm1.
probHMM.py — (continued)
189 stateseq=[]
190 for time in range(horizon):
191 stateseq.append(state)
192 newobs =
{obs:sample_one({0:1-hmm.pobs[obs][state],1:hmm.pobs[obs][state]})
193 for obs in hmm.obsvars}
194 obsseq.append(newobs)
195 state = sample_one(hmm.trans[state])
196 return stateseq,obsseq
197
198 def simobs(hmm,stateseq):
199 """returns observation sequence for the state sequence"""
200 obsseq=[]
201 for state in stateseq:
202 newobs =
{obs:sample_one({0:1-hmm.pobs[obs][state],1:hmm.pobs[obs][state]})
203 for obs in hmm.obsvars}
204 obsseq.append(newobs)
205 return obsseq
206
207 def create_eg(hmm,n):
208 """Create an annotated example for horizon n"""
209 seq,obs = simulate(hmm,n)
210 print("True state sequence:",seq)
211 print("Sequence of observations:\n",obs)
212 hmmfilter = HMMVEfilter(hmm)
213 dist = hmmfilter.filter(obs)
214 print("Resulting distribution over states:\n",dist)
• Rolling out the DBN for some time period, and using standard belief net-
work inference. The latest time that needs to be in the rolled out network
is the time of the latest observation or the time of a query (whichever is
later). This allows us to observe any variables at any time and query any
variables at any time. This is covered in Section 9.11.2.
• An unrolled belief network may be very large, and we might only be in-
terested in asking about “now”. In this case we can just representing the
variables “now”. In this approach we can observe and query the current
variables. We can them move to the next time. This does not allow for
arbitrary historical queries (about the past or the future), but can be much
simpler. This is covered in Section 9.11.3.
• An initial distribution over the features “now” (time 1). This is a belief
network with all variables being time 1 variables.
42 (x,y) = position
43 position = (x-0.3, y)
44 var_prev = DBNvariable(name, domain, index='prev', position=position)
45 var_now.previous = var_prev
46 return var_prev, var_now
probDBN.py — (continued)
48 class FactorRename(Factor):
49 def __init__(self,fac,renaming):
50 """A renamed factor.
51 fac is a factor
52 renaming is a dictionary of the form {new:old} where old and new
var variables,
53 where the variables in fac appear exactly once in the renaming
54 """
55 Factor.__init__(self,[n for (n,o) in renaming.items() if o in
fac.variables])
56 self.orig_fac = fac
57 self.renaming = renaming
58
59 def get_value(self,assignment):
60 return self.orig_fac.get_value({self.renaming[var]:val
61 for (var,val) in assignment.items()
62 if var in self.variables})
probDBN.py — (continued)
probDBN.py — (continued)
71 class DBN(Displayable):
72 """The class of stationary Dynamic Belief networks.
73 * name is the DBN name
74 * vars_now is a list of current variables (each must have
75 previous variable).
76 * transition_factors is a list of factors for P(X|parents) where X
Simple DBN
A_0 A_1
B_0 B_1
C_0 C_1
Animal DBN
Position_0 Position_1
Mic1_0 Mic1_1
Mic2_0 Mic2_1
Mic3_0 Mic3_1
103 pa = Prob(A1,[A0,B0],[[[0.1,0.9],[0.65,0.35]],[[0.3,0.7],[0.8,0.2]]])
104
105 # initial distribution
106 pa0 = Prob(A1,[],[0.9,0.1])
107 pb0 = Prob(B1,[A1],[[0.3,0.7],[0.8,0.2]])
108 pc0 = Prob(C1,[],[0.2,0.8])
109
110 dbn1 = DBN("Simple DBN",[A1,B1,C1],[pa,pb,pc],[pa0,pb0,pc0])
probDBN.py — (continued)
112 from probHMM import closeMic, farMic, midMic, sm, mmc, sc, mcm, mcc
113
114 Pos_0,Pos_1 = variable_pair("Position", domain=[0,1,2,3],
position=(0.5,0.8))
115 Mic1_0,Mic1_1 = variable_pair("Mic1", position=(0.6,0.6))
116 Mic2_0,Mic2_1 = variable_pair("Mic2", position=(0.6,0.4))
117 Mic3_0,Mic3_1 = variable_pair("Mic3", position=(0.6,0.2))
118
119 # conditional probabilities - see hmm for the values of sm,mmc, etc
120 ppos = Prob(Pos_1, [Pos_0],
121 [[sm, mmc, mmc, mmc], #was in middle
122 [mcm, sc, mcc, mcc], #was in corner 1
123 [mcm, mcc, sc, mcc], #was in corner 2
B_1 B_2
B_0=True False: 0.401 False: 0.634
True: 0.599 True: 0.366
C_0 C_2
False: 0.049 C_1=False False: 0.103
True: 0.951 True: 0.897
Here are two examples. You use bn.name2var['B'][2] to get the variable
B2 (B at time 2). Figure 9.11 shows the output of the drc.show_post below:
probDBN.py — (continued)
174 # Try
175 from probRC import ProbRC
176 # bn = BNfromDBN(dbn1,2) # construct belief network
177 # drc = ProbRC(bn) # initialize recursive conditioning
178 # B2 = bn.name2var['B'][2]
179 # drc.query(B2) #P(B2)
180 #
drc.query(bn.name2var['B'][1],{bn.name2var['B'][0]:True,bn.name2var['C'][1]:False})
#P(B1|b0,~c1)
181 # drc.show_post({bn.name2var['B'][0]:True,bn.name2var['C'][1]:False})
182
221
222 def elim_vars(self,factors, vars, obs):
223 for var in vars:
224 if var in obs:
225 factors = [self.project_observations(fac,obs) for fac in
factors]
226 else:
227 factors = self.eliminate_var(factors, var)
228 return factors
Example queries:
probDBN.py — (continued)
257
258 10. Learning with Uncertainty
P_heads
0.0: 0.000
0.05: 0.001
0.1: 0.005
0.15: 0.012 Coin Tosses observed: {Toss#0: 'heads', Toss#1: 'heads', Toss#2: 'tails'}
0.2: 0.019
0.25: 0.028
0.3: 0.038
0.35: 0.048
0.4: 0.058
0.45: 0.067 Toss#0=heads
0.5: 0.075
0.55: 0.082
0.6: 0.087
0.65: 0.089
0.7: 0.088 Toss#1=heads
0.75: 0.085
0.8: 0.077
0.85: 0.065
0.9: 0.049
0.95: 0.027 Toss#2=tails
1.0: 0.000
Toss#3
tails: 0.401
heads: 0.599
Toss#4
tails: 0.401
heads: 0.599
Toss#5
tails: 0.401
heads: 0.599
Toss#6
tails: 0.401
heads: 0.599
Toss#10 Toss#7
tails: 0.401 tails: 0.401
heads: 0.599 heads: 0.599
Toss#8
tails: 0.401
heads: 0.599
Toss#9
tails: 0.401
heads: 0.599
Beta Distribution
4.0
12 heads; 4 tails
3.5 3 heads; 1 tails
6 heads; 2 tails
3.0
2.5
Probability
2.0
1.5
1.0
0.5
0.0
0.0 0.2 0.4 0.6 0.8 1.0
P(Heads)
heads tails save reset
Figure 10.2 shows a plot of the Beta distribution (the P_head variable in the
previous belief network) given some sets of observations.
This is a plot that is produced by the following interactive tool.
learnBayesian.py — (continued)
96 def tails(self,event):
97 self.num_tails += 1
98 self.dist = [self.dist[i]*(1-self.vals[i]) for i in range(self.num)]
99 self.draw_dist()
100 def save(self,event):
101 self.saves.append((self.num_heads,self.num_tails,self.dist))
102 self.draw_dist()
103 def reset(self,event):
104 self.num_tails = 0
105 self.num_heads = 0
106 self.dist = [1/self.num for i in range(self.num)]
107 self.draw_dist()
108
109 # s1 = Show_Beta(100)
110 # sl = Show_Beta(100, fontsize=15) # for demos - enlarge window
111
112 if __name__ == "__main__":
113 print("Try: Show_Beta(100)")
10.2 K-means
The k-means learner takes in a dataset and a number of classes, and learns a
mapping from examples to classes (class_of_eg) and a function that makes
predictions for classes (class_predictions).
It maintains two lists that suffice as sufficient statistics to classify examples,
and to learn the classification:
• feature_sum is a list such that feature_sum[f ][c] is sum of the values for the
feature f for members of class c. The average value of the ith feature in
class i is
feature_sum[i][c]
class_counts[c]
37 def distance(self,cl,eg):
38 """distance of the eg from the mean of the class"""
39 return sum( (self.class_prediction(feat,cl)-feat(eg))**2
40 for feat in self.dataset.input_features)
41
42 def class_prediction(self,feat,cl):
43 """prediction of the class cl on the feature with index feat_ind"""
44 if self.class_counts[cl] == 0:
45 return 0 # arbitrary prediction
46 else:
47 return self.feature_sum[feat][cl]/self.class_counts[cl]
48
49 def class_of_eg(self,eg):
50 """class to which eg is assigned"""
51 return (min((self.distance(cl,eg),cl)
52 for cl in range(self.num_classes)))[1]
53 # second element of tuple, which is a class with minimum
distance
One step of k-means updates the class_counts and feature_sum. It uses the old
values to determine the classes, and so the new values for class_counts and
feature_sum. At the end it determines whether the values of these have changes,
and then replaces the old ones with the new ones. It returns an indicator of
whether the values are stable (have not changed).
learnKMeans.py — (continued)
55 def k_means_step(self):
56 """Updates the model with one step of k-means.
57 Returns whether the assignment is stable.
58 """
59 new_class_counts = [0]*self.num_classes
60 # feature_sum[f][c] is the sum of the values of feature f for class
c
61 new_feature_sum = {feat: [0]*self.num_classes
62 for feat in self.dataset.input_features}
63 for eg in self.dataset.train:
64 cl = self.class_of_eg(eg)
65 new_class_counts[cl] += 1
66 for feat in self.dataset.input_features:
67 new_feature_sum[feat][cl] += feat(eg)
68 stable = (new_class_counts == self.class_counts) and
(self.feature_sum == new_feature_sum)
69 self.class_counts = new_class_counts
70 self.feature_sum = new_feature_sum
71 self.num_iterations += 1
72 return stable
73
74
75 def learn(self,n=100):
76 """do n steps of k-means, or until convergence"""
77 i=0
78 stable = False
79 while i<n and not stable:
80 stable = self.k_means_step()
81 i += 1
82 self.display(1,"Iteration",self.num_iterations,
83 "class counts: ",self.class_counts,"
Stable=",stable)
84 return stable
85
86 def show_classes(self):
87 """sorts the data by the class and prints in order.
88 For visualizing small data sets
89 """
90 class_examples = [[] for i in range(self.num_classes)]
91 for eg in self.dataset.train:
92 class_examples[self.class_of_eg(eg)].append(eg)
93 print("Class","Example",sep='\t')
94 for cl in range(self.num_classes):
95 for eg in class_examples[cl]:
96 print(cl,*eg,sep='\t')
Figure 10.3 shows multiple runs for Example 10.5 in Section 10.3.1 of Poole
and Mackworth [2023]. Note that the y-axis is sum of squares of the values,
which is the square of the Euclidian distance. K-means can stabilize on a dif-
0 2 4 6 8
step
ferent assignment each time it is run. The first run with 2 classes shown in the
figure was stable after the first step. The next two runs with 3 classes started
with different assignments, but stabilized on the same assignment. (You can-
not check if it is the same assignment from the graph, but need to check the
assignment of examples to classes.) The second run with 3 classes took tow
steps to stabilize, but the other only took one. Note that the algorithm only
determines that it is stable with one more run.
learnKMeans.py — (continued)
109 if self.dataset.test:
110 test_errors.append(
sum(self.distance(self.class_of_eg(eg),eg)
111 for eg in self.dataset.test)
112 /len(self.dataset.test))
113 self.learn(1)
114 plt.plot(range(maxstep), train_errors,
115 label=str(self.num_classes)+" classes. Training set")
116 if self.dataset.test:
117 plt.plot(range(maxstep), test_errors,
118 label=str(self.num_classes)+" classes. Test set")
119 plt.legend()
120 plt.draw()
121
122 # data = Data_from_file('data/emdata1.csv', num_train=10,
target_index=2000) # trivial example
123 data = Data_from_file('data/emdata2.csv', num_train=10, target_index=2000)
124 # data = Data_from_file('data/emdata0.csv', num_train=14,
target_index=2000) # example from textbook
125 kml = K_means_learner(data,2)
126 num_iter=4
127 print("Class assignment after",num_iter,"iterations:")
128 kml.learn(num_iter); kml.show_classes()
129
130 # Plot the error
131 # km2=K_means_learner(data,2); km2.plot_error(10) # 2 classes
132 # km3=K_means_learner(data,3); km3.plot_error(10) # 3 classes
133 # km13=K_means_learner(data,10); km13.plot_error(10) # 10 classes
134
135 # data = Data_from_file('data/carbool.csv', target_index=2000,
one_hot=True)
136 # kml = K_means_learner(data,3)
137 # kml.learn(20); kml.show_classes()
138 # km3=K_means_learner(data,3); km3.plot_error(10) # 3 classes
139 # km3=K_means_learner(data,10); km3.plot_error(10) # 10 classes
Exercise 10.1 If there are many classes, some of the classes can become empty
(e.g., try 100 classes with carbool.csv). Implement a way to put some examples
into a class, if possible. Two ideas are:
(a) Initialize the classes with actual examples, so that the classes will not start
empty. (Do the classes become empty?)
(b) In class_prediction, we test whether the code is empty, and make a prediction
of 0 for an empty class. It is possible to make a different prediction to “steal”
an example (but you should make sure that a class has a consistent value for
each feature in a loop).
Make your own suggestions, and compare it with the original, and whichever of
these you think may work better.
10.3 EM
In the following definition, a class, c, is a integer in range [0, num_classes). i is
an index of a feature, so feat[i] is the ith feature, and a feature is a function from
tuples to values. val is a value of a feature.
A model consists of 2 lists, which form the sufficient statistics:
class_counts[c] = ∑ P(t)
t:class(t)=c
feature_counts[i][val][c] = ∑ P(t)
t:feat[i](t)=val andclass(t)=c
learnEM.py — EM Learning
11 from learnProblem import Data_set, Learner, Data_from_file
12 import random
13 import math
14 import matplotlib.pyplot as plt
15
16 class EM_learner(Learner):
17 def __init__(self,dataset, num_classes):
18 self.dataset = dataset
19 self.num_classes = num_classes
20 self.class_counts = None
21 self.feature_counts = None
The function em_step goes though the training examples, and updates these
counts. The first time it is run, when there is no model, it uses random distri-
butions.
learnEM.py — (continued)
The last step is because len(self .dataset) is a constant (independent of c). class_counts[c]
can be taken out of the product, but needs to be raised to the power of the num-
ber of features, and one of them cancels.
learnEM.py — (continued)
51 def learn(self,n):
52 """do n steps of em"""
53 for i in range(n):
54 self.class_counts,self.feature_counts =
self.em_step(self.class_counts,
55 self.feature_counts)
The following is for visualizing the classes. It prints the dataset ordered by the
probability of class c.
learnEM.py — (continued)
57 def show_class(self,c):
where cc is the class count and fc is feature count. len(self .dataset) can be dis-
tributed out of the sum, and cc[c] can be taken out of the product:
1 1
= ∑
len(self .dataset) c cc[c] #feats −1 ∏
∗ fc[i][feati (tple)][c]
i
Given the probability of each tuple, we can evaluate the logloss, as the negative
of the log probability:
learnEM.py — (continued)
68 def logloss(self,tple):
69 """returns the logloss of the prediction on tple, which is
-log(P(tple))
70 based on the current class counts and feature counts
71 """
72 feats = self.dataset.input_features
73 res = 0
74 cc = self.class_counts
75 fc = self.feature_counts
76 for c in range(self.num_classes):
77 res += prod(fc[i][feat(tple)][c]
78 for (i,feat) in
enumerate(feats))/(cc[c]**(len(feats)-1))
79 if res>0:
80 return -math.log2(res/len(self.dataset.train))
81 else:
82 return float("inf") #infinity
Figure 10.4 shows the training and test error for various numbers of classes for
the carbool dataset (calls commented out at the end of the code).
17
Ave Logloss (bits)
16
learnEM.py — (continued)
Exercise 10.2 For data where there are naturally 2 classes, does EM with 3 classes
do better on the training set after a while than 2 classes? Is is better on a test set.
Explain why. Hint: look what the 3 classes are. Use "eml.show_class(i)" for each
of the classes i ∈ [0, 3).
Exercise 10.3 Write code to plot the logloss as a function of the number of classes
(from 1 to, say, 30) for a fixed number of iterations. (From the experience with the
existing code, think about how many iterations are appropriate.
Exercise 10.4 Repeat the previous exercise, but use cross validation to select the
number of iterations as a function of the number of classes and other features of
the dataset.
Causality
11.1 Do Questions
A causal model can answer “do” questions.
The intervene function takes a belief network and a variable : value dictio-
nary specifying what to “do”, and returns a belief network resulting from in-
tervening to set each variable in the dictionary to its value specified. It replaces
the conditional probability distribution, CPD, (Section 9.3) of each intervened
variable with an constant CPD.
probDo.py — (continued)
271
272 11. Causality
Rained Sprinkler
False: 0.550 on: 1.000
True: 0.450 off: 0.000
Grass wet
False: 0.059
True: 0.940
Drug_Prone
Takes_Marijuana
Side_Effects Takes_Hard_Drugs
0.800
False: 0.894 False: 0.957
False: 0.200
True: 0.106
0.824 True: 0.043
True: 0.176
Figure 11.2: Does taking marijuana lead to hard drugs: observable variables
43
44 ### Showing posterior distributions:
45 # bn_sprinklerv.show_post({})
46 # bn_sprinklerv.show_post({Sprinkler:"on"})
47 # spon = intervene(bn_sprinkler, do={Sprinkler:"on"})
48 # ProbRC(spon).show_post({})
The following is a representation of a possible model where marijuana is a gate-
way drug to harder drugs (or not). Before reading the code, try the commented-
out queries at the end. Figure 11.2 shows the network with the observable
variables, Takes_Marijuana and Takes_Hard_Drugs.
probDo.py — (continued)
A A'
P(c) = 0.5
P(b | c) = P(b | ¬c) = 0.7 (the cab companies are equally reliable)
(a | b) = 0.4, (a | ¬b) = 0.2.
probCounterfactual.py — (continued)
a ≡ ab ∨ (¬b ∧ a0 ) ∨ (b ∧ a1 )
Thus ab is the background cause of a, a0 is the cause used when B=false and a1
is the cause used when B=false. Note that this is over parametrized with re-
spect the belief network, using three parameters whereas arbitrary conditional
probability can be represented using two parameters.
The running example where (a | b) = 0.4 and (a | ¬b) = 0.2 can be repre-
sented using
or
41 IFeq(Cprime,True,SameAs(B_1),SameAs(B_0))))
42 p_Aprime = ProbDT(Aprime, [Bprime, A_b, A_0, A_1],
43 IFeq(A_b,True,Dist([0,1]),
44 IFeq(Bprime,True,SameAs(A_1),SameAs(A_0))))
45 p_b_b = Prob(B_b, [], [1,0])
46 p_b_0 = Prob(B_0, [], [0.3,0.7])
47 p_b_1 = Prob(B_1, [], [0.3,0.7])
48
49 p_a_b = Prob(A_b, [], [1,0])
50 p_a_0 = Prob(A_0, [], [0.8,0.2])
51 p_a_1 = Prob(A_1, [], [0.6,0.4])
52
53 p_b_np = Prob(B, [], [0.3,0.7]) # for AB network
54 p_Bprime_np = Prob(Bprime, [], [0.3,0.7]) # for AB network
55 ab_Counter = BeliefNetwork("AB Counterfactual Example",
56 [A,B,Aprime,Bprime, A_b,A_0,A_1],
57 [p_A, p_b_np, p_Aprime, p_Bprime_np, p_a_b, p_a_0,
p_a_1])
58
59 cbaCounter = BeliefNetwork("CBA Counterfactual Example",
60 [A,B,C, Aprime,Bprime,Cprime, B_b,B_0,B_1, A_b,A_0,A_1],
61 [p_A, p_B, p_C, p_Aprime, p_Bprime, p_Cprime,
62 p_b_b, p_b_0, p_b_1, p_a_b, p_a_0, p_a_1])
Here are some queries you might like to try. The show_post queries might be
most useful if you have the space to show multiple queries.
probCounterfactual.py — (continued)
64 cbaq = ProbRC(cbaCounter)
65 # cbaq.queryDo(Aprime, obs = {C:True, Cprime:False})
66 # cbaq.queryDo(Aprime, obs = {C:False, Cprime:True})
67 # cbaq.queryDo(Aprime, obs = {A:True, C:True, Cprime:False})
68 # cbaq.queryDo(Aprime, obs = {A:False, C:True, Cprime:False})
69 # cbaq.queryDo(Aprime, obs = {A:False, C:True, Cprime:False})
70 # cbaq.queryDo(A_1, obs = {C:True,Aprime:False})
71 # cbaq.queryDo(A_0, obs = {C:True,Aprime:False})
72
73 # cbaq.show_post(obs = {})
74 # cbaq.show_post(obs = {C:True, Cprime:False})
75 # cbaq.show_post(obs = {A:False, C:True, Cprime:False})
76 # cbaq.show_post(obs = {A:True, C:True, Cprime:False})
Exercise 11.1 Consider the scenario “Bob called the first cab (C = true), was
late and Alice agrees to a second date”. What would you expect from the scenario
“what if Bob called the other cab?”. What does the network predict? Design prob-
abilities for the noise variables that fits the conditional probability and also fits
your expectation.
Exercise 11.2 How would you expect the counterfactual conclusion to change
given the following two scenarios that fit the story:
Dead
False: 0.882
True: 0.118
Figure 11.4: Firing squad belief network (figure obtained from fsq.show_post({})
• The cabs are both very reliable and start at the same location (and so face the
same traffic).
• The cabs are each 90% reliable and start from opposite directions.
(a) How would you expect the predictions to differ in these two cases?
(b) How can you fit the conditional probabilities above and represent each of
these by changing the probabilities of the noise variables?
(c) How can these be learned from data? (Hint: consider learning a correlation
between the taxi arrivals). Is your approach always applicable? If not, for
which cases is it applicable or not.
Exercise 11.3 Choose two assignments to values to each of ab , a0 and a1 using
a ≡ ab ∨ (¬b ∧ a0 ) ∨ (b ∧ a1 ), and a counterfactual query such that (a) the two
assignments cannot be distinguished by observations or by interventions, and (b)
the predictions for the query differ by an arbitrarluy large amount (differ by 1 − ϵ
for a small value of ϵ, such as ϵ = 0.1).
Instead of the tabular representation of the if-then-else structure used for the
A → B → C network above, the following uses the decision tree representation
of conditional probabilities of Section 9.3.4.
probCounterfactual.py — (continued)
Exercise 11.4 Create the network for “what if shooter 2 did or did not shoot”.
Give the probabilities of the following counterfactuals:
(a) The prisoner is dead; what is the probability that the prisoner would be dead
if shooter 2 did not shoot?
(b) Shooter 2 shot; what is the probability that the prisoner would be dead if
shooter 2 did not shoot?
(c) No order was given, but the prisoner is dead; what is the probability that
the prisoner would be dead if shooter 2 did not shoot?
Exercise 11.5 Create the network for “what if the order was or was not given”.
Give the probabilities of the following counterfactuals:
(a) The prisoner is dead; what is the probability that the prisoner would be dead
if the order was not given?
(b) The prisoner is not dead; what is the probability that the prisoner would be
dead if the order was not given? (Is this different from the prior that the
prisoner is dead, or the posterior that the prisoner was dead given the order
was not given).
(c) Shooter 2 shot; what is the probability that the prisoner would be dead if the
order was not given?
(d) Shooter 2 did not shoot; what is the probability that the prisoner would be
dead if the order was given? (Is this different from the probability that the
the prisoner would be dead if the order was given without the counterfac-
tual observation)?
A decision variable is like a random variable with a string name, and a do-
main, which is a list of possible values. The decision variable also includes the
281
282 12. Planning with Uncertainty
parents, a list of the variables whose value will be known when the decision is
made. It also includes a position, which is used for plotting.
decnNetworks.py — (continued)
29 class DecisionVariable(Variable):
30 def __init__(self, name, domain, parents, position=None):
31 Variable.__init__(self, name, domain, position)
32 self.parents = parents
33 self.all_vars = set(parents) | {self}
A decision network is a graphical model where the variables can be random
variables or decision variables. Among the factors we assume there is one
utility factor. Note that this is an instance of BeliefNetwork but overrides
__init__.
decnNetworks.py — (continued)
35 class DecisionNetwork(BeliefNetwork):
36 def __init__(self, title, vars, factors):
37 """title is a string
38 vars is a list of variables (random and decision)
39 factors is a list of factors (instances of CPD and Utility)
40 """
41 GraphicalModel.__init__(self, title, vars, factors)
42 # not BeliefNetwork.__init__
43 self.var2parents = ({v : v.parents for v in vars
44 if isinstance(v,DecisionVariable)}
45 | {f.child:f.parents for f in factors
46 if isinstance(f,CPD)})
47 self.children = {n:[] for n in self.variables}
48 for v in self.var2parents:
49 for par in self.var2parents[v]:
50 self.children[par].append(v)
51 self.utility_factor = [f for f in factors
52 if isinstance(f,Utility)][0]
53 self.topological_sort_saved = None
54
55 def __str__(self):
56 return self.title
The split order ensures that the parents of a decision node are split before
the decision node, and no other variables (if that is possible).
decnNetworks.py — (continued)
58 def split_order(self):
59 so = []
60 tops = self.topological_sort()
61 for v in tops:
62 if isinstance(v,DecisionVariable):
63 so += [p for p in v.parents if p not in so]
64 so.append(v)
65 so += [v for v in tops if v not in so]
66 return so
decnNetworks.py — (continued)
Weather
Forecast Utility
Umbrella
Tamper Fire
Report Call
decnNetworks.py — (continued)
decnNetworks.py — (continued)
Watched Punish
Caught1 Caught2
Grade_1 Grade_2
Fin_Grd
Chain of 3 decisions
The following decision network represents a finite-stage fully-observable Markov
decision process with a single reward (utility) at the end. It is interesting be-
cause the parents do not include all the predecessors. The methods we use will
work without change on this, even though the agent does not condition on all
of its previous observations and actions. The output of ch3.show() is shown in
Figure 12.4.
decnNetworks.py — (continued)
3-chain
Utility
S0 S1 S2 S3
D0 D1 D2
244
245 # ch3.show()
246 # ch3.show(fontsize=15)
decnNetworks.py — (continued)
300 if isinstance(v,DecisionVariable)}
301 return algorithm({}, self.dn.factors, split_order)
302
303 def show_policy(self):
304 print('\n'.join(df.to_table() for df in self.opt_policy.values()))
The following is the simplest search-based algorithm. It is exponential in
the number of variables, so is not very useful. However, it is simple, and help-
ful to understand before looking at the more complicated algorithm. Note
that the above code does not call rc0; you will need to change the self.rc
to self.rc0 in above code to use it.
decnNetworks.py — (continued)
376 if isinstance(var,DecisionVariable):
377 assert set(context) <= set(var.parents), f"cannot optimize
{var} in context {context}"
378 maxres = -math.inf
379 for val in var.domain:
380 self.display(3,"In rc, branching on",var,"=",val)
381 newres = self.rc({var:val}|context, factors,
split_order[1:])
382 if newres > maxres:
383 maxres = newres
384 theval = val
385 self.opt_policy[var].assign(context,theval)
386 self.cache[ce] = maxres
387 return maxres
388 else:
389 total = 0
390 for val in var.domain:
391 total += self.rc({var:val}|context, factors,
split_order[1:])
392 self.display(3, "rc branching on", var,"returning", total)
393 self.cache[ce] = total
394 return total
Here is how to run the optimizer on the example decision networks:
decnNetworks.py — (continued)
exception. (A decision node can only be maximized if the variables that are not
its parents have already been eliminated.)
decnNetworks.py — (continued)
decnNetworks.py — (continued)
456 """
457 self.dvar = dvar
458 self.factor = factor
459 vars = [v for v in factor.variables if v is not dvar]
460 Factor.__init__(self,vars)
461 self.values = {}
462 self.decision_fun = DecisionFunction(dvar, dvar.parents)
463
464 def get_value(self,assignment):
465 """lazy implementation: if saved, return saved value, else compute
it"""
466 new_asst = {x:v for (x,v) in assignment.items() if x in
self.variables}
467 asst = frozenset(new_asst.items())
468 if asst in self.values:
469 return self.values[asst]
470 else:
471 max_val = float("-inf") # -infinity
472 for elt in self.dvar.domain:
473 fac_val = self.factor.get_value(assignment|{self.dvar:elt})
474 if fac_val>max_val:
475 max_val = fac_val
476 best_elt = elt
477 self.values[asst] = max_val
478 self.decision_fun.assign(assignment, best_elt)
479 return max_val
decnNetworks.py — (continued)
Two state partying example (Example 12.29 in Poole and Mackworth [2023]):
44 class distribution(dict):
45 """A distribution is an item:prob dictionary.
46 Probabilities are added using add_prop.
47 """
48 def __init__(self,d):
49 dict.__init__(self,d)
50
51 def add_prob(self, item, pr):
52 """adds a probability to a distribution.
53 Like dictionary assignment, but if item is already there, the
values are summed
54 """
55 if item in self:
56 self[item] += pr
57 else:
58 self[item] = pr
59 return self
(a) the rewards and resulting state can be correlated (e.g., in the grid do-
mains below, crashing into a wall results in both a negative reward and
the agent not moving), and
(b) it represents the expected reward (e.g., a reward of 1 is has the same ex-
pected value as a reward of 100 with probability 1/100 and 0 otherwise,
but these are different in a simulation).
A problem domain represents a problem as a function result from states
and actions into a distribution of (state, reward) pairs. This can be a subclass of
MDP because it implements R and P. A problem domain also specifies an initial
state and coordinate information used by the graphical user interfaces.
mdpProblem.py — (continued)
61 class ProblemDomain(MDP):
62 """A ProblemDomain implements
63 self.result(state, action) -> {(reward, state):probability}.
64 Other pairs have probability are zero.
65 The probabilities must sum to 1.
66 """
67 def __init__(self, title, states, actions, discount,
68 initial_state=None, x_dim=0, y_dim = 0,
69 vinit=0, offsets={}):
70 """A problem domain
71 * title is list of titles
72 * states is the list of states
73 * actions is the list of actions
74 * discount is the discount factor
75 * initial_state is the state the agent starts at (for simulation)
if known
76 * x_dim and y_dim are the dimensions used by the GUI to show the
states in 2-dimensions
77 * vinit is the initial value
78 * offsets is a {action:(x,y)} map which specifies how actions are
displayed in GUI
79 """
80 MDP.__init__(self, title, states, actions, discount)
81 if initial_state is not None:
82 self.state = initial_state
83 else:
84 self.state = random.choice(states)
85 self.vinit = vinit # value to reset v,q to
86 # The following are for the GUI:
87 self.x_dim = x_dim
88 self.y_dim = y_dim
89 self.offsets = offsets
90
91 def state2pos(self,state):
92 """When displaying as a grid, this specifies how the state is
mapped to (x,y) position.
93 The default is for domains where the (x,y) position is the state
94 """
95 return state
96
97 def state2goal(self,state):
98 """When displaying as a grid, this specifies how the state is
mapped to goal position.
99 The default is for domains where there is no goal
100 """
101 return None
102
103 def pos2state(self,pos):
104 """When displaying as a grid, this specifies how the state is
mapped to (x,y) position.
105 The default is for domains where the (x,y) position is the state
106 """
107 return pos
108
109 def P(self, state, action):
110 """Transition probability function
111 returns a dictionary of {s1:p1} such that P(s1 | state,action)=p1.
112 Other probabilities are zero.
113 """
114 res = self.result(state, action)
115 acc = 1e-6 # accuracy for test of equality
116 assert 1-acc<sum(res.values())<1+acc, f"result({state},{action})
not a distribution, sum={sum(res.values())}"
117 dist = distribution({})
118 for ((r,s),p) in res.items():
119 dist.add_prob(s,p)
120 return dist
121
122 def R(self, state, action):
123 """Reward function R(s,a)
124 returns the expected reward for doing a in state s.
125 """
126 return sum(r*p for ((r,s),p) in self.result(state, action).items())
Tiny Game
The next example is the tiny game from Example 13.1 and Figure 13.1 of Poole
and Mackworth [2023], shown here as Figure 12.5. There are 6 states and 4
actions. The state is represented as (x, y) where x counts from zero from the
left, and y counts from zero upwards, so the state (0, 0) is on the bottom-left.
The actions are upC for up-careful, upR for up-risky, left, and right. Going left
from (0, 2) results in a reward of 10 and ending up in state (0, 0); going left
from (0, 1) results in a reward of −100 and staying there. Up-risky goes up but
with a chance of going left or right. Up careful goes up, but has a reward of
−1. Left and right are deterministic. Crashing into a wall results in a reward of
−1 and staying still.
(0,2) (1,2)
+10
-100 (0,1) (1,1)
(0,0) (1,0)
(Note that GridDomain means that it can be shown with the MDP GUI in
Section 12.2.3).
mdpExamples.py — (continued)