Programming Cognitive Robots - Levesque - 2019 Ed
Programming Cognitive Robots - Levesque - 2019 Ed
Programming Cognitive Robots - Levesque - 2019 Ed
Cognitive
Robots
Hector J. Levesque
Acknowledgments 13
Preliminaries
1 Introduction 15
1.1 Declarative programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2 Scripting at a high level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.3 Finding the right level of primitive actions . . . . . . . . . . . . . . . . . . . . 17
1.4 What do ERGO programs look like? . . . . . . . . . . . . . . . . . . . . . . . . 18
2 Scheme Review 21
2.1 Language overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Basic programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
∗ 2.3 Advanced programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4 The language primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.1 Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.2 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.3 Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.4 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.5 Boolean values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
∗ 2.4.6 Ports and channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.7 Hash-tables, vectors, and arrays . . . . . . . . . . . . . . . . . . . . . . 33
∗ 2.4.8 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.5 Running Scheme and ERGO programs . . . . . . . . . . . . . . . . . . . . . . 35
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3
3.2 Fluents and their states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.1 Fluent expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.3.1 Parameterized actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
∗ 3.3.2 Fluents with functional values . . . . . . . . . . . . . . . . . . . . . . . 43
3.4 Building a basic action theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4.1 Using abstract functions and predicates . . . . . . . . . . . . . . . . . 44
3.4.2 Testing a basic action theory . . . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Basic automated planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.5.1 The fox, hen, grain problem . . . . . . . . . . . . . . . . . . . . . . . . 50
∗ 3.5.2 The jealous husband problem . . . . . . . . . . . . . . . . . . . . . . . 51
3.6 Efficiency considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
∗ 3.7 How this all works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Projects
5
9.3.2 The JoyRide car . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
9.3.3 The ERGO program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
9.3.4 Playing the game . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
9.4 Extensions and variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Advanced Topics
6
12.4.1 When is an FSA plan correct? . . . . . . . . . . . . . . . . . . . . . . . 177
12.4.2 Generating FSA plans . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
12.4.3 The odd bar problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
12.4.4 The towers of Hanoi problem . . . . . . . . . . . . . . . . . . . . . . . 181
12.4.5 The striped tower problem . . . . . . . . . . . . . . . . . . . . . . . . . 183
12.4.6 How general are the plans? . . . . . . . . . . . . . . . . . . . . . . . . 185
12.5 Offline knowledge-based programs . . . . . . . . . . . . . . . . . . . . . . . . 186
12.6 Bibliographic notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
End Matter
Bibliography 201
7
Preface
Cognitive robotics
A cognitive robot (as the term will be used here) is a hardware or software agent that uses
what it knows about its world to make decisions about what to do. So cognitive robots
are to be contrasted with robots that do what they do because they have been explicitly
programmed to do so, or with robots that do what they do because they have been trained
to do so on massive amounts of data (the current fashion in much of AI).
The term cognitive robotics is due to Ray Reiter who first used it in 1993 to refer to the
study of the knowledge representation and reasoning problems faced by an autonomous
robot in a dynamic and incompletely known world. The emphasis, in other words, is
on designing and implementing robot controllers that make use of explicitly represented
knowledge in deciding what to do. It is not about robot controllers that merely solve a
certain class of problems or happen to work in a class of application domains by learning
or by some other means.
The sort of research pursued by Reiter and his colleagues was primarily theoretical in
nature, and leaned heavily on the mathematics of logic and computation. Nonetheless,
what came out of their work was a very different way of looking at programming. Instead
of thinking of a program as a specification of what should happen inside a computer, a
program could also be thought of as specifying a complex activity in the world that a
cognitive robot might choose to engage in.
The thesis of this book is that, just as it is possible to write useful Python programs
without knowing much about the mathematics of computation (Turing machines and the
like, say), it is possible to program cognitive robots without having to master the mathemat-
ical foundations proposed by Reiter and colleagues. However, the focus on programming
8
means thinking hard, not just about what needs to be computed by a cognitive robot, but
on how the required information should be represented and processed. Simple schemes
that work well enough on small or toy versions of problems may not scale well on larger,
more realistic cases. (Think of what works for calculating the first ten Fibonacci num-
bers, for instance, compared to what is needed for calculating the first hundred.) The
assumption here is that a programmer will want to make deliberate choices about the data
structures and algorithms a cognitive robot should use in keeping track of its world and
deciding what to do, and not simply rely on a one-size-fits-all default choice.
9
Overview of the book
The first two chapters of the book present preliminary material:
1. This chapter introduces the programming language used here, called ERGO, and the
ideas behind its programs. It previews a small but complete ERGO program.
2. This chapter reviews the Scheme programming language. The ERGO system uses
Scheme to do all the necessary work on numbers, strings, arrays, and so on.
The next four chapters show how to program a cognitive robot in ERGO:
3. This chapter presents the declarative part of ERGO programming, which involves
specifying the properties of the world the robot lives in and the actions that affect it.
This chapter also shows how a declarative specification can be used directly for basic
(sequential) planning.
4. This is the first of two chapters on the procedural part of ERGO programming. This
includes the usual deterministic facilities like sequence, iteration, and recursion, but
also nondeterministic facilities involving search and backtracking.
5. This is the second of two chapters on the procedural part of ERGO programming.
This one emphasizes the concurrent aspects along with the idea of performing some
tasks while monitoring and reacting to other conditions in the world. Reacting to
another agent is illustrated using minimax game playing.
6. This chapter looks at an ERGO system in its entirety and how it can control an actual
robot or software agent over TCP or by other means. The idea is that an ERGO
program will generate actions for the robot to perform, but the robot can also report
exogenous (external) actions that have occurred outside the program.
The next three chapters present possible projects in cognitive robotics and are independent
of each other:
8. In this chapter, ERGO is used to control a LEGO Mindstorm robot with motors and
sensors. Here the goal is to wheel around on a floor or table top delivering and
picking up ersatz packages at various locations.
9. This chapter explores the use of ERGO in a real-time video game. In this case, ERGO
is used to control a car that is taking a joy-ride while attempting to elude a patrol car
that is chasing it under the real-time control of the user.
The final three chapters discuss more advanced topics in cognitive robotics and again each
can be read independently:
10
10. This chapter discusses the relationship between ERGO and the mathematical foun-
dations of cognitive robotics in symbolic logic developed by Ray Reiter and his col-
leagues. The execution of a program is recast as a problem in logical reasoning.
11. This chapter and the next deal with an ERGO system that has only incomplete
knowledge of its world. This chapter generalizes programming to the case where
the system has to deal with the numerical uncertainty arising from the noise and
inaccuracies in sensors and effectors.
12. In this chapter, the sequential planning seen in Chapter 3 is generalized to deal with
incomplete knowledge. The generated plans will no longer be sequences of actions,
but more complex graph-like structures that branch and loop.
The book concludes after Chapter 12 with a short epilogue and index material. Chapters 2
through 5 include short exercises. Chapters 10 through 12 have bibliographic notes. Chap-
ters and sections marked with an asterisk (∗) are more advanced and can be skipped on
first reading without loss of continuity. Technical terms are underlined when they are first
used and indexed on Page 200.
11
1. A student will learn yet another way to think about programming, different from
issues in imperative programming, functional programming, logic programming,
concurrent programming, object-oriented programming. Programming a cognitive
robot has some aspects of all of these, but with a very different slant.
2. A student will see AI from a very different perspective than in traditional AI courses.
The book is not based on probability and machine learning, nor on logic and theorem-
proving. While it is possible to make those connections (and first steps can be found
in the advanced chapters of this book), this book takes a more hands-on route to the
idea of a cognitive agent.
3. A student will see robotics in a different light. Early computer programming in the
1950s stayed very close to the computer hardware. It took some time before the ideas
of data and procedural abstraction led to the higher-level programming languages
we now use. Similarly, most work in robotics today stays very close to the robotic
hardware. This book attempts to take a more abstract view of what a robot is and
what a robot can do.
As always, how much students actually get out of the book will depend on how much they
put into it. Our hope is that the book will excite a student into exploring more deeply the
many issues involved.
12
Acknowledgments
This book started a while back as a collaboration between Maurice Pagnucco and me. It
would never have gotten off the ground without him, and the organization and direction of
the book was something we arrived at together. He really should have been a co-author of
the book, except for the fact that he did not get a chance to do any of the writing. Instead,
he was co-opted into taking on a full-time load of administrative work at his university
which, unfortunately, he appears to excel at. My loss was their gain. But I remain totally
indebted to him for all he did contribute.
Next, I must thank my colleague and mentor, the late Ray Reiter, who was the first to
think about cognitive robotics in a systematic way, and who inspired me on this topic as
on so many others throughout my career. I was very fortunate to have been part of his
Cognitive Robotics group at the University of Toronto, and I thank him and all the other
members of that group, students, colleagues and visitors, for the ideas they planted in me.
I want to especially single out Yves Lespérance, Fangzhen Lin, Richard Scherl, Giuseppe
de Giacomo, and Sebastian Sardina for the development of GOLOG, CONGOLOG, and
INDIGOLOG, the ancestors of the ERGO programming language discussed here.
I also want to thank my research friends in Cognitive Robotics, Vaishak Belle, Jim
Delgrande, Sebastian Sardina, and Steven Shapiro, who went slogging through a draft
of the book, hunting for infelicities, even after I warned them that it was intended for
undergraduates and somewhat thin on research ideas. Are these kind folks then to blame
for any errors or confusions that remain? I think not.
The chapters 10, 11, and 12 have bibliographic sections acknowledging the contribu-
tions of others to the ideas there. But many of the earlier examples were also lifted from
the work of others. The fox, hen, grain problem and the jealous husbands problem from
Chapter 3 are classics that date back to the ninth century. The basic elevator from Chap-
ters 1 and 6 was presented in the first GOLOG paper. The reactive elevator from Chapter 5
was discussed in the first CONGOLOG paper. The grocery store example from Chapter 4
is due to Maurice Pagnucco. The squirrel world of Chapter 7 is a variant of the Monty
Karel robot world written by Joseph Bergin and colleagues. The LEGO delivery robot of
Chapter 8 was originally due to Maurice Pagnucco and me in a system called LEGOLOG.
I also want to thank the folks at MIT Press and Cambridge University Press who went
through the book carefully, and tried hard to see how they could publish it, before eventu-
ally deciding that it did not work for them. C’est la vie!
Finally, and as always, I want to thank my family and friends for their love and unwa-
vering support. “Aren’t you supposed to be retired?” they would sometimes ask, but then
were always willing to accept that, after a long career in academia, old habits die hard.
13
Preliminaries
Chapter 1
Introduction
As the title says, this is a book about programming cognitive robots. The book is not
so different from other books on programming you might see online or at your favourite
bookstore. Some of them are about programming in Python, or in Scala, or in Matlab. This
one happens to be about programming in a language called ERGO. In this chapter, we
give a general introduction to programming a cognitive robot and to ERGO. (The word
“ergo” means “therefore” in Latin, emphasizing that a cognitive robot will need to be able
to draw conclusions based on what it knows. But for reasons that will become clearer in
Chapter 10, ERGO also stands for “ERGO Reimplements GOLOG.”)
15
At the other extreme, we will be able to tell the system how we want the world to be
and let it work out what to do, like “make sure everybody in my group has a copy of the
memo,” for instance. Here the system has to analyze the current state of the world, the
desired final state of the world, and plan for itself a program of actions that will get it from
here to there. Or we may not be interested in a single goal to achieve, but in a condition to
be maintained, like “make sure everybody in my group is always made aware of my travel
plans, but only ever send them messages on this during off-hours.”
So at one extreme, the actions to perform are listed explicitly, and at the other extreme,
they are left entirely implicit. As we will see, it is actually between these two extremes
where the most interesting ERGO programming takes place. We will most often end up
specifying what we want the robot to do not as an explicit sequence of actions, nor as a
condition to achieve or to maintain, but in terms of what we call a high-level program.
1. The primitive statements of the program are not fixed in advance; they are the actions
of the BAT, and vary from application to application.
2. The system has to reason with the information provided by the BAT to keep track of
what those actions do, how they change the world or extract information from it.
3. Perhaps most significantly, an ERGO program can leave out many details, details
which will then be filled in by the system during execution.
The idea here is that ERGO will keep track of what it is doing and why, as well as what it
has found out about the world, and then use this information to sort out the details that
have not been made explicit in the program.
For example, at some point, an ERGO program might say (in effect) “If condition Q
is now true in the world, then do action A and otherwise do action B.” It will be up to
the system to know whether Q is true, and select A or B accordingly. This condition Q
typically will not be something internal to the system, like “does variable X have value
N?” or “is the text window currently visible on the screen?” but some external property of
the world that the robot knows about, like “is the door in front of you locked?” So at one
extreme, the ERGO program might have said “Perform actions C, D, then E,” and it will be
up to the system to determine how these actions change the world, including whether or
not they make the condition Q true. At the other extreme, the ERGO program might say at
some point “Now achieve condition P,” and the system will have to figure out how to do
this. (As a special case, the program might say “Now find out whether condition R is true”
and the system might need to figure out how to do that.) Or the ERGO program might
be nondeterministic and say something more like “Now do either action F or action G as
required,” where the system will need to make a reasoned (that is, non-random) choice
based on what has been done so far and what still remains for it to do.
16
So in the end, the essence of programming cognitive robots is this:
Giving the system enough knowledge about the world, its actions, and what needs to
be done for it to be able to calculate the appropriate primitive actions to perform.
Furthermore, this knowledge needs to be represented in such a way that the system can
not only complete the required reasoning, but do so efficiently enough to be practical.
17
In general, there are no hard and fast rules about what the primitive actions of an
ERGO program should be. Part of the skill of programming cognitive robots (that comes
with practice) is finding the right level of actions to use. These can be thought of as “basic
behaviours” that a robot can carry out quickly and reliably, without concern for the higher-
level goals being worked on. If a behaviour cannot be carried out sufficiently quickly,
it might be best to break it down and use something more like a “start the behaviour”
action and one or more “terminate the behaviour” actions (as discussed in Chapter 6). If
a behaviour cannot be carried out sufficiently reliably, it might be best to use a primitive
action more like “attempt to do the behaviour,” and then use sensing information to assess
the outcome (as discussed in Chapters 11 and 12).
Traditional robot programming might come to an end once all the basic behaviours of
the robot are realized in code. In cognitive robot programming, we take these primitive
actions as just the starting point for all the higher-level decision making.
instead of
For those who have never actually programmed in Scheme, but have experience with lan-
guages like Python or ML, the language is not that hard to use (once you get beyond the
parenthesized syntax), and a quick tutorial is presented in the next chapter.
Looking more closely, what does this elevator program in Figure 1.1 actually say? First,
it says that there are only two fluents, that is, two properties of the world that this elevator
needs to worry about: what floor the elevator is on (called floor) and which elevator call
buttons have been pushed (called on-buttons). In addition, the program specifies that at
the outset, the elevator is on floor 7 and buttons 3 and 5 have been pushed. This is what
the elevator knows about its world initially. (This is the simple case where the values of
these two fluents are known at the outset. In a more complex setting, a cognitive robot
may have incomplete knowledge about its world.)
18
Figure 1.1: Program file Examples/basic-elevator.scm
The next three define-action expressions specify the primitive actions that the eleva-
tor is capable of performing: up (going up to a floor), down (going down to a floor), and
turnoff (turning off a call button). Obviously different elevators will have different capa-
bilities and limitations (and a much more intricate one is presented in Chapter 5), but this
one is assumed to be able to carry out these three actions. Furthermore, in the case of going
up to floor n, the action has a prerequisite: the action can only be performed successfully
when the floor the elevator is currently on is below n: (< floor n). Finally, the action has
an effect: after it has been performed successfully, the elevator will be on floor n, that is,
the floor fluent will have value n. The down action has a similar prerequisite and effect,
and the turnoff action has an effect on which buttons are on, but no prerequisite.
The next two define expressions specify ERGO procedures of interest. The first one
go-floor says that getting to a floor means either going up, going down, or just being on
the floor, as appropriate. The second one serve-floors says that (for our purposes) the
elevator service amounts to doing two things in sequence: first, repeatedly selecting any
floor whose call button is on, getting to that floor, and turning off its call button until no call
buttons are on, and second, parking the elevator on the first floor. (Note that this program
19
does not try to minimize elevator time by choosing the floors to serve more carefully, or
deal with other minor complications like doors or passengers.)
The final line of the file says that the main thing to do overall is to find the first sequence
of actions that would constitute an execution of the serve-floor procedure and print it out
as a list. If this file is given to ERGO to execute, the output it produces might be something
like this:
So ERGO is done when it has determined the primitive actions to execute. This list of
actions could then be passed to the robot manager of an elevator (or a simulated one) for
actual execution in the world.
In this example, the ERGO system calculates everything that the elevator needs to do
in advance, a complete sequence of actions to perform. This is called offline execution. (In
some cases of offline execution, a more complex specification than a sequence is needed.)
In online execution, by contrast, ERGO calculates just the next action the robot needs to
do, and then continues only after the robot has performed that action. This has two big
advantages: first, it allows ERGO to work well in cases where it would be impractical to
calculate in advance everything that needs to be done by a large program; second, it allows
sensing information obtained by the execution of one action to help in determining what
the remaining actions should be. (As we will see, online execution requires defining pro-
grammed interfaces between ERGO and the outside world.) These two modes of execution
and others are discussed in Chapter 6.
20
Chapter 2
Scheme Review
This chapter offers a review of Scheme, the language in which ERGO is embedded, and
which provides it with all the usual programming datatypes. The presentation is aimed
at readers already somewhat familiar with Scheme or one of its cousins. The chapter has
five sections. The first presents a quick overview of the language; the second reviews the
programming basics; the third considers more advanced programming; the fourth covers
the Scheme primitives used in the example programs in this book; and the fifth summarizes
how to run Scheme (and ERGO) from the command line, and makes some brief comments
about the running time to be expected.
2. A Scheme variable is any symbol that has been given a binding by a lambda or define
(described below), and it evaluates to that binding. There are many predefined global
variables in Scheme. Most of them are bound to functions, like the variables + and
append, but some are bound to other values, like the variable pi. (The predefined
global variables that will be used in this book are covered in Section 2.4).
21
evaluate to the arguments for the function. The entire form then evaluates to the
result of applying the function to those arguments. For example, (+ (* 2 4) 5) is
a function application, where + is a variable that evaluates to the addition function,
and the forms (* 2 4) and 5 evaluate to 8 and 5 respectively, arguments that are
indeed appropriate for addition. The entire form then evaluates to 13.
4. A special form is a non-empty list whose first element is a symbol, and which is
evaluated in a special way according to that first element. The most common special
forms used in Scheme are the following:
• (quote v), where v is any symbol or list, is a special form that evaluates to
itself. This special form is almost always written as a single-quote character
followed by the v. So ’hello means the same as (quote hello).
• (if e1 e2 e3 ), where the ei are forms, is a special form that is evaluated as
follows: first e1 is evaluated; if the value is #f, then e3 is evaluated and its value
is the value of the entire form; if the value is anything else (including #t, of
course), then e2 is evaluated and its value is the value of the entire form.
• (lambda (x1 . . . xn ) e), where each xi is a distinct symbol and e is a form, is a
special form that evaluates to the function of n arguments which, when given
arguments v1 , . . . vn , returns as its value the result of evaluating e in a context
where the variable xi is bound to vi . (See examples in the next section.)
• (let ((x1 e1 ) . . . (xn en )) e) is an abbreviation for the function application
((lambda (x1 . . . xn ) e) e1 . . . en ), and is the usual way variables are given tem-
porary local bindings in Scheme programs.
• (define x e) where x is a symbol and e is a form, is a special form that is
evaluated for its effect, which is to bind the variable x to the value of the form
e. This is the usual way variables are given persistent global bindings. The
form (define ( f x1 . . . xn ) e) abbreviates (define f (lambda (x1 . . . xn ) e)),
and is the normal way of defining functions in Scheme programs.
22
> (+ 3 5)
8
(The bold here indicates what a user would type in an interactive Scheme sesssion.) So at
its most basic, interactive Scheme behaves like a prefix-notation calculator:
Note that the evaluation of forms is recursive: to get the value of (expt 2 (+ 3 5)), the
Scheme processor must (among other things) get the value of (+ 3 5), and to do this, it
must get the value of the variable +. As a numeric calculator, Scheme places no limits on
the size of integers; the form (expt 2 (expt 2 20)) will gladly fill a screen with digits.
Many Scheme forms are already in reduced form and therefore evaluate to themselves:
numbers, strings, Booleans, and quoted expressions. All the remaining Scheme forms are
either symbols or lists. Those that are symbols are variables (like the pi and the > above),
and those that are lists are either function applications or special forms.
Let us return to the (if e1 e2 e3 ) special form. This looks like a function application
but it is not because the evaluation is done in a special way: only one of the e2 and e3 forms
will be evaluated. Note also that the Boolean constant #f plays the role of falsity; any other
Scheme datum plays the role of truth:
The first form shows that no error is produced despite the (/ 10 0). The second form
illustrates how the first form in a function application need not be a symbol like +; it can
be any form that evaluates to a function.
A related special form is (case e0 [l1 e1 ] [l2 e2 ] . . . [else en ]) where the ei are forms
and the li are lists. This special form is evaluated by evaluating the e0 , finding the first list
li that contains that value as an element, and returning the value of the corresponding ei
as the value of the entire form:
23
> ((lambda (x) (+ x 5)) 6)
11
> ((lambda (x) (+ x 5)) (min 12 6 (abs -18) 24))
11
> ((lambda (x y) (* (+ x 5) y)) 4 2)
18
> ((lambda (x fn) (fn 3 x 4)) 6 +)
13
((lambda (x1 . . . xn ) e) e1 . . . en )
Note that the (expt 2 5) form is evaluated only once because of this use of a local variable.
Of course it is also useful to have global variables whose values persist. These are
introduced using the special form (define x e), where x is a symbol and e is a form:
Note that the define special form does not produce any value; it is evaluated only for its
effect. (It is also possible to reassign a variable to a new value as in traditional imperative
programming languages using a set! special form, but we will not be using this in the
programs here.) As a convenience, there is a shorthand for a define of a lambda:
(define ( f x1 . . . xn ) e)
and is the usual way new functions are defined. For example:
24
> (define (cube x) (* x x x))
> (cube 3)
27
> (define (sumsq x y)
(+ (* x x) (* y y)))
> (sumsq 3 5)
34
When used as above, define produces a global variable; however, it can also be used within
the body of a let or another define to produce local variables. For example,
(define (sumsq x y)
(define xsq (* x x))
(define ysq (* y y))
(+ xsq ysq))
behaves just like
(define (sumsq x y)
(let ((xsq (* x x)))
(let ((ysq (* y y)))
(+ xsq ysq))))
Note how the second let is nested within the first in this case.
The Scheme language uses lexically scoping, meaning that a form that is nested within
another can refer to the variables of the enclosing one. Consider a form like this:
(define (sumsq x y)
(define xsq e1 )
(define ysq e2 )
e)
The form e can use the variables x, y as well as xsq and ysq. However, if there are local
variables introduced within e1 or within e2 , they will not be visible to e. Of course both e1
and e2 will be able to use x and y (and, in addition, e2 will be able to use xsq because of the
nesting of let forms noted above). Furthermore, to allow for recursion, the variable sumsq
can be used just like x and y within e1 , e2 , and e.
This lexical scoping rule allows the recursive factorial function to be written in the
expected way: (Note: a ; indicates a Scheme comment; the rest of the line is ignored.)
> (define (fact n) ; the factorial function
(if (< n 2) 1
(* n (fact (- n 1))) ))
> (fact 8)
40320
The let special form has provisions for recursion as well. The form
(let f ((x1 e1 ) . . . (xn en )) e)
is just like the previous let except that the symbol f can also be used within e as a local
variable whose value is the value of (lambda (x1 . . . xn ) e). So, for example, here is a form
whose value is the sum of the numbers from 3 to 10:
25
> (let loop ((i 3))
(if (> i 10) 0
(+ i (loop (+ i 1))) ))
52
Temporary functions (often with names like loop) are useful even within named functions.
For example, to sum the values of an arbitrary function from a lower bound to an upper
bound, the following can be used:
Notice how the iteration is really only over a single variable and that the other variables
stay fixed. It might be clearer therefore to write something like this:
This is better. However, it still has a drawback: on the last line, the function must call the
loop function, but also remember its current position and the value of (fn i) to be able to
add it to the recursive result. This means that it must use a stack, and when the difference
between hi and low is large, Scheme may run out of memory. Here is a better version:
This one produces the same values, but in this case, the function is tail-recursive, meaning
that when loop is called recursively, that function application is the last thing that needs to
be done. Evaluation in this case does not require a growing stack. (The Scheme processor
treats tail-recursion exactly like iteration in other programming languages: assign the new
values to the variables and then branch to the top of the loop.)
Of course the difference between tail-recursion and non-tail-recursion is even more
pronounced with functions that use recursion more than once. Here is the naive version of
the Fibonacci function:
26
The first version will not be able to calculate even (fibo 100) (since this would require on
the order of 2100 arithmetic operations), while the second version will easily fill the screen
with (fibo 100000). Another way to deal with the problem of calculating the same value
repeatedly (as in the naive version) is to remember (or “memoize”) calculated values in a
data structure so that they are computed only once. This can be done easily for Fibonacci
using a list whose elements will be the Fibonacci numbers in reverse order:
In more complex cases (of what has been called dynamic programming), a more elaborate
data structure than a list might be required.
Of course, lambda expressions can also be used as the final argument to sumup:
The final argument to sumup must be a function of one argument. So we might consider
defining functions that return such functions as their values. Here is an example:
So square-times is defined to return a function of one argument. For example, the form
(square-times 5) evaluates to a function that squares its single argument and multiplies
it by five. The variable st5 evaluates to the same function. Note how the 5 passed as an
argument to the square-times function ends up being “frozen” in the st5 function. (A
function that includes values for some extra variables is sometimes called a closure.)
27
This idea is used extensively in the implementation of ERGO. It also allows Scheme to
support an easy form of object-oriented programming. For example, suppose we want to
define a class of “number pair” objects that support a variety of methods. Here is how a
generic class might be defined:
(define (mypair x y)
(lambda (msg)
(case msg
((sum) (+ x y))
((diff) (abs (- x y)))
((dist) (sqrt (+ (* x x) (* y y))))
((print) (displayln x) (displayln y)))))
Note that the body of a lambda is evaluated only when the function is actually called with
arguments. Among other things, this allows us to create objects that are potentially infinite
streams of data. For example, consider this:
(define all-evens (let loop ((i 0)) (lambda (a) (if a i (loop (+ i 2))))))
Without the lambda, this definition would generate an error attempting to call loop with
ever larger values of i. With the lambda however, we freeze the current value of loop and
i, and delay the evaluation until the function is called:
28
> (define (myarith x) (+ x 7))
> (myarith 9)
16
We might define myarith to take an extra argument to be called on the computed value:
> (define (myarith x fn) (fn (+ x 7)))
> (myarith 9 (lambda (v) v))
16
> (myarith 9 sqrt)
4
> (myarith 9 (lambda (v) ’ok))
’ok
In a sense, this version of the function myarith no longer returns a value; it does its local
computation and then asks the fn argument to handle the result.
This idea allows us to program things like the backtracking needed in ERGO in a very
convenient way. In the implementation, each ERGO primitive takes two extra functions as
arguments, a failure continuation and a success continuation. The primitive never returns
a value. In some cases, it calls the failure or success continuation directly with some
computed value. In other cases, the primitive passes the buck to a second primitive, but
with a modified failure or success continuation that will do additional work if it is ever
called. For example, in nondeterministic choice, the first primitive will compute a first
choice and a new failure continuation; another choice will be computed only if the second
primitive later fails and calls this new continuation. (This is discussed in more detail in
Section 4.6.)
2.4.1 Numbers
Many of the functions involving numbers have already been used: +, *, -, /, expt, max,
min, abs, sqrt. There are many others. The main predicates over numbers are >, <, >=, <=,
and =. The function random returns a pseudo-random number between 0 and 1.
2.4.2 Strings
There are many string-related functions in Scheme, but in this book, strings are used
mainly for displaying messages. The display function can be used to print any datum
including strings and returns no value. If the string contains a \n, a newline is printed:
> (let ((x 34)) (display "The value is ") (display x) (display ".\n"))
The value is 34.
The displayln function is just like display except that it always terminates with a \n.
More convenient is the printf function, modeled after the one in the language C:
29
> (let ((x 34) (y 13)) (printf "The values are ˜a and ˜a.\n" x y))
The values are 34 and 13.
The eprintf function is the same but sends the output to the standard error port.
2.4.3 Symbols
The function symbol? tests whether its argument is a symbol. The only other function
worth noting is the predicate eq? which tests if two symbols are identical. The predicate
equal? is used to test if two arbitrary data elements print the same. The predicate eq?
(and its derivatives, remq, memq, assq, and hasheq described below) can also be used on
Booleans and small integers (less than 2n on an n-bit machine).
2.4.4 Lists
As can be expected from its ancestry in Lisp, Scheme has a number of list primitives. The
following come directly from Lisp: cons, car, cdr, list, append, reverse. One minor
difference with Lisp is that the result of evaluation is displayed as a quoted expression:
As in Lisp, there are also functions cαr, where α is a string of up to four a or d characters.
For example, the function caddr returns the third element of a list. The function list-ref
returns the i-th element of a list for a given i (indexed starting at 0). The function null?
tests if a list is empty. (Note that the symbol nil has no special status in Scheme; the form
’() is used to refer to the empty list.) The function memq tests if a symbol is an element of
a list, while member does the same for any datum. (When these functions do not return #f,
they return the tail of the list containing the element.) The function remq deletes the first
occurrence of a symbol from a list, while remove deletes the first occurrence of any datum.
The functions remq* and remove* delete every element of one list from another. Regarding
lists and numbers, the function take returns the first n elements of a list, drop returns all
but the first n elements, and length returns the length of a list.
It is often useful to pair up symbols with other data. An association list (or alist) is a
list of the form ((s1 d1 ) ( s2 d2 ) . . . (s2 d2 )), where the si are symbols and the di are
any data. The function assq takes a symbol s and an alist as arguments, and returns the
first (si di ) whose si is s. (If the s cannot be found in the alist, assq returns #f.)
The function map applies a function to each element of a list (or lists of the same length)
and returns a list of the results:
30
The functions append-map, sum-map and product-map are similar to map except that instead
of making a list of the results, they are appended, added, or multiplied together. The
functions and-map and or-map are similar but perform Boolean operations on the results,
as explained later. The function for-each is similar but applies the given function for
effect only and returns no value. The function filter is used to produce those elements
for which the given function does not return #f.
Just as let provides a convenient alternative to the use of an inline lambda, there are
similar alternatives for mapping functions. The expression
is equivalent to
The functions for/append, for/sum, for/product, for/and, for/or, for/only and for do
the analogous job for the mapping functions append-map, sum-map, product-map, and-map,
or-map, filter and for-each respectively.
A quoted expression is any symbol or list preceded by a single quote, as in Lisp.
A backquoted expression is similar except that the backquote character ‘ is used: this
evaluates like a quoted expression except that a comma can be used to refer to the value
of a form. A comma followed by an @ symbol says that the form, which must evaluate to
a list, should be spliced into the list. Here are some examples:
31
The functions and-map and or-map provide a form of universal and existential quantifica-
tion respectively over lists. The function and-map applies a given function to each element
of a list (or to lists of the same length) and conjoins the results, stopping if any of the
applications return #f; or-map is analogous but disjoins the results, stopping if any of the
function applications return something other than #f:
As noted earlier, the expressions for/and and for/or provide convenient alternatives:
This example does a single read from the file. Obviously, multiple reads are possible while
the file is open, with data satisfying eof-object? returned at the end of a file.
Ports over TCP connections are slightly more complex. Conceptually, a program will
either be a TCP client, wanting to send requests to another source, or a TCP server, wanting
to receive requests. However, in both cases, the communication may need to be two-way:
for example, a client may send a question to the server and expect to receive an answer.
Therefore both clients and servers will use input and output ports: both open-tcp-client
and open-tcp-server return a list whose first element is an input port and whose second
element is an output port.
For example, assume that there is already a server running somewhere that is willing
to receive requests on TCP port number 8123. The following shows a simple client:
32
The open-tcp-client produces an error if there is no server waiting on port 8123. (The
second argument is the hostname and defaults to "localhost".) Data can be sent to the
server on the output port using display, and whatever answers received on the input port.
A server can be defined in a similar way:
In this case, the open-tcp-server blocks until some client makes a connection. (Again,
the hostname argument can be omitted.) Data can then be received from the client on the
input port using read, and whatever answers sent back on the output port.
Channels are FIFO queues that are similar to ports, but are typically used for syn-
chronization purposes. The form (make-channel) creates a new channel, after which the
form (channel-get chan) behaves like a read, blocking if the queue is empty, and other-
wise returning the entry at the front of the queue (and removing it). Similarly, the form
(channel-put chan datum) behaves like a display, adding the entry datum to the back
of the queue.
The hash-ref function takes an optional third argument which is the value to return if no
mapping can be found.
A new hash-table can be created from an old one: the expression (hash-set h s x) has
as value a hash-table just like h except that symbol s is now mapped to value x.
Vectors provide a similar mapping but from numbers (that is, from non-negative in-
tegers) to arbitrary data. A vector is created using the vector function. The mapping is
accessed using the vector-ref function, where the indexing starts at 0.
33
This is like the list-ref function, but with constant access time. A vector can be created
from an old one with (vector-set v i x), just like with hash-set.
Arrays provide a similar mapping but from lists of numbers to arbitrary data. An array
is created using build-array whose arguments are the dimensions (as a list) and a function
that returns the initial mapping, where indexing again starts at 0. (A one dimensional array
is like a vector.) The mapping is accessed with the array-ref function.
> (define coords (build-array ’(3 5) +))
> (array-ref coords ’(2 4))
6
An array can be created from an old one with (array-set a l x) just like with hash-set.
The functions hash-set, vector-set, and array-set make modified copies of their
first arguments. The functions hash-set*, vector-set* and array-set* behave just like
their unstarred versions, but allow multiple changes with a single copy. For example,
(hash-set* h s1 v1 ... sn vn ) has as value a hash-table that is like h but where each
symbol si is mapped to vi , evaluated in sequence. (There are Scheme functions hash-set!,
vector-set! and array-set! that actually modify existing data structures, but we will not
be using them in the examples in this book.)
∗ 2.4.8 Functions
The usual way to produce a function is with the lambda special form. However, there are
some additional features of lambda worth noting, all of which then carry over to the define
special form.
The general form is actually (lambda p e1 . . . en ) where the ei are forms. The idea is
that the function will evaluate all in the ei in sequence (for effect), returning as value the
value of the final en . (The let and case special forms are similar.)
In addition, the p need not be a list of symbols. If a single symbol is used instead,
it indicates a function that takes an arbitrary number of arguments; the symbol is then a
variable whose value is a list of the actual arguments:
> ((lambda x (append (cdr x) x)) ’a ’b ’c ’d ’e)
’(b c d e a b c d e)
The p can also be a list of symbols that ends with a . and then a symbol. In this case the
final symbol is a variable whose value is a list of all the remaining arguments (if any):
> ((lambda (x y . rest) (cons (* x y) rest)) 3 4 5 6 7 8)
’(12 5 6 7 8)
When arguments have been collected in a list, the function apply can be used to pass them
to functions that require multiple arguments:
> (apply + ’(2 3 4))
9
> ((lambda x (apply * (cddr x))) 2 3 4 5)
20
Finally, a lambda special form can define a function that takes optional arguments. These
are indicated by putting within the p a keyword, that is, a special symbol beginning with
#:, followed by a list [v e], where v is a symbol and e evaluates to its the default value:
34
> (define (test x #:my-optional-arg [y 3]) (+ x y))
> (test 5)
8
> (test 5 #:my-optional-arg 4)
9
There are no provisions for defining special forms as such in Scheme, but there is an
elaborate macro facility which will not be used in the programs in this book. (It is used
extensively in the implementation.)
racket -l ergo -i
would be used. This file would normally contain a number of define expressions to
be used in the interactive session, and perhaps some definitions that use the ERGO primi-
tives discussed in the chapters to follow. The expression (include "my-other-file.scm")
within this file can be used to load additional user files.
To load that same file and then evaluate (my-main-function 3 5) in non-interactive
mode (with standard input and output for read and display), the following is used:
The flag -m can be used as an abbreviation for -e ’(main)’. The function main of no
arguments would then need to be defined somewhere in the file myfile.scm. (This function
can also be defined to take arguments, in which case it will be given the command-line
arguments after -m as strings.) Typically, it would call one of the ERGO top-level functions
like ergo-do or ergo-simplan as defined in later chapters.
In developing Scheme and ERGO programs, a programmer needs to keep in mind what
to expect in terms of running time, to get a sense of when extra care will be required in the
programming. To this end, we present a very rough guideline for those ERGO programs
that perform some sort of search (as many of them will end up doing):
So, for example, searching a binary tree of depth twenty in ERGO will not be a problem,
whereas searching one of depth thirty will require extra work.
As a concrete example, consider the “fox, hen, grain” puzzle presented in the next
chapter. As a planning problem, the solution requires seven actions, and since there are
35
only four possible actions to consider at each step, the total search space is well under a
million. On the other hand, for the “jealous husband” puzzle in that same chapter, because
a solution requires eleven actions and there can be as many as twenty possible actions to
consider at each step, some attention is required. (Sections 3.6 and 4.5 discuss efficiency
considerations in more detail.)
2.6 Exercises
1. Write a tail-recursive version of the factorial function. Compare it to the non-tail-
recursive one given in the text. Is there any reason to prefer one over the other?
2. Write a recursive definition of the map function. Do the same for the sum-map func-
tion. Noting the similarities, write a recursive function that generalizes both. (It is
called foldl in Scheme.) Define append-map in terms of this function, and explain
why and-map cannot be defined in this way.
3. One way to deal with floating-point numbers of arbitrarily large precision in Scheme
is to use arbitrarily large integers: multiply the floating-point inputs by some large
power of 10, do all the calculations over integers, and then reinterpret the final output
as a floating-point number. Use this idea to calculate the first 1000 digits of the square
root of 2 using Newton’s method.
4. Languages like Scheme are often thought of as unsuitable for floating-point calcu-
lations, for various reasons. But some of these number problems lend themselves
nicely to Scheme solutions. Write a function that takes four arguments, an error tol-
erance, a function f of n arguments, a list of of n lower bounds, and a list of n upper
bounds, and computes using the Monte Carlo method the definite integral of f in
the area defined by the given bounds to within the given tolerance.
36
Fluents, Actions, Programs
Chapter 3
What makes cognitive robotics different from robotics more generally is the cognition. A
cognitive robot does not simply act and react to the world around it; rather, it thinks about
that world and uses what it knows to help it decide what actions to perform.
In general, of course, the knowledge of a robot is not expected to be fixed in advance.
A robot might not know what is inside a box, for example, but may be able to find out by
looking inside. For now, however, it will be convenient to make a very strong assumption:
anything that the robot needs to know to do its job will be given to it at the outset. There will be
no need to deal with sensing or perception since there is no additional information for the
robot to acquire as it tracks its changing world. (This assumption of complete knowledge
will be relaxed in later chapters.)
• a primitive action (or action, for short) is an event that changes some fluents; and
The idea is that the world starts in some initial state, like the one depicted in the figure,
where each fluent has a certain value. Then, as actions occur, the state of the world changes,
and the fluents come to have different values.
A specification of all the fluents, actions, and states of a dynamic world is called a
basic action theory, or BAT for short. In the simplest case, this specification is formulated
38
Figure 3.1: A simple robot world
cc
@
@
@
Room4
in ERGO using two functions: define-fluents and define-action. The best way to get
a sense of what is possible is to look at some examples. (As a programming language,
ERGO can be thought of as Scheme with a few extra predefined functions and special
forms. These are indexed on Page 198. A one-page summary of ERGO as a whole is
included with the software distribution and reprinted on Page 199.)
(define-fluents
door1-state ’closed
door2-state ’open
door3-state ’open
box1-location ’room1
box2-location ’room4
robot-location ’room1)
This declaration says that a state of the world can be described using six fluents of the
given names and whose values in the initial state are as given. In general, an ERGO BAT
will contain one or more declarations of the form
where each fluenti should be a symbol, and each inii should be a Scheme form whose value
is the initial value of the fluent. (A single fluent can be defined with define-fluent.)
It is often convenient to use Boolean values or numbers (instead of symbols like open
or closed) to represent a state using something more like this:
39
(define-fluents
door1-open? #f door2-open? #t door3-open? #t
box1-room-num 1 box2-room-num 4 robot-room-num 1)
(It is customary, though not necessary, to place a ? at the end of Boolean-valued items.)
But any Scheme value can be used to represent the value of a fluent. So, for example, the
state of the doors might also be represented by a list of the doors that are open, and the
locations of the objects might be represented by a hash-table:
(define-fluents
open-doors ’(door2 door3)
location (hasheq ’box1 ’room1 ’box2 ’room4 ’rob ’room1))
In this case, a single fluent is used to represent what is known about the location of all the
objects: the value of the fluent location is a hash-table that maps any object (a box or the
robot rob) to the room where it is located. (The value of a fluent can even be a Scheme
function, as discussed below in Section 3.3.2.)
(* box2-room-num 5)
is an fexpr whose value in that state is 5, 10, 15, or 20. If open-doors is a fluent whose
value in a state is the list (door2 door3), then
is an fexpr whose value in that state is #f. If location is the fluent defined as a hash-table
above, then
3.3 Actions
In ERGO, a state of the world is considered to change only as the result of an action. An
action is declared by indicating its prerequisite (or precondition), that is, a condition that
must be true before the action can be performed, and its effects, that is, the fluents changed
by the actions and what their new values will be.
40
As a very simple example, suppose that we have the fluents door1-state, box1-loc,
and rob-loc, described above. Imagine that there is an action push12! that can push box1
from room1 to room2 under the following conditions: door1 must be open, and both box1
and the robot must be located in room1. This action can be defined using the function
define-action:
(define-action push12!
#:prereq (and (eq? door1-state ’open)
(eq? box1-loc ’room1)
(eq? rob-loc ’room1))
box1-loc ’room2
rob-loc ’room2)
(It is customary, though not necessary, to place a ! at the end of the name of an action.)
The general form of a define-action expression is this:
(define-action action-name #:prereq fexpr fluent1 fexpr1 . . . fluentn fexprn )
The idea is that the defined action can only be performed in states where the #:prereq
fexpr is true, and it changes the state of the world in such a way that fluenti takes on the
value of fexpri . The fluents named must be among those defined by the define-fluents
declarations in effect, but the action need not mention all of them. Those that are not
named are considered to be unaffected by the action. For the push12! action above, the
state of the doors and the location of box2 is unchanged by the action.
An ERGO program will typically define many actions. Here is one that opens door1:
(define-action open1!
#:prereq (or (eq? rob-loc ’room1) (eq? rob-loc ’room2))
door1-state ’open)
In this case, the action is considered to be possible only when the robot is located in room1
or room2. (However, note that this version does not require the door to be closed initially.)
Here is a variant that toggles the state of door1:
(define-action toggle1!
#:prereq (or (eq? rob-loc ’room1) (eq? rob-loc ’room2))
door1-state (if (eq? door1-state ’closed) ’open ’closed))
If door states are represented using Boolean values (where true means open), the toggle
action can be represented more concisely like this:
(define-action toggle1!
#:prereq (or (eq? rob-loc ’room1) (eq? rob-loc ’room2))
door1-open? (not door1-open?))
Note that within the fexpr (not door1-open?), the door1-open? refers to the state of the
door prior to the action. So the effect of the toggle1 action is similar to an assignment:
X := not(X)
If the state of the doors is represented by a list of all the open doors, the action to open
door1 can be defined like this:
41
(define-action open1!
#:prereq (or (eq? rob-loc ’room1) (eq? rob-loc ’room2))
open-doors (if (memq ’door1 open-doors)
open-doors
(cons ’door1 open-doors)))
Again the open-doors within the fexpr refers to the state just before the action. This version
checks to see if the door is already open before adding it to the list. The alternative is to
imagine an open action that can only be performed when the door is closed initially:
(define-action open1!
#:prereq (and (not (memq ’door1 open-doors))
(or (eq? rob-loc ’room1) (eq? rob-loc ’room2)))
open-doors (cons ’door1 open-doors))
In this case, the action would fail if attempted in a state where the door was already open.
42
(define-action (open! d)
door1-state (if (eq? d ’door1) ’open door1-state)
door2-state (if (eq? d ’door2) ’open door2-state)
door3-state (if (eq? d ’door3) ’open door3-state))
This says that the open! action affects the value of all three door fluents, but unless the
argument d is the same as the name of the fluent, the new value is the same as the old.
Finally, to see an example of a more complex action, here is a representation of a
generic action that pushes a box b into a room rm, where locations are represented using a
hash-table location, and where door states are represented using a list of open doors:
;; push box b into room rm
(define-action (push-box! b rm)
#:prereq (let ((rloc (hash-ref location ’rob)))
(and (eq? rloc (hash-ref location b))
(not (eq? rloc rm))
(for/or ((d ’(door1 door2 door3)))
(and (memq d open-doors)
(connected? d rloc) (connected? d rm)))))
location (hash-set* location ’rob rm b rm))
The prerequisite for this action checks that the robot and the box b start in the same room,
that this room is not the same as the destination room rm, and that there is an open door
connecting the two rooms. The only fluent affected by the action is the location hash-
table, but the action uses hash-set* to change the location of both the robot and the box.
43
(define-action (open! d)
#:prereq (not (open? d))
open? (lambda (d*) (or (eq? d* d) (open? d*))))
Using an abbreviation like this (as in the BAT of the figure) has the advantage of isolating
the ERGO program from the details of the fluent representation. Changing the represen-
tation then requires changing only the definition of this predicate and those actions that
affect the fluent. Of course, we still have the option of defining an abstract function like
44
Figure 3.2: Program file Examples/house-bat.scm
The main point again is to define abstract predicates and functions as necessary, and to
avoid as much as possible concrete fluents like location-table in ERGO programs.
Once the fluents, the abbreviations, and the three actions are defined, a function called
show-house is defined. For debugging purposes, it is always a good idea to write a function
that displays all the values of the fluents in some convenient form.
45
3.4.2 Testing a basic action theory
There are some special ERGO functions that can be used to test the workings of a BAT
in an interactive Scheme session: legal-action?, change-state, save-state-excursion,
and display-execution. These functions are useful for debugging and are typically not
used in working ERGO programs.
After loading a BAT into Scheme, the first thing to observe is that each fluent and each
primitive action ends up as a global variable. For example, we can define a function that
returns the current location of the robot:
The show-house function defined in the BAT can be used to display all the fluent values:
> (show-house)
The open doors are (door2 door3).
Box1 is in room1. Box2 is in room4. Rob is in room1.
Of course there is more than one state of the world that is of interest. States other than the
initial one are the result of performing actions. The actions that take arguments end up as
globally-defined functions that evaluate their arguments normally:
The value returned is always a list whose first element is the action. Before performing an
action, the ERGO function legal-action? can test if its prerequisite is satisfied in a state:
The value returned by legal-action? is the value of the prerequisite fexpr of the action.
(The list ’(room1 room2) is returned instead of a simple #t here because of the use of memq
in the function connected? defined in the BAT.) To actually change the state of the world
with an action, the ERGO function change-state can be used:
46
> (change-state (goto! ’room1))
> (show-house)
The open doors are (door1 door2 door3).
Box1 is in room2. Box2 is in room4. Rob is in room1.
When testing a BAT, it is often convenient to perform some state changes, and then to
restore the state of the world to what it was at the outset. This can be accomplished using
save-state-excursion. It takes as its arguments some expressions that are evaluated in
sequence for their effect. So, for example, starting in the initial state, we have
> (show-house)
The open doors are (door2 door3).
Box1 is in room1. Box2 is in room4. Rob is in room1.
> (save-state-excursion
(change-state (open! ’door1))
(change-state (push-box! ’box1 ’room2))
(show-house))
The open doors are (door1 door2 door3).
Box1 is in room2. Box2 is in room4. Rob is in room2.
> (show-house)
The open doors are (door2 door3).
Box1 is in room1. Box2 is in room4. Rob is in room1.
It is sometimes useful to track the effects of a sequence of actions on the state of the world.
The ERGO function display-execution takes as its arguments a function of no arguments
(used to display states), and a list of actions to perform. For example, using this function
starting from the initial state, we get the following:
47
Like save-state-excursion, the display-execution function restores the state of the
world to what it was just before it was called.
Once a BAT is found to be behaving properly, it can be included as part of an ERGO
program (as discussed in the next chapter), or used for other purposes, such as for auto-
mated planning, which we now turn to.
48
(define all-actions
(append
(map open! all-doors)
(map goto! all-rooms)
(for/append ((b ’(box1 box2)))
(for/list ((rm all-rooms)) (push-box! b rm)))))
This is a list with the seven actions above and eight push! actions (for two boxes × four
rooms). To define the goal of getting all the objects into Room2, we can use this:
(define (my-goal) (for/and ((o all-objects)) (eq? (location o) ’room2)))
So my-goal is defined as a function of no arguments, as required for a goal. We can use it
as the first argument to ergo-simplan to find a plan to achieve the goal:
> (ergo-simplan my-goal all-actions)
Plan found after 1 ms.
’((open! door1) (push-box! box1 room2) (goto! room4) (push-box! box2 room2))
The four step plan returned is the solution. We can examine how this solution works more
closely using display-execution:
> (display-execution show-house (ergo-simplan my-goal all-actions))
Plan found after 1 ms.
The first state is:
The open doors are (door2 door3).
Box1 is in room1. Box2 is in room4. Rob is in room1.
Executing action (open! door1). The resulting state is:
The open doors are (door1 door2 door3).
Box1 is in room1. Box2 is in room4. Rob is in room1.
Executing action (push-box! box1 room2). The resulting state is:
The open doors are (door1 door2 door3).
Box1 is in room2. Box2 is in room4. Rob is in room2.
Executing action (goto! room4). The resulting state is:
The open doors are (door1 door2 door3).
Box1 is in room2. Box2 is in room4. Rob is in room4.
Executing action (push-box! box2 room2). The resulting state is:
The open doors are (door1 door2 door3).
Box1 is in room2. Box2 is in room2. Rob is in room2.
That was the final state.
We can generalize my-goal to one of getting all the objects into an arbitrary room:
> (define (all-in rm) (for/and ((o all-objects)) (eq? (location o) rm)))
> (ergo-simplan (lambda () (all-in ’room3)) all-actions)
Plan found after 3 ms.
’((open! door1) (push-box! box1 room2) (goto! room4) (push-box! box2 room2)
(push-box! box1 room3) (goto! room2) (push-box! box2 room3))
49
Figure 3.3: Program file Examples/PlanningExamples/farmer.scm
Note that ergo-simplan requires a function of no arguments as its first argument. This is
why the lambda is needed in this case, but was not needed with my-goal above.
A farmer has a fox, a hen, and a bushel of grain that he wants to transfer from
one side of a river to the other using a boat that can carry at most one of them
(plus himself) at a time. If the farmer leaves the fox alone with the hen, it will
eat the hen; if the hen is left alone with the grain, it will eat the grain.
50
Figure 3.4: The solution to the fox, hen, grain problem
passenger are on opposite sides of the river. The effect of the action is to move both the
farmer and the passenger to the other side (using hash-set* on the locs fluent).
One requirement in this puzzle is that the hen must never be left alone with either
the fox or the grain. One way to deal with this would be to change the prerequisite of
cross-with! to ensure that only crossings that are safe for both the grain and the hen
are considered possible. A cleaner way (used in the BAT of Figure 3.3) is to make use of
the optional #:prune argument of ergo-simplan, whose value should be a function of no
arguments. The idea is that ergo-simplan rejects any plan that attempts to pass through
a state where the prune function returns true. So here we need only define unsafe? to
formalize the safety requirement and use it as the prune argument.
The main procedure of the program produces the output shown in Figure 3.4. As can
be seen, the farmer first brings the hen to the right side, then returns alone to get the fox,
then brings the hen back to the left side, then brings the grain to the right side, and then
returns alone to get the hen for the final crossing.
51
Figure 3.5: Program file Examples/PlanningExamples/jealous-bat.scm
A basic action theory for this is shown in Figure 3.5. The fluents and actions in this BAT
are similar to those for the fox, hen, and grain problem, where the location of the boat is
kept separate from the location of the six people. The goal will be the same: get all the
objects from the left bank to the right bank. A complete program that loads this BAT and
uses ergo-simplan to find a plan is shown in Figure 3.6. (To keep the files small, two files
are used, one for the BAT, and one for the main program.) The jealousy constraint plays a
role like the safety constraint from before. In this case, a state is considered “bad” if some
woman is not on the same bank as her husband but is on the same bank as another man.
Much of the difficulty in actually solving this problem is due to the large number of
actions to consider. The global variable all-double-crossings is defined as the list of
all cross-double! actions with two people such that can-boat? is true and where the
second person p2 in the boat occurs later in the list of people than the first person p1. (The
can-boat? predicate is used to ensure that we do not consider a boating trip involving a
52
Figure 3.6: Program file Examples/PlanningExamples/jealous-main.scm
woman and a man who is not her husband. The memq is used to ensure that we do not
consider pairs of people twice.)
Even with these restrictions, the space of potential plans is still huge. It will take
ergo-simplan several seconds to find a plan even on a fast computer. A big part of the
problem is that the planner will consider actions that undo previous actions, for example,
a person crossing the river alone and then immediately returning.
To deal with this, the #:loop? optional argument to ergo-simplan can be used. When
this flag is true, the planner rejects any plan that would pass through the same state twice.
With this, we quickly get the following output, showing the required eleven actions:
Plan found after 72 ms.
’((cross-double! m1 w1) (cross-single! m1) (cross-double! w2 w3)
53
(cross-single! w1) (cross-double! m2 m3) (cross-double! m2 w2)
(cross-double! m1 m2) (cross-single! w3) (cross-double! w1 w2)
(cross-single! m3) (cross-double! m3 w3))
(Not to take this example too seriously or anything, but we might prefer a solution where
the jealous husbands all end up on one side of the river without a boat, and the women
have all gone off to find better husbands!)
54
part of the overall state. The disadvantage is that the representation for parameterized
actions can be cumbersome.
At the other extreme, there might be a single fluent that represents all aspects of the
changing world. For example, we might use a single hash-table state that maps doors
to their open/closed status and objects to their locations. The advantage is that generic
actions are easily defined to change any aspect of the state. In fact, there can be a single
action change! that updates any aspect of the state according to its arguments. The dis-
advantage is that the entire state would need to be copied when anything changes. (This
copying can be reduced by using a more regressive representation, but that would lead to
troubles of its own.)
Between these two extremes, there are representations like the one presented above,
where all the door states are represented by one fluent and all the object locations are
represented by a second fluent. This is a progressive representation, and should it lead to
too much copying after an action (because there are too many doors or too many boxes),
the solution would be to partition the boxes or the doors in some way and use more than
one fluent for each.
The function legal-action? works by looking for this function in the prereq-table for
the given action and calling it.
The state change produced by an action is handled in a similar way. In the simplest
case, (define-action name fluent fexpr) causes the following:
So the function associated with the action changes the state to a new one where some of
the fluents have new values according to the given fexprs. The function change-state
works by looking for this function in the effect-table for the given action and calling it.
Turning now to the planner, the code for the ergo-simplan procedure is shown in
Figure 3.7. It works by what is called iterative deepening. The idea is that ergo-simplan
55
Figure 3.7: A simple iterative deepening planner
first searches for a plan of length 0; failing this, it searches through all plans of length 1;
failing this, it searches through all plans of length 2, and so on up to a bound (which is a
third optional argument to ergo-simplan). It does this by calling a local function simp*
repeatedly within a loop with an increasing value of n. Although it may seem wasteful on
failing to find a plan of length n, to search from scratch for a plan of length (n + 1), it has
been found in practice that the time spent looking for plans of length n or less is actually
quite small compared to the time required to go through the plans of length (n + 1).
The simp* procedure within ergo-simplan does all the work. It receives as arguments
the number of actions n to consider in a plan, the history h of actions performed so far,
and a list l of the states encountered so far (in case the #:loop? parameter is used). If there
are no actions to consider (that is, n = 0), it simply checks if the goal condition is currently
satisfied and if so, returns h in reverse order. Otherwise, after checking the #:loop? and
#:prune conditions, it tries to find (using for/or) an action in the given list of actions
whose prerequisite is currently satisfied (using legal-action?) and such that simp* is
satisfied in the state that results from performing that action (using change-state), with
(n − 1) actions left to consider.
3.8 Exercises
1. Use ergo-simplan in the house-bat domain to find the shortest sequence of robot
actions that gets two boxes into the same room.
2. Write a solution to the fox, hen, grain problem that does not use the #:prune optional
argument, but instead uses a modified prerequisite for the cross-with! action as
suggested in the text.
3. An early test domain for planning programs was the blocks world, where a robot
would manipulate blocks on a tabletop with actions like picking up a block (off the
table or off another block) and putting a block down (on the table or onto another
block). A planning problem was specified in terms of an initial and a final layout of
blocks, looking for minimal sequences of actions that transformed one to the other.
Read up on this domain and formalize some parts of it in ERGO.
56
4. Formalize the Towers of Hanoi problem as a planning problem in ERGO. You may
assume that the number of disks n is provided at the outset (unlike the version of
the problem considered in Chapter 12). The shortest solution will take 2n − 1 moving
actions. The well-known recursive program for this problem will find and print the
actions in about 2n time units (that is, its running time will scale linearly in the size
of the plan). Investigate how long ergo-simplan takes to find a plan.
57
Chapter 4
In this chapter, we turn our attention to ERGO programs. Running an ERGO program is
not so different from the planning seen in Chapter 3. In both cases, the system is given an
initial state of the world and some objective to be achieved. For planning, the objective is
expressed as a goal condition to be made true using actions taken from a given list:
(ergo-do program)
In both cases, the desired output will be the same: a sequence of actions that achieves the
objective (or #f when no such sequence exists). The difference is that with a program, we
get to consider not only what is to be achieved, but also how we expect it to be achieved.
In ERGO, this can range from an explicit list of actions to perform, all the way to some
guidelines about what should or should not be done.
This chapter explores how basic sequential programs can be defined, that is, programs
where only one thing is happening at a time. It elaborates further on why it is useful to go
beyond planning to programming. The more advanced concurrent programs, where more
than one thing is happening at a time, are considered in Chapter 5.
(:act e)
58
is an expression whose value is an ERGO program, which can then be used as an argument
to ergo-do. Returning to the example BAT of Figure 3.2, the following interaction can be
observed:
In this case, ergo-do is asked to find a sequence of actions that constitutes a successful
execution of the given :act program, and it finds the obvious list with one element. (After
execution, ergo-do restores the state to what it was at the outset.) On the other hand, the
following program fails:
In this case, no sequence of actions can be found by ergo-do (since the door to Room2 is
closed in the initial state).
To obtain a sequence of actions, the :begin primitive is used. If p1 , p2 , . . . pn are
expressions that evaluate to ERGO programs, then
(:begin p1 . . . pn )
is an expression that evaluates to the ERGO program that executes each subprogram in
order:
As can be seen in the second example above, a :begin program fails (and ergo-do returns
#f) if any of its subprograms fail. In this case, the robot cannot go directly from Room1
to Room3. The (:begin) that appears in the third example above is a program that does
nothing. (The primitive :nil can be also used for this purpose.)
As with ergo-simplan, the sequence of actions returned by ergo-do can be used as an
argument to the display-execution function, as shown in Figure 4.1.
59
Figure 4.1: Executing a sequence of actions
The error reported here is that :begin was given something other than an ERGO program
as its first argument.
> (let ((pgm (:begin (:act (open! ’door1)) (:act (goto! ’room2)))))
(ergo-do pgm))
’((open! door1) (goto! room2))
This means that the Scheme define can also be used. For example, to define a new ERGO
procedure where the robot goes to some room and then back to Room1, we can use this:
(define (goback rm) (:begin (:act (goto! rm)) (:act (goto! ’room1)) ))
So goback is a function that returns an ERGO program as its value. We then get this:
> (ergo-do (:begin (:act (open! ’door1)) (goback ’room2) (goback ’room2)))
’((open! door1) (goto! room2) (goto! room1) (goto! room2) (goto! room1))
60
(:if e p q)
evaluates to a program that executes p if the fexpr e is true, and q otherwise.
> (ergo-do (:if (open? ’door1) (:act (goto! ’room2)) (:act (open! ’door1))))
’((open door1))
(See below for the difference between the ERGO :if and the Scheme if.) There are some
additional primitives that expand to versions of :if: the primitive
(:when e p1 . . . pn )
behaves just like (:if e (:begin p1 . . . pn ) :nil), and the primitive
(:unless e . . .)
is an abbreviation for (:when (not e) . . .).
The primitive
(:test e)
succeeds or fails according to whether or not the fexpr e evaluates to true. In sequential
programming, it is equivalent to (:if e :nil :fail), where :fail is the (rarely useful)
ERGO program that always fails.
61
∗ 4.1.6 Evaluation versus execution
Expressions such as (:begin p1 . . . pn ) or (:act a) evaluate to ERGO programs, but do
not actually execute those programs until they are passed to ergo-do. This distinction be-
tween when an expression is evaluated and when an ERGO program is executed accounts
for the difference between the ERGO :if and the Scheme if (and between :let and let).
The expression (:if e q r) evaluates to a conditional program which, when executed,
chooses between q and r based on the value of e at the time of execution. The expression
(if e q r), on the other hand, evaluates to q or to r based on the value of e at the time
of evaluation. In many cases, the two expressions behave the same and can be used in-
terchangeably. However, there can be a difference between the two when the e refers to
fluents whose values will be changed in the execution.
Suppose, for example, that Door1 is closed initially. Then the expression
(:begin (:act (open! ’door1)) (:if (open? ’door1) q r))
evaluates to a program that opens the door and then performs q (which is what would be
expected in normal sequential execution). On the other hand, the expression
(:begin (:act (open! ’door1)) (if (open? ’door1) q r))
evaluates to a program that opens the door and then performs r (which is almost certainly
not what was intended). The problem is that (open? ’door1) will be false when the
:begin expression is evaluated, but true after the first action is executed.
The moral is that when the normal Scheme evaluation requires the value of a fluent,
we need to be careful about when the fluent will be evaluated. In almost all cases, we will
want to use :if and :let in ERGO programs instead of their Scheme counterparts. Instead
of a program like (:begin p (my-function f )) (where f is a fluent), we will want to use
something more like (:begin p (:let ((x f )) (my-function x))) in cases where the
ERGO program p can change the value of f .
62
(:choose p1 . . . pn ),
is executed by executing one of the three actions. By default, ergo-do simply returns the
first successful execution it can find:
This mode of execution is most useful for deterministic programs. By using the optional
#:mode argument with the value ’offline, all three executions can be seen:
In this ’offline mode, ergo-do displays a sequence of actions and then asks for confirma-
tion from the user. If the user enters y, that sequence is returned as the value of the entire
expression; if the user enters n, ergo-do searches for another execution of the program and
then asks again. A value of #f is returned only when no further executions can be found.
(The default behaviour of ergo-do when no mode is specified is called ’first mode.
There is also a third ’count mode that returns the total number of successful executions.
For example,
63
(:begin
(:choose (:act a) (:act b) (:act c))
(:act d) )
has just one execution: the sequence consisting of the action b followed by the action d.
The other two executions, involving the actions a and c, must be discarded since they do
not allow the final d action to be executed successfully. Similarly,
(:begin
(:choose (:act a) (:act b) (:act c))
(:choose (:act a) (:act b) (:act c))
(:act d) )
has exactly five executions: the four executions without a b action must be discarded since
they do not allow the rest of the program to complete successfully.
64
(:begin
(:<< (printf "Out of options\n"))
(:>> (printf "Starting\n"))
(:choose :nil :nil)
(:>> (printf "So far so good\n"))
(:<< (printf "Failing\n"))
:fail)
Starting
So far so good
Failing
So far so good
Failing
Out of options
ERGO also provides two additional primitives, ::act and ::test which behave just like
:act and :test except that they use :>> and :<< to display information on success and on
failure for debugging purposes.
(:for-some v e p1 . . . pn ),
where v is a symbol, e is an fexpr that evaluates to a list, and the pi evaluate to sub-
programs. Conceptually, the :for-some is executed by setting a variable v to one of the
elements of the list denoted by e, and then executing all the subprograms in sequence.
The element of the list is chosen nondeterministically and depends on what else needs
to be done. As with :for-all, the e can also evaluate to a number n, in which case the
subprograms must succeed for one of the values 0, 1, . . . , n − 1.
The :for-some primitive can be thought of as the dual of the :for-all primitive:
whereas a :for-all program can be executed successfully when the subprograms can be
executed successfully for all values of its variable, a :for-some program can be executed
successfully when the subprograms can be executed successfully for some (at least one)
value of its variable. (Similarly, the :choose primitive can be thought of as the dual of
:begin.) So, for example, the program
with three possible executions (assuming each door can be opened). Similarly,
65
(:for-some a act-list (:act a) (:act a) (:act a) (:act a))
will have as many executions as there are actions in the list act-list that can be legally
performed four times. (This idea of nondeterministically choosing an action from a list of
actions will allow a convenient form of planning, as discussed in Section 4.4.)
(:star p1 . . . pn ),
picks up some number of objects that are on the floor. (The :test is not necessary if the
pickup! action has a prerequisite that the object be on the floor.) It is left unspecified by
this program which objects are to be chosen or even how many are to be picked up. In
terms of backtracking, :star first attempts to do its subprograms zero times, and if that
later fails, then once, and if that fails, then twice, and so on.
The :star primitive can be duplicated using recursion and other features of ERGO:
This is perhaps not too surprising, since :star is like a :while loop with no termination
condition, and a :while loop can also be simulated using recursion. (With no termination
condition, a program like (:begin (:star p) :fail) will simply run forever.)
66
Figure 4.2: A search through the rooms
5 @
Checkout
4
cc
3
2
...
0
Aisle 0 Aisle 1 Aisle n
the robot needs to go through that door to the adjoining room, which is calculated using
the connecting-room function.
The backtracking search needed to execute this ERGO program is actually very similar
to planning. The relationship between the two will be further explored in Section 4.4.
67
Figure 4.4: Program file Examples/grocery-bat.scm
The goal of the robot in this world is to navigate the grocery store, picking up all the
items on its list, and then to proceed to the checkout counter. A program that does this is
shown in Figure 4.5. The program works by first iterating through the shopping list one
68
Figure 4.5: Program file Examples/grocery-main.scm
item at a time, going to where that item is located in the store and putting it in the trolley,
and then going to the location of the checkout counter. To get to a desired location, the
robot first goes to the desired aisle using get-to-aisle, and then up or down the aisle
using get-to-shelf. To get to a desired aisle, the robot first goes to an end of the current
aisle, then moves right or left as needed.
While this program gets the job done, it has some limitations. For one thing, if the
robot is located at position ( a, s) and wants to get to position ( a0 , s0 ) where a0 is not a,
it nondeterministically chooses which end of the current aisle to exit. This can involve
needless travel on its part. For example, assuming there are 11 shelves on each aisle, if
the robot is located at shelf 10 and needs to go to shelf 9 on another aisle, it should exit
the current aisle at the top (shelf 11) and not at the bottom (shelf 0). A better version of
the get-to-location program appears in Figure 4.6. In this version, get-to-aisle is not
used, and the decision about which end of the aisle to exit is based on the values of the
current and goal shelves.
A more serious problem with the program is that the items on the grocery list are
considered in the order they appear on the grocery list. Unless the grocery list has been
prepared with care, this can result in a lot of zigzagging though the store. A better strategy
is to always choose the item on the grocery list that is closest to the current location:
69
Figure 4.6: Better aisle navigation
The function closest-item would go through the current grocery list returning the item
whose distance to the current location was minimal.
While this “greedy” strategy might work well in this grocery store, there are cases
where choosing the closest item repeatedly can fail to produce a trajectory that is minimal
overall. An even better strategy is to choose an arbitrary item from the grocery list, but
on the way there, to watch for and pick up any object that is on the grocery list. (The
:monitor primitive presented in Chapter 5 is well-suited to this type of programming. See
the delivery agent example in Section 5.2.1.) In fact, a simple related strategy employed by
many shoppers with big grocery lists is to traverse each of the aisles in sequence looking
for items that are on the grocery list.
This ERGO procedure says very little about how it should be executed: repeatedly choose
some action in the given list and perform it, after which the given goal condition must be
satisfied. To execute this program, ergo-do ends up having to find a sequence of actions
from the list that can be legally executed starting in the initial state and that terminates in
a state satisfying the goal condition. In other words,
70
(ergo-do (achieve-simple goal? actions))
a pruning condition given by prune? can be used to ensure that some condition is satisfied
throughout the execution of the plan (like safety in the fox, hen, grain planning problem).
Alternatively, a program like
could be used to filter the choice of action according to some given criteria (for example,
only use actions that make a loud noise when nobody will be disturbed). Another tactic is
to only look for plans that are limited in some way (before trying something else):
This procedure searches for a plan to achieve the goal but whose total action cost (according
to a given cost function) does not exceed a given maximum bound.
As we come to know more about the domain, our instructions about what to do can
become more specific. In the nondeterministic get-to-room procedure of Figure 4.2, in-
stead of a search for actions, the program searches for doors to go through. Once these door
choices have been made, the rest of the program follows deterministically. This is signifi-
cant since there may be a very large set of legal actions that could be considered next, and
the unrestricted planning problem might be very difficult to solve.
Consider the grocery store example from Section 4.3. For a large enough grocery list,
it would be quite difficult to solve the grocery shopping as a simple planning problem:
the size of the search tree would simply be too big. On the other hand, a well-constructed
ERGO program can easily deal with hundreds of grocery items. Moreover, the ERGO
program can provide a fine-grained control of the behaviour. It would be quite challenging
to formulate planning problems for each of the various ways of going through the shopping
aisles. It is much simpler to be able to say how to navigate the aisles in a program.
71
4.5 Efficiency considerations
For the most part, the implementation of the ERGO system allows programs to be written
without too much concern for how well they will scale. To see where programs will run
into efficiency issues, it is useful to consider a simple test BAT with these definitions:
(define-fluents afluent 0 bfluent 0)
(define-action a+ afluent (+ afluent 1))
(define-action b+ bfluent (+ bfluent 1))
There are two numeric fluents, afluent and bfluent, which are incremented by two ac-
tions, a+ and b+ respectively.
It is possible to generate extremely long sequences of actions in a fraction of a second
using a deterministic program like this:
(let loop ((n (expt 2 20))) (:when (> n 0) (:act a+) (loop (- n 1))))
The ergo-do function will quickly produce a list of 220 actions. (To avoid seeing them all,
pass the value returned by ergo-do through a function like length.)
Accessing the values of fluents is also very efficient, and so the following program takes
about the same amount of time:
(:while (< afluent (expt 2 20)) (:act a+))
It is not hard to confirm that the time required for the execution of this program scales
linearly (as to be expected), so that
(:while (< afluent (expt 2 25)) (:act a+))
takes about 32 times as long to execute. (Note that a significant proportion of the time
it takes to run this program is spent in Scheme garbage collection. For very time-critical
applications, it may be useful to find out how to turn off this garbage collection.)
Where care is required in programming is with nondeterminism. The program
(:begin
(let loop ((n (expt 2 20)))
(:when (> n 0) (:choose (:act a+) (:act b+)) (loop (- n 1))) )
(:test (= afluent (expt 2 20))) )
is similar to the one above and will generate a list of 220 actions. This will take somewhat
longer to execute, but almost all of the extra time is due to garbage collection. For this
program, the first action selected inside the :choose, the a+ action, is always the correct
one; no backtracking is needed. This can be contrasted with the following program:
(:begin
(let loop ((n 20))
(:when (> n 0) (:choose (:act a+) (:act b+)) (loop (- n 1))) )
(:test (= bfluent 20)))
In this case, the first action selected is never the correct one; the only way to pass the final
test is to always choose b+ actions. Although the final list of actions has only 20 elements,
ergo-do must explore the entire search tree of 220 elements to eventually find that correct
final sequence. This takes about the same time as generating a list with 220 elements,
discounting garbage collection. Consequently, taking into account the Million-Billion Rule
mentioned on page 35, we can predict that the program
72
(:begin
(let loop ((n 30))
(:when (> n 0) (:choose (:act a+) (:act b+)) (loop (- n 1))) )
(:test (= bfluent 30)))
which will take 210 times longer to finish, will have unacceptable execution time.
Similar considerations apply to the backtracking that results from the :star primitive.
Execution of the program
(:begin
(:star (:act a+))
(:test (= afluent 1000)))
will find the successful list of 1000 actions quite quickly. This is because only 1000 possi-
bilities will be considered. This is in contrast with the following program:
(:begin
(:star (:choose (:act a+) (:act b+)))
(:test (= afluent 20)))
In this case, before ergo-do can find the correct list with 20 elements, it must first go
through entire trees containing the sequences of a+ and b+ actions of length 1, 2, . . . , 19.
Again, we can predict that the program
(:begin
(:star (:choose (:act a+) (:act b+)))
(:test (= afluent 30)))
will take too long to execute. Blind search has its limitations.
As noted above, one of the major advantages of ERGO programming over planning is
that the search does not have to be blind. We can see this very dramatically, by inserting
conditions to be satisfied at various points in the search above:
(:begin
(:star (:choose (:act a+) (:act b+)))
(:test (= afluent 10))
(:star (:choose (:act a+) (:act b+)))
(:test (= afluent 20))
(:star (:choose (:act a+) (:act b+)))
(:test (= afluent 30)))
This program is similar to the previous one, except that it insists on progress being made
along the way. The net effect is that ergo-do only has to blindly explore very small trees (no
larger than 210 ) to then assemble the final sequence of length 30. This program terminates
almost immediately.
73
since the implementation relies heavily on macros (not covered in this book) to do much of
its work. Nonetheless, a rough overview of the implementation is presented here for those
who may be interested in extending or revising it.
The macro-defining operator used in the implementation is define-macro so that
causes (foo ’(a b c)) to behave like (car ’(a b c)). Macros are more flexible than
functions since they need not evaluate their arguments, or can do so selectively.
Program expressions in ERGO like (:begin p q) are evaluated before they are used.
They each evaluate to a function that takes three arguments:
• hist, a list of actions performed so far, which will be (reversed and then) returned
by ergo-do on successful completion;
• fail, a function of no arguments called the failure continuation, to call if the ERGO
program cannot take a successful step;
• succ, a function of three arguments called the success continuation, to call if the
ERGO program can take a successful step. (The arguments are explained below).
The last thing an ERGO program must do is to call one of these continuations or ask some
other ERGO program to do so.
For example, the ERGO program construct :if can be defined as follows:
Note first of all that (:if expr pgm1 pgm2 ) expands to a function of three arguments (via
lambda), as required. When called, this function evaluates the given expr (using if) and
then continues with either pgm1 or pgm2 . These must be ERGO programs themselves and so
are expected to take the three arguments as above and to call fail or succ as appropriate.
(One minor complication in the above scheme are the actual names hist, fail, and
succ, which must not collide accidentally with user variables used in expr, pgm1 , or pgm2 .
To ensure this, the implementation actually uses dm-ergo instead of define-macro, to
generate new variables different from all the variables that appear in user programs.)
The :test construct is a simple example of an ERGO program that calls one of the
continuations directly:
74
Either succ or fail is called depending on whether the condition is true. The fail contin-
uation takes no arguments. The arguments for the succ continuation are the history, the
failure continuation, and the steps remaining in the program under consideration. (The #f
indicates no further steps needed for a :test.) Note that fail is passed as an argument to
succ. This is because the success continuation may later decide that something has failed
and need to backtrack to the failure continuation.
To see why intuitively, consider a program like (:begin (:choose pgm1 pgm2 ) pgm3 ).
If pgm1 fails, execution continues by trying pgm2 . If pgm1 succeeds, on the other hand,
execution continues with pgm3 . However, if this pgm3 subsequently fails, the execution
must backtrack and try the pgm2 , just as if pgm1 had originally failed.
The handling of primitive actions via :act is also done by either calling the failure or
success continuation. The definition is (approximately) as follows:
(define-macro (:act a)
‘(lambda (hist fail succ)
(if (legal-action? ,a)
(begin (change-state ,a) (succ (cons ,a hist) fail #f))
(fail))))
If the action is not legal, fail is called; otherwise, the state is changed using change-state,
and succ is called with an updated history of actions. (The history of actions is not used
for online execution, discussed in Chapter 6.)
The basics of backtracking in ERGO can be seen in the definitions of the :choose and
:begin program constructs. Here is how a simpler version of :choose with just two argu-
ments can be defined:
This says that to execute (:choose2 pgm1 pgm2 ), the state of the world is first saved, and
then the program pgm1 is executed with succ as the success continuation, but with a new
failure continuation: if pgm1 fails, rather than simply calling fail, the state is restored, and
then the program pgm2 is be attempted, which in turn will call fail or succ as appropriate.
In other words, try pgm1 , and if that succeeds, execution continues normally; but if it fails,
pgm2 is attempted next.
The :begin construct is similar except that it is the success continuation that is modi-
fied. A two argument version can be defined as follows:
This says that to execute (:begin2 pgm1 pgm2 ), the program pgm1 is executed with fail as
the failure continuation, but with a new success continuation: if pgm1 is able to take a step
75
successfully, execution continues not with succ, but with the remaining steps from pgm1
(if any) followed by pgm2 . (Extra complications arise in the case of concurrent programs
where steps from some other program may be interleaved with this execution.)
The last piece of the implementation puzzle concerns how the interpreter ergo-do
itself is defined. In the simple case, when the execution is ’first mode, the definition is
(roughly) as follows:
(define (ergo-do p)
(define (succ h f c) (if c (c h f succ) (reverse h)))
(save-state-excursion (p ’() (lambda () #f) succ)))
So (ergo-do pgm) calls the ERGO program pgm with an empty history of actions, a failure
continuation that simply returns #f, and a success continuation that continues through
each step of the program until there are no steps left, in which case, it simply returns the
final history of actions in reverse order.
4.7 Exercises
1. Using the house-bat of Chapter 3, write an ERGO program that has the robot visiting
every room of the house.
2. Imagine a robot in a much larger house than the one in Chapter 3. Write an ERGO
program to get to an arbitrary room by first calculating the shortest path.
3. As noted, the :star primitive of ERGO is a feature that can be duplicated using
recursion. Compare the version given in the text to this simpler recursive one:
76
Chapter 5
In more complex cognitive robotic systems, a robot must not only deliberate about what
to do, but also react in an appropriate way to the dynamic world it inhabits. It often helps
to visualize the robot as performing more than one task at the same time. For example,
a robot may be going from room to room picking up garbage, but also closing any open
window it finds, and stopping occasionally to recharge its battery. A program with this
type of behaviour can certainly be written using just the programming primitives seen
in Chapter 4, but it would tend to be messy. An otherwise simple program for picking
up garbage would be interspersed everywhere with extra code to deal with windows and
battery levels. To avoid this, it is better to program each of these simple behaviours as
independent programs and then to ask ERGO to execute them concurrently.
(:conc p1 . . . pn )
and
(:monitor p1 . . . pn ),
where the pi evaluate to subprograms that are to be executed concurrently. The concur-
rency here does not mean true simultaneity, but rather an interleaving of steps. In other
words, each subprogram is assumed to go through a number of steps in order, where a
step is either (:act a), which involves the execution of some legal action, or (:test e),
which involves testing the truth of some condition. When two or more programs are ex-
ecuted concurrently, each subprogram will go though its steps in the order specified, but
between any two steps, steps from other subprograms may appear.
For example, suppose that a, b, c, and d are actions that have been defined as having
no prerequisites. Consider the program
77
(:conc (:begin (:act a) (:act b)) (:begin (:act c) (:act d)))
The first subprogram asks for a then b in sequence; the second subprogram asks for c then
d in sequence. When executed concurrently, there are six possible executions:
Note that in all these cases, the a step precedes the b step and the c step precedes d step,
as required. The :conc primitive, in other words, searches for a legal interleaving that
satisfies the ordering given by the subprograms, backtracking as required.
Suppose now that one of the effects of action b is make the prerequisite of action d
false. In this case, three of the executions above would be ruled out:
(a b c d)
(a c b d)
(c a b d)
These would no longer be legal since they require d to be executed in a state where its pre-
requisite is false. On the other hand, if the c action also happens to make the prerequisite
of d true, the first of the these three would again be a legal interleaving of the actions.
Steps that involves the :test primitive are similar to those involving :act. Consider a
program of the following form: (:conc p (:begin q (:test e) r)) The first subprogram
generates the steps for p, while the second first generates the steps for q, then a test that
e is true, and then more steps for r. The steps from the two subprograms are then to be
interleaved. The effect of the (:test e) is that the interleaving must be such that in some
state after q is done but before r begins, the condition e must be satisfied. This can impose
constraints on where the steps of p can appear. For example, e might be (> f 5), where f
is some fluent, and p might be a loop that increments the value of f from 0. In that case, r
will only be executed after p has incremented it six times. In other words, in the presence
of concurrency, (:test e) can be read as “wait until condition e is true,” where the waiting
is for other concurrent subprograms.
78
It is worth noting that :act and :test are the only programming constructs that gen-
erate steps. For example, when (:if e p q) is interleaved with some other program, the
evaluation of the condition e is not considered a step. In other words, it cannot happen
that e evaluates to true, some other program does something (for example to change e),
and then p begins to execute in the changed state. So the p will always begin execution in
a state where the e is true.
There may be cases where it is actually useful to treat the evaluation of e as a step and
to allow interleaving. This can be achieved by using :test explicitly instead of the :if
primitive, as in the following:
Other programming constructs like :when and :while can also be rewritten to use :test
in an analogous way.
On the other hand, it may sometimes be desirable to avoid any interleaving with certain
programs. For this, the :atomic primitive can be used. The expression
(:atomic p1 . . . pn )
behaves just like (:begin p1 . . . pn ), except that the resulting program is treated as an
indivisible operation with no interleaved steps. So in the end,
The first robot grabs an end and lifts it, but then must wait, since a second lifting is not
allowed (it would exceed the tolerance). So the second robot now grabs the other end and
does two liftings, after which it must wait. Execution continues, alternating between the
two robots until the table has been safely raised.
79
Figure 5.1: Program file Examples/lift-table.scm
;;; Two robots must lift two ends of a table in increments up to a height
;;; while ensuring that the table remains level (to a tolerance)
;;; Two fluents: position of table ends, and holding status of each robot
;;; Two actions: grab an end of a table, and move vertically
(define robots ’(r1 r2)) ; the robots
(define ends ’(e1 e2)) ; the table ends
(define goal-height 6) ; the desired height
(define amount 1) ; the increment for lifting
(define tolerance 1) ; the tolerance
(define-fluents
pos-table (hasheq ’e1 0 ’e2 0) ; vertical pos of table end
held-table (hasheq ’r1 #f ’r2 #f)) ; what robot is holding (#f = nothing)
;; some useful abbreviations
(define (pos e) (hash-ref pos-table e))
(define (held r) (hash-ref held-table r))
(define (table-is-up?) ; both ends higher than goal-height?
(for/and ((e ends)) (>= (pos e) goal-height)))
(define (safe-to-lift? r z) ; ok for robot to lift its end?
(let ((e (held r)))
(let ((e* (for/or ((d ends)) (and (not (eq? e d)) d))))
(<= (pos e) (+ (pos e*) tolerance (- z))))))
;; action of robot r grabbing table end e
(define-action (grab! r e)
#:prereq (and (for/and ((r* robots)) (not (eq? e (held r*))))
(not (held r)))
held-table (hash-set* held-table r e))
;; action of robot r moving his end of the table up by z units
(define-action (lift! r z)
#:prereq (and (held r) (safe-to-lift? r z))
pos-table (let ((e (held r))) (hash-set pos-table e (+ (pos e) z))))
80
the battery level drops below some amount, the normal behaviour is suspended, and the
charging behaviour takes on a higher urgency.
The ERGO program (:monitor p1 . . . pn ) is executed as follows: the program pn is
fully executed; however, before every step of pn , the program pn−1 is fully executed; simi-
larly, before every step of pn−1 , the program pn−2 is fully executed; and so on up to p1 . In
other words, p1 has the highest priority since no step takes place without it first being fully
executed, and pn has the lowest priority, since it only gets to execute a step when all the
other programs have finished.
Typically, these pi (except possibly for pn ) involve either a :when or a :while: if a
certain condition is true, then do something; otherwise do nothing so that a lower priority
program will get a chance to do its part. For example, a program of the form
(:monitor (:when e p) q)
says to do program q except that p should be done before every step of q but only when e
is true. Similarly, the program
(:monitor (:while e p) q)
says to do program q except that p should be done repeatedly before every step of q while
e is true. Consequently, for the program
(:monitor (:while #t p) q)
(:monitor
(:when (< battery-level 10)
(get-to-room ’room1))
(:when (and (< battery-level 15) (eq? (location robot) ’room1))
(:act charge-battery)))
(:for-all box box-list (fetch box) (transfer box ’room3)))
The main task of the robot is to go get each box and carry it to Room3 using the (user-
defined) programs fetch and transfer. However, if at any point in the execution of this
task the robot finds itself in Room1 with a battery level below fifteen, it interrupts the main
task to charge its battery. But before any of this, if the battery level ever drops below ten,
the robot interrupts everything and gets to Room1 using get-to-room.
81
Figure 5.2: Program file Examples/delivery-bot.scm
82
(putdown! a) (move! (1 0)) (move! (0 -1)) (move! (0 -1)) (move! (0 -1))
(move! (0 -1)) (pickup! b) (move! (0 1)) (move! (0 1)) (putdown! b)
(move! (-1 0)) (move! (-1 0)) (move! (-1 0)) (move! (-1 0)) (move! (0 1))
(putdown! c) (move! (1 0)) (move! (0 -1)) (move! (0 -1)) (putdown! d))
The robot begins by picking up the first object, a, that happens to be located where it is, at
(0 0). Then it begins moving towards the goal destination for object a which is (3 4). It
does three moves north, and three moves east along its way, picking up objects c and d at
location (3 3), before reaching (3 4), where it drops off a. It continues this way until all
four objects have reached their goal destination.
Note that this delivery behaviour is completely opportunistic. The robot does not com-
pute in advance which objects it will pick up and in what order. If it happens to be where
an object is, it picks it up and deals with it. This means that if the world changes while the
program is executing, for example, if the objects move by themselves or new objects enter
the grid, the robot can react appropriately to the objects at hand. This sort of reactivity is
the topic of the next section.
5.3 Reactivity
Consider a very simple world with a single Boolean fluent stopped? (which starts out
being false), an action stop! which makes the fluent stopped? true, and an action ding!
which does nothing. By itself, the ERGO program (:until stopped? (:act ding!)) is
not very interesting; it simply repeats the ding! action forever. However, if we imagine
that there are other agents at work performing actions concurrently, the behaviour of the
program must be reinterpreted: the ERGO program repeats the ding! action until it is
stopped by the stop! action. Actions performed by other agents are called exogenous.
This then is the essence of reactivity: the robot performs some task, while monitoring
and reacting to conditions that may be changed exogenously. More generally, the robot
will be performing one or more tasks, while monitoring one or more conditions that may
change as the result of its actions or the actions of other agents. When one of these con-
ditions becomes true, the robot may suspend what it is doing, deal with the condition,
and then resume its normal tasks, perhaps in some modified form. The :monitor control
construct, seen in the previous section, is an ideal way to program this sort of reactivity.
• The elevator will be located on some floor, and will have actions to go from floor
to floor. For simplicity, let us assume that the action up! moves the elevator up one
floor, and that down! moves it down one floor.
• There are a certain number of call buttons that are on, and the goal of the elevator
is to turn them all off using the action turnoff!, which turns off the call button
of the current floor, and finally park the elevator on the first floor. (We put aside
complications such as actual passengers on the elevator for now.)
83
Figure 5.3: Program file Examples/reactive-elevator-bat.scm
• Now let us further assume that there is an exogenous action (turnon! n), whose
effect is to turn on the call button of floor n. In essence, these are external requests
that the elevator is being asked to process. Before the elevator on floor one can
actually stop running, it must ensure that there are no pending requests.
• While the elevator is doing its job, an alarm condition may become true as a result
of an exogenous smoke! event. If an alarm condition is detected, the elevator must
stop moving, and react by repeatedly ringing its bell using the ring! action until the
alarm is finally turned off exogenously by a reset! event (from a firefighter).
• Finally, the elevator has an internal temperature. No matter what else the elevator
is doing (serving floors or sounding alarms), if the temperature becomes too hot
(as a result of exogenous heat! events), it must react by turning on its fan via the
toggle-fan! action; if the temperature becomes too cold (as a result of exogenous
cold! events), it should turn off the fan via the toggle-fan! action.
A simple basic action theory for an elevator like this is shown in Figure 5.3. Note
that exogenous actions are presented right along with the normal ones (called endogenous).
From the point of view of the BAT, they are indistinguishable.
84
Figure 5.4: Program file Examples/reactive-elevator-run.scm
;;; This is the main program for the elevator that appears in the IJCAI-97
;;; paper on ConGolog. The BAT appears in reactive-elevator-bat.scm
(include "reactive-elevator-bat.scm")
;; get to floor n using up and down actions
(define (go_floor n)
(:until (= floor n) (:if (< floor n) (:act up!) (:act down!))))
;; the main elevator program as a priority driven monitor
(define control
(:monitor
(:when (and (< temp -2) fan?) (:act toggle-fan!)) ; handle cold
(:when (and (> temp 2) (not fan?)) (:act toggle-fan!)) ; handle heat
(:while alarm? (:act ring!)) ; stop and ring the bell
(:until (null? on-buttons) ; main serving behaviour
(:for-some n on-buttons ; choose a floor
(go_floor n) (:act turnoff!))) ; serve it
(go_floor 1) ; default homing behaviour
;; (:while #t (:wait)) ; to keep the elevator running when online
))
The main procedure for the elevator is called control and is shown in Figure 5.4. It
consists of a :monitor with five separate tasks in order of priority. The “normal” behaviour
of the robot is the one in the :until loop: until there are no longer floors to be served,
select a floor, go to that floor, and turnoff the call button. (In the program here, the selection
of a floor is trivial. In a more complex setting the on-buttons list would be reordered, for
example, to give certain floors priority, or to take into account which floor has been waiting
the longest, or to minimize elevator motion, etc.) The rest of the :monitor program serves
to augment this normal task, to react to events that happen during the execution, and to
park the elevator when the normal task is complete.
In performing the normal elevator task, a number of up! and down! actions will be
generated. Before each such action, the higher priority :while loop will be executed. In
case the fluent alarm? is false, this loop terminates immediately and the normal behaviour
continues. However, imagine that a smoke! event occurs exogenously, triggering the alarm.
In this case, the alarm? fluent becomes true and the :while loop executes the ring! action
repeatedly. Because this action does not change any fluent, this ringing may continue
indefinitely. However, if a reset! action occurs exogenously, it will set the alarm? fluent
to false, and the :while loop will terminate, allowing the normal behaviour to resume.
Before any alarm or normal behaviour takes place, the higher priority :when programs
will be executed. When the temperature is within range, no action will be needed, and
the moving or ringing will continue as before. However, if enough heat! or cold! actions
take place exogenously, the ringing or moving will be interrupted momentarily so that the
toggle-fan! action can be executed. This is a :when and not a :while, so the interruption
consists of a single action only, after which the interrupted behaviour resumes.
Finally, after all the fan toggling, bell ringing, and floor serving is complete, the lowest
priority task can start execution. This involves parking the elevator on the first floor.
85
This typically requires a number of down! actions. But before each down! action step,
:monitor must execute the higher priority tasks. So if the call button for a new floor
is pushed exogenously, the on-buttons fluent will become non-empty and the normal
elevator behaviour will once again resume. Only when the elevator is on the first floor and
no further action is required will the program terminate.
If this program is used in the simple offline mode seen so far, none of this interesting
reactivity takes place:
What happens here is that the elevator goes down four floors to serve floor 3, then up
two floors to serve floor 5, then down four floors to park on the first floor. Nothing else
happens exogenously to cause it to behave differently. It is only when this program is used
in online mode (in Chapter 6) that a more interesting range of behaviour will emerge.
1 2 3
4 5 6
7 8 9
There is a single action move! whose parameter is the board position to be occupied by the
current player. The idea is that when it is the robot’s turn to play, move! will be a normal,
86
Figure 5.5: Program file Examples/GameExamples/ttt.scm
;;; The game of tic tac toe using a general game player.
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; The action theory: 2 fluents and 1 action
(define-fluents
board (vector #f #f #f #f #f #f #f #f #f #f) ; the initial board
player ’x) ; player who plays next
;; the current player occupies the board at position sq
(define-action (move! sq)
#:prereq (not (vector-ref board sq)) ; square is not occupied
board (vector-set board sq player) ; player occupies square
player (if (eq? player ’x) ’o ’x)) ; the turns alternate
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Auxiliary definitions
(define squares ’(1 2 3 4 5 6 7 8 9))
(define lines ’((1 2 3) (4 5 6) (7 8 9) (1 4 7) (2 5 8) (3 6 9)
(3 5 7) (1 5 9)))
(define (occ sq) (vector-ref board sq))
(define (has-line? pl) ; player pl owns some line?
(for/or ((ln lines)) (for/and ((sq ln)) (eq? pl (occ sq)))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; The game playing
(define (winner) ; the winner of a board or #f
(if (has-line? ’x) 1 ; player1
(if (has-line? ’o) -1 ; player2
(and (and-map occ squares) 0)))) ; a tie
(define (print-board) ; display 3 lines of the board
(define (printsq sq) (display (or (occ sq) "-")))
(define (println ln) (display " ") (for-each printsq ln) (display "\n"))
(for-each println ’((1 2 3) (4 5 6) (7 8 9))))
;; X moves first via minimax; O plays via read
(define (main) (ergo-play-game winner (map move! squares) print-board))
endogenous action; when it is the opponent’s turn to play, move! becomes an exogenous
action. Either way, the action changes the board and who is to play next.
The ultimate goal of a tic-tac-toe robot is to fully occupy one of the eight horizontal,
vertical, or diagonal lines on the board. However, as mentioned above, there can be no
simple offline plan to achieve this goal since a player must respond to the moves of an
opponent. From an online point of view, we can think of the goal of the robot as choosing
a single good move to make starting in some state.
There are two top-level ERGO functions that can be used for playing strategic games:
ergo-generate-move and ergo-play-game. The first of these returns a player (a function
of no arguments that when called generates a single next move), while the second function
87
Figure 5.6: The start of a game of tic-tac-toe
plays an entire game by generating and reading moves repeatedly. The format for the
ergo-generate-move function is as follows:
Leaving aside the optional arguments for now, this is similar to the ergo-simplan function
of Chapter 3. As before, actions is a list of actions to be considered in deciding how to
move. Instead of a binary goal condition, however, the function win of no arguments
indicates when the game is over. The value returned should be false in those states where
the game is not over, but otherwise, it should have a numeric value: 1, if the player who is
moving first (according to the initial state) has won: −1, if the player who is moving first
has lost; and 0, for a tie. The winner function in Figure 5.5 does this for tic-tac-toe, where X
is assumed to move first. The max? argument should be #t or #f according to whether the
generated player will be playing first or second. The format for ergo-play-game is similar:
The extra argument show is a function of no arguments whose effect is to print the state of
the board. This argument is handled by print-board in Figure 5.5. The ergo-play-game
function works by calling the value of ergo-generate-move to get a move, updating the
state, showing the board and a numbered list of legal actions available to the opponent,
reading the opponent’s choice (as a number), updating the state again, and then iterating
until the game is over. (The fact that this function obtains moves from an opposing player
88
makes this an online controller, as we will see in Chapter 6, but one that is rudimentary
enough to consider here.) The start of a game of tic-tac-toe is shown in Figure 5.6.
The game player returned by ergo-generate-move uses minimax with alpha-beta prun-
ing to decide on a next move, but those details are beyond the scope of this book.
5.5 Exercises
1. Change the reactive elevator so that there are priority floors that must be served
before any non-priority ones.
2. Redesign the delivery agent robot so that multiple robots can be working simultane-
ously on the grid. If we assume that the passageways are wide enough to accommo-
date multiple robots (unlike in the next exercise), the main control issue will be to
split up the work to avoid more than one robot traveling to pick up the same object.
3. Consider a world where multiple robots (like the delivery agent) are moving on
a two-dimensional grid towards various destinations. Arrange the BAT so that no
location on the grid can be occupied by more than one robot at a time. Consequently,
89
Figure 5.7: Program file Examples/GameExamples/pousse-bat.scm
there can be a standoff when two robots are trying to get past each other. Program
a scheme (such as a right-of-way convention, or one-way passageways, or something
else) to eliminate such standoffs.
4. Write a program to play Qubic, that is, 3D tic-tac-toe on a 4 × 4 × 4 board, using the
90
Figure 5.8: Program file Examples/GameExamples/pousse-main.scm
;;; The game Pousse in Ergo using the general game player.
(include "pousse-bat.scm")
;; number of cols (resp rows) occupied by player pl in a row (resp col)
(define (num-cols pl i) (for/sum ((j L)) (if (eq? pl (occ i j)) 1 0)))
(define (num-rows pl j) (for/sum ((i L)) (if (eq? pl (occ i j)) 1 0)))
;; sum over all rows (resp cols) of fn of #cols (resp rows) for X and for O
(define (row-counter fn)
(for/sum ((i L)) (fn (num-cols ’x i) (num-cols ’o i))))
(define (col-counter fn)
(for/sum ((j L)) (fn (num-rows ’x j) (num-rows ’o j))))
(define (occupancy) ; occupancy squared
(define (sq xs os) (- (* xs xs) (* os os)))
(+ (row-counter sq) (col-counter sq)))
(define (centrality) ; manhattan dist to corners
(for/sum ((i L))
(for/sum ((j L))
(* (case (occ i j) ((x) 1) ((o) -1) (else 0))
(+ (min i (- N i)) (min j (- N j)))))))
(define (static) (+ (occupancy) (centrality))) ; static evaluation
(define (winner) ; the winner of a board
(define (owned xs os) (if (= xs N) 1 (if (= os N) -1 0)))
(if (member board boards) (if (eq? player ’x) 1 -1)
(let ((ownership (+ (row-counter owned) (col-counter owned))))
(if (> ownership 0) 1 (if (< ownership 0) -1 #f)))))
(define all-moves (for/append ((dir D)) (for/list ((k L)) (push! dir k))))
;; X moves first via minimax; O plays via read
(define (main) (ergo-play-game winner all-moves print-board
#:static static #:depth 6 #:infinity 512))
91
Chapter 6
Cognitive robotics is not just about computing sequences of actions, as with the basic
planning seen in Chapter 3 or the programming in Chapters 4 and 5. What cognitive
robotics is really about is controlling robots through cognition. The reason we care about
producing a sequence of actions is that we intend to use these actions to actually drive a
robot or software agent.
In this chapter, we examine the issues that arise in using the output of a planner or an
ERGO program to control a robot. The architecture of a total system based on ERGO is
shown in Figure 6.1. There are three main components:
• The ERGO component is made up of an ERGO program and its associated basic
action theory. The expected output of this component is a stream of actions to be
performed by the robot. In the simplest case, this component takes no input other
than an invocation by the user. In a more complex setting, it receives a stream of
sensing data from the robot manager as well as reports of other exogenous actions
that have taken place, which may include additional user requests.
• A robot is a piece of machinery with effectors and sensors. Once it is turned on, its
effectors can be activated by a robot manager, and it can report the state of its sensors
to the manager. (A robot may be replaced by a software simulator that displays in
some way what its effectors would have done and generates ersatz sensing data.)
• A robot manager is a program that sits between the ERGO component and the robot.
Its job is twofold: it translates requests for action from the ERGO program into
commands for the robot effectors; it translates readings from the robot sensors into
sensing reports for the ERGO program as needed. This manager can be written
in any computer language that supports the necessary communication. (Typically,
it is the need for communication with specialized robotic hardware that ends up
determining the required computer language for the manager.)
So far, in all the previous chapters, the basic action theories and ERGO programs have been
completely shut off from the outside world. The decisions about the actions to perform
use information available to the program at the outset, but nothing else. (Although the
planning of Chapter 12 also deals with sensing, the actual state of a sensor will not be
available to the planner. The best that can be done there will be to plan for all possible
sensing results, according again to what is known at the outset.)
92
Figure 6.1: The architecture of a cognitive robotic system
. robot
.
.
. -
actions
.
ERGO program Robot manager
.
.
sensing
. data
CO . A
. .software
. . . . . . . . . . . .C . . . . . . . . . . . . . . A. . . . . . . . . . . .
C . effector A sensor
C . commands U
A readings
user exogenous . R
requests actions . O
. dB d
This is what is called offline execution. The cognitive robotics architecture is simple:
before the robot does anything, the ERGO program (or a planning program) is asked to
compute an entire plan, a complete sequence of actions; then the ERGO program is put
away, and the plan is passed to the robot manager for autonomous execution.
In online execution, an ERGO program only computes the next action to be performed
by the robot manager; the manager then executes this action, possibly returning sensing
data to the running ERGO program. The ERGO program then computes the following
action, possibly using information acquired in previous steps, possibly looking ahead to
future actions. The process then iterates, perhaps indefinitely. While this is going on, the
ERGO program may also find out about actions performed exogenously by other agents.
In the first section of this chapter, a complete cognitive robotic system that runs in an
offline mode will be presented. But this is mostly just for contrast with the main focus of
this chapter, which is online control.
• The actual elevator is simulated using a displayed elevator image and call buttons.
93
Figure 6.2: Program file Examples/basic-elevator.scm
• The robot manager, called elevator-server.scm, drives the simulation. (In this case,
the manager happens to be written in Scheme.)
The actions it generates are intended to send an elevator down to floor 3, turnoff the call
button there, then up to floor 5, turnoff the call button there, and finally down to floor 1.
Since the overall system is intended to work in offline mode, the only real requirement
of the robot manager is that it be able to “consume” the actions produced by the ERGO
program and do something interesting with them. For our purposes here, it will read the
94
Figure 6.3: The elevator server
actions as a single list at the outset. (This protocol would not work well if there were
thousands or millions of actions to perform offline.)
Here is how the robot manager elevator-server.scm works. It begins by opening
a graphical window showing a simple image of an elevator. Then, rather than sending
instructions to a real physical elevator in response to the actions it receives as input, it
modifies the graphical display as necessary, moving the elevator, turning off call buttons,
and so on. (The program uses a number of simulation and graphical facilities provided by
Racket Scheme that are beyond the scope of this book. Note that this elevator manager is
written in Scheme but has nothing to do with ERGO.)
To test this elevator manager manually, load the program into Scheme calling its main
function, and provide it with a list of actions to read:
racket -f elevator-server.scm -m
((up 10) (down 2))
Here is what should happen: A narrow window opens showing an elevator shaft with blue
numbers along the left side indicating the floor, like the image shown in Figure 6.3. Red
call buttons are visible at floors 3 and 5. In the simulation, the elevator begins at floor 7,
moves up to floor 10, then down to floor 2, and then the simulation stops. (The simulation
can also be ended at any time by typing a q in the simulation window.)
The same simulation behaviour can be achieved non-interactively by piping the list of
actions through standard input:
So the simplest way to put this elevator manager together with the ERGO control program,
is to pipe the actions computed by ERGO to the manager through standard input:
This should cause the simulated elevator to perform the actions determined by the ERGO
program. Here, for the very first time in this book, all three pieces of a cognitive robotic
system from Figure 6.1 have come together in a (simulated) robot.
95
6.2 The robotic interface for online control
So far, ERGO programs have been run in offline mode: given a program, ergo-do generates
a sequence of actions, a complete legal execution of the entire program. What this means
is that a robot cannot get started on even the first action in the program until ergo-do has
considered the execution of the entire program from start to end. For a realistic program
written on many pages of text and requiring thousands of actions, this is impractical.
In online execution, ergo-do generates a sequence of actions incrementally, sending each
action to the robot manager one at a time, allowing it to decide when it is ready to receive
the next one. In between these steps, exogenous actions can occur that may change the
values of some of the fluents.
So nothing happens yet. To make something happen, a source for exogenous actions and
a target for endogenous actions must be declared using the function define-interface:
The direction argument should evaluate to ’in for incoming exogenous actions or to ’out
for outgoing endogenous actions (that is, actions produced by ERGO programs). The
function argument should evaluate to a function of no arguments (like read) in the case of
’in, and to a function of one argument (like display) in the case of ’out. For exogenous
actions, the input function will be used by ergo-do to receive actions from the outside
world, and for endogenous actions, the output function will be used by ergo-do to send
actions to the outside world. (An ERGO program can have multiple sources of exogenous
actions as well as multiple targets for endogenous ones.)
As a very simple example of an output function, we might use displayln itself, as
shown in Figure 6.4. In this case, ergo-do displays the actions produced by the program
one at at time. Of course, displayln is not actually getting an elevator to move; it is only
simulating sending the actions to an outside world. Moreover, the displayln takes almost
no time to execute, and so the simulated elevator does all its work in a fraction of a second.
A better simulation output function to use is the ERGO function write-endogenous. This
behaves much like displayln except that it also pauses for 1.5 seconds, simulating the time
it might take for each action to take place on a real robot.
The ’in interface for exogenous actions is similar. The function read can be used to
enter simulated exogenous actions: (define-interface ’in read). But a better simula-
tion input function to use is the ERGO function read-exogenous, which is like read, but
also displays a convenient prompt.
96
Figure 6.4: Using define-interface
97
Figure 6.6: Program file Examples/reactive-elevator-tcp2.scm
ready to receive exogenous actions. If the client then produces an exogenous action, that
action will be read by the input function, returned to the ergo-do server, and executed
concurrently with the running ERGO program. The client will then be prompted for the
next exogenous action.
Note that the value passed as the second argument to define-interface is a function
of no arguments, as required, but that the let expression first opens TCP port 8234 using
open-tcp-server. In general, the function argument of define-interface must do what-
ever initialization is necessary for the input or output function to work properly. (In the
implemenattion, the robotic interfaces created by define-interface are run as separate
processes within Racket during the execution of ergo-do.)
To try out the program in Figure 6.5 by hand, call the main function of this elevator
program, and immediately enter the following in another command-line window:
At this point, exogenous actions can be typed in by hand for as long as the main program
is running. For example, if smoke! is entered, the ERGO program will stop the elevator
and start producing ring! actions, until the reset! action is entered. If (turnon! 8) is
entered, the elevator will eventually go up to floor 8 to turn off the call button there. The
telnet connection is closed when ergo-do terminates (but see :wait below).
A different robotic interface for the reactive elevator is shown in Figure 6.6. In this case,
exogenous actions are read at the interaction terminal using read-exogenous. The output
interface, however, is over TCP. This means that when ergo-do is run online, it becomes a
98
TCP client: a server process must already be running and waiting for connections over TCP
port 8123. When the ERGO program produces an action (such as up! or toggle-fan!), that
action is sent to the server using displayln. The ERGO client then waits until it receives an
ok from the server using read. At that point, the output function returns, and the execution
of the ERGO program resumes.
99
6.3.1 Sensing via exogenous actions
But in what sense does the ERGO system know the temperature in the elevator? In the
initial state of the world, the value of the temp fluent is given as 0. The system has com-
plete knowledge of this initial state. And yet, the temperature is expected to change, even
though none of the elevator actions change it. So how does the system maintain its com-
plete knowledge about the current temperature? The answer is that it expects to be told
about temperature changes via exogenous actions. That is, without having to perform any
sensing activities on its own, the ERGO system is told via an exogenous action when the
temperature goes up or down. The effect of a heat! event is precisely to inform an ERGO
program that the temp fluent has gone up by one unit. (The cold!, smoke!, and reset!
events are similar.) This is a form of passive sensing which depends on the occurrence of
exogenous actions, in contrast to a more active form of sensing considered below.
In the reactive elevator example, the temperature changes by one unit at a time. But
there might just as easily have been an exogenous action that models an arbitrary change
in temperature:
(define-action (thermometer-response! x) ; exogenous temperature change
temp x)
In this case, an occurrence of the exogenous event (thermometer-response! 4) would
signal a running ERGO program that the temp fluent had changed to 4 regardless of its
previous value.
It is, of course, outside the control of an ERGO program when exogenous events such
as these take place. In the simplest case, the external world might send an exogenous event
as soon as a fluent of interest changes. But the world might instead send an exogenous
event reporting the current value of a fluent only at regular timed intervals.
Another possibility is that these exogenous events only happen when they are requested
by a running ERGO program. This would be a form of active sensing. We might imagine a
robot with a number of onboard sensors, such as a thermometer, a sonar, a battery level,
and others. An endogenous action such has thermometer-request! would change no
fluents, but would tell the robot manager that a thermometer reading was requested. The
robot manager would then deal with the robot’s sensors and eventually cause an exogenous
action like (thermometer-response! x) to happen for some value of x, which would have
the effect of changing the temperature fluent to x as above. If there are no other exogenous
actions to worry about, an ERGO program might include something like
(:atomic (:act thermometer-request!) (:wait))
to perform the request and wait for the response. (The :atomic sequencing ensures that
other concurrent tasks do not execute between the request and the :wait, during which
the exogenous response might occur, causing the :wait to then hang.)
100
Consider, for example, the motion of a robot from room to room in a building, as
in Chapter 3. In the simplest case, there might be a single action goto! whose effect
is to change the location of the robot from one room to another. This change happens
instantaneously in the sense that no other actions (endogenous or exogenous) can happen
between the time the motion starts and the time it stops. This is the model of robot motion
reflected in the BAT of Figure 3.2.
But of course the motion of an actual robot is far from instantaneous, and in an online
context, it might be better to say that there are two or more instantaneous actions that start
and stop the behaviour:
In the best case, the robot will arrive at its intended destination, and the location fluent
will then be updated. But in some cases, the robot may have been stopped by the robot
manager because the motion was taking too long, or because the robot bumped into some-
thing, or for some other reason. In such a case, the fluent arrived? would be false and it
would then be up to the ERGO program to engage in some sort of failure recovery. (An-
other way of modeling this is for there to be more than one type of exogenous action that
can terminate the motion and change the value of the status fluent.)
In this example of robot motion, the behaviour is initiated by an endogenous action and
terminated by an exogenous one. Interestingly, user requests can be thought of as just the
opposite of this. For example, serving a floor in the reactive elevator is also a behaviour, but
one that is initiated by an exogenous action (the turnon! action, which requests service)
and terminated by an endogenous one (the turnoff action, which indicates completion).
In this case, between the starting and stopping actions, the elevator is in a “serving a floor”
state, reflected by that floor being an element of the on-buttons fluent, and is free to be
engaged in other concurrent actions
101
∗ 6.4 Backtracking and the :search primitive
This final section is concerned with backtracking. In fact, nothing has been said so far
about how online execution interacts with backtracking. Consider the following program:
(:begin
(:choose (:act a) (:act b))
(:act c))
Suppose that action b makes the prerequisite of action c true, but that action a does not.
In offline execution, ergo-do would consider action a first, then consider c, which
would fail, and then backtrack and consider action b instead, then c, which would now
succeed. So there is a single legal execution: action b followed by action c.
In online execution, however, ergo-do generates actions one at a time and sends them
to the outside world for execution. Once :choose has decided to try action a, that choice
is final. The action takes place in the world and there may be no way to undo its effect.
Without looking ahead to see action c, however, there is no way of knowing that action a
was a bad choice that will later lead to failure.
Why not simply defer the execution of action a until it can be determined that there
is a successful completion of the entire program with a as its first action? One of the
motivations of online execution is to generate actions incrementally without having to first
run through the entire program. Given a program like (:begin (:act a) p), the goal is
to get the robot started on action a before trying to find a successful execution of all of p,
which could be gigantic.
This means that in an online context, backtracking is handled in a different way. The
program (:begin p1 . . . pn ) generates actions incrementally starting at p1 without waiting
to see if there is a path all the way through to the end of pn . Consequently, an online
execution of a program may fail in cases where an offline execution would succeed, as in
the simple program above.
To guard against this possibility, the ERGO primitive
(:search p1 . . . pn )
can be used. In online mode, :search behaves like an offline :begin. In other words,
an action is produced by :search and sent to the output interfaces only if a successful
execution of all the pi subprograms in sequence can be found. So the online execution of
the program
(:search
(:choose (:act a) (:act b))
(:act c))
102
program p, it must be the case that a successful execution of both p then q can be found.
But the program (:begin (:search p) q), on the other hand, allows the online execution
of the two pieces to be detached. Before any actions are generated for program p, it must
be the case that a successful execution of all of p can be found. But the consideration of
q is deferred. If we imagine, for example, that q is a very large program, with hundreds
of pages of code, perhaps containing :search operators of its own, this can make an enor-
mous difference. So by using the :search primitive selectively, we obtain the advantages
of offline search (or planning) without having to deal with impossibly large search spaces.
One caveat concerns exogenous actions. The :search primitive uses ergo-do in offline
mode to search for an appropriate sequence of actions, which it then sends online to the
output interfaces. No exogenous actions are considered during the offline search. However,
if an exogenous action occurs while the actions are being sent to the output interfaces, the
computed sequence of actions may no longer be suitable, and failure may still occur. This
means that the :search primitive should only be used in contexts where exogenous actions
cannot invalidate the results of the offline search. (A more complex search operator would
restart the search after an exogenous action to find a replacement sequence of actions.)
103
Projects
Chapter 7
A Simulated World
105
Figure 7.1: The Squirrel World
Acorns and walls are distributed randomly on the grid and are completely passive.
Squirrels on the other hand have a number of actions at their disposal which are either
ordinary actions (that change the world) or sensing actions (that provide information to
the squirrel). Each action has a fixed duration. Squirrels have an energy level, the number
of time units they have left before they expire. Squirrels start at the maximum level, which
is 3000 time units. Energy levels decrease as time advances, but can be increased by eating
an acorn. Squirrels can pick up acorns and carry up to two of them. They can also drop
an acorn they are holding. Finally, squirrels can build additional wall segments. The first
squirrel to be located at a position where there are four acorns wins the game.
106
Note that an action may have no effect. For example, if a squirrel attempts a pick action
and either there is no acorn there or the squirrel is already holding two acorns, then the
action still has a duration of 30 time units, but has no effect. (To be able to find out if there
is an acorn present, the squirrel can use the smell action described below.) Of particular
note is the forward action: if a squirrel attempts this action and there is a wall segment
directly ahead, the action has no effect, but the squirrel is stunned and loses 750 units of
energy. (The look action described below can be used to check for a wall.)
Here is a summary of the four sensing actions:
In addition, the ordinary actions like pick return a sensing result which is either ok when
the action has its intended effect, or fail when the action has no effect.
107
Figure 7.2: Program file Projects/Squirrels/sw-bridge.scm
;;; This is interface code that can be used for any ERGO agent that interacts
;;; with a Squirrel World server.
;; The SW interface parameters (change as necessary)
(define portnum 8123) ; port for SW server
(define machine "localhost") ; machine for SW server
(define tracing? #f) ; print action info for debugging?
;;; Initializing the TCP connection to the SW server
(eprintf "Connecting to the Squirrel World\n")
(define sw-ports (open-tcp-client portnum machine))
(define sw-name (read (car sw-ports))) ; the first output of SW server
(eprintf "Squirrel ˜a ready to go\n" sw-name)
(define sw-acts ; acts to send to SW server
’(feel look smell listen left right forward pick drop eat build quit))
(define sw-responses (hasheq ; exog responses for sensing acts
’feel ’set-energy! ’look ’set-view! ’smell ’set-aroma! ’listen ’set-sound!))
;; Define the two ERGO interfaces (using a channel for exog actions)
(let ((chan (make-channel)) (iport (car sw-ports)) (oport (cadr sw-ports)))
(define (sw-read) (channel-get chan)) ; get exog from sw-write (below)
(define (sw-write act) ; send act over TCP and get response
(and (memq act sw-acts)
(let ()
(displayln act oport)
(let ((ans (read iport)) (exog (hash-ref sw-responses act #f)))
(and (eof-object? ans) (error "No response from Squirrel World"))
(and tracing? (eprintf "Sending: ˜a. Receiving: ˜a\n" act ans))
(and (eq? ans ’fail) (eprintf "Warning: Action ˜a failed\n" act))
(and exog (channel-put chan (list exog ans)))))))
(define-interface ’in sw-read)
(define-interface ’out sw-write))
In other words, when a look action is produced by an ERGO program, it will be followed
by an exogenous (set-view! wall) or (set-view! nothing) action, depending on what
is in front of the squirrel. It is then up to the ERGO basic action theory to decide which
fluents change as the result of these exogenous actions.
The communication method used by sw-bridge is worth noting as it may be useful
in other contexts. The main thing to observe is that the ’in interface (for exogenous
actions) defined by the internal procedure sw-read cannot simply read from the TCP port.
Instead, sw-read attempts to extract an entry from an internal channel, blocking when that
channel is empty. It will be up to the other internal procedure sw-write to put exogenous
actions into that channel. So sw-write really does all the communication work. Given
an action performed by an ERGO program, sw-write first decides using sw-acts whether
the action should be sent to the SW server. If so, it sends the action with displayln and
obtains the response immediately with read. If the action was a sensing action according
to sw-responses, it then constructs an appropriate exogenous action (as described above)
and puts that action in the channel for sw-read.
108
Figure 7.3: Program file Projects/Squirrels/simple-main.scm
;;; This program provides agent behaviour in the Squirrel World (SW)
;;; The squirrel finds a wall ahead, and then runs back and forth.
(include "sw-bridge.scm") ; define the interfaces to SW
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Basic action theory
(define-fluents
steps 0 ; the number of steps taken
seen ’nothing) ; the last thing seen by "look"
;; squirrel actions
(define-action left) ; turn left
(define-action look) ; look ahead
(define-action forward ; go forward
steps (+ steps 1))
;; exogenous action
(define-action (set-view! x) ; report from last look action
seen x)
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; Control program
(define (count) ; go forward until a wall is seen
(:until (eq? seen ’wall) (:act forward) (:act look) (:wait)))
(define (run n) ; forever do 2 lefts and n forwards
(:while #t (:act left) (:act left) (:for-all i n (:act forward))))
(define (main) ; overall: count then run
(ergo-do #:mode ’online (:begin (count) (:let ((n steps)) (run n)))))
109
• (run n) is an ERGO program that repeatedly turns the squirrel around (by perform-
ing two left actions) and then does n forward actions. Note that this program does
not look for any walls, and so the squirrel moves at its top speed.
The main program performs the (count) procedure, then the (run n) procedure where n
is the value of the steps fluent immediately after the (count). (Note that :let is used to
ensure that n has the value of steps after the count. See Section 4.1.6.) The effect is to
have the squirrel walk cautiously up to the first wall ahead of it (perhaps the boundary at
the other side of the world), and then to run quickly back and forth between that wall and
the origin. The main program runs until the squirrel dies, and so the only way to stop it
before then is to stop the SW server (by typing a q in the SW server window).
Once the SW server is running, the simple squirrel program can be run using
The actual behaviour observed in the SW window will of course depend on which squirrel
ends up being connected to the program and on the locations of the walls.
110
Figure 7.4: Program file Projects/Squirrels/systematic-main.scm
(include "sw-bridge.scm")
(include "systematic-bat.scm")
(include "systematic-procs.scm")
(define (run-to x y) ; run to (x,y) but east/west on x-axis only
(:begin
(:for-all i yposition (go-dir ’south))
(:if (< x xposition) (:for-all i (- xposition x) (go-dir ’west))
(:for-all i (- x xposition) (go-dir ’east)))
(:for-all i y (go-dir ’north))))
(define (do-at-nest prog) ; run to (1,1), do prog, then run back
(:let ((x xposition) (y yposition)) (run-to 1 1) prog (run-to x y)))
(define (stroll-north) ; walk north, smelling and looking for walls
(:until (or (eq? seen ’wall) (> yposition (/ ymax 2)))
(go-dir ’north)
(check smell) (check look)))
(define (main)
(ergo-do #:mode ’online
(:monitor (low-energy) (acorns-present)
(:while #t ; systematic search of grid
(stroll-north) ; walk north as far as possible
(run-to xposition 0) ; run back to x-axis
(go-dir ’east) ; take one step east
(check feel) (check smell) (status-report)))))
111
Figure 7.5: Program file Projects/Squirrels/systematic-bat.scm
(define-fluents
seen ’nothing smelled 0 energy 100 carrying 0 stashed 0
xposition 0 yposition 0 direction ’north)
;; motion actions
(define-action quit)
(define-action forward
xposition (+ xposition (case direction ((east) 1) ((west) -1) (else 0)))
yposition (+ yposition (case direction ((north) 1) ((south) -1) (else 0)))
smelled 0
seen ’nothing)
(define-action left
direction (cadr (assq direction dirL))
seen ’nothing)
(define-action right
direction (cadr (assq direction dirR))
seen ’nothing)
;; acorn actions
(define-action eat
carrying (- carrying 1))
(define-action drop
carrying (- carrying 1)
stashed (+ stashed 1))
(define-action pick
carrying (+ carrying 1)
stashed (if (and (= xposition 1) (= yposition 1))
(- stashed 1) ; picking from stash
stashed)) ; picking from elsewhere
;; sensing actions
(define-action feel)
(define-action look)
(define-action smell)
;; exogenous actions
(define-action (set-energy! x) energy x)
(define-action (set-view! x) seen x)
(define-action (set-aroma! x) smelled (car x))
• The alists dirL and dirR are used to map one direction into another direction, the
one that results from turning left or right.
• All the sensing done by the squirrel passes through the procedure (check a), which
performs the sensing action a and waits for an exogenous report.
112
Figure 7.6: Program file Projects/Squirrels/systematic-procs.scm
• The procedure go-dir faces in a given direction, then takes a single step forward.
• The procedure (do-at-nest p) does the following: it remembers its current location
( x, y), it runs to (1, 1), executes p, and then runs backs to ( x, y).
• The procedure dump-at-nest uses do-at-nest to get to the nest, and there drops all
the acorns being carried.
• The procedure status-report prints information about the state of the squirrel.
113
The remaining procedures determine the top-level behaviour of the squirrel:
• The procedure stroll-north walks north using go-dir and at each step, smells for
an acorn and checks for a wall. It continues doing this until the squirrel sees a wall
or gets outs of its quadrant (when its yposition exceeds half of ymax).
• The procedure acorns-present does the following when at least one acorn has been
detected: If the squirrel is already carrying two acorns, it first dumps both of them
in the nest. Then the squirrel picks up an acorn and, if it now has four in total, it
goes to the nest to claim victory. Otherwise, it checks its health and smells for more
acorns at the same spot.
• The procedure low-energy does the following when the energy of the squirrel is
found to be low: If the squirrel is not carrying any acorns, it first goes to get one
from the nest. (If there are none in the nest either, the squirrel gives up and quits.) It
then eats one of the acorns it is carrying.
114
Figure 7.7: Program file Projects/Squirrels/random-main.scm
(include "sw-bridge.scm")
(include "systematic-bat.scm")
(include "systematic-procs.scm")
(define-fluent path ’())
(define-action push-path! path (cons direction path))
(define-action pop-path! path (cdr path))
(define (revdir dir) ; the 180 direction from dir
(cadr (assq (cadr (assq dir dirL)) dirL)))
(define (do-at-nest prog) ; go to nest, do prog, then return
(:begin (:for-all d path (go-dir (revdir d)))
prog
(:for-all d (reverse path) (go-dir d))))
(define (take-step r) ; try a step in a random dir according to r
(define (try-in-order dirs)
(:begin (face-dir (car dirs)) (check look)
(:if (eq? seen ’nothing)
(:begin (:act push-path!) (:act forward) (check smell))
(try-in-order (cdr dirs)))))
(try-in-order
(if (< r .48) ’(north east west south)
(if (< r .96) ’(east north south west) ’(south west north east)))))
(define (path-restart) ; go back to nest and start a new path
(:until (null? path) (go-dir (revdir (car path))) (:act pop-path!)))
(define (main)
(ergo-do #:mode ’online ; random search of grid
(:begin (go-dir ’north) (go-dir ’east) ; start from (1,1) location
(:monitor (low-energy) (acorns-present)
(:while #t
(path-restart) ; start search over
(:for-all i 35 (take-step (random))) ; take 35 random steps
(check feel) (status-report))))))
side. In other words, the squirrel might actually be in some sort of wall maze that might be
quite hard to exit (without starving).
One radical solution to this problem is to perform a non-systematic search of the grid,
where a squirrel moves somewhat randomly, always endeavouring to visit new territory,
while looking for walls and smelling for acorns as necessary. This requires a redesign of
the procedure and its basic action theory.
A simplified version of a randomized searching squirrel is shown in Figure 7.7. It uses
the same BAT and procedures as the systematic squirrel, but with a few modifications. It
uses a new fluent path to keep track of all the directions it has taken since the last time
it left the nest at (1, 1). It uses the same low-energy and acorns-present procedures as
before, but this time, the do-at-nest procedure uses the saved path (which is known to
be clear of walls). The search procedure in this case involves getting the squirrel to take 35
115
steps using take-step, and then running back to the nest and starting on a new path. The
take-step procedure uses try-in-order to attempt one step in a list of possible directions,
taking the first one that is not blocked by a wall. The order of directions is determined
randomly: 48% of the time it tries north, then east, then west, then south; 48% of the time
it tries east, then north, then south, then west; 4% of the time it tries something different:
south, then west, then north, then east. Overall, this means the squirrel heads somewhat to
the north and to the east, unless there are walls blocking it. Unlike the systematic squirrel,
it will not be stymied by walls, including walls that touch the x-axis (or any boundary).
Assuming the SW server is running, the random-main program can be run the same
way as systematic-main. There are cases where it will do better. But overall, it appears
to spend too much time looking for walls and smelling for acorns at the same nearby
locations. Much better would be to somehow keep track of all the “clear space” it has
visited, and to return to its nest only when it has too many acorns, and then to run back to
a perimeter of that clear space. While the search for acorns can be random, the goal should
always be to expand the clear space. This would preserve the advantages of the systematic
search, while avoiding its drawbacks.
116
Chapter 8
A LEGO Robot
After all this time talking about cognitive robotics, we are finally ready to work on an actual
robot. Perhaps the simplest programmable robot is the LEGO Mindstorms, available as an
inexpensive kit. The kit includes plastic LEGO pieces to build a variety of robots, as well
as a collection of sensors and motors. In this chapter, we consider a project that involves a
LEGO robot controlled by ERGO that moves on a small surface, picking up items at various
locations and delivering them to other locations, somewhat like the delivery example of
section 5.2.1. In terms of hardware, all that is needed for this robot beyond the LEGO kit is
a micro-SD card (to install the EV3dev operating system) and a USB Wifi dongle (to allow
wireless TCP communication with ERGO).
http://ev3dev-lang.readthedocs.io/projects/python-ev3dev/en/stable/
https://sites.google.com/site/ev3python/
117
Figure 8.1: The LEGO EV3 Brick
To give a sense of what an EV3 Python program looks like, two small ones are shown
here. Figure 8.3 shows EV3 Python code that gets a vehicular robot to drive in a square
pattern (or close) by dead reckoning, assuming motors on two wheels are connected to
out ports on the Brick. Figure 8.4 shows Python code that gets the vehicle to continue
driving forward until a sensor detects reflected light brighter than a certain threshold,
again assuming the light sensor is connected to an in port on the Brick.
118
Figure 8.3: Program file Servers/EV3/misc/square.py
import time
import ev3dev.ev3 as ev3
# motors must be connected to out ports A and D
left_wheel = ev3.LargeMotor(’outA’)
right_wheel = ev3.LargeMotor(’outD’)
# drive both motors for ms milliseconds
def straight(ms):
left_wheel.run_timed(time_sp=ms,speed_sp=500)
right_wheel.run_timed(time_sp=ms,speed_sp=500)
time.sleep(ms/1000.0)
# turn either left (dir=1) or right (dir=-1)
def turn(dir):
left_wheel.run_timed(time_sp=360,speed_sp=-500*dir)
right_wheel.run_timed(time_sp=360,speed_sp=500*dir)
time.sleep(.5)
# drive a squarish pattern
def square(ms):
for i in range(0,4):
straight(ms)
turn(1)
square(1500)
119
Figure 8.5: Program file Projects/LEGO/tag-bridge.scm
;;; This is interface code that can be used for an ERGO agent that
;;; uses tagged actions for online interactions
;; this calls define-interface after modifying readfn and printfn to use tag
(define (define-tagged-interfaces tag readfn printfn)
(define (read-add-tag)
(let ((r (readfn)))
(if (symbol? r) (list r tag) (cons (car r) (cons tag (cdr r))))))
(define (print-detag a)
(and (not (symbol? a)) (not (null? (cdr a))) (eq? (cadr a) tag)
(printfn (cons (car a) (cddr a)))))
(define-interface ’in read-add-tag)
(define-interface ’out print-detag))
;; setup in and out interfaces over TCP
(define (tag-tcp-setup tag portnum IPaddress)
(eprintf "Setting up interfaces over TCP for ˜a\n" tag)
(define tcp-ports (open-tcp-client portnum IPaddress))
(define-tagged-interfaces tag
(lambda () (read (car tcp-ports)))
(lambda (act) (displayln act (cadr tcp-ports))))
(eprintf "˜a is ready to go\n" tag))
;; setup in and out interfaces with standard IO
(define (tag-stdio-setup)
(eprintf "Setting up interfaces over stdin and stdout\n")
(define-tagged-interfaces ’user read-exogenous write-endogenous))
120
Figure 8.6: Program file Projects/LEGO/test-manager1.scm
;;; This program uses the EV3 Robot Manager 1 to perform the actions below.
;;; Once the robot manager is running, this ERGO program can be run by
;;; racket -l ergo -f test-manager1.scm -m <IPaddress>
;;; where <IPaddress> is the address of the EV3 machine.
(include "tag-bridge.scm")
(define-fluents light 100)
;; endogenous
(define-action (run_motor! r x))
(define-action (req_sensor! r))
;; exogenous
(define-action (reply_sensor! r z) light z)
(define (main . args)
(and (null? args) (error "Must supply an IP address for EV3"))
(tag-tcp-setup ’my-EV3 8123 (car args)) ; port 8123 assumed
(ergo-do #:mode ’online
(:begin (:act (run_motor! ’my-EV3 2000)) ; run the motor for 2000 ms
(:act (req_sensor! ’my-EV3)) ; ask for a sensor reading
(:wait) ; wait for value returned
(:>> (printf "The light had value ˜a\n" light)))))
• signal_exogenous(actName,args)
This function takes two arguments, actName, which is a string, like ’respond!’ or
’wall-detected!’, and a list of arguments for the action, each of should be a number
121
or a string. The function sends the action (converted into a Scheme list) over TCP to
the ERGO system.
• ergo_tcp_session(portnum,initialize,handle_endogenous)
This function starts a TCP server on the EV3 waiting for a connection on the port
given by the portnum argument, and terminates when the session closes. The other
two arguments are functions that are called after a connection is made:
Note that everything that the robot manager does during a TCP session is as a result of
its initialize or handle_endogenous arguments. So it is up to the handle_endogenous
argument to do what it takes for every possible endogenous action it can receive. Typically,
the body of this function will be something like this:
where the acti are the names of all the endogenous actions that can be received from
ERGO, and the proci are Python functions defined in the robot manager. These Python
functions will go on to use the EV3 library to control the motors, read from the sensors, and
then call the function signal_exogenous as necessary. It is a good idea when developing
a robot manager to begin with stubs for these proci functions that simply use print and
input operations instead of motors and sensors.
122
Figure 8.7: Program file Servers/EV3/manager1.py
# This is a robot manager for an EV3 brick interacting with ERGO over TCP
# It accepts the following endogenous actions:
# - (run_motor! t), where t is in milliseconds
# - req_sensor!
# It generates the following exogenous actions:
# - (reply_sensor! n), where n is between 1 and 100
import sys, ev3dev.ev3 as ev3
from ergo_communication import signal_exogenous, ergo_tcp_session
moto = ev3.LargeMotor(’outA’) # need a motor on out port A
colo = ev3.ColorSensor() # need a color sensor on an in port
colo.mode = ’COL-REFLECT’ # color sensor returns 1 to 100
# Run the motor on port A for t milliseconds
def run_motor(t):
moto.run_timed(time_sp=t, speed_sp=500)
# Return a number from 1 to 100 in a reply_sensor! exogenous action
def req_sensor():
signal_exogenous(’reply_sensor!’,[colo.value()])
########################################################################
### The two procedures needed for ergo_tcp_session
def ini():
ev3.Sound.speak(’Starting Manager One’).wait()
print("Starting Manager 1")
def dispatch(actName,args):
if actName==’run_motor!’: run_motor(args[0])
elif actName==’req_sensor!’: req_sensor()
else: print(’No dice’)
ergo_tcp_session(8123,ini,dispatch)
At this point, every time a number n is typed in the telnet session, the EV3 should respond
with the number n + 7, until the session is terminated.
We are now ready to try the program manager1.py, a simple robot manager for the
EV3 shown in Figure 8.7. Note that the two function arguments to ergo_tcp_session here
are ini and dispatch. The dispatch function is prepared to deal with the two endoge-
nous actions produced by the ERGO program test-manager1.scm from Figure 8.6, and
to send to that program the exogenous action it is expecting. So once Python is running
manager1.py on the EV3, we can test its operation by running test-manager1.scm under
ERGO on a different computer in the same local area network and, if all is working well,
we get a LEGO motor to spin for two seconds and the LEGO sensor value printed.
This, then, is a complete cognitive robotic system of the sort shown in Figure 6.1. All
that is left to do is to give it something more interesting to think about.
123
8.3 A delivery robot project
The project we want to consider in this chapter involves a robot delivery service. The idea
is that the LEGO robot is a vehicle of some sort that can move around on the roads over a
given terrain. There will be packages for it to pick up at certain designated locations along
the roads, and packages to deliver at others. Most of the robot’s time will actually be spent
moving on a road from one location to another, possibly turning at intersections, which
are also designated locations. We assume that the robot itself does not put packages on its
cart or take them off; at the appropriate locations, it signals for customers there to do so.
So in the simplest case, we are considering three endogenous actions that the LEGO
robot can perform:
• (leave-location! rob)
The robot starts moving along the road from it current location towards the next one.
It continues following the road until an arrive-at-location! action happens.
• (req-customer-action! rob)
The robot announces its presence at its current location and waits for someone to
take parcels off or put parcels on. It does not leave until the customer-action-done!
action below happens.
We are also considering three exogenous actions that can happen:
• (arrive-at-location! rob)
This action happens when a moving robot arrives at the next location on the road.
• (customer-action-done! rob)
This action happens after someone signals that the parcels to be picked up have been
placed on the robot and the parcels to be delivered have been removed.
124
Figure 8.8: A LEGO Robot with a downward light sensor
125
Figure 8.9: Program file Projects/LEGO/delivery-map.scm
;;; This is the code that defines the map of roads for the LEGO robot
;; The table below is for a map that looks like this:
;; 1 --- 2 --------- 7 ---8
;; / | \
;; | | \
;; 3 ----- 4 --- 5 ---- 6 ----- 9
;; / |
;; 10 11
(define adjacency
(hasheq 1 ’(#f 2 #f #f) 2 ’(#f 7 4 1) 3 ’(#f 4 #f #f) 4 ’(3 2 5 10)
5 ’(4 #f 6 #f) 6 ’(5 7 9 11) 7 ’(2 #f 8 6) 8 ’(7 #f #f 9)
9 ’(6 8 #f #f) 10 ’(#f 4 #f #f) 11 ’(#f 6 #f #f)))
(define (adjs x) (hash-ref adjacency x #f))
;; the next location after leaving x with orientation ori
(define (next-location x ori) (list-ref (adjs x) ori))
;; the next orientation after leaving x with orientation ori
(define (next-orientation x ori)
(let ((locs (adjs (next-location x ori))))
(modulo (+ (for/or ((i 4)) (and (eq? x (list-ref locs i)) i)) 2) 4)))
;; the orientation that results from turning dir = left, right or around
(define (shift-orientation ori dir)
(modulo (+ ori (case dir ((left) -1) ((right) 1) ((around) 2))) 4))
;; the dir that is needed to go from x and ori to an adjacent location y
(define (get-direction x ori y)
(let ((locs (adjs x)))
(case (for/or ((i 4)) (and (eq? y (list-ref locs (modulo (+ ori i) 4))) i))
((0) #f) ((1) ’right) ((2) ’around) ((3) ’left))))
;; a path of adj locations whose first element=start and last element=end
(define (find-path start end)
(let loop ((seen ’()) (nodes (list (list start))))
(define (nexts x) (for/only ((y (adjs x))) (and y (not (memq y seen)))))
(define (next-paths x path) (for/list ((y (nexts x))) (cons y path)))
(if (null? nodes) #f
(let ((path (car nodes)) (x (caar nodes)))
(if (eq? x end) (reverse path)
(loop (cons x seen) (append (cdr nodes) (next-paths x path))))))))
is necessary). Finally, the function find-path returns a shortest list of adjacent locations
between a start and end location.
126
Figure 8.10: Program file Projects/LEGO/delivery-bat.scm
or remove packages, and the fluents pending, onboard, and done as a list of items that
are either to be delivered, are being delivered, or have been delivered respectively. Note
that leave-location! changes in-transit? only; location and orientation are changed
only after the robot arrives at its next location via the exogenous arrive-at-location!.
(The actual movement of the robot will be initiated and terminated by the robot manager.)
However, the prerequisite of the leave-location! action tests the next location to ensure
that there is a location to go to. As for the packages, note that items are put on at the ends
of the onboard and pending lists, making those lists behave like queues.
127
Figure 8.11: Program file Projects/LEGO/delivery-main.scm
128
load or unload at the current location, and if so, requests customer action, and then waits
for a signal to proceed. Note that the robot is opportunistic: if on the way to some location,
it happens to find itself somewhere where there are packages to be picked up or dropped
off, it will stop there before continuing.
The head-for procedure is what is used for taking one step towards a goal location.
It finds the shortest path towards the goal, does a turn as necessary to face the first des-
tination in the path, and then heads out towards that location, waiting until it arrives.
The way the program is structured, the robot only does one step at a time before recon-
sidering where it is heading. This allows it to react to changing circumstances, possibly
reconsidering the rest of the projected moves.
The next-work-loc procedure decides where to go to deliver or pickup packages. If
either nothing is onboard or nothing is pending, the case is clear. But if there are packages
on board and packages waiting to be picked up, the current routine looks at the destination
of the first onboard package and the source of the first pending package and goes to the
location that is the furthest away. (Other scheduling options are certainly possible.)
129
Figure 8.12: Program file Servers/EV3/delivery manager0.py
location, ready to move forward. This may require some experimentation and tuning de-
pending how well the robot can rotate on its own axis.
There are different ways of handling the req-customer-action! action. First the robot
should announce its presence. It can beep, say something using the onboard speech system,
flash its LEDs, or even use a small motor to raise a flag. At this point, the thread should
begin. Maybe the simplest thing to use is a touch sensor aimed upwards. The thread can
repeatedly read the value of that sensor until the touch sensor has been pressed by the
customer, in which case the thread would signal completion and terminate.
The leave-location! action should start the robot moving forward and then monitor
its progress in a loop within a thread. The idea is to check the light sensor aimed down-
wards repeatedly and decide what to do based on whether the sensing value indicates the
130
presence of a road, a designated location, or just ordinary terrain. When the robot is on a
road, it should continue moving; when the robot is at a designated location, it should stop
moving and signal arrival and terminate the thread; when the robot is on ordinary terrain,
it should adjust its motion, to the left or to the right, to get back onto the road. How
much to turn and for how long to get a smooth forward motion along a road (especially
a curving one) is something that will definitely require experimentation and tuning. This
might be the most difficult robot behaviour to get right. (There are some online examples
of programs that get a LEGO robot to follow a line in this way.)
131
Chapter 9
Real-time video games offer one of the more compelling examples of computer-generated
artificial worlds. In these games, a user is shown a graphically-rendered environment, and
then, using a keyboard or other hardware controller, the user must control one or more
agents living in that environment to achieve some goal. In some cases, the goal might
be no more than to explore the environment; in other cases, the goal might be to build a
city or to stay alive. It is typical of these games that other agents also inhabit the same
environment. These might be other users connected to the game somehow (for example,
over a network) or computer-generated agents (often called non-player characters or NPCs).
In this chapter, we consider an ERGO project that involves the construction of a very
simple game called CarChase using the freely-available Unity 3d game development sys-
tem. In this game, there are two primitive cars called Patrol and JoyRide that drive around
an equally primitive terrain. The Patrol car is under user control, while the JoyRide car
will be an NPC controlled externally by ERGO. The object of the game is for the Patrol car
to catch the JoyRide car and tap it on the rear bumper. A snapshot of a game in progress
is shown in Figure 9.1. This shows the Patrol car from behind with its headlights on ap-
proaching the JoyRide car ahead and to its right. (The other triangular-shaped features on
the terrain are supposed to be rocks.)
9.1 Unity 3D
Unity 3d is a system for constructing real-time video games. The games can then be played
on computers, over the web, or on smart phones. What the games actually look like (or
sound like) is completely up to the game designer: Unity 3d provides basic shapes and
textures, but it is up to the programmer to assemble them into reasonable-looking worlds.
When a game is being played, the world is displayed (according to what the main camera
has been told to look at), one video frame at a time. What the objects in the game actually
do from frame to frame is again up to the programmer: Unity 3d provides basic physics for
inactive physical objects, but the programmer must provide scripts in the C# programming
language for objects that are intended to behave on their own.
A full account of Unity 3d and C# are beyond the scope of this book, but there is
considerable material online. See https://unity3d.com/ for information about Unity 3d,
including how to install it on various platforms, and an extensive manual and help system.
132
Figure 9.1: A CarChase game in progress
using UnityEngine;
// This is the script to be attached to the car "Patrol" in the scene.
// It reads arrow keys from the terminal and moves the car accordingly.
public class PatrolCar : MonoBehaviour {
private float forwardSpeed = 80f;
private float turnSpeed = 0.4f;
private Rigidbody myCarBody;
private Transform myCar;
void Start () {
myCarBody = GetComponent<Rigidbody>();
myCar = GetComponent<Transform>();
}
void Update () {
float forwardAmount = Input.GetAxis("Vertical")*forwardSpeed;
float turnAmount = Input.GetAxis("Horizontal")*turnSpeed;
myCar.Rotate(0,turnAmount,0);
myCarBody.AddRelativeForce(0,0,forwardAmount);
}
}
133
called by Unity 3d at the start of a game; and Update, called by Unity 3d before a video
frame is rendered during a game. In this case, the Update method first determines which
arrow keys are currently being pressed by the user. For the vertical, Input.GetAxis returns
1 for up-arrow, -1 for down-arrow, and 0, otherwise; for the horizontal, it returns 1 for left-
arrow, -1 for right-arrow, and 0, otherwise. The Update method then calls the car’s Rotate
and AddRelativeForce methods to change the orientation and position of the Patrol car in
the current frame. (The car has been constructed as a rigid body subject to physics, and
thus force is required to move it forward.) So for example, if no arrow keys are pressed,
the turnAmount and forwardAmount will both be set to zero, and the car will not move in
current frame, unless it is coasting as a result of its physics.
In addition to the Start and Update methods, there are many other special methods
that are called automatically by Unity 3d. The ones used here in the CarChase project for
the other car (the JoyRide car controlled by ERGO) are the following:
This only scratches the surface of the many behaviours that can be controlled by Unity 3d,
which includes GUI events and facilities for multi-player games.
134
Figure 9.3: The form for a Unity 3d robot manager
using UnityEngine;
// This is a script to be attached to an ERGO-controlled object called myRobot
public class myRobot : MonoBehaviour {
private ErgoServer server = new ErgoServer();
// any other declarations
void Start () {
server.Start();
}
void OnApplicationQuit() {
server.Stop();
}
void Update () {
server.Update();
if (server.hasData) {
string act = server.getEndogenous();
// what to do with the endogenous action
}
}
// any other methods
}
These are all methods to be called except for hasData which is a read-only variable. None
of the methods take arguments except for signalExogenous, which takes a string argu-
ment. The methods return no values except for getEndogenous, which returns a string.
On the ERGO side, the file u3d-bridge.scm defines (u3d-start-comm port trace?),
which makes a connection to a Unity 3d server over TCP, and defines the interfaces re-
quired by online ERGO programs (See Section 6.2.) If the trace? argument is true, the
actions sent and received are also displayed on the terminal.
135
Figure 9.4: The terrain with no cars
136
Figure 9.5: A closeup of a car
are checked, so that the car will only turn on its Y-axis. This completes a basic car. Rename
the car Patrol, and make a copy of it called JoyRide for later use below.
To complete the Patrol car, move the Main Camera to within the Patrol object. This
causes the camera to follow the Patrol car, providing a first-person view of the scene during
the game. Position the camera so that it is just behind the Patrol car looking down, as in
Figure 9.1. Finally, the Patrol car needs a controlling script: attach the one from Figure 9.2.
The Patrol car is now ready for a test drive. Confirm that the system works properly
by starting the game, pressing the arrow keys and seeing the Patrol car move appropri-
ately. (Note that vertical and horizontal keys can be pressed simultaneously.) It should be
possible to drive around the entire terrain, avoiding rocks, using the arrow keys.
137
Figure 9.6: Program file Servers/Unity3D/CarChase/Assets/JoyRideCar.cs
using UnityEngine;
// This is the script to be attached to the "JoyRide" car in the scene.
// It accepts actions via an ErgoServer and moves the car accordingly.
public class JoyRideCar : MonoBehaviour {
private float forwardSpeed = 35f;
private float turnSpeed = 0.6f;
private Rigidbody myCarBody;
private Transform myCar;
private int turnDir = 0;
private int forwardDir = 0;
public ErgoServer server = new ErgoServer();
void Start () {
myCarBody = GetComponent<Rigidbody>();
myCar = GetComponent<Transform>();
server.Start();
}
void Update () {
server.Update();
if (server.hasData) {
string act = server.getEndogenousAct();
switch(act) {
case "right-turn!": turnDir = 1; break;
case "left-turn!": turnDir = -1; break;
case "straight!": turnDir = 0; break;
case "stop!": forwardDir = 0; break;
case "go!": forwardDir = 1; break;
case "reverse!": forwardDir = -1; break;
}
}
myCar.Rotate(0,turnDir*turnSpeed,0);
myCarBody.AddRelativeForce(0,0,forwardDir*forwardSpeed);
}
}
Nothing happens in Unity 3d unless it has the focus, so bring it to the front. It should now
display a message saying that a connection from ERGO has been received. Returning to the
telnet window, typing a go! action should cause the JoyRide car to move when Unity 3d
regains the focus. If all is well, the JoyRide car will proceed forward and eventually crash.
The Joyride car needs two more components to give it additional behaviour. First,
create a rear bumper as a Cube of scale (1,1,.16) positioned at the rear of the car. The
MeshRenderer component of this cube should be removed so that the cube will be invisible.
The IsTrigger box of its BoxCollider should be checked so that it will be triggered by
collisions. Then the script shown in Figure 9.7 should be attached. During the game, if
anything touches the rear bumper (namely, the Patrol car), the OnTriggerEnter method
138
Figure 9.7: Program file Servers/Unity3D/CarChase/Assets/BumperScript.cs
using UnityEngine;
// This script is attached to the RearBumper component of the JoyRide car.
// It detects contact with the Patrol car and ends the game.
public class BumperScript : MonoBehaviour {
Light overhead;
void Start() {
overhead = GameObject.Find("OverheadLight").GetComponent<Light>();
}
void OnTriggerEnter(Collider col) {
overhead.intensity = 0.0f;
Application.Quit();
}
}
using UnityEngine;
// This script is attached to the LookAhead component of the JoyRide car.
// It detects collisions and signals them with an exogenous action.
public class LookAheadScript : MonoBehaviour {
ErgoServer server;
Transform myTrans;
void Start() {
myTrans = transform;
server = transform.parent.GetComponent<JoyRideCar>().server;
}
void OnTriggerEnter(Collider col) {
Vector3 myPos = myTrans.position;
Vector3 itsPos = col.ClosestPointOnBounds(myPos);
float dist = Vector3.Distance(myPos,itsPos);
server.signalExogenous("(object-detect! "+dist+")");
}
void OnTriggerExit(Collider col) {
server.signalExogenous("(object-detect! 0)");
}
}
will be called to turn off the overhead light and quit the game.
Finally, the JoyRide car needs a way of seeing objects ahead of it. One way of doing
this is to create a long Cube of scale (15,1,2.7) and placing it at the front of the car. Like
the rear bumper, the cube should be made invisible and set so that it will be triggered by
collisions. The idea is that an object that is directly in front of the car will touch this cube,
and this event can be reported to ERGO so that it can decide what to do. In effect, the
invisible cube behaves like a forward-looking sensor for the car.
139
Figure 9.9: Program file Projects/Unity3D/u3d-car.scm
;;; This is the ERGO code for a robot car doing a JoyRide
;;; It communicates with a running Unity 3d engine.
(include "u3d-bridge.scm") ; bridge to the Unity 3d Engine
(define-fluents
distance 0 ; how far to an object (0 = clear)
turning? #f ; am I turning?
advancing? #f) ; am I going forward?
;; four normal actions
(define-action right-turn! turning? #t)
(define-action straight! turning? #f)
(define-action go! advancing? #t)
(define-action stop! advancing? #f)
;; one exogenous action, reporting on objects ahead
(define-action (object-detect! d) distance d)
;; the car controller
(define (control)
(:monitor
(:when (and (> distance 0) (< distance 20) advancing?)
(:act stop!) (:unless turning? (:act right-turn!)))
(:when (and (> distance 0) (< distance 60) (not turning?))
(:act right-turn!))
(:when (= distance 0)
(:when (not advancing?) (:act go!))
(:when turning? (:act straight!)))
(:while #t (:wait))))
(define (main . args)
(u3d-start-comm 8123 (not (null? args))) ; set up online interfaces
(ergo-do #:mode ’online (control))) ; run the ERGO program
The script to be attached to the cube for this purpose appears in Figure 9.8. What it does
is to first locate the ErgoServer instance used by the JoyRide car itself. Then, if anything
touches the cube, the distance between the cube and the colliding object is calculated and
sent to ERGO using signalExogenous. When the object stops touching the cube, ERGO is
sent another exogenous event with a distance value of zero.
This completes the Unity 3d part of the CarChase project, which now can be saved as
an application, allowing the game to run without the Unity 3d development environment.
140
without turning, waiting for something to happen; if it detects an object ahead in the
distance, it continues forward but initiates a turn to the right; if the object ahead gets too
close, it stops the forward force and initiates or continues the turn to the right; when it no
longer detects an object ahead, it resumes driving straight normally.
Nothing will happen until the game regains the focus, but at that point, the JoyRide car
should start moving under ERGO control. The Patrol car can then be put in pursuit using
the arrow keys, and the chase is on.
141
all or nothing. For the Patrol car this would mean using other keyboard keys as controls;
for the Joyride car this would require new endogenous actions. The simplest way perhaps
is to have two endogenous actions: one that sets a forward force factor to a given amount
(from -1 to 1, say) and one that sets a turning factor. A more complex solution would
involve separate gas and brake controls and a more sophisticated physics for coasting.
The JoyRide car could also make use of the ability to sense a nearby Patrol car. This
might be programmed in the form of a “radar detector” that sends an exogenous event
whenever the Patrol car is close (as part of the car Update method), or more simply, a
rear-looking detector that reports only when the Patrol car is not too far behind.
From a cognitive robotics point of view, perhaps the biggest limitation of the CarChase
project is the purely reactive nature of the ERGO control program. It could have been
written just as easily in C#. In the end, if all a cognitive robot needs to do is to move
forward without bumping into things, there is not much for it to think about! Where the
ERGO language pays off is when the robot has some purpose other than just driving, and
needs to deliberate about what to do. (In other chapters, we have seen cases where a robot
needs to do considerable offline planning before it can decide what to do next.) One way
to consider making the CarChase world richer in this sense is for there to be locations in
the world that the JoyRide car needs to visit in some order to achieve some goal.
It is worth remembering, however, that any agent in Unity 3d that can be controlled
using arrow keys can also be controlled by endogenous actions in a similar way. The main
limitation is that whereas a user at a keyboard gets to see the entire screen (and perhaps
hear things too), the cognitive robot only gets sensing reports via exogenous actions. As
a first step towards levelling the playing field, it will be necessary to ensure that an agent
controlled by ERGO gets as much information as it needs from its sensors.
142
Advanced Topics
∗ Chapter 10
This book has emphasized cognitive robots from the programming perspective. The idea of
cognitive robotics, however, first arose as a problem not in programming, but in knowledge
representation and reasoning. The idea is to imagine a system that can reason in an automated
way from a collection of facts called its knowledge base.
In the case of cognitive robotics, the facts in the knowledge base concern the state of the
world as well as its dynamics, that is, how the state of the world changes as the result of
actions. In this chapter we reconsider the planning and program execution of a cognitive
robot from the perspective of a reasoning agent that uses the language of first-order logic
to represent what it knows, and that reasons using logical inference.
The main reason for moving from programming to logical reasoning is to consider an
agent capable of making effective use of a wider range of knowledge than the BATs seen so
far. For example, an agent might have to deal with incomplete knowledge about the world
it is dealing with, where the values of fluents are not known. (We will see more of this in
Chapters 11 and 12.) More generally, an agent may need to deal with knowledge that does
not fit the current BAT pattern at all. For example, in deciding what to do, an agent may
need to reason about the passage of time, or about the properties of rigid physical objects,
or about how people typically interact. All of these suggest a knowledge base drawn from
a declarative language that is more expressive than the ERGO language of BATs.
10.1.1 Syntax
Expressions in the language of first-order logic are made up of logical and non-logical sym-
bols. The logical symbols are the punctuation marks (parentheses, comma, and period),
the logical connectives (∧, ¬, ∀, and = are sufficient, but others are typically introduced
as abbreviations), and an infinite supply of individual variables. The non-logical symbols
vary from application to application and are made up of predicate symbols and function
symbols. Each non-logical symbol has an “arity” which is a non-negative integer specify-
144
ing how many arguments the predicate or function takes. Function symbols of 0-arity are
called constants.
The expressions in the language come in two forms: terms and well-formed formulas
(or wffs, for short). The terms are defined as the least set satisfying the following:
For constants, c is usually written instead of c(). The wffs are defined inductively as
follows:
3. If α and β are wffs and x is a variable, then ¬α, (α ∧ β) and ∀x.α are wffs.
Usually certain liberties are taken with the syntax: parentheses are added or omitted for
clarity; square brackets and curly braces are used instead of parentheses, and abbreviations
are used: (α ∨ β) for ¬(¬α ∧ ¬β), (α ⊃ β) for (¬α ∨ β), (α ≡ β) for ((α ⊃ β) ∧ ( β ⊃ α)), and
∃x.α for ¬∀x.¬α.
Variables within a wff are considered to have a scope determined by the quantifier ∀.
An appearance of a variable x within a formula is said to be bound if it appears within
a subformula ∀x.α; otherwise the appearance is said to be free. The notation αtx is used
to name the formula that results from replacing every free occurrence of the variable x in
formula α by the term t. A wff without free variables is called a sentence.
10.1.2 Semantics
Terms and formulas of first-order logic are interpreted according to an interpretation M
made up of two parts: a non-empty set D called the domain of interpretation, and a map-
ping from the predicate and function symbols of the language to relations and functions
over D. More precisely, for each predicate symbol P of arity k, the interpretation of P
according to M, written PM is a k-ary relation over D:
PM ⊆ [ D × D... × D ] (k times).
f M ∈ [ D × D... × D → D ] (k times).
The idea is that an interpretation will determine which sentences of the language are true
and which are false. To do so, it specifies which formulas are satisfied for which values of
its free variables. Let µ be a function from the variables to D. Then, the denotation of a
term t with respect to M and µ, written | t |M,µ , is defined inductively by:
1. | x |M,µ = µ( x );
145
2. | f (t1 , . . . , tk ) |M,µ = f M (| t1 |M,µ , . . . , | tk |M,µ ).
3. M, µ |= ¬α iff M, µ 6|= α.
4. M, µ |= (α ∧ β) iff M, µ |= α and M, µ |= β.
For sentences, α is said to be true wrt M iff M, µ |= α for any µ, and false otherwise.
10.1.3 Pragmatics
The main use of first-order logic in our context is for entailment: we say that a set of
sentences Σ logically entails a sentence α, written Σ |= α iff there is no interpretation that
makes all the sentences in Σ ∪ {¬α} true. In other words, every interpretation where the Σ
sentences are true is one where α is also true.
Logical inference is the process of determining the entailments of a given Σ. The most
basic argument that a sentence α is entailed involves using interpretations as above. But an
argument can also be formulated in terms of a collection of rules of inference that allow the
entailment question to be broken down into simpler pieces. Here are some typical rules:
• If α ∈ Σ, then Σ |= α.
• Σ |= ∀x ( x = x ).
A collection of rules like this is said to be sound if it only allows correct entailments to
be derived, and complete if all the correct entailments can be derived using just the rules.
(The four rules above are sound but not complete.)
146
except that some of them are considered to be changeable as the result of actions; these are
called fluents and they take a situation term as their final argument.
So, for example, Broken( x, s) might be the predicate that says whether the object de-
noted by x is broken in the situation denoted by s. The following sentence might be true:
¬Broken(obj1 , S0 ) ∧ Broken(obj1 , do (drop(rob, obj1), S0 ))
Informally, this says that the object obj1 is not broken in the initial situation, but is broken
in the situation that results from doing the action drop(rob, obj1 ) in the initial situation.
In other words, the object is not broken initially, but is broken right after the robot rob
drops it. Note that there is no distinguished “current” situation. A single sentence in the
language can talk about many different situations, past, present, and future.
There is a final distinguished symbol in the language, a predicate symbol Poss, where
Poss( a, s) is intended to be true if the action denoted by a is possible in the situation
denoted by s.
147
This says that an object x will be broken after doing an action a iff a is a dropping action
and x is fragile, or a is a bomb exploding where x is near the bomb, or x was already broken
and a is not the action of repairing it. In other words, Broken is made true by dropping
and exploding actions (under the right circumstances), made false by appropriate repairing
actions, and left unchanged by all other actions.
A situation calculus basic action theory consists of these parts:
Poss( A( x1 , . . . , xn ), s) ≡ φ( x1 , . . . , xn , s)
where φ does not mention Poss and the only situation term it uses is the variable s;
3. successor state axioms: for each predicate fluent P, a sentence of the form
P( x1 , . . . , xn , do ( a, s)) ≡ γ( x1 , . . . , xn , a, s)
where the only situation term in γ is the variable s. (Similar axioms are needed for
the functional fluents;
10.4 An example
An example basic action theory of this form is shown in Figure 10.1. This is the situation
calculus version of the fox, hen, grain problem of Section 3.5.1. In this example, there
are two actions, crossAlone and crossWith( x ) to move the farmer and possibly one other
passenger from one bank of the river to the other, and a single fluent OnLe f t( x, s) that is
intended to hold when object x is located on the left bank of the river in situation s. As
can be seen in the specification of the initial state, there are four objects in the world (not
counting the situations and actions) and they all start out on the left bank of the river.
The precondition axiom states that the farmer can cross the river alone provided it
would be safe to do so: either the hen or both the fox and the grain are on the other side of
the river. In addition, the farmer can cross with passenger x if x is on the same side as the
farmer and it would be safe to do so: either x is the hen, or it would be safe to cross alone.
The successor state axiom characterizes how the crossing actions change the locations
of objects: for any object x and action a, x will be on the left bank after a is performed iff a
moves x and x was not on the left bank before or a does not move x and x was on the left
bank before. (A crossing action moves x when x is the farmer or the passenger.)
148
Figure 10.1: The fox, hen, grain problem in logic
Initial State:
∀x ( x = f armer ∨ x = f ox ∨ x = hen ∨ x = grain)
∀x OnLe f t( x, S0 )
Preconditions:
Poss(crossAlone, s) ≡
[OppSide( f armer, hen, s) ∨ (OppSide( f ox, hen, s) ∧ OppSide(hen, grain, s))]
Poss(crossWith( x ), s) ≡ ¬OppSide( f armer, x, s) ∧ [ x = hen ∨
OppSide( f armer, hen, s) ∨ (OppSide( f ox, hen, s) ∧ OppSide(hen, grain, s))]
Successor States:
OnLe f t( x, do ( a, s)) ≡ ¬(OnLe f t( x, s) ≡ Moves( a, x ))
reinstating the situation variable and then replacing it by t. Given a basic action theory
Σ as above, and a situation-suppressed goal formula G, the planning problem is to find a
sequence of action terms without variables a1 , a2 , . . . , an such that the following hold:
1. Σ |= G [do ( an , . . . , do ( a2 , do ( a1 , S0 )) . . .)].
10.6 Golog
Just as planning can be formulated as a logical reasoning problem, so can the execution of
programs. To do so, for each program δ under consideration, a formula of the situation
149
calculus Do (δ, s1 , s2 ) is defined. This formula is intended to hold when the program δ
started in situation s1 can terminate in situation s2 . (We say “can terminate” instead of
“will terminate” since the programs may be nondeterministic.)
The programming language Golog is the ancestor of ERGO and has constructs similar
to those of Chapter 4. The Do formula for these is defined as follows:
.
• primitive actions: Do ( a, s1 , s2 ) = Poss( a, s1 ) ∧ s2 = do ( a, s1 ).
.
• sequence: Do (δ1 ; δ2 , s1 , s2 ) = ∃s. Do (δ1 , s1 , s) ∧ Do (δ2 , s, s2 ).
.
• test action: Do (φ?, s1 , s2 ) = φ[s1 ] ∧ s2 = s1 . (where φ is situation-suppressed)
.
• nondeterministic choice: Do (δ1 | δ2 , s1 , s2 ) = Do (δ1 , s1 , s2 ) ∨ Do (δ2 , s1 , s2 ).
.
• nondeterministic pick: Do (πx. δ, s1 , s2 ) = ∃x. Do (δ, s1 , s2 ).
.
• nondeterministic iteration: Do (δ∗ , s1 , s2 ) = ∀P(. . . ⊃ P(s1 , s2 ))
where the ellipsis stands for: ∀sP(s, s) ∧ ∀s, s0 , s00 ( P(s, s0 ) ∧ Do (δ, s0 , s00 ) ⊃ P(s, s00 )).
(The Golog language includes other constructs not discussed here.) Note that the Do
formula for the nondeterministic iteration is in fact a formula of second-order logic whose
details need not concern us.
An example of Do with the basic action theory of Figure 10.1 is the following:
With the Do formula defined as above, the execution of programs in general can now be
defined. Given a basic action theory Σ and a Golog program δ, the task of offline execution
is to find a sequence of action terms a1 , a2 , . . . , an such that the following holds:
Σ |= Do (δ, S0 , do ( an , . . . , do ( a2 , do ( a1 , S0 )) . . .)).
In other words, offline execution consists in finding a sequence of actions that constitute a
legal execution of the program starting in S0 according to the Do formula. For the program
above, the offline execution is solved by the sequence hcrossWith(hen), crossAlonei since
the sentence ¬OppSide( f armer, hen, S0 ) is entailed, as are the two required Poss formulas.
As in Section 4.4, planning is once again seen as a special case of offline execution in that
any offline execution of the program [(πa. a)∗ ; G?] is a plan that will achieve the goal G.
150
10.7 From Golog to ERGO and back
In this section, the relationship between Golog (as presented above) and ERGO (as seen
in the rest of the book) is considered. Roughly speaking, the idea is to come up with a
translation between ERGO and Golog such that
(ergo-do δ) returns ~a iff Σ |= Do (δ, S0 , do (~a, S0 )).
As will become clear, however, the match between the two formalisms is not quite exact.
10.7.1 Programs
Perhaps the easiest correspondence between ERGO and Golog concerns the programs
themselves. It is clear that there is a direct correspondence between the programming
constructs seen in this chapter and those in Chapter 3. For example, sequence is handled
with ; in Golog and with :begin in ERGO, but the effect is the same. Similarly, the π
operator in Golog corresponds to the :for-some of ERGO, the main difference being that
ERGO requires a list of elements for the quantification, whereas Golog effectively iterates
over the entire domain. There are primitives for :if and :while in ERGO, whereas these
would need to be expressed in Golog as abbreviations using the given constructs:
.
if φ then δ1 else δ2 = [(φ?; δ1 ) | (¬φ?; δ2 )]
.
while φ do δ = [(φ?; δ)∗ ; ¬φ?]
It is also possible to provide a logical account of the concurrency and online aspects of
ERGO, but this would take us too far afield.
151
(The case where the fluent f has a Scheme function as its value would be handled just
like the initial state axioms, with universally quantified arguments in the axiom. The
case where an action Ai has arguments would be handled by existential quantifiers in the
successor state axiom.) Going from a situation calculus successor state axiom to ERGO is
possible only when the axiom can be massaged into a form similar to the one above.
(define-fluents f 1 e1 . . . f k ek )
f 1 ( S0 ) = e 1 ∧ · · · ∧ f k ( S0 ) = e k .
∀x1 · · · ∀xn . f ( x1 , . . . , xn , S0 ) = e.
When the values of fluents involve other datatypes of Scheme, such as numbers, lists,
tables, and so on, the initial state axioms would need to contain axioms defining the oper-
ations on objects of those types. (It is typical, however, to use a variant of first-order logic
where the properties of numbers are built in.)
Going in the other direction on the other hand, that is, from the situation calculus to
ERGO, is possible only when the initial state axioms uniquely determine the values of the
fluents. For example,
f ( S0 ) = 3
can be translated into ERGO in the obvious way, but the initial state axiom
f ( S0 ) = 3 ∨ f ( S0 ) = 4 ∨ f ( S0 ) = 5
cannot be translated into ERGO at all. This is due to the fact (stated at the start of Chapter 3)
that the BAT representation of ERGO assumes complete knowledge about the initial values
of the fluents. The case of an ERGO with incomplete knowledge is considered only in
Chapters 11 and 12. To handle initial state axioms more generally in ERGO, something like
the #:known keyword introduced in Section 11.1 would be needed.
152
10.8 Bibliographic notes
This chapter deals with the logical foundations of cognitive robotics. Logic itself has a
long history that in Western culture goes back to the ancient Greeks. The modern form of
symbolic logic as used in this chapter is due to Gottlob Frege, with a specific notation due
to Giuseppe Peano, in the early 1900s. Two excellent mathematical textbooks on modern
symbolic logic are [14] and [34].
The original motivation behind the development of symbolic logic was to put all of
mathematical reasoning on a sound footing. But it was John McCarthy, one of the founders
of the field of artificial intelligence (AI) [41], who first proposed using symbolic logic not for
mathematics, but to represent the commonsense knowledge of an agent [33]. A textbook
on this topic is [6], and a more advanced handbook is [20].
McCarthy’s original paper introduced the situation calculus and the idea of planning
as a form of logical reasoning. His formulation suffered from what was called the frame
problem [42]: to work properly, a logical axiomatization had to specify not just the effects
of actions, but all their myriad non-effects as well. It was Ray Reiter who, building on
some earlier work, proposed a solution to this frame problem in [39] in terms of what he
called basic action theories, as presented in this chapter. Reiter went on to write a much
more comprehensive study of the situation calculus in [40]. See [32] for a survey of this
and subsequent work on the situation calculus.
Reiter’s formulation of the situation calculus was very influential and became the main
foundation of what was to become cognitive robotics. (Reiter coined the term in 1993.)
Among many other things, Reiter and colleagues developed the Golog language discussed
in this chapter [31]. Subsequent work incorporated concurrency [12] and made the distinc-
tion between offline and online execution [13], as seen throughout this book. In a very real
sense, this book is an attempt to extract the programming ideas from the on-going research
in cognitive robotics that began with Ray Reiter. For a survey of cognitive robotics from
this knowledge representation point of view, see [30].
153
∗ Chapter 11
Numerical Uncertainty
In previous chapters, a cognitive robot was assumed to have complete knowledge of the
initial state of the world. A robot might not know how the world was changing beyond
its own actions, and we considered how active sensing using exogenous actions could deal
with this in Chapters 5 and 6. But in all cases, we assumed that the initial values of the
fluents were known, so there was no need for sensing there. In more realistic settings,
however, even the initial knowledge of a cognitive robot can be incomplete. Unlike in
Chapter 3, the robot may not know which of two rooms contains a certain box, for example,
and it may need to go to those rooms to find out. Similarly, the robot may only have a
rough idea of how far it is from a nearby wall and need to use its onboard sensors to get a
more accurate picture. In this chapter and the next, we consider how a cognitive robot can
decide what to do in the presence of knowledge that is incomplete in this way.
The easiest way to think about the sort of incomplete knowledge we have in mind is
in terms of the representation of knowledge it uses. In previous chapters, what the robot
knew about the world could be represented by what we called a state, that is, a mapping
from fluents to their values. In this chapter and the next, a robot will not be able to use
a state as its representation, since it may know certain things about some fluents without
knowing their values. Instead, the robot will have to make do with a set of possible states,
any one of which might be the correct representation of the world. For example, if all the
robot knows about a fluent f is that it must have value x or y, then there will be some
possible states where the value of f is x and others where its value is y. We say that x and
y are possible values for the fluent f . The only conditions the robot knows for sure are those
conditions that come out true in all these possible states.
In practice, there might be very little that a cognitive robot actually knows for sure. But
as we will see in this chapter, it may be able to use some numerical information it has to
make an informed guess about the most likely cases. (In Chapter 12, we will see how a
robot might still be able to plan courses of action without this numerical information.)
154
Figure 11.1: A world with three stacks of objects
"
"
"
A
6
" "
B
" "
"
" 5 "" "
"
" 4" " ""
" " "
3
C"
" " "
" " "
" " ""
" "
" " " "
" " " "
"
1 " " "
"
" " "
2 "
e e
"
"
"
C. At any point, the robot may be holding one of these objects. See Figure 11.1. Let us
suppose that each object in a stack can be coloured red or blue. We could represent this
information using a hash-table that maps objects to their colour. However, if colour is the
only property of an object that the robot really cares about, we can simply list the objects
in a stack according to their colour.
A basic action theory for this world is shown in Figure 11.2. There are no objects
initially in stacks B and C, and stack A contains four objects identified only by their colours:
red, red, blue, and red. The goal of the robot (the goal? function) is to make two towers of
uniform colour, that is, to get the all red blocks onto stack B and all the blue objects onto
stack C. A plan for this goal can be found using the basic planner from Chapter 3:
> (ergo-simplan goal? (append (map pick! ’(A B C)) (map put! ’(A B C))))
’((pick! A) (put! B) (pick! A) (put! B)
(pick! A) (put! C) (pick! A) (put! B))
The generated plan obviously depends on the initial list of blocks in stack A. In fact, for
any list of blocks in stack A, the planner can find a plan that works for that initial state.
155
Figure 11.2: A basic action theory for the three stacks world
This function is very much like define-fluents except that it constructs a list of states
rather than a single one. The general form is this:
where the f i are fluents and the ei are their values, as before with define-fluents. The
difference here is v j and the d j . The v j are variables that may be used in any of the ei and
the d j are expressions that evaluate to lists. The idea is that we will get initial states for
all possible values of the variables v j taken from the lists d j . (A d j fexpr can also evaluate
to a positive integer m in which case the v j has a value taken from 0, 1, . . . , m − 1.) So, for
example, if we had k = 3 where the list d1 had three elements, d2 had two elements, and
d3 had four elements, define-states would produce a list of 3 × 2 × 4 = 24 initial states.
In the example above, a list with two initial states is defined. The two states differ only
in the colour x of the second block in stack A. There are two possible values for this block,
which can be confirmed with the ERGO function possible-values:
156
> (possible-values (car (stack ’A)))
’(red)
As mentioned above, an fexpr is considered known to be true by the robot if it comes out
true in all of these states. We can use the ERGO function known? as follows:
> (known? (eq? (caddr (stack ’A)) ’blue)) ; the third block is blue
#t
The value is true since the eq? expression is true in both initial states. However, we have
> (known? (eq? (cadr (stack ’A)) ’blue)) ; the second block is blue
#f
> (known? (eq? (cadr (stack ’A)) ’red)) ; the second block is red
#f
since in each case there is a state in the list where the expression comes out false. More
generally, we say that an fexpr is known to have value x if it has the same value x in all the
states. (In other words, x is its only possible value.) Note that something can be known
about an fexpr even if its value is not known:
> (known? (memq (cadr (stack ’A)) ’(red blue))) ; the second block is red
#t ; or blue
Although the colour of the second object is not known, it is known to be red or blue.
With this view of incomplete knowledge, the larger the list of states, the less complete
is the knowledge of the robot. For example, if the robot only knows the colour of the
top two blocks on stack A, we would use a list with four states to capture the range of
possibilities for the third and fourth blocks:
> (known? (not (eq? (car (stack ’A)) (cadr (stack ’A)))))
#t
In other words, the first object in the stack is known to be different in colour from the
second. However, we have the following:
In other words, the first object in the stack is not known to have the same colour or to have
a different colour from the third object. Observe that the value of
157
(length (stack ’A))
would be known here, since this fexpr has the same value in all the states. Nothing stops
us, however, from considering initial states where even that value is unknown:
(define-states ((u ’(() (red) (blue) (red blue) (red red blue))))
hand ’empty
stacks (hasheq ’A u ’B ’() ’C ’()))
In this case, the number of objects in stack A is unknown, although we do have this:
(With any finite list of states, there will always be a known upper bound on the length of
any stack. There is no way in this representation to represent knowing absolutely nothing
about the number of objects in stack A.)
As a final convenience in specifying the initial state of knowledge, the define-states
function can take an optional keyword #:known followed by a Boolean fexpr e0 so that
behaves just like before except that only states where e0 is true are considered. An example
is the following:
In this case, instead of getting 2 × 2 × 2 × 2 = 16 initial states, we get only 12 of them, just
those where either p or q comes out true. In general, if we want to represent a state of
knowledge where some Boolean formula φ is what is known initially (as seen in the initial
state axioms of Section 10.7.3), we can let the fluents range over all their possible values,
and then constrain the set of initial states using #:known with φ as the e0 .
158
infinite number of possible values to contend with, but some of these values may be much
more likely than others.
For some applications, of course, an assumption of completely accurate sensors and
effectors is quite justified. We might have a primitive action of moving ahead one metre,
for instance, and leave it to the robot manager to perform that action as best as it can with
its onboard sensors and effectors. This is how we think of an elevator moving to a given
floor in a building, for example. But in other cases, small errors can accumulate. We might
want a robot to use what it knows about the world and the goals it is working on to decide
how worthwhile it might be to obtain a more accurate assessment of what it is doing. That
is to say, we might want to consider ERGO programming that deals explicitly with this
uncertainty.
we have four initial states, where the possible values for stack A are: follows:
159
To make this idea precise, we define the notion of the degree of belief (sometimes called
the subjective probability) of a Boolean fexpr e as a real number from 0 to 1 defined as the
sum of the weights of the states where e is true divided by the sum of the weights of all
the states. In mathematical notation, it’s this:
n wi if e is true
∑
i =1 0 otherwise
n .
∑ wi
i =1
So, for example, for the four initial states above, we have the following degrees of belief:
Note that these degrees of belief would have been exactly the same if all the weights had
been multiplied by the same factor, like w1 = 12, w2 = 9, w3 = 6, w4 = 3. This is because we
end up “normalizing” the weights, that is, dividing each weight by the sum of all of them.
Note also that the degrees of belief would have been identical had there been additional
states with a weight of 0.0. Finally, observe that in this scheme, formulas that are known to
be true will get a degree of belief of 1.0, and formulas known to be false will get a degree of
belief of 0.0. (If there are no states with a weight of 0.0, the converse also holds: formulas
with a degree of belief of 1.0 are known to be true, and formulas with a degree of belief of
0.0 are known to be false.)
All degrees of belief other than 1.0 and 0.0 are for formulas that are not known to be
true or to be false, where the degree measures how far we are from those two possibilities.
So, for example, if we had w1 = 44 instead of w1 = 4 (and the other three weights the
same), the degree of belief that there are more blue blocks than red ones would have fallen
from 0.1 to 0.02. In that case, the formula is not quite known to be false, but very close. As
we will see, there will be times when a practical robot may need to treat formulas where
the degree of belief is this low as false.
It is also useful to talk about the conditional belief in e1 given e2 . This is defined as the
degree of belief in the conjunction of e1 and e2 divided by the degree of belief in e2 . (The
conditional belief is undefined when the degree of belief in e2 is 0.0.) For the block example
above, the degree of belief that block 3 is blue is .4, but the degree of belief that block 3 is
blue given that block 2 is blue is about .33. This is a way of saying that our confidence that
block 3 is blue is somewhat lower if we limit ourselves to states where block 2 is blue.
To test the degree of belief in a condition, we can use the ERGO function belief which
is like the known? function except that it returns a number between 0.0 and 1.0:
160
> (belief (eq? (cadr (stack ’A)) (caddr (stack ’A))))
0.5
> (define (count-colour c) (for/sum ((x (stack ’A))) (if (eq? x c) 1 0)))
> (belief (> (count-colour ’blue) (count-colour ’red)))
0.1
We can also provide a second argument to the belief function for conditional belief:
> (belief (eq? ’blue (caddr (stack ’A))) (eq? ’blue (cadr (stack ’A))))
0.33333
So, for example, suppose we have the four states with the four weights above and the value
of some fexpr f in each state is as follows: f 1 = 5, f 2 = 3, f 3 = 5, f 4 = 6. Then the mean
value of f for these four states would be
(5 · 4 + 3 · 3 + 5 · 2 + 6 · 1)
µ= = 4.5
10
When all states have equal weight, the mean is the same as the ordinary average value.
Note that there is a close connection between means and degrees of belief: the degree
of belief in e can be defined as the mean value of the expression (if e then 1 else 0).
Faced with inaccuracies and uncertainty, it will sometimes be necessary to use the mean
as a guess for an unknown numeric value. The ERGO function (sample-mean e) can be
used for this purpose. There are other possible guesses we could consider, but the mean
has the advantage of being neither too low nor too high, in the sense that the total error
we might encounter from being too low is balanced by the total error from being too high.
For example, we might have considered using 5 as a guess for the value of f (which is
the value f takes in two of the four states, after all), but there’s a chance this could be as
much as 2 units too high (in case state s2 turns out to be the correct value). We might also
consider using the unweighted average of 4.75 as the guess, but this seems to give undue
prominence to the possible value of 6 (for the f in s4 ), which has a low weight relative to
the others. The mean value of 4.5 strikes the right balance.
One thing that might be done before using the mean as a guess is to look at the amount
of uncertainty we have about the value. Suppose we have a numeric expression e whose
161
Figure 11.3: A one-dimensional robot world
hhh
h
hhh
sonar
((
((((
-
g g
5 10
mean value is µ. The variance of e is defined as the mean value of (e − µ)2 . For the four
states above, the variance of f would be
162
clearer picture of where it is located, and then at some threshold, stop and make a move.
But finally let us suppose that the motors themselves are also inaccurate. The robot might
request a move of 2.2 metres but end up moving 2.217 metres. The best the robot could do
at that stage would be to use the sonar again to try to find out how close it is to the desired
5 metres, and either settle for that, or consider additional smaller moves. As long as the
motors are not so inaccurate as to move the robot further and further away from the goal
of 5 metres, eventually the robot will be able to get close enough.
What we have sketched in the previous paragraph is in effect a program that uses
inaccurate sensors and effectors to achieve a goal to within a certain tolerance. In this
example, we have a single fluent representing the current distance to the wall, and two
somewhat inaccurate operations, one that moves the robot towards the wall, and one that
senses the distance to the wall. We now develop an ERGO program along these lines,
starting with the BAT.
We are using one million initial states, not because there are exactly that many possible
values for h, but to have a large number of randomly-selected sample values that have an
appropriate mean and variance.
The ERGO functions UNIFORM-GEN, GAUSSIAN-GEN, DISCRETE-GEN, and BINARY-GEN can
be used to generate random numbers. The expression (UNIFORM-GEN x y) has as its value
a floating point number selected at random according to a uniform distribution between
x and y. Similarly, (GAUSSIAN-GEN x y) has as its value a floating point number selected
at random according to a Gaussian (or normal) distribution with mean x and standard
deviation y. In other words, it generates random numbers that cluster around a mean
value of x with a variance of y2 . (In practice, about 68% of the generated numbers will
lie between ( x − y) and ( x + y) and about 95% of them will lie between ( x − 2y) and
( x + 2y).) To generate random values chosen from a fixed list of choices, the expression
(DISCRETE-GEN v1 p1 . . . vn pn ) where the pi add up to 1, generates values from the vi
according to the probability pi . Finally, the expression (BINARY-GEN p) is an abbreviation
for (DISCRETE-GEN #t p #f (- 1.0 p)).
(A side note on random numbers: The random numbers produced by ERGO are of
course only pseudo random numbers generated by the underlying Scheme implementa-
tion. Scheme provides a way of “seeding” its pseudo-random-number generator, that is,
initializing it so that the same sequence of numbers will be produced in distinct runs of
the program. This can be useful for debugging, but is beyond the scope of this book.)
To return to the BAT, what we are doing with the define-states above is generating
a list of one million initial states where the h fluent takes values from 2.0 to 12.0 in a
163
uniformly random way. So for example, the degree of belief that h is between 3 and 4
(say), will be the same as the degree of belief that it is between 6.5 and 7.5. (This is
how a uniform distribution is supposed to work in theory. By using sampling, we forgo
these precise theoretical results and make do with approximations. How close we get to
the theoretical values depends on how many sample states we generate.) If we wanted
instead to say that the unknown value of h was somewhere around 7 metres, with the
range between 3 and 4 much less likely than the range between 6.5 and 7.5, we could have
used something like this:
Among the million samples generated, the range of h values will be centered around 7.0,
with about 95% of the values lying between 4.0 and 10.0.
Another option in the sampling is to choose values along exact intervals between 2.0
and 12.0 and to weigh those samples explicitly:
The ERGO functions UNIFORM, GAUSSIAN, DISCRETE, and BINARY are random number testing
functions analogous to the random number generating functions described above. In all
cases, the functions are given an additional first argument z and return a number between
0 and 1 measuring the relative likelihood of the z value. So the (GAUSSIAN h 7.0 1.5) in
the definition here says the weight of a sample should be the likelihood of getting its value
of h, assuming that those values were distributed according to a Gaussian with mean 7.0
and standard deviation 1.5. (This method of sampling has the advantage of avoiding any
randomness in the sampling process, but has the disadvantage of having many samples
with weights very close to 0. The other method is usually more accurate.)
Note that we can have fluents with uncertain values mixed with fluents with certain
values by using something like this:
In this case, we would get the following. The h fluent will be as above. The temp fluent
will get the value hot in about .2 of the states, cold in about .6 of them, and tepid in the
rest. The fluent q will get the value 7 in all states. In other words, the mean value of q will
be 7.0 and its variance will be 0.0. Since the temp value is not numeric, there is no mean or
variance, but the degree of belief that the value of temp is either cold or tepid should be
about .8. (See Section 11.4 for dealing with fluents whose values are not independent.)
164
(define-action (move! x) ; advance x units
h (max 0 (- h (* x (GAUSSIAN-GEN 1.0 .2)))))
Because of inaccuracies in the motor or for other reasons, the value of h after the action
(move! 1) will not be exactly (- h 1), but will deviate from this value like a random
number chosen from a Gaussian distribution with standard deviation .2. Even if we knew
the value of h precisely before the move, after the move, we would no longer know its
value, although we expect it to be close to (- h 1). How close? This depends on the
properties of the effector. For a very accurate motor, the standard deviation should be
small; for an inaccurate motor, it might be large. In other examples, the mean might not
even be (- h 1); there may be a bias in the motor that makes it more likely to overshoot
than to undershoot. Similarly, we can use an expression like
to say that the variance in the moving can depend on other considerations, here whether
or not the slippery fluent is true at the time of the move.
As in Section 6.3, the sensing involves a request for sensing information followed by an
exogenous report. The first action, sonar-request!, will be used as an ordinary action (of
no arguments) in the ERGO program. The second, sonar-response!, will be used as an
exogenous action (of one argument) that happens implicitly. When (sonar-response! z)
happens for some value z, the z is then understood to be the value returned by the sonar
sensor asked to measure the current distance to the wall.
How the z returned by sonar-response is then used is quite different from what we
did before. We do not want to try to update the h fluent according to this value as we did
in Chapter 6. It’s not as if we learned that our previous value of h is out of date and that
the distance to the wall has changed to z. Rather, what we want to do is to update our
beliefs about the distance to the wall in a way that accommodates both our previous beliefs
and this new bit of sensing information.
The way we do this is to change the weight of each sample according to the value z
returned. Typically, we want the new weight to be the product of two numbers: the pre-
vious weight and the number between 0 and 1 returned by the GAUSSIAN testing function.
The effect of the first number is to give higher weights to those sample states that were
considered more likely before; the effect of the second number is to give higher weights to
those sample states whose h values are closer to the z value returned by the sonar.
In other words, although we do not want to assume that the sonar gives us the true
value of h, we do want to assume that it will give a value z that is close. Hence, we want
to rate more highly those states whose h values are close to this z. How seriously should
165
we take this z value? This depends on the sensor. So the expression (GAUSSIAN z h .4) in
the definition in effect says that the sensing results z from the sonar will deviate from the
true value of h like a random number chosen from a Gaussian distribution with standard
deviation .4. Other sensors might be more or less accurate and so have smaller or larger
standard deviations. As with ordinary actions, we can also use expressions like
(define-action (sonar-response! z)
weight (* weight (GAUSSIAN z h (if raining .9 .4))))
to say that inaccuracies in the sensing may depend on other considerations, in this case,
whether or not the raining fluent is true at the time the sonar is used.
166
Figure 11.4: Program file Examples/sonar-robot.scm
;;; This is program for a robot with incomplete knowledge that starts out
;;; somewhere between 2 and 12 units away from the wall, and uses a noisy
;;; sensor and noise effector to move to close to 5 units away from the wall
(define-states ((i 1000000))
h (UNIFORM-GEN 2.0 12.0)) ; distance to wall
(define-action (move! x) ; advance x units
h (max 0 (- h (* x (GAUSSIAN-GEN 1.0 .2)))))
(define-action sonar-request!) ; sonar endogenous
(define-action (sonar-response! z) ; sonar exogenous
weight (* weight (GAUSSIAN z h .4)))
(define (get-close x)
(:let loop ()
(:while (> (sample-variance h) .02) ; too much uncertainty?
(:begin (:act sonar-request!) (:wait))) ; get sonar data
(:let ((d (- (sample-mean h) x))) ; get distance to x
(:when (> (abs d) .05) ; |d| is still large?
(:act (move! d)) ; move the robot
(loop))))) ; repeat
;; program interacts with an external robot manager on port 8123
(define (main)
(let ((ports (open-tcp-client 8123)))
(define-interface ’in (lambda () (read (car ports))))
(define-interface ’out (lambda (act) (displayln act (cadr ports)))))
(ergo-do #:mode ’online (get-close 5)))
A procedure like this would normally be run online (as discussed in Chapter 6) in-
teracting with the manager of a physical robot. So ERGO will need an external interface
that can receive endogenous actions (move! and sonar-request!) and return exogenous
actions (sonar-response!). The program here assumes that there is a robot manager of
some sort already running and listening on port 8123. (A simulation of such a manager
appears in the file Servers/sonar-robot-server.scm.)
where the ei are fexprs which may or may not involve random elements. In many cases,
these fexprs can be evaluated in any order because they are independent of each other. For
example, in the example used above
167
Figure 11.5: A Bayesian network example
?
hear bark
the value of the temp fluent was independent of the value of the h fluent. What this means
more precisely is that if we generate enough sample states, we expect half of them to have
(< h 7) true, say, and among the states where (eq? temp ’cold) is true, we expect half
of those to have (< h 7) true as well. This is what we mean when we say the two fluent
values are independent.
But in some cases, there will be dependencies among the fluents, where the proportion
of states where one fluent has a value is not preserved when we consider just those states
where other fluents have certain values. For this reason, when we have a define-states
expression, ERGO evaluates the ei in sequential order so that the value given to f i can de-
pend on the values given to earlier f j . (We already used this property in the define-states
expression of Section 11.3.1 that had the explicit weight value.) These dependencies define
what is called a Bayesian network (or a belief network) of fluents: the nodes are the fluents,
and there is an edge from fluent f j to f i if f j appears in ei , in other words, if the value of f i
depends on the value of f j .
Figure 11.5 shows an example (from the literature) of a Bayesian network with depen-
dent and independent fluents (all of which happen to have binary values): a family may or
may not be in the house, the family dog may or may not have bowel problems, the external
light on the house may or may not be on, the dog may or may not be placed outdoors,
and we may or may not hear barking. So, for example, the proportion of states where the
dog is out depends on whether or not the family is out, but the proportion of states where
the dog has a bowel problem does not depend on whether the family is out. Similarly, the
proportion of states where barking can be heard depend on whether or not the dog is out,
and so it depends indirectly on whether or not the family is out.
Here are the numbers for this Bayesian network example, showing the dependencies:
168
(define-state ((i 1000000))
family-out (BINARY-GEN .15)
bowel-problem (BINARY-GEN .01)
light-on (if family-out (BINARY-GEN .6) (BINARY-GEN .05))
dog-out (if family-out
(if bowel-problem (BINARY-GEN .99) (BINARY-GEN .9))
(if bowel-problem (BINARY-GEN .97) (BINARY-GEN .3)))
hear-bark (if dog-out (BINARY-GEN .7) (BINARY-GEN .01)))
In about 15% of the states, the fluent family-out will be true, and in about 1% of the
states, the fluent bowel-problem will be true. These two fluents are independent. But for
light-on, the fluent will be true in about 60% of those states where family-out is true,
but only in about 5% of those states where family-out is false. So light-on depends on
family-out. Similarly dog-out depends on both family-out and on bowel-problem, and
hear-bark depends on dog-out.
The ERGO belief function can be used to estimate the degree of belief of any formula
involving the fluents of a Bayesian network. (The accuracy of the estimate depends on the
number of samples used.) For the above network, the degree of belief that the family is
out given that the light is on but we don’t hear barking can be estimated as follows:
> (belief family-out (and light-on (not hear-bark)))
0.5011908149967795
(The theoretically correct value here is .5).
In more general terms, it is worth noting that when we have a collection of fluents f i
that are independent, the degree of belief that f 1 = x1 and f 2 = x2 and . . . and f n = xn will
always be the product of the degrees of belief that f i = xi . So, for example:
> (belief (and (not family-out) bowel-problem))
.008394
(This is about .85 × .01.) But with dependencies, such as in a Bayesian network, this is no
longer true. The degree of belief in the conjunction will end up being the product of the
degrees of conditional belief that f i = xi given that its dependent fluents f j have their values
x j . This can be seen in this example:
> (belief light-on (not (family-out)))
0.050516094862590476
> (belief light-on)
0.133473
> (belief (and (not family-out) light-on))
.04291
(This is about .85 × .05 and not .85 × .13.) In this case, the degree of belief in the conjunction
is not simply the product of the degrees of belief in the two conjuncts.
169
adjustment can depend on the values of other fluents. For example, as noted above, a
sonar value might depend on the value of the raining fluent. In this case, once the weights
are adjusted, it need not be the case that two independent fluents remain independent.
(Roughly, the mean and variance of h across all states may no longer agree with the mean
and variance of h across those states where raining is true.)
With ordinary actions, the situation is more complex. We may have had a dependency
between light-on and family-out precisely because we thought there was some sort of
causal relation between the two fluents: actions that causes the family-out fluent to change
also cause the light-on fluent to change as well. For actions that are completely accurate,
we can model this causal connection by arranging that the actions that change one fluent
also change the other. But for actions that (for one reason or another) may or may not
change the first fluent, we cannot use define-action as before.
For example, imagine an action maybe-go-out! that changes the fluent family-out to
true, but only 80% of the time (for one reason or another). We might want to say that if
the value is changed, then there is a 60% chance that the light will be turned on after the
action. If we were to use
(define-action maybe-go-out!
family-out (BINARY .8)
light-on (if family-out (BINARY-GEN .6) light-on))
then we would be saying that the new value of light-on depends on the previous value
of family-out (which is the way define-action has been used until now). To say that
the value of light-on depends on the new value of family-out, ERGO allows an optional
#:sequential? keyword argument in a define-action expression:
This ensures that the ei are evaluated in sequential order, so that the new value given to an
f i can depend on the new values given to earlier f j , preserving the dependence between
the two fluents.
170
might have a different weight. (In the case of real-valued fluents, the situation is much
worse: there would be uncountably many possible values for the fluents, so the summation
of weights must be reformulated as the integration of densities.) Judea Pearl was perhaps
the first to tackle this issue in [35]. He is credited with the proposal for Bayesian networks
[10, 26], which is one important way of making the specification of all the possible states
and their weights more manageable. However, reasoning with a Bayesian network directly
appears to be difficult [9], which is why an approximate method based on sampling (what
Pearl called “stochastic simulation”) is often used, as it was here. The example Bayesian
network used in this chapter is taken from [7].
The application of probability to various facets of (non-cognitive) robotics has a long
history, especially for sensing, localization, path planning, and learning. An excellent
textbook on the subject is [45]. The application of probability to the programming of
cognitive robots, where an agent would still need to reason about the prequisites and
effects of actions, for instance, is somewhat newer. One proposal based on the Golog
language (discussed in Chapter 10) is [5], but other variants exist. The probabilistic version
of ERGO presented in this chapter is indeed one such variant, and draws heavily from [1]
and [3], with the sonar example used in this chapter taken from [3]. For other ways of
looking at probabilistic programming in general, see [18] and [38].
171
∗ Chapter 12
Generalized Planning
In Chapter 11, we explored the idea of a cognitive robot with incomplete knowledge, with
an emphasis on real-valued numerical fluents and on the uncertainty that arises due to
noise and inaccuracies in a robotic system. In this chapter, we continue the exploration of
incomplete knowledge but without this focus on numbers.
The starting point of this chapter will be the representation of incomplete knowledge
presented in Section 11.1. To recap very briefly, instead of using define-fluents to create
a single state mapping fluents to their known values, the function define-states is used
to create a list of possible states, each of which maps fluents to values, and each of which
might be a correct representation of the world.
In this chapter, we will deal with fluents that only have a small number of possible
values (so sampling will not be necessary) and where numerical inaccuracies in the sensors
and effectors are not the issue. In contrast to the online programming of Chapter 11, we
will consider how a cognitive robot can use offline planning in such a context.
(So there are a total of sixteen states in the list of initial states.) Consider the goal defined
by the following function:
172
(define (all-on-C?)
(and (eq? hand ’empty) (null? (stack ’A)) (null? (stack ’B))))
In each of the sixteen states, the actions in the sequence can be legally performed and in
each case, the goal of having all the objects in stack C will be satisfied in the final state.
To generate a conformant plan like this, the ergo-simplan function can be used exactly
as before. What the implementation does is to search for a plan that works for one of the
initial states, but before accepting it, the procedure confirms that the plan also works for
the remaining states. (Other implementation strategies are certainly possible.)
In a plan like this, we are obviously going beyond a mere sequence of actions. (There
is a form of branching here, and later we will see that there can be loops.) Nonetheless,
this non-sequential plan has two desirable properties: first, it does the job properly in all
sixteen initial states, that is, in each initial state, the actions recommended by the plan
can be legally executed and will result in final states where the goal condition is true;
second, like a simple sequence of actions, the plan can be executed by a robot manager
independently of ERGO, that is, without any information about the current BAT, the state
of the world, or the goals under consideration.
This idea will be made more precise below, and we will show how such plans can be
generated automatically in ERGO. But first, we must examine in more detail the sort of
sensing being considered to support this form of planning.
173
12.3 Using sensing information offline
In earlier chapters, sensing information was used to update the values of certain changing
fluents (like the temperature in the elevator, in Chapter 6). In the case of incomplete
knowledge, however, sensing information is not used to track a changing world, but to
revise the state of knowledge of the robot. (In Chapter 11, it was the state of belief that
was revised.) The robot begins by not knowing the colour of the first block in stack A, but
then picks it up, does the sensing, and then comes to know what that colour is. We can
use this idea in an offline setting: we can consider the range of sensing results we might
obtain, calculate how each of them would change the state of knowledge, and then plan
appropriately for each contingency.
(define-action action-name
#:prereq fexpr
#:sensing fexpr
fluent1 fexpr1
...
fluentn fexprn )
The idea is that whenever this action is performed, the available sensors will eventually
report a value for the sensing result. The range of results to be expected are the possible
values of the given #:sensing fexpr. For example, a sensing action sense-colour! that
tells the robot the colour of the object in its hand can be defined as follows:
(define-action sense-colour!
#:prereq (not (eq? hand ’empty))
#:sensing hand)
After performing this sense-colour! action, no fluents change, but the sensors will even-
tually tell the robot some value (red or blue), which is to be understood as the value of the
hand fluent, which can then be used to update the state of knowledge.
A sensing action is not limited to making known the value of a given fluent only. For
example, imagine a weaker colour sensor that can only detect if what is being held is red
or not. This can be formalized using an action like sense-red! defined as follows:
(define-action sense-red!
#:prereq (not (eq? hand ’empty))
#:sensing (eq? hand ’red) )
In this case, the sensor is assumed to provide a Boolean value, #t or #f. The robot can then
eliminate those states for which the fexpr (eq? hand ’red) has a different value, and thus
come to know whether or not the object it is holding is red. If there are only red and blue
objects in the world, this is the same as sense-colour!. But in a world of red, blue, and
174
green objects, the robot would not be able to tell the difference between a green and a blue
object in its hand since the sensing result would be #f in both cases.
A slightly different way of handling the colour information in this world is to imagine
that the sensing information is relayed any time an object is picked up:
In this case, the pick! action is assumed to change the world as before (moving an object
from a stack to the robot’s hand), and simultaneously, to provide red or blue as the value
for the sensing result. It is important to note that after doing this pick! action, the robot
comes to know what the first element of the stack was (just prior to the pick! action) and
not what the first element of the stack will be (after the pick! action). The elimination of
states (or acquisition of knowledge) is the same as for the sense-colour! action.
So far, this colour sensing information will not allow a robot to deal with the different
numbers of objects it may need to deal with on the stacks. For this, one might suppose
that the robot has a second sensor that will tell it if there are any objects left on a stack:
This sensing action is assumed to tell the robot empty or non-empty according to whether
or not there are objects on the given stack, which would again allow it to eliminate world
states from its list of possible states appropriately.
175
Figure 12.1: Program file Examples/PlanningExamples/red-blue-bat.scm
176
Figure 12.2: A plan to make two coloured towers
(pick! A)
2. If the current state is a final state of the plan, then terminate the execution.
3. Otherwise get the robot to perform the action associated with the current state, and
obtain any sensing report from the sensors.
4. Make a transition from the current state to the next state of the plan according to the
sensing report obtained at step 3, and then go to step 2.
This is a completely offline use of ERGO. That is, the robot manager is not being asked
to send the sensing results of the actions to ERGO for consideration. Instead the manager
uses those sensing results itself to traverse the plan it was given.
Note that any fixed sequence of actions (as produced by ergo-simplan, and the offline
mode of ergo-do) can be trivially encoded as an FSA plan: put an edge to the first action
in the sequence, put unlabelled edges from each action in the sequence to the next one,
and finally, put an unlabelled edge from the last action in the sequence to a final stop state.
A trace is considered undefined when there is an action that cannot be performed legally
or when there is a sensing result for which there is no edge in the plan. An FSA plan is
then considered to solve a planning problem if for each initial state of the world given by
the BAT, the trace of the plan solves the planning problem in that state.
177
This definition of plan correctness reduces it to the correctness of a sequence of actions
in a single initial state (as in Chapter 3). But it is sometimes also useful to take a different
perspective and consider planning not in terms of individual world states, but in terms
of the states of knowledge. For an FSA plan to solve a planning problem, the following
conditions must hold:
• at the final state of the plan, the given goal condition must be known to be satisfied,
according to the current state of knowledge.
• at a non-final state, the prerequisite of the action in the plan must be known to be
satisfied, according to the current state of knowledge;
• at a non-final state, each possible sensing results of the action in the plan, according
to the current state of knowledge, must be represented by a transition in the plan;
Implicit in this new specification is the idea that the planner will be in various states of
knowledge at various states of the plan. For example, at the start, the state of knowledge is
as given by the BAT. (This implies that the prerequisite of the first action in the plan must
be known to be true initially.) Then, after performing an action and obtaining a sensing
result, the state of knowledge evolves: world states are eliminated if they conflict with the
sensing result, as described in Section 12.3.1. At the end, the final goal condition must be
known to be true.
178
Figure 12.3: Program file Examples/PlanningExamples/two-towers.scm
There are twelve gold bars that have the same weight, except for one of them
that is either heavier or lighter than the others. There is a balance scale that can
compare the weight of any two collections of bars: it will indicate whether the
collection on the left is heavier, or the one on the right is heavier, or whether
the two collections have the same weight. The goal is to determine which bar
is the odd one, whether it is heavier or lighter than the others, and to do so
using the balance scale only three times.
With twelve bars and three weighing actions, this is a very challenging problem! (What
makes the problem so hard is that there are an extremely large number of possible weigh-
ing actions to consider, and the smallest solution to the problem is a plan with thirty seven
states in it. We will see how to deal this in Section 12.5.) But a variant of the problem
where there are only three gold bars and where the balance scale can only be used twice is
much easier. A representation of this three-bar variant is shown in Figures 12.4 and 12.5.
In this version, the fluents odd-bar and odd-weight represent the (unknown) gold
bar and its (unknown) weight, heavy or light. (So for three bars, there will be six ini-
tial states.) The fluent tries represents how many weighing actions remain: it starts at
allowed-weighs, and is decremented each time a weigh! action is performed. The other
action in this BAT is the say! action, which simply announces the odd bar and its weight,
making the announced fluent true, which is the goal to be achieved.
The important part of the weigh! action is its sensing result: it returns left, right,
or even, according to whether the scale goes down on the left, down on the right, or not
all for the two lists of bars it is given as arguments. (For simplicity, the weigh! action is
restricted to lists of bars of the same length.) If the sensing result is left, for example, this
indicates that either the odd bar is heavy and appears in the first argument of the action,
or the odd bar is light and appears in the second argument.
179
Figure 12.4: Program file Examples/PlanningExamples/bars-bat.scm
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; This is the odd bar problem, where the number of bars and the number
;;; of allowed weigh actions are defined elsewhere
;;; Fluents:
;;; odd-bar: the bar that is odd
;;; odd-weight: heavy or light for the odd bar
;;; announced?: true only after odd bar has been announced
;;; tries: numbers of weighing actions remaining
;;; Actions:
;;; (say! b w): announce that bar b is the odd one and of weight w
;;; (weigh! l1 l2): compare weight of the bars in list l1 vs list l2
(define weights ’(heavy light))
(define-states ((b bars) (w weights)) ; bars defined elsewhere
odd-bar b
odd-weight w
announced? #f
tries allowed-weighs) ; allowed-weighs defined elsewhere
(define-action (say! b w)
#:prereq (and (eq? b odd-bar) (eq? w odd-weight))
announced? #t)
(define-action (weigh! l1 l2)
#:prereq (and (> tries 0) (= (length l1) (length l2)))
#:sensing (cond
((memq odd-bar l1) (if (eq? odd-weight ’heavy) ’left ’right))
((memq odd-bar l2) (if (eq? odd-weight ’heavy) ’right ’left))
(else ’even))
tries (- tries 1))
(define all-say-acts
(for/append ((b bars)) (for/list ((w weights)) (say! b w))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; This is the odd bar problem, with 3 bars and 2 weighings allowed.
(define bars 3)
(define allowed-weighs 2)
(include "bars-bat.scm")
(define (main)
(ergo-genplan (lambda () announced?)
(append all-say-acts
’((weigh! (0) (1)) (weigh! (0) (2)) (weigh! (1) (2))) )))
180
Figure 12.6: A plan for the 3bars problem
(say! 1 heavy) (say! 0 light) (say! 2 light) (say! 2 heavy) (say! 1 light) (say! 0 heavy)
STOP
Since the goal is to make the announced fluent true, and since this is what a say! action
does, why is it then not sufficient to have a plan consisting of a single say! action? The
answer is that this action can only be performed if its prerequisite is known to be true,
which only happens when the odd-bar and odd-weight fluents have known values. To get
to such a state, it will be necessary to first perform some weigh! actions.
The FSA plan that the program finds is shown in Figure 12.6. It begins by weighing
bars 0 and 1, an action whose prerequisite is indeed known to be true at the outset. There
are then three possible sensing outcomes. Consider the one labeled even, and how the
state of knowledge changes. Since an even outcome only happens in states where bar 2 is
the odd bar, any initial state where this is not the case is eliminated, leaving only two of the
original six states. Next, the plan is to weigh bars 0 and 2. In this case, the even outcome is
not possible according to what is known. (It would require bar 2 to have the same weight
as bar 0, which is now known to be false.) Consider the left outcome. Since bar 0 is not
heavy, this outcome only happens when bar 2 is light. At this point, the prerequisite of a
say! action is satisfied, and so the plan is to next announce that bar 2 is light, and then
stop. All the other paths through the plan are analogous. There are six traces through the
FSA plan, each involving two weigh! actions and the appropriate say! action.
181
Figure 12.7: A plan for the towers of Hanoi
(pick! B)
fail ok
ok
fail (pick! A)
ok ok
ok
fail (pick! C)
ok
(put! A)
this robotic version, the robot has no way of knowing initially how many disks there are,
or even if the number of disks is even or odd. Nonetheless, there is an FSA plan that works
for an initial state with any number of disks on stack A, shown in Figure 12.7.
Unlike the case with the red and blue blocks, the fact that this FSA plan does the job is
far from obvious. Because the number of disks is unknown at the outset, the goal is solved
in an unusual way. For instance, as can be seen from the figure, the first disk from stack A
is always placed on stack B. This is the right thing to do when there are an even number of
disks on stack A. But when there are an odd number of disks, what happens is that all the
disks end up being moved to stack B first, and then they are all moved to stack C. Another
unusual aspect of the problem is that the robot has no way of knowing which disk is the
largest (at the bottom of stack A). Consequently, it may end up picking up that largest disk
and putting it right back when there happen to be disks on the two other stacks.
A complete program for the towers of Hanoi problem appears in Figure 12.8. The
BAT is a variant of the one in Figure 12.1, but where both the pick! and put! actions
now have sensing results. Let us consider some of the additional optional arguments of
ergo-genplan. The procedure works by iterative deepening, looking for plans with ever
more plan states, up to some maximum specified by the optional #:states argument.
(If the #:deep? argument is set to false, only plans of the size given by #:states are
considered.) Because even a small plan can loop forever, a separate #:steps argument can
182
Figure 12.8: Program file Examples/PlanningExamples/hanoi.scm
;;; This is a program for the robotic version of the Towers of Hanoi.
(define-states ((ini ’((1) (1 2) (1 2 3) (1 2 3 4))))
hand ’empty ; start holding nothing
stacks (hasheq ’A ini ’B ’() ’C ’())) ; all stacks empty
;; abbreviation to get the content of a stack A, B, or C
(define (stack st) (hash-ref stacks st))
;; the contents of hand cannot be placed on stack st
(define (cannot-put? st)
(and (not (null? (stack st))) (> hand (car (stack st)))))
;; action to pick up an object and return either ’ok’ or ’fail’
(define-action (pick! st)
#:prereq (eq? hand ’empty)
#:sensing (if (null? (stack st)) ’fail ’ok)
hand (if (null? (stack st)) ’empty (car (stack st)))
stacks (if (null? (stack st)) stacks
(hash-set stacks st (cdr (stack st)))) )
;; action to put what is being held on a stack and return ’ok’ or ’fail’
(define-action (put! st)
#:prereq (not (eq? hand ’empty))
#:sensing (if (cannot-put? st) ’fail ’ok)
hand (if (cannot-put? st) hand ’empty)
stacks (if (cannot-put? st) stacks
(hash-set stacks st (cons hand (stack st))) ))
;; get all the disks onto stack C
(define (goal?) (and (eq? hand ’empty) (null? (stack ’A)) (null? (stack ’B))))
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;; The main program
(define (main)
(ergo-genplan goal? (append (map pick! ’(A B C)) (map put! ’(A B C)))
#:steps 60))
be set to an upper bound of how many actions should be allowed in attempting to reach a
goal state. (In the case of the towers of Hanoi, the default value of 30 steps is insufficient,
and a higher value of 60 is specified to deal with the initial state having four disks.) The
final optional argument to ergo-genplan, #:loose?, is considered in the next problem.
183
Figure 12.9: Program file Examples/PlanningExamples/striped.scm
;;; A robot world just like the Two Towers problem but with a harder goal.
;; all lists of length n made up of only red and blue elements
(define (all-blocks n)
(if (= n 0) ’(())
(let ((x (all-blocks (- n 1))))
(append (for/list ((e x)) (cons ’blue e))
(for/list ((e x)) (cons ’red e))))))
;; all lists of length n with an even number of blue and red elements
(define (even-blocks n)
(for/only ((x (all-blocks n)))
(= (for/sum ((b x)) (if (eq? b ’blue) 1 0)) (/ n 2))))
;; stack A: up to 6 blocks with equal number of red and blue ones
(define stackA-values (for/append ((n 4)) (even-blocks (* n 2))))
(include "red-blue-bat.scm") ; the BAT from the Two Towers problem
;; the goal to be achieved: a striped tower on stack C
(define (goal?)
(and (eq? hand ’empty) (null? (stack ’A)) (null? (stack ’B))
(striped? (stack ’C)) ))
;; the recursive definition of a striped tower with blue at the top
(define (striped? x)
(or (null? x)
(and (eq? (car x) ’blue) (eq? (cadr x) ’red) (striped? (cddr x)))))
;; find a plan that works for all states with up to 6 blocks
(define (main)
(ergo-genplan #:states 10 #:loop? #t #:loose? #f
goal? (append (map pick! ’(A B)) (map put! ’(B C)))))
so on to a red one at the bottom of the stack. Finally, it is only permissible to pick objects
up from stacks A and B, and to put objects down on stacks B and C.
What makes this problem challenging from a search point of view is that the smallest
FSA plan that achieves the goal (for the given world states) has ten states: see Figure 12.10.
The problem can be solved by ergo-genplan with iterative deepening, but even using the
#:loop? argument to avoid visiting the same state twice, it takes much too long.
One of the reasons ergo-genplan can take a long time is that it begins by looking at
the sensing outcome where the resulting knowledge is the least constrained. This is fine for
quickly eliminating actions that in some cases fail to provide enough information. But for
domains with many legal actions, it can be much better to look first at the sensing outcomes
where the resulting knowledge is the most constrained (to help build up a plan with the
specific knowledge). To do so, an optional #:loose? argument is used with ergo-genplan.
The default value here is #t, but if #f is given, as it is in Figure 12.9, the most constrained
sensing outcomes are considered first. With this, the program works much better, although
it will still take minutes (but not hours) to find the desired plan.
184
Figure 12.10: A plan for the striped tower problem
(pick! A)
(pick! B)
(put! C) (pick! A)
blue red
(put! C) (put! B)
(pick! B)
red
(put! C)
185
Figure 12.11: Program file Examples/PlanningExamples/prog-two-towers.scm
186
Figure 12.12: Program file Examples/PlanningExamples/prog-striped.scm
searching involved in the solution. The get-colour procedure says exactly what should
be done when trying to get the next block of a certain colour from stack A, including using
stack B to store blocks of the wrong colour until they are needed.
Now let us turn to the original odd bar problem, with twelve bars and three possible
weighings. As already noted, there were too many possible actions to consider at each step
of the plan to blindly search through. For example, consider a first weighing action with
three bars on each side of the scale. There will be (12 × 11 × 10)/6 choices for the bars on
the left, and for each of those, (9 × 8 × 7)/6 choices on the right. But considering all these
choices is a waste of time. If any of them work as the first action of a plan, then any of
the others would also work as the first action of a plan. This is because, initially, all the
bars are “equivalent”: nothing special is known about any one bar that is not known about
the others. For this reason, as our first weighing action, we only need to consider a single
weighing action of k bars on each side, for 1 ≤ k ≤ 6. Similarly, after doing a first weighing
action, say with bars 0,1,2 on the left and bars 3,4,5 on the right, then for each sensing
result, bars 0,1,2 will be equivalent, bars 3,4,5 will be equivalent, and the remaining bars
will be equivalent. In general, at any given stage, we can partition the bars into equivalence
classes and only consider weighing actions that use representatives from these classes.
A program that does just this is shown in Figure 12.13. All of the effort here is on
limiting the weighing actions to be considered. Looking at the last few lines of the pro-
gram, we can see that it says to do three actions taken from the value of (weigh-acts)
followed by a say! action as before. The function weigh-acts gets the possible values
for (list odd-bar odd-weight) according to what is currently known, partitions the bars
into four equivalence classes, then selects k bars from these classes for the left and right
sides, where 1 ≤ k ≤ 6. Overall, the program runs to completion in a second or so, and
produces an FSA plan with thirty seven states (and sixty transitions), too big to display
here. (Note that the program discovers that the first weighing action of the three must
187
Figure 12.13: Program file Examples/PlanningExamples/prog-12bars.scm
;;; This is the odd bar problem, with 12 bars and 3 weighings allowed.
(define bars 12)
(define allowed-weighs 3)
(include "bars-bat.scm")
;; partition the list of all bars into 4 groups: both, heavy, light, neither
(define (ppart odds)
(let loop ((b 0) (both ’()) (heavy ’()) (light ’()) (neither ’()))
(if (= b bars) (list both heavy light neither)
(if (member (list b ’heavy) odds)
(if (member (list b ’light) odds)
(loop (+ b 1) (cons b both) heavy light neither)
(loop (+ b 1) both (cons b heavy) light neither) )
(if (member (list b ’light) odds)
(loop (+ b 1) both heavy (cons b light) neither)
(loop (+ b 1) both heavy light (cons b neither)) )))))
;; choose k bars for left part of weighing action then call rbars for right
(define (lbars k part arg rem)
(if (null? (cdr part))
(if (> k (length (car part))) ’()
(let ((left (append (take (car part) k) arg)))
(rbars (length left) (cons (drop (car part) k) rem) ’() left)))
(for/append ((i (+ 1 (min k (length (car part))))))
(lbars (- k i) (cdr part) (append (take (car part) i) arg)
(cons (drop (car part) i) rem)))))
;; choose k bars for right part of weighing action then build the action
(define (rbars k part arg left)
(if (null? (cdr part))
(if (> k (length (car part))) ’()
(list (weigh! left (append (take (car part) k) arg))))
(for/append ((i (+ 1 (min k (length (car part))))))
(rbars (- k i) (cdr part) (append (take (car part) i) arg)
left))))
;; the weighing actions representatives, in terms of possible values
(define (weigh-acts)
(let ((pp (ppart (possible-values (list odd-bar odd-weight)))))
(for/append ((k (quotient bars 2))) (lbars (+ k 1) pp ’() ’()))))
(define (main)
(ergo-gendo
(:begin (:for-all i allowed-weighs (:for-some a (weigh-acts) (:act a)))
(:for-some a all-say-acts (:act a)))))
have exactly four bars on each side of the scale. Anything less or more will leave too many
possibilities to sort out in the remaining steps.)
As a final point regarding these knowledge-based programs, it is worth noting that
writing them is somewhat of a black art. This can be seen in the ERGO program that
188
Figure 12.14: Program file Examples/PlanningExamples/prog-hanoi.scm
;;; This is an offline program for the robotic version of the Towers of Hanoi.
(include "hanoi.scm")
;; pick from stack x and then put to stack y. Stop when Peg B is empty
(define (picker x y z)
(:begin (:act (pick! x))
(:if (known? (eq? hand ’empty)) ; the pick action failed
(:unless (eq? x ’B) (picker y z x))
(putter y z x))))
;; put to stack x and then pick from stack y
(define (putter x y z)
(:begin (:act (put! x))
(:if (known? (eq? hand ’empty)) ; the put action was successful
(picker y z x)
(putter y z x))))
(define (main) (ergo-gendo (picker ’A ’B ’C)))
solves the Towers of Hanoi problem, shown in Figure 12.14. It is far from obvious how to
write such a program. It does not look at all like the FSA plan that it generates, shown in
Figure 12.7. The program is recursive, for one thing, even though the FSA plan it produces
is clearly not. But the program is not recursive in the way that it would be in the usual
formulation of the problem, where the number of disks is known. Analyzing in what sense
this program is correct for any number of disks remains an open problem.
189
of generating plans with branches, but subsequent work like [4, 23, 44] focusses more on
plans with loops. The approach here, with FSA plans and explicit sensing actions, is taken
from [22] and [24], from which the examples in this chapter are drawn. The idea of solving
a planning problem for a given set of initial states, and then seperately confirming that it
also works for a larger perhaps infinite set of states was first suggested in [28]. The move
from generalized planning to offline knowledge-based programming in the final section
(like the move from planning to programming in Chapter 4) has not appeared anywhere.
190
End Matter
Final Thoughts
This was a book about programming cognitive robots, and explored a number of ideas on
the topic: declarative specifications, planning, high-level programming, search and back-
tracking, game playing, offline and online execution, reactivity, incomplete knowledge. But
all the examples considered in the book were small, no more than a hundred lines of code
each. From a programming point of view, a good question to ask is this: how will the
ideas presented in this book scale up as the cognitive robotic systems grow in size and
complexity?
Scaling up
I think there are two answers to this question. In one sense, the systems will scale well.
While the examples considered here involved tens of fluents, there should be no major
problems dealing with hundreds or even thousands of them. By using arrays and hash-
tables, it should be possible to deal with millions of individual changing values. Similarly,
the examples in this book used tens of primitive actions, but hundreds or thousands of
them should still work quite well. Since actions can have parameters, it should be possible
to deal with millions of individual action instances. Regarding the sizes of ERGO pro-
grams, it should be possible using libraries to build systems made of many thousands of
lines of code, although it will be necessary to manage the namespaces more carefully than
was done here. (Racket Scheme has a comprehensive module facility, outside the scope of
this book.) So from this point of view, the ERGO systems should scale quite comfortably.
But there is another sense in which ERGO systems will not scale well at all, and that is
with respect to combinatorics. To take an extreme example, consider the sequential planning
seen in Chapter 3. If we had a large BAT (as above) with a million primitive action instances
say, then searching for a plan consisting of just two actions might already be too much. We
cannot expect even the fastest of computers to be able to cycle comfortably through the
1012 possibilities. But this issue is not due to the size of the BAT. As we have seen, the
problem is already there in a BAT with just ten actions. In this case, searching blindly for
a plan with twelve steps would again mean looking at 1012 possibilities. In fact, even with
just two actions in a BAT, we already have the problem, in that we cannot expect to search
blindly for a plan of forty actions, again about 1012 possibilities. The problem here is not
the size of the BAT at all. It is the fact that we need to consider the very large number of
cases that can arise when trying to reason with the BAT for some purpose.
How to deal with this problem? As we have seen in this book, the solution is to avoid
blind search whenever the combinatorics makes the cost too high. (Recall the Million-
Billion Rule for ERGO searching on page 35.) The word “blind” in this context really
192
means “without knowledge of how to do any better.” The idea is that a cognitive robot
should be able to use what it knows to avoid working through so many possibilities. This
was the primary reason we moved from planning to high-level programming, starting in
Chapter 4.
Consider, for instance, the job of finding a path from one location to another, something
that came up a few times in the book. In the simple case, with tens of locations say, nothing
special is needed to search for a path. But even with hundreds of locations, these simple
methods no longer work. What does work, however, is to structure the locations into
hierarchic regions. Even though we might be dealing with billions of individual locations,
we can structure the hierarchy so that there might only be tens of interconnected regions
at each level of the hierarchy. This means that it will remain feasible to find paths from one
region to another at the same level. So, for example, to find a path from a street address
in Toronto to a street address in Montreal, we can first find a path on the major highways
from Toronto to Montreal, ignoring lower street-level details completely.
So in a challenging setting, the idea is not to avoid search completely, but to make sure
that the combinatorics are kept manageable. In many cases, there will be expertise about
the problem domain (such as the idea of structuring a map hierarchically) that can guide
a cognitive robot towards a solution. Of course, the ERGO system itself does not provide
this expertise; it is something that needs to be programmed.
193
Knowledge representation and reasoning
Fortunately, there are some ideas about what to do. Consider again the case where there
are forty Boolean fluents with unknown values. Typically, a cognitive robot will still know
something, even if it does not know the values of any of those fluents. Let us assume that
what is known about these fluents can be represented by a formula φ (as was suggested
with the #:known keyword in Chapter 11). To determine if some other formula ψ is known
to be true, instead of going through all the states where φ is true to see if ψ is also true, we
take our job to be that of determining whether or not φ logically entails ψ. In other words,
instead of pouring over a very large number of initial states, we look closely at φ itself and
see if we can derive ψ from it.
This is of course what is done in the area of knowledge representation and reasoning, as
noted in Chapter 10. We represent what is known using a formula φ (or a finite set of
such formulas), that we call the knowledge base, and then try to reason directly with this
knowledge base to see what is entailed. Of course, one way to do this is would be to
construct the list of states that satisfy the φ and work from there. But this is precisely
what we hope to avoid. Indeed, for forty Boolean fluents it might not even be feasible to
construct a list of states in the first place, and for a hundred fluents (still a small BAT),
making a list of all the states is totally out of the question.
What are the good options for reasoning in this sense, and will they scale up well
enough to serve for cognitive robotics? That is the big question. For certain types of
knowledge bases, the answer is definitely yes. One well known example is the so-called
Horn case, where the knowledge base is a set of Horn clauses, that is, disjunctions of
fluents or their negations where at most one fluent appears unnegated in each disjunction.
For a knowledge base that happens to be in this form, it is quite feasible to deal not just
with forty, but with thousands of fluents. But the techniques that have been proposed
so far to deal with arbitrary knowledge bases without restriction, appear to be much too
weak. There are some ideas about what to do, including using inference methods that are
only approximately correct. Whether any of these ideas will lead to practical systems for
cognitive robotics remains a question for the future.
194
Figures and Program Files
195
7.7 Program file Projects/Squirrels/random-main.scm . . . . . . . . . . . . . 115
196
Scheme Functions Used
This is a list of the predefined Scheme functions presented in Section 2.4 that are used in
the example programs in this book. Most of these are Racket Scheme primitives, but a few
are defined in the files System/misc.scm and System/arrays.scm. (Macros and special
forms are also included in this list.)
• Symbols:
eq? equal? symbol?
• Numbers:
+ - * / < <= = > >= abs min modulo quotient random
• Lists:
append assq caddr cadr car cddr cdr cons drop length list list-ref
member memq null? remove remove* reverse take
• Strings:
display displayln eprintf printf
• Functions:
and-map apply for-each lambda map
• Boolean values:
and not or
• For programming:
case define else error for for/and for/append for/list for/only for/or
for/sum if include let ’(quote) ‘(quasiquote) ,(unquote)
197
Index of Ergo Keywords
198
(v1.5) The fluents listed are those that are considered to be changed by (:for-some var list pgm . . . pgm)
the action. The value of the fluent after the action will be the Perform (:begin pgm. . . pgm) for some value of var from
ERGO on a Page
value of the corresponding fexpr before the action. (All changes the list, chosen nondeterministically.
are considered to be done in parallel.)
The ERGO File
An ERGO file for an application typically has three parts: a basic
The action in the definition can be a symbol or a list of symbols (:conc pgm . . . pgm)
action theory, the definitions of some ERGO procedures (using
(name var . . . var) for an action with parameters.. Concurrently perform all of the programs,
define as usual), and an optional robotic interface. These three nondeterministically interleaving any single steps.
parts are described further below, and may be loaded from other In addition, the fluent can be the special symbol #:prereq or
files by using include. The ERGO file usually ends with a main #:sensing, in which case, the corresponding fexpr defines the (:atomic pgm . . . pgm)
function that calls ergo-do or one of the planning functions: prerequisite or sensing result of the action. Perform (:begin pgm. . . pgm) but with no interleaving
The define-action expression has the effect of defining the from concurrent programs.
(define (main) (ergo-do [#:mode mode] pgm)) action itself as a global variable whose value is the action symbol
The pgm here is an expression that evaluates to an ERGO (or a list of the action symbol and its arguments). (:monitor pgm1 pgm2 . . . pgmn )
program. If the execution mode is specified, it should be one of Perform pgm1 before every step of pgm2 , which is
’offline, ’first (the default), or ’online. (The robotic interface ERGO Programs performed before every step of pgm3 , and so on.
part is needed only when the mode is ’online.)
Each of the following expressions evaluates to an ERGO program: (:>> fexpr . . . fexpr)
Using Scheme :nil Succeed after evaluating the expressions (for effect).
Within an ERGO file, ERGO can be intermingled with Scheme The program that always succeeds. (:<< fexpr . . . fexpr)
variables and function definitions that appear in the usual way. :fail Like :>> except that evaluation only happens on failure.
In what follows, we use “fexpr” to mean any Scheme expression The program that always fails.
where the fluents of the basic action theory (see define-fluents (:let ((var fexpr) . . . (var fexpr)) pgm . . . pgm)
below) may appear as global variables. (:test fexpr)
Perform (:begin pgm. . . pgm) in an environment where
Succeed or fail according to whether or not the current
each variable var has the value of fexpr.
Running ERGO value of fexpr is true.
Once ERGO has been properly installed, it is possible to call the (:act action) (:wait)
Fail if the action has a prerequisite that evaluates to false, Succeed after the next exogenous action happens.
199
main function in an ERGO file called my-ergo-app.scm as follows:
but succeed otherwise, and move to a new state where the
> racket -l ergo -f my-ergo-app.scm -m (:search pgm . . . pgm)
fluents are updated as per its define-action (see above).
Perform (:begin pgm. . . pgm) online, but using lookahead
Basic Action Theory (:begin pgm . . . pgm) for any nondeterminism to guard against failure.
Sequentially perform all of the programs.
A basic action theory has definitions for the fluents and actions.
(:choose pgm . . . pgm) Note that obtaining the value of these expressions is not the same
Fluents Nondeterministically perform one of the programs. as executing them. Execution is what is done by ergo-do alone.
The fluents of a basic action theory are defined using one or more (:if fexpr pgm1 pgm2 )
expressions in the file of the form Behave like pgm1 if the current value of fexpr is true, but Robotic Interface
(define-fluents like pgm2 otherwise.
A robotic interface is defined by a set of expressions of the form:
fluent ini (:when fexpr pgm . . . pgm)
··· Behave like (:if fexpr (:begin pgm. . . pgm) :nil)
fluent ini ) (define-interface ’out body)
(:unless fexpr pgm . . . pgm) The body should evaluate to a function of one argument
where each fluent is a symbol and each ini is a Scheme form that Behave like (:when (not fexpr) pgm. . . pgm). (like displayln) that will send an endogenous action to
provides the value of the fluent in the initial state. This has the
(:until fexpr pgm . . . pgm) an outside target, blocking until the target is ready.
effect of defining the fluents as global variables that can then be
Perform (:begin pgm. . . pgm) repeatedly until the value
used later in fexprs for actions and programs. (define-interface ’in body)
of fexpr becomes true.
Any valid Scheme datum can be used as the value of a fluent, The body should evaluate to a function of no arguments
including lists, vectors, hash-tables, and functions. (:while fexpr pgm . . . pgm)
(like read) that will return the next exogenous action
Behave like (:until (not fexpr) pgm. . . pgm).
received from an outside source, blocking when none.
Actions (:star pgm . . . pgm)
Each action is defined by an expression of the following form: Perform (:begin pgm. . . pgm) repeatedly for some The bodies can perform whatever initialization is needed for the
nondeterministically chosen number of times. functions they return to work properly (such as opening files, or
(define-action action
fluent fexpr (:for-all var list pgm . . . pgm) making TCP connections). More than one in and out interface
··· Perform (:begin pgm. . . pgm) repeatedly for all values of can be defined. The functions write-endogenous and
fluent fexpr ) the variable var taken from the list list. read-exogenous can be used as default bodies.
Index of Technical Terms
lexically scoping, 25
logical entailment, 146
200
Bibliography
[1] Vaishak Belle and Hector J. Levesque, Reasoning about continuous uncertainty in
the situation calculus. In Proceedings of the International Joint Conference on Artificial
Intelligence. Beijing, 2013.
[2] Vaishak Belle and Hector J. Levesque, Foundations for generalized planning in un-
bounded stochastic domains. In Proceedings of the International Conference on Principles
of Knowledge Representation and Reasoning Cape Town, South Africa, 2016.
[4] Blai Bonet, Giuseppe de Giacomo, Hector Geffner, and Sasha Rubin. Generalized
planning: non-deterministic abstractions and trajectory constraints. In Proceedings of
the International Joint Conference on Artificial Intelligence. Melbourne, 2017.
[5] Craig Boutilier, Ray Reiter, Mikhail Soutchanski and Sebastian Thrun. Decision-
theoretic, high-level robot programming in the situation calculus. in Proceedings of
the AAAI National Conference on Artificial Intelligence. Austin, TX, 2000.
[6] Ronald J. Brachman and Hector J. Levesque. Knowledge Representation and Reasoning.
San Francisco: Morgan Kaufmann, 2004.
[7] Eugene Charniak. Bayesian networks without tears. AI Magazine, 12(4):50–63, 1991.
[8] Kai Lai Chung. A Course in Probability Theory. San Diego, CA: Academic Press, 2001.
[9] Gregory F. Cooper. Probabilistic inference using belief networks is NP-hard. Artificial
Intelligence, 42:393–405, 1990.
[11] Bruno de Finetti. Theory of Probability: A Critical Introductory Treatment, New York:
Wiley, 1974.
[12] Giuseppe De Giacomo, Yves Lespérance, and Hector J. Levesque. ConGolog, a con-
current programming language based on the situation calculus. Artificial Intelligence,
121(1-2):109–169, 2000.
201
[13] Giuseppe De Giacomo, Yves Lespérance, Hector J. Levesque, and Sebastian Sardiña.
On the semantics of deliberation in IndiGolog—from theory to implementation. In
Proceedings of the International Conference on Principles of Knowledge Representation and
Reasoning. Rome, 2002.
[14] Herbert B. Enderton. A Mathematical Introduction to Logic. San Diego, CA: Academic
Press, 2001.
[15] Ron Fagin, Joseph Halpern, Yoram Moses and Moshe Vardi. Reasoning about Knowl-
edge. Cambridge, MA: MIT Press, 1995.
[16] William Feller. An Introduction to Probability Theory and Its Applications. Vol. 1 New
York: Wiley, 1968.
[17] Malik Ghallab, Dana Nau, and Paolo Traverso. Automated Planning: Theory and Prac-
tice. San Francisco: Morgan Kaufmann, 2004.
[18] Noah Goodman, Vikash Mansinghka, Daniel Roy, Keith Bonawitz, and Josh Tenen-
baum. Church: a language for generative models. In Proceedings of the Conference on
Uncertainty in Artificial Intelligence. Helsinki, 2008.
[19] Joseph Halpern. Reasoning about Uncertainty. Cambridge, MA: MIT Press, 2003.
[20] Frank van Harmelen, Vladimir Lifschitz and Bruce Porter, eds. Handbook of Knowledge
Representation. Amsterdam: Elsevier, 2008.
[21] Jaako Hintikka. Knowledge and Belief. Ithaca, NY: Cornell University Press, 1962.
[22] Yuxiao Hu. Generation and Verification of Plans with Loops Ph.D. thesis, Dept. of
Computer Science, University of Toronto, 2012.
[23] Yuxiao Hu and Giuseppe de Giacomo. Generalized planning: synthesizing plans that
work for multiple environments. In Proceedings of the International Joint Conference on
Artificial Intelligence. Barcelona, 2011.
[24] Yuxiao Hu and Hector J. Levesque. A correctness result for reasoning about one-
dimensional planning problems. In Proceedings of the International Conference on Prin-
ciples of Knowledge Representation and Reasoning Toronto, 2010.
[25] Edwin Jaynes. Probability theory: The Logic of Science. Cambridge, UK: Cambridge
University Press, 2003.
[26] Finn Jensen. An Introduction to Bayesian Networks. London: University College Lon-
don Press, 1996.
[27] Richard Jeffrey. Probability and the Art of Judgment. Cambridge, UK: Cambridge
University Press, 1992.
[28] Hector J. Levesque. Planning with loops. In Proceedings of the International Joint
Conference on Artificial Intelligence. Edinburgh, 2005.
202
[29] Hector J. Levesque and Gerhard Lakemeyer. The logic of knowledge bases. Cambridge,
MA: MIT Press, 2000.
[30] Hector J. Levesque and Gerhard Lakemeyer. Cognitive robotics. In [20], 869–886.
[31] Hector J. Levesque, Raymond Reiter, Yves Lespérance, Fangzhen Lin, and Richard
Scherl. Golog: A logic programming language for dynamic domains. Journal of Logic
Programming, 31:59–84, 1997.
[33] John McCarthy. Programs with common sense. Reprinted in Readings in Knowl-
edge Representation, ed. Ronald J. Brachman and Hector J. Levesque. San Francisco:
Morgan Kaufmann, 1986.
[35] Judea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference,
Revised second printing. San Francisco: Morgan Kaufmann, 1988.
[36] Judea Pearl. Belief networks revisited. Artificial Intelligence, 59:49–56, 1993.
[37] Ronald Petrick and Fahiem Bacchus. A knowledge-based approach to planning with
incomplete information and sensing. In Proceedings of the International Conference on
Artificial Intelligence Planning and Scheduling. Menlo Park, CA: AAAI Press, 2002.
[39] Raymond Reiter. The frame problem in the situation calculus: a simple solution
(sometimes) and a completeness result for goal regression. In Artificial Intelligence
and Mathematical Theory of Computation: Papers in Honor of John McCarthy, Vladimir
Lifschitz, ed. New York: Academic Press, 1991.
[40] Ray Reiter. Knowledge in Action: Logical Foundations for Specifying and Implementing
Dynamical Systems. Cambridge, MA: MIT Press, 2001.
[41] Stuart J. Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Upper
Saddle River, NJ: Prentice Hall, 2009.
[42] Murray Shanahan. Solving the Frame Problem: A Mathematical Investigation of the Com-
mon Sense Law of Inertia. Cambridge, MA: MIT Press, 1997.
[44] Siddharth Srivastava, Neil Immerman and Shlomo Zilberstein Computing applica-
bility conditions for plans with loops. In Proceedings of the International. Conference on
Automated Planning and Scheduling. Toronto, 2010.
[45] Sebastian Thrun, Wolfram Burgard, and Dieter Fox. Probabilistic Robotics (Intelligent
Robotics and Autonomous Agents). Cambridge, MA: MIT Press, 2005.
203