Artificial Intelligence A Modern Approach - 1
Artificial Intelligence A Modern Approach - 1
Artificial Intelligence A Modern Approach - 1
and subtracting machine called the Pascaline. Leibniz improved on this in 1694, building a
mechanical device that multiplied by doing repeated addition. Progress stalled for over a century
until Charles Babbage (1792-1871) dreamed that logarithm tables could be computed by machine.
He designed a machine for this task, but never completed the project. Instead, he turned to the
design of the Analytical Engine, for which Babbage invented the ideas of addressable memory,
stored programs, and conditional jumps. Although the idea of programmable machines was
not new—in 1805, Joseph Marie Jacquard invented a loom that could be programmed using
punched cards—Babbage's machine was the first artifact possessing the characteristics necessary
for universal computation. Babbage's colleague Ada Lovelace, daughter of the poet Lord Byron,
wrote programs for the Analytical Engine and even speculated that the machine could play chess
or compose music. Lovelace was the world's first programmer, and the first of many to endure
massive cost overruns and to have an ambitious project ultimately abandoned." Babbage's basic
design was proven viable by Doron Swade and his colleagues, who built a working model using
only the mechanical techniques available at Babbage's time (Swade, 1993). Babbage had the
right idea, but lacked the organizational skills to get his machine built.
AI also owes a debt to the software side of computer science, which has supplied the
operating systems, programming languages, and tools needed to write modern programs (and
papers about them). But this is one area where the debt has been repaid: work in AI has pioneered
many ideas that have made their way back to "mainstream" computer science, including time
sharing, interactive interpreters, the linked list data type, automatic storage management, and
some of the key concepts of object-oriented programming and integrated program development
environments with graphical user interfaces.
Linguistics (1957-present)
In 1957, B. F. Skinner published Verbal Behavior. This was a comprehensive, detailed account
of the behaviorist approach to language learning, written by the foremost expert in the field. But
curiously, a review of the book became as well-known as the book itself, and served to almost kill
off interest in behaviorism. The author of the review was Noam Chomsky, who had just published
a book on his own theory, Syntactic Structures. Chomsky showed how the behaviorist theory did
not address the notion of creativity in language—it did not explain how a child could understand
and make up sentences that he or she had never heard before. Chomsky's theory—based on
syntactic models going back to the Indian linguist Panini (c. 350 B.C.)—could explain this, and
unlike previous theories, it was formal enough that it could in principle be programmed.
Later developments in linguistics showed the problem to be considerably more complex
than it seemed in 1957. Language is ambiguous and leaves much unsaid. This means that
understanding language requires an understanding of the subject matter and context, not just an
understanding of the structure of sentences. This may seem obvious, but it was not appreciated
until the early 1960s. Much of the early work in knowledge representation (the study of how to
put knowledge into a form that a computer can reason with) was tied to language and informed
by research in linguistics, which was connected in turn to decades of work on the philosophical
analysis of language.
She also gave her name to Ada, the U.S. Department of Defense's all-purpose programming language.
1 INTRODUCTION
Humankind has given itself the scientific name homo sapiens—man the wise—because our
mental capacities are so important to our everyday lives and our sense of self. The field of
artificial intelligence, or AI, attempts to understand intelligent entities. Thus, one reason to
study it is to learn more about ourselves. But unlike philosophy and psychology, which are
also concerned with intelligence, AI strives to build intelligent entities as well as understand
them. Another reason to study AI is that these constructed intelligent entities are interesting and
useful in their own right. AI has produced many significant and impressive products even at this
early stage in its development. Although no one can predict the future in detail, it is clear that
computers with human-level intelligence (or better) would have a huge impact on our everyday
lives and on the future course of civilization.
AI addresses one of the ultimate puzzles. How is it possible for a slow, tiny brain, whether
biological or electronic, to perceive, understand, predict, and manipulate a world far larger and
more complicated than itself? How do we go about making something with those properties?
These are hard questions, but unlike the search for faster-than-light travel or an antigravity device,
the researcher in AI has solid evidence that the quest is possible. All the researcher has to do is
look in the mirror to see an example of an intelligent system.
AI is one of the newest disciplines. It was formally initiated in 1956, when the name
was coined, although at that point work had been under way for about five years. Along with
modern genetics, it is regularly cited as the "field I would most like to be in" by scientists in other
disciplines. A student in physics might reasonably feel that all the good ideas have already been
taken by Galileo, Newton, Einstein, and the rest, and that it takes many years of study before one
can contribute new ideas. AI, on the other hand, still has openings for a full-time Einstein.
The study of intelligence is also one of the oldest disciplines. For over 2000 years, philoso-
phers have tried to understand how seeing, learning, remembering, and reasoning could, or should,
Chapter Introduction
be done.' The advent of usable computers in the early 1950s turned the learned but armchair
speculation concerning these mental faculties into a real experimental and theoretical discipline.
Many felt that the new "Electronic Super-Brains" had unlimited potential for intelligence. "Faster
Than Einstein" was a typical headline. But as well as providing a vehicle for creating artificially
intelligent entities, the computer provides a tool for testing theories of intelligence, and many
theories failed to withstand the test—a case of "out of the armchair, into the fire." AI has turned
out to be more difficult than many at first imagined, and modem ideas are much richer, more
subtle, and more interesting as a result.
AI currently encompasses a huge variety of subfields, from general-purpose areas such as
perception and logical reasoning, to specific tasks such as playing chess, proving mathematical
theorems, writing poetry, and diagnosing diseases. Often, scientists in other fields move gradually
into artificial intelligence, where they find the tools and vocabulary to systematize and automate
the intellectual tasks on which they have been working all their lives. Similarly, workers in AI
can choose to apply their methods to any area of human intellectual endeavor. In this sense, it is
truly a universal field.
We have now explained why AI is exciting, but we have not said what it is. We could just say,
"Well, it has to do with smart programs, so let's get on and write some." But the history of science
shows that it is helpful to aim at the right goals. Early alchemists, looking for a potion for eternal
life and a method to turn lead into gold, were probably off on the wrong foot. Only when the aim ;
changed, to that of finding explicit theories that gave accurate predictions of the terrestrial world, j
in the same way that early astronomy predicted the apparent motions of the stars and planets, i
could the scientific method emerge and productive science take place.
Definitions of artificial intelligence according to eight recent textbooks are shown in Fig- j
ure 1.1. These definitions vary along two main dimensions. The ones on top are concerned
with thought processes and reasoning, whereas the ones on the bottom address behavior. Also,!
the definitions on the left measure success in terms of human performance, whereas the ones 1
RATIONALITY on the right measure against an ideal concept of intelligence, which we will call rationality. A!
system is rational if it does the right thing. This gives us four possible goals to pursue in artificial j
intelligence, as seen in the caption of Figure 1.1.
Historically, all four approaches have been followed. As one might expect, a tension existsl
between approaches centered around humans and approaches centered around rationality.2 A!
human-centered approach must be an empirical science, involving hypothesis and experimental]
1
A more recent branch of philosophy is concerned with proving that AI is impossible. We will return to this interesting j
viewpoint in Chapter 26.
2
We should point out that by distinguishing between human and rational behavior, we are not suggesting that humans 1
are necessarily "irrational" in the sense of "emotionally unstable" or "insane." One merely need note that we often make I
mistakes; we are not all chess grandmasters even though we may know all the rules of chess; and unfortunately, not]
everyone gets an A on the exam. Some systematic errors in human reasoning are cataloged by Kahneman et al. (1982).
Section 1.1 What is Al?
"The exciting new effort to make computers "The study of mental faculties through the
think . . . machines with minds, in the full use of computational models"
and literal sense" (Haugeland, 1985) (Charniak and McDermott, 1985)
"[The automation of] activities that we asso- "The study of the computations that make
ciate with human thinking, activities such as it possible to perceive, reason, and act"
decision-making, problem solving, learning (Winston, 1992)
..."(Bellman, 1978)
"The art of creating machines that perform "A field of study that seeks to explain and
functions that require intelligence when per- emulate intelligent behavior in terms of
formed by people" (Kurzweil, 1990) computational processes" (Schalkoff, 1 990)
"The study of how to make computers do "The branch of computer science that is con-
things at which, at the moment, people are cerned with the automation of intelligent
better" (Rich and Knight, 1 99 1 ) behavior" (Luger and Stubblefield, 1993)
Figure 1.1 Some definitions of AI. They are organized into four categories:
Systems that think like humans. Systems that think rationally.
Systems that act like humans. Systems that act rationally.
TOTAL TURING TEST the so-called total Turing Test includes a video signal so that the interrogator can test the
subject's perceptual abilities, as well as the opportunity for the interrogator to pass physical
objects "through the hatch." To pass the total Turing Test, the computer will need
COMPUTER VISION <) computer vision to perceive objects, and
ROBOTICS (> robotics to move them about.
Within AI, there has not been a big effort to try to pass the Turing test. The issue of acting
like a human comes up primarily when AI programs have to interact with people, as when an
expert system explains how it came to its diagnosis, or a natural language processing system has
a dialogue with a user. These programs must behave according to certain normal conventions of
human interaction in order to make themselves understood. The underlying representation and
reasoning in such a system may or may not be based on a human model.
L
Section 1.1. What is AI?
all men are mortal; therefore Socrates is mortal." These laws of thought were supposed to govern
LOGIC the operation of the mind, and initiated the field of logic.
The development of formal logic in the late nineteenth and early twentieth centuries, which
we describe in more detail in Chapter 6, provided a precise notation for statements about all kinds
of things in the world and the relations between them. (Contrast this with ordinary arithmetic
notation, which provides mainly for equality and inequality statements about numbers.) By 1965,
programs existed that could, given enough time and memory, take a description of a problem
in logical notation and find the solution to the problem, if one exists. (If there is no solution,
LOGICIST the program might never stop looking for it.) The so-called logicist tradition within artificial
intelligence hopes to build on such programs to create intelligent systems.
There are two main obstacles to this approach. First, it is not easy to take informal
knowledge and state it in the formal terms required by logical notation, particularly when the
knowledge is less than 100% certain. Second, there is a big difference between being able to
solve a problem "in principle" and doing so in practice. Even problems with just a few dozen
facts can exhaust the computational resources of any computer unless it has some guidance as to
which reasoning steps to try first. Although both of these obstacles apply to any attempt to build
computational reasoning systems, they appeared first in the logicist tradition because the power
of the representation and reasoning systems are well-defined and fairly well understood.
development than approaches based on human behavior or human thought, because the standard
of rationality is clearly defined and completely general. Human behavior, on the other hand,
is well-adapted for one specific environment and is the product, in part, of a complicated and
largely unknown evolutionary process that still may be far from achieving perfection. This
book will therefore concentrate on general principles of rational agents, and on components for
constructing them. We will see that despite the apparent simplicity with which the problem can
be stated, an enormous variety of issues come up when we try to solve it. Chapter 2 outlines
some of these issues in more detail.
One important point to keep in mind: we will see before too long that achieving perfect
rationality—always doing the right thing—is not possible in complicated environments. The
computational demands are just too high. However, for most of the book, we will adopt the
working hypothesis that understanding perfect decision making is a good place to start. It
simplifies the problem and provides the appropriate setting for most of the foundational material
LIMITED
RATIONALITY
in the field. Chapters 5 and 17 deal explicitly with the issue of limited rationality—acting
appropriately when there is not enough time to do all the computations one might like.
In this section and the next, we provide a brief history of AI. Although AI itself is a young field,
it has inherited many ideas, viewpoints, and techniques from other disciplines. From over 2000
years of tradition in philosophy, theories of reasoning and learning have emerged, along with the
viewpoint that the mind is constituted by the operation of a physical system. From over 400 years
of mathematics, we have formal theories of logic, probability, decision making, and computation.
From psychology, we have the tools with which to investigate the human mind, and a scientific
language within which to express the resulting theories. From linguistics, we have theories of
the structure and meaning of language. Finally, from computer science, we have the tools with
which to make AI a reality.
Like any history, this one is forced to concentrate on a small number of people and events,
and ignore others that were also important. We choose to arrange events to tell the story of how
the various intellectual components of modern AI came into being. We certainly would not wish
to give the impression, however, that the disciplines from which the components came have all
been working toward AI as their ultimate fruition.
and his student Aristotle laid the foundation for much of western thought and culture. The
philosopher Hubert Dreyfus (1979, p. 67) says that "The story of artificial intelligence might well
begin around 450 B.C." when Plato reported a dialogue in which Socrates asks Euthyphro,3 "I
want to know what is characteristic of piety which makes all actions pious... that I may have it
to turn to, and to use as a standard whereby to judge your actions and those of other men."4 In
other words, Socrates was asking for an algorithm to distinguish piety from non-piety. Aristotle
went on to try to formulate more precisely the laws governing the rational part of the mind. He
developed an informal system of syllogisms for proper reasoning, which in principle allowed one
to mechanically generate conclusions, given initial premises. Aristotle did not believe all parts
of the mind were governed by logical processes; he also had a notion of intuitive reason.
Now that we have the idea of a set of rules that can describe the working of (at least part
of) the mind, the next step is to consider the mind as a physical system. We have to wait for
Rene Descartes (1596-1650) for a clear discussion of the distinction between mind and matter,
and the problems that arise. One problem with a purely physical conception of the mind is that
it seems to leave little room for free will: if the mind is governed entirely by physical laws, then
it has no more free will than a rock "deciding" to fall toward the center of the earth. Although a
DUALISM strong advocate of the power of reasoning, Descartes was also a proponent of dualism. He held
that there is a part of the mind (or soul or spirit) that is outside of nature, exempt from physical
laws. On the other hand, he felt that animals did not possess this dualist quality; they could be
considered as if they were machines.
MATERIALISM An alternative to dualism is materialism, which holds that all the world (including the
brain and mind) operate according to physical law.5 Wilhelm Leibniz (1646-1716) was probably
the first to take the materialist position to its logical conclusion and build a mechanical device
intended to carry out mental operations. Unfortunately, his formulation of logic was so weak that
his mechanical concept generator could not produce interesting results.
It is also possible to adopt an intermediate position, in which one accepts that the mind
has a physical basis, but denies that it can be explained by a reduction to ordinary physical
processes. Mental processes and consciousness are therefore part of the physical world, but
inherently unknowable; they are beyond rational understanding. Some philosophers critical of
AI have adopted exactly this position, as we discuss in Chapter 26.
Barring these possible objections to the aims of AI, philosophy had thus established a
tradition in which the mind was conceived of as a physical device operating principally by
reasoning with the knowledge that it contained. The next problem is then to establish the
EMPIRICIST source of knowledge. The empiricist movement, starting with Francis Bacon's (1561-1626)
Novwn Organum,6 is characterized by the dictum of John Locke (1632-1704): "Nothing is in
the understanding, which was not first in the senses." David Hume's (1711-1776) A Treatise
INDUCTION of Human Nature (Hume, 1978) proposed what is now known as the principle of induction:
3
The Euthyphro describes the events just before the trial of Socrates in 399 B.C. Dreyfus has clearly erred in placing it
51 years earlier.
4
Note that other translations have "goodness/good" instead of "piety/pious."
5
In this view, the perception of "free will" arises because the deterministic generation of behavior is constituted by the
operation of the mind selecting among what appear to be the possible courses of action. They remain "possible" because
the brain does not have access to its own future states.
6
An update of Aristotle's organon, or instrument of thought.
10 Chapter 1. Introduction
that general rules are acquired by exposure to repeated associations between their elements.
The theory was given more formal shape by Bertrand Russell (1872-1970) who introduced
LOGICAL POSITIVISM logical positivism. This doctrine holds that all knowledge can be characterized by logical
OBSERVATION
SENTENCES
theories connected, ultimately, to observation sentences that correspond to sensory inputs. 7 The
CONFIRMATION confirmation theory of Rudolf Carnap and Carl Hempel attempted to establish the nature of the
THEORY
connection between the observation sentences and the more general theories—in other words, to
understand how knowledge can be acquired from experience.
The final element in the philosophical picture of the mind is the connection between
knowledge and action. What form should this connection take, and how can particular actions
be justified? These questions are vital to AI, because only by understanding how actions are
justified can we understand how to build an agent whose actions are justifiable, or rational.
Aristotle provides an elegant answer in the Nicomachean Ethics (Book III. 3, 1112b):
We deliberate not about ends, but about means. For a doctor does not deliberate whether he
shall heal, nor an orator whether he shall persuade, nor a statesman whether he shall produce
law and order, nor does any one else deliberate about his end. They assume the end and
consider how and by what means it is attained, and if it seems easily and best produced
thereby; while if it is achieved by one means only they consider how it will be achieved by
this and by what means this will be achieved, till they come to the first cause, which in the
order of discovery is last . . . and what is last in the order of analysis seems to be first in the
order of becoming. And if we come on an impossibility, we give up the search, e.g. if we
need money and this cannot be got: but if a thing appears possible we try to do it.
Aristotle's approach (with a few minor refinements) was implemented 2300 years later by Newell
and Simon in their GPS program, about which they write (Newell and Simon, 1972):
MEANS-ENDS The main methods of GPS jointly embody the heuristic of means-ends analysis. Means-ends
ANALYSIS
analysis is typified by the following kind of common-sense argument:
I want to take my son to nursery school. What's the difference between what I
have and what I want? One of distance. What changes distance? My automobile.
My automobile won't work. What is needed to make it work? A new battery.
What has new batteries? An auto repair shop. I want the repair shop to put in a
new battery; but the shop doesn't know I need one. What is the difficulty? One
of communication. What allows communication? A telephone . . . and so on.
This kind of analysis—classifying things in terms of the functions they serve and oscillating
among ends, functions required, and means that perform them—forms the basic system of
heuristic of GPS.
Means-ends analysis is useful, but does not say what to do when several actions will achieve the
goal, or when no action will completely achieve it. Arnauld, a follower of Descartes, correctly
described a quantitative formula for deciding what action to take in cases like this (see Chapter 16).
John Stuart Mill's (1806-1873) book Utilitarianism (Mill, 1863) amplifies on this idea. The more
formal theory of decisions is discussed in the following section.
7
In this picture, all meaningful statements can be verified or falsified either by analyzing the meaning of the words or
by carrying out experiments. Because this rules out most of metaphysics, as was the intention, logical positivism w a s
unpopular in some circles.
Section 1.2. The Foundations of Artificial Intelligence 11
stances cannot be solved in any reasonable time. Therefore, one should strive to divide the overall
problem of generating intelligent behavior into tractable subproblems rather than intractable ones.
REDUCTION The second important concept in the theory of complexity is reduction, which also emerged in
the 1960s (Dantzig, 1960; Edmonds, 1962). A reduction is a general transformation from one
class of problems to another, such that solutions to the first class can be found by reducing them
to problems of the second class and solving the latter problems.
NP COMPLETENESS How can one recognize an intractable problem? The theory of NP-completeness, pioneered
by Steven Cook (1971) and Richard Karp (1972), provides a method. Cook and Karp showed
the existence of large classes of canonical combinatorial search and reasoning problems that
are NP-complete. Any problem class to which an NP-complete problem class can be reduced
is likely to be intractable. (Although it has not yet been proved that NP-complete problems
are necessarily intractable, few theoreticians believe otherwise.) These results contrast sharply
with the "Electronic Super-Brain" enthusiasm accompanying the advent of computers. Despite
the ever-increasing speed of computers, subtlety and careful use of resources will characterize
intelligent systems. Put crudely, the world is an extremely large problem instance!
Besides logic and computation, the third great contribution of mathematics to AI is the j
theory of probability. The Italian Gerolamo Cardano (1501-1576) first framed the idea of I
probability, describing it in terms of the possible outcomes of gambling events. Before his time, j
the outcomes of gambling games were seen as the will of the gods rather than the whim of chance, i
Probability quickly became an invaluable part of all the quantitative sciences, helping to deal
with uncertain measurements and incomplete theories. Pierre Fermat (1601-1665), Blaise Pascal I
(1623-1662), James Bernoulli (1654-1705), Pierre Laplace (1749-1827), and others advanced j
the theory and introduced new statistical methods. Bernoulli also framed an alternative view]
of probability, as a subjective "degree of belief" rather than an objective ratio of outcomes.!
Subjective probabilities therefore can be updated as new evidence is obtained. Thomas Bayes j
(1702-1761) proposed a rule for updating subjective probabilities in the light of new evidence!
(published posthumously in 1763). Bayes' rule, and the subsequent field of Bayesian analysis,!
form the basis of the modern approach to uncertain reasoning in AI systems. Debate still rages j
between supporters of the objective and subjective views of probability, but it is not clear if the!
difference has great significance for AI. Both versions obey the same set of axioms. Savage'sJ
(1954) Foundations of Statistics gives a good introduction to the field.
As with logic, a connection must be made between probabilistic reasoning and action.!
DECISION THEORY Decision theory, pioneered by John Von Neumann and Oskar Morgenstern (1944), combines!
probability theory with utility theory (which provides a formal and complete framework forl
specifying the preferences of an agent) to give the first general theory that can distinguish good!
actions from bad ones. Decision theory is the mathematical successor to utilitarianism, and]
provides the theoretical basis for many of the agent designs in this book.
Psychology (1879-present)
Scientific psychology can be said to have begun with the work of the German physicist Hermann i
von Helmholtz (1821-1894) and his student Wilhelm Wundt (1832-1920). Helmholtz applied
the scientific method to the study of human vision, and his Handbook of Physiological Optics \