Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

AI Unit PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 97

ARTIFICIAL INTELLIGENCE

for
T.E. Computer Engg. / I. T

As per Revised KBCNMU Syllabus (w.e.f. 2019-2020)

Notes prepared by:


Mr. Pravin K. Patil
Assistant Professor, Computer Engineering Department

SSBT’s College of Engineering and Technology Bambhori Jalgaon

Based on Reference Book:

Elaine Rich, Kevin Knight and Shivshankar Nair "Artificial Intelligence".


3rd Edition TMH.
UNIT – I
INTRODUCTION TO ARTIFICIAL INTELLIGENCE

Contents:
 Definitions of AI
 History
 Turing Test
 AI Problems and Techniques:
o Problem as a State Space Search
o Problem Characteristics
 Production System:
o Water Jug Problem
 Heuristic Searching Techniques:
o BFS
o DFS
o A*
o AO*
o Means Ends Analysis

Definitions of AI
 AI is the branch of computer science which deals with the Symbolic, Non
Algorithmic method of problem solving. It is the study of how to make
computer to things at a movement like as human.
 Artificial Intelligence is concerned with the design of intelligence in an
artificial device. The term was coined by John McCarthy (Father of AI) in
1956.
 There are two ideas in the definition.
1. Intelligence
2. Artificial device
 A system with intelligence is expected to behave as intelligently as a human
and a system with intelligence is expected to behave in the best possible
manner.
 Artificial intelligence is about designing systems that are as intelligent as
humans. This involves trying to understand human thought and an effort to
build machines that emulate the human thought process. This view is the
cognitive science approach to AI.
History of AI
 Intellectual roots of AI date back to the early studies of the nature of knowledge and
reasoning. The dream of making a computer imitate humans also has a very early history.
 The concept of intelligent machines is found in Greek mythology. There is a story in the 8th
century A.D about Pygmalion Olio, the legendary king of Cyprus. He fell in love with an
ivory statue he made to represent his ideal woman. The king prayed to the goddess Aphrodite,
and the goddess miraculously brought the statue to life. Other myths involve human-like
artifacts. As a present from Zeus to Europa, Hephaestus created Talos, a huge robot. Talos
was made of bronze and his duty was to patrol the beaches of Crete.
 Aristotle (384-322 BC) developed an informal system of syllogistic logic, which is the basis
of the first formal deductive reasoning system.
 Early in the 17th century, Descartes proposed that bodies of animals are nothing more than
complex machines.
 Pascal in 1642 made the first mechanical digital calculating machine.
 In the 19th century, George Boole developed a binary algebra representing (some) "laws of
thought."
 Charles Babbage & Ada Byron worked on programmable mechanical calculating machines.
 In the late 19th century and early 20th century, mathematical philosophers like Gottlob Frege,
Bertram Russell, Alfred North Whitehead, and Kurt Gödel built on Boole's initial logic
concepts to develop mathematical representations of logic problems.
 The advent of electronic computers provided a revolutionary advance in the ability study
intelligence.
 In 1943 McCulloch & Pitts developed a Boolean circuit model of brain. They wrote the paper
“A Logical Calculus of Ideas Immanent in Nervous Activity”, which explained how it is
possible for neural networks to compute.
 Marvin Minsky and Dean Edmonds built the SNARC in 1951, which is the first randomly
wired neural network learning machine (SNARC stands for Stochastic NeuralAnalog
Reinforcement Computer).It was a neural network computer that used 3000 vacuum tubes and
a network with 40 neurons.
 In 1950 Turing wrote an article on “Computing Machinery and Intelligence” which
articulated a complete vision of AI. Turing’s paper talked of many things, of solving
problems by searching through the space of possible solutions, guided by heuristics. He
illustrated his ideas on machine intelligence by reference to chess. He even propounded the
possibility of letting the machine alter its own instructions so that machines can learn from
experience.
 In 1956 a famous conference took place in Dartmouth. The conference brought together the
founding fathers of artificial intelligence for the first time. In this meeting the term “Artificial
Intelligence” was adopted.
 Between 1952 and 1956, Samuel had developed several programs for playing checkers. In
1956, Newell & Simon’s Logic Theorist was published. It is considered by many to be the
first AI program. In 1959, Gelernter developed a Geometry Engine. In 1961 James Slagle
(PhD dissertation, MIT) wrote a symbolic integration program, SAINT. It was written in
LISP and solved calculus problems at the college freshman level. In 1963, Thomas Evan's
program Analogy was developed which could solve IQ test type analogy problems.
 In 1963, Edward A. Feigenbaum & Julian Feldman published Computers and Thought, the
first collection of articles about artificial intelligence.
 In 1965, J. Allen Robinson invented a mechanical proof procedure, the Resolution Method,
which allowed programs to work efficiently with formal logic as a representation language. In
1967, the Dendral program (Feigenbaum, Lederberg, Buchanan, Sutherland at Stanford) was
demonstrated which could interpret mass spectra on organic chemical compounds. This was
the first successful knowledge-based program for scientific reasoning.
 In 1969 the SRI robot, Shakey, demonstrated combining locomotion, perception and problem
solving.
 The years from 1969 to 1979 marked the early development of knowledge-based systems In
1974: MYCIN demonstrated the power of rule-based systems for knowledge representation
and inference in medical diagnosis and therapy. Knowledge representation schemes were
developed. These included frames developed by Minski. Logic based languages like Prolog
and Planner were developed.
 In the 1980s, Lisp Machines developed and marketed. Around 1985, neural networks return
to popularity In 1988, there was a resurgence of probabilistic and decision-theoretic methods
 The early AI systems used general systems, little knowledge. AI researchers realized that
specialized knowledge is required for rich tasks to focus reasoning.
 The 1990's saw major advances in all areas of AI including the machine learning, data
mining, intelligent tutoring, case-based reasoning, multi-agent planning, scheduling, uncertain
reasoning, natural language understanding and translation, vision, virtual reality, games, and
other topics.
 Rod Brooks' COG Project at MIT, with numerous collaborators, made significant progress in
building a humanoid robot
 The first official Robo-Cup soccer match featuring table-top matches with 40 teams of
interacting robots was held in 1997. For details, see the site
 In the late 90s, Web crawlers and other AI-based information extraction programs become
essential in widespread use of the world-wide-web. Interactive robot pets ("smart toys")
become commercially available, realizing the vision of the 18th century novelty toy makers.
 In 2000, the Nomad robot explores remote regions of Antarctica looking for meteorite
samples.

We will now look at a few famous AI system that has been developed over the years.

1. ALVINN: Autonomous Land Vehicle In a Neural Network


In 1989, Dean Pomerleau at CMU created ALVINN. This is a system which learns to
control vehicles by watching a person drive. It contains a neural network whose input is a
30x32 unit two dimensional camera image. The output layer is a representation of the
direction the vehicle should travel. The system drove a car from the East Coast of USA to
the west coast, a total of about 2850 miles. Out of this about 50 miles were driven by a
human and the rest solely by the system.
2. Deep Blue
In 1997, the Deep Blue chess program created by IBM, beat the current world chess
champion, Gary Kasparov.
3. Machine translation
A system capable of translations between people speaking different languages will be a
remarkable achievement of enormous economic and cultural benefit. Machine translation
is one of the important fields of endeavour in AI. While some translating systems have
been developed, there is a lot of scope for improvement in translation quality.
4. Autonomous agents
In space exploration, robotic space probes autonomously monitor their surroundings,
make decisions and act to achieve their goals. NASA's Mars rovers successfully
completed their primary three-month missions in April, 2004. The Spirit rover had been
exploring a range of Martian hills that took two months to reach. It is finding curiously
eroded rocks that may be new pieces to the puzzle of the region's past. Spirit's twin,
Opportunity, had been examining exposed rock layers inside a crater.
5. Internet agents
The explosive growth of the internet has also led to growing interest in internet agents to
monitor users' tasks, seek needed information, and to learn which information is most
useful.

Turing Test
 Turing held that in future computers can be programmed to acquire abilities rivaling
human intelligence. As part of his argument Turing put forward the idea of an
'imitation game', in which a human being and a computer would be interrogated under
conditions where the interrogator would not know which was which, the
communication being entirely by textual messages. Turing argued that if the
interrogator could not distinguish them by questioning, then it would be unreasonable
not to call the computer intelligent. Turing's 'imitation game' is now usually called 'the
Turing test' for intelligence.

 Consider that there are two rooms, A and B. One of the rooms contains a computer.
The other contains a human. The interrogator is outside and does not know which one
is a computer. He can ask questions through a teletype and receives answers from
both A and B. The interrogator needs to identify whether A or B are humans. To pass
the Turing test, the machine has to fool the interrogator into believing that it is human.
Applications of AI
1. Artificial Intelligence in Healthcare: Companies are applying machine learning to
make better and faster diagnoses than humans. One of the best-known technologies is
IBM’s Watson. It understands natural language and can respond to questions asked of
it. The system mines patient data and other available data sources to form a
hypothesis, which it then presents with a confidence scoring schema. AI is a study
realized to emulate human intelligence into computer technology that could assist
both, the doctor and the patients in the following ways:
 By providing a laboratory for the examination, representation and cataloguing
 medical information
 By devising novel tool to support decision making and research
 By integrating activities in medical, software and cognitive sciences
 By offering a content rich discipline for the future scientific medical
communities.
2. Artificial Intelligence in business: Robotic process automation is being applied to
highly repetitive tasks normally performed by humans. Machine learning algorithms
are being integrated into analytics and CRM (Customer relationship management)
platforms to uncover information on how to better serve customers. Chatbots have
already been incorporated into websites and e companies to provide immediate
service to customers. Automation of job positions has also become a talking point
among academics and IT consultancies.

3. AI in education: It automates grading, giving educators more time. It can also assess
students and adapt to their needs, helping them work at their own pace.

4. AI in Autonomous vehicles: Just like humans, self-driving cars need to have sensors
to understand the world around them and a brain to collect, processes and choose
specific actions based on information gathered. Autonomous vehicles are with
advanced tool to gather information, including long range radar, cameras, and
LIDAR. Each of the technologies are used in different capacities and each collects
different information. This information is useless, unless it is processed and some
form of information is taken based on the gathered information. This is where
artificial intelligence comes into play and can be compared to human brain. AI has
several applications for these vehicles and among them the more immediate ones are
as follows:
 Directing the car to gas station or recharge station when it is running low on
fuel.
 Adjust the trips directions based on known traffic conditions to find the
quickest route.
 Incorporate speech recognition for advanced communication with passengers.
 Natural language interfaces and virtual assistance technologies.

5. AI for robotics will allow us to address the challenges in taking care of an aging
population and allow much longer independence. It will drastically reduce, may be even
bring down traffic accidents and deaths, as well as enable disaster response for dangerous
situations for example the nuclear meltdown at the fukushima power plant.

6. Cyborg Technology: One of the main limitations of being human is simply our own
bodies and brains. Researcher Shimon Whiteson thinksthat in the future, we will be able
to augment ourselves with computers and enhance many of our own natural abilities.
Though many of these possible cyborg enhancements would be added for convenience,
others may serve a more practical purpose. Yoky Matsuka of Nest believes that AI will
become useful for people with amputated limbs, as the brain will be able to communicate
with a robotic limb to give the patient more control. This kind of cyborg technology
would significantly reduce the limitations that amputees deal with daily.

7. Game Playing: You can buy machines that can play master level chess for a few hundred
dollars. There is some AI in them, but they play well against people mainly through brute
force computation--looking at hundreds of thousands of positions. To beat a world
champion by brute force and known reliable heuristics requires being able to look at 200
million positions per second.

8. Speech Recognition: In the 1990s, computer speech recognition reached a practical level
for limited purposes. Thus United Airlines has replaced its keyboard tree for flight
information by a system using speech recognition of flight numbers and city names. It is
quite convenient. On the other hand, while it is possible to instruct some computers using
speech, most users have gone back to the keyboard and the mouse as still more
convenient.
9. Understanding Natural Language: Just getting a sequence of words into a computer is
not enough. Parsing sentences is not enough either. The computer has to be provided with
an understanding of the domain the text is about, and this is presently possible only for
very limited domains.

10. Computer Vision: The world is composed of three-dimensional objects, but the inputs to
the human eye and computers & TV cameras are two dimensional. Some useful programs
can work solely in two dimensions, but full computer vision requires partial three-
dimensional information that is not just a set of two-dimensional views. At present there
are only limited ways of representing three-dimensional information directly, and they are
not as good as what humans evidently use.

11. Expert Systems: A ``knowledge engineer'' interviews experts in a certain domain and
tries to embody their knowledge in a computer program for carrying out some task. One
of the first expert systems was MYCIN in 1974, which diagnosed bacterial infections of
the blood and suggested treatments. It did better than medical students or practicing
doctors, provided its limitations were observed. Namely, its ontology included bacteria,
symptoms, and treatments and did not include patients, doctors, hospitals, death,
recovery, and events occurring in time. Its interactions depended on a single patient being
considered. Since the experts consulted by the knowledge engineers knew about patients,
doctors, death, recovery, etc., it is clear that the knowledge engineers forced what the
experts told them into a predetermined framework. The usefulness of current expert
systems depends on their users having common sense.

Q. Define Artificial Intelligence. Explain applications of Artificial Intelligence (KBCNMU


December 2019 Examination)

AI Problems and Techniques


 AI Problem:
o AI Problem is solved by using AI Technique. The problem which is solve by
using knowledge & logic is called as AI problem.
o Knowledge consist of
 Facts
 Concepts
 Theories
 Procedures
 Relationships between them
 AI Technique
o In the real world, the knowledge has some unwelcomed properties −
 Its volume is huge, next to unimaginable.
 It is not well-organized or well-formatted.
 It keeps changing constantly
o AI Technique is a technique which is used to solve AI problem.
o AI Technique is a manner to organize and use the knowledge efficiently. It is a
Method that exploits knowledge that should represent in such a way that –
a) The knowledge captures generalization.
b) It can be understood by people, who provide it.
c) It can be easily modified to correct errors & to reflect the changes in
world.
 Problem as a State Space Search:

Problem Space: It is a Hypothetical space in which, we represent all the states of


problem including Initial, Intermediate & Final state.

Steps to solve AI Problem:


o Define problem precisely (Specify the problem)
o Analyse the problem
o Isolate & represent the task knowledge i.e. necessary
o Choose the best problem solving technique & apply it, to the particular
problem

 Characteristics of AI Problem:
1. Is a problem decomposable into a set of independent, smaller or easier sub
problem?
2. Can solution steps be ignored or at least undone if they proved unwise?
3. Is a problem universal predictable? (Confirm we get the solution)
4. Is a root solution to the problem obvious without comparing to all other solutions?
5. Is the knowledgebase we use for solving the problem, is internally consistent?
6. Is the large amount of knowledge accurately required to solve the problem?
7. Were the solutions of problems required interaction between computer & human?

Production System

 A production system (or production rule system) is a computer program typically used
to provide some form of artificial intelligence, which consists primarily of a set of
rules about behaviour.
 These rules, termed productions, are a basic representation found useful in automated
planning, expert systems and action selection.
 A production system provides the mechanism necessary to execute productions in
order to achieve some goal for the system.
 Productions consist of two parts: a sensory precondition (or "IF" statement) and an
action (or "THEN").
 If a production's precondition matches the current state of the world, then the
production is said to be triggered.
 If a production's action is executed, it is said to have fired.
 A production system also contains a database, sometimes called working memory,
which maintains data about current state or knowledge, and a rule interpreter.

WATER – JUG PROBLEM


Problem Statement: You are given two jugs, a 4-gallon one and a 3-gallon one, a pump
which has unlimited water which you can use to fill the jug, and the ground on which water
may be poured. Neither jug has any measuring markings on it. How can you get exactly 2
gallons of water in the 4-gallon jug?

One possible solution without using AI technique and Production System:


 Initially both the jugs will be empty.
 Fill the 3 gallon jug
 Pour the water from 3 gallon jug into 4 gallon jug ( After pouring 3 gallon water into
4 gallon jug, the 4 gallon jug will still have capacity of 1 gallon empty)
 Now the 3 gallon jug is empty, so again fill the 3 gallon jug
 Pour some water from 3 gallon jug into 4 gallon jug, until the 4 gallon jug is full
(After this, 3 gallon jug will have 2 gallon of water left)
 Empty the 4 gallon jug on ground
 Pour the 2 gallon of water from the 3 gallon jug into 4 gallon jug.
One possible solution by using AI technique and Production System:

State Representation and Initial State


We will represent a state of the problem as a tuple (x, y) where x represents the
amount of water in the 4-gallon jug and y represents the amount of water in the 3-gallon jug.

Now, we must define a set of production rules, which will take us from Initial (Start) State to
Goal State. The Production Rules are as follows:

Production Rules for Water Jug Problem


Possible Solution

Q. What is Production Systems? Solve the following Water – Jug Problem: If you are given two
jugs, 8-gallon one and a 6-gallon one, a pump which has unlimited water which you can use to fill
the jug, and the ground on which water may be poured. Neither jug has any measuring markings
on it. How can you get exactly 4 gallons of water in the 8-gallon jug (KBCNMU December 2019
Examination)
Heuristic Searching Techniques
Breadth First Search (BFS)
 It starts from the root node, explores the neighbouring nodes first and moves towards
the next level neighbours.
 It generates one tree at a time until the solution is found.
 BFS searches breadth-wise in the problem space.
 Breadth-First search is like traversing a tree where each node is a state which may be
a potential candidate for solution.
 It expands nodes from the root of the tree and then generates one level of the tree at a
time until a solution is found.
 Example: Construct a tree with initial state as root. Generate all its child nodes.
Now for each leaf node, generate all its successors by applying appropriate rules.
Continue the process until goal state is reached. This is called Breadth First Search.
 Algorithm - Breadth First Search

1. Create a variable called NODE - LIST and set it to the initial state.
2. Loop until the goal state is found or NODE - LIST is empty.
i. Remove the first element from the NODE – LIST, say E. If
NODE - LIST was empty then quit.
ii. For each way that each rule can match the state described in E
do:
a) Apply the rule to generate a new state.
b) If the new state is the goal state, quit and return this
state.
c) Otherwise add this state to the end of NODE-LIST

Working of Breadth First Search Algorithm with example

 Advantages of Breadth – First Search


1. BFS will never get trapped.
2. If there is a solution, then BFS is guaranteed to find it. Furthermore, if there
are multiple solutions, then BFS will give the minimal solution. This is
guaranteed by the fact that BFS will never explore the longer paths, until all
shorter ones have been examined.
3. BFS will always find shortest path to the goal.

 Disadvantages of Breadth – First Search:


1. The main drawback of Breadth first search is its memory requirement. Since
each level of the tree must be saved in order to generate the next level, and the
amount of memory is proportional to the number of nodes stored. As a result,
BFS is severely space-bound in practice so will exhaust the memory available
on typical computers in a matter of minutes.
2. If the solution is farther away from the root, breath first search will consume
lot of time.

Depth - First Search (DFS)


 In DFS, we pursue a single branch of the tree until it yields a solution or a decision to
terminate the path is made.
 In such cases backtracking occurs.
 The most recently visited node will be revisited and new path will be created.
 This search procedure is called Depth – First Search.

Depth – First Search


 Algorithm - Depth First Search

1. If the initial state is a goal state, quit and return success.


2. Otherwise, loop until success or failure is signalled.
a) Generate a state, say E, and let it be the successor of the initial state
there is no successor, signal failure.
b) Call Depth - First Search with E as the initial state
c) If success is returned, signal success. Otherwise continue in this loop.
Working of Depth - First Search Algorithm with example

 Advantages of Depth – First Search


1. Depth – First Search requires less memory since only the nodes to the current
path are stored.
2. Sometimes, depth – first search may find a solution without examining much
of the other nodes at all.

 Disadvantages of Depth – First Search


1. Depth-First Search is not guaranteed to find the solution.
2. There is no guarantee to find a minimal solution, if more than one solution
exists.

Q. Explain Breadth First Search and Depth First Search algorithm with its advantages.
(KBCNMU December 2019 Examination)

Best First Search


 Best First Search is a combination of both Depth first search & Breadth first search. In
Best First Search we follow single path at a time but we can switch from that path
whenever some other path looks more promising than current path.
 At each step of Best First Search, we select the most promising node by applying an
appropriate Heuristic Function to each of them.
 For the practical implementation of Best First Search, we need to use two lists of
nodes:
1. OPEN: Nodes that have been generated and have had Heuristic Function
applied to them but which have not been yet examined.
2. CLOSED: Nodes that have already been examined. We have to keep these
nodes in memory for backtracking in future
3. Heuristic Function : f’(n) = g (n) + h’ (n)
4. Where,
a) g (n) = cost from the initial state to current state n.
b) h’ (n) = cost from current state n to goal node.
 Algorithm – Best First Search
1. Start with OPEN, containing just the initial state.
2. Until the goal node is found or there are no nodes left in OPEN, do:
a) Pick the best node in OPEN
b) Generate its successors
c) For each successor, do:
i. If it has not been generated before, evaluate it, add it to OPEN
and record its parent.
ii. If it has been generated before, change the parent if this new
path is better than the previous one. In this case, update the
cost of getting to this node and to any successors that this node
may already have.
 Example:

1. Step 1: OPEN {A} CLOSED {}


2. Step 2: OPEN {A,B,C,D} CLOSED {}
3. Step 3: OPEN {A,B,C,E,F} CLOSED {D}
4. Step 4: OPEN {A,C,D,E,F,G,H} CLOSED {B,D}
5. Step 5: OPEN {A,C,D,F,G,H,I,J} CLOSED{B,D,E}

A* Algorithm
 A* algorithm is the practical implementation of Best First Search which was first
presented by Peter Hart, Nils Nilsson and Bertram Raphael of Stanford Research
Institute, first described the algorithm in 1968.
 Similar to the Best First Search, it uses two nodes of lists:
1. OPEN: nodes that have been generated, but have not examined.
2. CLOSED: nodes that have already been examined.
 Whenever a new node is generated, check whether it has been generated before.
 It uses a heuristic evaluation function, f(n)
 f (n) is the approximate distance of a node, n, from a goal node.
 For two node m and n, if f(m) < f(n), than m is more likely to be on an optimal path
 f(n) may not be 100% accurate, but it should give better results than pure guesswork.
 f’(n) = g(n)+ h’(n)
Where,
g(n) = cost from the initial state to the current state n
h’(n) = estimated cost from node n to a goal node

 Algorithm
1. Start with OPEN which contain only initial node. calculate its F’
2. Until goal node found repeat the following procedure:
a) If there is no nodes in OPEN report failure, otherwise pick the node from
OPEN having lowest F’ treat it as BEST node.
b) Remove it from OPEN and put it on CLOSED. If BEST node is goal
node then exit. Otherwise generate the successor of BEST node. For each
successor do the following:
i. For each successor calculate its F’
ii. Check whether the successor is in OPEN; if it is shown then call
that node i.e. OLD node.
iii. If successor are not OPEN, then see it is in CLOSED. At that time
call that node as CLOSED OLD node.
iv. If successor was not on OPEN or CLOSE. Then put it in OPEN &
calculate its F’
 Example 1: 8 puzzle problem using A* Algorithm

 Example 2: 8 puzzle problem using A*Algorithm


 Video Tutorial on 8 puzzle problem using A* algorithm:

https://www.youtube.com/watch?v=wJu3IZq1NFs&t=333s

Q. Write A* Algorithm with example. (KBCNMU December 2019 Examination)

AO* Algorithm
 AO* algorithm was proposed by Nilsson in 1980.
 Rather than using two lists, OPEN and CLOSED, that were used in A* algorithm,
AO* algorithm will use a single structure GRAPH G, representing the part of search
graph that has been explicitly generated so far.
 Each node in the graph will point down to its immediate successors and up to its
immediate predecessors.
 Each node will also have its associated h’ value.
 Algorithm:
1. Let GRAPH G consist only the node representing the initial state. (Call this
node INIT). Compute h' (INIT).
2. Until INIT is labelled SOLVED or h’ becomes greater than FUTILITY,
repeat the following procedure.
a) Trace the labelled arcs from INIT and select unexpanded nodes on this
path. Call this node as NODE.
b) Generate the successors of NODE. If there are no successors then
assign FUTILITY as h' value of NODE. This means that NODE is not
solvable. If there are successors then for each one do the following:
i. Add SUCCESSOR to GRAPH G
ii. If successor is not a terminal node, mark it solved and assign
zero to its h ' value.
iii. If successor is not a terminal node, compute it h' value.
c) Propagate the newly discovered information up the graph by doing the
following:
Let S be the set of nodes that have been marked SOLVED. Initialize S
to NODE. Until S is empty repeat the following procedure:
i. Select a node from S calls it CURRENT and remove it from S.
ii. Compute h' of each of the arcs emerging from CURRENT,
Assign minimum h' to CURRENT.
iii. Mark the minimum cost path as the best out of CURRENT.
iv. Mark CURRENT SOLVED if all of the nodes connected to it
through the new marked arc have been labelled SOLVED.
v. If CURRENT has been marked SOLVED or its h' has just
changed, its new status must be propagate backwards up the
graph. Hence all the ancestors of CURRENT are added to S.

 Video Tutorial on AO* algorithm with example:

https://www.youtube.com/watch?v=PhRayhkbJCo

Means – Ends Analysis


 Most of the search strategies either go for forward or backward approach, however
often a mixture of the two directions is appropriate.
 Such mixed strategy would make it possible to solve the major parts of problem first
and solve the smaller part later, Such a technique is called "Means - Ends Analysis".
 The means -ends analysis process centers around finding the difference between
current state and goal state.
 The problem space of means - ends analysis consist of the following
1. An initial state
2. One or more goal state
3. A set of operator with a set of preconditions and Results
4. Difference functions that computes the difference between two states i.e.
current state & goal state.

 Means- Ends Analysis is useful for many human planning activities.


 Example: Problem for Household Robot (Means End Analysis)

 Here goal is to move the desk with 2 things on it from one room to another.
 Main difference between start and goal state is location.

Difference Table

 Following table shows the available set of operators with its Pre – Conditions and
Results
 Algorithm
1. Compare CURRENT to GOAL. If no differences, return.
2. Otherwise select most important difference and reduce it by doing the
following until success or failure is indicated.
a) Select an as yet untried operator O that is applicable to the current
difference. If there are no such operators then signal failure.
b) Attempt to apply O to the current state. Generate descriptions of two
states O-START a state in which O’s preconditions are satisfied and O-
RESULT, the state that would result if O were applied in O-START.
c) If (FIRST-PART MEA (CURRENT,O-START) AND (LAST-PART
MEA (O-RESULT, GOAL) are successful then signal success.
ARTIFICIAL INTELLIGENCE

for
T.E. Computer Engg. / I. T

As per Revised KBCNMU Syllabus (w.e.f. 2019-2020)

Notes prepared by:


Mr. Pravin K. Patil
Assistant Professor, Computer Engineering Department

SSBT‟s College of Engineering and Technology Bambhori Jalgaon

Based on Reference Book:

Elaine Rich, Kevin Knight and Shivshankar Nair "Artificial Intelligence".


3rd Edition TMH.
UNIT – II
KNOWLEDGE ENGINEERING
Contents:
 Knowledge Representation Issues
 Knowledge Representation using Predicate Logic
 Knowledge Representation using Rules
 Weak and Strong Filler Structure for Knowledge Representation
o Semantic Net
o Frames
o Script
o Conceptual Dependency

Knowledge Representation Issues


 The fundamental goal of Knowledge Representation is to facilitate inference
(conclusions) from knowledge.
 The issues that arise while using KR techniques are many. Some of these are
explained below.

1. Important Attributes :
 Any attribute of objects so basic that they occur in almost every problem
domain?
2. Relationship among attributes:
 Any important relationship that exists among object attributes?
3. Choosing Granularity:
 At what level of detail should the knowledge be represented?
4. Set of objects:
 How sets of objects be represented?
5. Finding Right structure :
 Given a large amount of knowledge stored, how can relevant parts are
accessed?

1. Important Attributes :
 There are attributes that are of general significance.
 There are two attributes "instance" and "isa” that are of general importance.
 These attributes are important because they support property inheritance.
 “ IsA ” and “ Instance ”

 Two important attributes "instance" and "isa", in a hierarchical structure


 These two attributes support property inheritance and play important role
in knowledge representation.
 The ways, attributes "instance" and "isa", are logically expressed are :
 Example: A simple sentence like "Joe is a musician“

Here "is a" (called IsA) is a way of expressing what logically is called a
class-instance relationship between the subjects represented by the terms "Joe" and
"musician".

"Joe" is an instance of the class of things called "musician".

"Joe" plays the role of instance,

"Musician" plays the role of class in that sentence.

Here,

[Joe] IsA [Musician]

i.e., [Instance] IsA [Class]

2. Relationship among Attributes


 The attributes to describe objects are themselves entities they represent.
 The relationship between the attributes of an object, independent of specific
knowledge they encode, may hold properties like:
a. Inverses
b. An Isa hierarchy of Attributes
c. Techniques for reasoning about values
d. Single valued attributes.

a. Inverses:
 This is about consistency check, while a value is added to one attribute. The
entities are related to each other in many different ways. The below figure
shows attributes (isa, instance, and team), each with a directed arrow,
originating at the object being described and terminating either at the object or
its value.
There are two ways of realizing this:
 First, represent two relationships in a single representation; e.g., a logical
representation, team (Pee-Wee-Reese, Brooklyn–Dodgers), that can be
interpreted as a statement about Pee-Wee-Reese or Brooklyn–Dodger.
 Use attributes that focus on a single entity but use them in pairs, one the
inverse of the other. For e.g., one Dodgers and the other team = Pee-Wee-
Reese,
 This second approach is followed in semantic net and frame-based systems,
accompanied by a knowledge acquisition tool that guarantees the consistency
of inverse slot by checking, each time a value is added to one attribute then the
corresponding value is added to the inverse.

b. An IsA hierarchy of Attributes


 Just like there are classes of objects and specialized subset of classes,
similarly, there are attributes and specialization of the attributes.
 For e.g. the attribute height is a specialized attribute of the general attribute
physical – size.
 This specialization – generalization relationship among the attributes is
important for the purpose of inheritance.

c. Techniques for reasoning about values


 Sometimes, values of the attributes are explicitly specified when the
knowledge base is created.
 But there are also values of the attributes that are not given explicitly.
 This decision making process of deciding whether the values of the attributes
are to be specified explicitly or not is known as a Reasoning System.
 Following is the different types of information that plays a major role in this
Reasoning System:
o Information about type of value. For e.g. the value of the attribute
height must be a number measured in the units of length.
o Constraints on value. For e.g. age of a person cannot be greater than
the age of his / her parents.
o Rules of computing the value when it is needed. For e.g. in the above
figure, the value of the attribute bats will be computed only if its
needed.
o Rules that describe the action to be taken if the value of the attribute is
ever known.

d. Single Valued Attributes


 A specific but a very useful kind of attribute is the one that is guaranteed to
take unique values. For e.g. a baseball player can, at one time, have a single
height and can be a member of only one team. So the attribute height and
team are called as single valued attribute.

3. Choosing the Granularity


 While deciding the granularity of representation, it is necessary to know the
following:
a) What are the primitives and at what level should the knowledge be
represented?
b) What should be the number (small or large) of low-level primitives or
high-level facts?
 High-level facts may be insufficient to draw the conclusion while Low-level
primitives may require a lot of storage.
 Example: Suppose that we are interested in following facts:
John spotted Alex.
 Now, this could be represented as
"Spotted (agent (John), object (Alex))"
 Such a representation can make it easy to answer questions such as:
Who spotted Alex?
 Suppose we want to know :
"Did John see Alex?"
 Given only one fact, user cannot discover that answer.
 Hence, the user can add other facts, such as
"Spotted (x, y) → saw (x, y)"
4. Set of Objects
 Certain properties of objects that are true as member of a set but not
as individual
 Example: Consider the assertion made in the sentences
There are more sheep than people in Australia,
and
English speakers can be found all over the world.
 To describe these facts, the only way is to attach assertion to the sets
representing People, Sheep, and English.
 The reason to represent sets of objects is if a property is true for all or most
elements of a set, then it is more efficient to associate it once with the set rather
than to associate it explicitly with every elements of the set.
 This is done in different ways :
 In logical representation through the use of universal quantifier, and
 In hierarchical structure where node represents sets, the
inheritance propagates set level assertion down to individual.

5. Finding the right Structure


 It is generally accessing the right structure for describing a particular situation.
 It requires, selecting an initial structure and then revising the choice. While doing
so, it is necessary to solve following problems:
a) How to perform an initial selection of the most appropriate structure.
b) How to fill in appropriate details from the current situations.
c) How to find a better structure if the one chosen initially turns out not to be
appropriate.
d) What to do if none of the available structures is appropriate.
e) what to do if none of the available structures is appropriate

Q. Discuss knowledge representation issues in detail (KBCNMU December 2019 Examination)

Knowledge Representation using Predicate Logic


 Predicate logic is one of the knowledge representation schemes that satisfy the
requirements of any language.
 Predicate logic is powerful enough for expression and reasoning.
 Every complete sentence contains two parts:
1. Subject - The subject is what (or whom) the sentence is about.
2. Predicate - The predicate tells something about the subject
 Example: Judy {runs}
 Here Judy is the Subject and runs is the Predicate.
 Predicate, always includes verb, tells something about the subject
 Predicate is a verb phrase template that describes a property of objects, or a relation
among objects represented by the variables.
 Example:
“The car Tom is driving is blue"
"The sky is blue"
"The cover of this book is blue"
 Predicate is blue describes property.
 Predicates can be given names;
 Let „B‟ is name for predicate "is_blue".
 Sentence is represented as "B(x)”, read as "x is blue" where, “x” represents an
arbitrary object.

Knowledge Representation using Rules



The other most popular approach to Knowledge representation is to use production
rules, sometimes called IF-THEN rules.
 The production rules are simple but powerful forms of knowledge representation
providing the flexibility of combining declarative and procedural representation for
using them in a unified form.
 Examples of production rules :
− IF condition THEN action
− IF premise THEN conclusion
− IF proposition p1 and proposition p2 are true THEN proposition p3
is true
 The advantages of production rules :
o They are modular,
o Each rule define a small and independent piece of knowledge
o New rules may be added and old ones deleted
o Rules are usually independently of other rules.
 The production rules as knowledge representation mechanism are used in the design
of many "Rule-based systems” also called "Production systems" .
 Reasoning/Chaining
 Rule-Based system architecture consists a set of rules, a set of facts, and an
inference engine. The need is to find what new facts can be derived.
 It is the process to solve the problem from initial state to goal state.
 Given a set of rules, there are essentially two ways to generate new
knowledge:
1. Forward chaining
2. Backward chaining.
1. Forward chaining :
 It is also called data driven.
 It starts with the facts, and sees what rules apply.
 It starts from initial to goal state.
 Condition rules represent actions to be taken when specified facts
occur in working memory.
 Typically, actions involve adding or deleting facts from the working
memory.

Facts
Working Inference
Engine Engine
Facts
Facts Rule

User Rule Based

Fig. Forward chaining

2. Backward chaining :
 It is also called goal driven.
 It starts with something to find out, and looks for rules that will help in
answering it.
 It starts from Goal back to Initial state
 Backward chaining means reasoning from goals back to facts.
 The idea is to focus on the search.
 Rules and facts are processed using backward chaining interpreter.
Weak and Strong Filler Structure for Knowledge Representation

1. Semantic Nets
 "Semantic Nets" were first invented for computers by Richard H. Richens of
the Cambridge Language Research Unit in 1956
 A semantic network, or frame network, is a network that represents semantic relations
between concepts.
 This is often used as a form of knowledge representation.
 It is a directed or undirected graph consisting of vertices, which represent concepts,
and edges, which represent semantic relations between concepts.
 It is used to analysing meaning of words within sentence.
 It is graphically shown in the form of directed graph consisting of nodes and arcs.
 The nodes represent objects and arcs represent links or edges.
 Semantic networks are an alternative to predicate logic as a form of knowledge
representation.
 The idea is that we can store our knowledge in the form of a graph, with nodes
representing objects in the world, and arcs representing relationships between those
objects.
 Example
 The above figure represents the following data:
o Tom is a cat.
o Tom caught a bird.
o Tom is owned by John.
o Tom is ginger in colour.
o Cats like cream.
o The cat sat on the mat.
o A cat is a mammal.
o A bird is an animal.
o All mammals are animals.
o Mammals have fur

 Advantages of Semantic Nets


1. Semantic networks are a natural representation of knowledge.
2. Semantic networks convey meaning in a transparent manner.
3. These networks are simple and easily understandable.

 Disadvantages of Semantic Nets


1. Semantic networks take more computational time at runtime as we need to
traverse the complete network tree to answer some questions. It might be possible
in the worst case scenario that after traversing the entire tree, we find that the
solution does not exist in this network.
2. Semantic networks try to model human-like memory (Which has 1015 neurons
and links) to store the information, but in practice, it is not possible to build such a
vast semantic network.
3. These types of representations are inadequate as they do not have any equivalent
quantifier, e.g., for all, for some, none, etc.
4. Semantic networks do not have any standard definition for the link names.
5. These networks are not intelligent and depend on the creator of the system.

Q. Write a short note on Semantic Nets (KBCNMU December 2019 Examination)

2. Frames
 Frames were proposed by Marvin Minsky in 1974 article "A Framework for
Representing Knowledge."
 Frames were originally derived from semantic networks and are therefore part of
structure based knowledge representations.
 Frame is a collection of attributes and associated values that describe some entity in
the world.
 A frame is a record like structure which consists of a collection of attributes and its
values to describe an entity in the world.
 Frames are the AI data structure which divides knowledge into substructures by
representing stereotypes situations.
 It consists of a collection of slots and slot values. These slots may be of any type and
sizes. Slots have names and values which are called facets.
 Facets: The various aspects of a slot is known as Facets.
 Facets are features of frames which enable us to put constraints on the frames.
 A frame may consist of any number of slots, and a slot may include any number of
facets and facets may have any number of values.
 A frame is also known as a slot - filter knowledge representation in artificial
intelligence.
 Frames are derived from semantic networks and later evolved into our modern-day
classes and objects.
 A single frame is not much useful. Frames system consists of a collection of frames
which are connected.
 In the frame, knowledge about an object or event can be stored together in the
knowledge base.
 The frame is a type of technology which is widely used in various applications
including Natural language processing and machine visions.
 Example
 Let's suppose we are taking an entity, Peter. Peter is an engineer as a profession, and
his age is 25, he lives in city London, and the country is England. So following is the
frame representation for this:

Slots Filter

Name Peter

Profession Doctor

Age 25

Marital status Single

Weight 78
 Advantages of frame representation:
1. The frame knowledge representation makes the programming easier by grouping the
related data.
2. The frame representation is comparably flexible and used by many applications in AI.
3. It is very easy to add slots for new attribute and relations.
4. It is easy to include default data and to search for missing values.
5. Frame representation is easy to understand and visualize.

 Disadvantages of frame representation:


1. In frame system inference mechanism is not be easily processed.
2. Inference mechanism cannot be smoothly proceeded by frame representation.
3. Frame representation has a much generalized approach.

Q. Write a short note on Frames (KBCNMU December 2019 Examination)

3. Conceptual Dependency(CD)
 CD theory was developed by Schank in 1973 to 1975 to represent the meaning of
Natural Language sentences.
 It helps in drawing inferences
 It is independent of the language
 CD representation of a sentence is not built using words in the sentence rather built
using conceptual primitives which give the intended meanings of words.
 CD provides structures and specific set of primitives from which representation can
be built.
 Conceptual dependency (CD) is a theory of natural language processing which mainly
deals with representation of semantics of a language.
 It helps to construct computer programs which can understand natural language.
 It helps to make inferences from the statements and also to identify conditions in
which two sentences can have similar meaning,
 It provide facilities for the system to take part in dialogues and answer questions,
 To provide a means of representation which are language independent.
 Knowledge is represented in CD by elements what are called as conceptual structures.
 What forms the basis of CD representation is that for two sentences which have
identical meaning there must be only one representation and implicitly packed
information must be explicitly stated.
 In order that knowledge is represented in CD form, certain primitive actions have
been developed.
Table: Primitive Acts of CD

Symbol Meaning Example

ATRANS transfer a relationship give

PTRANS transfer physical location of an object go

PROPEL apply physical force to an object push

MOVE move body part by owner kick

GRASP grab an object by an actor grasp

INGEST ingest an object by an animal eat

EXPEL expel from an animal’s body Cry

MTRANS transfer mental information Tell

MBUILD mentally make new information Decide

CONC conceptualize or think about an idea think

SPEAK produce sound say

ATTEND focus sense organ listen

 Six primitive conceptual categories provide building blocks which are the
set of allowable dependencies in the concepts in a sentence:
 PP -- Real world objects.
 ACT -- Real world actions.
 PA -- Attributes of objects.
 AA -- Attributes of actions.
 T -- Times.
 LOC -- Locations.

 Few conventions:
o Arrows indicate directions of dependency
o Double arrow indicates two way links between actor and action.
o O – for the object case relation
o R – for the recipient case relation
o P – for past tense
o D – destination
 The tripl earrow ( ) is also a two link but between an object, PP, and its
attribute, PA. I.e. PP PA.
 It represents isa type dependencies. E.g
 Dave lecturer
o Dave is a lecturer.

 Examples of Conceptual Dependency

Sentences CD Representations
p o d ?
Jenny cried Jenny  EXPEL  tears
eyes
poss-by 
Jenny

p d India
Mike went to India Mike  PTRANS
? (source is unknown)
Mary read a novel p o d CP(Mary)
Mary  MTRANS  info
novel
 i (instrument)

p o d novel
Mary  ATTEND  eyes
?

Q. Explain Conceptual Dependency with various primitives and show conceptual dependency
relation for the following- 1. Seema is a teacher 2. A nice flower (KBCNMU December 2019
Examination)

4. Script
 Script was developed by Schank and Abelson, 1977
 A script is a structured representation describing a stereotyped sequence of events in a
particular context.
 Scripts are used in natural language understanding systems to organize a knowledge
base in terms of the situations that the system should understand
 A script is a structure that prescribes a set of circumstances which could be expected
to follow on from one another.
 It is similar to a thought sequence or a chain of situations which could be anticipated.
 It could be considered to consist of a number of slots or frames but with more
specialised roles.
 Scripts are beneficial because:
 Events tend to occur in known runs or patterns.
 Causal relationships between events exist.
 Entry conditions exist which allow an event to take place
 Prerequisites exist upon events taking place. E.g. when a student progresses
through a degree scheme or when a purchaser buys a house.

 The components of a script include:


 Entry Conditions: These must be satisfied before events in the script can
occur.
 Results: Conditions that will be true after events in script occur.
 Props: Slots representing objects involved in events.
 Roles: Persons involved in the events.
 Track: Variations on the script. Different tracks may share components of the
same script.
 Scenes: The sequence of events that occur. Events are represented
in conceptual dependency form
 Example: Script for Bank Robbery
 Example: Script for Restaurant
Q. What is Script? Write Script for Banking with components(KBCNMU December 2019
Examination)
ARTIFICIAL INTELLIGENCE

for
T.E. Computer Engg. / I. T

As per Revised KBCNMU Syllabus (w.e.f. 2020-2021)

Notes prepared by:


Mr. Pravin K. Patil
Assistant Professor, Computer Engineering Department

SSBT’s College of Engineering and Technology Bambhori Jalgaon

Based on Reference Book:

Elaine Rich, Kevin Knight and Shivshankar Nair "Artificial Intelligence".


3rd Edition TMH.
UNIT – III
Game Playing and Planning
Contents:
 Min- Max Search Algorithm
 Alpha – Beta Pruning
 Min- Max Search with Additional Refinements
 Overview of Planning and types
 Goal Stack Planning
o Block World Problem
o STRIPS
 Nonlinear, Hierarchical and other Planning Techniques

Min – Max Search Algorithm


 Min- Max algorithm is a recursive or backtracking algorithm which is used in
decision-making and game theory.
 It provides an optimal move for the player assuming that opponent is also playing
optimally.
 Min - Max algorithm uses recursion to search through the game-tree.
 Min - Max algorithm is mostly used for game playing in AI such as Chess, Checkers,
tic-tac-toe, go, and various tow-players game.
 In this algorithm two players play the game, one is called MAX and other is called
MIN.
 Both the players fight it as the opponent player gets the minimum benefit while they
get the maximum benefit.
 Both Players of the game are opponent of each other, where MAX will select the
maximized value and MIN will select the minimized value.
 This algorithm performs a depth-first search algorithm for the exploration of the
complete game tree.
 This algorithm proceeds all the way down to the terminal node of the tree, then
backtrack the tree as the recursion.

 Example of Min-Max Algorithm

o The working of the Min – Max algorithm can be easily described using an
example.
o Below we have taken an example of game-tree which is representing the two-
player game.
o In this example, there are two players one is called Maximizer and other is
called Minimizer.
o Maximizer will try to get the Maximum possible score, and Minimizer will
try to get the Minimum possible score.
o This algorithm applies DFS, so in this game-tree, we have to go all the way
through the leaves to reach the terminal nodes.
o At the terminal node, the terminal values are given so we will compare those
values and back track the tree until the initial state occurs.
o Following are the main steps involved in solving the two-player game tree:
o Step 1:-
 In the first step, the algorithm generates the entire game-tree and apply
the utility function to get the utility values for the terminal states.
 In the below tree diagram, let's take A is the initial state of the tree.
 Suppose Maximizer takes first turn which has worst-case initial value
is - infinity and Minimizer will take next turn which has worst-case
initial value +infinity.

o Step 2:-
 Now, first we find the utilities value for the Maximizer, its initial value
is -∞, so we will compare each value in terminal state with initial value
of Maximizer and determines the higher nodes values.
 It will find the maximum among the all.
 For node D max(-1,- -∞) => max(-1,4) = 4
 For Node E max(2, -∞) => max(2, 6 )= 6
 For Node F max(-3, -∞) => max(-3,-5) = -3
 For node G max(0, -∞) = max(0, 7) = 7

o Step 3:-
 In the next step, it's a turn for Minimizer, so it will compare all nodes
value with +∞, and will find the 3rd layer node values.
 For node B= min(4,6) = 4
 For node C= min (-3, 7) = -3
o Step 4:-
 Now it's a turn for Maximizer, and it will again choose the maximum
of all nodes value and find the maximum value for the root node.
 For node A max(4, -3)= 4

Q. Explain Min – Max Search Algorithm with suitable example (KBCNMU December 2019
Examination)
Alpha – Beta Pruning

 In Min – Max search, the time required for searching will increase proportionally as
the tree gets deeper. As it uses DFS technique, all the nodes / paths are to be
traversed even if it is promising or not.
 Alpha-beta pruning is a modified version of the Min - Max algorithm. It is an
optimization technique for the Min - Max algorithm.
 As we have seen in the algorithm that the number of game states it has to examine
are exponential in depth of the tree.
 Since we cannot eliminate the exponent, but we can cut it to half.
 Hence there is a technique by which without checking each node of the game tree
we can compute the correct decision, and this technique is called Pruning.
 This involves two threshold parameters Alpha and Beta so it is called Alpha-Beta
Pruning.
 Alpha-beta pruning can be applied at any depth of a tree, and sometimes it not only
prunes the tree leaves but also entire sub-tree.
 The two-parameter can be defined as:
a) Alpha: The best (highest-value) choice we have found so far at any point along
the path of Maximizer. The initial value of alpha is -∞.
b) Beta: The best (lowest-value) choice we have found so far at any point along the
path of Minimizer. The initial value of beta is +∞.
 The Alpha-beta pruning returns the same move as the standard Min – Max algorithm
does, but it removes all the nodes which are not really affecting the final decision but
making algorithm slow. Hence by pruning these nodes, it makes the algorithm fast.
 Condition for Alpha – Beta Pruning

α>=β

 The Max player will only update the value of alpha.


 The Min player will only update the value of beta.

 Example
 Step 1: At the first step the, Max player will start first move from node A where
α= -∞ and β= +∞, these value of alpha and beta passed down to node B where
again α= -∞ and β= +∞, and Node B passes the same value to its child D.
 Step 2: At Node D, the value of α will be calculated as its turn for Max. The
value of α is compared with firstly 2 and then 3, and the max (2, 3) = 3 will be the
value of α at node D and node value will also 3.
 Step 3: Now algorithm backtracks to node B, where the value of β will change as
this is a turn of Min, Now β= +∞, will compare with the available subsequent
nodes value, i.e. min (∞, 3) = 3, hence at node B now α= -∞, and β= 3.
 Step 4: In the next step, algorithm traverses the next successor of Node B which is
node E, and the values of α= -∞, and β= 3 will also be passed. At node E, Max will
take its turn, and the value of alpha will change. The current value of alpha will be
compared with 5, so max (-∞, 5) = 5, hence at node E α= 5 and β= 3, where α>=β, so
the right successor of E will be pruned, and algorithm will not traverse it, and the
value at node E will be 5.
 Step 5:-At next step, algorithm again backtrack the tree, from node B to node A. At
node A, the value of alpha will be changed the maximum available value is 3 as max
(-∞, 3) = 3, and β= +∞, these two values now passes to right successor of A which is
Node C. At node C, α=3 and β= +∞, and the same values will be passed on to node F.
 Step 6: At node F, again the value of α will be compared with left child which is 0,
and max(3,0)= 3, and then compared with right child which is 1, and max(3,1)= 3 still
α remains 3, but the node value of F will become 1.

 Step 7:- Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here
the value of beta will be changed, it will compare with 1 so min (∞, 1) = 1. Now
at C, α=3 and β= 1, and again it satisfies the condition α>=β, so the next child of
C which is G will be pruned, and the algorithm will not compute the entire sub-
tree G.
 Step 8:- C now returns the value of 1 to A here the best value for A is max (3, 1) =
3. Following is the final game tree which is the showing the nodes which are
computed and nodes which has never computed. Hence the optimal value for the
Maximizer is 3 for this example.

Q. Explain Alpha – Beta Pruning with suitable example (KBCNMU December 2019 Examination)
Min – Max Search with Additional Refinements

1. Waiting for Quiescence


 One of the important factors that should be considered in determining when to
stop going deeper in the search tree is whether the situation is relatively stable.
 Consider the following example:

 If we consider the tree in the fig 12.7, node B looks the most promising node. But if
we explore the tree up to one more level, our decision changes drastically. As shown
in fig 12.8, when the node B is further explored, the value of node B changes and and
B does not looks the best node.
 To avoid such situations, we should continue the search until no drastic change occurs
from one level to next. This is called Waiting for Quiescence.
2. Secondary Search
 It is always better to double check that a particular chosen move always proves to be
promising, irrespective of the depth of the tree.
 Suppose we explored a particular tree up to level 6 and based on it we decided to
choose a particular move.
 But while making this decision it is always better to double check our decision.
 Although it would be expensive to search the complete tree up to two more levels i.e.
level 8, but it is always to feasible to atleast search a single branch of the tree up to an
additional two levels to make sure that our decision still looks good. This is called
Secondary Search.

3. Using the book moves


 For complicated games, sometimes it is not feasible to select the best move by simply
looking at the current game situation and extracting the correct move as per the rule
book of the game. But for some segments of the game this approach is reasonable.
 For example, in the game of chess, the opening move sequences and the closing game
sequences are always performed as per the rule book of the game. Such moves are
called the Book Moves of the game.
 The use of book moves in the opening sequence and in the closing sequence,
combined with the use of min – max search procedure in the mid game provides a
good example of the way knowledge and search can be combined to produce more
effective results.

4. Alternatives to Min – Max


 One important problematic aspect of Min – Max search is that it heavily relies on the
assumption that the opponent will always choose the optimal path.
 This assumption is acceptable in winning situations where a move that is guaranteed
to be good for us is to be found.
 However, in a losing situations, it is better to take a risk that opponent will make a
mistake.
 Suppose we have two moves, both of which, if the opponent plays correctly, will put
us in a losing situation. It’s just that one is slightly worse than the other.
 In this case, the Min – Max search will by default choose the one which is less bad.
 But what if we decide to choose the move which is the worst among the two and the
opponent makes a mistake?
 This single mistake of the opponent could lead us to a very good situation even if we
chose the worst move.
 A similar situation arises when one move appears to be slightly more advantageous
assuming that opponent plays perfectly.
 It would be better to choose the less advantageous move if it could lead to
significantly better situation, if the opponent makes a mistake.
 Thus, in some situations, Min – Max Search stands on shaky theoretical grounds
stating that the deeper the search, the poorer the result obtained by min – max.

Planning
 It is the process of computing several things before computation.
 Planning plays major role in Artificial Intelligence.
 Because in AI we have to exploit the Knowledge in proper direction so that we will not
have dead ends. (Halting Position)
 The task of finding sequences of actions in a state space where the states have logical
representations is called A.I. Planning
 Types of Planning
1. Linear Planning:
In Linear planning process, we decompose the complex problems into the number of
sub problems in such a way that all the sub problems (sub plans) are isolated to each
other (Logically separate)

Subplan A1 Subplan A2 Subplan A3

Initial Sate
Final State
A1
A1 A2 A3

2. Non Linear Planning:


In Non Linear planning process, we decompose the complex problems into the
number of sub problems in such a way that all the sub problems (sub plans) are
depends on each other. It means next plan is executed after the execution of previous
plan.

Subplan A1 Subplan A2 Subplan A3


Initial Sate Final State
A1
A1 A2 A3

3. Hierarchical Planning:
It is a combination of both Linear & Non Linear planning. In this we decompose the
task in such a way that, some sub plans are dependent and some are independent.

Subplan A1 Subplan A2 Subplan A3 Subplan A4

4. Reactive system:
In this, system will give the response for the particular action so that the execution of
the system is depends on particular signal.
e.g. Thermostat
5. Triangular Table Planning:
 It provides the lay of recording the goals & each operator expected to satisfy
it.
 If something unexpected happen during execution of plan, the table provides
information required to patch that plan.

Goal Stack Planning


 Block World Problem

 In Block World Problem, there is a flat surface on which blocks can be placed.
 There are number of square blocks, all of the same size.
 They can be stacked one upon another.
 There is a robot that can manipulate the blocks.

 The actions it can perform include:


1. UNSTACK (A, B) – Pickup block A from its current position on Block B. To
perform this action, the robot arm must be empty and block A must not have
any block on it.
2. STACK (A, B) – Place block A on block B. To perform this action, the robot
arm must be holding block A and the surface of block B must be clear.
3. PICKUP (A) – Pickup block A from the table and hold it. To perform this
action, the robot arm must be empty and there must be nothing on the top of
block A.
4. PUTDOWN (A) – Put block A down on the table. To perform this action, the
robot arm must be holding block A.
 Note that, the robot arm can hold only one block and a time. Also, all the blocks are
of equal size, so each block can have at most one block at the top of it.
 Following are the important Predicates that are used to perform the above actions:
1. ON (A, B) – Block A is placed on block B.
2. ONTABLE (A) – Block A is on the table.
3. CLEAR (A) – There is nothing on top of block A.
4. HOLDING (A) – The arm is holding block A.
5. ARMEMPTY - The robot arm is holding nothing.

Q. State and Explain various predicates and actions used in block world problem (KBCNMU
December 2019 Examination)

 STRIPS
 STRIPS stands for "STanford Research Institute Problem Solver.
 It is one of the mechanisms to solve the block world problem.
 In this approach, each operation is described by three types of lists:
o PRECONDITION: This list contains those predicates that must be true for the
operator to be applied.
o ADD: This list contains the new predicates that the Operator causes to become
true.
o DELETE: This list contains old predicates that the Operator causes to become
false.
 In most of the cases, the PRECONDITION list is often identical to the DELETE list.
 For example, to pick up a block, the PRECONDITION is that the robot arm must be
empty. As soon as robot picks the block, the arm will be no longer empty.
 In this case the PRECONDITION and the DELETE lists will be same.
 But there will be one more PRECONDITION for picking up a block that is the block
must not have other block on top of it.
 Now, this PRECONDITION will be true even after picking the block, because, even
after picking up the block, there will not be anything on the top of that block.
 This is the reason that PRECONDITION and DELETE list must be maintained
separately.

 Solving Block World Problem using Goal Stack Planning


 First step is to push the goal into the stack.

 Next push the individual predicates of the goal into the stack.

 Now pop an element out from the stack


 The element is ON (B,D) which is a predicate and it is not true in our current world.
So the next step is to push the relevant action which could achieve the subgoal

ON(B,D) in to the stack.

 Now again push the precondition of the action Stack (B,D) into the stack.

 POP an element out from the stack.


 After popping we see that CLEAR (D) is true in the current world model so we don’t
have to do anything.

 So again pop the stack,

 The popped element is HOLDING (B) which is a predicate and note that it is not true
in our current world. So we have to push the relevant action into the stack.
 Let’s push the action UNSTACK (B,C) into the stack.

 Now push the individual precondition of UNSTACK (B,C) into the stack.

 POP the stack. Note here that on popping we could see that ON(B,C) ,CLEAR(B)
AND ARMEMPTY are true in our current world. So don’t do anything.

 Now again pop the stack .


 When we do that we will get an action, so just apply the action to the current world
and add that action to plan list.

 Again pop an element. Now its STACK(B,D) which is an action so apply that to the
current state and add it to the PLAN. PLAN= { UNSTACK(B,C), STACK(B,D) }
 Now the stack will look like the one given below and our current world is like the one
above.

 Again pop the stack. The popped element is a predicate and it is not true in our current
world so push the relevant action into the stack.
 STACK(C,A) is pushed now into the stack and now push the individual preconditions
of the action into the stack.

 Now pop the stack. We will get CLEAR(A) and it is true in our current world so do
nothing. Next element that is popped is HOLDING(C) which is not true so push the
relevant action into the stack.

 In order to achieve HOLDING(C) we have to push the action PICKUP(C) and its
individual preconditions into the stack.
 Now doing pop we will get ONTABLE(C) which is true in our current world.Next
CLEAR(C) is popped and that also is achieved. Then PICKUP(C) is popped which is
an action so apply it to the current world and add it to the PLAN. The world model
and stack will look like below,

PLAN= {UNSTACK (B,C), STACK(B,D) ,PICKUP(C) }

 Again POP the stack, we will get STACK(C,A) which is an action apply it to the
world and insert it to the PLAN
PLAN= { UNSTACK(B,D), STACK(B,D) ,PICKUP(C) ,STACK(C,A) }

 Now pop the stack we will get CLEAR(C) which is already achieved in our current

situation. So we don’t need to do anything. At last when we pop the element we will

get all the three subgoal which is true and our PLAN will contain all the necessary

actions to achieve the goal.


PLAN= { UNSTACK(B,D), STACK(B,D) ,PICKUP(C) ,STACK(C,A) }

Q. Write various STRIPS style operators for bock world problem with example (KBCNMU
December 2019 Examination)
ARTIFICIAL INTELLIGENCE

for
T.E. Computer Engg. / I. T

As per Revised KBCNMU Syllabus (w.e.f. 2020-2021)

Notes prepared by:


Mr. Pravin K. Patil
Assistant Professor, Computer Engineering Department

SSBT’s College of Engineering and Technology Bambhori Jalgaon

Based on Reference Book:

Elaine Rich, Kevin Knight and Shivshankar Nair "Artificial Intelligence".


3rd Edition TMH.
UNIT – IV
Understanding NLP and Expert System
Contents:
 Natural Language Processing Steps
 Learning Techniques
 Introduction to Expert System
o Architecture of Expert System
o Expert System Shell
o Knowledge Acquisition in Expert System
 Understanding Constraint Satisfaction
o Waltz Algorithm
o Constraint Determination
o Trihedral Figure Labelling

Natural Language Processing Steps


 Natural Language Processing, usually shortened as NLP, is a branch of artificial
intelligence that deals with the interaction between computers and humans using
the natural language.
 Following are the important steps in NLP:
1. Morphological Analysis
2. Syntactic Analysis
3. Semantic Analysis
4. Discourse Integration
5. Pragmatic Analysis
1. Morphological Analysis
 Individual words are analysed into their components and nonword tokens such as
punctuation marks are separated from the words.
 For example, consider the sentence, “I want to print Bill’s .init file.”
 In the above sentence, morphological analysis will do two things:
 Pull apart the word “Bill’s” into its proper noun “Bill” and the possessive
suffix “s”
 Recognize the sequence “.init” as a file extension that is functioning as an
adjective in the sentence.
 In addition to this, it will also assign syntactic categories to all the word in the
sentence.

2. Syntactic Analysis
 Linear sequence of words is transformed into structures that show how the words
relate to each other.
 Some word sequences may be rejected if they violate the language’s rules for how
the words may be combined.
 For example the English Syntactic Analyser would reject the sentence “Boy the go the
to store”
 Syntactic Analysis exploits the results of morphological analysis to build a structural
description of a sentence.
 The goal of this process called “Parsing”, is to convert the flat list of words that form
a sentence into a structure that defines the units that are represented by that flat list of
words.
 The following figure shows the parse tree for the sentence ““I want to print Bill’s .init
file.”

 Here RM denotes Reference Markers. Each reference markers corresponds to some


entity that has been mentioned in the sentence.
 These reference markers provide a place to accumulate information about the entities.
This information can be used in further steps of the process.

3. Semantic Analysis
 Here the structures created by syntactic analysis are assigned meaninigs.
 In other words, mapping is made between the syntactic structures and the objects in
the task domain.
 Semantic Analysis must do two things:
o It must map individual words into appropriate objects in the knowledge base
or database.
o It must create correct structures to correspond to the way the meanings of
individual words combine with each other.

4. Discourse Integration
 Consider the sentence ““I want to print Bill’s .init file.”
 In this sentence, we do not exactly know, to whom, the pronoun “I” or the proper
noun “Bill” refers to.
 To pin down these references, we need a model of discourse integration from which
we can learn who the current user “I” is and who the person named “Bill” is.
 Once the current reference for Bill is known we can also find the reference to actual
file whose extension is .init.

5. Pragmatic Analysis
 Pragmatic Analysis is part of the process of extracting information from text.
 Specifically, it's the portion that focuses on taking structures set of text and figuring
out what the actual meaning was.
 Pragmatic Analysis is very important with respect to extracting the context of the
information.
 Many times the context in which the sentence was said or written is very important.
So Pragmatic Analysis becomes a crucial part of NLP process.

Q. Explain NLP steps in detail (KBCNMU December 2019 Examination)

Learning Techniques
1. Rote Learning
 Rote learning is defined as the memorization of information based on repetition.
 The two best examples of rote learning are the alphabet and numbers.
 Slightly more complicated examples include multiplication tables and spelling words.
 At the high-school level, scientific elements and their chemical numbers must be
memorized by rote.
 And, many times, teachers use rote learning without even realizing they do so.
 Memorization isn’t the most effective way to learn, but it’s a method many students
and teachers still use.
 A common rote learning technique is preparing quickly for a test, also known as
cramming.

 Advantages of Rote Learning


o Ability to quickly recall basic facts
o Helps develop foundational knowledge
 Disadvantages of Rote Learning
o Can be repetitive
o Easy to lose focus
o Doesn’t allow for a deeper understanding of a subject
o Doesn’t encourage the use of social skills
o No connection between new and previous knowledge
o May result in wrong impression or understanding a concept

 When rote memorization is applied as the main focus of learning, it is not considered
higher-level thought or critical thinking.
 Opponents to rote memorization argue that creativity in students is stunted and
suppressed, and students do not learn how to think, analyze or solve problems.
 These educators believe, instead, that a more associative or constructive learning
should be applied in the classroom. If the majority of the student’s day is spent on
repetition, the foundation for learning becomes shaky.
 Rote learning is the cornerstone of higher-level thinking and should not be ignored.
 Especially in today’s advanced technological world, rote memorization might be even
more important than ever
 If you can easily access the information when performing a certain task, the brain is
free to make major leaps in learning.

2. Learning by taking Advice


 This is a simple form of learning. Suppose a programmer writes a set of instructions
to instruct the computer what to do, the programmer is a teacher and the computer is a
student. Once learned (i.e. programmed), the system will be in a position to do new
things.
 The advice may come from many sources: human experts, internet to name a few.
This type of learning requires more inference than rote learning. The knowledge must
be transformed into an operational form before stored in the knowledge base.
Moreover the reliability of the source of knowledge should be considered.

Expert System

 An expert system is a system that employs human knowledge captured in a computer


to solve problems that ordinarily require human expertise.
 A computer program that emulates the behaviour of human experts who are solving
real-world problems associated with a particular domain of knowledge.
 An expert system is a computer system that emulates, or acts in all respects, with the
decision-making capabilities of a human expert.
 The expert systems are the computer applications developed to solve complex
problems in a particular domain, at the level of extra-ordinary human intelligence and
expertise.

 Architecture of Expert Systems


The Architecture of Expert System include following 3 major components −
1. Knowledge Base
2. Inference Engine
3. User Interface

Fig. Architecture of expert system

1. Knowledge Base
 It contains domain-specific and high-quality knowledge. Knowledge is required to
exhibit intelligence. The success of any ES majorly depends upon the collection of
highly accurate and precise knowledge
 The data is collection of facts. The information is organized as data and facts about the
task domain. Data, information, and past experience combined together are termed
as knowledge.
 The knowledge base of an ES is a store of both, factual and heuristic knowledge.
1. Factual Knowledge − It is the information widely accepted by the Knowledge
Engineers and scholars in the task domain.
2. Heuristic Knowledge − It is about practice, accurate judgement, one’s ability of
evaluation, and guessing.
 Knowledge representation is the method used to organize and formalize the
knowledge in the knowledge base. It is in the form of IF-THEN-ELSE rules
 The success of any expert system majorly depends on the quality, completeness, and
accuracy of the information stored in the knowledge base.
 The knowledge base is formed by readings from various experts, scholars, and
the Knowledge Engineers

2. Inference Engine
 Use of efficient procedures and rules by the Inference Engine is essential in deducting
a correct, flawless solution.
 In case of knowledge-based ES, the Inference Engine acquires and manipulates the
knowledge from the knowledge base to arrive at a particular solution.
 In case of rule based ES, it −
 Applies rules repeatedly to the facts, which are obtained from earlier rule
application.
 Adds new knowledge into the knowledge base if required.
 Resolves rules conflict when multiple rules are applicable to a particular case
 To recommend a solution, the Inference Engine uses the following strategies −
I. Forward Chaining
II. Backward Chaining

 Forward Chaining
 It is a strategy of an expert system to answer the question, “What can happen
next?”
 Here, the Inference Engine follows the chain of conditions and derivations and
finally deduces the outcome.
 It considers all the facts and rules, and sorts them before concluding to a solution.
 This strategy is followed for working on conclusion, result, or effect.
 For example, prediction of share market status as an effect of changes in interest
rates.

Fig. Forward Chaining


 Backward Chaining
 With this strategy, an expert system finds out the answer to the question, “Why
this happened?”
 On the basis of what has already happened, the Inference Engine tries to find out
which conditions could have happened in the past for this result.
 This strategy is followed for finding out cause or reason.
 For example, diagnosis of blood cancer in humans.

Fig. Backward Chaining

3. User Interface
 User interface provides interaction between user of the ES and the ES itself. It is
generally Natural Language Processing so as to be used by the user who is well-
versed in the task domain. The user of the ES need not be necessarily an expert in
Artificial Intelligence.
 It explains how the ES has arrived at a particular recommendation. The explanation
may appear in the following forms −
 Natural language displayed on screen.
 Verbal narrations in natural language.
 Listing of rule numbers displayed on the screen.
 Requirements of Efficient ES User Interface
 It should help users to accomplish their goals in shortest possible way.
 It should be designed to work for user’s existing or desired work practices.
 Its technology should be adaptable to user’s requirements; not the other way
round.
 It should make efficient use of user input.

Q. Draw and Explain architecture of Expert System with its advantages (KBCNMU December
2019 Examination)
Expert System Shell

 A new expert system can be developed by adding domain knowledge to the shell. The
figure depicts generic components of expert system.

 Expert System Shell:

 Knowledge acquisition system: It is the first and fundamental step. It helps to collect
the experts knowledge required to solve the Problems and build the knowledge base.

 Knowledge Base: This component is the heart of expert systems. It stores all factual
and heuristic knowledge about the application domain. It provides with the various
representation techniques for all the data.

 Inference mechanism: Inference engine is the brain of the expert system. This
component is mainly responsible for generating inference from the given knowledge
from the knowledge base and produce line of reasoning in turn the result of the user's
query.

 Explanation subsystem: This part of shell is responsible for explaining or justifying


the final or intermediate result of user query. It is also responsible to justify need of
additional knowledge.

 User interface: It is the means of communication with the user. It decides the utility
of expert system.

 Building expert systems by using shells has significant advantages. It is always


advisable to use shell to develop expert system as it avoids building the system from
scratch.
 To build an expert system using system shell, one needs to enter all necessary
knowledge about a task domain into the shell.

 Initially each expert system that was built, was created right from scratch. But after
several systems has been built in this way, it became clear that they had a lot in
common.

 In particular, since the systems were built as asset of rules combined with an
interpreter for those representations, it was possible to separate the interpreter from
the domain specific knowledge and thus create a system that could be used to
construct a new expert system, by adding the new knowledge corresponding to the
new problem domain. Such resulting interpreters are called shells.

 Earlier expert system shells provide mechanisms for knowledge representation,


reasoning and explanation. Later on tools for knowledge Acquisition were added.

 But with the time, expert systems needed to do something else as well. They need to
make it easy to integrate expert system with other programs.

 They need to access the corporate databases, embedded with larger application
programs etc. So the most important feature of expert system shell is to provide easy
to use interface.
Q. Write a short note on Expert System Shell (KBCNMU December 2019 Examination)

Knowledge Acquision in Expert System

 Knowledge acquisition is an activity of knowledge engineering that is very important


in the initial phase of system.
 It deals with building the fundamental knowledge base and updating it in the
application phase of the system.
 The domain knowledge acquired initially is a combination of factual knowledge and
the heuristic knowledge.
 The development phase of knowledge acquisition includes the activities that occur
before acquiring knowledge from domain experts, such as domain identification,
domain knowledge conceptualization, knowledge formalization and encoding, and
knowledge refinement and validation.
 The knowledge acquisition module is responsible for identifying, acquiring, and
storing new knowledge that would be useful in decision making.
 This module contains the following agents:
o Acquisition agent
o Mapping agent
o Storage agent.
 Based on their day-to-day activities, stakeholders may either identify new knowledge
that needs to be gathered or create knowledge that could be used by others within the
organization.
 Any knowledge that is generated needs to be represented and stored in such a manner
that it is readily accessible and usable by others.
 The mapping and storage agents are responsible for generating the internal
representation of the knowledge so that they can be stored, organized, disseminated,
and used by other stakeholders and applications.
 The knowledge within the repositories can be organized based on subject or a specific
taxonomy.

Understanding Constraint Satisfaction

 In artificial intelligence, constraint satisfaction is the process of finding a solution


to a set of constraints that impose conditions that the variables must satisfy.
 Constraint satisfaction problems (CSPs) are mathematical questions defined as a set
of objects whose state must satisfy a number of constraints or limitations. .
 Example:
1. Consider a Sudoku game with some numbers filled initially in some squares.
2. You are expected to fill the empty squares with numbers ranging from 1 to 9
in such a way that no row, column or a block has a number repeating itself.
3. This is a very basic constraint satisfaction problem.
4. You are supposed to solve a problem keeping in mind some constraints.
5. The remaining squares that are to be filled are known as variables, and the
range of numbers (1-9) that can fill them is known as a domain. Variables
take on values from the domain.
6. The conditions governing how a variable will choose its domain are known as
constraints.
 A constraint satisfaction problem (CSP) is a problem that requires its solution
within some limitations or conditions also known as constraints. It consists of the
following:
1. A finite set of variables which stores the solution (V = {V1, V2, V3,....., Vn})
2. A set of discrete values known as domain from which the solution is picked
(D = {D1, D2, D3,.....,Dn})
3. A finite set of constraints (C = {C1, C2, C3,......, Cn})
 Also, note that all these sets should be finite except for the domain set.
 Each variable in the variable set can have different domains.
 For example, consider the Sudoku problem again. Suppose that a row, column and
block already have 3, 5 and 7 filled in. Then the domain for all the variables in that
row, column and block will be {1, 2, 4, 6, 8, 9}.
 A problem to be converted to CSP requires the following steps:
o Step 1: Create a variable set.
o Step 2: Create a domain set.
o Step 3: Create a constraint set with variables and domains (if possible)
after considering the constraints.
o Step 4: Find an optimal solution

 Waltz’s Algorithm
There are two important steps in the use of constraints in problem solving:

o Analyse the problem domain to determine the actual constraints.


o Solve the problem by applying a constraint satisfaction algorithm.

 Consider for example a three dimensional line drawing. The analysis process is to
determine the object described by the lines. The geometric relationships between
different types of line junctions helped to determine the object types.

Fig. Three dimensional polyhedral junction types

 In waltz’s algorithm, labels are assigned to lines of various types-say concave edges
are produced by two adjacent touching surfaces which duce a concave (less than 180
Degrees).
 Conversely, convex edges produce a convexly viewed depth (greater than 180
degrees), and a boundary edge outlines a surface that obstracts other objects.
 To label a concave edge, a minus sign is used.
 Convex edges one labeled with a plus sign, and a right or left arrow is used to
label the boundary edges.
 By restricting vertices to be the intersection of three object faces, it is possible to
reduce the number of basic vertex types to only four :
The L, The Arrow, The T, The Fork

Fig. Valid junction labels for three-dimensional shapes.

 When a three-dimensional object is viewed from all possible positions, the four
junction types, together with the valid edge labels, give rise to sixteen different
permissible junction configurations as shown in figure
 Geometric constraints, together with a consistent labeling scheme, can simplify the
object identification process.

Algorithm:
1. Find the lines at the border of the scene boundary and label them. These lines can
be found by finding an outline such that no vertices are outside it. We do this first
because this labeling will impose additional constraints on the other labelings in
the figure.
2. Number the vertices of the figure to be analyzed. These numbers will correspond
to the order in which the vertices will be visited during the labeling process. To
decide on a numbering, do the following:
a. Start at any vertex on the boundary of the figure. Since boundary lines are
known, the vertices involving them are more highly constrained than are
interior ones.
b. Move from the vertex along the boundary to an adjacent unnumbered vertex
and continue until all boundary vertices have been numbered.
c. Number interior vertices by moving from a numbered vertex to some adjacent
unnumbered one. By always labeling a vertex next to one that has already
been labeled, maximum use can be made of the constraints.
3. Visit each vertex V in order and attempt to label it by doing the following:
a. Using the set of possible vertex labelings given in the list of 18 physically
possible trihedral vertices, attach to V a list of its possible labelings.
b. See whether some of these labelings can be eliminated on the basis of local
constraints. To do this, examine each vertex A that is adjacent to V and
that has already been visited. Check to see that for each proposed labeling
for V, there is a way to label the line between V and A in such a way that
at least one of the labelings listed for A is still possible. Eliminated from
V’s list any labeling for which is not the case.
c. Use the set of labelings just attached to V to contain the labels at vertices
adjacent to V. For each vertex A that was visited in the last step, do the
following:
i. Eliminated all labelings of A that are not consistent with at least
one labeling of V.
ii. If any labelings were eliminated, continue constraint propagation
by examining the vertices adjacent to A and checking for
consistency with the restricted set of labeling now attached to A
iii. Continue to propagate until there are no adjacent labeled vertices or
until there are no adjacent labeled vertices or until there is no
change made to the existing set of labeling.

 A set of labelling rules which facilitates this process can be developed for different
classes of objects. The following rules will apply for many polyhedral objects.
o The arrow should be directed to mark boundaries by traversing the object in a
clockwise direction.
o Unbroken lines should have the same label assigned at both ends.
o When a fork is labelled with a + edge, it must have all three edges label as + .
o Arrow junctions which have a+ label on both bard edges must also have a +
label on the shaft.
 These rules can be applied to a polygonal object as given in figure

Fig. Example of Object labeling.

 Starting with any edge having an object face on its right, the external boundary is
labelled with the + in a clockwise direction. Interior lines are then labelled with + or -
consistent with the other labelling rules.
 To see how constraint satisfaction algorithm works waltz; consider the image
drawing of a pyramid as given in below figure. At the right side of the pyramid are
all possible labelling for the four junctions A, B, C and D.

 Using these labels as mutual constraints on connected junctions, permissible labels for
the whole pyramid can be determined. The constraint satisfaction procedure works
as follows.
 Starting at an arbitrary junction, say A, a record of all permissible labels is made for
that junction. An adjacent junction is then chosen, say, B and labels which are
inconsistent with the line AB are then eliminated from the permissible A and B lists.
In this case, the line joining B can only be a +, - or an up arrow consequently, two of
the possible
 A labelling can be eliminated and the remaining are
 Choosing junction c next, we find that the BC constraints are satisfied by all of the B
and C labelling, so on reduction is possible with this step. On the other hand, the line
AC must be labelled as – or as an up left-arrow to be consistent. Therefore, an
additional label for A can be eliminated to reduce the remainder to the following.

 This new restriction on a now permits the elimination of one B labelling to maintain
consistency. Thus, the permissible B labelling remaining is now

 This reduction in turn, places a new restriction on BC, permitting the elimination of
one C label, since BC must now be labelled as a + only. This leaves the remaining C
labels as show in side diagram.
 Moving now to junction d, we see that of the six possible D leadings, only three
satisfy the BD constraint of up or down arrow.
 Continuing with the above procedure, we see that further label eliminations are not
possible since all constraints have been satisfied. This process is completed by finding
the different combinations of unique labelling that can be assigned to the figure. An
enumerations of the remaining label shows that it is possible to find only three
different labelling.
Q. Explain Waltz Algorithm with its limitations (KBCNMU December 2019 Examination)
ARTIFICIAL INTELLIGENCE

for
T.E. Computer Engg. / I. T

As per Revised KBCNMU Syllabus (w.e.f. 2020-2021)

Notes prepared by:


Mr. Pravin K. Patil
Assistant Professor, Computer Engineering Department

SSBT’s College of Engineering and Technology Bambhori Jalgaon

Based on Reference Book:

Elaine Rich, Kevin Knight and Shivshankar Nair "Artificial Intelligence".


3rd Edition TMH.
UNIT – V
Neural Networks
Contents:
 Characteristics of Neural Networks
 Biological Neural Network
o Features of Biological Neural Network
o Performance comparison of computer and Biological Neural Network
 Historical Development of Neural Network
 Artificial Neural Network
o Terminologies
 Models of Neuron
o McCulloch – Pits Model
 Perception
 Adeline Topology
 Basic Learning Laws
 Learning Methods
o Supervised
o Unsupervised

Neural Network
 Neural networks are multi-layer networks of neurons (the blue and magenta
nodes in the chart below) that we use to classify things, make predictions, etc.
 Below is the diagram of a simple neural network with five inputs, 5 outputs, and two
hidden layers of neurons.
 Starting from the left, we have:
1. The input layer of our model in orange.
2. Our first hidden layer of neurons in blue.
3. Our second hidden layer of neurons in magenta.
4. The output layer (a.k.a. the prediction) of our model in green.
 The arrows that connect the dots shows how all the neurons are interconnected and
how data travels from the input layer all the way through to the output layer.

Biological Neural Network

 Artificial NN draw much of their inspiration from the biological nervous system. It is
therefore very useful to have some knowledge of the way this system is organized.
 Most living creatures, which have the ability to adapt to a changing environment,
need a controlling unit which is able to learn. Higher developed animals and
humans use very complex networks of highly specialized neurons to perform this
task.
 The control unit - or brain - can be divided in different anatomic and functional
sub-units, each having certain tasks like vision, hearing, motor and sensor control.
 The brain is connected by nerves to the sensors and actors in the rest of the body.
 The brain consists of a very large number of neurons, about 1011 in average.
These can be seen as the basic building bricks for the central nervous system (CNS).
 The neurons are interconnected at points called synapses. The complexity of the
brain is due to the massive number of highly interconnected simple units working in
parallel, with an individual neuron receiving input from up to 10000 others.
 The neuron contains all structures of an animal cell. The complexity of the structure
and of the processes in a simple cell is enormous. Even the most sophisticated neuron
models in artificial neural networks seem comparatively toy-like.
 Structurally the neuron can be divided in three major parts:
o The cell body (soma),
o The dendrites
o The axon

 These are also called as the components of Biological Neural Network


 As shown in the above diagram, a typical neuron consists of the following four
components with the help of which we can explain its working –
a) Dendrites − they are tree-like branches, responsible for receiving the
information from other neurons it is connected to. In other sense, we can say
that they are like the ears of neuron.
b) Soma − It is the cell body of the neuron and is responsible for processing of
information, they have received from dendrites.
c) Axon − It is just like a cable through which neurons send the information.
d) Synapses − It is the connection between the axon and other neuron dendrites

 Characteristics of Biological Neural Networks


 Massive connectivity.
 Nonlinear, Parallel, Robust and Fault Tolerant.
 Capability to adapt to surroundings.
 Ability to learn and generalize from known examples.
 Collective behaviour is different from individual behaviour
Q. Draw and Explain components of biological neural network (KBCNMU December 2019
Examination)

Artificial Neural Network


 An artificial neural network (ANN) is the piece of a computing system designed to
simulate the way the human brain analyses and processes information.
 It is the foundation of artificial intelligence (AI) and solves problems that would
prove impossible or difficult by human or statistical standards.
 ANNs have self-learning capabilities that enable them to produce better results as
more data becomes available.
 Artificial neural networks are built like the human brain, with neuron nodes
interconnected like a web.
 The human brain has hundreds of billions of cells called neurons. Each neuron is
made up of a cell body that is responsible for processing information by carrying
information towards (inputs) and away (outputs) from the brain.
 An ANN has hundreds or thousands of artificial neurons called processing units,
which are interconnected by nodes.
 These processing units are made up of input and output units. The input units
receive various forms and structures of information based on an internal weighting
system, and the neural network attempts to learn about the information presented to
produce one output report.
 Just like humans need rules and guidelines to come up with a result or output, ANNs
also use a set of learning rules called back propagation, an abbreviation for
backward propagation of error, to perfect their output results.
 An ANN initially goes through a training phase where it learns to recognize patterns
in data, whether visually, aurally, or textually.
 During this supervised phase, the network compares its actual output produced with
what it was meant to produce—the desired output.
 The difference between both outcomes is adjusted using back propagation.
 This means that the network works backward, going from the output unit to the input
units to adjust the weight of its connections between the units until the difference
between the actual and desired outcome produces the lowest possible error.
 During the training and supervisory stage, the ANN is taught what to look for and
what its output should be, using yes/no question types with binary number.

Performance Comparison of Biological Neural Network and Artificial


Neural Network
1. Size:
 Our brain contains about 86 billion neurons and more than a 100 trillion (or
according to some estimates 1000 trillion) synapses (connections).
 The number of “neurons” in artificial networks is much less than
that (usually in the ballpark of 10–1000)

2. Topology:
 All artificial layers compute one by one, that is one layer at a time, instead of
being part of a network that has nodes computing asynchronously.
 Feed forward networks compute the state of one layer of artificial neurons and
their weights, and then use the results to compute the following layer the same
way.
 During back propagation, the algorithm computes some change in the weights
the opposing way, to reduce the difference of the feed forward computational
results in the output layer from the expected values of the output layer.
 Layers aren’t connected to non-neighbouring layers in biological
networks.
 Neurons can fire asynchronously in parallel, have a small portion of highly
connected neurons (hubs) and a large amount of lesser connected ones.
3. Speed:
 Certain biological neurons can fire around 200 times a second on average.
 Signals travel at different speeds depending on the type of the nerve impulse.
 Signal travel speeds also vary from person to person depending on their
gender, age, height, temperature, medical condition, lack of sleep etc.
 Information in artificial neurons is instead carried over by the
continuous, floating point number values of synaptic weights.

4. Fault-tolerance:
 Biological neuron networks due to their topology are also fault-tolerant.
 Information is stored redundantly so minor failures will not result in memory
loss.
 They don‘t have one ―central‖ part. The brain can also recover and heal to an
extent.
 Artificial neural networks are not modelled for fault tolerance or self-
regeneration.

5. Power consumption:
 The brain consumes about 20% of all the human body‘s energy whereas a
single Nvidia GeForce Titan X GPU runs on 250 watts alone, and requires a
power supply instead.
 Our machines are way less efficient than biological systems.
 Computers also generate a lot of heat when used, with consumer GPUs
operating safely between 50–80 degrees Celsius instead of 36.5–37.5 °C.

6. Learning:
 We still do not understand how brains learn, or how redundant connections
store and recall information.
 Brain fibres grow and reach out to connect to other neurons, neuroplasticity
allows new connections to be created or areas to move and change function,
and synapses may strengthen or weaken based on their importance.
 Artificial neural networks on the other hand, have a predefined model,
where no further neurons or connections can be added or removed.
 Only the weights of the connections can change during training.
 The networks start with random weight values and will slowly try to reach a
point where further changes in the weights would no longer improve
performance
 Unlike the brain, artificial neural networks don’t learn by recalling
information — they only learn during training, but will always ―recall‖ the
same, learned answers afterwards, without making a mistake.
 The great thing about this is that ―recalling‖ can be done on much weaker
hardware as many times as we want to.
 It is also possible to use previously pretrained models and improve them by
training with additional examples that have the same input features.

Biological Neural Network BNN Artificial Neural Network ANN

Soma Node

Dendrites Input

Synapse Weights or Interconnections

Axon Output

Criteria BNN ANN

Processing Massively parallel, slow but Massively parallel, fast but inferior than
superior than ANN BNN

Size 1011 neurons and 102 to 104 nodes mainly depends on the
1015 interconnections type of application and network designer

Learning They can tolerate ambiguity Very precise, structured and formatted
data is required to tolerate ambiguity

Fault Performance degrades with even It is capable of robust performance,


tolerance partial damage hence has the potential to be fault
tolerant

Storage Stores the information in the Stores the information in continuous


capacity synapse memory locations
Historical Development of Neural Network

 Although the study of the human brain is thousands of years old.


 The first step towards neural networks took place in 1943, when Warren McCulloch,
a neurophysiologist, and a young mathematician, Walter Pitts, wrote a paper on how
neurons might work. They modelled a simple neural network with electrical circuits.
 Donald Hebb took the idea further in his book, The Organization of Behaviour
(1949), proposing that neural pathways strengthen over each successive use, especially
between neurons that tend to fire at the same time thus beginning the long journey
towards quantifying the complex processes of the brain.
 Two major concepts that are precursers to Neural Networks are
o ‗Threshold Logic‘ — converting continuous input to discrete output
o ‗Hebbian Learning‘ — a model of learning based on neural plasticity, proposed
by Donald Hebb in his book ―The Organization of Behaviour‖ often
summarized by the phrase: ―Cells that fire together, wire together.‖
both proposed in the 1940’s.
 In 1950s, as researchers began trying to translate these networks onto computational
systems, the first Hebbian network was successfully implemented at MIT in 1954.
 In 1956 the Dartmouth Summer Research Project on Artificial Intelligence provided a
boost to both artificial intelligence and neural networks. This stimulated research in AI
and in the much lower level neural processing part of the brain.
 In 1957, John von Neumann suggested imitating simple neuron functions by using
telegraph relays or vacuum tubes.
 Frank Rosenblatt, a psychologist at Cornell, was working on understanding the
comparatively simpler decision systems. In an attempt to understand and quantify this
process, he proposed the idea of a Perceptron in 1958, calling it Mark I Perceptron.
 In 1959, Bernard Widrow and Marcian Hoff of Stanford developed models they
called ADALINE and MADALINE.
 These models were named for their use of Multiple ADAptive LINear Elements.
 MADALINE was the first neural network to be applied to a real-world problem.
 It is an adaptive filter which eliminates echoes on phone lines. This neural network is
still in commercial use.
 Marvin Minsky & Seymour Papert proved the Perceptron to be limited in their
book, Perceptrons
 Progress on neural network research halted due fear, unfulfilled claims, etc. until 1981.
 1982 — John Hopfield presented a paper to the national Academy of Sciences. His
approach to create useful devices; he was likeable, articulate, and charismatic.
 1985 — American Institute of Physics began what has become an annual meeting —
Neural Networks for Computing.
 By 1987, the Institute of Electrical and Electronic Engineer‘s (IEEE) first
International Conference on Neural Networks drew more than 1,800 attendees.
 In 1997 — A recurrent neural network framework, Long Short-Term Memory
(LSTM) was proposed by Schmidhuber & Hochreiter.
 In 1998, Yann LeCun published Gradient-Based Learning Applied to Document
Recognition.
 Thus by 1990‘s, Neural networks were definitely back, this time truly catching the
imagination of the world and finally coming to par with.

McCulloch-Pitts Neuron
 The first computational model of a neuron was proposed by Warren MuCulloch
(neuroscientist) and Walter Pitts (logician) in 1943.
 It may be divided into 2 parts.
 The first part, g takes an input , performs an aggregation and based on the aggregated
value the second part, f makes a decision.
 Lets suppose that we want to predict the decision, whether John want to watch a
random football game or not on TV.
 The inputs are all Boolean i.e., {0,1} and output variable is also Boolean {0: Will
watch it, 1: Won‘t watch it}.
o So, x_1 could be isPremierLeagueOn (John like Premier League more)
o x_2 could be isItAFriendlyGame (John tend to care less about the friendlies)
o x_3 could be isNotHome (Can‘t watch it as John is not at home)
o x_4 could be isManUnitedPlaying (John is a big Man United) and so on.
 These inputs can either be excitatory or inhibitory.
 Inhibitory inputs are those that have maximum effect on the decision making
irrespective of other inputs i.e., if x_3 is 1 (not home) then output will always be 0 i.e.,
the neuron will never fire, so x_3 is an inhibitory input.
 Excitatory inputs are NOT the ones that will make the neuron fire on their own but
they might fire it when combined together. Formally, this is what is going on:
 We can see that g(x) is just doing a sum of the inputs — a simple aggregation.
And theta here is called thresholding parameter.

 Functions of McCulloch-Pitts Neuron

1. Boolean Functions Using M-P Neuron

 So far we have seen how the M-P neuron works.

 Now lets look at how this very neuron can be used to represent a few boolean
functions.

 Our inputs are all boolean and the output is also boolean so essentially, the neuron
is just trying to learn a boolean function.

 A lot of boolean decision problems can be cast into this, based on appropriate input
variables— like whether to continue reading this post, whether to watch Friends
after reading this post etc. can be represented by the M-P neuron.

2. AND Function

 An AND function neuron would only fire when ALL the inputs are ON
3. OR Function

 An OR function neuron would fire if ANY of the inputs is ON

4. NOT Function

 For a NOT neuron, 1 outputs 0 and 0 outputs 1.

Learning Methods
1. Supervised Learning
 If you‘re learning a task under supervision, someone is present judging
whether you‘re getting the right answer, then such learning is called Super
vised Learning.
 Similarly, in supervised learning, we have a full set of labelled data while
training an algorithm.
 Fully labelled means that each example in the training dataset is tagged with
the answer the algorithm should come up with on its own.
 For example, suppose you are given a basket filled with different kinds of
fruits.
 Now the first step is to train the machine with all different fruits one by one
like this:

 If shape of object is rounded and depression at top having color Red then it
will be labeled as –Apple.
 If shape of object is long curving cylinder having color Green-Yellow then it
will be labeled as –Banana.
 Now suppose after training the data, you have given a new separate fruit say
Banana from basket and asked to identify it.

 Since the machine has already learned the things from previous data and this
time have to use it wisely.
 It will first classify the fruit with its shape and color and would confirm the
fruit name as BANANA and put it in Banana category.
 Thus the machine learns the things from training data(basket containing fruits)
and then apply the knowledge to test data(new fruit)
 There are two main areas where supervised learning is useful:
o Classification problems
o Regression problems.
 Classification problems ask the algorithm to predict a discrete value,
identifying the input data as the member of a particular class, or group.
 On the other hand, regression problems look at continuous data

2. Unsupervised Learning

 Unsupervised learning is the training of machine using information that is


neither classified nor labeled and allowing the algorithm to act on that
information without guidance.
 Here the task of machine is to group unsorted information according to
similarities, patterns and differences without any prior training of data.
 Unlike supervised learning, no teacher is provided that means no training will
be given to the machine.
 Therefore machine is restricted to find the hidden structure in unlabeled data
by our-self.
 For instance, suppose it is given an image having both dogs and cats which
have not seen ever.

 Thus the machine has no idea about the features of dogs and cat so we can‘t
categorize it in dogs and cats.
 But it can categorize them according to their similarities, patterns, and differences i.e.,
we can easily categorize the above picture into two parts.
 First may contain all pics having dogs in it and second part may contain all pics
having cats in it. Here you didn‘t learn anything before, means no training data or
examples.
 It allows the model to work on its own to discover patterns and information that was
previously undetected.
 It mainly deals with unlabelled data.
 Unsupervised learning classified into two categories of algorithms
o Clustering: A clustering problem is where you want to discover the inherent
groupings in the data, such as grouping customers by purchasing behaviour.
o Association: An association rule learning problem is where you want to
discover rules that describe large portions of your data, such as people that buy
X also tend to buy Y.

You might also like