AI Unit PDF
AI Unit PDF
AI Unit PDF
for
T.E. Computer Engg. / I. T
Contents:
Definitions of AI
History
Turing Test
AI Problems and Techniques:
o Problem as a State Space Search
o Problem Characteristics
Production System:
o Water Jug Problem
Heuristic Searching Techniques:
o BFS
o DFS
o A*
o AO*
o Means Ends Analysis
Definitions of AI
AI is the branch of computer science which deals with the Symbolic, Non
Algorithmic method of problem solving. It is the study of how to make
computer to things at a movement like as human.
Artificial Intelligence is concerned with the design of intelligence in an
artificial device. The term was coined by John McCarthy (Father of AI) in
1956.
There are two ideas in the definition.
1. Intelligence
2. Artificial device
A system with intelligence is expected to behave as intelligently as a human
and a system with intelligence is expected to behave in the best possible
manner.
Artificial intelligence is about designing systems that are as intelligent as
humans. This involves trying to understand human thought and an effort to
build machines that emulate the human thought process. This view is the
cognitive science approach to AI.
History of AI
Intellectual roots of AI date back to the early studies of the nature of knowledge and
reasoning. The dream of making a computer imitate humans also has a very early history.
The concept of intelligent machines is found in Greek mythology. There is a story in the 8th
century A.D about Pygmalion Olio, the legendary king of Cyprus. He fell in love with an
ivory statue he made to represent his ideal woman. The king prayed to the goddess Aphrodite,
and the goddess miraculously brought the statue to life. Other myths involve human-like
artifacts. As a present from Zeus to Europa, Hephaestus created Talos, a huge robot. Talos
was made of bronze and his duty was to patrol the beaches of Crete.
Aristotle (384-322 BC) developed an informal system of syllogistic logic, which is the basis
of the first formal deductive reasoning system.
Early in the 17th century, Descartes proposed that bodies of animals are nothing more than
complex machines.
Pascal in 1642 made the first mechanical digital calculating machine.
In the 19th century, George Boole developed a binary algebra representing (some) "laws of
thought."
Charles Babbage & Ada Byron worked on programmable mechanical calculating machines.
In the late 19th century and early 20th century, mathematical philosophers like Gottlob Frege,
Bertram Russell, Alfred North Whitehead, and Kurt Gödel built on Boole's initial logic
concepts to develop mathematical representations of logic problems.
The advent of electronic computers provided a revolutionary advance in the ability study
intelligence.
In 1943 McCulloch & Pitts developed a Boolean circuit model of brain. They wrote the paper
“A Logical Calculus of Ideas Immanent in Nervous Activity”, which explained how it is
possible for neural networks to compute.
Marvin Minsky and Dean Edmonds built the SNARC in 1951, which is the first randomly
wired neural network learning machine (SNARC stands for Stochastic NeuralAnalog
Reinforcement Computer).It was a neural network computer that used 3000 vacuum tubes and
a network with 40 neurons.
In 1950 Turing wrote an article on “Computing Machinery and Intelligence” which
articulated a complete vision of AI. Turing’s paper talked of many things, of solving
problems by searching through the space of possible solutions, guided by heuristics. He
illustrated his ideas on machine intelligence by reference to chess. He even propounded the
possibility of letting the machine alter its own instructions so that machines can learn from
experience.
In 1956 a famous conference took place in Dartmouth. The conference brought together the
founding fathers of artificial intelligence for the first time. In this meeting the term “Artificial
Intelligence” was adopted.
Between 1952 and 1956, Samuel had developed several programs for playing checkers. In
1956, Newell & Simon’s Logic Theorist was published. It is considered by many to be the
first AI program. In 1959, Gelernter developed a Geometry Engine. In 1961 James Slagle
(PhD dissertation, MIT) wrote a symbolic integration program, SAINT. It was written in
LISP and solved calculus problems at the college freshman level. In 1963, Thomas Evan's
program Analogy was developed which could solve IQ test type analogy problems.
In 1963, Edward A. Feigenbaum & Julian Feldman published Computers and Thought, the
first collection of articles about artificial intelligence.
In 1965, J. Allen Robinson invented a mechanical proof procedure, the Resolution Method,
which allowed programs to work efficiently with formal logic as a representation language. In
1967, the Dendral program (Feigenbaum, Lederberg, Buchanan, Sutherland at Stanford) was
demonstrated which could interpret mass spectra on organic chemical compounds. This was
the first successful knowledge-based program for scientific reasoning.
In 1969 the SRI robot, Shakey, demonstrated combining locomotion, perception and problem
solving.
The years from 1969 to 1979 marked the early development of knowledge-based systems In
1974: MYCIN demonstrated the power of rule-based systems for knowledge representation
and inference in medical diagnosis and therapy. Knowledge representation schemes were
developed. These included frames developed by Minski. Logic based languages like Prolog
and Planner were developed.
In the 1980s, Lisp Machines developed and marketed. Around 1985, neural networks return
to popularity In 1988, there was a resurgence of probabilistic and decision-theoretic methods
The early AI systems used general systems, little knowledge. AI researchers realized that
specialized knowledge is required for rich tasks to focus reasoning.
The 1990's saw major advances in all areas of AI including the machine learning, data
mining, intelligent tutoring, case-based reasoning, multi-agent planning, scheduling, uncertain
reasoning, natural language understanding and translation, vision, virtual reality, games, and
other topics.
Rod Brooks' COG Project at MIT, with numerous collaborators, made significant progress in
building a humanoid robot
The first official Robo-Cup soccer match featuring table-top matches with 40 teams of
interacting robots was held in 1997. For details, see the site
In the late 90s, Web crawlers and other AI-based information extraction programs become
essential in widespread use of the world-wide-web. Interactive robot pets ("smart toys")
become commercially available, realizing the vision of the 18th century novelty toy makers.
In 2000, the Nomad robot explores remote regions of Antarctica looking for meteorite
samples.
We will now look at a few famous AI system that has been developed over the years.
Turing Test
Turing held that in future computers can be programmed to acquire abilities rivaling
human intelligence. As part of his argument Turing put forward the idea of an
'imitation game', in which a human being and a computer would be interrogated under
conditions where the interrogator would not know which was which, the
communication being entirely by textual messages. Turing argued that if the
interrogator could not distinguish them by questioning, then it would be unreasonable
not to call the computer intelligent. Turing's 'imitation game' is now usually called 'the
Turing test' for intelligence.
Consider that there are two rooms, A and B. One of the rooms contains a computer.
The other contains a human. The interrogator is outside and does not know which one
is a computer. He can ask questions through a teletype and receives answers from
both A and B. The interrogator needs to identify whether A or B are humans. To pass
the Turing test, the machine has to fool the interrogator into believing that it is human.
Applications of AI
1. Artificial Intelligence in Healthcare: Companies are applying machine learning to
make better and faster diagnoses than humans. One of the best-known technologies is
IBM’s Watson. It understands natural language and can respond to questions asked of
it. The system mines patient data and other available data sources to form a
hypothesis, which it then presents with a confidence scoring schema. AI is a study
realized to emulate human intelligence into computer technology that could assist
both, the doctor and the patients in the following ways:
By providing a laboratory for the examination, representation and cataloguing
medical information
By devising novel tool to support decision making and research
By integrating activities in medical, software and cognitive sciences
By offering a content rich discipline for the future scientific medical
communities.
2. Artificial Intelligence in business: Robotic process automation is being applied to
highly repetitive tasks normally performed by humans. Machine learning algorithms
are being integrated into analytics and CRM (Customer relationship management)
platforms to uncover information on how to better serve customers. Chatbots have
already been incorporated into websites and e companies to provide immediate
service to customers. Automation of job positions has also become a talking point
among academics and IT consultancies.
3. AI in education: It automates grading, giving educators more time. It can also assess
students and adapt to their needs, helping them work at their own pace.
4. AI in Autonomous vehicles: Just like humans, self-driving cars need to have sensors
to understand the world around them and a brain to collect, processes and choose
specific actions based on information gathered. Autonomous vehicles are with
advanced tool to gather information, including long range radar, cameras, and
LIDAR. Each of the technologies are used in different capacities and each collects
different information. This information is useless, unless it is processed and some
form of information is taken based on the gathered information. This is where
artificial intelligence comes into play and can be compared to human brain. AI has
several applications for these vehicles and among them the more immediate ones are
as follows:
Directing the car to gas station or recharge station when it is running low on
fuel.
Adjust the trips directions based on known traffic conditions to find the
quickest route.
Incorporate speech recognition for advanced communication with passengers.
Natural language interfaces and virtual assistance technologies.
5. AI for robotics will allow us to address the challenges in taking care of an aging
population and allow much longer independence. It will drastically reduce, may be even
bring down traffic accidents and deaths, as well as enable disaster response for dangerous
situations for example the nuclear meltdown at the fukushima power plant.
6. Cyborg Technology: One of the main limitations of being human is simply our own
bodies and brains. Researcher Shimon Whiteson thinksthat in the future, we will be able
to augment ourselves with computers and enhance many of our own natural abilities.
Though many of these possible cyborg enhancements would be added for convenience,
others may serve a more practical purpose. Yoky Matsuka of Nest believes that AI will
become useful for people with amputated limbs, as the brain will be able to communicate
with a robotic limb to give the patient more control. This kind of cyborg technology
would significantly reduce the limitations that amputees deal with daily.
7. Game Playing: You can buy machines that can play master level chess for a few hundred
dollars. There is some AI in them, but they play well against people mainly through brute
force computation--looking at hundreds of thousands of positions. To beat a world
champion by brute force and known reliable heuristics requires being able to look at 200
million positions per second.
8. Speech Recognition: In the 1990s, computer speech recognition reached a practical level
for limited purposes. Thus United Airlines has replaced its keyboard tree for flight
information by a system using speech recognition of flight numbers and city names. It is
quite convenient. On the other hand, while it is possible to instruct some computers using
speech, most users have gone back to the keyboard and the mouse as still more
convenient.
9. Understanding Natural Language: Just getting a sequence of words into a computer is
not enough. Parsing sentences is not enough either. The computer has to be provided with
an understanding of the domain the text is about, and this is presently possible only for
very limited domains.
10. Computer Vision: The world is composed of three-dimensional objects, but the inputs to
the human eye and computers & TV cameras are two dimensional. Some useful programs
can work solely in two dimensions, but full computer vision requires partial three-
dimensional information that is not just a set of two-dimensional views. At present there
are only limited ways of representing three-dimensional information directly, and they are
not as good as what humans evidently use.
11. Expert Systems: A ``knowledge engineer'' interviews experts in a certain domain and
tries to embody their knowledge in a computer program for carrying out some task. One
of the first expert systems was MYCIN in 1974, which diagnosed bacterial infections of
the blood and suggested treatments. It did better than medical students or practicing
doctors, provided its limitations were observed. Namely, its ontology included bacteria,
symptoms, and treatments and did not include patients, doctors, hospitals, death,
recovery, and events occurring in time. Its interactions depended on a single patient being
considered. Since the experts consulted by the knowledge engineers knew about patients,
doctors, death, recovery, etc., it is clear that the knowledge engineers forced what the
experts told them into a predetermined framework. The usefulness of current expert
systems depends on their users having common sense.
Characteristics of AI Problem:
1. Is a problem decomposable into a set of independent, smaller or easier sub
problem?
2. Can solution steps be ignored or at least undone if they proved unwise?
3. Is a problem universal predictable? (Confirm we get the solution)
4. Is a root solution to the problem obvious without comparing to all other solutions?
5. Is the knowledgebase we use for solving the problem, is internally consistent?
6. Is the large amount of knowledge accurately required to solve the problem?
7. Were the solutions of problems required interaction between computer & human?
Production System
A production system (or production rule system) is a computer program typically used
to provide some form of artificial intelligence, which consists primarily of a set of
rules about behaviour.
These rules, termed productions, are a basic representation found useful in automated
planning, expert systems and action selection.
A production system provides the mechanism necessary to execute productions in
order to achieve some goal for the system.
Productions consist of two parts: a sensory precondition (or "IF" statement) and an
action (or "THEN").
If a production's precondition matches the current state of the world, then the
production is said to be triggered.
If a production's action is executed, it is said to have fired.
A production system also contains a database, sometimes called working memory,
which maintains data about current state or knowledge, and a rule interpreter.
Now, we must define a set of production rules, which will take us from Initial (Start) State to
Goal State. The Production Rules are as follows:
Q. What is Production Systems? Solve the following Water – Jug Problem: If you are given two
jugs, 8-gallon one and a 6-gallon one, a pump which has unlimited water which you can use to fill
the jug, and the ground on which water may be poured. Neither jug has any measuring markings
on it. How can you get exactly 4 gallons of water in the 8-gallon jug (KBCNMU December 2019
Examination)
Heuristic Searching Techniques
Breadth First Search (BFS)
It starts from the root node, explores the neighbouring nodes first and moves towards
the next level neighbours.
It generates one tree at a time until the solution is found.
BFS searches breadth-wise in the problem space.
Breadth-First search is like traversing a tree where each node is a state which may be
a potential candidate for solution.
It expands nodes from the root of the tree and then generates one level of the tree at a
time until a solution is found.
Example: Construct a tree with initial state as root. Generate all its child nodes.
Now for each leaf node, generate all its successors by applying appropriate rules.
Continue the process until goal state is reached. This is called Breadth First Search.
Algorithm - Breadth First Search
1. Create a variable called NODE - LIST and set it to the initial state.
2. Loop until the goal state is found or NODE - LIST is empty.
i. Remove the first element from the NODE – LIST, say E. If
NODE - LIST was empty then quit.
ii. For each way that each rule can match the state described in E
do:
a) Apply the rule to generate a new state.
b) If the new state is the goal state, quit and return this
state.
c) Otherwise add this state to the end of NODE-LIST
Q. Explain Breadth First Search and Depth First Search algorithm with its advantages.
(KBCNMU December 2019 Examination)
A* Algorithm
A* algorithm is the practical implementation of Best First Search which was first
presented by Peter Hart, Nils Nilsson and Bertram Raphael of Stanford Research
Institute, first described the algorithm in 1968.
Similar to the Best First Search, it uses two nodes of lists:
1. OPEN: nodes that have been generated, but have not examined.
2. CLOSED: nodes that have already been examined.
Whenever a new node is generated, check whether it has been generated before.
It uses a heuristic evaluation function, f(n)
f (n) is the approximate distance of a node, n, from a goal node.
For two node m and n, if f(m) < f(n), than m is more likely to be on an optimal path
f(n) may not be 100% accurate, but it should give better results than pure guesswork.
f’(n) = g(n)+ h’(n)
Where,
g(n) = cost from the initial state to the current state n
h’(n) = estimated cost from node n to a goal node
Algorithm
1. Start with OPEN which contain only initial node. calculate its F’
2. Until goal node found repeat the following procedure:
a) If there is no nodes in OPEN report failure, otherwise pick the node from
OPEN having lowest F’ treat it as BEST node.
b) Remove it from OPEN and put it on CLOSED. If BEST node is goal
node then exit. Otherwise generate the successor of BEST node. For each
successor do the following:
i. For each successor calculate its F’
ii. Check whether the successor is in OPEN; if it is shown then call
that node i.e. OLD node.
iii. If successor are not OPEN, then see it is in CLOSED. At that time
call that node as CLOSED OLD node.
iv. If successor was not on OPEN or CLOSE. Then put it in OPEN &
calculate its F’
Example 1: 8 puzzle problem using A* Algorithm
https://www.youtube.com/watch?v=wJu3IZq1NFs&t=333s
AO* Algorithm
AO* algorithm was proposed by Nilsson in 1980.
Rather than using two lists, OPEN and CLOSED, that were used in A* algorithm,
AO* algorithm will use a single structure GRAPH G, representing the part of search
graph that has been explicitly generated so far.
Each node in the graph will point down to its immediate successors and up to its
immediate predecessors.
Each node will also have its associated h’ value.
Algorithm:
1. Let GRAPH G consist only the node representing the initial state. (Call this
node INIT). Compute h' (INIT).
2. Until INIT is labelled SOLVED or h’ becomes greater than FUTILITY,
repeat the following procedure.
a) Trace the labelled arcs from INIT and select unexpanded nodes on this
path. Call this node as NODE.
b) Generate the successors of NODE. If there are no successors then
assign FUTILITY as h' value of NODE. This means that NODE is not
solvable. If there are successors then for each one do the following:
i. Add SUCCESSOR to GRAPH G
ii. If successor is not a terminal node, mark it solved and assign
zero to its h ' value.
iii. If successor is not a terminal node, compute it h' value.
c) Propagate the newly discovered information up the graph by doing the
following:
Let S be the set of nodes that have been marked SOLVED. Initialize S
to NODE. Until S is empty repeat the following procedure:
i. Select a node from S calls it CURRENT and remove it from S.
ii. Compute h' of each of the arcs emerging from CURRENT,
Assign minimum h' to CURRENT.
iii. Mark the minimum cost path as the best out of CURRENT.
iv. Mark CURRENT SOLVED if all of the nodes connected to it
through the new marked arc have been labelled SOLVED.
v. If CURRENT has been marked SOLVED or its h' has just
changed, its new status must be propagate backwards up the
graph. Hence all the ancestors of CURRENT are added to S.
https://www.youtube.com/watch?v=PhRayhkbJCo
Here goal is to move the desk with 2 things on it from one room to another.
Main difference between start and goal state is location.
Difference Table
Following table shows the available set of operators with its Pre – Conditions and
Results
Algorithm
1. Compare CURRENT to GOAL. If no differences, return.
2. Otherwise select most important difference and reduce it by doing the
following until success or failure is indicated.
a) Select an as yet untried operator O that is applicable to the current
difference. If there are no such operators then signal failure.
b) Attempt to apply O to the current state. Generate descriptions of two
states O-START a state in which O’s preconditions are satisfied and O-
RESULT, the state that would result if O were applied in O-START.
c) If (FIRST-PART MEA (CURRENT,O-START) AND (LAST-PART
MEA (O-RESULT, GOAL) are successful then signal success.
ARTIFICIAL INTELLIGENCE
for
T.E. Computer Engg. / I. T
1. Important Attributes :
Any attribute of objects so basic that they occur in almost every problem
domain?
2. Relationship among attributes:
Any important relationship that exists among object attributes?
3. Choosing Granularity:
At what level of detail should the knowledge be represented?
4. Set of objects:
How sets of objects be represented?
5. Finding Right structure :
Given a large amount of knowledge stored, how can relevant parts are
accessed?
1. Important Attributes :
There are attributes that are of general significance.
There are two attributes "instance" and "isa” that are of general importance.
These attributes are important because they support property inheritance.
“ IsA ” and “ Instance ”
Here "is a" (called IsA) is a way of expressing what logically is called a
class-instance relationship between the subjects represented by the terms "Joe" and
"musician".
Here,
a. Inverses:
This is about consistency check, while a value is added to one attribute. The
entities are related to each other in many different ways. The below figure
shows attributes (isa, instance, and team), each with a directed arrow,
originating at the object being described and terminating either at the object or
its value.
There are two ways of realizing this:
First, represent two relationships in a single representation; e.g., a logical
representation, team (Pee-Wee-Reese, Brooklyn–Dodgers), that can be
interpreted as a statement about Pee-Wee-Reese or Brooklyn–Dodger.
Use attributes that focus on a single entity but use them in pairs, one the
inverse of the other. For e.g., one Dodgers and the other team = Pee-Wee-
Reese,
This second approach is followed in semantic net and frame-based systems,
accompanied by a knowledge acquisition tool that guarantees the consistency
of inverse slot by checking, each time a value is added to one attribute then the
corresponding value is added to the inverse.
Facts
Working Inference
Engine Engine
Facts
Facts Rule
2. Backward chaining :
It is also called goal driven.
It starts with something to find out, and looks for rules that will help in
answering it.
It starts from Goal back to Initial state
Backward chaining means reasoning from goals back to facts.
The idea is to focus on the search.
Rules and facts are processed using backward chaining interpreter.
Weak and Strong Filler Structure for Knowledge Representation
1. Semantic Nets
"Semantic Nets" were first invented for computers by Richard H. Richens of
the Cambridge Language Research Unit in 1956
A semantic network, or frame network, is a network that represents semantic relations
between concepts.
This is often used as a form of knowledge representation.
It is a directed or undirected graph consisting of vertices, which represent concepts,
and edges, which represent semantic relations between concepts.
It is used to analysing meaning of words within sentence.
It is graphically shown in the form of directed graph consisting of nodes and arcs.
The nodes represent objects and arcs represent links or edges.
Semantic networks are an alternative to predicate logic as a form of knowledge
representation.
The idea is that we can store our knowledge in the form of a graph, with nodes
representing objects in the world, and arcs representing relationships between those
objects.
Example
The above figure represents the following data:
o Tom is a cat.
o Tom caught a bird.
o Tom is owned by John.
o Tom is ginger in colour.
o Cats like cream.
o The cat sat on the mat.
o A cat is a mammal.
o A bird is an animal.
o All mammals are animals.
o Mammals have fur
2. Frames
Frames were proposed by Marvin Minsky in 1974 article "A Framework for
Representing Knowledge."
Frames were originally derived from semantic networks and are therefore part of
structure based knowledge representations.
Frame is a collection of attributes and associated values that describe some entity in
the world.
A frame is a record like structure which consists of a collection of attributes and its
values to describe an entity in the world.
Frames are the AI data structure which divides knowledge into substructures by
representing stereotypes situations.
It consists of a collection of slots and slot values. These slots may be of any type and
sizes. Slots have names and values which are called facets.
Facets: The various aspects of a slot is known as Facets.
Facets are features of frames which enable us to put constraints on the frames.
A frame may consist of any number of slots, and a slot may include any number of
facets and facets may have any number of values.
A frame is also known as a slot - filter knowledge representation in artificial
intelligence.
Frames are derived from semantic networks and later evolved into our modern-day
classes and objects.
A single frame is not much useful. Frames system consists of a collection of frames
which are connected.
In the frame, knowledge about an object or event can be stored together in the
knowledge base.
The frame is a type of technology which is widely used in various applications
including Natural language processing and machine visions.
Example
Let's suppose we are taking an entity, Peter. Peter is an engineer as a profession, and
his age is 25, he lives in city London, and the country is England. So following is the
frame representation for this:
Slots Filter
Name Peter
Profession Doctor
Age 25
Weight 78
Advantages of frame representation:
1. The frame knowledge representation makes the programming easier by grouping the
related data.
2. The frame representation is comparably flexible and used by many applications in AI.
3. It is very easy to add slots for new attribute and relations.
4. It is easy to include default data and to search for missing values.
5. Frame representation is easy to understand and visualize.
3. Conceptual Dependency(CD)
CD theory was developed by Schank in 1973 to 1975 to represent the meaning of
Natural Language sentences.
It helps in drawing inferences
It is independent of the language
CD representation of a sentence is not built using words in the sentence rather built
using conceptual primitives which give the intended meanings of words.
CD provides structures and specific set of primitives from which representation can
be built.
Conceptual dependency (CD) is a theory of natural language processing which mainly
deals with representation of semantics of a language.
It helps to construct computer programs which can understand natural language.
It helps to make inferences from the statements and also to identify conditions in
which two sentences can have similar meaning,
It provide facilities for the system to take part in dialogues and answer questions,
To provide a means of representation which are language independent.
Knowledge is represented in CD by elements what are called as conceptual structures.
What forms the basis of CD representation is that for two sentences which have
identical meaning there must be only one representation and implicitly packed
information must be explicitly stated.
In order that knowledge is represented in CD form, certain primitive actions have
been developed.
Table: Primitive Acts of CD
Six primitive conceptual categories provide building blocks which are the
set of allowable dependencies in the concepts in a sentence:
PP -- Real world objects.
ACT -- Real world actions.
PA -- Attributes of objects.
AA -- Attributes of actions.
T -- Times.
LOC -- Locations.
Few conventions:
o Arrows indicate directions of dependency
o Double arrow indicates two way links between actor and action.
o O – for the object case relation
o R – for the recipient case relation
o P – for past tense
o D – destination
The tripl earrow ( ) is also a two link but between an object, PP, and its
attribute, PA. I.e. PP PA.
It represents isa type dependencies. E.g
Dave lecturer
o Dave is a lecturer.
Sentences CD Representations
p o d ?
Jenny cried Jenny EXPEL tears
eyes
poss-by
Jenny
p d India
Mike went to India Mike PTRANS
? (source is unknown)
Mary read a novel p o d CP(Mary)
Mary MTRANS info
novel
i (instrument)
p o d novel
Mary ATTEND eyes
?
Q. Explain Conceptual Dependency with various primitives and show conceptual dependency
relation for the following- 1. Seema is a teacher 2. A nice flower (KBCNMU December 2019
Examination)
4. Script
Script was developed by Schank and Abelson, 1977
A script is a structured representation describing a stereotyped sequence of events in a
particular context.
Scripts are used in natural language understanding systems to organize a knowledge
base in terms of the situations that the system should understand
A script is a structure that prescribes a set of circumstances which could be expected
to follow on from one another.
It is similar to a thought sequence or a chain of situations which could be anticipated.
It could be considered to consist of a number of slots or frames but with more
specialised roles.
Scripts are beneficial because:
Events tend to occur in known runs or patterns.
Causal relationships between events exist.
Entry conditions exist which allow an event to take place
Prerequisites exist upon events taking place. E.g. when a student progresses
through a degree scheme or when a purchaser buys a house.
for
T.E. Computer Engg. / I. T
o The working of the Min – Max algorithm can be easily described using an
example.
o Below we have taken an example of game-tree which is representing the two-
player game.
o In this example, there are two players one is called Maximizer and other is
called Minimizer.
o Maximizer will try to get the Maximum possible score, and Minimizer will
try to get the Minimum possible score.
o This algorithm applies DFS, so in this game-tree, we have to go all the way
through the leaves to reach the terminal nodes.
o At the terminal node, the terminal values are given so we will compare those
values and back track the tree until the initial state occurs.
o Following are the main steps involved in solving the two-player game tree:
o Step 1:-
In the first step, the algorithm generates the entire game-tree and apply
the utility function to get the utility values for the terminal states.
In the below tree diagram, let's take A is the initial state of the tree.
Suppose Maximizer takes first turn which has worst-case initial value
is - infinity and Minimizer will take next turn which has worst-case
initial value +infinity.
o Step 2:-
Now, first we find the utilities value for the Maximizer, its initial value
is -∞, so we will compare each value in terminal state with initial value
of Maximizer and determines the higher nodes values.
It will find the maximum among the all.
For node D max(-1,- -∞) => max(-1,4) = 4
For Node E max(2, -∞) => max(2, 6 )= 6
For Node F max(-3, -∞) => max(-3,-5) = -3
For node G max(0, -∞) = max(0, 7) = 7
o Step 3:-
In the next step, it's a turn for Minimizer, so it will compare all nodes
value with +∞, and will find the 3rd layer node values.
For node B= min(4,6) = 4
For node C= min (-3, 7) = -3
o Step 4:-
Now it's a turn for Maximizer, and it will again choose the maximum
of all nodes value and find the maximum value for the root node.
For node A max(4, -3)= 4
Q. Explain Min – Max Search Algorithm with suitable example (KBCNMU December 2019
Examination)
Alpha – Beta Pruning
In Min – Max search, the time required for searching will increase proportionally as
the tree gets deeper. As it uses DFS technique, all the nodes / paths are to be
traversed even if it is promising or not.
Alpha-beta pruning is a modified version of the Min - Max algorithm. It is an
optimization technique for the Min - Max algorithm.
As we have seen in the algorithm that the number of game states it has to examine
are exponential in depth of the tree.
Since we cannot eliminate the exponent, but we can cut it to half.
Hence there is a technique by which without checking each node of the game tree
we can compute the correct decision, and this technique is called Pruning.
This involves two threshold parameters Alpha and Beta so it is called Alpha-Beta
Pruning.
Alpha-beta pruning can be applied at any depth of a tree, and sometimes it not only
prunes the tree leaves but also entire sub-tree.
The two-parameter can be defined as:
a) Alpha: The best (highest-value) choice we have found so far at any point along
the path of Maximizer. The initial value of alpha is -∞.
b) Beta: The best (lowest-value) choice we have found so far at any point along the
path of Minimizer. The initial value of beta is +∞.
The Alpha-beta pruning returns the same move as the standard Min – Max algorithm
does, but it removes all the nodes which are not really affecting the final decision but
making algorithm slow. Hence by pruning these nodes, it makes the algorithm fast.
Condition for Alpha – Beta Pruning
α>=β
Example
Step 1: At the first step the, Max player will start first move from node A where
α= -∞ and β= +∞, these value of alpha and beta passed down to node B where
again α= -∞ and β= +∞, and Node B passes the same value to its child D.
Step 2: At Node D, the value of α will be calculated as its turn for Max. The
value of α is compared with firstly 2 and then 3, and the max (2, 3) = 3 will be the
value of α at node D and node value will also 3.
Step 3: Now algorithm backtracks to node B, where the value of β will change as
this is a turn of Min, Now β= +∞, will compare with the available subsequent
nodes value, i.e. min (∞, 3) = 3, hence at node B now α= -∞, and β= 3.
Step 4: In the next step, algorithm traverses the next successor of Node B which is
node E, and the values of α= -∞, and β= 3 will also be passed. At node E, Max will
take its turn, and the value of alpha will change. The current value of alpha will be
compared with 5, so max (-∞, 5) = 5, hence at node E α= 5 and β= 3, where α>=β, so
the right successor of E will be pruned, and algorithm will not traverse it, and the
value at node E will be 5.
Step 5:-At next step, algorithm again backtrack the tree, from node B to node A. At
node A, the value of alpha will be changed the maximum available value is 3 as max
(-∞, 3) = 3, and β= +∞, these two values now passes to right successor of A which is
Node C. At node C, α=3 and β= +∞, and the same values will be passed on to node F.
Step 6: At node F, again the value of α will be compared with left child which is 0,
and max(3,0)= 3, and then compared with right child which is 1, and max(3,1)= 3 still
α remains 3, but the node value of F will become 1.
Step 7:- Node F returns the node value 1 to node C, at C α= 3 and β= +∞, here
the value of beta will be changed, it will compare with 1 so min (∞, 1) = 1. Now
at C, α=3 and β= 1, and again it satisfies the condition α>=β, so the next child of
C which is G will be pruned, and the algorithm will not compute the entire sub-
tree G.
Step 8:- C now returns the value of 1 to A here the best value for A is max (3, 1) =
3. Following is the final game tree which is the showing the nodes which are
computed and nodes which has never computed. Hence the optimal value for the
Maximizer is 3 for this example.
Q. Explain Alpha – Beta Pruning with suitable example (KBCNMU December 2019 Examination)
Min – Max Search with Additional Refinements
If we consider the tree in the fig 12.7, node B looks the most promising node. But if
we explore the tree up to one more level, our decision changes drastically. As shown
in fig 12.8, when the node B is further explored, the value of node B changes and and
B does not looks the best node.
To avoid such situations, we should continue the search until no drastic change occurs
from one level to next. This is called Waiting for Quiescence.
2. Secondary Search
It is always better to double check that a particular chosen move always proves to be
promising, irrespective of the depth of the tree.
Suppose we explored a particular tree up to level 6 and based on it we decided to
choose a particular move.
But while making this decision it is always better to double check our decision.
Although it would be expensive to search the complete tree up to two more levels i.e.
level 8, but it is always to feasible to atleast search a single branch of the tree up to an
additional two levels to make sure that our decision still looks good. This is called
Secondary Search.
Planning
It is the process of computing several things before computation.
Planning plays major role in Artificial Intelligence.
Because in AI we have to exploit the Knowledge in proper direction so that we will not
have dead ends. (Halting Position)
The task of finding sequences of actions in a state space where the states have logical
representations is called A.I. Planning
Types of Planning
1. Linear Planning:
In Linear planning process, we decompose the complex problems into the number of
sub problems in such a way that all the sub problems (sub plans) are isolated to each
other (Logically separate)
Initial Sate
Final State
A1
A1 A2 A3
3. Hierarchical Planning:
It is a combination of both Linear & Non Linear planning. In this we decompose the
task in such a way that, some sub plans are dependent and some are independent.
4. Reactive system:
In this, system will give the response for the particular action so that the execution of
the system is depends on particular signal.
e.g. Thermostat
5. Triangular Table Planning:
It provides the lay of recording the goals & each operator expected to satisfy
it.
If something unexpected happen during execution of plan, the table provides
information required to patch that plan.
In Block World Problem, there is a flat surface on which blocks can be placed.
There are number of square blocks, all of the same size.
They can be stacked one upon another.
There is a robot that can manipulate the blocks.
Q. State and Explain various predicates and actions used in block world problem (KBCNMU
December 2019 Examination)
STRIPS
STRIPS stands for "STanford Research Institute Problem Solver.
It is one of the mechanisms to solve the block world problem.
In this approach, each operation is described by three types of lists:
o PRECONDITION: This list contains those predicates that must be true for the
operator to be applied.
o ADD: This list contains the new predicates that the Operator causes to become
true.
o DELETE: This list contains old predicates that the Operator causes to become
false.
In most of the cases, the PRECONDITION list is often identical to the DELETE list.
For example, to pick up a block, the PRECONDITION is that the robot arm must be
empty. As soon as robot picks the block, the arm will be no longer empty.
In this case the PRECONDITION and the DELETE lists will be same.
But there will be one more PRECONDITION for picking up a block that is the block
must not have other block on top of it.
Now, this PRECONDITION will be true even after picking the block, because, even
after picking up the block, there will not be anything on the top of that block.
This is the reason that PRECONDITION and DELETE list must be maintained
separately.
Next push the individual predicates of the goal into the stack.
Now again push the precondition of the action Stack (B,D) into the stack.
The popped element is HOLDING (B) which is a predicate and note that it is not true
in our current world. So we have to push the relevant action into the stack.
Let’s push the action UNSTACK (B,C) into the stack.
Now push the individual precondition of UNSTACK (B,C) into the stack.
POP the stack. Note here that on popping we could see that ON(B,C) ,CLEAR(B)
AND ARMEMPTY are true in our current world. So don’t do anything.
Again pop an element. Now its STACK(B,D) which is an action so apply that to the
current state and add it to the PLAN. PLAN= { UNSTACK(B,C), STACK(B,D) }
Now the stack will look like the one given below and our current world is like the one
above.
Again pop the stack. The popped element is a predicate and it is not true in our current
world so push the relevant action into the stack.
STACK(C,A) is pushed now into the stack and now push the individual preconditions
of the action into the stack.
Now pop the stack. We will get CLEAR(A) and it is true in our current world so do
nothing. Next element that is popped is HOLDING(C) which is not true so push the
relevant action into the stack.
In order to achieve HOLDING(C) we have to push the action PICKUP(C) and its
individual preconditions into the stack.
Now doing pop we will get ONTABLE(C) which is true in our current world.Next
CLEAR(C) is popped and that also is achieved. Then PICKUP(C) is popped which is
an action so apply it to the current world and add it to the PLAN. The world model
and stack will look like below,
Again POP the stack, we will get STACK(C,A) which is an action apply it to the
world and insert it to the PLAN
PLAN= { UNSTACK(B,D), STACK(B,D) ,PICKUP(C) ,STACK(C,A) }
Now pop the stack we will get CLEAR(C) which is already achieved in our current
situation. So we don’t need to do anything. At last when we pop the element we will
get all the three subgoal which is true and our PLAN will contain all the necessary
Q. Write various STRIPS style operators for bock world problem with example (KBCNMU
December 2019 Examination)
ARTIFICIAL INTELLIGENCE
for
T.E. Computer Engg. / I. T
2. Syntactic Analysis
Linear sequence of words is transformed into structures that show how the words
relate to each other.
Some word sequences may be rejected if they violate the language’s rules for how
the words may be combined.
For example the English Syntactic Analyser would reject the sentence “Boy the go the
to store”
Syntactic Analysis exploits the results of morphological analysis to build a structural
description of a sentence.
The goal of this process called “Parsing”, is to convert the flat list of words that form
a sentence into a structure that defines the units that are represented by that flat list of
words.
The following figure shows the parse tree for the sentence ““I want to print Bill’s .init
file.”
3. Semantic Analysis
Here the structures created by syntactic analysis are assigned meaninigs.
In other words, mapping is made between the syntactic structures and the objects in
the task domain.
Semantic Analysis must do two things:
o It must map individual words into appropriate objects in the knowledge base
or database.
o It must create correct structures to correspond to the way the meanings of
individual words combine with each other.
4. Discourse Integration
Consider the sentence ““I want to print Bill’s .init file.”
In this sentence, we do not exactly know, to whom, the pronoun “I” or the proper
noun “Bill” refers to.
To pin down these references, we need a model of discourse integration from which
we can learn who the current user “I” is and who the person named “Bill” is.
Once the current reference for Bill is known we can also find the reference to actual
file whose extension is .init.
5. Pragmatic Analysis
Pragmatic Analysis is part of the process of extracting information from text.
Specifically, it's the portion that focuses on taking structures set of text and figuring
out what the actual meaning was.
Pragmatic Analysis is very important with respect to extracting the context of the
information.
Many times the context in which the sentence was said or written is very important.
So Pragmatic Analysis becomes a crucial part of NLP process.
Learning Techniques
1. Rote Learning
Rote learning is defined as the memorization of information based on repetition.
The two best examples of rote learning are the alphabet and numbers.
Slightly more complicated examples include multiplication tables and spelling words.
At the high-school level, scientific elements and their chemical numbers must be
memorized by rote.
And, many times, teachers use rote learning without even realizing they do so.
Memorization isn’t the most effective way to learn, but it’s a method many students
and teachers still use.
A common rote learning technique is preparing quickly for a test, also known as
cramming.
When rote memorization is applied as the main focus of learning, it is not considered
higher-level thought or critical thinking.
Opponents to rote memorization argue that creativity in students is stunted and
suppressed, and students do not learn how to think, analyze or solve problems.
These educators believe, instead, that a more associative or constructive learning
should be applied in the classroom. If the majority of the student’s day is spent on
repetition, the foundation for learning becomes shaky.
Rote learning is the cornerstone of higher-level thinking and should not be ignored.
Especially in today’s advanced technological world, rote memorization might be even
more important than ever
If you can easily access the information when performing a certain task, the brain is
free to make major leaps in learning.
Expert System
1. Knowledge Base
It contains domain-specific and high-quality knowledge. Knowledge is required to
exhibit intelligence. The success of any ES majorly depends upon the collection of
highly accurate and precise knowledge
The data is collection of facts. The information is organized as data and facts about the
task domain. Data, information, and past experience combined together are termed
as knowledge.
The knowledge base of an ES is a store of both, factual and heuristic knowledge.
1. Factual Knowledge − It is the information widely accepted by the Knowledge
Engineers and scholars in the task domain.
2. Heuristic Knowledge − It is about practice, accurate judgement, one’s ability of
evaluation, and guessing.
Knowledge representation is the method used to organize and formalize the
knowledge in the knowledge base. It is in the form of IF-THEN-ELSE rules
The success of any expert system majorly depends on the quality, completeness, and
accuracy of the information stored in the knowledge base.
The knowledge base is formed by readings from various experts, scholars, and
the Knowledge Engineers
2. Inference Engine
Use of efficient procedures and rules by the Inference Engine is essential in deducting
a correct, flawless solution.
In case of knowledge-based ES, the Inference Engine acquires and manipulates the
knowledge from the knowledge base to arrive at a particular solution.
In case of rule based ES, it −
Applies rules repeatedly to the facts, which are obtained from earlier rule
application.
Adds new knowledge into the knowledge base if required.
Resolves rules conflict when multiple rules are applicable to a particular case
To recommend a solution, the Inference Engine uses the following strategies −
I. Forward Chaining
II. Backward Chaining
Forward Chaining
It is a strategy of an expert system to answer the question, “What can happen
next?”
Here, the Inference Engine follows the chain of conditions and derivations and
finally deduces the outcome.
It considers all the facts and rules, and sorts them before concluding to a solution.
This strategy is followed for working on conclusion, result, or effect.
For example, prediction of share market status as an effect of changes in interest
rates.
3. User Interface
User interface provides interaction between user of the ES and the ES itself. It is
generally Natural Language Processing so as to be used by the user who is well-
versed in the task domain. The user of the ES need not be necessarily an expert in
Artificial Intelligence.
It explains how the ES has arrived at a particular recommendation. The explanation
may appear in the following forms −
Natural language displayed on screen.
Verbal narrations in natural language.
Listing of rule numbers displayed on the screen.
Requirements of Efficient ES User Interface
It should help users to accomplish their goals in shortest possible way.
It should be designed to work for user’s existing or desired work practices.
Its technology should be adaptable to user’s requirements; not the other way
round.
It should make efficient use of user input.
Q. Draw and Explain architecture of Expert System with its advantages (KBCNMU December
2019 Examination)
Expert System Shell
A new expert system can be developed by adding domain knowledge to the shell. The
figure depicts generic components of expert system.
Knowledge acquisition system: It is the first and fundamental step. It helps to collect
the experts knowledge required to solve the Problems and build the knowledge base.
Knowledge Base: This component is the heart of expert systems. It stores all factual
and heuristic knowledge about the application domain. It provides with the various
representation techniques for all the data.
Inference mechanism: Inference engine is the brain of the expert system. This
component is mainly responsible for generating inference from the given knowledge
from the knowledge base and produce line of reasoning in turn the result of the user's
query.
User interface: It is the means of communication with the user. It decides the utility
of expert system.
Initially each expert system that was built, was created right from scratch. But after
several systems has been built in this way, it became clear that they had a lot in
common.
In particular, since the systems were built as asset of rules combined with an
interpreter for those representations, it was possible to separate the interpreter from
the domain specific knowledge and thus create a system that could be used to
construct a new expert system, by adding the new knowledge corresponding to the
new problem domain. Such resulting interpreters are called shells.
But with the time, expert systems needed to do something else as well. They need to
make it easy to integrate expert system with other programs.
They need to access the corporate databases, embedded with larger application
programs etc. So the most important feature of expert system shell is to provide easy
to use interface.
Q. Write a short note on Expert System Shell (KBCNMU December 2019 Examination)
Waltz’s Algorithm
There are two important steps in the use of constraints in problem solving:
Consider for example a three dimensional line drawing. The analysis process is to
determine the object described by the lines. The geometric relationships between
different types of line junctions helped to determine the object types.
In waltz’s algorithm, labels are assigned to lines of various types-say concave edges
are produced by two adjacent touching surfaces which duce a concave (less than 180
Degrees).
Conversely, convex edges produce a convexly viewed depth (greater than 180
degrees), and a boundary edge outlines a surface that obstracts other objects.
To label a concave edge, a minus sign is used.
Convex edges one labeled with a plus sign, and a right or left arrow is used to
label the boundary edges.
By restricting vertices to be the intersection of three object faces, it is possible to
reduce the number of basic vertex types to only four :
The L, The Arrow, The T, The Fork
When a three-dimensional object is viewed from all possible positions, the four
junction types, together with the valid edge labels, give rise to sixteen different
permissible junction configurations as shown in figure
Geometric constraints, together with a consistent labeling scheme, can simplify the
object identification process.
Algorithm:
1. Find the lines at the border of the scene boundary and label them. These lines can
be found by finding an outline such that no vertices are outside it. We do this first
because this labeling will impose additional constraints on the other labelings in
the figure.
2. Number the vertices of the figure to be analyzed. These numbers will correspond
to the order in which the vertices will be visited during the labeling process. To
decide on a numbering, do the following:
a. Start at any vertex on the boundary of the figure. Since boundary lines are
known, the vertices involving them are more highly constrained than are
interior ones.
b. Move from the vertex along the boundary to an adjacent unnumbered vertex
and continue until all boundary vertices have been numbered.
c. Number interior vertices by moving from a numbered vertex to some adjacent
unnumbered one. By always labeling a vertex next to one that has already
been labeled, maximum use can be made of the constraints.
3. Visit each vertex V in order and attempt to label it by doing the following:
a. Using the set of possible vertex labelings given in the list of 18 physically
possible trihedral vertices, attach to V a list of its possible labelings.
b. See whether some of these labelings can be eliminated on the basis of local
constraints. To do this, examine each vertex A that is adjacent to V and
that has already been visited. Check to see that for each proposed labeling
for V, there is a way to label the line between V and A in such a way that
at least one of the labelings listed for A is still possible. Eliminated from
V’s list any labeling for which is not the case.
c. Use the set of labelings just attached to V to contain the labels at vertices
adjacent to V. For each vertex A that was visited in the last step, do the
following:
i. Eliminated all labelings of A that are not consistent with at least
one labeling of V.
ii. If any labelings were eliminated, continue constraint propagation
by examining the vertices adjacent to A and checking for
consistency with the restricted set of labeling now attached to A
iii. Continue to propagate until there are no adjacent labeled vertices or
until there are no adjacent labeled vertices or until there is no
change made to the existing set of labeling.
A set of labelling rules which facilitates this process can be developed for different
classes of objects. The following rules will apply for many polyhedral objects.
o The arrow should be directed to mark boundaries by traversing the object in a
clockwise direction.
o Unbroken lines should have the same label assigned at both ends.
o When a fork is labelled with a + edge, it must have all three edges label as + .
o Arrow junctions which have a+ label on both bard edges must also have a +
label on the shaft.
These rules can be applied to a polygonal object as given in figure
Starting with any edge having an object face on its right, the external boundary is
labelled with the + in a clockwise direction. Interior lines are then labelled with + or -
consistent with the other labelling rules.
To see how constraint satisfaction algorithm works waltz; consider the image
drawing of a pyramid as given in below figure. At the right side of the pyramid are
all possible labelling for the four junctions A, B, C and D.
Using these labels as mutual constraints on connected junctions, permissible labels for
the whole pyramid can be determined. The constraint satisfaction procedure works
as follows.
Starting at an arbitrary junction, say A, a record of all permissible labels is made for
that junction. An adjacent junction is then chosen, say, B and labels which are
inconsistent with the line AB are then eliminated from the permissible A and B lists.
In this case, the line joining B can only be a +, - or an up arrow consequently, two of
the possible
A labelling can be eliminated and the remaining are
Choosing junction c next, we find that the BC constraints are satisfied by all of the B
and C labelling, so on reduction is possible with this step. On the other hand, the line
AC must be labelled as – or as an up left-arrow to be consistent. Therefore, an
additional label for A can be eliminated to reduce the remainder to the following.
This new restriction on a now permits the elimination of one B labelling to maintain
consistency. Thus, the permissible B labelling remaining is now
This reduction in turn, places a new restriction on BC, permitting the elimination of
one C label, since BC must now be labelled as a + only. This leaves the remaining C
labels as show in side diagram.
Moving now to junction d, we see that of the six possible D leadings, only three
satisfy the BD constraint of up or down arrow.
Continuing with the above procedure, we see that further label eliminations are not
possible since all constraints have been satisfied. This process is completed by finding
the different combinations of unique labelling that can be assigned to the figure. An
enumerations of the remaining label shows that it is possible to find only three
different labelling.
Q. Explain Waltz Algorithm with its limitations (KBCNMU December 2019 Examination)
ARTIFICIAL INTELLIGENCE
for
T.E. Computer Engg. / I. T
Neural Network
Neural networks are multi-layer networks of neurons (the blue and magenta
nodes in the chart below) that we use to classify things, make predictions, etc.
Below is the diagram of a simple neural network with five inputs, 5 outputs, and two
hidden layers of neurons.
Starting from the left, we have:
1. The input layer of our model in orange.
2. Our first hidden layer of neurons in blue.
3. Our second hidden layer of neurons in magenta.
4. The output layer (a.k.a. the prediction) of our model in green.
The arrows that connect the dots shows how all the neurons are interconnected and
how data travels from the input layer all the way through to the output layer.
Artificial NN draw much of their inspiration from the biological nervous system. It is
therefore very useful to have some knowledge of the way this system is organized.
Most living creatures, which have the ability to adapt to a changing environment,
need a controlling unit which is able to learn. Higher developed animals and
humans use very complex networks of highly specialized neurons to perform this
task.
The control unit - or brain - can be divided in different anatomic and functional
sub-units, each having certain tasks like vision, hearing, motor and sensor control.
The brain is connected by nerves to the sensors and actors in the rest of the body.
The brain consists of a very large number of neurons, about 1011 in average.
These can be seen as the basic building bricks for the central nervous system (CNS).
The neurons are interconnected at points called synapses. The complexity of the
brain is due to the massive number of highly interconnected simple units working in
parallel, with an individual neuron receiving input from up to 10000 others.
The neuron contains all structures of an animal cell. The complexity of the structure
and of the processes in a simple cell is enormous. Even the most sophisticated neuron
models in artificial neural networks seem comparatively toy-like.
Structurally the neuron can be divided in three major parts:
o The cell body (soma),
o The dendrites
o The axon
2. Topology:
All artificial layers compute one by one, that is one layer at a time, instead of
being part of a network that has nodes computing asynchronously.
Feed forward networks compute the state of one layer of artificial neurons and
their weights, and then use the results to compute the following layer the same
way.
During back propagation, the algorithm computes some change in the weights
the opposing way, to reduce the difference of the feed forward computational
results in the output layer from the expected values of the output layer.
Layers aren’t connected to non-neighbouring layers in biological
networks.
Neurons can fire asynchronously in parallel, have a small portion of highly
connected neurons (hubs) and a large amount of lesser connected ones.
3. Speed:
Certain biological neurons can fire around 200 times a second on average.
Signals travel at different speeds depending on the type of the nerve impulse.
Signal travel speeds also vary from person to person depending on their
gender, age, height, temperature, medical condition, lack of sleep etc.
Information in artificial neurons is instead carried over by the
continuous, floating point number values of synaptic weights.
4. Fault-tolerance:
Biological neuron networks due to their topology are also fault-tolerant.
Information is stored redundantly so minor failures will not result in memory
loss.
They don‘t have one ―central‖ part. The brain can also recover and heal to an
extent.
Artificial neural networks are not modelled for fault tolerance or self-
regeneration.
5. Power consumption:
The brain consumes about 20% of all the human body‘s energy whereas a
single Nvidia GeForce Titan X GPU runs on 250 watts alone, and requires a
power supply instead.
Our machines are way less efficient than biological systems.
Computers also generate a lot of heat when used, with consumer GPUs
operating safely between 50–80 degrees Celsius instead of 36.5–37.5 °C.
6. Learning:
We still do not understand how brains learn, or how redundant connections
store and recall information.
Brain fibres grow and reach out to connect to other neurons, neuroplasticity
allows new connections to be created or areas to move and change function,
and synapses may strengthen or weaken based on their importance.
Artificial neural networks on the other hand, have a predefined model,
where no further neurons or connections can be added or removed.
Only the weights of the connections can change during training.
The networks start with random weight values and will slowly try to reach a
point where further changes in the weights would no longer improve
performance
Unlike the brain, artificial neural networks don’t learn by recalling
information — they only learn during training, but will always ―recall‖ the
same, learned answers afterwards, without making a mistake.
The great thing about this is that ―recalling‖ can be done on much weaker
hardware as many times as we want to.
It is also possible to use previously pretrained models and improve them by
training with additional examples that have the same input features.
Soma Node
Dendrites Input
Axon Output
Processing Massively parallel, slow but Massively parallel, fast but inferior than
superior than ANN BNN
Size 1011 neurons and 102 to 104 nodes mainly depends on the
1015 interconnections type of application and network designer
Learning They can tolerate ambiguity Very precise, structured and formatted
data is required to tolerate ambiguity
McCulloch-Pitts Neuron
The first computational model of a neuron was proposed by Warren MuCulloch
(neuroscientist) and Walter Pitts (logician) in 1943.
It may be divided into 2 parts.
The first part, g takes an input , performs an aggregation and based on the aggregated
value the second part, f makes a decision.
Lets suppose that we want to predict the decision, whether John want to watch a
random football game or not on TV.
The inputs are all Boolean i.e., {0,1} and output variable is also Boolean {0: Will
watch it, 1: Won‘t watch it}.
o So, x_1 could be isPremierLeagueOn (John like Premier League more)
o x_2 could be isItAFriendlyGame (John tend to care less about the friendlies)
o x_3 could be isNotHome (Can‘t watch it as John is not at home)
o x_4 could be isManUnitedPlaying (John is a big Man United) and so on.
These inputs can either be excitatory or inhibitory.
Inhibitory inputs are those that have maximum effect on the decision making
irrespective of other inputs i.e., if x_3 is 1 (not home) then output will always be 0 i.e.,
the neuron will never fire, so x_3 is an inhibitory input.
Excitatory inputs are NOT the ones that will make the neuron fire on their own but
they might fire it when combined together. Formally, this is what is going on:
We can see that g(x) is just doing a sum of the inputs — a simple aggregation.
And theta here is called thresholding parameter.
Now lets look at how this very neuron can be used to represent a few boolean
functions.
Our inputs are all boolean and the output is also boolean so essentially, the neuron
is just trying to learn a boolean function.
A lot of boolean decision problems can be cast into this, based on appropriate input
variables— like whether to continue reading this post, whether to watch Friends
after reading this post etc. can be represented by the M-P neuron.
2. AND Function
An AND function neuron would only fire when ALL the inputs are ON
3. OR Function
4. NOT Function
Learning Methods
1. Supervised Learning
If you‘re learning a task under supervision, someone is present judging
whether you‘re getting the right answer, then such learning is called Super
vised Learning.
Similarly, in supervised learning, we have a full set of labelled data while
training an algorithm.
Fully labelled means that each example in the training dataset is tagged with
the answer the algorithm should come up with on its own.
For example, suppose you are given a basket filled with different kinds of
fruits.
Now the first step is to train the machine with all different fruits one by one
like this:
If shape of object is rounded and depression at top having color Red then it
will be labeled as –Apple.
If shape of object is long curving cylinder having color Green-Yellow then it
will be labeled as –Banana.
Now suppose after training the data, you have given a new separate fruit say
Banana from basket and asked to identify it.
Since the machine has already learned the things from previous data and this
time have to use it wisely.
It will first classify the fruit with its shape and color and would confirm the
fruit name as BANANA and put it in Banana category.
Thus the machine learns the things from training data(basket containing fruits)
and then apply the knowledge to test data(new fruit)
There are two main areas where supervised learning is useful:
o Classification problems
o Regression problems.
Classification problems ask the algorithm to predict a discrete value,
identifying the input data as the member of a particular class, or group.
On the other hand, regression problems look at continuous data
2. Unsupervised Learning
Thus the machine has no idea about the features of dogs and cat so we can‘t
categorize it in dogs and cats.
But it can categorize them according to their similarities, patterns, and differences i.e.,
we can easily categorize the above picture into two parts.
First may contain all pics having dogs in it and second part may contain all pics
having cats in it. Here you didn‘t learn anything before, means no training data or
examples.
It allows the model to work on its own to discover patterns and information that was
previously undetected.
It mainly deals with unlabelled data.
Unsupervised learning classified into two categories of algorithms
o Clustering: A clustering problem is where you want to discover the inherent
groupings in the data, such as grouping customers by purchasing behaviour.
o Association: An association rule learning problem is where you want to
discover rules that describe large portions of your data, such as people that buy
X also tend to buy Y.