Unit 3 Ai Notes
Unit 3 Ai Notes
Unit 3 Ai Notes
Artificial intelligence
o An intelligent agent needs knowledge about the real world for taking
decisions and reasoning to act efficiently.
o Knowledge-based agents are those agents who have the capability
of maintaining an internal state of knowledge, reason over that
knowledge, update their knowledge after observations and take
actions. These agents can represent the world with some formal
representation and act intelligently.
o Knowledge-based agents are composed of two main parts:
o Knowledge-base and
o Inference system.
Inference system
Inference means deriving new sentences from old. Inference system allows us to
add a new sentence to the knowledge base. A sentence is a proposition about the
world. Inference system applies logical rules to the KB to deduce new information.
Inference system generates new facts so that an agent can update the KB. An
inference system works mainly in two rules which are given as:
o Forward chaining
o Backward chaining
1. TELL: This operation tells the knowledge base what it perceives from the
environment.
2. ASK: This operation asks the knowledge base what action it should perform.
3. Perform: It performs the selected action.
1. function KB-AGENT(percept):
2. persistent: KB, a knowledge base
3. t, a counter, initially 0, indicating time
4. TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))
5. Action = ASK(KB, MAKE-ACTION-QUERY(t))
6. TELL(KB, MAKE-ACTION-SENTENCE(action, t))
7. t = t + 1
8. return action
The knowledge-based agent takes percept as input and returns an action as output.
The agent maintains the knowledge base, KB, and it initially has some background
knowledge of the real world. It also has a counter to indicate the time for the whole
process, and this counter is initialized with zero.
Each time when the function is called, it performs its three operations:
1. Knowledge level
Knowledge level is the first level of knowledge-based agent, and in this level, we
need to specify what the agent knows, and what the agent goals are. With these
specifications, we can fix its behavior. For example, suppose an automated taxi
agent needs to go from a station A to station B, and he knows the way from A to B,
so this comes at the knowledge level.
2. Logical level:
At this level, we understand that how the knowledge representation of knowledge is
stored. At this level, sentences are encoded into different logics. At the logical level,
an encoding of knowledge into logical sentences occurs. At the logical level we can
expect to the automated taxi agent to reach to the destination B.
3. Implementation level:
This is the physical representation of logic and knowledge. At the implementation
level agent perform actions as per logical and knowledge level. At this level, an
automated taxi agent actually implement his knowledge and logic so that he can
reach to the destination.
However, in the real world, a successful agent can be built by combining both
declarative and procedural approaches, and declarative knowledge can often be
compiled into more efficient procedural code.
What is knowledge
representation?
Humans are best at understanding, reasoning, and interpreting knowledge. Human
knows things, which is knowledge and as per their knowledge they perform various
actions in the real world. But how machines do all these things comes under
knowledge representation and reasoning. Hence we can describe Knowledge
representation as following:
What to Represent:
Following are the kind of knowledge which needs to be represented in AI systems:
o Object: All the facts about objects in our world domain. E.g., Guitars
contains strings, trumpets are brass instruments.
o Events: Events are the actions which occur in our world.
o Performance: It describe behavior which involves knowledge about how to
do things.
o Meta-knowledge: It is knowledge about what we know.
o Facts: Facts are the truths about the real world and what we represent.
o Knowledge-Base: The central component of the knowledge-based agents is
the knowledge base. It is represented as KB. The Knowledgebase is a group
of the Sentences (Here, sentences are used as a technical term and not
identical with the English language).
Types of knowledge
Following are the various types of knowledge:
1. Declarative Knowledge:
2. Procedural Knowledge
3. Meta-knowledge:
4. Heuristic knowledge:
5. Structural knowledge:
Let's suppose if you met some person who is speaking in a language which you
don't know, then how you will able to act on that. The same thing applies to the
intelligent behavior of the agents.
As we can see in below diagram, there is one decision maker which act by sensing
the environment and using knowledge. But if the knowledge part will not present
then, it cannot display intelligent behavior.
AI knowledge cycle:
An Artificial intelligence system has the following components for displaying
intelligent behavior:
o Perception
o Learning
o Knowledge Representation and Reasoning
o Planning
o Execution
The above diagram is showing how an AI system can interact with the real world
and what components help it to show intelligence. AI system has Perception
component by which it retrieves information from its environment. It can be visual,
audio or another form of sensory input. The learning component is responsible for
learning from data captured by Perception comportment. In the complete cycle, the
main components are knowledge representation and Reasoning. These two
components are involved in showing the intelligence in machine-like humans. These
two components are independent with each other but also coupled together. The
planning and execution depend on analysis of Knowledge representation and
reasoning.
Approaches to knowledge
representation:
There are mainly four approaches to knowledge representation, which are given
below:
Player1 65 23
Player2 58 18
Player3 75 24
2. Inheritable knowledge:
o In the inheritable knowledge approach, all data must be stored into a
hierarchy of classes.
o All classes should be arranged in a generalized form or a hierarchal manner.
o In this approach, we apply inheritance property.
o Elements inherit values from other members of a class.
o This approach contains inheritable knowledge which shows a relation
between instance and class, and it is called instance relation.
o Every individual frame can represent the collection of attributes and its value.
o In this approach, objects and values are represented in Boxed nodes.
o We use Arrows which point from objects to their values.
o Example:
3. Inferential knowledge:
o Inferential knowledge approach represents knowledge in the form of formal
logics.
o This approach can be used to derive more facts.
o It guaranteed correctness.
o Example: Let's suppose there are two statements:
a. Marcus is a man
b. All men are mortal
Then it can represent as;
man(Marcus)
∀x = man (x) ----------> mortal (x)s
4. Procedural knowledge:
o Procedural knowledge approach uses small programs and codes which
describes how to do specific things, and how to proceed.
o In this approach, one important rule is used which is If-Then rule.
o In this knowledge, we can use various coding languages such as LISP
language and Prolog language.
o We can easily represent heuristic or domain-specific knowledge using this
approach.
o But it is not necessary that we can represent all cases in this approach.
1. Representational Accuracy:
KR system should have the ability to represent all kind of required
knowledge.
2. Inferential Adequacy:
KR system should have ability to manipulate the representational structures
to produce new knowledge corresponding to existing structure.
3. Inferential Efficiency:
The ability to direct the inferential knowledge mechanism into the most
productive directions by storing appropriate guides.
Techniques of knowledge
representation
There are mainly four ways of knowledge representation which are given as follows:
1. Logical Representation
2. Semantic Network Representation
3. Frame Representation
4. Production Rules
1. Logical Representation
Logical representation is a language with some concrete rules which deals with
propositions and has no ambiguity in representation. Logical representation means
drawing a conclusion based on various conditions. This representation lays down
some important communication rules. It consists of precisely defined syntax and
semantics which supports the sound inference. Each sentence can be translated
into logics using syntax and semantics.
Syntax:
o Syntaxes are the rules which decide how we can construct legal sentences in
the logic.
o It determines which symbol we can use in knowledge representation.
o How to write those symbols.
Semantics:
o Semantics are the rules by which we can interpret the sentence in the logic.
o Semantic also involves assigning a meaning to each sentence.
Logical representation can be categorised into mainly two logics:
a. Propositional Logics
b. Predicate logics
Note: We will discuss Prepositional Logics and Predicate logics in later chapters.
Note: Do not be confused with logical representation and logical reasoning as logical
representation is a representation language and reasoning is a process of thinking
logically.
Example: Following are some statements which we need to represent in the form
of nodes and arcs.
Statements:
a. Jerry is a cat.
b. Jerry is a mammal
c. Jerry is owned by Priya.
d. Jerry is brown colored.
e. All Mammals are animal.
In the above diagram, we have represented the different type of knowledge in the
form of nodes and arcs. Each object is connected with another object by some
relation.
3. Frame Representation
A frame is a record like structure which consists of a collection of attributes and its
values to describe an entity in the world. Frames are the AI data structure which
divides knowledge into substructures by representing stereotypes situations. It
consists of a collection of slots and slot values. These slots may be of any type and
sizes. Slots have names and values which are called facets.
Facets: The various aspects of a slot is known as Facets. Facets are features of
frames which enable us to put constraints on the frames. Example: IF-NEEDED
facts are called when data of any particular slot is needed. A frame may consist of
any number of slots, and a slot may include any number of facets and facets may
have any number of values. A frame is also known as slot-filter knowledge
representation in artificial intelligence.
Frames are derived from semantic networks and later evolved into our modern-day
classes and objects. A single frame is not much useful. Frames system consist of a
collection of frames which are connected. In the frame, knowledge about an object
or event can be stored together in the knowledge base. The frame is a type of
technology which is widely used in various applications including Natural language
processing and machine visions.
Example: 1
Let's take an example of a frame for a book
Slots Filters
Title Artificial Intelligence
Year 1996
Page 1152
Example 2:
Let's suppose we are taking an entity, Peter. Peter is an engineer as a profession,
and his age is 25, he lives in city London, and the country is England. So following
is the frame representation for this:
Slots Filter
Name Peter
Profession Doctor
Age 25
Weight 78
4. Production Rules
Production rules system consist of (condition, action) pairs which mean, "If
condition then action". It has mainly three parts:
In production rules agent checks for the condition and if the condition exists then
production rule fires and corresponding action is carried out. The condition part of
the rule determines which rule may be applied to a problem. And the action part
carries out the associated problem-solving steps. This complete process is called a
recognize-act cycle.
The working memory contains the description of the current state of problems-
solving and rule can write knowledge to the working memory. This knowledge
match and may fire other rules.
If there is a new situation (state) generates, then multiple production rules will be
fired together, this is called conflict set. In this situation, the agent needs to select
a rule from these sets, and it is called a conflict resolution.
Example:
o IF (at bus stop AND bus arrives) THEN action (get into the bus)
o IF (on the bus AND paid AND empty seat) THEN action (sit down).
o IF (on bus AND unpaid) THEN action (pay charges).
o IF (bus arrives at destination) THEN action (get down from the bus).
Example:
1. a) It is Sunday.
2. b) The Sun rises from West (False proposition)
3. c) 3+3= 7(False proposition)
4. d) 5 is a prime number.
a. Atomic Propositions
b. Compound propositions
Example:
Example:
Truth Table:
In propositional logic, we need to know the truth values of propositions in all possible
scenarios. We can combine all the possible combination with logical connectives, and the
representation of these combinations in a tabular format is called Truth table. Following
are the truth table for all logical connectives:
Truth table with three propositions:
We can build a proposition composing three propositions P, Q, and R. This truth table is
made-up of 8n Tuples as we have taken three proposition symbols.
Precedence of connectives:
Just like arithmetic operators, there is a precedence order for propositional connectors or
logical operators. This order should be followed while evaluating a propositional problem.
Following is the list of the precedence order for operators:
Precedence Operators
Note: For better understanding use parenthesis to make sure of the correct interpretations.
Such as ¬R∨ Q, It can be interpreted as (¬R) ∨ Q.
Logical equivalence:
Logical equivalence is one of the features of propositional logic. Two propositions are said to
be logically equivalent if and only if the columns in the truth table are identical to each
other.
Let's take two propositions A and B, so for logical equivalence, we can write it as A⇔B. In
below truth table we can see that column for ¬A∨ B and A→B, are identical hence A is
Equivalent to B
Properties of Operators:
o Commutativity:
o P∧ Q= Q ∧ P, or
o P ∨ Q = Q ∨ P.
o Associativity:
o (P ∧ Q) ∧ R= P ∧ (Q ∧ R),
o (P ∨ Q) ∨ R= P ∨ (Q ∨ R)
o Identity element:
o P ∧ True = P,
o P ∨ True= True.
o Distributive:
o P∧ (Q ∨ R) = (P ∧ Q) ∨ (P ∧ R).
o P ∨ (Q ∧ R) = (P ∨ Q) ∧ (P ∨ R).
o DE Morgan's Law:
o ¬ (P ∧ Q) = (¬P) ∨ (¬Q)
o ¬ (P ∨ Q) = (¬ P) ∧ (¬Q).
o Double-negation elimination:
o ¬ (¬P) = P.
Inference rules:
Inference rules are the templates for generating valid arguments. Inference rules
are applied to derive proofs in artificial intelligence, and the proof is a sequence of
the conclusion that leads to the desired goal.
In inference rules, the implication among all the connectives plays an important
role. Following are some terminologies related to inference rules:
From the above term some of the compound statements are equivalent to each
other, which we can prove using truth table:
Hence from the above truth table, we can prove that P → Q is equivalent to ¬ Q →
¬ P, and Q→ P is equivalent to ¬ P → ¬ Q.
Example:
2. Modus Tollens:
The Modus Tollens rule state that if P→ Q is true and ¬ Q is true, then ¬ P will
also true. It can be represented as:
Statement-1: "If I am sleepy then I go to bed" ==> P→ Q
Statement-2: "I do not go to the bed."==> ~Q
Statement-3: Which infers that "I am not sleepy" => ~P
3. Hypothetical Syllogism:
The Hypothetical Syllogism rule state that if P→R is true whenever P→Q is true, and
Q→R is true. It can be represented as the following notation:
Example:
Statement-1: If you have my home key then you can unlock my home. P→Q
Statement-2: If you can unlock my home then you can take my money. Q→R
Conclusion: If you have my home key then you can take my money. P→R
4. Disjunctive Syllogism:
The Disjunctive syllogism rule state that if P∨Q is true, and ¬P is true, then Q will
be true. It can be represented as:
Example:
Proof by truth-table:
5. Addition:
The Addition rule is one the common inference rule, and it states that If P is true,
then P∨Q will be true.
Example:
Proof by Truth-Table:
6. Simplification:
The simplification rule state that if P∧ Q is true, then Q or P will also be true. It can
be represented as:
Proof by Truth-Table:
7. Resolution:
The Resolution rule state that if P∨Q and ¬ P∧R is true, then Q∨R will also be
true. It can be represented as
Proof by Truth-Table:
First-Order logic:
o First-order logic is another way of knowledge representation in artificial
intelligence. It is an extension to propositional logic.
o FOL is sufficiently expressive to represent the natural language statements in
a concise way.
o First-order logic is also known as Predicate logic or First-order predicate
logic. First-order logic is a powerful language that develops information
about the objects in a more easy way and can also express the relationship
between those objects.
o First-order logic (like natural language) does not only assume that the world
contains facts like propositional logic but also assumes the following things in
the world:
o Objects: A, B, people, numbers, colors, wars, theories, squares, pits,
wumpus, ......
o Relations: It can be unary relation such as: red, round, is
adjacent, or n-any relation such as: the sister of, brother of, has
color, comes between
o Function: Father of, best friend, third inning of, end of, ......
o As a natural language, first-order logic also has two main parts:
a. Syntax
b. Semantics
Syntax of First-Order logic:
The syntax of FOL determines which collection of symbols is a logical expression in
first-order logic. The basic syntactic elements of first-order logic are symbols. We
write statements in short-hand notation in FOL.
Variables x, y, z, a, b,....
Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==
Quantifier ∀, ∃
Atomic sentences:
o Atomic sentences are the most basic sentences of first-order logic. These
sentences are formed from a predicate symbol followed by a parenthesis with
a sequence of terms.
o We can represent atomic sentences as Predicate (term1, term2, ......,
term n).
Complex Sentences:
o Complex sentences are made by combining atomic sentences using
connectives.
Consider the statement: "x is an integer.", it consists of two parts, the first
part x is the subject of the statement and second part "is an integer," is known as a
predicate.
Universal Quantifier:
Universal quantifier is a symbol of logical representation, which specifies that the
statement within its range is true for everything or every instance of a particular
thing.
o For all x
o For each x
o For every x.
Example:
All man drink coffee.
Let a variable x which refers to a cat so all x can be represented in UOD as below:
It will be read as: There are all x where x is a man who drink coffee.
Existential Quantifier:
Existential quantifiers are the type of quantifiers, which express that the statement
within its scope is true for at least one instance of something.
It is denoted by the logical operator ∃, which resembles as inverted E. When it is
used with a predicate variable then it is called as an existential quantifier.
If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:
Example:
Some boys are intelligent.
It will be read as: There are some x where x is a boy who is intelligent.
Points to remember:
o The main connective for universal quantifier ∀ is implication →.
o The main connective for existential quantifier ∃ is and ∧.
Properties of Quantifiers:
o In universal quantifier, ∀x∀y is similar to ∀y∀x.
o In Existential quantifier, ∃x∃y is similar to ∃y∃x.
o ∃x∀y is not similar to ∀y∃x.
Substitution:
Note: First-order logic is capable of expressing facts about some or all objects in the
universe.
Equality:
First-Order logic does not only use predicate and terms for making atomic
sentences but also uses another way, which is equality in FOL. For this, we can
use equality symbols which specify that the two terms refer to the same object.
As in the above example, the object referred by the Brother (John) is similar to
the object referred by Smith. The equality symbol can also be used with negation
to represent that two terms are not the same objects.
1. Universal Generalization:
Example: Let's represent, P(c): "A byte contains 8 bits", so for ∀ x P(x) "All
bytes contain 8 bits.", it will also be true.
2. Universal Instantiation:
Example:1.
Example: 2.
Let's take a famous example,
"All kings who are greedy are Evil." So let our knowledge base contains this detail
as in the form of FOL:
So from this information, we can infer any of the following statements using
Universal Instantiation:
3. Existential Instantiation:
Example:
4. Existential introduction
o An existential introduction is also known as an existential generalization,
which is a valid inference rule in first-order logic.
o This rule states that if there is some element c in the universe of discourse
which has a property P, then we can infer that there exists something in the
universe which has the property P.
Generalized Modus Ponens can be summarized as, " P implies Q and P is asserted to
be true, therefore Q must be True."
According to Modus Ponens, for atomic sentences pi, pi', q. Where there is a
substitution θ such that SUBST (θ, pi',) = SUBST(θ, pi), it can be represented as:
Example:
We will use this rule for Kings are evil, so we will find some x such that x is
king, and x is greedy so we can infer that x is evil.
What is Unification?
o Unification is a process of making two different logical atomic expressions
identical by finding a substitution. Unification depends on the substitution
process.
o It takes two literals as input and makes them identical using substitution.
o Let Ψ1 and Ψ2 be two atomic sentences and 𝜎 be a unifier such that, Ψ1𝜎 =
Ψ2𝜎, then it can be expressed as UNIFY(Ψ1, Ψ2).
o Example: Find the MGU for Unify{King(x), King(John)}
o The UNIFY algorithm is used for unification, which takes two atomic
sentences and returns a unifier for those sentences (If any exist).
o Unification is a key component of all first-order inference algorithms.
o It returns fail if the expressions do not match with each other.
o The substitution variables are called Most General Unifier or MGU.
E.g. Let's say there are two different expressions, P(x, y), and P(a, f(z)).
In this example, we need to make both above statements identical to each other.
For this, we will perform the substitution.
o Substitute x with a, and y with f(z) in the first expression, and it will be
represented as a/x and f(z)/y.
o With both the substitutions, the first expression will be identical to the
second expression and the substitution set will be: [a/x, f(z)/y].
Unification Algorithm:
Algorithm: Unify(Ψ1, Ψ2)
For each pair of the following atomic sentences find the most
general unifier (If exist).
5. Find the MGU of Q(a, g(x, a), f(y)), Q(a, g(f(b), a), x)}
SUBST θ= {b/y}
S1 => {Q(a, g(f(b), a), f(b)); Q(a, g(f(b), a), f(b))}, Successfully Unified.
Resolution in FOL
Resolution
Resolution is a theorem proving technique that proceeds by building refutation
proofs, i.e., proofs by contradictions. It was invented by a Mathematician John Alan
Robinson in the year 1965.
Resolution is used, if there are various statements are given, and we need to prove
a conclusion of those statements. Unification is a key concept in proofs by
resolutions. Resolution is a single inference rule which can efficiently operate on
the conjunctive normal form or clausal form.
Note: To better understand this topic, firstly learns the FOL in AI.
This rule is also called the binary resolution rule because it only resolves exactly
two literals.
Example:
We can resolve two clauses which are given below:
Where two complimentary literals are: Loves (f(x), x) and ¬ Loves (a, b)
These literals can be unified with unifier θ= [a/f(x), and b/x] , and it will
generate a resolvent clause:
To better understand all the above steps, we will take an example in which we will
apply resolution.
Example:
a. John likes all kind of food.
b. Apple and vegetable are food
c. Anything anyone eats and not killed is food.
d. Anil eats peanuts and still alive
e. Harry eats everything that Anil eats.
Prove by resolution that:
f. John likes peanuts.
In the first step we will convert all the given statements into its first order logic.
In First order logic resolution, it is required to convert the FOL into CNF as CNF
form makes easier for resolution proofs.
a. ∀x ¬ food(x) V likes(John, x)
b. food(Apple) Λ food(vegetables)
c. ∀x ∀y ¬ [eats(x, y) Λ ¬ killed(x)] V food(y)
d. eats (Anil, Peanuts) Λ alive(Anil)
e. ∀x ¬ eats(Anil, x) V eats(Harry, x)
f. ∀x¬ [¬ killed(x) ] V alive(x)
g. ∀x ¬ alive(x) V ¬ killed(x)
h. likes(John, Peanuts).
o Move negation (¬)inwards and rewrite
a. ∀x ¬ food(x) V likes(John, x)
b. food(Apple) Λ food(vegetables)
c. ∀x ∀y ¬ eats(x, y) V killed(x) V food(y)
d. eats (Anil, Peanuts) Λ alive(Anil)
e. ∀x ¬ eats(Anil, x) V eats(Harry, x)
f. ∀x ¬killed(x) ] V alive(x)
g. ∀x ¬ alive(x) V ¬ killed(x)
h. likes(John, Peanuts).
o Rename variables or standardize variables
a. ∀x ¬ food(x) V likes(John, x)
b. food(Apple) Λ food(vegetables)
c. ∀y ∀z ¬ eats(y, z) V killed(y) V food(z)
d. eats (Anil, Peanuts) Λ alive(Anil)
e. ∀w¬ eats(Anil, w) V eats(Harry, w)
f. ∀g ¬killed(g) ] V alive(g)
g. ∀k ¬ alive(k) V ¬ killed(k)
h. likes(John, Peanuts).
o Eliminate existential instantiation quantifier by elimination.
In this step, we will eliminate existential quantifier ∃, and this process is
known as Skolemization. But in this example problem since there is no
existential quantifier so all the statements will remain same in this step.
o Drop Universal quantifiers.
In this step we will drop all universal quantifier since all the statements are
not implicitly quantified so we don't need it.
a. ¬ food(x) V likes(John, x)
b. food(Apple)
c. food(vegetables)
d. ¬ eats(y, z) V killed(y) V food(z)
e. eats (Anil, Peanuts)
f. alive(Anil)
g. ¬ eats(Anil, w) V eats(Harry, w)
h. killed(g) V alive(g)
i. ¬ alive(k) V ¬ killed(k)
j. likes(John, Peanuts).
In this statement, we will apply negation to the conclusion statements, which will
be written as ¬likes(John, Peanuts)
Now in this step, we will solve the problem by resolution tree using substitution. For
the above problem, it will be given as follows:
Hence the negation of the conclusion has been proved as a complete contradiction
with the given set of statements.
Inference engine:
The inference engine is the component of the intelligent system in artificial
intelligence, which applies logical rules to the knowledge base to infer new
information from known facts. The first inference engine was part of the expert
system. Inference engine commonly proceeds in two modes, which are:
a. Forward chaining
b. Backward chaining
Horn clause and definite clause are the forms of sentences, which enables
knowledge base to use a more restricted and efficient inference algorithm. Logical
inference algorithms use forward and backward chaining approaches, which require
KB in the form of the first-order definite clause.
Horn clause: A clause which is a disjunction of literals with at most one positive
literal is known as horn clause. Hence all the definite clauses are horn clauses.
It is equivalent to p ∧ q → k.
A. Forward Chaining
Forward chaining is also known as a forward deduction or forward reasoning
method when using an inference engine. Forward chaining is a form of reasoning
which start with atomic sentences in the knowledge base and applies inference
rules (Modus Ponens) in the forward direction to extract more data until a goal is
reached.
The Forward-chaining algorithm starts from known facts, triggers all rules whose
premises are satisfied, and add their conclusion to the known facts. This process
repeats until the problem is solved.
Properties of Forward-Chaining:
Consider the following famous example which we will use in both approaches:
Example:
"As per the law, it is a crime for an American to sell weapons to hostile
nations. Country A, an enemy of America, has some missiles, and all the
missiles were sold to it by Robert, who is an American citizen."
To solve the above problem, first, we will convert all the above facts into first-order
definite clauses, and then we will use a forward-chaining algorithm to reach the
goal.
In the first step we will start with the known facts and will choose the sentences
which do not have implications, such as: American(Robert), Enemy(A,
America), Owns(A, T1), and Missile(T1). All these facts will be represented as
below.
Step-2:
At the second step, we will see those facts which infer from available facts and with
satisfied premises.
Rule-(1) does not satisfy premises, so it will not be added in the first iteration.
Rule-(4) satisfy with the substitution {p/T1}, so Sells (Robert, T1, A) is added,
which infers from the conjunction of Rule (2) and (3).
B. Backward Chaining:
Backward-chaining is also known as a backward deduction or backward reasoning
method when using an inference engine. A backward chaining algorithm is a form of
reasoning, which starts with the goal and works backward, chaining through rules
to find known facts that support the goal.
Example:
In backward-chaining, we will use the same above example, and will rewrite all the
rules.
Backward-Chaining proof:
In Backward chaining, we will start with our goal predicate, which
is Criminal(Robert), and then infer further rules.
Step-1:
At the first step, we will take the goal fact. And from the goal fact, we will infer
other facts, and at last, we will prove those facts true. So our goal fact is "Robert is
Criminal," so following is the predicate of it.
Step-2:
At the second step, we will infer other facts form goal fact which satisfies the rules.
So as we can see in Rule-1, the goal predicate Criminal (Robert) is present with
substitution {Robert/P}. So we will add all the conjunctive facts below the first level
and will replace p with Robert.
Step-3:t At step-3, we will extract further fact Missile(q) which infer from
Weapon(q), as it satisfies Rule-(5). Weapon (q) is also true with the substitution of
a constant T1 at q.
Step-4:
At step-4, we can infer facts Missile(T1) and Owns(A, T1) form Sells(Robert, T1, r)
which satisfies the Rule- 4, with the substitution of A in place of r. So these two
statements are proved here.
Step-5:
At step-5, we can infer the fact Enemy(A, America) from Hostile(A) which
satisfies Rule- 6. And hence all the statements are proved true using backward
chaining.
o Forward chaining as the name suggests, start from the known facts and
move forward by applying inference rules to extract more data, and it
continues until it reaches to the goal, whereas backward chaining starts from
the goal, move backward by using inference rules to determine the facts that
satisfy the goal.
o Forward chaining is called a data-driven inference technique, whereas
backward chaining is called a goal-driven inference technique.
o Forward chaining is known as the down-up approach, whereas backward
chaining is known as a top-down approach.
o Forward chaining uses breadth-first search strategy, whereas backward
chaining uses depth-first search strategy.
o Forward and backward chaining both applies Modus ponens inference rule.
o Forward chaining can be used for tasks such as planning, design process
monitoring, diagnosis, and classification, whereas backward chaining
can be used for classification and diagnosis tasks.
o Forward chaining can be like an exhaustive search, whereas backward
chaining tries to avoid the unnecessary path of reasoning.
o In forward-chaining there can be various ASK questions from the knowledge
base, whereas in backward chaining there can be fewer ASK questions.
o Forward chaining is slow as it checks for all the rules, whereas backward
chaining is fast as it checks few required rules only.
1. Forward chaining starts from known facts and Backward chaining starts f
applies inference rule to extract more data unit it backward through inferenc
reaches to the goal. facts that support the goa
5. Forward chaining tests for all the available rules Backward chaining only te
Reasoning:
The reasoning is the mental process of deriving logical conclusion and making
predictions from available knowledge, facts, and beliefs. Or we can say,
"Reasoning is a way to infer facts from existing data." It is a general process
of thinking rationally, to find valid conclusions.
In artificial intelligence, the reasoning is essential so that the machine can also
think rationally as a human brain, and can perform like a human.
Types of Reasoning
In artificial intelligence, reasoning can be divided into the following categories:
o Deductive reasoning
o Inductive reasoning
o Abductive reasoning
o Common Sense Reasoning
o Monotonic Reasoning
o Non-monotonic Reasoning
Note: Inductive and deductive reasoning are the forms of propositional logic.
1. Deductive reasoning:
Deductive reasoning is deducing new information from logically related known
information. It is the form of valid reasoning, which means the argument's
conclusion must be true when the premises are true.
Deductive reasoning is a type of propositional logic in AI, and it requires various
rules and facts. It is sometimes referred to as top-down reasoning, and
contradictory to inductive reasoning.
In deductive reasoning, the truth of the premises guarantees the truth of the
conclusion.
Deductive reasoning mostly starts from the general premises to the specific
conclusion, which can be explained as below example.
Example:
2. Inductive Reasoning:
Inductive reasoning is a form of reasoning to arrive at a conclusion using limited
sets of facts by the process of generalization. It starts with the series of specific
facts or data and reaches to a general statement or conclusion.
Example:
Premise: All of the pigeons we have seen in the zoo are white.
Example:
Conclusion It is raining.
Common Sense reasoning simulates the human ability to make presumptions about
events which occurs on every day.
It relies on good judgment rather than exact logic and operates on heuristic
knowledge and heuristic rules.
Example:
The above two statements are the examples of common sense reasoning which a
human mind can easily understand and assume.
5. Monotonic Reasoning:
In monotonic reasoning, once the conclusion is taken, then it will remain the same
even if we add some other information to existing information in our knowledge
base. In monotonic reasoning, adding knowledge does not decrease the set of
prepositions that can be derived.
To solve monotonic problems, we can derive the valid conclusion from the available
facts only, and it will not be affected by new facts.
Monotonic reasoning is not useful for the real-time systems, as in real time, facts
get changed, so we cannot use monotonic reasoning.
Example:
6. Non-monotonic Reasoning
In Non-monotonic reasoning, some conclusions may be invalidated if we add some
more information to our knowledge base.
Logic will be said as non-monotonic if some conclusions can be invalidated by
adding more knowledge into our knowledge base.
"Human perceptions for various things in daily life, "is a general example of non-
monotonic reasoning.
Example: Let suppose the knowledge base contains the following knowledge:
So from the above sentences, we can conclude that Pitty can fly.
So to represent uncertain knowledge, where we are not sure about the predicates,
we need uncertain reasoning or probabilistic reasoning.
Causes of uncertainty:
Following are some leading causes of uncertainty to occur in the real world.
Probabilistic reasoning:
Probabilistic reasoning is a way of knowledge representation where we apply the
concept of probability to indicate the uncertainty in knowledge. In probabilistic
reasoning, we combine probability theory with logic to handle the uncertainty.
In the real world, there are lots of scenarios, where the certainty of something is
not confirmed, such as "It will rain today," "behavior of someone for some
situations," "A match between two teams or two players." These are probable
sentences for which we can assume that it will happen but not sure about it, so
here we use probabilistic reasoning.
o Bayes' rule
o Bayesian Statistics
We can find the probability of an uncertain event by using the below formula.
Sample space: The collection of all possible events is called sample space.
Random variables: Random variables are used to represent the events and
objects in the real world.
Conditional probability:
Conditional probability is a probability of occurring an event when another event
has already happened.
Let's suppose, we want to calculate the event A when event B has already occurred,
"the probability of A under the conditions of B", it can be written as:
If the probability of A is given and we need to find the probability of B, then it will
be given as:
It can be explained by using the below Venn diagram, where B is occurred event, so
sample space will be reduced to set B, and now we can only calculate event A when
event B is already occurred by dividing the probability of P(A⋀B) by P( B ).
Example:
In a class, there are 70% of the students who like English and 40% of the students
who likes English and mathematics, and then what is the percent of students those
who like English also like mathematics?
Solution:
Hence, 57% are the students who like English also like Mathematics.
Bayes' theorem:
Bayes' theorem is also known as Bayes' rule, Bayes' law, or Bayesian
reasoning, which determines the probability of an event with uncertain knowledge.
Bayes' theorem was named after the British mathematician Thomas Bayes.
The Bayesian inference is an application of Bayes' theorem, which is fundamental
to Bayesian statistics.
Bayes' theorem can be derived using product rule and conditional probability of
event A with known event B:
It shows the simple relationship between joint and conditional probabilities. Here,
P(B|A) is called the likelihood, in which we consider that hypothesis is true, then we
calculate the probability of evidence.
In the equation (a), in general, we can write P (B) = P(A)*P(B|Ai), hence the
Bayes' rule can be written as:
Where A1, A2, A3,........, An is a set of mutually exclusive and exhaustive events.
Example-1:
A doctor is aware that disease meningitis causes a patient to have a stiff neck, and
it occurs 80% of the time. He is also aware of some more facts, which are given as
follows:
Let a be the proposition that patient has stiff neck and b be the proposition that
patient has meningitis. , so we can calculate the following as:
P(a|b) = 0.8
P(b) = 1/30000
P(a)= .02
Hence, we can assume that 1 patient out of 750 patients has meningitis disease
with a stiff neck.
Example-2:
Solution:
o It is used to calculate the next step of the robot when the already executed
step is given.
o Bayes' theorem is helpful in weather forecasting.
o It can solve the Monty Hall problem.
Bayesian networks are probabilistic, because these networks are built from
a probability distribution, and also use probability theory for prediction and
anomaly detection.
Real world applications are probabilistic in nature, and to represent the relationship
between multiple events, we need a Bayesian network. It can also be used in
various tasks including prediction, anomaly detection, diagnostics, automated
insight, reasoning, time series prediction, and decision making under
uncertainty.
Bayesian Network can be used for building models from data and experts opinions,
and it consists of two parts:
o Directed Acyclic Graph
o Table of conditional probabilities.
The generalized form of Bayesian network that represents and solve decision
problems under uncertain knowledge is known as an Influence diagram.
Note: The Bayesian network graph does not contain any cyclic graph. Hence, it is
known as a directed acyclic graph or DAG.
o Causal Component
o Actual numbers
Each node in the Bayesian network has condition probability distribution P(Xi |
Parent(Xi) ), which determines the effect of the parent on that node.
P[x1, x2, x3,....., xn], it can be written as the following way in terms of the joint
probability distribution.
In general for each variable Xi, we can write the equation as:
Example: Harry installed a new burglar alarm at his home to detect burglary. The
alarm reliably responds at detecting a burglary but also responds for minor
earthquakes. Harry has two neighbors David and Sophia, who have taken a
responsibility to inform Harry at work when they hear the alarm. David always calls
Harry when he hears the alarm, but sometimes he got confused with the phone
ringing and calls at that time too. On the other hand, Sophia likes to listen to high
music, so sometimes she misses to hear the alarm. Here we would like to compute
the probability of Burglary Alarm.
Problem:
Calculate the probability that alarm has sounded, but there is neither a
burglary, nor an earthquake occurred, and David and Sophia both called
the Harry.
Solution:
o The Bayesian network for the above problem is given below. The network
structure is showing that burglary and earthquake is the parent node of the
alarm and directly affecting the probability of alarm's going off, but David
and Sophia's calls depend on alarm probability.
o The network is representing that our assumptions do not directly perceive
the burglary and also do not notice the minor earthquake, and they also not
confer before calling.
o The conditional distributions for each node are given as conditional
probabilities table or CPT.
o Each row in the CPT must be sum to 1 because all the entries in the table
represent an exhaustive set of cases for the variable.
o In CPT, a boolean variable with k boolean parents contains 2 K probabilities.
Hence, if there are two parents, then CPT will contain 4 probability values
o Burglary (B)
o Earthquake(E)
o Alarm(A)
o David Calls(D)
o Sophia calls(S)
We can write the events of problem statement in the form of probability: P[D, S, A,
B, E], can rewrite the above probability statement using joint probability
distribution:
Let's take the observed probability for the Burglary and earthquake component:
P(E= False)= 0.999, Which is the probability that an earthquake not occurred.
The Conditional probability of David that he will call depends on the probability of
Alarm.
The Conditional probability of Sophia that she calls is depending on its Parent Node
"Alarm."
From the formula of joint distribution, we can write the problem statement in the
form of probability distribution:
Hence, a Bayesian network can answer any query about the domain by
using Joint distribution.
There are two ways to understand the semantics of the Bayesian network, which is
given below: