Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
11 views67 pages

The Architecture of Knowledge-Based Agent

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 67

UNIT 3

o An intelligent agent needs knowledge about the real world for taking decisions
and reasoning to act efficiently.
o Knowledge-based agents are those agents who have the capability
of maintaining an internal state of knowledge, reason over that knowledge,
update their knowledge after observations and take actions. These agents
can represent the world with some formal representation and act
intelligently.
o Knowledge-based agents are composed of two main parts:
o Knowledge-base and
o Inference system.

A knowledge-based agent must able to do the following:

o An agent should be able to represent states, actions, etc.


o An agent Should be able to incorporate new percepts
o An agent can update the internal representation of the world
o An agent can deduce the internal representation of the world
o An agent can deduce appropriate actions.

The architecture of knowledge-based agent:


The above diagram is representing a generalized architecture for a knowledge-based
agent. The knowledge-based agent (KBA) take input from the environment by
perceiving the environment. The input is taken by the inference engine of the agent and
which also communicate with KB to decide as per the knowledge store in KB. The
learning element of KBA regularly updates the KB by learning new knowledge.

Knowledge base: Knowledge-base is a central component of a knowledge-based


agent, it is also known as KB. It is a collection of sentences (here 'sentence' is a technical
term and it is not identical to sentence in English). These sentences are expressed in a
language which is called a knowledge representation language. The Knowledge-base of
KBA stores fact about the world.

Why use a knowledge base?


Knowledge-base is required for updating knowledge for an agent to learn with
experiences and take action as per the knowledge.

Inference system
Inference means deriving new sentences from old. Inference system allows us to add a
new sentence to the knowledge base. A sentence is a proposition about the world.
Inference system applies logical rules to the KB to deduce new information.
Inference system generates new facts so that an agent can update the KB. An inference
system works mainly in two rules which are given as:

o Forward chaining
o Backward chaining

o Knowledge representation and reasoning (KR, KRR) is the part of Artificial


intelligence which concerned with AI agents thinking and how thinking
contributes to intelligent behavior of agents.
o It is responsible for representing information about the real world so that a
computer can understand and can utilize this knowledge to solve the complex
real world problems such as diagnosis a medical condition or communicating
with humans in natural language.
o It is also a way which describes how we can represent knowledge in artificial
intelligence. Knowledge representation is not just storing data into some
database, but it also enables an intelligent machine to learn from that knowledge
and experiences so that it can behave intelligently like a human.

What to Represent:
Following are the kind of knowledge which needs to be represented in AI systems:

o Object: All the facts about objects in our world domain. E.g., Guitars contains
strings, trumpets are brass instruments.
o Events: Events are the actions which occur in our world.
o Performance: It describe behavior which involves knowledge about how to do
things.
o Meta-knowledge: It is knowledge about what we know.
o Facts: Facts are the truths about the real world and what we represent.
o Knowledge-Base: The central component of the knowledge-based agents is the
knowledge base. It is represented as KB. The Knowledgebase is a group of the
Sentences (Here, sentences are used as a technical term and not identical with
the English language).
Knowledge: Knowledge is awareness or familiarity gained by experiences of facts, data,
and situations. Following are the types of knowledge in artificial intelligence:

Types of knowledge
Following are the various types of knowledge:

1. Declarative Knowledge:

o Declarative knowledge is to know about something.


o It includes concepts, facts, and objects.
o It is also called descriptive knowledge and expressed in declarativesentences.
o It is simpler than procedural language.

2. Procedural Knowledge

o It is also known as imperative knowledge.


o Procedural knowledge is a type of knowledge which is responsible for knowing
how to do something.
o It can be directly applied to any task.
o It includes rules, strategies, procedures, agendas, etc.
o Procedural knowledge depends on the task on which it can be applied.

3. Meta-knowledge:

o Knowledge about the other types of knowledge is called Meta-knowledge.

4. Heuristic knowledge:

o Heuristic knowledge is representing knowledge of some experts in a filed or


subject.
o Heuristic knowledge is rules of thumb based on previous experiences, awareness
of approaches, and which are good to work but not guaranteed.

5. Structural knowledge:

o Structural knowledge is basic knowledge to problem-solving.


o It describes relationships between various concepts such as kind of, part of, and
grouping of something.
o It describes the relationship that exists between concepts or objects.

The relation between knowledge and intelligence:


Knowledge of real-worlds plays a vital role in intelligence and same for creating artificial
intelligence. Knowledge plays an important role in demonstrating intelligent behavior in
AI agents. An agent is only able to accurately act on some input when he has some
knowledge or experience about that input.

Let's suppose if you met some person who is speaking in a language which you don't
know, then how you will able to act on that. The same thing applies to the intelligent
behavior of the agents.

As we can see in below diagram, there is one decision maker which act by sensing the
environment and using knowledge. But if the knowledge part will not present then, it
cannot display intelligent behavior.
AI knowledge cycle:
An Artificial intelligence system has the following components for displaying intelligent
behavior:

o Perception
o Learning
o Knowledge Representation and Reasoning
o Planning
o Execution
The above diagram is showing how an AI system can interact with the real world and
what components help it to show intelligence. AI system has Perception component by
which it retrieves information from its environment. It can be visual, audio or another
form of sensory input. The learning component is responsible for learning from data
captured by Perception comportment. In the complete cycle, the main components are
knowledge representation and Reasoning. These two components are involved in
showing the intelligence in machine-like humans. These two components are
independent with each other but also coupled together. The planning and execution
depend on analysis of Knowledge representation and reasoning.

Approaches to knowledge representation:


There are mainly four approaches to knowledge representation, which are givenbelow:

1. Simple relational knowledge:

o It is the simplest way of storing facts which uses the relational method, and each
fact about a set of the object is set out systematically in columns.
o This approach of knowledge representation is famous in database systems where
the relationship between different entities is represented.
o This approach has little opportunity for inference.

Example: The following is the simple relational knowledge representation.


Player Weight Age

Player1 65 23

Player2 58 18

Player3 75 24

2. Inheritable knowledge:

o In the inheritable knowledge approach, all data must be stored into a hierarchy of
classes.
o All classes should be arranged in a generalized form or a hierarchal manner.
o In this approach, we apply inheritance property.
o Elements inherit values from other members of a class.
o This approach contains inheritable knowledge which shows a relation between
instance and class, and it is called instance relation.
o Every individual frame can represent the collection of attributes and its value.
o In this approach, objects and values are represented in Boxed nodes.
o We use Arrows which point from objects to their values.
o Example:
3. Inferential knowledge:

o Inferential knowledge approach represents knowledge in the form of formal


logics.
o This approach can be used to derive more facts.
o It guaranteed correctness.
o Example: Let's suppose there are two statements:
a. Marcus is a man
b. All men are mortal

Then it can represent as;

man(Marcus)
∀x = man (x) ----------> mortal (x)s

4. Procedural knowledge:

o Procedural knowledge approach uses small programs and codes which describes
how to do specific things, and how to proceed.
o In this approach, one important rule is used which is If-Then rule.
o In this knowledge, we can use various coding languages such as LISP
language and Prolog language.
o We can easily represent heuristic or domain-specific knowledge using this
approach.
o But it is not necessary that we can represent all cases in this approach.

Requirements for knowledge Representation system:


A good knowledge representation system must possess the following properties.

1. Representational Accuracy: KR system should have the ability to represent all kind
of required knowledge.

2. Inferential Adequacy: KR system should have ability to manipulate the


representational structures to produce new knowledge corresponding to existing
structure.

3. Inferential Efficiency: The ability to direct the inferential knowledge mechanism into
the most productive directions by storing appropriate guides.

4. Acquisitional efficiency- The ability to acquire the new knowledge easily using
automatic methods.

Techniques of knowledge representation


There are mainly four ways of knowledge representation which are given as follows:

1. Logical Representation
2. Semantic Network Representation
3. Frame Representation
4. Production Rules
1. Logical Representation
Logical representation is a language with some concrete rules which deals with
propositions and has no ambiguity in representation. Logical representation means
drawing a conclusion based on various conditions. This representation lays down some
important communication rules. It consists of precisely defined syntax and semantics
which supports the sound inference. Each sentence can be translated into logics using
syntax and semantics.

Syntax:

o Syntaxes are the rules which decide how we can construct legal sentences in the logic.
o It determines which symbol we can use in knowledge representation.
o How to write those symbols.

Semantics:

o Semantics are the rules by which we can interpret the sentence in the logic.
o Semantic also involves assigning a meaning to each sentence.

Logical representation can be categorised into mainly two logics:


a. Propositional Logics
b. Predicate logics

Note: We will discuss Prepositional Logics and Predicate logics in later chapters.

Advantages of logical representation:

1. Logical representation enables us to do logical reasoning.


2. Logical representation is the basis for the programming languages.

Disadvantages of logical Representation:

1. Logical representations have some restrictions and are challenging to work with.
2. Logical representation technique may not be very natural, and inference may not be so
efficient.

Note: Do not be confused with logical representation and logical reasoning as logical
representation is a representation language and reasoning is a process of thinking
logically.

2. Semantic Network Representation


Semantic networks are alternative of predicate logic for knowledge representation. In
Semantic networks, we can represent our knowledge in the form of graphical networks.
This network consists of nodes representing objects and arcs which describe the
relationship between those objects. Semantic networks can categorize the object in
different forms and can also link those objects. Semantic networks are easy to
understand and can be easily extended.

This representation consist of mainly two types of relations:

a. IS-A relation (Inheritance)

b. Kind-of-relation

Example: Following are some statements which we need to represent in the form of
nodes and arcs.

Statements:
a) Jerry is a cat.
b) Jerry is a mammal
c) Jerry is owned by Priya.
d) Jerry is brown colored.
e) All Mammals are animal.

In the above diagram, we have represented the different type of knowledge in the form
of nodes and arcs. Each object is connected with another object by some relation.

Drawbacks in Semantic representation:

1. Semantic networks take more computational time at runtime as we need to traverse the
complete network tree to answer some questions. It might be possible in the worst case
scenario that after traversing the entire tree, we find that the solution does not exist in
this network.
2. Semantic networks try to model human-like memory (Which has 1015 neurons and links)
to store the information, but in practice, it is not possible to build such a vast semantic
network.
3. These types of representations are inadequate as they do not have any equivalent
quantifier, e.g., for all, for some, none, etc.
4. Semantic networks do not have any standard definition for the link names.
5. These networks are not intelligent and depend on the creator of the system.
Advantages of Semantic network:

1. Semantic networks are a natural representation of knowledge.


2. Semantic networks convey meaning in a transparent manner.
3. These networks are simple and easily understandable.

3. Frame Representation
A frame is a record like structure which consists of a collection of attributes and its
values to describe an entity in the world. Frames are the AI data structure which divides
knowledge into substructures by representing stereotypes situations. It consists of a
collection of slots and slot values. These slots may be of any type and sizes. Slots have
names and values which are called facets.

Facets: The various aspects of a slot is known as Facets. Facets are features of frames
which enable us to put constraints on the frames. Example: IF-NEEDED facts are called
when data of any particular slot is needed. A frame may consist of any number of slots,
and a slot may include any number of facets and facets may have any number of values.
A frame is also known as slot-filter knowledge representation in artificial intelligence.

Frames are derived from semantic networks and later evolved into our modern-day
classes and objects. A single frame is not much useful. Frames system consist of a
collection of frames which are connected. In the frame, knowledge about an object or
event can be stored together in the knowledge base. The frame is a type of technology
which is widely used in various applications including Natural language processing and
machine visions.

Example: 1
Let's take an example of a frame for a book

Slots Filters

Title Artificial Intelligence

Genre Computer Science


Author Peter Norvig

Edition Third Edition

Year 1996

Page 1152

Example 2:
Let's suppose we are taking an entity, Peter. Peter is an engineer as a profession, and his
age is 25, he lives in city London, and the country is England. So following is the frame
representation for this:

Slots Filter

Name Peter

Profession Doctor

Age 25

Marital status Single

Weight 78

Advantages of frame representation:

1. The frame knowledge representation makes the programming easier by grouping the
related data.
2. The frame representation is comparably flexible and used by many applications in AI.
3. It is very easy to add slots for new attribute and relations.
4. It is easy to include default data and to search for missing values.
5. Frame representation is easy to understand and visualize.

Disadvantages of frame representation:

1. In frame system inference mechanism is not be easily processed.


2. Inference mechanism cannot be smoothly proceeded by frame representation.
3. Frame representation has a much generalized approach.

4. Production Rules
Production rules system consist of (condition, action) pairs which mean, "If condition
then action". It has mainly three parts:

o The set of production rules


o Working Memory
o The recognize-act-cycle

In production rules agent checks for the condition and if the condition exists then
production rule fires and corresponding action is carried out. The condition part of the
rule determines which rule may be applied to a problem. And the action part carries out
the associated problem-solving steps. This complete process is called a recognize-act
cycle.

The working memory contains the description of the current state of problems-solving
and rule can write knowledge to the working memory. This knowledge match and may
fire other rules.

If there is a new situation (state) generates, then multiple production rules will be fired
together, this is called conflict set. In this situation, the agent needs to select a rule from
these sets, and it is called a conflict resolution.

Example:

o IF (at bus stop AND bus arrives) THEN action (get into the bus)
o IF (on the bus AND paid AND empty seat) THEN action (sit down).
o IF (on bus AND unpaid) THEN action (pay charges).
o IF (bus arrives at destination) THEN action (get down from the bus).
Advantages of Production rule:

1. The production rules are expressed in natural language.


2. The production rules are highly modular, so we can easily remove, add or modify an
individual rule.

Disadvantages of Production rule:

1. Production rule system does not exhibit any learning capabilities, as it does not store the
result of the problem for the future uses.
2. During the execution of the program, many rules may be active hence rule-based
production systems are inefficient.

Propositional logic in Artificial intelligence


Propositional logic (PL) is the simplest form of logic where all the statements are made
by propositions. A proposition is a declarative statement which is either true or false. It is
a technique of knowledge representation in logical and mathematical form.

Example:

1. a) It is Sunday.
2. b) The Sun rises from West (False proposition)
3. c) 3+3= 7(False proposition)
4. d) 5 is a prime number.

Following are some basic facts about propositional logic:

o Propositional logic is also called Boolean logic as it works on 0 and 1.


o In propositional logic, we use symbolic variables to represent the logic, and we can use
any symbol for a representing a proposition, such A, B, C, P, Q, R, etc.
o Propositions can be either true or false, but it cannot be both.
o Propositional logic consists of an object, relations or function, and logical connectives.
o These connectives are also called logical operators.
o The propositions and connectives are the basic elements of the propositional logic.
o Connectives can be said as a logical operator which connects two sentences.
o A proposition formula which is always true is called tautology, and it is also called a
valid sentence.
o A proposition formula which is always false is called Contradiction.
o A proposition formula which has both true and false values is called
o Statements which are questions, commands, or opinions are not propositions such as
"Where is Rohini", "How are you", "What is your name", are not propositions.

Syntax of propositional logic:


The syntax of propositional logic defines the allowable sentences for the knowledge
representation. There are two types of Propositions:

a. Atomic Propositions
b. Compound propositions

o Atomic Proposition: Atomic propositions are the simple propositions. It consists of a


single proposition symbol. These are the sentences which must be either true or false.

Example:

1. a) 2+2 is 4, it is an atomic proposition as it is a true fact.


2. b) "The Sun is cold" is also a proposition as it is a false fact.
o Compound proposition: Compound propositions are constructed by combining simpler
or atomic propositions, using parenthesis and logical connectives.

Example:

1. a) "It is raining today, and street is wet."


2. b) "Ankit is a doctor, and his clinic is in Mumbai."

Logical Connectives:
Logical connectives are used to connect two simpler propositions or representing a
sentence logically. We can create compound propositions with the help of logical
connectives. There are mainly five connectives, which are given as follows:
1. Negation: A sentence such as ¬ P is called negation of P. A literal can be either Positive
literal or negative literal.
2. Conjunction: A sentence which has ∧ connective such as, P ∧ Q is called a conjunction.
Example: Rohan is intelligent and hardworking. It can be written as,
P= Rohan is intelligent,
Q= Rohan is hardworking. → P∧ Q.
3. Disjunction: A sentence which has ∨ connective, such as P ∨ Q. is called disjunction,
where P and Q are the propositions.
Example: "Ritika is a doctor or Engineer",
Here P= Ritika is Doctor. Q= Ritika is Doctor, so we can write it as P ∨ Q.
4. Implication: A sentence such as P → Q, is called an implication. Implications are also
known as if-then rules. It can be represented as
If it is raining, then the street is wet.
Let P= It is raining, and Q= Street is wet, so it is represented as P → Q
5. Biconditional: A sentence such as P⇔ Q is a Biconditional sentence, example If I am
breathing, then I am alive
P= I am breathing, Q= I am alive, it can be represented as P ⇔ Q.

Following is the summarized table for Propositional Logic


Connectives:

Truth Table:
In propositional logic, we need to know the truth values of propositions in all possible
scenarios. We can combine all the possible combination with logical connectives, and
the representation of these combinations in a tabular format is called Truth table.
Following are the truth table for all logical connectives:
Truth table with three propositions:
We can build a proposition composing three propositions P, Q, and R. This truth table is
made-up of 8n Tuples as we have taken three proposition symbols.

Precedence of connectives:
Just like arithmetic operators, there is a precedence order for propositional connectors
or logical operators. This order should be followed while evaluating a propositional
problem. Following is the list of the precedence order for operators:

Precedence Operators

First Precedence Parenthesis

Second Precedence Negation

Third Precedence Conjunction(AND)

Fourth Precedence Disjunction(OR)

Fifth Precedence Implication

Six Precedence Biconditional

Note: For better understanding use parenthesis to make sure of the correct
interpretations. Such as ¬R∨ Q, It can be interpreted as (¬R) ∨ Q.

Logical equivalence:
Logical equivalence is one of the features of propositional logic. Two propositions are
said to be logically equivalent if and only if the columns in the truth table are identical
to each other.

Let's take two propositions A and B, so for logical equivalence, we can write it as A⇔B.
In below truth table we can see that column for ¬A∨ B and A→B, are identical hence A is
Equivalent to B

Properties of Operators:

o Commutativity:
o P∧ Q= Q ∧ P, or
o P ∨ Q = Q ∨ P.
o Associativity:
o (P ∧ Q) ∧ R= P ∧ (Q ∧ R),
o (P ∨ Q) ∨ R= P ∨ (Q ∨ R)
o Identity element:
o P ∧ True = P,
o P ∨ True= True.
o Distributive:
o P∧ (Q ∨ R) = (P ∧ Q) ∨ (P ∧ R).
o P ∨ (Q ∧ R) = (P ∨ Q) ∧ (P ∨ R).
o DE Morgan's Law:
o ¬ (P ∧ Q) = (¬P) ∨ (¬Q)
o ¬ (P ∨ Q) = (¬ P) ∧ (¬Q).
o Double-negation elimination:
o ¬ (¬P) = P.
Limitations of Propositional logic:

o We cannot represent relations like ALL, some, or none with propositional logic. Example:
a. All the girls are intelligent.
b. Some apples are sweet.
o Propositional logic has limited expressive power.
o In propositional logic, we cannot describe statements in terms of their properties or
logical relationships.

Rules of Inference in Artificial intelligence


Inference:
In artificial intelligence, we need intelligent computers which can create new logic from
old logic or by evidence, so generating the conclusions from evidence and facts is
termed as Inference.

Inference rules:
Inference rules are the templates for generating valid arguments. Inference rules are
applied to derive proofs in artificial intelligence, and the proof is a sequence of the
conclusion that leads to the desired goal.

In inference rules, the implication among all the connectives plays an important role.
Following are some terminologies related to inference rules:

o Implication: It is one of the logical connectives which can be represented as P → Q. It is


a Boolean expression.
o Converse: The converse of implication, which means the right-hand side proposition
goes to the left-hand side and vice-versa. It can be written as Q → P.
o Contrapositive: The negation of converse is termed as contrapositive, and it can be
represented as ¬ Q → ¬ P.
o Inverse: The negation of implication is called inverse. It can be represented as ¬ P → ¬
Q.
From the above term some of the compound statements are equivalent to each other,
which we can prove using truth table:

Hence from the above truth table, we can prove that P → Q is equivalent to ¬ Q → ¬ P,
and Q→ P is equivalent to ¬ P → ¬ Q.

Types of Inference rules:


1. Modus Ponens:
The Modus Ponens rule is one of the most important rules of inference, and it states
that if P and P → Q is true, then we can infer that Q will be true. It can be represented as:

Example:

Statement-1: "If I am sleepy then I go to bed" ==> P→ Q

Statement-2: "I am sleepy" ==> P

Conclusion: "I go to bed." ==> Q.

Hence, we can say that, if P→ Q is true and P is true then Q will be true.

Proof by Truth table:


2. Modus Tollens:
The Modus Tollens rule state that if P→ Q is true and ¬ Q is true, then ¬ P will also
true. It can be represented as:

Statement-1: "If I am sleepy then I go to bed" ==> P→ Q

Statement-2: "I do not go to the bed."==> ~Q

Statement-3: Which infers that "I am not sleepy" => ~P

Proof by Truth table:

3. Hypothetical Syllogism:
The Hypothetical Syllogism rule state that if P→R is true whenever P→Q is true, and
Q→R is true. It can be represented as the following notation:

Example:

Statement-1: If you have my home key then you can unlock my home. P→Q
Statement-2: If you can unlock my home then you can take my money. Q→R
Conclusion: If you have my home key then you can take my money. P→R
Proof by truth table:

4. Disjunctive Syllogism:
The Disjunctive syllogism rule state that if P∨Q is true, and ¬P is true, then Q will be true.
It can be represented as:

Example:

Statement-1: Today is Sunday or Monday. ==>P∨Q

Statement-2: Today is not Sunday. ==> ¬P

Conclusion: Today is Monday. ==> Q

Proof by truth-table:
5. Addition:
The Addition rule is one the common inference rule, and it states that If P is true, then
P∨Q will be true.

Example:

Statement: I have a vanilla ice-cream. ==> P

Statement-2: I have Chocolate ice-cream.

Conclusion: I have vanilla or chocolate ice-cream. ==> (P∨Q)

Proof by Truth-Table:

6. Simplification:
The simplification rule state that if P∧ Q is true, then Q or P will also be true. It can be
represented as:

Proof by Truth-Table:
7. Resolution:
The Resolution rule state that if P∨Q and ¬ P∧R is true, then Q∨R will also be true. It can
be represented as

Proof by Truth-Table:

The Wumpus World in Artificial intelligence


Wumpus world:
The Wumpus world is a simple world example to illustrate the worth of a knowledge-
based agent and to represent knowledge representation. It was inspired by a video
game Hunt the Wumpus by Gregory Yob in 1973.

The Wumpus world is a cave which has 4/4 rooms connected with passageways. So
there are total 16 rooms which are connected with each other. We have a knowledge-
based agent who will go forward in this world. The cave has a room with a beast which
is called Wumpus, who eats anyone who enters the room. The Wumpus can be shot by
the agent, but the agent has a single arrow. In the Wumpus world, there are some Pits
rooms which are bottomless, and if agent falls in Pits, then he will be stuck there forever.
The exciting thing with this cave is that in one room there is a possibility of finding a
heap of gold. So the agent goal is to find the gold and climb out the cave without fallen
into Pits or eaten by Wumpus. The agent will get a reward if he comes out with gold,
and he will get a penalty if eaten by Wumpus or falls in the pit.
Note: Here Wumpus is static and cannot move.

Following is a sample diagram for representing the Wumpus world. It is showing some
rooms with Pits, one room with Wumpus and one agent at (1, 1) square location of the
world.

There are also some components which can help the agent to navigate the cave.
These components are given as follows:

a. The rooms adjacent to the Wumpus room are smelly, so that it would have some stench.
b. The room adjacent to PITs has a breeze, so if the agent reaches near to PIT, then he will
perceive the breeze.
c. There will be glitter in the room if and only if the room has gold.
d. The Wumpus can be killed by the agent if the agent is facing to it, and Wumpus will emit
a horrible scream which can be heard anywhere in the cave.

PEAS description of Wumpus world:


To explain the Wumpus world we have given PEAS description as below:
Performance measure:

o +1000 reward points if the agent comes out of the cave with the gold.
o -1000 points penalty for being eaten by the Wumpus or falling into the pit.
o -1 for each action, and -10 for using an arrow.
o The game ends if either agent dies or came out of the cave.

Environment:

o A 4*4 grid of rooms.


o The agent initially in room square [1, 1], facing toward the right.
o Location of Wumpus and gold are chosen randomly except the first square [1,1].
o Each square of the cave can be a pit with probability 0.2 except the first square.

Actuators:

o Left turn,
o Right turn
o Move forward
o Grab
o Release
o Shoot.

Sensors:

o The agent will perceive the stench if he is in the room adjacent to the Wumpus. (Not
diagonally).
o The agent will perceive breeze if he is in the room directly adjacent to the Pit.
o The agent will perceive the glitter in the room where the gold is present.
o The agent will perceive the bump if he walks into a wall.
o When the Wumpus is shot, it emits a horrible scream which can be perceived anywhere
in the cave.
o These percepts can be represented as five element list, in which we will have different
indicators for each sensor.
o Example if agent perceives stench, breeze, but no glitter, no bump, and no scream then it
can be represented as:

[Stench, Breeze, None, None, None].

The Wumpus world Properties:


o Partially observable: The Wumpus world is partially observable because the agent can
only perceive the close environment such as an adjacent room.
o Deterministic: It is deterministic, as the result and outcome of the world are already
known.
o Sequential: The order is important, so it is sequential.
o Static: It is static as Wumpus and Pits are not moving.
o Discrete: The environment is discrete.
o One agent: The environment is a single agent as we have one agent only and Wumpus
is not considered as an agent.

Exploring the Wumpus world:


Now we will explore the Wumpus world and will determine how the agent will find its
goal by applying logical reasoning.

Agent's First step:

Initially, the agent is in the first room or on the square [1,1], and we already know that
this room is safe for the agent, so to represent on the below diagram (a) that room is
safe we will add symbol OK. Symbol A is used to represent agent, symbol B for the
breeze, G for Glitter or gold, V for the visited room, P for pits, W for Wumpus.

At Room [1,1] agent does not feel any breeze or any Stench which means the adjacent
squares are also OK.
Agent's second Step:

Now agent needs to move forward, so it will either move to [1, 2], or [2,1]. Let's suppose
agent moves to the room [2, 1], at this room agent perceives some breeze which means
Pit is around this room. The pit can be in [3, 1], or [2,2], so we will add symbol P? to say
that, is this Pit room?

Now agent will stop and think and will not make any harmful move. The agent will go
back to the [1, 1] room. The room [1,1], and [2,1] are visited by the agent, so we will use
symbol V to represent the visited squares.

Agent's third step:

At the third step, now agent will move to the room [1,2] which is OK. In the room [1,2]
agent perceives a stench which means there must be a Wumpus nearby. But Wumpus
cannot be in the room [1,1] as by rules of the game, and also not in [2,2] (Agent had not
detected any stench when he was at [2,1]). Therefore agent infers that Wumpus is in the
room [1,3], and in current state, there is no breeze which means in [2,2] there is no Pit
and no Wumpus. So it is safe, and we will mark it OK, and the agent moves further in
[2,2].
Agent's fourth step:

At room [2,2], here no stench and no breezes present so let's suppose agent decides to
move to [2,3]. At room [2,3] agent perceives glitter, so it should grab the gold and climb
out of the cave.

First-Order Logic in Artificial intelligence


In the topic of Propositional logic, we have seen that how to represent statements using
propositional logic. But unfortunately, in propositional logic, we can only represent the
facts, which are either true or false. PL is not sufficient to represent the complex
sentences or natural language statements. The propositional logic has very limited
expressive power. Consider the following sentence, which we cannot represent using PL
logic.

o "Some humans are intelligent", or


o "Sachin likes cricket."

To represent the above statements, PL logic is not sufficient, so we required some more
powerful logic, such as first-order logic.

First-Order logic:
o First-order logic is another way of knowledge representation in artificial intelligence. It is
an extension to propositional logic.
o FOL is sufficiently expressive to represent the natural language statements in a concise
way.
o First-order logic is also known as Predicate logic or First-order predicate logic. First-
order logic is a powerful language that develops information about the objects in a more
easy way and can also express the relationship between those objects.
o First-order logic (like natural language) does not only assume that the world contains
facts like propositional logic but also assumes the following things in the world:
o Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus, ......
o Relations: It can be unary relation such as: red, round, is adjacent, or n-any
relation such as: the sister of, brother of, has color, comes between
o Function: Father of, best friend, third inning of, end of, ......
o As a natural language, first-order logic also has two main parts:
a. Syntax
b. Semantics

Syntax of First-Order logic:


The syntax of FOL determines which collection of symbols is a logical expression in first-
order logic. The basic syntactic elements of first-order logic are symbols. We write
statements in short-hand notation in FOL.

Basic Elements of First-order logic:


Following are the basic elements of FOL syntax:

Constant 1, 2, A, John, Mumbai, cat,....

Variables x, y, z, a, b,....

Predicates Brother, Father, >,....


Function sqrt, LeftLegOf, ....

Connectives ∧, ∨, ¬, ⇒, ⇔

Equality ==

Quantifier ∀, ∃

Atomic sentences:

o Atomic sentences are the most basic sentences of first-order logic. These sentences are
formed from a predicate symbol followed by a parenthesis with a sequence of terms.
o We can represent atomic sentences as Predicate (term1, term2, ......, term n).

Example: Ravi and Ajay are brothers: => Brothers(Ravi, Ajay)

Chinky is a cat: => cat (Chinky).

Complex Sentences:

o Complex sentences are made by combining atomic sentences using connectives.

First-order logic statements can be divided into two parts:

o Subject: Subject is the main part of the statement.


o Predicate: A predicate can be defined as a relation, which binds two atoms together in a
statement.

Consider the statement: "x is an integer.", it consists of two parts, the first part x is
the subject of the statement and second part "is an integer," is known as a predicate.
Quantifiers in First-order logic:
o A quantifier is a language element which generates quantification, and quantification
specifies the quantity of specimen in the universe of discourse.
o These are the symbols that permit to determine or identify the range and scope of the
variable in the logical expression. There are two types of quantifier:
a. Universal Quantifier, (for all, everyone, everything)
b. Existential quantifier, (for some, at least one).

Universal Quantifier:
Universal quantifier is a symbol of logical representation, which specifies that the
statement within its range is true for everything or every instance of a particular thing.

The Universal quantifier is represented by a symbol ∀, which resembles an inverted A.

Note: In universal quantifier we use implication "→".

If x is a variable, then ∀x is read as:

o For all x
o For each x
o For every x.

Example:
All man drink coffee.

Let a variable x which refers to a cat so all x can be represented in UOD as below:
∀x man(x) → drink (x, coffee).

It will be read as: There are all x where x is a man who drink coffee.

Existential Quantifier:
Existential quantifiers are the type of quantifiers, which express that the statement within
its scope is true for at least one instance of something.

It is denoted by the logical operator ∃, which resembles as inverted E. When it is used


with a predicate variable then it is called as an existential quantifier.

Note: In Existential quantifier we always use AND or Conjunction symbol ( ∧).

If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:

o There exists a 'x.'


o For some 'x.'
o For at least one 'x.'

Example:
Some boys are intelligent.

∃x: boys(x) ∧ intelligent(x)

It will be read as: There are some x where x is a boy who is intelligent.

Points to remember:
o The main connective for universal quantifier ∀ is implication →.
o The main connective for existential quantifier ∃ is and ∧.

Properties of Quantifiers:
o In universal quantifier, ∀x∀y is similar to ∀y∀x.
o In Existential quantifier, ∃x∃y is similar to ∃y∃x.
o ∃x∀y is not similar to ∀y∃x.

Some Examples of FOL using quantifier:

a) All birds fly.

In this question the predicate is "fly(bird)."


And since there are all birds who fly so it will be represented as follows.

∀x bird(x) →fly(x).

b) Every man respects his parent.

In this question, the predicate is "respect(x, y)," where x=man, and y= parent.

Since there is every man so will use ∀, and it will be represented as follows:

∀x man(x) → respects (x, parent).

c) Some boys play cricket.

In this question, the predicate is "play(x, y)," where x= boys, and y= game. Since
there are some boys so we will use ∃, and it will be represented as:

∃x boys(x) → play(x, cricket).

d) Not all students like both Mathematics and Science.

In this question, the predicate is "like(x, y)," where x= student, and y= subject.
Since there are not all students, so we will use ∀ with negation, so following
representation for this:

¬∀ (x) [ student(x) → like(x, Mathematics) ∧ like(x, Science)].

e) Only one student failed in Mathematics.

In this question, the predicate is "failed(x, y)," where x= student, and y= subject.
Since there is only one student who failed in Mathematics, so we will use following
representation for this:

∃(x) [ student(x) → failed (x, Mathematics) ∧∀ (y) [¬(x==y) ∧ student(y)


→ ¬failed (x, Mathematics)].

Free and Bound Variables:


The quantifiers interact with variables which appear in a suitable way. There are two
types of variables in First-order logic which are given below:
Free Variable: A variable is said to be a free variable in a formula if it occurs outside the
scope of the quantifier.

Example: ∀x ∃(y)[P (x, y, z)], where z is a free variable.

Bound Variable: A variable is said to be a bound variable in a formula if it occurs within


the scope of the quantifier.

Example: ∀x [A (x) B( y)], here x and y are the bound variables.

Knowledge Engineering in First-order logic


What is knowledge-engineering?
The process of constructing a knowledge-base in first-order logic is called as
knowledge- engineering. In knowledge-engineering, someone who investigates a
particular domain, learns important concept of that domain, and generates a formal
representation of the objects, is known as knowledge engineer.

In this topic, we will understand the Knowledge engineering process in an electronic


circuit domain, which is already familiar. This approach is mainly suitable for
creating special-purpose knowledge base.

The knowledge-engineering process:


Following are some main steps of the knowledge-engineering process. Using these
steps, we will develop a knowledge base which will allow us to reason about digital
circuit (One-bit full adder) which is given below
1. Identify the task:
The first step of the process is to identify the task, and for the digital circuit, there are
various reasoning tasks.

At the first level or highest level, we will examine the functionality of the circuit:

o Does the circuit add properly?


o What will be the output of gate A2, if all the inputs are high?

At the second level, we will examine the circuit structure details such as:

o Which gate is connected to the first input terminal?


o Does the circuit have feedback loops?

2. Assemble the relevant knowledge:


In the second step, we will assemble the relevant knowledge which is required for digital
circuits. So for digital circuits, we have the following required knowledge:

o Logic circuits are made up of wires and gates.


o Signal flows through wires to the input terminal of the gate, and each gate produces the
corresponding output which flows further.
o In this logic circuit, there are four types of gates used: AND, OR, XOR, and NOT.
o All these gates have one output terminal and two input terminals (except NOT gate, it
has one input terminal).

3. Decide on vocabulary:
The next step of the process is to select functions, predicate, and constants to represent
the circuits, terminals, signals, and gates. Firstly we will distinguish the gates from each
other and from other objects. Each gate is represented as an object which is named by a
constant, such as, Gate(X1). The functionality of each gate is determined by its type,
which is taken as constants such as AND, OR, XOR, or NOT. Circuits will be identified
by a predicate: Circuit (C1).

For the terminal, we will use predicate: Terminal(x).


For gate input, we will use the function In(1, X1) for denoting the first input terminal of
the gate, and for output terminal we will use Out (1, X1).

The function Arity(c, i, j) is used to denote that circuit c has i input, j output.

The connectivity between gates can be represented by predicate Connect(Out(1, X1),


In(1, X1)).

We use a unary predicate On (t), which is true if the signal at a terminal is on.

4. Encode general knowledge about the domain:


To encode the general knowledge about the logic circuit, we need some following rules:

o If two terminals are connected then they have the same input signal, it can be
represented as:

1. ∀ t1, t2 Terminal (t1) ∧ Terminal (t2) ∧ Connect (t1, t2) → Signal (t1) = Signal (2).
o Signal at every terminal will have either value 0 or 1, it will be represented as:

1. ∀ t Terminal (t) →Signal (t) = 1 ∨Signal (t) = 0.


o Connect predicates are commutative:

1. ∀ t1, t2 Connect(t1, t2) → Connect (t2, t1).


o Representation of types of gates:

1. ∀ g Gate(g) ∧ r = Type(g) → r = OR ∨r = AND ∨r = XOR ∨r = NOT.


o Output of AND gate will be zero if and only if any of its input is zero.

1. ∀ g Gate(g) ∧ Type(g) = AND →Signal (Out(1, g))= 0 ⇔ ∃n Signal (In(n, g))= 0.


o Output of OR gate is 1 if and only if any of its input is 1:

1. ∀ g Gate(g) ∧ Type(g) = OR → Signal (Out(1, g))= 1 ⇔ ∃n Signal (In(n, g))= 1


o Output of XOR gate is 1 if and only if its inputs are different:

1. ∀ g Gate(g) ∧ Type(g) = XOR → Signal (Out(1, g)) = 1 ⇔ Signal (In(1, g)) ≠ Signal (In(2, g)).
o Output of NOT gate is invert of its input:
1. ∀ g Gate(g) ∧ Type(g) = NOT → Signal (In(1, g)) ≠ Signal (Out(1, g)).
o All the gates in the above circuit have two inputs and one output (except NOT gate).

1. ∀ g Gate(g) ∧ Type(g) = NOT → Arity(g, 1, 1)


2. ∀ g Gate(g) ∧ r =Type(g) ∧ (r= AND ∨r= OR ∨r= XOR) → Arity (g, 2, 1).
o All gates are logic circuits:

1. ∀ g Gate(g) → Circuit (g).

5. Encode a description of the problem instance:


Now we encode problem of circuit C1, firstly we categorize the circuit and its gate
components. This step is easy if ontology about the problem is already thought. This
step involves the writing simple atomics sentences of instances of concepts, which is
known as ontology.

For the given circuit C1, we can encode the problem instance in atomic sentences as
below:

Since in the circuit there are two XOR, two AND, and one OR gate so atomic sentences
for these gates will be:

1. For XOR gate: Type(x1)= XOR, Type(X2) = XOR


2. For AND gate: Type(A1) = AND, Type(A2)= AND
3. For OR gate: Type (O1) = OR.

And then represent the connections between all the gates.

Note: Ontology defines a particular theory of the nature of existence.

6. Pose queries to the inference procedure and get answers:


In this step, we will find all the possible set of values of all the terminal for the adder
circuit. The first query will be:

What should be the combination of input which would generate the first output of
circuit C1, as 0 and a second output to be 1?

1. ∃ i1, i2, i3 Signal (In(1, C1))=i1 ∧ Signal (In(2, C1))=i2 ∧ Signal (In(3, C1))= i3
2. ∧ Signal (Out(1, C1)) =0 ∧ Signal (Out(2, C1))=1

7. Debug the knowledge base:


Now we will debug the knowledge base, and this is the last step of the complete
process. In this step, we will try to debug the issues of knowledge base.

In the knowledge base, we may have omitted assertions like 1 ≠ 0.

Inference in First-Order Logic


Inference in First-Order Logic is used to deduce new facts or sentences from existing
sentences. Before understanding the FOL inference rule, let's understand some basic
terminologies used in FOL.

Substitution:

Substitution is a fundamental operation performed on terms and formulas. It occurs in


all inference systems in first-order logic. The substitution is complex in the presence of
quantifiers in FOL. If we write F [a/x], so it refers to substitute a constant "a" in place of
variable "x".

Note: First-order logic is capable of expressing facts about some or all objects in the
universe.

Equality:

First-Order logic does not only use predicate and terms for making atomic sentences
but also uses another way, which is equality in FOL. For this, we can use equality
symbols which specify that the two terms refer to the same object.

Example: Brother (John) = Smith.

As in the above example, the object referred by the Brother (John) is similar to the
object referred by Smith. The equality symbol can also be used with negation to
represent that two terms are not the same objects.

Example: ¬(x=y) which is equivalent to x ≠y.

FOL inference rules for quantifier:


As propositional logic we also have inference rules in first-order logic, so following are
some basic inference rules in FOL:

o Universal Generalization
o Universal Instantiation
o Existential Instantiation
o Existential introduction

1. Universal Generalization:

o Universal generalization is a valid inference rule which states that if premise P(c) is true
for any arbitrary element c in the universe of discourse, then we can have a conclusion as
∀ x P(x).

o It can be represented as: .


o This rule can be used if we want to show that every element has a similar property.
o In this rule, x must not appear as a free variable.

Example: Let's represent, P(c): "A byte contains 8 bits", so for ∀ x P(x) "All bytes
contain 8 bits.", it will also be true.

2. Universal Instantiation:

o Universal instantiation is also called as universal elimination or UI is a valid inference


rule. It can be applied multiple times to add new sentences.
o The new KB is logically equivalent to the previous KB.
o As per UI, we can infer any sentence obtained by substituting a ground term for the
variable.
o The UI rule state that we can infer any sentence P(c) by substituting a ground term c (a
constant within domain x) from ∀ x P(x) for any object in the universe of discourse.

o It can be represented as: .

Example:1.
IF "Every person like ice-cream"=> ∀x P(x) so we can infer that
"John likes ice-cream" => P(c)

Example: 2.

Let's take a famous example,

"All kings who are greedy are Evil." So let our knowledge base contains this detail as in
the form of FOL:

∀x king(x) ∧ greedy (x) → Evil (x),

So from this information, we can infer any of the following statements using Universal
Instantiation:

o King(John) ∧ Greedy (John) → Evil (John),


o King(Richard) ∧ Greedy (Richard) → Evil (Richard),
o King(Father(John)) ∧ Greedy (Father(John)) → Evil (Father(John)),

3. Existential Instantiation:

o Existential instantiation is also called as Existential Elimination, which is a valid inference


rule in first-order logic.
o It can be applied only once to replace the existential sentence.
o The new KB is not logically equivalent to old KB, but it will be satisfiable if old KB was
satisfiable.
o This rule states that one can infer P(c) from the formula given in the form of ∃x P(x) for a
new constant symbol c.
o The restriction with this rule is that c used in the rule must be a new term for which P(c )
is true.

o It can be represented as:

Example:

From the given sentence: ∃x Crown(x) ∧ OnHead(x, John),


So we can infer: Crown(K) ∧ OnHead( K, John), as long as K does not appear in the
knowledge base.

o The above used K is a constant symbol, which is called Skolem constant.


o The Existential instantiation is a special case of Skolemization process.

4. Existential Generalization

o An existential introduction is also known as an existential introduction, which is a valid


inference rule in first-order logic.
o This rule states that if there is some element c in the universe of discourse which has a
property P, then we can infer that there exists something in the universe which has the
property P.

o It can be represented as:


o Example: Let's say that,

"Priyanka got good marks in English."

"Therefore, someone got good marks in English."

Generalized Modus Ponens Rule:


For the inference process in FOL, we have a single inference rule which is called
Generalized Modus Ponens. It is lifted version of Modus ponens.

Generalized Modus Ponens can be summarized as, " P implies Q and P is asserted to be
true, therefore Q must be True."

According to Modus Ponens, for atomic sentences pi, pi', q. Where there is a
substitution θ such that SUBST (θ, pi',) = SUBST(θ, pi), it can be represented as:

Example:
We will use this rule for Kings are evil, so we will find some x such that x is king,
and x is greedy so we can infer that x is evil.

1. Here let say, p1' is king(John) p1 is king(x)


2. p2' is Greedy(y) p2 is Greedy(x)
3. θ is {x/John, y/John} q is evil(x)
4. SUBST(θ,q).

What is Unification?
o Unification is a process of making two different logical atomic expressions identical by
finding a substitution. Unification depends on the substitution process.
o It takes two literals as input and makes them identical using substitution.
o Let Ψ1 and Ψ2 be two atomic sentences and 𝜎 be a unifier such that, Ψ1𝜎 = Ψ2𝜎, then it
can be expressed as UNIFY(Ψ1, Ψ2).
o Example: Find the MGU for Unify{King(x), King(John)}

Let Ψ1 = King(x), Ψ2 = King(John),

Substitution θ = {John/x} is a unifier for these atoms and applying this substitution,
and both expressions will be identical.

o The UNIFY algorithm is used for unification, which takes two atomic sentences and
returns a unifier for those sentences (If any exist).
o Unification is a key component of all first-order inference algorithms.
o It returns fail if the expressions do not match with each other.
o The substitution variables are called Most General Unifier or MGU.

E.g. Let's say there are two different expressions, P(x, y), and P(a, f(z)).

In this example, we need to make both above statements identical to each other. For
this, we will perform the substitution.

P(x, y)......... (i)


P(a, f(z))......... (ii)

o Substitute x with a, and y with f(z) in the first expression, and it will be represented
as a/x and f(z)/y.
o With both the substitutions, the first expression will be identical to the second expression
and the substitution set will be: [a/x, f(z)/y].

Conditions for Unification:


Following are some basic conditions for unification:

o Predicate symbol must be same, atoms or expression with different predicate symbol can
never be unified.
o Number of Arguments in both expressions must be identical.
o Unification will fail if there are two similar variables present in the same expression.

Unification Algorithm:
Algorithm: Unify(Ψ1, Ψ2)

Step. 1: If Ψ1 or Ψ2 is a variable or constant, then:


a) If Ψ1 or Ψ2 are identical, then return NIL.
b) Else if Ψ1is a variable,
a. then if Ψ1 occurs in Ψ2, then return FAILURE
b. Else return { (Ψ2/ Ψ1)}.
c) Else if Ψ2 is a variable,
a. If Ψ2 occurs in Ψ1 then return FAILURE,
b. Else return {( Ψ1/ Ψ2)}.
d) Else return FAILURE.
Step.2: If the initial Predicate symbol in Ψ1 and Ψ2 are not same, then
return FAILURE.
Step. 3: IF Ψ1 and Ψ2 have a different number of arguments, then return
FAILURE.
Step. 4: Set Substitution set(SUBST) to NIL.
Step. 5: For i=1 to the number of elements in Ψ1.
a) Call Unify function with the ith element of Ψ1 and ith element of
Ψ2, and put the result into S.
b) If S = failure then returns Failure
c) If S ≠ NIL then do,
a. Apply S to the remainder of both L1 and L2.
b. SUBST= APPEND(S, SUBST).
Step.6: Return SUBST.

Implementation of the Algorithm


Step.1: Initialize the substitution set to be empty.

Step.2: Recursively unify atomic sentences:

a. Check for Identical expression match.


b. If one expression is a variable vi, and the other is a term ti which does not contain
variable vi, then:
a. Substitute ti / vi in the existing substitutions
b. Add ti /vi to the substitution setlist.
c. If both the expressions are functions, then function name must be similar, and the
number of arguments must be the same in both the expression.

For each pair of the following atomic sentences find the most general unifier (If exist).

1. Find the MGU of {p(f(a), g(Y)) and p(X, X)}

Sol: S0 => Here, Ψ1 =


p(f(a), g(Y)), and Ψ2 = p(X, X)
SUBST θ= {f(a) / X}
S1 => Ψ1 = p(f(a), g(Y)), and Ψ2 = p(f(a), f(a))
SUBST θ= {f(a) / g(y)}, Unification failed.

Unification is not possible for these expressions.

2. Find the MGU of {p(b, X, f(g(Z))) and p(Z, f(Y), f(Y))}

Here, Ψ1 = p(b, X, f(g(Z))) , and Ψ2 = p(Z, f(Y), f(Y))


S0 => { p(b, X, f(g(Z))); p(Z, f(Y), f(Y))}
SUBST θ={b/Z}

S1 => { p(b, X, f(g(b))); p(b, f(Y), f(Y))}


SUBST θ={f(Y) /X}

S2 => { p(b, f(Y), f(g(b))); p(b, f(Y), f(Y))}


SUBST θ= {g(b) /Y}

S2 => { p(b, f(g(b)), f(g(b)); p(b, f(g(b)), f(g(b))} Unified Successfully.


And Unifier = { b/Z, f(Y) /X , g(b) /Y}.

3. Find the MGU of {p (X, X), and p (Z, f(Z))}


Here, Ψ1 = {p (X, X), and Ψ2 = p (Z, f(Z))
S0 => {p (X, X), p (Z, f(Z))}
SUBST θ= {X/Z}
S1 => {p (Z, Z), p (Z, f(Z))}
SUBST θ= {f(Z) / Z}, Unification Failed.

Hence, unification is not possible for these expressions.

4. Find the MGU of UNIFY(prime (11), prime(y))

ADVERTISEMENT

ADVERTISEMENT

Here, Ψ1 = {prime(11) , and Ψ2 = prime(y)}


S0 => {prime(11) , prime(y)}
SUBST θ= {11/y}

S1 => {prime(11) , prime(11)} , Successfully unified.


Unifier: {11/y}.

5. Find the MGU of Q(a, g(x, a), f(y)), Q(a, g(f(b), a), x)}

Here, Ψ1 = Q(a, g(x, a), f(y)), and Ψ2 = Q(a, g(f(b), a), x)

S0 => {Q(a, g(x, a), f(y)); Q(a, g(f(b), a), x)}

SUBST θ= {f(b)/x}

S1 => {Q(a, g(f(b), a), f(y)); Q(a, g(f(b), a), f(b))}

SUBST θ= {b/y}

S1 => {Q(a, g(f(b), a), f(b)); Q(a, g(f(b), a), f(b))}, Successfully Unified.

Unifier: [a/a, f(b)/x, b/y].

6. UNIFY(knows(Richard, x), knows(Richard, John))

Here, Ψ1 = knows(Richard, x), and Ψ2 = knows(Richard, John)

S0 => { knows(Richard, x); knows(Richard, John)}


SUBST θ= {John/x}

S1 => { knows(Richard, John); knows(Richard, John)}, Successfully Unified.

Unifier: {John/x}.

Resolution in FOL
Resolution
Resolution is a theorem proving technique that proceeds by building refutation proofs,
i.e., proofs by contradictions. It was invented by a Mathematician John Alan Robinson in
the year 1965.

Resolution is used, if there are various statements are given, and we need to prove a
conclusion of those statements. Unification is a key concept in proofs by resolutions.
Resolution is a single inference rule which can efficiently operate on the conjunctive
normal form or clausal form.

Clause: Disjunction of literals (an atomic sentence) is called a clause. It is also known as
a unit clause.

Conjunctive Normal Form: A sentence represented as a conjunction of clauses is said


to be conjunctive normal form or CNF.

Note: To better understand this topic, firstly learns the FOL in AI.

The resolution inference rule:


The resolution rule for first-order logic is simply a lifted version of the propositional rule.
Resolution can resolve two clauses if they contain complementary literals, which are
assumed to be standardized apart so that they share no variables.

Where li and mj are complementary literals.


This rule is also called the binary resolution rule because it only resolves exactly two
literals.

Example:
We can resolve two clauses which are given below:

[Animal (g(x) V Loves (f(x), x)] and [¬ Loves(a, b) V ¬Kills(a, b)]

Where two complimentary literals are: Loves (f(x), x) and ¬ Loves (a, b)

These literals can be unified with unifier θ= [a/f(x), and b/x] , and it will generate a
resolvent clause:

[Animal (g(x) V ¬ Kills(f(x), x)].

Steps for Resolution:


1. Conversion of facts into first-order logic.
2. Convert FOL statements into CNF
3. Negate the statement which needs to prove (proof by contradiction)
4. Draw resolution graph (unification).

To better understand all the above steps, we will take an example in which we will apply
resolution.

Example:

a. John likes all kind of food.


b. Apple and vegetable are food
c. Anything anyone eats and not killed is food.
d. Anil eats peanuts and still alive
e. Harry eats everything that Anil eats.
Prove by resolution that:

f. John likes peanuts.

Step-1: Conversion of Facts into FOL


In the first step we will convert all the given statements into its first order logic.

Step-2: Conversion of FOL into CNF

In First order logic resolution, it is required to convert the FOL into CNF as CNF form
makes easier for resolution proofs.

Eliminate all implication (→) and rewrite

a. ∀x ¬ food(x) V likes(John, x)
b. food(Apple) Λ food(vegetables)
c. ∀x ∀y ¬ [eats(x, y) Λ ¬ killed(x)] V food(y)
d. eats (Anil, Peanuts) Λ alive(Anil)
e. ∀x ¬ eats(Anil, x) V eats(Harry, x)
f. ∀x¬ [¬ killed(x) ] V alive(x)
g. ∀x ¬ alive(x) V ¬ killed(x)
h. likes(John, Peanuts).
o Move negation (¬)inwards and rewrite
. ∀x ¬ food(x) V likes(John, x)
a. food(Apple) Λ food(vegetables)
b. ∀x ∀y ¬ eats(x, y) V killed(x) V food(y)
c. eats (Anil, Peanuts) Λ alive(Anil)
d. ∀x ¬ eats(Anil, x) V eats(Harry, x)
e. ∀x ¬killed(x) ] V alive(x)
f. ∀x ¬ alive(x) V ¬ killed(x)
g. likes(John, Peanuts).
o Rename variables or standardize variables
. ∀x ¬ food(x) V likes(John, x)
a. food(Apple) Λ food(vegetables)
b. ∀y ∀z ¬ eats(y, z) V killed(y) V food(z)
c. eats (Anil, Peanuts) Λ alive(Anil)
d. ∀w¬ eats(Anil, w) V eats(Harry, w)
e. ∀g ¬killed(g) ] V alive(g)
f. ∀k ¬ alive(k) V ¬ killed(k)
g. likes(John, Peanuts).
o Eliminate existential instantiation quantifier by elimination.
In this step, we will eliminate existential quantifier ∃, and this process is known
as Skolemization. But in this example problem since there is no existential quantifier so
all the statements will remain same in this step.
o Drop Universal quantifiers.

In this step we will drop all universal quantifier since all the statements are not implicitly
quantified so we don't need it.

. ¬ food(x) V likes(John, x)
a. food(Apple)
b. food(vegetables)
c. ¬ eats(y, z) V killed(y) V food(z)
d. eats (Anil, Peanuts)
e. alive(Anil)
f. ¬ eats(Anil, w) V eats(Harry, w)
g. killed(g) V alive(g)
h. ¬ alive(k) V ¬ killed(k)
i. likes(John, Peanuts).
Note: Statements "food(Apple) Λ food(vegetables)" and "eats (Anil, Peanuts) Λ
alive(Anil)" can be written in two separate statements.
o Distribute conjunction ∧ over disjunction ¬.

This step will not make any change in this problem.

Step-3: Negate the statement to be proved

In this statement, we will apply negation to the conclusion statements, which will be
written as ¬likes(John, Peanuts)

Step-4: Draw Resolution graph:

Now in this step, we will solve the problem by resolution tree using substitution. For the
above problem, it will be given as follows:

Hence the negation of the conclusion has been proved as a complete contradiction with
the given set of statements.

Explanation of Resolution graph:


o In the first step of resolution graph, ¬likes(John, Peanuts) , and likes(John, x) get
resolved(canceled) by substitution of {Peanuts/x}, and we are left with ¬ food(Peanuts)
o In the second step of the resolution graph, ¬ food(Peanuts) , and food(z) get resolved
(canceled) by substitution of { Peanuts/z}, and we are left with ¬ eats(y, Peanuts) V
killed(y) .
o In the third step of the resolution graph, ¬ eats(y, Peanuts) and eats (Anil,
Peanuts) get resolved by substitution {Anil/y}, and we are left with Killed(Anil) .
o In the fourth step of the resolution graph, Killed(Anil) and ¬ killed(k) get resolve by
substitution {Anil/k}, and we are left with ¬ alive(Anil) .
o In the last step of the resolution graph ¬ alive(Anil) and alive(Anil) get resolved.

Forward Chaining and backward chaining in


AI
Inference engine:
The inference engine is the component of the intelligent system in artificial intelligence,
which applies logical rules to the knowledge base to infer new information from known
facts. The first inference engine was part of the expert system. Inference engine
commonly proceeds in two modes, which are:

o Forward chaining
o Backward chaining

Horn Clause and Definite clause:

Horn clause and definite clause are the forms of sentences, which enables knowledge
base to use a more restricted and efficient inference algorithm. Logical inference
algorithms use forward and backward chaining approaches, which require KB in the
form of the first-order definite clause.

Definite clause: A clause which is a disjunction of literals with exactly one positive
literal is known as a definite clause or strict horn clause.

Horn clause: A clause which is a disjunction of literals with at most one positive
literal is known as horn clause. Hence all the definite clauses are horn clauses.
Example: (¬ p V ¬ q V k). It has only one positive literal k.

It is equivalent to p ∧ q → k.

A. Forward Chaining
Forward chaining is also known as a forward deduction or forward reasoning method
when using an inference engine. Forward chaining is a form of reasoning which start
with atomic sentences in the knowledge base and applies inference rules (Modus
Ponens) in the forward direction to extract more data until a goal is reached.

The Forward-chaining algorithm starts from known facts, triggers all rules whose
premises are satisfied, and add their conclusion to the known facts. This process repeats
until the problem is solved.

Properties of Forward-Chaining:

o It is a down-up approach, as it moves from bottom to top.


o It is a process of making a conclusion based on known facts or data, by starting from the
initial state and reaches the goal state.
o Forward-chaining approach is also called as data-driven as we reach to the goal using
available data.
o Forward -chaining approach is commonly used in the expert system, such as CLIPS,
business, and production rule systems.

Consider the following famous example which we will use in both approaches:

Example:
"As per the law, it is a crime for an American to sell weapons to hostile nations.
Country A, an enemy of America, has some missiles, and all the missiles were sold
to it by Robert, who is an American citizen."

Prove that "Robert is criminal."

To solve the above problem, first, we will convert all the above facts into first-order
definite clauses, and then we will use a forward-chaining algorithm to reach the goal.

Facts Conversion into FOL:


o It is a crime for an American to sell weapons to hostile nations. (Let's say p, q, and r are
variables)
American (p) ∧ weapon(q) ∧ sells (p, q, r) ∧ hostile(r) → Criminal(p) ...(1)
o Country A has some missiles. ?p Owns(A, p) ∧ Missile(p). It can be written in two
definite clauses by using Existential Instantiation, introducing new Constant T1.

Owns(A, T1) ......(2)

Missile(T1) .......(3)

o All of the missiles were sold to country A by Robert.

?p Missiles(p) ∧ Owns (A, p) → Sells (Robert, p, A) ......(4)

o Missiles are weapons.

Missile(p) → Weapons (p) .......(5)

o Enemy of America is known as hostile.

Enemy(p, America) →Hostile(p) ........(6)

o Country A is an enemy of America.

Enemy (A, America) .........(7)

o Robert is American

American(Robert). ..........(8)

Forward chaining proof:


Step-1:

In the first step we will start with the known facts and will choose the sentences which
do not have implications, such as: American(Robert), Enemy(A, America), Owns(A,
T1), and Missile(T1). All these facts will be represented as below.
Step-2:

At the second step, we will see those facts which infer from available facts and with
satisfied premises.

Rule-(1) does not satisfy premises, so it will not be added in the first iteration.

Rule-(2) and (3) are already added.

Rule-(4) satisfy with the substitution {p/T1}, so Sells (Robert, T1, A) is added, which
infers from the conjunction of Rule (2) and (3).

Rule-(6) is satisfied with the substitution(p/A), so Hostile(A) is added and which infers
from Rule-(7).

Step-3:

At step-3, as we can check Rule-(1) is satisfied with the substitution {p/Robert, q/T1,
r/A}, so we can add Criminal(Robert) which infers all the available facts. And hence we
reached our goal statement.
Hence it is proved that Robert is Criminal using forward chaining approach.

B. Backward Chaining:
Backward-chaining is also known as a backward deduction or backward reasoning
method when using an inference engine. A backward chaining algorithm is a form of
reasoning, which starts with the goal and works backward, chaining through rules to
find known facts that support the goal.

Properties of backward chaining:

o It is known as a top-down approach.


o Backward-chaining is based on modus ponens inference rule.
o In backward chaining, the goal is broken into sub-goal or sub-goals to prove the facts
true.
o It is called a goal-driven approach, as a list of goals decides which rules are selected and
used.
o Backward -chaining algorithm is used in game theory, automated theorem proving tools,
inference engines, proof assistants, and various AI applications.
o The backward-chaining method mostly used a depth-first search strategy for proof.

Example:
In backward-chaining, we will use the same above example, and will rewrite all the rules.

o American (p) ∧ weapon(q) ∧ sells (p, q, r) ∧ hostile(r) → Criminal(p) ...(1)


Owns(A, T1) ........(2)
o Missile(T1)
o ?p Missiles(p) ∧ Owns (A, p) → Sells (Robert, p, A) ......(4)
o Missile(p) → Weapons (p) .......(5)
o Enemy(p, America) →Hostile(p) ........(6)
o Enemy (A, America) .........(7)
o American(Robert). ..........(8)

ADVERTISEMENT

Backward-Chaining proof:
In Backward chaining, we will start with our goal predicate, which is Criminal(Robert),
and then infer further rules.

Step-1:

At the first step, we will take the goal fact. And from the goal fact, we will infer other
facts, and at last, we will prove those facts true. So our goal fact is "Robert is Criminal,"
so following is the predicate of it.

Step-2:

At the second step, we will infer other facts form goal fact which satisfies the rules. So as
we can see in Rule-1, the goal predicate Criminal (Robert) is present with substitution
{Robert/P}. So we will add all the conjunctive facts below the first level and will replace p
with Robert.

Here we can see American (Robert) is a fact, so it is proved here.


Step-3:t At step-3, we will extract further fact Missile(q) which infer from Weapon(q), as
it satisfies Rule-(5). Weapon (q) is also true with the substitution of a constant T1 at q.

Step-4:

At step-4, we can infer facts Missile(T1) and Owns(A, T1) form Sells(Robert, T1, r) which
satisfies the Rule- 4, with the substitution of A in place of r. So these two statements are
proved here.
Step-5:

At step-5, we can infer the fact Enemy(A, America) from Hostile(A) which satisfies
Rule- 6. And hence all the statements are proved true using backward chaining.
Difference between backward chaining and
forward chaining
Following is the difference between the forward chaining and backward chaining:

o Forward chaining as the name suggests, start from the known facts and move forward by
applying inference rules to extract more data, and it continues until it reaches to the
goal, whereas backward chaining starts from the goal, move backward by using inference
rules to determine the facts that satisfy the goal.
o Forward chaining is called a data-driven inference technique, whereas backward
chaining is called a goal-driven inference technique.
o Forward chaining is known as the down-up approach, whereas backward chaining is
known as a top-down approach.
o Forward chaining uses breadth-first search strategy, whereas backward chaining
uses depth-first search strategy.
o Forward and backward chaining both applies Modus ponens inference rule.
o Forward chaining can be used for tasks such as planning, design process monitoring,
diagnosis, and classification, whereas backward chaining can be used for classification
and diagnosis tasks.
o Forward chaining can be like an exhaustive search, whereas backward chaining tries to
avoid the unnecessary path of reasoning.
o In forward-chaining there can be various ASK questions from the knowledge base,
whereas in backward chaining there can be fewer ASK questions.
o Forward chaining is slow as it checks for all the rules, whereas backward chaining is fast
as it checks few required rules only.

S. Forward Chaining Backward Chaining


No.

1. Forward chaining starts from known facts and Backward chaining starts from the goal and works
applies inference rule to extract more data backward through inference rules to find the
unit it reaches to the goal. required facts that support the goal.

2. It is a bottom-up approach It is a top-down approach

3. Forward chaining is known as data-driven Backward chaining is known as goal-driven


inference technique as we reach to the goal technique as we start from the goal and divide
using the available data. into sub-goal to extract the facts.

4. Forward chaining reasoning applies a Backward chaining reasoning applies a depth-first


breadth-first search strategy. search strategy.

5. Forward chaining tests for all the available Backward chaining only tests for few required
rules rules.

6. Forward chaining is suitable for the planning, Backward chaining is suitable for diagnostic,
monitoring, control, and interpretation prescription, and debugging application.
application.

7. Forward chaining can generate an infinite Backward chaining generates a finite number of
number of possible conclusions. possible conclusions.
8. It operates in the forward direction. It operates in the backward direction.

9. Forward chaining is aimed for any conclusion. Backward chaining is only aimed for the required
data.

You might also like