Ai Complete Notes
Ai Complete Notes
Ai Complete Notes
Here, one of the booming technologies of computer science is Artificial Intelligence which
is ready to create a new revolution in the world by making intelligent machines. The
Artificial Intelligence is now all around us. It is currently working with a variety of subfields,
ranging from general to specific, such as self-driving cars, playing chess, proving theorems,
playing music, Painting, etc.
AI is one of the fascinating and universal fields of Computer science which has a great
scope in future. AI holds a tendency to cause a machine to work as a human.
Artificial Intelligence is composed of two words Artificial and Intelligence, where Artificial
defines "man-made," and intelligence defines "thinking power", hence AI means "a
manmade thinking power." So, we can define AI as:
"It is a branch of computer science by which we can create intelligent machines which can
behave like a human, think like humans, and able to make decisions."
Artificial Intelligence exists when a machine can have human based skills such as learning,
reasoning, and solving problems
With Artificial Intelligence you do not need to preprogram a machine to do some work,
despite that you can create a machine with programmed algorithms which can work with
own intelligence, and that is the awesomeness of AI.
It is believed that AI is not a new technology, and some people says that as per Greek
myth, there were Mechanical men in early days which can work and behave like humans.
4. Building a machine which can perform tasks that requires human intelligence.
5. Creating some system which can exhibit intelligent behavior, learn new things by
itself, demonstrate, explain, and can advise to its user.
o High Accuracy with less errors: AI machines or systems are prone to less errors
and high accuracy as it takes decisions as per pre-experience or information.
o High-Speed: AI systems can be of very high-speed and fast-decision making,
because of that AI systems can beat a chess champion in the Chess game.
o High reliability: AI machines are highly reliable and can perform the same action
multiple times with high accuracy.
o Useful for risky areas: AI machines can be helpful in situations such as defusing a
bomb, exploring the ocean floor, where to employ a human can be risky.
o Digital Assistant: AI can be very useful to provide digital assistant to the users such
as AI technology is currently used by various E-commerce websites to show the
products as per customer requirement.
o Useful as a public utility: AI can be very useful for public utilities such as a self
driving car which can make our journey safer and hassle-free, facial recognition for
security purpose, Natural language processing to communicate with the human in
human-language, etc.
o High Cost: The hardware and software requirement of AI is very costly as it requires
lots of maintenance to meet current world requirements.
o Can't think out of the box: Even we are making smarter machines with AI, but still
they cannot work out of the box, as the robot will only do that work for which they
are trained, or programmed.
o No feelings and emotions: AI machines can be an outstanding performer, but still
it does not have the feeling so it cannot make any kind of emotional attachment
with human, and may sometime be harmful for users if the proper care is not taken.
o Increase dependency on machines: With the increment of technology, people are
getting more dependent on devices and hence they are losing their mental
capabilities.
o No Original Creativity: As humans are so creative and can imagine some new ideas
but still AI machines cannot beat this power of human intelligence and cannot be
creative and imaginative.
Top 4 Techniques of Artificial Intelligence
Artificial Intelligence can be divided into different categories based on the machine’s capacity
to use past experiences to predict future decisions, memory, and self awareness.
1. Machine Learning
perform certain tasks; rather, they learn and improve from experience automatically.
Deep Learning is a subset of machine learning based on artificial neural networks for
predictive analysis. There are various machine learning algorithms, such as Unsupervised
the algorithm does not use classified information to act on it without any guidance. In
Supervised Learning, it deduces a function from the training data, which consists of a set
of an input object and the desired output. Reinforcement learning is used by machines
to take suitable actions to increase the reward to find the best possibility which should
be taken in to account.
It is the interactions between computers and human language where the computers are
Natural Language Processing to obtain meaning from human languages. In NLP, the audio
of a human talk is captured by the machine. Then the audio-to-text conversation occurs,
and then the text is processed where the data is converted into audio. Then the machine
uses the audio to respond to humans. Applications of Natural Language Processing can
be found in IVR (Interactive Voice Response) applications used in call centres, language
as Microsoft Word to check the accuracy of grammar in text. However, the nature of
human languages makes Natural Language Processing difficult because of the rules
which are involved in the passing of information using natural language, and they are
not easy for the computers to understand. So NLP uses algorithms to recognize and
abstract the rules of the natural languages where the unstructured data from the human
The purpose of Automation is to get the monotonous and repetitive tasks done by
machines which also improve productivity and in receiving cost-effective and more
efficient results. Many organizations use machine learning, neural networks, and graphs
in automation. Such automation can prevent fraud issues while financial transactions
perform high volume repetitive tasks which can adapt to the change in different
circumstances.
4. Machine Vision
Machines can capture visual information and then analyze it. Here cameras are used to
capture the visual information, the analogue to digital conversion is used to convert the
image to digital data, and digital signal processing is employed to process the data.
Then the resulting data is fed to a computer. In machine vision, two vital aspects are
sensitivity, which is the ability of the machine to perceive impulses that are weak and
resolution, the range to which the machine can distinguish the objects. The usage of
machine vision can be found in signature identification, pattern recognition, and medical
image
analysis, etc.
• Initial State: This state requires an initial state for the problem which
starts the AI agent towards a specified goal. In this state new methods also
initialize problem domain solving by a specific class.
• Action: This stage of problem formulation works with function with a
specific class taken from the initial state and all possible actions done in this
stage.
• Transition: This stage of problem formulation integrates the actual action
done by the previous action stage and collects the final stage to forward it
to their next stage.
• Goal test: This stage determines that the specified goal achieved by the
integrated transition model or not, whenever the goal achieves stop the
action and forward into the next stage to determines the cost to achieve the
goal.
• Path costing: This component of problem-solving numerical assigned what
will be the cost to achieve the goal. It requires all hardware software and
human working cost.
Problem-solving agents:
In Artificial Intelligence, Search techniques are universal problem-solving methods.
Rational agents or Problem-solving agents in AI mostly used these search strategies or
algorithms to solve a specific problem and provide the best result. Problemsolving agents
are the goal-based agents and use atomic representation. In this topic, we will learn
various problem-solving search algorithms.
Search Algorithm Terminologies:
o Search: Searchingis a step by step procedure to solve a search-problem in a given search space.
A search problem can have three main factors:
a. Search Space: Search space represents a set of possible solutions, which a system may
have.
c. Goal test: It is a function which observe the current state and returns whether the goal
state is achieved or not.
o Search tree: A tree representation of search problem is called Search tree. The root of the search
tree is the root node which is corresponding to the initial state.
o Actions: It gives the description of all the available actions to the agent.
o Transition model: A description of what each action do, can be represented as a transition model.
o Solution: It is an action sequence which leads from the start node to the goal node. o
Optimal Solution: If a solution has the lowest cost among all solutions.
Time Complexity: Time complexity is a measure of time for an algorithm to complete its
task.
Space Complexity: It is the maximum storage space required at any point during the
search, as the complexity of the problem.
Types of search algorithms
Based on the search problems we can classify the search algorithms into uninformed (Blind
search) search and informed search (Heuristic search) algorithms.
Uninformed/Blind Search:
The uninformed search does not contain any domain knowledge such as closeness, the
location of the goal. It operates in a brute-force way as it only includes information about
how to traverse the tree and how to identify leaf and goal nodes. Uninformed search
applies a way in which search tree is searched without any information about the search
space like initial state operators and test for the goal, so it is also called blind search. It
examines each node of the tree until it achieves the goal node.
• Breadth-first search
• Depth-first search
• Bidirectional Search
• Depth-limited search
Informed Search
Informed search algorithms use domain knowledge. In an informed search, problem
information is available which can guide the search. Informed search strategies can find a
solution more efficiently than an uninformed search strategy. Informed search is also called
a Heuristic search.
A heuristic is a way which might not always be guaranteed for best solutions but
guaranteed to find a good solution in reasonable time.
Informed search can solve much complex problem which could not be solved in another
way.
1. Greedy Search
2. A* Search
1. Breadth-first Search
2. Depth-first Search
3. Depth-limited Search
6. Bidirectional Search
1. Breadth-first Search:
o Breadth-first search is the most common search strategy for traversing a tree or graph. This
algorithm starts searching from the root node of the tree and expands all successor node at
Advantages:
o If there are more than one solutions for a given problem, then BFS will provide the minimal
solution which requires the least number of steps.
Disadvantages:
It requires lots of memory since each level of the tree must be saved into memory to
expand the next level. o BFS needs lots of time if the solution is far away from the root node.
Example:
In the below tree structure, we have shown the traversing of the tree using BFS algorithm
from the root node S to goal node K. BFS search algorithm traverse in layers, so it will
follow the path which is shown by the dotted arrow, and the traversed path will be: 1. S--
> A--->B---->C--->D---->G--->H--->E---->F---->I---->K
Time Complexity: Time Complexity of BFS algorithm can be obtained by the number of
nodes traversed in BFS until the shallowest Node. Where the d= depth of shallowest
solution and b is a node at every state.
Space Complexity: Space complexity of BFS algorithm is given by the Memory size of
frontier which is O(bd).
Completeness: BFS is complete, which means if the shallowest goal node is at some finite
depth, then BFS will find a solution.
Optimality: BFS is optimal if path cost is a non-decreasing function of the depth of the
node.
2. Depth-first Search
o Depth-first search is a recursive algorithm for traversing a tree or graph data structure. o It is
called the depth-first search because it starts from the root node and follows each path to its
greatest depth node before moving to the next path. o DFS uses a stack data structure for its
implementation.
Advantage:
o DFS requires very less memory as it only needs to store a stack of the nodes on the path from
root node to the current node.
o It takes less time to reach to the goal node than BFS algorithm (if it traverses in the right path).
Disadvantage:
o There is the possibility that many states keep re-occurring, and there is no guarantee of finding
the solution. o DFS algorithm goes for deep down searching and sometime it may go to the
infinite loop.
Example:
In the below search tree, we have shown the flow of depth-first search, and it will follow
the order as:
It will start searching from root node S, and traverse A, then B, then D and E, after traversing
E, it will backtrack the tree as E has no other successor and still goal node is not found.
After backtracking it will traverse node C and then G, and here it will terminate as it found
goal node.
Completeness: DFS search algorithm is complete within finite state space as it will expand
every node within a limited search tree.
Time Complexity: Time complexity of DFS will be equivalent to the node traversed by the
algorithm. It is given by:
T(n)= 1+ n2+ n3 +.........+ nm=O(nm)
Where, m= maximum depth of any node and this can be much larger than d (Shallowest
solution depth)
Space Complexity: DFS algorithm needs to store only single path from the root node,
hence space complexity of DFS is equivalent to the size of the fringe set, which is O(bm).
Optimal: DFS search algorithm is non-optimal, as it may generate a large number of steps
or high cost to reach to the goal node.
o Standard failure value: It indicates that problem does not have any solution.
o Cutoff failure value: It defines no solution for the problem within a given depth
limit.
Advantages:
Disadvantages:
o Depth-limited search also has a disadvantage of incompleteness. o It may not be optimal if the
Example:
Completeness: DLS search algorithm is complete if the solution is above the depthlimit.
Optimal: Depth-limited search can be viewed as a special case of DFS, and it is also not
optimal even if ℓ>d.
Advantages:
o Uniform cost search is optimal because at every state the path with the least cost is chosen.
Disadvantages:
o It does not care about the number of steps involve in searching and only concerned about path cost.
Due to which this algorithm may be stuck in an infinite loop.
Example:
Completeness:
Uniform-cost search is complete, such as if there is a solution, UCS will find it.
Time Complexity:
Let C* is Cost of the optimal solution, and ε is each step to get closer to the goal node.
Then the number of steps is = C*/ε+1. Here we have taken +1, as we start from state 0
and end to C*/ε.
Hence, the worst-case time complexity of Uniform-cost search isO(b1 + [C*/ε])/.
Space Complexity:
The same logic is for space complexity so, the worst-case space complexity of Uniformcost
search is O(b1 + [C*/ε]).
Optimal:
Uniform-cost search is always optimal as it only selects a path with the lowest path cost.
This algorithm performs depth-first search up to a certain "depth limit", and it keeps
increasing the depth limit after each iteration until the goal node is found.
This Search algorithm combines the benefits of Breadth-first search's fast search and
depth-first search's memory efficiency.
The iterative search algorithm is useful uninformed search when search space is large, and
depth of goal node is unknown.
Advantages:
o It combines the benefits of BFS and DFS search algorithm in terms of fast search and memory
efficiency.
Disadvantages: o The main drawback of IDDFS is that it repeats all the work of
Example:
Following tree structure is showing the iterative deepening depth-first search. IDDFS
algorithm performs various iterations until it does not find the goal node. The iteration
performed by the algorithm is given as:
1'st Iteration-----> A 2'nd Iteration----> A, B, C
3'rd Iteration------>A, B, D, E, C, F, G 4'th Iteration------>A, B, D, H, I, E, C, F, K, G In the
fourth iteration, the algorithm will find the goal node.
Completeness:
Time Complexity:
Let's suppose b is the branching factor and depth is d then the worst-case time complexity
is O(bd).
Space Complexity:
Optimal:
IDDFS algorithm is optimal if path cost is a non- decreasing function of the depth of the
node.
Bidirectional search can use search techniques such as BFS, DFS, DLS, etc.
Advantages:
o Implementation of the bidirectional search tree is difficult. o In bidirectional search, one should
Heuristics function: Heuristic is a function which is used in Informed Search, and it finds
the most promising path. It takes the current state of the agent as its input and produces
the estimation of how close agent is from the goal. The heuristic method, however, might
not always give the best solution, but it guaranteed to find a good solution in reasonable
time. Heuristic function estimates how close a state is to the goal. It is represented by h(n),
and it calculates the cost of an optimal path between the pair of states. The value of the
heuristic function is always positive.
Here h(n) is heuristic cost, and h*(n) is the estimated cost. Hence heuristic cost should be
less than or equal to the estimated cost.
On each iteration, each node n with the lowest heuristic value is expanded and generates
all its successors and n is placed to the closed list. The algorithm continues unit a goal
state is found.
In the informed search we will discuss two main algorithms which are given below:
A* Search Algorithm
1. f(n)= g(n).
return failure.
o Step 3: Remove the node n, from the OPEN list which has the lowest value of h(n), and places it
in the CLOSED list.
o Step 5: Check each successor of node n, and find whether any node is a goal node or not. If any
successor node is goal node, then return success and terminate the search, else proceed to
Step 6.
o Step 6: For each successor node, algorithm checks for evaluation function f(n), and then check if
the node has been in either OPEN or CLOSED list. If the node has not been in both list, then add
it to the OPEN list. o Step 7: Return to Step 2.
Advantages:
o Best first search can switch between BFS and DFS by gaining the advantages of both the
algorithms. o This algorithm is more efficient than BFS and DFS algorithms.
Disadvantages:
o It can behave as an unguided depth-first search in the worst case scenario. o It can get
Example:
Consider the below search problem, and we will traverse it using greedy best-first search.
At each iteration, each node is expanded using evaluation function f(n)=h(n) , which is
given in the below table.
In this search example, we are using two lists which are OPEN and CLOSED Lists.
Following are the iteration for traversing the above example.
Time Complexity: The worst case time complexity of Greedy best first search is O(bm).
Space Complexity: The worst case space complexity of Greedy best first search is O(bm).
Where, m is the maximum depth of the search space.
Complete: Greedy best-first search is also incomplete, even if the given state space is
finite.
In A* search algorithm, we use search heuristic as well as the cost to reach the node. Hence
we can combine both costs as following, and this sum is called as a fitness number.
At each point in the search space, only those node is expanded which have the lowest value of
f(n), and the algorithm terminates when the goal node is found.
Algorithm of A* search:
Step1: Place the starting node in the OPEN list.
Step 2: Check if the OPEN list is empty or not, if the list is empty then return failure and
stops.
Step 3: Select the node from the OPEN list which has the smallest value of evaluation
function (g+h), if node n is goal node then return success and stop, otherwise
Step 4: Expand node n and generate all of its successors, and put n into the closed list.
For each successor n', check whether n' is already in the OPEN or CLOSED list, if not then
compute evaluation function for n' and place into Open list.
Step 5: Else if node n' is already in OPEN and CLOSED, then it should be attached to the
back pointer which reflects the lowest g(n') value.
Advantages:
o A* search algorithm is the best algorithm than other search algorithms. o A* search algorithm is
optimal and complete. o This algorithm can solve very complex problems.
Disadvantages:
o It does not always produce the shortest path as it mostly based on heuristics and approximation.
o A* search algorithm has some complexity issues.
o The main drawback of A* is memory requirement as it keeps all generated nodes in the memory, so
it is not practical for various large-scale problems.
Example:
In this example, we will traverse the given graph using the A* algorithm. The heuristic value of all
states is given in the below table so we will calculate the f(n) of each state using the formula f(n)=
g(n) + h(n), where g(n) is the cost to reach any node from start state.
Iteration3: {(S--> A-->C--->G, 6), (S--> A-->C--->D, 11), (S--> A-->B, 7), (S-->G, 10)}
Iteration 4 will give the final result, as S--->A--->C--->G it provides the optimal path
with cost 6.
Points to remember:
o A* algorithm returns the path which occurred first, and it does not search for all remaining paths.
o The efficiency of A* algorithm depends on the quality of heuristic. o A* algorithm expands all
nodes which satisfy the condition f(n)<="" li=""> Complete: A* algorithm is complete as
long as:
o Admissible: the first condition requires for optimality is that h(n) should be an admissible
heuristic for A* tree search. An admissible heuristic is optimistic in nature. o Consistency: Second
If the heuristic function is admissible, then A* tree search will always find the least cost
path.
Production System in AI
A production system (popularly known as a production rule system) is a
kind of cognitive architecture that is used to implement search algorithms
and replicate human problem-solving skills. This problem
solving knowledge is encoded in the system in the form of little quanta
popularly known as productions.
It consists of two components: rule and action.
Rules recognize the condition, and the actions part has the knowledge of
how to deal with the condition. In simpler words, the production system in
AI contains a set of rules which are defined by the left side and right side of
the system. The left side contains a set of things to watch for
(condition), and the right side contains the things to do (action).
What are the Elements of a Production Sys-
tem?
An AI production system has three main elements which are as follows:
• Global Database: The primary database which contains all the
information necessary to successfully complete a task. It is further
broken down into two parts: temporary and permanent. The
temporary part contains information relevant to the current
situation only whereas the permanent part contains information
about the fixed actions.
• A set of Production Rules: A set of rules that operates on the
global database. Each rule consists of a precondition and
postcondition that the global database either meets or not. For
example, if a condition is met by the global database, then the
production rule is applied successfully.
• Control System: A control system that acts as the decision-maker,
decides which production rule should be applied. The Control
system stops computation or processing when a termination
condition is met on the database.
UNIT-II
Heuristics
A heuristic is a technique that is used to solve a problem faster than the classic methods. These
techniques are used to find the approximate solution of a problem when classical methods do
not. Heuristics are said to be the problem-solving techniques that result in practical and quick
solutions.
Heuristics are strategies that are derived from past experience with similar problems. Heuristics
use practical methods and shortcuts used to produce the solutions that may or may not be
optimal, but those solutions are sufficient in a given limited timeframe.
It is also called greedy local search as it only looks to its good immediate neighbor state
and not beyond that. The steps of a simple hill-climbing algorithm are listed below: Step
1: Evaluate the initial state. If it is the goal state, then return success and Stop.
Step 2: Loop Until a solution is found or there is no new operator left to apply.
Else if it is better than the current state, then assign a new state as a current state.
Else if not better than the current state, then return to step2.
Step 5: Exit.
DF_branch_and_bound(G,s,goal,h,bound0,,goal,ℎ,bound0)
2: Inputs
⊥
12: best_path: path or ⊥
13: bound: non-negative real
19: else
22: best_path:=⊥
23: bound:=bound0
24: cbsearch(⟨s⟩)
25: return best_path
And-Or graph
PROBLEM REDUCTION:
So far we have considered search strategies for OR graphs through which we want to find a single
path to a goal. Such structure represent the fact that we know how to get from anode to a goal state if
we can discover how to get from that node to a goal state along any one of the branches leaving it.
AND-OR GRAPHS
The AND-OR GRAPH (or tree) is useful for representing the solution of problems that can solved by
decomposing them into a set of smaller problems, all of which must then be solved. This
decomposition, or reduction, generates arcs that we call AND arcs. One AND arc may point to any
number of successor nodes, all of which must be solved in order for the arc to point to a solution.
Just as in an OR graph, several arcs may emerge from a single node, indicating a variety of ways in
which the original problem might be solved. This is why the structure is called not simply an
ANDgraph but rather an AND-OR graph (which also happens to be an AND-OR tree)
The Depth-first search and Breadth-first search given earlier for OR trees or graphs can be
easily adopted by AND-OR graph. The main difference lies in the way termination
conditions are determined since all goals following an AND node must be realized;
whereas a single goal node following an OR node will do. So for this purpose, we are
using AO* algorithm.
Like A* algorithm here we will use two arrays and one heuristic function.
OPEN: It contains the nodes that have been traversed but yet not been marked solvable
or unsolvable.
CLOSE: It contains the nodes that have already been processed. h(n):
The distance from the current node to the goal node.
Step 4: If n is the terminal goal node then leveled n as solved and leveled all the ancestors
of n as solved. If the starting node is marked as solved then success and exit.
Step 5: If n is not a solvable node, then mark n as unsolvable. If starting node is marked as
unsolvable, then return failure and exit.
Step 6: Expand n. Find all its successors and find their h (n) value, push them into OPEN.
Step 8: Exit.
We have encountered a wide variety of methods, including adversarial search and instant
search, to address various issues. Every method for issue has a single purpose in mind: to
locate a remedy that will enable that achievement of the objective. However there were
no restrictions just on bots' capability to resolve issues as well as arrive at responses in
adversarial search and local search, respectively.
Constraint Satisfaction Problems in Artificial
Intelligence
These section examines the constraint optimization methodology, another form or real
concern method. By its name, constraints fulfilment implies that such an issue must be
solved while adhering to a set of restrictions or guidelines.
For a constraint satisfaction problem (CSP), the following conditions must be met:
The definition of a state in phase space involves giving values to any or all of the
parameters, like as
1. Consistent or Legal Assignment: A task is referred to as consistent or legal if it complies with all
laws and regulations.
2. Complete Assignment: An assignment in which each variable has a number associated to it and
that the CSP solution is continuous. One such task is referred to as a completed task.
3. A partial assignment is one that just gives some of the variables values. Projects of this nature are
referred to as incomplete assignment.
o Discrete Domain: This limitless area allows for the existence of a single state with numerous
variables. For instance, every parameter may receive a endless number of beginning states.
o It is a finite domain with continous phases that really can describe just one area for just one
particular variable. Another name for it is constant area.
Types of Constraints in CSP
Basically, there are three different categories of limitations in regard towards the
parameters:
o Unary restrictions are the easiest kind of restrictions because they only limit the value of one
variable.
o Binary resource limits: These restrictions connect two parameters. A value between x1 and x3 can
be found in a variable named x2.
o Global Resource limits: This kind of restriction includes a unrestricted amount of variables.
The main kinds of restrictions are resolved using certain kinds of resolution
methodologies:
o In linear programming, when every parameter carrying an integer value only occurs in linear
equation, linear constraints are frequently utilised.
o Non-linear Constraints: With non-linear programming, when each variable (an integer value)
exists in a non-linear form, several types of restrictions were utilised.
Note: The preferences restriction is a unique restriction that operates in the actual world.
Think of a Sudoku puzzle where some of the squares have initial fills of certain integers.
You must complete the empty squares with numbers between 1 and 9, making sure that
no rows, columns, or blocks contains a recurring integer of any kind. This solving multi -
objective issue is pretty elementary. A problem must be solved while taking certain
limitations into consideration.
The integer range (1-9) that really can occupy the other spaces is referred to as a domain,
while the empty spaces themselves were referred as variables. The values of the variables
are drawn first from realm. Constraints are the rules that determine how a variable will
select the scope.
o In this algorithm two players play the game, one is called MAX and other is called MIN.
o Both the players fight it as the opponent player gets the minimum benefit while they get the
maximum benefit.
o Both Players of the game are opponent of each other, where MAX will select the maximized value
and MIN will select the minimized value.
o The minimax algorithm performs a depth-first search algorithm for the exploration of the
complete game tree.
o The minimax algorithm proceeds all the way down to the terminal node of the tree, then backtrack
the tree as the recursion.
o Maximizer will try to get the Maximum possible score, and Minimizer will try to get the minimum
possible score.
o This algorithm applies DFS, so in this game-tree, we have to go all the way through the leaves to
reach the terminal nodes.
o At the terminal node, the terminal values are given so we will compare those value and backtrack
the tree until the initial state occurs. Following are the main steps involved in solving the twoplayer
game tree:
Step-1: In the first step, the algorithm generates the entire game-tree and apply the utility
function to get the utility values for the terminal states. In the below tree diagram, let's
take A is the initial state of the tree. Suppose maximizer takes first turn which has worstcase
initial value =- infinity, and minimizer will take next turn which has worstcase initial value
= +infinity.
Step 2: Now, first we find the utilities value for the Maximizer, its initial value is -∞, so we
will compare each value in terminal state with initial value of Maximizer and determines
the higher nodes values. It will find the maximum among the all.
o For node D max(-1,- -∞) => max(-1,4)= 4 o For Node E max(2, -∞) => max(2,
6)= 6 o For Node F max(-3, -∞) => max(-3,-5) = -3 o For node G max(0, -∞) =
max(0, 7) = 7
Step 3: In the next step, it's a turn for minimizer, so it will compare all nodes value with
+∞, and will find the 3rd layer node values. o For node B= min(4,6) = 4 o For node C= min (-
3, 7) = -3
Step 4: Now it's a turn for Maximizer, and it will again choose the maximum of all nodes
value and find the maximum value for the root node. In this game tree, there are only 4
layers, hence we reach immediately to the root node, but in real games, there will be more
than 4 layers. o For node A max(4, -3)= 4
That was the complete workflow of the minimax two player game.
o Space Complexity- Space complexity of Mini-max algorithm is also similar to DFS which is
O(bm).
UNIT-III
First-Order Logic in Artificial intelligence
In the topic of Propositional logic, we have seen that how to represent statements using
propositional logic. But unfortunately, in propositional logic, we can only represent the
facts, which are either true or false. PL is not sufficient to represent the complex sentences
or natural language statements. The propositional logic has very limited expressive power.
Consider the following sentence, which we cannot represent using PL logic.
To represent the above statements, PL logic is not sufficient, so we required some more
powerful logic, such as first-order logic.
First-Order logic:
o First-order logic is another way of knowledge representation in artificial intelligence. It is an
extension to propositional logic. o FOL is sufficiently expressive to represent the natural language
statements in a concise way.
First-order logic is also known as Predicate logic or First-order predicate logic. Firstorder
logic is a powerful language that develops information about the objects in a more easy
way and can also express the relationship between those objects.
o First-order logic (like natural language) does not only assume that the world contains facts like
propositional logic but also assumes the following things in the world:
o Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus, ...... o Relations: It
can be unary relation such as: red, round, is adjacent, or n-any relation such as: the sister of,
brother of, has color, comes between o Function: Father of, best friend,
a. Syntax
b. Semantics
Variables x, y, z, a, b,....
Predicates Brother, Father, >,....
Connectives ∧, ∨, ¬, ⇒, ⇔
Equality ==
Quantifier ∀, ∃
Atomic sentences:
o Atomic sentences are the most basic sentences of first-order logic. These sentences are formed
from a predicate symbol followed by a parenthesis with a sequence of terms.
o We can represent atomic sentences as Predicate (term1, term2, ......, term n).
Complex Sentences:
o Complex sentences are made by combining atomic sentences using connectives.
o Predicate: A predicate can be defined as a relation, which binds two atoms together in a
statement.
Consider the statement: "x is an integer.", it consists of two parts, the first part x is the
subject of the statement and second part "is an integer," is known as a predicate.
Quantifiers in First-order logic:
o A quantifier is a language element which generates quantification, and quantification specifies
the quantity of specimen in the universe of discourse.
These are the symbols that permit to determine or identify the range and scope of the variable
in the logical expression. There are two types of quantifier:
Universal Quantifier:
Universal quantifier is a symbol of logical representation, which specifies that the statement
within its range is true for everything or every instance of a particular thing.
every x.
Example:
All man drink coffee.
Existential Quantifier:
Existential quantifiers are the type of quantifiers, which express that the statement within
its scope is true for at least one instance of something.
It is denoted by the logical operator ∃, which resembles as inverted E. When it is used with a
predicate variable then it is called as an existential quantifier.
Note: In Existential quantifier we always use AND or Conjunction symbol ( ∧).
If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:
o There exists a 'x.' o For some 'x.' o For at least one 'x.'
Example:
Some boys are intelligent.
Points to remember:
The main connective for universal quantifier ∀ is implication →.
o The main connective for existential quantifier ∃ is and ∧.
Properties of Quantifiers:
o In universal quantifier, ∀x∀y is similar to ∀y∀x. o In Existential quantifier, ∃x∃y is similar to
∃y∃x. o ∃x∀y is not similar to ∀y∃x.
Free Variable: A variable is said to be a free variable in a formula if it occurs outside the
scope of the quantifier.
Resolution in FOL
Resolution
Resolution is a theorem proving technique that proceeds by building refutation proofs,
i.e., proofs by contradictions. It was invented by a Mathematician John Alan Robinson in
the year 1965.
Resolution is used, if there are various statements are given, and we need to prove a
conclusion of those statements. Unification is a key concept in proofs by resolutions.
Resolution is a single inference rule which can efficiently operate on the conjunctive
normal form or clausal form.
Clause: Disjunction of literals (an atomic sentence) is called a clause. It is also known as a unit
clause.
866 History of
Java
Note: To better understand this topic, firstly learns the FOL in AI.
The resolution inference rule:
The resolution rule for first-order logic is simply a lifted version of the propositional rule.
Resolution can resolve two clauses if they contain complementary literals, which are
assumed to be standardized apart so that they share no variables.
This rule is also called the binary resolution rule because it only resolves exactly two
literals.
1.
What is Unification?
o Unification is a process of making two different logical atomic expressions identical by finding
a substitution. Unification depends on the substitution process. o It takes two literals as input
and makes them identical using substitution.
o Let Ψ1 and Ψ2 be two atomic sentences and 𝜎 be a unifier such that, Ψ1𝜎 = Ψ2𝜎, then it can be
expressed as UNIFY(Ψ1, Ψ2).
o The UNIFY algorithm is used for unification, which takes two atomic sentences and returns a
unifier for those sentences (If any exist).
o Unification is a key component of all first-order inference algorithms. o It returns fail if the
o Predicate symbol must be same, atoms or expression with different predicate symbol can never
be unified.
o Number of Arguments in both expressions must be identical. o Unification will fail if there
b. If one expression is a variable vi, and the other is a term ti which does not contain variable vi,
then:
Substitution:
Note: First-order logic is capable of expressing facts about some or all objects in the
universe.
Equality:
First-Order logic does not only use predicate and terms for making atomic sentences but
also uses another way, which is equality in FOL. For this, we can use equality symbols
which specify that the two terms refer to the same object.
As in the above example, the object referred by the Brother (John) is similar to the object
referred by Smith. The equality symbol can also be used with negation to represent that
two terms are not the same objects.
o Universal Generalization o
Universal Instantiation o
Existential Instantiation o
Existential introduction
1. Universal Generalization:
o Universal generalization is a valid inference rule which states that if premise P(c) is true for any
arbitrary element c in the universe of discourse, then we can have a conclusion as x
P(x).
o This rule can be used if we want to show that every element has a similar property. o
Example: Let's represent, P(c): "A byte contains 8 bits", so for x P(x) "All bytes contain
8 bits.", it will also be true.
2. Universal Instantiation:
o As per UI, we can infer any sentence obtained by substituting a ground term for the variable.
o The UI rule state that we can infer any sentence P(c) by substituting a ground term c (a constant
within domain x) from x P(x) for any object in the universe of discourse.
Example:1.
3. Existential Instantiation:
o Existential instantiation is also called as Existential Elimination, which is a valid inference rule in
first-order logic.
o It can be applied only once to replace the existential sentence.
o The new KB is not logically equivalent to old KB, but it will be satisfiable if old KB was satisfiable.
o This rule states that one can infer P(c) from the formula given in the form of ∃x P(x) for a new
constant symbol c.
o The restriction with this rule is that c used in the rule must be a new term for which P(c ) is true.
Example:
So we can infer: Crown(K) ∧ OnHead( K, John), as long as K does not appear in the knowledge
base.
o The above used K is a constant symbol, which is called Skolem constant. o The
4. Existential introduction
o This rule states that if there is some element c in the universe of discourse which has a property
P, then we can infer that there exists something in the universe which has the property P.
Example:
We will use this rule for Kings are evil, so we will find some x such that x is king, and x
is greedy so we can infer that x is evil.
Inference engine:
The inference engine is the component of the intelligent system in artificial intelligence,
which applies logical rules to the knowledge base to infer new information from known
facts. The first inference engine was part of the expert system. Inference engine commonly
proceeds in two modes, which are:
a. Forward chaining
b. Backward chaining
Horn clause and definite clause are the forms of sentences, which enables knowledge base
to use a more restricted and efficient inference algorithm. Logical inference algorithms use
forward and backward chaining approaches, which require KB in the form of the firstorder
definite clause.
Definite clause: A clause which is a disjunction of literals with exactly one positive literal
is known as a definite clause or strict horn clause.
Horn clause: A clause which is a disjunction of literals with at most one positive literal is known
as horn clause. Hence all the definite clauses are horn clauses.
Example: (¬ p V ¬ q V k). It has only one positive literal k.
It is equivalent to p q → k.
A. Forward Chaining
Forward chaining is also known as a forward deduction or forward reasoning method when
using an inference engine. Forward chaining is a form of reasoning which start with atomic
sentences in the knowledge base and applies inference rules (Modus Ponens) in the
forward direction to extract more data until a goal is reached.
The Forward-chaining algorithm starts from known facts, triggers all rules whose premises
are satisfied, and add their conclusion to the known facts. This process repeats until the
problem is solved.
Properties of Forward-Chaining:
o It is a process of making a conclusion based on known facts or data, by starting from the initial
state and reaches the goal state.
o Forward-chaining approach is also called as data-driven as we reach to the goal using available
data.
o Forward -chaining approach is commonly used in the expert system, such as CLIPS, business, and
production rule systems.
B. Backward Chaining:
Backward-chaining is also known as a backward deduction or backward reasoning method
when using an inference engine. A backward chaining algorithm is a form of reasoning,
which starts with the goal and works backward, chaining through rules to find known facts
that support the goal.
o Backward-chaining is based on modus ponens inference rule. o In backward chaining, the goal is
o Backward -chaining algorithm is used in game theory, automated theorem proving tools,
inference engines, proof assistants, and various AI applications.
o The backward-chaining method mostly used a depth-first search strategy for proof.
1. Logical Representation
3. Frame Representation
4. Production Rules
1. Logical Representation
Logical representation is a language with some concrete rules which deals with
propositions and has no ambiguity in representation. Logical representation means
drawing a conclusion based on various conditions. This representation lays down some
important communication rules. It consists of precisely defined syntax and semantics
which supports the sound inference. Each sentence can be translated into logics using
syntax and semantics.
Syntax:
o Syntaxes are the rules which decide how we can construct legal sentences in the logic. o
Semantics:
o Semantics are the rules by which we can interpret the sentence in the logic. o Semantic also
a. Propositional Logics
b. Predicate logics
Note: We will discuss Prepositional Logics and Predicate logics in later chapters.
2. Logical representation technique may not be very natural, and inference may not be so efficient.
Note: Do not be confused with logical representation and logical reasoning as logical
representation is a representation language and reasoning is a process of thinking
logically.
2. Semantic Network Representation
Semantic networks are alternative of predicate logic for knowledge representation. In
Semantic networks, we can represent our knowledge in the form of graphical networks.
This network consists of nodes representing objects and arcs which describe the
relationship between those objects. Semantic networks can categorize the object in
different forms and can also link those objects. Semantic networks are easy to understand
and can be easily extended.
b. Kind-of-relation
Example: Following are some statements which we need to represent in the form of nodes
and arcs.
Statements:
a. Jerry is a cat.
b. Jerry is a mammal
In the above diagram, we have represented the different type of knowledge in the form of
nodes and arcs. Each object is connected with another object by some relation.
2. Semantic networks try to model human-like memory (Which has 1015 neurons and links) to store
the information, but in practice, it is not possible to build such a vast semantic network.
3. These types of representations are inadequate as they do not have any equivalent quantifier, e.g.,
for all, for some, none, etc.
4. Semantic networks do not have any standard definition for the link names.
5. These networks are not intelligent and depend on the creator of the system.
3. Frame Representation
A frame is a record like structure which consists of a collection of attributes and its values
to describe an entity in the world. Frames are the AI data structure which divides
knowledge into substructures by representing stereotypes situations. It consists of a
collection of slots and slot values. These slots may be of any type and sizes. Slots have
names and values which are called facets.
Facets: The various aspects of a slot is known as Facets. Facets are features of frames
which enable us to put constraints on the frames. Example: IF-NEEDED facts are called
when data of any particular slot is needed. A frame may consist of any number of slots,
and a slot may include any number of facets and facets may have any number of values.
A frame is also known as slot-filter knowledge representation in artificial intelligence.
Frames are derived from semantic networks and later evolved into our modern-day classes
and objects. A single frame is not much useful. Frames system consist of a collection of
frames which are connected. In the frame, knowledge about an object or event can be
stored together in the knowledge base. The frame is a type of technology which is widely
used in various applications including Natural language processing and machine visions.
Example: 1
Let's take an example of a frame for a book
Slots Filters
Title Artificial Intelligence
Year 1996
Page 1152
2. The frame representation is comparably flexible and used by many applications in AI.
4. Production Rules
Production rules system consist of (condition, action) pairs which mean, "If condition then
action". It has mainly three parts:
o The set of production rules o
recognize-act-cycle
In production rules agent checks for the condition and if the condition exists then
production rule fires and corresponding action is carried out. The condition part of the
rule determines which rule may be applied to a problem. And the action part carries out
the associated problem-solving steps. This complete process is called a recognize-act
cycle.
The working memory contains the description of the current state of problems-solving
and rule can write knowledge to the working memory. This knowledge match and may fire
other rules.
If there is a new situation (state) generates, then multiple production rules will be fired
together, this is called conflict set. In this situation, the agent needs to select a rule from
these sets, and it is called a conflict resolution.
Example:
o IF (at bus stop AND bus arrives) THEN action (get into the bus) o IF (on the bus
AND paid AND empty seat) THEN action (sit down). o IF (on bus
2. The production rules are highly modular, so we can easily remove, add or modify an individual
rule.
2. During the execution of the program, many rules may be active hence rule-based production
systems are inefficient.
Scripts
Scripts are used in natural language understanding systems to organize a knowledge base in terms of the
situations that the system should understand. Scripts use a frame-like structure to represent the
commonly occurring experience like going to the movies eating in a restaurant, shopping in a
supermarket, or visiting an ophthalmologist.
Thus, a script is a structure that prescribes a set of circumstances that could be expected to follow on
from one another.
Components of a script
The components of a script include:
Entry condition: These are basic condition which must be fulfilled before events in the script can occur.
Roles: These are the actions that the individual participants perform.
Track: Variations on the script. Different tracks may share components of the same scripts.
Conceptual Dependency in
Artificial Intelligence
Conceptual Dependency:
In 1977, Roger C. Schank has developed a Conceptual Dependency structure.
The Conceptual Dependency is used to represent knowledge of Artificial Intelligence.
It should be powerful enough to represent these concepts in the sentence of natural
language. It states that different sentence which has the same meaning should have
some unique representation. There are 5 types of states in Conceptual Dependency:
1. Entities
2. Actions
3. Conceptual cases
4. Conceptual dependencies
5. Conceptual tense
o Python o R
language.
2.Java
Java has so many features which make Java best in industry and to develop artificial
intelligence applications:
3. Prolog
o Supports basic mechanisms such as o Pattern
4. Lisp
o The program can be easily modified, similar to data. o
6. Julia
o Common numeric data types. o Arbitrary precision values. o Robust mathematical
o Ability to work for both parallel and distributed computing. o Macros and metaprogramming
7. C++
o C++ is one of the fastest languages, and it can be used in statistical techniques. o It can
be used with ML algorithms for fast execution.
o Most of the libraries and packages available for Machine learning and AI are written in
C++. o It is a user friendly and simple language.
UNIT-IV
What is NLP?
NLP stands for Natural Language Processing, which is a part of Computer Science,
Human language, and Artificial Intelligence. It is the technology that is used by
machines to understand, analyse, manipulate, and interpret human's languages. It helps
developers to organize knowledge for performing tasks such as translation, automatic
summarization, Named Entity Recognition (NER), speech recognition, relationship
extraction, and topic segmentation.
History of NLP
(1940-1960) - Focused on Machine Translation (MT)
1948 - In the Year 1948, the first recognisable NLP application was introduced in Birkbeck College,
London.
1950s - In the Year 1950s, there was a conflicting view between linguistics and computer
science. Now, Chomsky developed his first book syntactic structures and claimed that
language is generative in nature.
In 1957, Chomsky also introduced the idea of Generative Grammar, which is rule based
descriptions of syntactic structures.
Advantages of NLP
o NLP helps users to ask questions about any subject and get a direct response within seconds. o
NLP offers exact answers to the question means it does not offer unnecessary and unwanted
information.
o NLP helps computers to communicate with humans in their languages. o It is very time
efficient.
o Most of the companies use NLP to improve the efficiency of documentation processes, accuracy
of documentation, and identify the information from large databases.
Disadvantages of NLP
A list of disadvantages of NLP is given below:
o NLP is unable to adapt to the new domain, and it has a limited function that's why NLP is built for
a single and specific task only.
Components of NLP
There are the following two components of NLP -
NLU mainly used in Business applications to understand the customer's problem in both spoken
and written language.
NLU NLG
NLU is the process of reading and interpreting NLG is the process of writing or generating
language. language.
Language Models determine the probability of the next word by analyzing the text in
data. These models interpret the data by feeding it through algorithms.
The algorithms are responsible for creating rules for the context in natural language. The
models are prepared for the prediction of words by learning the features and
characteristics of a language. With this learning, the model prepares itself for
understanding phrases and predicting the next words in sentences.
For training a language model, a number of probabilistic approaches are used. These
approaches vary on the basis of the purpose for which a language model is created. The
amount of text data to be analyzed and the math applied for analysis makes a difference
in the approach followed for creating and training a language model.
For example, a language model used for predicting the next word in a search query will be
absolutely different from those used in predicting the next word in a long document
(such as Google Docs). The approach followed to train the model would be unique in both
cases.
Statistical models include the development of probabilistic models that are able to
predict the next word in the sequence, given the words that precede it. A number of
statistical language models are in use already.
Let’s take a look at some of those popular models:
N-Gram: This is one of the simplest approaches to language modelling. Here, a probability
distribution for a sequence of ‘n’ is created, where ‘n’ can be any number and defines the
size of the gram (or sequence of words being assigned a probability). If n=4, a gram may
look like: “can you help me”. Basically, ‘n’ is the amount of context that the model is
trained to consider. There are different types of N-Gram models such as unigrams,
bigrams, trigrams, etc.
Unigram: The unigram is the simplest type of language model. It doesn't look at any
conditioning context in its calculations. It evaluates each word or term independently.
Unigram models commonly handle language processing tasks such as information
retrieval. The unigram is the foundation of a more specific model variant called the query
likelihood model, which uses information retrieval to examine a pool of documents and
match the most relevant one to a specific query.
Bidirectional: Unlike n-gram models, which analyze text in one direction (backwards),
bidirectional models analyze text in both directions, backwards and forwards. These
models can predict any word in a sentence or body of text by using every other word in
the text. Examining text bidirectionally increases result accuracy. This type is often
utilized in machine learning and speech generation applications. For example, Google
uses a bidirectional model to process search queries.
Exponential: This type of statistical model evaluates text by using an equation which is a
combination of n-grams and feature functions. Here the features and parameters of the
desired results are already specified. The model is based on the principle of entropy,
which states that probability distribution with the most entropy is the best choice.
Exponential models have fewer statistical assumptions which mean the chances of having
accurate results are more.
Continuous Space: In this type of statistical model, words are arranged as a non-linear
combination of weights in a neural network. The process of assigning weight to a word is
known as word embedding. This type of model proves helpful in scenarios where the data
set of words continues to become large and include unique words.
In cases where the data set is large and consists of rarely used or unique words, linear
models such as n-gram do not work. This is because, with increasing words, the possible
word sequences increase, and thus the patterns predicting the next word become
weaker.
These language models are based on neural networks and are often considered as an
advanced approach to execute NLP tasks. Neural language models overcome the
shortcomings of classical models such as n-gram and are used for complex tasks such as
speech recognition or machine translation.
Language is significantly complex and keeps on evolving. Therefore, the more complex
the language model is, the better it would be at performing NLP tasks. Compared to the
n-gram model, an exponential or continuous space model proves to be a better option for
NLP tasks because they are designed to handle ambiguity and language variation.
Grammar
Grammar is defined as the rules for forming well-structured sentences.
The theory of formal languages is not only applicable here but is also applicable in the
fields of Computer Science mainly in programming languages and data structures.
CFG consists of a finite set of grammar rules having the following four components
• Set of Non-Terminals
• Set of Terminals
• Set of Productions
• Start Symbol
Set of Non-terminals
It is represented by V. The non-terminals are syntactic variables that denote the sets of
strings, which helps in defining the language that is generated with the help of
grammar.
Set of Terminals
It is also known as tokens and represented by Σ. Strings are formed with the help of the
basic symbols of terminals.
Set of Productions
It is represented by P. The set gives an idea about how the terminals and nonterminals
can be combined. Every production consists of the following components:
• Non-terminals,
• Arrow,
• Terminals (the sequence of terminals).
The left side of production is called non-terminals while the right side of production is
called terminals.
Start Symbol
The production begins from the start symbol. It is represented by symbol S.
Nonterminal symbols are always designated as start symbols.
Constituency Grammar (CG)
It is also known as Phrase structure grammar. It is called constituency Grammar as it is
based on the constituency relation. It is the opposite of dependency grammar.
Before deep dive into the discussion of CG, let’s see some fundamental points about
constituency grammar and constituency relation.
• All the related frameworks view the sentence structure in terms of constituency
relation.
• To derive the constituency relation, we take the help of subject-predicate
division of Latin as well as Greek grammar.
• Here we study the clause structure in terms of noun phrase NP and verb phrase
VP.
Before deep dive into the discussion of DG, let’s see some fundamental points about
Dependency grammar and Dependency relation.
• Every other syntactic unit is connected to the verb in terms of directed link.
These syntactic units are called dependencies.
Regular expression
o Regular expression is a sequence of pattern that defines a string. It is used to denote regular
languages.
o It is also used to match character combinations in strings. String searching algorithm used this
pattern to find the operations on string.
o In regular expression, x* means zero or more occurrence of x. It can generate {e, x, xx, xxx,
xxxx,.....}
o In regular expression, x+ means one or more occurrence of x. It can generate {x, xx, xxx,
xxxx,.....}
Union: If L and M are two regular languages then their union L U M is also a union.
1. L U M = {s | s is in L or s is in M}
Intersection: If L and M are two regular languages then their intersection is also an
intersection.
1. L M = {st | s is in L and t is in M}
Kleene closure: If L is a regular language then its kleene closure L1* will also be a regular
language.
Finite Automata
o Finite automata are used to recognize patterns.
o It takes the string of symbol as input and changes its state accordingly. When the desired symbol
is found, then the transition occurs.
o At the time of transition, the automata can either move to the next state or stay in the same state.
o Finite automata have two states, Accept state or Reject state. When the input string is processed
successfully, and the automata reached its final state, then it will accept.
Formal Definition of FA
A finite automaton is a collection of 5-tuple (Q, ∑, δ, q0, F), where:
Input tape: It is a linear tape having some number of cells. Each input symbol is placed in
each cell.
Finite control: The finite control decides the next state on receiving particular input from
input tape. The tape reader reads the cells one by one from left to right, and at a time only
one input symbol is read.
Types of Automata:
There are two types of finite automata:
1. DFA
DFA refers to deterministic finite automata. Deterministic refers to the uniqueness of the
computation. In the DFA, the machine goes to one state only for a particular input
character. DFA does not accept the null move.
2. NFA
NFA stands for non-deterministic finite automata. It is used to transmit any number of states for
a particular input. It can accept the null move.
• Ambiguity: Some words can have multiple POS tags depending on the context in which
they appear, making it difficult to determine their correct tag. For example, the word
“bass” can be a noun (a type of fish) or an adjective (having a low frequency or pitch).
• Out-of-vocabulary (OOV) words: Words that are not present in the training data of a
POS tagger can be difficult to tag accurately, especially if they are rare or specific to a
particular domain.
• Complex grammatical structures: Languages with complex grammatical structures,
such as languages with many inflections or free word order, can be more challenging to
tag accurately.
• Lack of annotated training data: Some languages or domains may have limited
annotated training data, making it difficult to train a high-performing POS tagger.
• Inconsistencies in annotated data: Annotated data can sometimes contain errors or
inconsistencies, which can negatively impact the performance of a POS tagger.
What is Semantics?
Semantics is simply the branch of linguistics that concerns studying the meanings of words as well as their
meanings within a sentence. Thus, it is the study of linguistic meaning, or more precisely, the study of the relation
between linguistic expressions and their meaning. Therefore, it considers the meaning of a sentence without
paying attention to their context.
To explain further what semantics means in linguistics, it can be denoted that “it is the study of the interpretation
of signs or symbols used in agents or communities within particular circumstances and contexts”. Hence,
according to this, sounds, facial expressions, body language, and proxemics have semantic (meaningful) content,
and each of these comprises several branches of study. Moreover, in written language, things like paragraph
structure and punctuation bear semantic content; other forms of language bear other semantic content.
Thus, semantics focuses on three basic aspects: “the relations of words to the objects denoted by them, the
relations of words to the interpreters of them, and, in symbolic logic, the formal relations of signs to one another
(syntax)”. Therefore, semantics also looks at the ways in which the meanings of words can be related to each
other.
What is Pragmatics?
Pragmatics is another branch of linguistics. Similar to semantics, pragmatics also studies the meanings of words,
but it pays emphasis on their context. In other words, pragmatics is “the study of the use of linguistic signs,
words, and sentences, in actual situations.”
Thus, it looks beyond the literal meaning of an utterance or a sentence, considering how the context impacts its
meaning to be constructed as well the implied meanings.
Therefore, unlike semantics, pragmatics concern the context of that particular words and how that context
impacts their meaning.
UNIT-V
What is an Expert System?
An expert system is a computer program that is designed to solve complex problems and
to provide decision-making ability like a human expert. It performs this by extracting
knowledge from its knowledge base using the reasoning and inference rules according to
the user queries.
The expert system is a part of AI, and the first ES was developed in the year 1970, which
was the first successful approach of artificial intelligence. It solves the most complex issue
as an expert by extracting the knowledge stored in its knowledge base. The system helps
in decision making for compsex problems using both facts and heuristics like a human
expert. It is called so because it contains the expert knowledge of a specific domain and
can solve any complex problem of that particular domain. These systems are designed for
a specific domain, such as medicine, science, etc.
The performance of an expert system is based on the expert's knowledge stored in its
knowledge base. The more knowledge stored in the KB, the more that system improves
its performance. One of the common examples of an ES is a suggestion of spelling errors
while typing in the Google search box.
• Supervised Learning
• Unsupervised Learning
• Semi-supervised Learning
Similarly, there are four categories of machine learning algorithms as shown below −
to actually learn. They are Supervised Learning, Unsupervised Learning and Reinforcement learning.
• Supervised Learning: The machine has a “teacher” who guides it by providing sample inputs along
with the desired output. The machine then maps the inputs and the outputs. This is similar to how we
teach very young children with picture books. According to Yann LeCun, all of the AI machines we have
today have used this form of learning (from speech recognition to self-driving cars).
• Reinforcement Learning: Yann LeCun believes this plays a relatively minor role in training AI and is
similar to training an animal. When the animal displays a desired behavior it is given a reward.
According to the Wikipedia entry on Machine Learning, reinforcement learning is defined as “a
computer program interacts with a dynamic environment in which it must perform a certain goal (such
as driving a vehicle), without a teacher explicitly telling it whether it has come close to its goal. “
• Unsupervised Learning: This is the most important and most difficult type of learning and would be
better titled Predictive Learning. In this case the machine is not given any labels for its inputs and needs
to “figure out” the structure on its own. This is similar to how babies learn early in life. For example
they learn that if an object in space is not supported it will fall
However, the most commonly used ones are supervised and unsupervised learning.