Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Unit 1aiml

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 21

Artificial Intelligence (AI) is the way of making computing hardware and software think

intelligently, in similarity to the manner humans use natural intelligence. (Or ) A method for
educating a computer, a robot controlled by a computer, or software to think critically and creatively
like a human mind is known as Artificial Intelligence.
According to “John McCarthy”, known as the father of Artificial Intelligence: AI is “The science
and engineering of making intelligent machines, especially intelligent computer programs”.

Artificial Intelligence Techniques


Artificial Intelligence (AI) techniques are revolutionizing the way humans interact with technology.
AI refers to the development of computer systems that can perform tasks that typically require human
intelligence, such as visual perception, speech recognition, decision-making, and language translation.
These systems are designed to learn from experience and improve their performance over time.
discuss 4 types of Artificial Intelligence Techniques.

Machine Learning (ML)


Machine Learning is a subset of AI that uses statistical methods to enable machines to learn from
data. It involves the creation of algorithms that can identify patterns, make predictions, and improve
their performance over time without explicit programming. Some popular ML methods include:

 Supervised Learning: The algorithm is trained on a labeled dataset, where the input-output
pairs are provided. The algorithm learns the relationship between input and output and applies
this knowledge to unseen data. Examples: Linear Regression, Support Vector Machines
(SVM), and Neural Networks.
 Unsupervised Learning: The algorithm is provided with an unlabeled dataset, and it
identifies patterns or structures in the data without guidance. Examples: Clustering (e.g., K-
means), Dimensionality Reduction (e.g., Principal Component Analysis), and Association
Rule Learning.
 Reinforcement Learning: The algorithm learns from its actions and interactions with an
environment to maximize a reward signal. It’s particularly useful in decision-making and
control tasks. Examples: Q-learning, Deep Q-Network (DQN), and Policy Gradient methods.

NLP (Natural Language Processing)


Natural Language Processing involves programming computers to process human languages to
facilitate interactions between humans and computers.
Machine Learning is a reliable technology for Natural Language Processing to obtain meaning from
human languages. In NLP, the machine captures the audio of a human talk. After the audio-to-text
conversion, the text is processed and converted back into audio data. Then the machine uses the audio
to respond to humans.
Applications of Natural Language Processing can be found in IVR (Interactive Voice Response)
applications used in call centers, language translation applications like Google Translate, and word
processors such as Microsoft Word to check the accuracy of grammar in text.
However, the nature of human languages makes Natural Language Processing difficult ..because of
the rules involved in passing information using natural language. They are challenging for computers
to understand. NLP leverages algorithms to recognize and abstract the rules of natural languages,
converting unstructured human language data into a computer-understandable format. Moreover, NLP
can also be found in content optimization, such as paraphrasing applications, which helps to
improve the readability of complex text.
Automation and Robotics

Automation aims to improve productivity and efficiency by having machines perform monotonous
and repetitive tasks, resulting in cost-effective outcomes

Many organizations use machine learning, neural networks, and graphs in automation.

Using CAPTCHA technology, such automation can prevent fraud issues during online financial
transactions. Programmers create robotic process automation to perform high-volume repetitive tasks
that can adapt to changes in different circumstances.

Machine Vision
Machines can capture visual information and then analyze it.

This process involves using cameras to capture visual information, converting the analog image to
digital data, and processing the data through digital signal processing. Then the resulting data is fed to
a computer.
In machine vision, two vital aspects are sensitivity, the ability to perceive weak impulses, and
resolution, the range to which the machine can distinguish objects.

The usage of machine vision can be found in signature identification, pattern recognition, medical
image analysis, etc.

Deep Learning

Deep Learning takes ML to a higher level by employing neural networks with multiple layers to
process complex data representations. It has propelled AI achievements, such as beating human
champions in games like chess and Go and enhancing image and speech recognition systems.

Types of Artificial Intelligence:

Artificial Intelligence can be divided in various types, there are mainly two types of main
categorization which are based on capabilities and based on functionally of AI. Following is flow
diagram which explain the types of AI.

AI type-1: Based on Capabilities

1. Weak AI or Narrow AI:


 Narrow AI is a type of AI which is able to perform a dedicated task with intelligence.The
most common and currently available AI is Narrow AI in the world of Artificial Intelligence.
 Narrow AI cannot perform beyond its field or limitations, as it is only trained for one specific
task. Hence it is also termed as weak AI. Narrow AI can fail in unpredictable ways if it goes
beyond its limits.
 Apple Siriis a good example of Narrow AI, but it operates with a limited pre-defined range of
functions.
 IBM's Watson supercomputer also comes under Narrow AI, as it uses an Expert system
approach combined with Machine learning and natural language processing.
 Some Examples of Narrow AI are playing chess, purchasing suggestions on e-commerce site,
self-driving cars, speech recognition, and image recognition.
2. General AI:
 General AI is a type of intelligence which could perform any intellectual task with efficiency
like a human.
 The idea behind the general AI to make such a system which could be smarter and think like a
human by its own.
 Currently, there is no such system exist which could come under general AI and can perform
any task as perfect as a human.
 The worldwide researchers are now focused on developing machines with General AI.
 As systems with general AI are still under research, and it will take lots of efforts and time to
develop such systems.
3. Super AI:
 Super AI is a level of Intelligence of Systems at which machines could surpass human
intelligence, and can perform any task better than human with cognitive properties. It is an
outcome of general AI.
 Some key characteristics of strong AI include capability include the ability to think, to
reason,solve the puzzle, make judgments, plan, learn, and communicate by its own.
 Super AI is still a hypothetical concept of Artificial Intelligence. Development of such
systems in real is still world changing task.

Artificial Intelligence type-2: Based on functionality

1. Reactive Machines

 Purely reactive machines are the most basic types of Artificial Intelligence.
 Such AI systems do not store memories or past experiences for future actions.
 These machines only focus on current scenarios and react on it as per possible best action.
 IBM's Deep Blue system is an example of reactive machines.
 Google's AlphaGo is also an example of reactive machines.

2. Limited Memory

 Limited memory machines can store past experiences or some data for a short period of time.
 These machines can use stored data for a limited time period only.
 Self-driving cars are one of the best examples of Limited Memory systems. These cars can
store recent speed of nearby cars, the distance of other cars, speed limit, and other information
to navigate the road.

3. Theory of Mind

 Theory of Mind AI should understand the human emotions, people, beliefs, and be able to
interact socially like humans.
 This type of AI machines are still not developed, but researchers are making lots of efforts
and improvement for developing such AI machines.
4. Self-Awareness

 Self-awareness AI is the future of Artificial Intelligence. These machines will be super


intelligent, and will have their own consciousness, sentiments, and self-awareness.
 These machines will be smarter than human mind.
 Self-Awareness AI does not exist in reality still and it is a hypothetical concept.

Knowledge Representation and Reasoning (KR, KRR)

It represents information from the real world for a computer to understand and then utilize this
knowledge to solve complex real-life problems like communicating with human beings in natural
language.

Knowledge representation in AI is not just about storing data in a database, it allows a machine to
learn from that knowledge and behave intelligently like a human being.

Following are the kind of knowledge which needs to be represented in AI systems:


o Object: All the facts about objects in our world domain. E.g., Guitars contains strings,
trumpets are brass instruments.
o Events: Events are the actions which occur in our world.
o Performance: It describe behavior which involves knowledge about how to do things.
o Meta-knowledge: It is knowledge about what we know.
o Facts: Facts are the truths about the real world and what we represent.
o Knowledge-Base: The central component of the knowledge-based agents is the knowledge
base. It is represented as KB. The Knowledgebase is a group of the Sentences (Here,
sentences are used as a technical term and not identical with the English language).

Types of knowledge

Declarative Knowledge:
 Declarative knowledge is to know about something.
 It includes concepts, facts, and objects.
 It is also called descriptive knowledge and expressed in declarativesentences.
 It is simpler than procedural language.
Procedural Knowledge
 It is also known as imperative knowledge.
 Procedural knowledge is a type of knowledge which is responsible for knowing how to do
something.
 It can be directly applied to any task.
 It includes rules, strategies, procedures, agendas, etc.
 Procedural knowledge depends on the task on which it can be applied.
Meta-knowledge:

Knowledge about the other types of knowledge is called Meta-knowledge.


Heuristic knowledge:
 Heuristic knowledge is representing knowledge of some experts in a filed or subject.
 Heuristic knowledge is rules of thumb based on previous experiences, awareness of
approaches, and which are good to work but not guaranteed.
Structural knowledge:
 Structural knowledge is basic knowledge to problem-solving.
 It describes relationships between various concepts such as kind of, part of, and grouping of
something.
 It describes the relationship that exists between concepts or objects.

AI knowledge cycle:
An Artificial intelligence system has the following components for displaying intelligent behavior:

The above diagram is showing how an AI system can interact with the real world and what
components help it to show intelligence.
 AI system has Perception component by which it retrieves information from its environment.
It can be visual, audio or another form of sensory input.
 The learning component is responsible for learning from data captured by Perception
comportment.
 In the complete cycle, the main components are knowledge representation and Reasoning.
These two components are involved in showing the intelligence in machine-like humans.
These two components are independent with each other but also coupled together.
 The planning and execution depend on analysis of Knowledge representation and reasoning.

Four ways of knowledge representation which are given as follows:

 Logical Representation
 Semantic Network Representation
 Frame Representation
 Production Rules

Logical representation Logical representation is a language with some concrete rules which deals
with propositions and has no ambiguity in representation. Logical representation means drawing a
conclusion based on various conditions.

Logical representation can be categorised into mainly two logics:


 Propositional Logics
 Predicate logics
Propositional logic (PL) is the simplest form of logic where all the statements are made by
propositions. A proposition is a declarative statement which is either true or false.
example
a) It is Sunday.
b) The Sun rises from West (False proposition)
c) 3+3= 7(False proposition)
d) 5 is a prime number.

 Propositional logic consists of an object, relations or function, and logical connectives.


 These connectives are also called logical operators.
 A proposition formula which is always true is called tautology, and it is also called a valid
sentence.
 A proposition formula which is always false is called Contradiction.

Syntax of propositional logic:

The syntax of propositional logic defines the allowable sentences for the knowledge representation.
There are two types of Propositions:
Atomic Proposition: Atomic propositions are the simple propositions. It consists of a single
proposition symbol. These are the sentences which must be either true or false.
Example:
a) 2+2 is 4, it is an atomic proposition as it is a true fact.
b) "The Sun is cold" is also a proposition as it is a false fact.

Compound proposition: Compound propositions are constructed by combining simpler or atomic


propositions, using parenthesis and logical connectives.
Example:
a) "It is raining today, and street is wet."
b) "Ankit is a doctor, and his clinic is in Mumbai."

Logical Connectives:
Logical connectives are used to connect two simpler propositions or representing a sentence logically.
We can create compound propositions with the help of logical connectives. There are mainly five
connectives, which are given as follows:

Negation: A sentence such as ¬ P is called negation of P.

Conjunction: A sentence which has ∧ connective such as, P ∧ Q is called a conjunction.


Example: Rohan is intelligent and hardworking. It can be written as,
P=Rohan is intelligent,
Q= Rohan is hardworking. → P∧ Q.
Disjunction: A sentence which has ∨ connective, such as P ∨ Q. is called disjunction,.
Example: "Ritika is a doctor or Engineer",
Here P= Ritika is Doctor. Q= Ritika is Doctor, so we can write it as P ∨ Q.
Implication : A sentence such as P → Q, is called an implication.
Implications are also known as if-then rules. It can be represented as
If it is raining, then the street is wet.
Let P= It is raining, and Q= Street is wet, so it is represented as P → Q
Biconditional: A sentence such as P⇔ Q is a Biconditional sentence,
example If I am breathing, then I am alive
P= I am breathing, Q= I am alive, it can be represented as P ⇔ Q.
Truth Table:
In propositional logic, we need to know the truth values of propositions in all possible scenarios. We
can combine all the possible combination with logical connectives, and the representation of these
combinations in a tabular format is called Truth table. Following are the truth table for all logical
connectives:

Properties of Operators:

Commutativity:
 P∧ Q= Q ∧ P, or
 P ∨ Q = Q ∨ P.
Associativity:
 (P ∧ Q) ∧ R= P ∧ (Q ∧ R),
 (P ∨ Q) ∨ R= P ∨ (Q ∨ R)
Identity element:
 P ∧ True = P,
 P ∨ True= True.
Distributive:
 P∧ (Q ∨ R) = (P ∧ Q) ∨ (P ∧ R).
 P ∨ (Q ∧ R) = (P ∨ Q) ∧ (P ∨ R).
DE Morgan's Law:
 ¬ (P ∧ Q) = (¬P) ∨ (¬Q)
 ¬ (P ∨ Q) = (¬ P) ∧ (¬Q).
Double-negation elimination:
 ¬ (¬P) = P.

Limitations of Propositional logic:

 We cannot represent relations like ALL, some, or none with propositional logic. Example:
All the girls are intelligent.
Some apples are sweet.
 Propositional logic has limited expressive power.
 In propositional logic, we cannot describe statements in terms of their properties or logical
relationships.

Predicate logic

First-order logic is also known as Predicate logic or First-order predicate logic. First-order logic is
a powerful language that develops information about the objects in a more easy way and can also
express the relationship between those objects.

First-order logic (like natural language) does not only assume that the world contains facts like
propositional logic but also assumes the following things in the world
:
Objects: A, B, people, numbers, colors, wars, theories, squares, pits, wumpus, ......
Relations: It can be unary relation such as: red, round, is adjacent, or n-any relation such as: the
sister of, brother of, has color, comes between
Function: Father of, best friend, third inning of, end of, ......

As a natural language, first-order logic also has two main parts:

 Syntax
 Semantics

Syntax of First-Order logic:

The syntax of FOL determines which collection of symbols is a logical expression in first-order logic.
The basic syntactic elements of first-order logic are symbols. We write statements in short-hand
notation in FOL.
Basic Elements of First-order logic:
Constant 1, 2, A, John, Mumbai, cat,....

Variables x, y, z, a, b,....
Predicates Brother, Father, >,....

Function sqrt, LeftLegOf, ....

Connectives ∧, ∨, ¬, ⇒, ⇔

Equality ==

Quantifier ∀, ∃

Atomic sentences:

Atomic sentences are the most basic sentences of first-order logic. These sentences are formed from a
predicate symbol followed by a parenthesis with a sequence of terms.
We can represent atomic sentences as Predicate (term1, term2, ......, term n).
Example: Ravi and Ajay are brothers: => Brothers(Ravi, Ajay).
Chinky is a cat: => cat (Chinky).

Complex Sentences:
Complex sentences are made by combining atomic sentences using connectives.
First-order logic statements can be divided into two parts:
Subject: Subject is the main part of the statement.
Predicate: A predicate can be defined as a relation, which binds two atoms together in a statement.

Consider the statement: "x is an integer.", it consists of two parts, the first part x is the subject of
the statement and second part "is an integer," is known as a predicate.

Quantifiers in First-order logic:


A quantifier is a language element which generates quantification, and quantification specifies the
quantity of specimen in the universe of discourse.
These are the symbols that permit to determine or identify the range and scope of the variable in the
logical expression. There are two types of quantifier:

 Universal Quantifier, (for all, everyone, everything)


 Existential quantifier, (for some, at least one).

Universal Quantifier:
Universal quantifier is a symbol of logical representation, which specifies that the statement within its
range is true for everything or every instance of a particular thing.
The Universal quantifier is represented by a symbol ∀, which resembles an inverted A.
In universal quantifier we use implication "→".

If x is a variable, then ∀x is read as:


 For all x
 For each x
 For every x

Example:
All man drink coffee.
∀x man(x) → drink (x, coffee).

Existential Quantifier:

Existential quantifiers are the type of quantifiers, which express that the statement within its scope is
true for at least one instance of something.
It is denoted by the logical operator ∃, which resembles as inverted E. When it is used with a
predicate variable then it is called as an existential quantifier.

Note: In Existential quantifier we always use AND or Conjunction symbol (∧).

If x is a variable, then existential quantifier will be ∃x or ∃(x). And it will be read as:
 There exists a 'x.'
 For some 'x.'
 For at least one 'x.'

Example:
Some boys are intelligent.
∃x: boys(x) ∧ intelligent(x)

Advantages of logical representation:


 Logical representation enables us to do logical reasoning.
 Logical representation is the basis for the programming languages.
Disadvantages of logical Representation:
 Logical representations have some restrictions and are challenging to work with.
 Logical representation technique may not be very natural, and inference may not be so
efficient.

2. Semantic Network Representation


 Semantic networks are alternative of predicate logic for knowledge representation.
 In Semantic networks, we can represent our knowledge in the form of graphical networks.
 This network consists of nodes representing objects and arcs which describe the relationship
between those objects.
 Semantic networks can categorize the object in different forms and can also link those
objects. Semantic networks are easy to understand and can be easily extended.
 This representation consist of mainly two types of relations:
IS-A relation (Inheritance)
Kind-of-relation
Statements:
a. Jerry is a cat.

b. Jerry is a mammal

c. Jerry is owned by Priya.

d. Jerry is brown colored.


e. All Mammals are animal.

In the above diagram, we have represented the different type of knowledge in the form of nodes and
arcs. Each object is connected with another object by some relation.

Drawbacks in Semantic representation:


 Semantic networks take more computational time at runtime as we need to traverse
the complete network tree to answer some questions. It might be possible in the
worst case scenario that after traversing the entire tree, we find that the solution does
not exist in this network.
 Semantic networks try to model human-like memory (Which has 1015 neurons and
links) to store the information, but in practice, it is not possible to build such a vast
semantic network.
 Semantic networks do not have any standard definition for the link names.
Advantages of Semantic network:
 Semantic networks are a natural representation of knowledge.
 Semantic networks convey meaning in a transparent manner.
 These networks are simple and easily understandable.

Frame Representation

1. A frame is a record like structure which consists of a collection of attributes and its values to
describe an entity in the world.
2. Frames are the AI data structure which divides knowledge into substructures by representing
stereotypes situations.
3. It consists of a collection of slots and slot values. These slots may be of any type and sizes.
Slots have names and values which are called facets.

4. Facets: The various aspects of a slot is known as Facets. Facets are features of frames which
enable us to put constraints on the frames. Example: IF-NEEDED facts are called when data
of any particular slot is needed. A frame may consist of any number of slots, and a slot may
include any number of facets and facets may have any number of values. A frame is also
known as slot-filter knowledge representation in artificial intelligence.

Frames are derived from semantic networks and later evolved into our modern-day classes and
objects. A single frame is not much useful. Frames system consist of a collection of frames which are
connected. In the frame, knowledge about an object or event can be stored together in the knowledge
base. The frame is a type of technology which is widely used in various applications including Natural
language processing and machine visions.
Example

Let's suppose we are taking an entity, Peter. Peter is an engineer as a profession, and his age is 25, he
lives in city London, and the country is England. So following is the frame representation for this:

Slots Filter

Name Peter

Profession Doctor

Age 25

Marital status Single

Weight 78

Advantages of frame representation:

1. The frame knowledge representation makes the programming easier by grouping the
related data.
2. The frame representation is comparably flexible and used by many applications in AI.
3. It is very easy to add slots for new attribute and relations.
4. It is easy to include default data and to search for missing values.
5. Frame representation is easy to understand and visualize.

Disadvantages of frame representation:

1. In frame system inference mechanism is not be easily processed.


2. Inference mechanism cannot be smoothly proceeded by frame representation.
3. Frame representation has a much generalized approach.

4. Production Rules

Production rules system consist of (condition, action) pairs which mean, "If condition then
action". It has mainly three parts:

o The set of production rules


o Working Memory
o The recognize-act-cycle

Example:

o IF (at bus stop AND bus arrives) THEN action (get into the bus)
o IF (on the bus AND paid AND empty seat) THEN action (sit down).
o IF (on bus AND unpaid) THEN action (pay charges).
o IF (bus arrives at destination) THEN action (get down from the bus).

Advantages of Production rule:

1. The production rules are expressed in natural language.


2. The production rules are highly modular, so we can easily remove, add or modify an
individual rule.

Disadvantages of Production rule:

1. Production rule system does not exhibit any learning capabilities, as it does not store
the result of the problem for the future uses.
2. During the execution of the program, many rules may be active hence rule-based
production systems are inefficient.

Procedural VS Declarative Knowledge

S.NO Procedural Knowledge Declarative Knowledge

1. It is also known as Interpretive knowledge. It is also known as Descriptive knowledge.

Procedural Knowledge means how a particular While Declarative Knowledge means basic
2.
thing can be accomplished. knowledge about something.

Procedural Knowledge is generally not used


3. Declarative Knowledge is more popular.
means it is not more popular.

Procedural Knowledge can’t be easily Declarative Knowledge can be easily


4.
communicate. communicate.

Procedural Knowledge is generally process Declarative Knowledge is data oriented in


5.
oriented in nature. nature.
In Procedural Knowledge debugging and In Declarative Knowledge debugging and
6.
validation is not easy. validation is easy.

Procedural Knowledge is less effective in Declarative Knowledge is more effective in


7.
competitive programming. competitive programming.

Applications of Artificial Intelligence

1.AI in Astronomy

Artificial Intelligence can be very useful to solve complex universe problems. AI technology can be
helpful for understanding the universe such as how it works, origin, etc.

2. AI in Healthcare

o In the last, five to ten years, AI becoming more advantageous for the healthcare industry and
going to have a significant impact on this industry.
o Healthcare Industries are applying AI to make a better and faster diagnosis than humans. AI
can help doctors with diagnoses and can inform when patients are worsening so that medical
help can reach to the patient before hospitalization.

3. AI in Gaming

o AI can be used for gaming purpose. The AI machines can play strategic games like chess,
where the machine needs to think of a large number of possible places.

4. AI in Finance

o AI and finance industries are the best matches for each other. The finance industry is
implementing automation, chatbot, adaptive intelligence, algorithm trading, and machine
learning into financial processes.

5. AI in Data Security

o The security of data is crucial for every company and cyber-attacks are growing very rapidly
in the digital world. AI can be used to make your data more safe and secure. Some examples
such as AEG bot, AI2 Platform,are used to determine software bug and cyber-attacks in a
better way.

6. AI in Social Media
o Social Media sites such as Facebook, Twitter, and Snapchat contain billions of user profiles,
which need to be stored and managed in a very efficient way. AI can organize and manage
massive amounts of data. AI can analyze lots of data to identify the latest trends, hashtag, and
requirement of different users.

7. AI in Travel & Transport

o AI is becoming highly demanding for travel industries. AI is capable of doing various travel
related works such as from making travel arrangement to suggesting the hotels, flights, and
best routes to the customers. Travel industries are using AI-powered chatbots which can make
human-like interaction with customers for better and fast response.

8. AI in Automotive Industry

o Some Automotive industries are using AI to provide virtual assistant to their user for better
performance. Such as Tesla has introduced TeslaBot, an intelligent virtual assistant.
o Various Industries are currently working for developing self-driven cars which can make your
journey more safe and secure.

9. AI in Robotics:

o Artificial Intelligence has a remarkable role in Robotics. Usually, general robots are
programmed such that they can perform some repetitive task, but with the help of AI, we can
create intelligent robots which can perform tasks with their own experiences without pre-
programmed.
o Humanoid Robots are best examples for AI in robotics, recently the intelligent Humanoid
robot named as Erica and Sophia has been developed which can talk and behave like humans.

10. AI in Entertainment

o We are currently using some AI based applications in our daily life with some entertainment
services such as Netflix or Amazon. With the help of ML/AI algorithms, these services show
the recommendations for programs or shows.

11. AI in Agriculture

o Agriculture is an area which requires various resources, labor, money, and time for best result.
Now a day's agriculture is becoming digital, and AI is emerging in this field. Agriculture is
applying AI as agriculture robotics, solid and crop monitoring, predictive analysis. AI in
agriculture can be very helpful for farmers.
12. AI in E-commerce

o AI is providing a competitive edge to the e-commerce industry, and it is becoming more


demanding in the e-commerce business. AI is helping shoppers to discover associated
products with recommended size, color, or even brand.

13. AI in education:

o AI can automate grading so that the tutor can have more time to teach. AI chatbot can
communicate with students as a teaching assistant.
o AI in the future can be work as a personal virtual tutor for students, which will be accessible
easily at any time and any place.

…………………………………………………………………………………………………
…………………………………………………

Statistics is a core component of data analytics and machine learning. It helps you analyze
and visualize data to find unseen patterns.

What Is Statistics?
Statistics is a branch of mathematics that deals with collecting, analyzing, interpreting, and visualizing
empirical data.

Descriptive statistics and inferential statistics are the two major areas of statistics.

Descriptive statistics are for describing the properties of sample and population data (what has
happened).

Inferential statistics use those properties to test hypotheses, reach conclusions, and make predictions
(what can you expect).
Use of Statistics in Machine Learning

 Asking questions about the data


 Cleaning and preprocessing the data
 Selecting the right features
 Model evaluation
 Model prediction
 With this basic understanding, it’s time to dive deep into learning all the crucial concepts
related to statistics for machine learning.

Population and Sample

Population: In statistics, the population comprises all observations (data points) about the subject
under study.
Sample: In statistics, a sample is a subset of the population. It is a small portion of the total observed
population.

Measures of Central Tendency

Measures of central tendency are the measures that are used to describe the distribution of
data using a single value. Mean, Median and Mode are the three measures of central
tendency.

MEAN :The arithmetic mean is the average of all the data points.If there are n number of
observations and xi is the ith observation, then mean is:

Consider the data frame below that has the names of seven employees and their salaries.
To find the mean or the average salary of the employees, you can use the mean() functions in
Python.

Median: Median is the middle value that divides the data into two equal parts once it sorts the data
in ascending order.

 If the total number of data points (n) is odd, the median is the value at position (n+1)/2.
 When the total number of observations (n) is even, the median is the average value of
observations at n/2 and (n+2)/2 positions.

The median() function in Python can help you find the median value of a column. From the
above data frame, you can find the median salary as:

Mode : The mode is the observation (value) that occurs most frequently in the data set. There can
be over one mode in a dataset.
Given below are the heights of students (in cm) in a class: 155, 157, 160, 159, 162, 160, 161,
165, 160, 158 Mode = 160 cm.
The mode salary from the data frame can be calculated as:

Variance and Standard Deviation

Variance is used to measure the variability in the data from the mean.

Consider the below dataset.

To calculate the variance of the Grade, use the following:

Standard deviation in statistics is the square root of the variance. Variance and standard
deviation represent the measures of fit, meaning how well the mean represents the data.
You can find the standard deviation using the std() function in Python.

Range and Interquartile Range

Range : The Range in statistics is the difference between the maximum and the minimum value of the
dataset.

Interquartile Range (IQR) : The IQR is a measure of the distance between the 1st quartile (Q1)
(25%) and 3rd quartile (Q3)(75%)

Skewness :
Skewness measures the shape of the distribution. A distribution is symmetrical when the proportion of
data at an equal distance from the mean (or median) is equal. If the values extend to the right, it is
right-skewed, and if the values extend left, it is left-skewed.

Gaussian Distribution In statistics and probability, Gaussian (normal) distribution is a popular


continuous probability distribution for any random variable. It is characterized by 2 parameters (mean
μ and standard deviation σ). Many natural phenomena follow a normal distribution, such as the
heights of people and IQ scores.

Properties of Gaussian Distribution:

 The mean, median, and mode are the same

 It has a symmetrical bell shape

 68% data lies within 1 standard deviation of the mean

 95% data lie within 2 standard deviations of the mean

 99.7% of the data lie within 3 standard deviations of the mean

You might also like