0% found this document useful (0 votes)

104 views

Naïve Bayes Classifier Algorithm

The document provides information about the Naive Bayes classifier algorithm. It begins by explaining that Naive Bayes is a supervised learning algorithm based on Bayes' theorem used for classification problems. It then discusses that Naive Bayes assumes independence between features. The document proceeds to give an example of how Naive Bayes works using a weather dataset to classify whether to "play" or not based on weather conditions. It concludes by discussing the advantages, disadvantages, applications, and types of Naive Bayes models.

Uploaded by

amir

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

104 views

Naïve Bayes Classifier Algorithm

Uploaded by

amir

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Naïve Bayes Classifier Algorithm

o Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes

theorem and used for solving classification problems.
o It is mainly used in text classification that includes a high-dimensional training
dataset.
o Naïve Bayes Classifier is one of the simple and most effective Classification
algorithms which helps in building the fast machine learning models that can make
quick predictions.
o It is a probabilistic classifier, which means it predicts on the basis of the
probability of an object.
o Some popular examples of Naïve Bayes Algorithm are spam filtration,
Sentimental analysis, and classifying articles.

Why is it called Naïve Bayes?

The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which can be
described as:

o Naïve: It is called Naïve because it assumes that the occurrence of a certain feature
is independent of the occurrence of other features. Such as if the fruit is identified
on the bases of color, shape, and taste, then red, spherical, and sweet fruit is
recognized as an apple. Hence each feature individually contributes to identify that
it is an apple without depending on each other.
o Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem.

Bayes' Theorem:
o Bayes' theorem is also known as Bayes' Rule or Bayes' law, which is used to
determine the probability of a hypothesis with prior knowledge. It depends on the
conditional probability.
o The formula for Bayes' theorem is given as:

Where,

P(A|B) is Posterior probability: Probability of hypothesis A on the observed event B.

P(B|A) is Likelihood probability: Probability of the evidence given that the probability
of a hypothesis is true.

P(A) is Prior Probability: Probability of hypothesis before observing the evidence.

P(B) is Marginal Probability: Probability of Evidence.

Working of Naïve Bayes' Classifier:

Working of Naïve Bayes' Classifier can be understood with the help of the below example:

Suppose we have a dataset of weather conditions and corresponding target variable

"Play". So using this dataset we need to decide that whether we should play or not on a
particular day according to the weather conditions. So to solve this problem, we need to
follow the below steps:

1. Convert the given dataset into frequency tables.

2. Generate Likelihood table by finding the probabilities of given features.
3. Now, use Bayes theorem to calculate the posterior probability.

Problem: If the weather is sunny, then the Player should play or not?

Solution: To solve this, first consider the below dataset:

Outlook Play

0 Rainy Yes

1 Sunny Yes

2 Overcast Yes

3 Overcast Yes

4 Sunny No

5 Rainy Yes

6 Sunny Yes

7 Overcast Yes

8 Rainy No

9 Sunny No

10 Sunny Yes

11 Rainy No

12 Overcast Yes

13 Overcast Yes

Frequency table for the Weather Conditions:

Weather Yes No

Overcast 5 0

Rainy 2 2

Sunny 3 2

Total 10 5
Likelihood table weather condition:

Weather No Yes

Overcast 0 5 5/14= 0.35

Rainy 2 2 4/14=0.29

Sunny 2 3 5/14=0.35

All 4/14=0.29 10/14=0.71

Applying Bayes'theorem:

P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)

P(Sunny|Yes)= 3/10= 0.3

P(Sunny)= 0.35

P(Yes)=0.71

So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60

P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)

P(Sunny|NO)= 2/4=0.5

P(No)= 0.29

P(Sunny)= 0.35

So P(No|Sunny)= 0.5*0.29/0.35 = 0.41

So as we can see from the above calculation that P(Yes|Sunny)>P(No|Sunny)

Hence on a Sunny day, Player can play the game.

Advantages of Naïve Bayes Classifier:

o Naïve Bayes is one of the fast and easy ML algorithms to predict a class of datasets.
o It can be used for Binary as well as Multi-class Classifications.
o It performs well in Multi-class predictions as compared to the other Algorithms.
o It is the most popular choice for text classification problems.

Disadvantages of Naïve Bayes Classifier:

o Naive Bayes assumes that all features are independent or unrelated, so it cannot
learn the relationship between features.

Applications of Naïve Bayes Classifier:

o It is used for Credit Scoring.

o It is used in medical data classification.
o It can be used in real-time predictions because Naïve Bayes Classifier is an eager
learner.
o It is used in Text classification such as Spam filtering and Sentiment analysis.

Types of Naïve Bayes Model:

There are three types of Naive Bayes Model, which are given below:

o Gaussian: The Gaussian model assumes that features follow a normal

distribution. This means if predictors take continuous values instead of discrete,
then the model assumes that these values are sampled from the Gaussian
distribution.
o Multinomial: The Multinomial Naïve Bayes classifier is used when the data is
multinomial distributed. It is primarily used for document classification problems,
it means a particular document belongs to which category such as Sports,
Politics, education, etc.
The classifier uses the frequency of words for the predictors.
o Bernoulli: The Bernoulli classifier works similar to the Multinomial classifier, but
the predictor variables are the independent Booleans variables. Such as if a
particular word is present or not in a document. This model is also famous for
document classification tasks.

Python Implementation of the Naïve Bayes algorithm:

Now we will implement a Naive Bayes Algorithm using Python. So for this, we will use the
"user_data" dataset, which we have used in our other classification model. Therefore we
can easily compare the Naive Bayes model with the other models.

Steps to implement:

o Data Pre-processing step

o Fitting Naive Bayes to the Training set
o Predicting the test result
o Test accuracy of the result(Creation of Confusion matrix)
o Visualizing the test set result.

1) Data Pre-processing step:

In this step, we will pre-process/prepare the data so that we can use it efficiently in our
code. It is similar as we did in data-pre-processing. The code for this is given below:

# Importing the libraries

import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd

# Importing the dataset

dataset = pd.read_csv('user_data.csv')
x = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values

# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.25, random_state = 0)

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)
In the above code, we have loaded the dataset into our program using "dataset =
pd.read_csv('user_data.csv'). The loaded dataset is divided into training and test set, and
then we have scaled the feature variable.

The output for the dataset is given as:

2) Fitting Naive Bayes to the Training Set:
After the pre-processing step, now we will fit the Naive Bayes model to the Training set.
Below is the code for it:

1. # Fitting Naive Bayes to the Training set

2. from sklearn.naive_bayes import GaussianNB
3. classifier = GaussianNB()
4. classifier.fit(x_train, y_train)
5.

In the above code, we have used the GaussianNB classifier to fit it to the training dataset.
We can also use other classifiers as per our requirement.

Output:

Out[6]: GaussianNB(priors=None, var_smoothing=1e-09)

3) Prediction of the test set result:

Now we will predict the test set result. For this, we will create a new predictor
variable y_pred, and will use the predict function to make the predictions.

1. # Predicting the Test set results

2. y_pred = classifier.predict(x_test)
3.

Output:
The above output shows the result for prediction vector y_pred and real vector y_test. We
can see that some predications are different from the real values, which are the incorrect
predictions.

4) Creating Confusion Matrix:

Now we will check the accuracy of the Naive Bayes classifier using the Confusion matrix.
Below is the code for it:

1. # Making the Confusion Matrix

2. from sklearn.metrics import confusion_matrix
3. cm = confusion_matrix(y_test, y_pred)
4.

Output:

As we can see in the above confusion matrix output, there are 7+3= 10 incorrect
predictions, and 65+25=90 correct predictions.
5) Visualizing the training set result:
Next we will visualize the training set result using Naïve Bayes Classifier. Below is the code
for it:

1. # Visualising the Training set results

2. from matplotlib.colors import ListedColormap
3. x_set, y_set = x_train, y_train
4. X1, X2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].max() + 1,
step = 0.01),
5. nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1,
step = 0.01))
6. mtp.contourf(X1, X2, classifier.predict(nm.array([X1.ravel(), X2.ravel()]).T).reshape(X1.sha
pe),
7. alpha = 0.75, cmap = ListedColormap(('purple', 'green')))
8. mtp.xlim(X1.min(), X1.max())
9. mtp.ylim(X2.min(), X2.max())
10. for i, j in enumerate(nm.unique(y_set)):
11. mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],
12. c = ListedColormap(('purple', 'green'))(i), label = j)
13. mtp.title('Naive Bayes (Training set)')
14. mtp.xlabel('Age')
15. mtp.ylabel('Estimated Salary')
16. mtp.legend()
17. mtp.show()
18.

Output:

In the above output we can see that the Naïve Bayes classifier has segregated the data
points with the fine boundary. It is Gaussian curve as we have used GaussianNB classifier
in our code.
6) Visualizing the Test set result:
1. # Visualising the Test set results
2. from matplotlib.colors import ListedColormap
3. x_set, y_set = x_test, y_test
4. X1, X2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].max() + 1,
step = 0.01),
5. nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1,
step = 0.01))
6. mtp.contourf(X1, X2, classifier.predict(nm.array([X1.ravel(), X2.ravel()]).T).reshape(X1.sha
pe),
7. alpha = 0.75, cmap = ListedColormap(('purple', 'green')))
8. mtp.xlim(X1.min(), X1.max())
9. mtp.ylim(X2.min(), X2.max())
10. for i, j in enumerate(nm.unique(y_set)):
11. mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],
12. c = ListedColormap(('purple', 'green'))(i), label = j)
13. mtp.title('Naive Bayes (test set)')
14. mtp.xlabel('Age')
15. mtp.ylabel('Estimated Salary')
16. mtp.legend()
17. mtp.show()
18.

Output:

The above output is final output for test set data. As we can see the classifier has created
a Gaussian curve to divide the "purchased" and "not purchased" variables. There are some
wrong predictions which we have calculated in Confusion matrix. But still it is pretty good
classifier.

Ambit Optimist 8 Installation Guide
0% (1)
Ambit Optimist 8 Installation Guide
87 pages
Naïve Bayes
No ratings yet
Naïve Bayes
15 pages
Oracle Developer 2000 Training
No ratings yet
Oracle Developer 2000 Training
4 pages
Problem Set Time Value of Money
No ratings yet
Problem Set Time Value of Money
5 pages
Pumps Lecture
100% (1)
Pumps Lecture
38 pages
Machine Ass
No ratings yet
Machine Ass
33 pages
Naive Bayes Classifier in Machine Learning
No ratings yet
Naive Bayes Classifier in Machine Learning
16 pages
Naive Bates Classifier
No ratings yet
Naive Bates Classifier
18 pages
Naive Bayes Classifier in Machine Learning - Javatpoint
No ratings yet
Naive Bayes Classifier in Machine Learning - Javatpoint
19 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
10 pages
Notes
No ratings yet
Notes
32 pages
6d7701 - Bayesean Classifer
No ratings yet
6d7701 - Bayesean Classifer
8 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
3 pages
UNIT IV Na-Ve Bayes Classifier Algorithm
No ratings yet
UNIT IV Na-Ve Bayes Classifier Algorithm
33 pages
AI NOTES unit 2
No ratings yet
AI NOTES unit 2
9 pages
Unit I Probabilistic Reasoning I 9
No ratings yet
Unit I Probabilistic Reasoning I 9
20 pages
Unit 2 AAM
No ratings yet
Unit 2 AAM
32 pages
Notes On Module 3 - Pattern Recognition
No ratings yet
Notes On Module 3 - Pattern Recognition
17 pages
Module V_v1
No ratings yet
Module V_v1
58 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
16 pages
Naive Bayes Theorm
No ratings yet
Naive Bayes Theorm
4 pages
ML Unit No.4 Naïve Bayes Classifiers PPT Notes
No ratings yet
ML Unit No.4 Naïve Bayes Classifiers PPT Notes
47 pages
Unit II Probabilistic Reasoning
No ratings yet
Unit II Probabilistic Reasoning
28 pages
Naive Bayes Classification_04c360b1c962b080d8b84f51f8a494ad
No ratings yet
Naive Bayes Classification_04c360b1c962b080d8b84f51f8a494ad
5 pages
Mechine Learning
No ratings yet
Mechine Learning
7 pages
Unit 2.2
No ratings yet
Unit 2.2
9 pages
ML Lec-11
No ratings yet
ML Lec-11
12 pages
Chapt 2 Notes
No ratings yet
Chapt 2 Notes
12 pages
LM3 - Naive Bayes Model
No ratings yet
LM3 - Naive Bayes Model
21 pages
Wa0006.
No ratings yet
Wa0006.
3 pages
ML - Unit 2
No ratings yet
ML - Unit 2
15 pages
Session 10 - Ensemble Methods (XGBoost)
No ratings yet
Session 10 - Ensemble Methods (XGBoost)
37 pages
BSC ML CH2.pptx
No ratings yet
BSC ML CH2.pptx
79 pages
MLT UNIT-2 notes
No ratings yet
MLT UNIT-2 notes
16 pages
CSL0777 L24
No ratings yet
CSL0777 L24
38 pages
Baye's Theorem - Example
No ratings yet
Baye's Theorem - Example
7 pages
Naive_Bayes_classifier (1)
No ratings yet
Naive_Bayes_classifier (1)
15 pages
Pgm5 With Output
No ratings yet
Pgm5 With Output
13 pages
What Is Naive Bayes Algorithm?
No ratings yet
What Is Naive Bayes Algorithm?
18 pages
Unit 6
No ratings yet
Unit 6
19 pages
ML Unit Iv
No ratings yet
ML Unit Iv
17 pages
What Is Naive Bayes?
No ratings yet
What Is Naive Bayes?
6 pages
LM39 - Naïve Bayes Models
No ratings yet
LM39 - Naïve Bayes Models
14 pages
Lecture10 - Bayesian Classifier
No ratings yet
Lecture10 - Bayesian Classifier
40 pages
UNIT 2 AAM notes (1)
No ratings yet
UNIT 2 AAM notes (1)
38 pages
Naive Bayes Algorithm
No ratings yet
Naive Bayes Algorithm
46 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
8 pages
Naive_Bayes_1696233556
No ratings yet
Naive_Bayes_1696233556
5 pages
Naive Ba Yes
No ratings yet
Naive Ba Yes
28 pages
Lecture - 4.2 - Continuous Data and Zero Frequency Problem in Naive Bayes Classifier
No ratings yet
Lecture - 4.2 - Continuous Data and Zero Frequency Problem in Naive Bayes Classifier
11 pages
Naive Bayes
No ratings yet
Naive Bayes
27 pages
Naive Bayes
No ratings yet
Naive Bayes
62 pages
Chapter 2
No ratings yet
Chapter 2
31 pages
Naive Bayes
No ratings yet
Naive Bayes
7 pages
Chapter 8
No ratings yet
Chapter 8
24 pages
Naïve Bayes classifiers 3_dc4478f7a9b2f677b59859e94c82b62a
No ratings yet
Naïve Bayes classifiers 3_dc4478f7a9b2f677b59859e94c82b62a
16 pages
Naive Bayes
No ratings yet
Naive Bayes
62 pages
Assignment No 2
No ratings yet
Assignment No 2
5 pages
MLT Unit 2 - Updated
No ratings yet
MLT Unit 2 - Updated
58 pages
Mid-Term2024 SOL
No ratings yet
Mid-Term2024 SOL
4 pages
ML BayesionBeliefNetwork Lect12 14
No ratings yet
ML BayesionBeliefNetwork Lect12 14
99 pages
U02Lecture07 Classification
100% (1)
U02Lecture07 Classification
56 pages
BAYES Theorem
From Everand
BAYES Theorem
Jeffery Short
2/5 (5)
Schaum's Easy Outline of Probability and Statistics, Revised Edition
From Everand
Schaum's Easy Outline of Probability and Statistics, Revised Edition
Schiller
No ratings yet
3ds Max (Glass)
No ratings yet
3ds Max (Glass)
12 pages
8.05 Quadrilaterals Built From Triangles - Worksheet
No ratings yet
8.05 Quadrilaterals Built From Triangles - Worksheet
7 pages
Module-2 CC
No ratings yet
Module-2 CC
5 pages
Fruit ML
No ratings yet
Fruit ML
12 pages
Catalytic Dehydration of Bioethanol To Ethylene: Biocatalysis
No ratings yet
Catalytic Dehydration of Bioethanol To Ethylene: Biocatalysis
16 pages
Transformers, How Do They Work?: Generative AI To Create Content
No ratings yet
Transformers, How Do They Work?: Generative AI To Create Content
14 pages
Lab Report JC
No ratings yet
Lab Report JC
5 pages
IT407 Knowledge Engineering
No ratings yet
IT407 Knowledge Engineering
2 pages
Development of Mill Drives For The Cement Industry
No ratings yet
Development of Mill Drives For The Cement Industry
16 pages
APPTITUDE-1
No ratings yet
APPTITUDE-1
2 pages
DL Modules
No ratings yet
DL Modules
1 page
Schemes Talcher
100% (1)
Schemes Talcher
206 pages
Tekla Structural Designer 2022 Eurocodes Reference
No ratings yet
Tekla Structural Designer 2022 Eurocodes Reference
226 pages
Batching and Mixing 2011-1 PDF
No ratings yet
Batching and Mixing 2011-1 PDF
84 pages
Cost-Volume-Profit Relationships1
100% (1)
Cost-Volume-Profit Relationships1
52 pages
2.7 - Arc Length and Sector Area
No ratings yet
2.7 - Arc Length and Sector Area
5 pages
Chemical Kinetics RDR
No ratings yet
Chemical Kinetics RDR
2 pages
AQR (2014) Capital Market Assumptions For Major Asset Classes
No ratings yet
AQR (2014) Capital Market Assumptions For Major Asset Classes
12 pages
Dosing Pump Guidance Notes
No ratings yet
Dosing Pump Guidance Notes
25 pages
Simple Practice Problems On Numbers-1
No ratings yet
Simple Practice Problems On Numbers-1
3 pages
Boron Deficiency
No ratings yet
Boron Deficiency
2 pages
Technical Specification For Ventilator - Infant
No ratings yet
Technical Specification For Ventilator - Infant
2 pages
Intel CPU Install PDF
No ratings yet
Intel CPU Install PDF
30 pages
Manual of The Pioneer LX
No ratings yet
Manual of The Pioneer LX
145 pages
Simplifying Complexity-A Review of Complexity Theory
No ratings yet
Simplifying Complexity-A Review of Complexity Theory
10 pages
2023 TLD Paper 3
No ratings yet
2023 TLD Paper 3
24 pages