Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
911 views

Pattern Recognition Lab

The document contains details about experiments conducted as part of a Pattern Recognition Lab course. It includes 8 programs: 1. Reads images and calculates basic statistics like mean, mode, standard deviation. 2. Implements naive Bayesian classifier on a CSV dataset and computes accuracy. 3. Constructs a Bayesian network using medical data to diagnose heart patients. 4. Implements Bayes' theorem and formula. 5. Performs data analysis on a given dataset. 6. Implements KNN on an image dataset. 7. Implements K-means clustering. 8. Implements PCA (principal component analysis).

Uploaded by

Prashant Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
911 views

Pattern Recognition Lab

The document contains details about experiments conducted as part of a Pattern Recognition Lab course. It includes 8 programs: 1. Reads images and calculates basic statistics like mean, mode, standard deviation. 2. Implements naive Bayesian classifier on a CSV dataset and computes accuracy. 3. Constructs a Bayesian network using medical data to diagnose heart patients. 4. Implements Bayes' theorem and formula. 5. Performs data analysis on a given dataset. 6. Implements KNN on an image dataset. 7. Implements K-means clustering. 8. Implements PCA (principal component analysis).

Uploaded by

Prashant Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Pattern Recognition Lab

CAL – 302

B.Tech 3rd Year


SEMESTER: 6th
Session: 2021-2022

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


SHARDA UNIVERSITY, GREATER NOIDA

Submitted By: Harsh Tiwari


Roll No: 190101124
System I’d: 2019644021
Group: 1
Submitted to: Dr. Vijendra Singh
INDEX
S.No Title of The Experiment Date Signature

1 Assuming a set of images that need


to be classified, read the images
and calculate basic statistics such
as mean, mode, standard deviation,
etc.
2 Write a program to implement the
naïve Bayesian classifier for a
sample training data set stored as a
.CSV file. Compute the accuracy of
the classifier, considering few test
data sets.
3 Write a program to construct a
Bayesian network considering
medical data. Use this model to
demonstrate the diagnosis of heart
patients.
PROGRAM 1

Experiment 1: Assuming a set of images that need to be classified, read the

images and calculate basic statistics such as mean, mode, standard deviation,

etc.

Code:
from google.colab import files
uploaded = files.upload()
!pwd
import os
!pip install albumentations==0.4.6

import numpy as np
import pandas as pd
import torch
import torchvision
from torch.utils.data import Dataset,DataLoader
import albumentations as A
from albumentations.pytorch import ToTensorV2
import cv2
from tqdm import tqdm
import matplotlib.pyplot as plt
%matplotlib inline
device = torch.device('cpu')
import requests
os.environ['KAGGLE_CONFIG_DIR'] = "/content"
!kaggle competitions download -c cassava-leaf-disease-
classification
from zipfile import ZipFile
# specifying the zip file name
file_name = "cassava-leaf-disease-classification.zip"

# opening the zip file in READ mode


with ZipFile(file_name, 'r') as zip:
# printing all the contents of the zip file
zip.printdir()

# extracting all the files


print('Extracting all the files now...')
zip.extractall()
print('Done!')

df2.head()
class LeafData(Dataset):

def __init__(self,
data,
directory,
transform = None):
self.data = data
self.directory = directory
self.transform = transform

def __len__(self):
return len(self.data)

def __getitem__(self, idx):

# import
path = os.path.join(self.directory,
self.data.iloc[idx]['image_id'])
image = cv2.imread(path, cv2.COLOR_BGR2RGB)

# augmentations
if self.transform is not None:
image = self.transform(image = image)['image']
return image

num_workers = 4
image_size = 512
batch_size = 8

augs = A.Compose([A.Resize(height = image_size,


width = image_size),
A.Normalize(mean = (0, 0, 0),
std = (1, 1, 1)),
ToTensorV2()])
# dataset
image_dataset = LeafData(data = df2,
directory = 'train_images/',
transform = augs)

# data loader
image_loader = DataLoader(image_dataset,
batch_size = batch_size,
shuffle = False,
num_workers = num_workers,
pin_memory = True)

# display images
for batch_idx, inputs in enumerate(image_loader):
fig = plt.figure(figsize = (14, 7))
for i in range(8):
ax = fig.add_subplot(2, 4, i + 1, xticks = [], yticks =
[])
plt.imshow(inputs[i].numpy().transpose(1, 2, 0))
break

psum = torch.tensor([0.0, 0.0, 0.0])


psum_sq = torch.tensor([0.0, 0.0, 0.0])

# loop through images


for inputs in tqdm(image_loader):
psum += inputs.sum(axis = [0, 2, 3])
psum_sq += (inputs ** 2).sum(axis = [0, 2, 3])
count = len(df2) * image_size * image_size

# mean and std


total_mean = psum / count
total_var = (psum_sq / count) - (total_mean ** 2)
total_std = torch.sqrt(total_var)
# output
print('mean: ' + str(total_mean))
print('std: ' + str(total_std))
PROGRAM 2
Experiment 2: Write a program to implement the naïve Bayesian
classifier for a sample training data set stored as a .CSV file. Compute
the accuracy of the classifier, considering few test data sets.

Code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets
wine=datasets.load_wine()
print(wine)

print(wine.feature_names)

print(wine.target_names)

X=pd.DataFrame(wine['data'])
print(X.head())

y=print(wine.target)
from sklearn.model_selection import train_test_split
X_train,X_test,y_train,y_test=train_test_split(wine.data,wine.target,te
st_size=0.30,random_state=100)

from sklearn.naive_bayes import GaussianNB


gnb = GaussianNB()
gnb.fit(X_train,y_train)
y_pred = gnb.predict(X_test)
print(y_pred)

from sklearn import metrics


print(metrics.accuracy_score(y_test,y_pred))

from sklearn.metrics import confusion_matrix


cm=np.array(confusion_matrix(y_test,y_pred))
cm
PROGRAM 3
Write a program to construct a Bayesian network considering medical
data. Use this model to demonstrate the diagnosis of heart patients.

Code:
!pip install pgmpy

import pandas as pd
data = pd.read_csv('heart.csv')
from pgmpy.models import BayesianNetwork
names = "A,B,C,D,E,F,G,H,I,J,K,L,M,RESULT"
names = names.split(",")
len(names)

data.head()

import pandas.util.testing as tm

model =
BayesianNetwork([('age','sex'),('trestbps','chol'),('restecg','thalach'
),('exang','target')])
model.fit(data)
from pgmpy.inference import VariableElimination
infer = VariableElimination(model)
print(infer)

q=infer.query(variables=['target'],evidence={'age':28})

print(q)
PROGRAM – 4

Write a program to implement Bayes Theorem and its Formula.

CODE:
def bayes_theorem(p_b, p_g_given_b, p_g_given_not_b):

# calculate P(not B)
not_b = 1 - p_b

# calculate P(G)
p_g = p_g_given_b * p_b + p_g_given_not_b * not_b

# calculate P(B|G)
p_b_given_g = (p_g_given_b * p_b) / p_g
return p_b_given_g

#P(B)
p_b = 1/7

# P(G|B)
p_g_given_b = 1

# P(G|notB)
p_g_given_not_b = 2/3

# calculate P(B|G)
result = bayes_theorem(p_b, p_g_given_b, p_g_given_not_b)

# print result
print('P(B|G) = %.2f%%' % (result * 100))

import pandas as pd

df = pd.read_csv('cereal.csv')

df
col1 = df.calories
col2 = df.potass
col3 = df.sodium
for i in range(1,21,2):
p_a = col1[i]/1000
p_b_given_a = col2[i]/1000
p_b_not_a = col3[i]/1000
result = bayes_theorem(p_a, p_b_given_a, p_b_not_a)
print(result)
PROGRAM – 5

Write a program to perform Data Analysis on a given Dataset

CODE:

from google.colab import files


uploaded = files.upload()

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

df = pd.read_csv("cereal.csv")

df

print(df.weight)
print(df.columns)

x =df.calories
print(x.var())
print(x.std())
df[['weight']].idxmax()
df.mean(axis = 0)
PROGRAM – 6

Write a program to implement KNN on an image dataset

CODE:

from sklearn import datasets


digits = datasets.load_digits()
x = digits.data
y = digits.target
import pandas as pd
df = pd.DataFrame(data = y, columns = ['targets'])
df

x.shape
y.shape
digits.images.shape
digits.images[0]

import matplotlib.pyplot as plt


plt.imshow(digits.images[0],cmap = plt.cm.gray_r)
plt.axis('off')
plt.title('Number: ' + str(y[0]))
None

figure,axes = plt.subplots(3,10,figsize = (15,6))


for ax,image,number in zip(axes.ravel(),digits.images,y):
ax.axis('off')
ax.imshow(image,cmap = plt.cm.gray_r)
ax.set_title('Number: '+ str(number))
image = digits.images[0]
print('original image data = ')
print(image)
print()
image_flattened = image.ravel()
print("flattened image = ")
print(image_flattened)

print('feature data for a sample= ')


print(x[0])
print()

print('Feature data for all sample is a 8-by-8 two dimaensional array')


print(x)
from sklearn.model_selection import train_test_split
x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.33,ran
dom_state=99,stratify = y)

from sklearn.neighbors import KNeighborsClassifier


knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(x_train,y_train)

y_red = knn.predict(x_test)
y_red

from sklearn.metrics import classification_report


report = classification_report(y_test,y_red)
print(report)
import seaborn as sns
s = sns.heatmap(confusion,annot = True, cmap = 'nipy_spectral_r')
s.set_title('Confusion matrix for MNIST dataset')
None
PROGRAM – 7

Write a program to implement K-Means Clustering.

CODE:

import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans

X, y = make_blobs(n_samples=300, centers=4, cluster_std=0.60, random_st


ate=0)
plt.scatter(X[:,0], X[:,1])

wcss = []
for i in range(1, 11):
kmeans = KMeans(n_clusters=i, init='k-
means++', max_iter=300, n_init=10, random_state=0)
kmeans.fit(X)
wcss.append(kmeans.inertia_)
plt.plot(range(1, 11), wcss)
plt.title('Elbow Method')
plt.xlabel('Number of clusters')
plt.ylabel('WCSS')
plt.show()
kmeans = KMeans(n_clusters=4, init='k-
means++', max_iter=300, n_init=10, random_state=0)
pred_y = kmeans.fit_predict(X)
plt.scatter(X[:,0], X[:,1])
plt.scatter(kmeans.cluster_centers_[:, 0], kmeans.cluster_centers_[:, 1
], s=300, c='red')
plt.show()
PROGRAM – 8

Write a program to implement PCA (Principle Component Analysis)

CODE:

import pandas as pd
url = "https://archive.ics.uci.edu/ml/machine-learning-
databases/iris/iris.data"
# load dataset into Pandas DataFrame
df = pd.read_csv(url, names=['sepal length','sepal width','petal length
','petal width','target'])

df.head()

from sklearn.preprocessing import StandardScaler


features = ['sepal length', 'sepal width', 'petal length', 'petal width
']
# Separating out the features
x = df.loc[:, features].values
# Separating out the target
y = df.loc[:,['target']].values
# Standardizing the features
x = StandardScaler().fit_transform(x)
from sklearn.decomposition import PCA
pca = PCA(n_components=2)
principalComponents = pca.fit_transform(x)
principalDf = pd.DataFrame(data = principalComponents
, columns = ['principal component 1', 'principal component
2'])
finalDf = pd.concat([principalDf, df[['target']]], axis = 1)
fig = plt.figure(figsize = (8,8))
ax = fig.add_subplot(1,1,1)
ax.set_xlabel('Principal Component 1', fontsize = 15)
ax.set_ylabel('Principal Component 2', fontsize = 15)
ax.set_title('2 component PCA', fontsize = 20)
targets = ['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']
colors = ['r', 'g', 'b']
for target, color in zip(targets,colors):
indicesToKeep = finalDf['target'] == target
ax.scatter(finalDf.loc[indicesToKeep, 'principal component 1']
, finalDf.loc[indicesToKeep, 'principal component 2']
, c = color
, s = 50)
ax.legend(targets)
ax.grid()

pca.explained_variance_ratio_

You might also like