Ai/Ml Lab-4: Name: Pratik Jadhav PRN: 20190802050
Ai/Ml Lab-4: Name: Pratik Jadhav PRN: 20190802050
Ai/Ml Lab-4: Name: Pratik Jadhav PRN: 20190802050
AI/ML LAB-4
Name: Pratik Jadhav
PRN: 20190802050
Q1. Write a program to implement k-Nearest Neighbour algorithm to classify the iris data
set. Print both correct and wrong predictions. Java/Python ML library classes can be used
for this problem.
In [1]:
%matplotlib inline
import numpy as np
import pandas as pd
In [2]:
iris_data = pd.read_csv("Iris.csv")
iris_data.head()
In [3]:
len(iris_data)
150
Out[3]:
In [4]:
iris_data.isna().sum()
Id 0
Out[4]:
SepalLengthCm 0
SepalWidthCm 0
PetalLengthCm 0
PetalWidthCm 0
Species 0
dtype: int64
localhost:8888/nbconvert/html/20190802050_DS_Lab4.ipynb?download=false 1/5
10/8/21, 1:09 PM 20190802050_DS_Lab4
y = iris_data["Species"]
len(X), len(y)
(150, 150)
Out[5]:
In [6]:
from sklearn.neighbors import KNeighborsClassifier
test_size=0.2,
random_state=1)
clf = KNeighborsClassifier(n_neighbors=3)
clf.fit(X_train, y_train)
clf.score(X_test, y_test)
0.9666666666666667
Out[6]:
In [7]:
y_preds = clf.predict(X_test)
y_preds[:10]
Out[7]:
'Iris-virginica', 'Iris-versicolor', 'Iris-virginica',
In [8]:
y_preds_proba = clf.predict_proba(X_test)
y_preds_proba[:10]
Out[8]:
[0., 1., 0.],
In [9]:
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
localhost:8888/nbconvert/html/20190802050_DS_Lab4.ipynb?download=false 2/5
10/8/21, 1:09 PM 20190802050_DS_Lab4
accuracy 0.97 30
Confusion Matrix:
[[11 0 0]
[ 0 12 0]
[ 0 1 6]]
In [10]:
from sklearn.model_selection import cross_val_score
cvs = cross_val_score(clf, X, y)
print(cvs)
[0.66666667 1. 1. 1. 0.7 ]
In [11]:
y_testing = pd.Series(y_test).reset_index().drop("index",axis=1)
y_predictions = pd.Series(y_preds)
In [12]:
predictions_df = pd.DataFrame(data={
"Species": y_testing["Species"],
})
In [13]:
predicts = []
if i == y_preds[index]:
predicts.append("Correct")
else:
predicts.append("Wrong")
In [14]:
predictions_df["Correct or Wrong"] = pd.Series(predicts)
predictions_df.head()
In [15]:
print(f"Total Correct or Wrong Predictions:\n\
{predictions_df['Correct or Wrong'].value_counts()}")
Correct 29
localhost:8888/nbconvert/html/20190802050_DS_Lab4.ipynb?download=false 3/5
10/8/21, 1:09 PM 20190802050_DS_Lab4
Wrong 1
Q2. Write a program to implement the naïve Bayesian classifier for a sample training data
set stored as a .CSV file. Compute the accuracy of the classifier, considering few test data
sets.
In [16]:
iris_data = pd.read_csv("Iris.csv")
iris_data.head()
In [17]:
X = iris_data.drop("Species", axis=1)
y = iris_data["Species"]
len(X), len(y)
(150, 150)
Out[17]:
In [18]:
from sklearn.naive_bayes import GaussianNB
test_size=0.3,
random_state=1)
gnb = GaussianNB()
gnb.fit(X_train, y_train)
gnb.score(X_test, y_test)
1.0
Out[18]:
In [19]:
y_preds = gnb.predict(X_test)
y_preds[:10]
Out[19]:
'Iris-virginica', 'Iris-versicolor', 'Iris-virginica',
In [20]:
from sklearn.metrics import accuracy_score
localhost:8888/nbconvert/html/20190802050_DS_Lab4.ipynb?download=false 4/5
10/8/21, 1:09 PM 20190802050_DS_Lab4
In [21]:
from sklearn.model_selection import cross_val_score
cvs = cross_val_score(gnb, X, y)
print(cvs)
[0.96666667 1. 1. 1. 1. ]
In [22]:
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
accuracy 1.00 45
Confusion Matrix:
[[14 0 0]
[ 0 18 0]
[ 0 0 13]]
localhost:8888/nbconvert/html/20190802050_DS_Lab4.ipynb?download=false 5/5