0% found this document useful (0 votes)

14 views

Data Analysis in Python_ML

Uploaded by

Khushal Khan

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

Data Analysis in Python_ML

Uploaded by

Khushal Khan

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 21

Anaconda is a distribution of the Python and R programming languages for scientific

computing, that aims to simplify package management and deployment. The distribution
includes data-science packages suitable for Windows, Linux, and macOS. It can be accessed
from

https://www.anaconda.com/products/individual

After installing anaconda, launch it by simply searching it and then launch it by clicking the
icon you get. When it opens, find and launch jupyter Or type jupyter notebook.

Home screen will open in a browser after terminal display for few seconds.
Go to New and select python 3 in drop down list.

A new notebook will be opened, you may rename it by clicking on text box displaying
“untitled”

You may enter your commands in the empty cells. Green color indicates the active cell
while blue indicates the inactive ones. You can make the cell inactive by pressing Esc key.
Hover the mouse over the inactive cell and click or press ENTER key to make it active.

To create a new cell, press + icon at the top or press B on keyboard to add a cell below or A
to add a cell above the current cell.

To remove a cell, press D twice, indicates number of the command

executed.

# Represents comments

# Like R, it recognizes numbers but not the valueless variables.

print(‘Bismillah') # You can print a value to the screen using “print.”

# = is used as an assignment operator.

A=1

A # will print the contents of A

type (A) # tells the class of the object A, we created above

Variable types don’t need to be declared. Python figures out the variable types on its own.

Variable Names are case sensitive and cannot start with a number. They can contain
letters, numbers, and underscores.

a, b, c = 17, 3.14, "test" # We can assign values to multiple variables at a time.

+ operator to concatenate (join) two strings

a = "hello "

b = "world"

print (a + b)

Python Data Structures

• Integers: Whole numbers e.g. 2, 3, 5, 0, -1

• Floats: Numbers with decimal point. e.g. 1.50

• Strings: Either single ('') or double ("") or triple quotes (""" or ''') can be used. For example,
“python” and ‘python’ are same strings. Unmatched ones can occur within the string.
"datatype’s"

• Tuple ( ) A collection of different things. Tuples are “immutable”, i.e., they cannot be modified
after creation.
myTuple = ('abc', 2.5, A)

myTuple[2] # Shall return the value at 3rd index position starting from 0.

myTuple.index(2.5) # shall tell us what is the index position of 2.5

List [ ] Lists are “mutable”, i.e., their elements can be modified.

myList = ['abc', 'def', 'ghij']

myList.append('klm')

myList

myList.count('def') # shall count the occurrences of def in a list

myList2 = [1,2,3]

myList3 = [4,5,6]
myList2 + myList3

Array [ ] vectors (1d) and matrices (>1d) , for numerical data manipulation are defined in
numpy. We need to import numpy to our python session.
import numpy as np # that’s what the community does, we can access any function like
np.array etc you may use any variable name here, we may also do import numpy but then
we need to access the functions like numpy.array etc. Lists are containers for elements
having differing data types but arrays are used as containers for elements of the same data
type.

myArray2 = np.array(myList2)

myArray3 = np.array(myList3)

myArray2 + myArray3

myArray2.dot(myArray3)

Data Analysis in python

To perform various tasks, a set of instructions is combined into functions. A function is

defined by the keyword def, and can be defined anywhere. A combination of various
functions are put together as Modules which then constitute Packages.

import pandas as pd #Pandas library is designed for quick and easy data manipulation, reading,
aggregation, visualization.

import numpy as np #NumPy is used to process arrays that store values of the same datatype. It
facilitates math operations on arrays and their vectorization.

import matplotlib.pyplot as plt #To plot the histograms and other statistical graphs

Working Directory Setting:

pwd # will tell us where we are (same as linux)

ls # will tell us what we have there (files and folders)

dir() # will tell us what variables (objects) we have try vars()

cd # we can change directory

import os #OS module functions for creating and removing a directory (folder), fetching its
contents, changing and identifying the current directory. Right way of doing that as in other
python IDEs above commands might not work.
os.chdir("C:/Desktop/DataAnalysis/") # shall change to directory to the required directory

os.getcwd() # To make sure we are at the right place

#Reading Dataframe exported previously

combined_PRAD = pd.read_csv("PRAD_labeled.csv", index_col=0)

#Tells you what type of variable it is.

type(combined_PRAD)

#Description about variables contained by the dataframe

combined_PRAD.info()

# gives the total size of a dataframe by multiplying the rows with

columns.
combined_PRAD.size

# tells about the shape of data, how many rows and columns present.
combined_PRAD.shape

# Gives you the information about the dimensions of dataset.

combined_PRAD.ndim

#Outputs the first five rows of the data

combined_PRAD.head()

#Gives the list of last 10 rows in the dataset

combined_PRAD.tail(10)

# Import label encoder

from sklearn import preprocessing
# label_encoder object knows how to understand word labels.
label_encoder = preprocessing.LabelEncoder()
# Encode labels in column 'label'.
combined_PRAD['labels']=
label_encoder.fit_transform(combined_PRAD['labels'])
#To have a list of unique enteries.
combined_PRAD['labels'].unique()

#counting the number of classes

combined_PRAD["labels"].value_counts()

#Assigning the numerical data to a "X" variable and labels column into
a "y" variable that will be used in the next steps
X = combined_PRAD.iloc[:,:-1]
y = combined_PRAD["labels"]

#importing train_test_split
from sklearn.model_selection import train_test_split
X_train, X_test ,Y_train, Y_test = train_test_split(X,y,test_size
=0.30, random_state=42)
from sklearn.preprocessing import StandardScaler
sc= StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

#########Plotting TSNE plot to check whether problem is linear or

not#######

import matplotlib.pyplot as plt

import seaborn as sns
from sklearn.manifold import TSNE
fig, ax = plt.subplots()
m = TSNE(learning_rate=50)
X_tsne = m.fit_transform(X)
combined_PRAD["y"] = Y_train
combined_PRAD["comp-1"] = X_tsne[:,0]
combined_PRAD["comp-2"] = X_tsne[:,1]

sns.scatterplot(x="comp-1", y="comp-2", hue=combined_PRAD.y.tolist(),

palette=sns.color_palette('husl', 2),
data=combined_PRAD).set(title="Cancer data T-SNE
projection")
plt.savefig("TSNE-plot.png", dpi = 600)

from sklearn.metrics import accuracy_score

from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import f1_score
from sklearn.metrics import classification_report
import matplotlib.pyplot as plt
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
from sklearn.neighbors import KNeighborsClassifier

KNN = KNeighborsClassifier(n_neighbors=7, metric='minkowski', p=1)

KNN.fit(X_train,Y_train)
# predict samples in the test set
prediction = KNN.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your KNN model and have X_test and Y_test
# KNN is your trained K Nearest Neighbors model
# Get predictions from the KNN model
Y_pred = KNN.predict(X_test)
# Create the confusion matrix
cm = confusion_matrix(Y_test, Y_pred)

# Display the confusion matrix using ConfusionMatrixDisplay

disp = ConfusionMatrixDisplay(confusion_matrix=cm,
display_labels=KNN.classes_) # KNN.classes_ contains your class
labels
disp.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("KNN.png", dpi=600)
plt.show()

# roc curve and auc

from sklearn.metrics import roc_curve
from sklearn.metrics import roc_auc_score
from matplotlib import pyplot

KNN_probs = KNN.predict_proba(X_test)
KNN_probs = KNN_probs[:, 1]
KNN_auc = roc_auc_score(Y_test, KNN_probs)
KNN_fpr, KNN_tpr, _ = roc_curve(Y_test, KNN_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(KNN_fpr, KNN_tpr ,label='KNN =%.3f' % (KNN_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('Precision', fontsize=15)
pyplot.xlabel('Recall', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("PR_curve.png", dpi = 600)
pyplot.show()

#SVC_linear
from sklearn.svm import SVC
svm_linear = SVC(kernel='linear', probability=True, random_state=40)
svm_linear.fit(X_train,Y_train).decision_function(X_test)
prediction = svm_linear.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM linear model and have X_test and
Y_test
# svm_linear is your trained Support Vector Machine with linear kernel
model

# Get predictions from the SVM linear model

Y_pred_svm_linear = svm_linear.predict(X_test)

# Create the confusion matrix

cm_svm_linear = confusion_matrix(Y_test, Y_pred_svm_linear)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_svm_linear =
ConfusionMatrixDisplay(confusion_matrix=cm_svm_linear,
display_labels=svm_linear.classes_)
disp_svm_linear.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("svm_linear.png", dpi=600)
plt.show()
svm_linear_probs = svm_linear.predict_proba(X_test)
svm_linear_probs = svm_linear_probs[:, 1]
svm_linear_auc = roc_auc_score(Y_test, svm_linear_probs)
svm_linear_fpr, svm_linear_tpr, _ = roc_curve(Y_test,
svm_linear_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(svm_linear_fpr, svm_linear_tpr ,label='SVM_Linear =%.3f' %
(svm_linear_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

#SVC_poly
from sklearn.svm import SVC
# Training a SVM classifier using SVC polynomial
svm_poly = SVC(kernel='poly', probability=True, random_state=40)
svm_poly.fit(X_train,Y_train).decision_function(X_test)
prediction = svm_poly.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

Y_pred_svm_poly = svm_poly.predict(X_test)

# Create the confusion matrix

cm_svm_poly = confusion_matrix(Y_test, Y_pred_svm_poly)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_svm_poly = ConfusionMatrixDisplay(confusion_matrix=cm_svm_poly,
display_labels=svm_poly.classes_)
disp_svm_poly.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("svm_poly.png", dpi=600)
plt.show()

svm_poly_probs = svm_poly.predict_proba(X_test)
svm_poly_probs = svm_poly_probs[:, 1]
svm_poly_auc = roc_auc_score(Y_test, svm_poly_probs)
svm_poly_fpr, svm_poly_tpr, _ = roc_curve(Y_test, svm_poly_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(svm_poly_fpr, svm_poly_tpr ,label='SVM_poly =%.3f' %
(svm_poly_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

#SVC_RBF
from sklearn.svm import SVC
# Training a SVM classifier using SVC class
svm_rbf = SVC(kernel='rbf', probability=True, random_state=40)
svm_rbf.fit(X_train,Y_train).decision_function(X_test)
prediction = svm_rbf.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_svm_rbf = svm_rbf.predict(X_test)

# Create the confusion matrix

cm_svm_rbf = confusion_matrix(Y_test, Y_pred_svm_rbf)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_svm_rbf = ConfusionMatrixDisplay(confusion_matrix=cm_svm_rbf,
display_labels=svm_rbf.classes_)
disp_svm_rbf.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("svm_rbf.png", dpi=600)
plt.show()

svm_rbf_probs = svm_rbf.predict_proba(X_test)
svm_rbf_probs = svm_rbf_probs[:, 1]
svm_rbf_auc = roc_auc_score(Y_test, svm_rbf_probs)
svm_rbf_fpr, svm_rbf_tpr, _ = roc_curve(Y_test, svm_rbf_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(svm_rbf_fpr, svm_rbf_tpr ,label='SVM_rbf =%.3f' %
(svm_rbf_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

from sklearn.linear_model import LogisticRegression

LR = LogisticRegression()
LR.fit(X_train,Y_train).decision_function(X_test)
prediction = LR.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_LR = LR.predict(X_test)

# Create the confusion matrix

cm_LR = confusion_matrix(Y_test, Y_pred_LR)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_LR = ConfusionMatrixDisplay(confusion_matrix=cm_LR,
display_labels=LR.classes_)
disp_LR.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("LR.png", dpi=600)
plt.show()

LR_probs = LR.predict_proba(X_test)
LR_probs = LR_probs[:, 1]
LR_auc = roc_auc_score(Y_test, LR_probs)
LR_fpr, LR_tpr, _ = roc_curve(Y_test, LR_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(LR_fpr, LR_tpr ,label='LR =%.3f' % (LR_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

#Naive bayes
from sklearn.naive_bayes import GaussianNB
NB = GaussianNB()
NB.fit(X_train, Y_train)
prediction = NB.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))
# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_NB = NB.predict(X_test)

# Create the confusion matrix

cm_NB = confusion_matrix(Y_test, Y_pred_NB)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_NB = ConfusionMatrixDisplay(confusion_matrix=cm_NB,
display_labels=NB.classes_)
disp_NB.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("NB.png", dpi=600)
plt.show()

NB_probs = NB.predict_proba(X_test)
NB_probs = NB_probs[:, 1]
NB_auc = roc_auc_score(Y_test, NB_probs)
NB_fpr, NB_tpr, _ = roc_curve(Y_test, NB_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(NB_fpr, NB_tpr ,label='NB =%.3f' % (NB_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

#DECISION TREE CLASSIFIER

from sklearn.tree import DecisionTreeClassifier
DT= DecisionTreeClassifier(random_state=0)
DT.fit(X_train, Y_train)
prediction = DT.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_DT = DT.predict(X_test)

# Create the confusion matrix

cm_DT = confusion_matrix(Y_test, Y_pred_DT)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_DT = ConfusionMatrixDisplay(confusion_matrix=cm_DT,
display_labels=DT.classes_)
disp_DT.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("DT.png", dpi=600)
plt.show()

DT_probs = DT.predict_proba(X_test)
DT_probs = DT_probs[:, 1]
DT_auc = roc_auc_score(Y_test, DT_probs)
DT_fpr, DT_tpr, _ = roc_curve(Y_test, DT_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(DT_fpr, DT_tpr ,label='DT =%.3f' % (DT_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

from sklearn.neural_network import MLPClassifier

MLP = MLPClassifier()
MLP.fit(X_train, Y_train)
prediction = MLP.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_MLP = MLP.predict(X_test)

# Create the confusion matrix

cm_MLP = confusion_matrix(Y_test, Y_pred_MLP)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_MLP = ConfusionMatrixDisplay(confusion_matrix=cm_MLP,
display_labels=MLP.classes_)
disp_MLP.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("MLP.png", dpi=600)
plt.show()

MLP_probs = MLP.predict_proba(X_test)
MLP_probs = MLP_probs[:,1]
MLP_auc = roc_auc_score(Y_test, MLP_probs)
MLP_fpr, MLP_tpr, _ = roc_curve(Y_test, MLP_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(MLP_fpr, MLP_tpr ,label='MLP =%.3f' % (MLP_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

from sklearn.ensemble import AdaBoostClassifier

# define the model
AB = AdaBoostClassifier()
AB.fit(X_train, Y_train)
prediction = AB.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_AB = AB.predict(X_test)

# Create the confusion matrix

cm_AB = confusion_matrix(Y_test, Y_pred_AB)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_AB = ConfusionMatrixDisplay(confusion_matrix=cm_AB,
display_labels=AB.classes_)
disp_AB.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("AB.png", dpi=600)
plt.show()
AB_probs = AB.predict_proba(X_test)
AB_probs = AB_probs[:, 1]
AB_auc = roc_auc_score(Y_test, AB_probs)
AB_fpr, AB_tpr, _ = roc_curve(Y_test, AB_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(AB_fpr, AB_tpr ,label='AB =%.3f' % (AB_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

from sklearn.ensemble import RandomForestClassifier

# define the model
RF = RandomForestClassifier()
RF.fit(X_train, Y_train)
prediction = RF.predict(X_test)
Accuracy = accuracy_score(Y_test,prediction)
print('accuracy',Accuracy)

#Printing precision, recall and f1_scores

print(precision_score(Y_test,prediction))
print(recall_score(Y_test,prediction))
print(f1_score(Y_test,prediction))

# Assuming you have trained your SVM poly model and have X_test and
Y_test
# svm_poly is your trained SVM with polynomial kernel model

# Get predictions from the SVM poly model

Y_pred_RF = RF.predict(X_test)

# Create the confusion matrix

cm_RF = confusion_matrix(Y_test, Y_pred_RF)

# Display the confusion matrix using ConfusionMatrixDisplay

disp_RF = ConfusionMatrixDisplay(confusion_matrix=cm_RF,
display_labels=RF.classes_)
disp_RF.plot(cmap=plt.cm.Greens)
plt.xlabel('Predicted label', color='black')
plt.ylabel('True label', color='black')
plt.gca().tick_params(axis='x', colors='white') # Modify ticks on x-
axis
plt.gca().tick_params(axis='y', colors='white') # Modify ticks on y-
axis
plt.gcf().set_size_inches(10, 6)
plt.savefig("RF.png", dpi=600)
plt.show()

RF_probs = RF.predict_proba(X_test)
RF_probs = RF_probs[:,1]
RF_auc = roc_auc_score(Y_test, RF_probs)
RF_fpr, RF_tpr, _ = roc_curve(Y_test, RF_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(RF_fpr, RF_tpr ,label='RF =%.3f' % (RF_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
#plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

#####Combined AUROC and PR Curves

KNN_probs = KNN.predict_proba(X_test)
AB_probs = AB.predict_proba(X_test)
DT_probs = DT.predict_proba(X_test)
LR_probs = LR.predict_proba(X_test)
RF_probs = RF.predict_proba(X_test)
NB_probs = NB.predict_proba(X_test)
MLP_probs = MLP.predict_proba(X_test)
svm_linear_probs =svm_linear.predict_proba(X_test)
svm_rbf_probs = svm_rbf.predict_proba(X_test)
svm_poly_probs = svm_poly.predict_proba(X_test)

# keep probabilities for the positive outcome only

KNN_probs = KNN_probs[:, 1]
AB_probs = AB_probs[:, 1]
DT_probs = DT_probs[:, 1]
LR_probs = LR_probs[:, 1]
RF_probs = RF_probs[:, 1]
NB_probs = NB_probs[:, 1]
MLP_probs = MLP_probs[:,1]
svm_linear_probs = svm_linear_probs[:, 1]
svm_poly_probs = svm_poly_probs[:, 1]
svm_rbf_probs = svm_rbf_probs[:, 1]

# calculate scores
KNN_auc = roc_auc_score(Y_test, KNN_probs)
AB_auc = roc_auc_score(Y_test, AB_probs)
DT_auc = roc_auc_score(Y_test, DT_probs)
LR_auc = roc_auc_score(Y_test, LR_probs)
NB_auc = roc_auc_score(Y_test, NB_probs)
RF_auc = roc_auc_score(Y_test, RF_probs)
MLP_auc = roc_auc_score(Y_test, MLP_probs)
svm_poly_auc = roc_auc_score(Y_test, svm_poly_probs)
svm_linear_auc = roc_auc_score(Y_test, svm_linear_probs)
svm_rbf_auc = roc_auc_score(Y_test, svm_rbf_probs)

# calculate roc curves

KNN_fpr, KNN_tpr, _ = roc_curve(Y_test, KNN_probs)
AB_fpr, AB_tpr, _ = roc_curve(Y_test, AB_probs)
DT_fpr, DT_tpr, _ = roc_curve(Y_test, DT_probs)
NB_fpr, NB_tpr, _ = roc_curve(Y_test, NB_probs)
LR_fpr, LR_tpr, _ = roc_curve(Y_test, LR_probs)
RF_fpr, RF_tpr, _ = roc_curve(Y_test, RF_probs)
MLP_fpr, MLP_tpr, _ = roc_curve(Y_test, MLP_probs)
svm_linear_fpr, svm_linear_tpr, _ = roc_curve(Y_test,
svm_linear_probs)
svm_poly_fpr, svm_poly_tpr, _ = roc_curve(Y_test, svm_poly_probs)
svm_rbf_fpr, svm_rbf_tpr, _ = roc_curve(Y_test, svm_rbf_probs)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(KNN_fpr, KNN_tpr ,label='KNN =%.3f' % (KNN_auc))
pyplot.plot(AB_fpr, AB_tpr ,label='AB =%.3f' % (AB_auc))
pyplot.plot(NB_fpr, NB_tpr ,label='NB =%.3f' % (NB_auc))
pyplot.plot(DT_fpr, DT_tpr ,label='DT =%.3f' % (DT_auc))
pyplot.plot(LR_fpr, LR_tpr ,label='LR =%.3f' % (LR_auc))
pyplot.plot(RF_fpr, RF_tpr ,label='RF =%.3f' % (RF_auc))
pyplot.plot(MLP_fpr, MLP_tpr ,label='MLP =%.3f' % (MLP_auc))
pyplot.plot(svm_linear_fpr, svm_linear_tpr ,label='SVM_Linear =%.3f' %
(svm_linear_auc))
pyplot.plot(svm_poly_fpr, svm_poly_tpr ,label='SVM_Poly =%.3f' %
(svm_poly_auc))
pyplot.plot(svm_rbf_fpr, svm_rbf_tpr ,label='SVM_RBF =%.3f' %
(svm_rbf_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('True Positive Rate', fontsize=15)
pyplot.xlabel('False Positive Rate', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
plt.savefig("AUC_ROC.png", dpi = 600)
pyplot.show()

from sklearn.metrics import precision_score

from sklearn.metrics import recall_score
from sklearn.metrics import precision_recall_curve

KNN_precision, KNN_recall, _ = precision_recall_curve(Y_test,

KNN_probs)
AB_precision, AB_recall, _ = precision_recall_curve(Y_test, AB_probs)
DT_precision, DT_recall, _ = precision_recall_curve(Y_test, DT_probs)
LR_precision, LR_recall, _ = precision_recall_curve(Y_test, LR_probs)
NB_precision, NB_recall, _ = precision_recall_curve(Y_test, NB_probs)
RF_precision, RF_recall, _ = precision_recall_curve(Y_test, RF_probs)
MLP_precision, MLP_recall, _ = precision_recall_curve(Y_test,
MLP_probs)
svm_linear_precision, svm_linear_recall, _ =
precision_recall_curve(Y_test, svm_linear_probs)
svm_poly_precision, svm_poly_recall, _ =
precision_recall_curve(Y_test, svm_poly_probs)
svm_rbf_precision, svm_rbf_recall, _ = precision_recall_curve(Y_test,
svm_rbf_probs)

from sklearn.metrics import auc

# calculate the precision-recall auc
KNN_auc = auc(KNN_recall, KNN_precision)
AB_auc = auc(AB_recall, AB_precision)
DT_auc = auc(DT_recall, DT_precision)
NB_auc = auc(NB_recall, NB_precision)
LR_auc = auc(LR_recall, LR_precision)
RF_auc = auc(RF_recall, RF_precision)
MLP_auc = auc(MLP_recall, MLP_precision)
svm_linear_auc = auc(svm_linear_recall, svm_linear_precision)
svm_poly_auc = auc(svm_poly_recall, svm_poly_precision)
svm_rbf_auc = auc(svm_rbf_recall, svm_rbf_precision)

# plot the roc curve for the model

fig = pyplot.figure(figsize=(7, 5))
pyplot.plot(KNN_precision, KNN_recall ,label='KNN =%.3f' % (KNN_auc))
pyplot.plot(AB_precision, AB_recall ,label='AB =%.3f' % (AB_auc))
pyplot.plot(NB_precision, NB_recall ,label='NB =%.3f' % (NB_auc))
pyplot.plot(DT_precision, DT_recall ,label='DT =%.3f' % (DT_auc))
pyplot.plot(LR_precision, LR_recall ,label='LR =%.3f' % (LR_auc))
pyplot.plot(RF_precision, RF_recall ,label='RF =%.3f' % (RF_auc))
pyplot.plot(MLP_precision, MLP_recall ,label='MLP =%.3f' % (MLP_auc))
pyplot.plot(svm_linear_precision, svm_linear_recall ,label='SVM_Linear
=%.3f' % (svm_linear_auc))
pyplot.plot(svm_poly_precision, svm_poly_recall ,label='SVM_Poly =
%.3f' % (svm_poly_auc))
pyplot.plot(svm_rbf_precision, svm_rbf_recall ,label='SVM_RBF =%.3f' %
(svm_rbf_auc))
params = {'legend.fontsize': 10,
'legend.handlelength': 2}
pyplot.rcParams.update(params)
pyplot.legend()
pyplot.ylabel('Precision', fontsize=15)
pyplot.xlabel('Recall', fontsize=15)
pyplot.xticks(fontsize=12)
pyplot.yticks(fontsize=12)
#Show legend
pyplot.legend() #
plt.savefig("PR_curve.png", dpi = 600)
pyplot.show()

BCG Matrix PPT of Apple
50% (2)
BCG Matrix PPT of Apple
15 pages
2 Mark Python Imp
No ratings yet
2 Mark Python Imp
11 pages
Lab2 - Python Programming Basics
No ratings yet
Lab2 - Python Programming Basics
16 pages
ÔN TẬP FINAL NGÔN NGỮ LẬP TRÌNH
No ratings yet
ÔN TẬP FINAL NGÔN NGỮ LẬP TRÌNH
121 pages
Python Assignment
100% (1)
Python Assignment
21 pages
Python
No ratings yet
Python
21 pages
ProfessiR programming
No ratings yet
ProfessiR programming
22 pages
CSC220 Data Structures Fall 2014: Python II
No ratings yet
CSC220 Data Structures Fall 2014: Python II
4 pages
Python
No ratings yet
Python
61 pages
Python by Ram
No ratings yet
Python by Ram
8 pages
Pychapt2 New
No ratings yet
Pychapt2 New
44 pages
Python Notes
No ratings yet
Python Notes
16 pages
3 Mark Python Imp
No ratings yet
3 Mark Python Imp
18 pages
Class 02
No ratings yet
Class 02
12 pages
Module4 - Types and Operations
No ratings yet
Module4 - Types and Operations
52 pages
Week3 2020
No ratings yet
Week3 2020
20 pages
(SEM III) (AKTU) PYTHON THEORY EXAMINATION 2022-23 SOLUTION
No ratings yet
(SEM III) (AKTU) PYTHON THEORY EXAMINATION 2022-23 SOLUTION
17 pages
SEM (IV) (AKTU) THEORY EXAMINATION 2023-24 SOLUTION
No ratings yet
SEM (IV) (AKTU) THEORY EXAMINATION 2023-24 SOLUTION
9 pages
PYTHON PYQ SOLUTION AKTU
No ratings yet
PYTHON PYQ SOLUTION AKTU
42 pages
Python Unit - I - Complete
No ratings yet
Python Unit - I - Complete
19 pages
Stochastic Technologies Available As A Short Ebook
No ratings yet
Stochastic Technologies Available As A Short Ebook
8 pages
House Price Prediction Using Machine Learning in Python
No ratings yet
House Price Prediction Using Machine Learning in Python
13 pages
code EXPLANATIONFOR builtins function
No ratings yet
code EXPLANATIONFOR builtins function
31 pages
Here’s More Fun-WPS Office
No ratings yet
Here’s More Fun-WPS Office
10 pages
Python Unit 2
No ratings yet
Python Unit 2
8 pages
Unit 2 Python
No ratings yet
Unit 2 Python
17 pages
Python Basics Notes
No ratings yet
Python Basics Notes
17 pages
Pierian Data - Python For Finance & Algorithmic Trading Course Notes
No ratings yet
Pierian Data - Python For Finance & Algorithmic Trading Course Notes
11 pages
Data Science Fundamentals Lab
No ratings yet
Data Science Fundamentals Lab
24 pages
Exercise and Experiment 3
No ratings yet
Exercise and Experiment 3
14 pages
AIML__MODULE__1
No ratings yet
AIML__MODULE__1
48 pages
Ap Python
No ratings yet
Ap Python
12 pages
FODS_LAB_MANUAL
No ratings yet
FODS_LAB_MANUAL
26 pages
R Course Notes
No ratings yet
R Course Notes
10 pages
Introduction To NUMPY
No ratings yet
Introduction To NUMPY
15 pages
Exam Preparation Python - Jupyter Notebook
No ratings yet
Exam Preparation Python - Jupyter Notebook
17 pages
PP
No ratings yet
PP
80 pages
Unit Two Notes.docx
No ratings yet
Unit Two Notes.docx
16 pages
Lecture3_DynamicLanguage
No ratings yet
Lecture3_DynamicLanguage
24 pages
Python
No ratings yet
Python
20 pages
Advanced Python notes
No ratings yet
Advanced Python notes
16 pages
Python Language Features Summary
No ratings yet
Python Language Features Summary
26 pages
Learn Python in 10 Minutes
No ratings yet
Learn Python in 10 Minutes
10 pages
CSX3001 Class02 1 2023
No ratings yet
CSX3001 Class02 1 2023
15 pages
Ipython in The Terminal. & Quit The Ipython Interpreter by Using CTRL + D
No ratings yet
Ipython in The Terminal. & Quit The Ipython Interpreter by Using CTRL + D
10 pages
Python Variables and Data Types
No ratings yet
Python Variables and Data Types
85 pages
Pranav Data Science Lab
No ratings yet
Pranav Data Science Lab
34 pages
Python Lab Manual
No ratings yet
Python Lab Manual
19 pages
Python Data Types Unit I
No ratings yet
Python Data Types Unit I
8 pages
Python Data Types
No ratings yet
Python Data Types
19 pages
Python Notes
No ratings yet
Python Notes
141 pages
EXP1-siddhant gupta (23_SE_148)
No ratings yet
EXP1-siddhant gupta (23_SE_148)
17 pages
Python
No ratings yet
Python
13 pages
What Is Python
No ratings yet
What Is Python
10 pages
Python_S2AIDS_1st_Class
No ratings yet
Python_S2AIDS_1st_Class
22 pages
Lecture 9,10,11&12 notes
No ratings yet
Lecture 9,10,11&12 notes
28 pages
python programming bcc302
No ratings yet
python programming bcc302
16 pages
py 7
No ratings yet
py 7
6 pages
Dec 23 py
No ratings yet
Dec 23 py
19 pages
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
From Everand
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
Nikhil Khan
No ratings yet
Introduction to PHP, Part 2, Second Edition
From Everand
Introduction to PHP, Part 2, Second Edition
Adam Majczak
No ratings yet
INTERNSHIP PRESENTATION
No ratings yet
INTERNSHIP PRESENTATION
28 pages
An Efficient and High Speed Overlap Free Karatsuba Based Finite Field Multiplier For Fpga Implementation
No ratings yet
An Efficient and High Speed Overlap Free Karatsuba Based Finite Field Multiplier For Fpga Implementation
167 pages
Wgs Classifieds 061114
No ratings yet
Wgs Classifieds 061114
7 pages
Nanocom Evolution Software Hardware Manual
No ratings yet
Nanocom Evolution Software Hardware Manual
19 pages
The BBI Dictionary of English Word Combinations by Morton Benson Read Book DJVU, DOCX, PRC, DOC, PDF
No ratings yet
The BBI Dictionary of English Word Combinations by Morton Benson Read Book DJVU, DOCX, PRC, DOC, PDF
1 page
Statcon Electronics & Powtech Internship Projects
No ratings yet
Statcon Electronics & Powtech Internship Projects
2 pages
2023 State of AI Infrastructure Survey
No ratings yet
2023 State of AI Infrastructure Survey
19 pages
TechLibrary - Juniper Networks
No ratings yet
TechLibrary - Juniper Networks
1 page
AFT Chino SOP v1.1 K0412 PDF
No ratings yet
AFT Chino SOP v1.1 K0412 PDF
16 pages
Integers Worksheet 1
No ratings yet
Integers Worksheet 1
7 pages
Wifi Cracker
No ratings yet
Wifi Cracker
13 pages
Print & Digital Media
No ratings yet
Print & Digital Media
29 pages
Gigabyte Ga-G41mt-S2 - r13 PDF
No ratings yet
Gigabyte Ga-G41mt-S2 - r13 PDF
33 pages
12TH IT JOURNAL
No ratings yet
12TH IT JOURNAL
46 pages
Ocelot
No ratings yet
Ocelot
104 pages
Graph Theory: Maximum Flows and Min Cost Flows
No ratings yet
Graph Theory: Maximum Flows and Min Cost Flows
13 pages
Violation of OLS - Autocorrelation
No ratings yet
Violation of OLS - Autocorrelation
11 pages
Ass1 Merged Merged
No ratings yet
Ass1 Merged Merged
19 pages
The Beginners Guide To Gerber File Extensions
100% (1)
The Beginners Guide To Gerber File Extensions
13 pages
Sparkius Конфиг инпут лаг
No ratings yet
Sparkius Конфиг инпут лаг
1 page
Apache Airflow On Docker For Complete Beginners - Justin Gage - Medium
No ratings yet
Apache Airflow On Docker For Complete Beginners - Justin Gage - Medium
12 pages
Sure Cross Multihop Data Radio: Datasheet
No ratings yet
Sure Cross Multihop Data Radio: Datasheet
7 pages
Lecture 8 Memory Management
No ratings yet
Lecture 8 Memory Management
23 pages
The Metaverse
No ratings yet
The Metaverse
17 pages
Megmeet English Catalog
No ratings yet
Megmeet English Catalog
102 pages
O-CAPS-01: Pre-Regional Mathematics Olympiad (PRMO) : Olympiad-Classroom Assessment Practice Sheet
No ratings yet
O-CAPS-01: Pre-Regional Mathematics Olympiad (PRMO) : Olympiad-Classroom Assessment Practice Sheet
15 pages
ssss
No ratings yet
ssss
282 pages
Information Technology Convergence
No ratings yet
Information Technology Convergence
1,051 pages
28F001BX T
No ratings yet
28F001BX T
33 pages