Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
60 views

Machine Learning Lab Assignment CSE-716: S. M. Shafkat Raihan ID: 16701041 SESSION: 2015-16

The document describes implementing an artificial neural network classification method using Python. It involves: 1. Loading training and test data from CSV files and preprocessing the data using label encoding. 2. Splitting the data into input and output columns and building a sequential neural network model with multiple dense layers. 3. Compiling and training the model, then making predictions on the test data and evaluating using accuracy score and a confusion matrix. 4. The output shows an accuracy of 75% on the test data with the confusion matrix correctly classifying one sample as "No" and two samples as "Yes".

Uploaded by

Enaqkat Painav
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views

Machine Learning Lab Assignment CSE-716: S. M. Shafkat Raihan ID: 16701041 SESSION: 2015-16

The document describes implementing an artificial neural network classification method using Python. It involves: 1. Loading training and test data from CSV files and preprocessing the data using label encoding. 2. Splitting the data into input and output columns and building a sequential neural network model with multiple dense layers. 3. Compiling and training the model, then making predictions on the test data and evaluating using accuracy score and a confusion matrix. 4. The output shows an accuracy of 75% on the test data with the confusion matrix correctly classifying one sample as "No" and two samples as "Yes".

Uploaded by

Enaqkat Painav
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 9

MACHINE LEARNING LAB ASSIGNMENT

CSE-716
S. M. SHAFKAT RAIHAN
ID: 16701041
SESSION: 2015-16
PROBLEM DESCRIPTION

Implementing Artificial Neural Network Classification Method


Using Python Language
FULL DATASET
age income isStudent credit_rating Buys_Computer
youth high No Fair No
youth high No Excellent No

middle_aged high No Fair Yes


Implementing
senior Artificial
medium Neural
No Network
Fair Classification
Yes Using R
senior low Yes Fair Yes
Language
senior low Yes Excellent No

middle_aged low Yes Excellent Yes


youth medium No Fair No
youth low Yes Fair Yes
senior medium Yes Fair Yes
youth medium Yes Excellent Yes

middle_aged medium No Excellent Yes

middle_aged high Yes Fair Yes


senior medium No Excellent No
TRAINING DATA

age income isStudent credit_rating class


youth high No Fair No
youth high No Excellent No
Implementing
middle_aged
Artificial
high
NeuralNoNetwork Classification
Fair
Using
Yes
R
Language
senior medium No Fair Yes
senior low Yes Fair Yes
senior low Yes Excellent No
middle_aged low Yes Excellent Yes
youth medium No Fair No
youth low Yes Fair Yes
senior medium Yes Fair Yes
TEST DATA

age income isStudent credit_rating class

Implementing
youth Artificial
mediumNeural
YesNetwork Classification
Excellent Using
Yes R
Language
middle_aged medium No Excellent Yes

middle_aged high Yes Fair Yes

senior medium No Excellent No


CODE & EXPLANATION
import pandas as pd # This data analysis library is used for using methods to read the CSV files
from sklearn.preprocessing import LabelEncoder # LabelEncoder Class to convert categorical text data into model-understandable numerical data
from sklearn.metrics import confusion_matrix, accuracy_score #For computing Accuracy and confusion matrix
import numpy #Used for mathematical operations on multidimensional
from keras.models import Sequential #keras is python’s deep learning library and Sequential is a class used for creating sequential stack of layers of
neurons
from keras.layers import Dense #Dense is a layer class, which implements a densely connected/fully connected layer
numpy.random.seed(7) # used to randomly initialize the weight matrix of 7 weight entries

train = pd.read_csv('C:/Users/USER/Desktop/Testing_ML_Lab_Files/data/Training2.csv’) # Read training data using pandas method read.csv()


test = pd.read_csv('C:/Users/USER/Desktop/Testing_ML_Lab_Files/data/UnknownData.csv’) # Read testing data using pandas method read.csv()

# LabelEncoder to convert categorical to numeric value.


number = LabelEncoder() #LabelEncoder object constructed to convert categorical text data into model-understandable numerical data
# Convert categorical values to numeric.
for i in train:
train[i] = number.fit_transform(train[i].astype('str’)) # fit() retrieves the parameters of the model from the dataset such as mean and standard div.
# Transform() applies those on the dataset to transform it into the new dataset.
# fit_transform() joins these #two operations.
#astype(str) used to convert the all attribute values to string type first
CODE & EXPLANATION
# Split input and output columns; x = input columns, y = output columns.
x_train = train.iloc[:, :-1] #train.iloc[:,:-1] means data of all rows and all columns except
#the last column will be considered
y_train = train.iloc[:, -1] #test.iloc[:,-1] means data of all rows and only the last column
#will be considered
# Do the same for test dataset.
for i in test:
test[i] = number.fit_transform(test[i].astype('str'))
x_test = test.iloc[:,:-1] #What was done for x_train
y_test = test.iloc[:,-1] #What was done for y_train

model = Sequential() # Create a sequential ANN model.


model.add(Dense(10, input_dim=4, activation='relu’)) # Add first layer; It will be dense with output
#array of shape (*,10) and input shape (*,4).
# Rectified Linear unit is used as Activation
#function
model.add(Dense(4, activation='relu’)) # Add second layer; output shape = (*,4)
model.add(Dense(1, activation='sigmoid’)) # Add output layer; output shape = (*,1) for output 0 or 1.
#Logsigmoid activation function used
CODE & EXPLANATION
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy’]) # Compile configures the model for
# training
Loss function ‘binary_crossentropy’
#is used as there are just 2 class labels

model.fit(x_train, y_train, epochs=150, batch_size=10) # Now train-up the model, iterations = 500, batch = 10 is the number of
#training examples in one epoch
predictions = model.predict(x_test) # doing prediction using the model
predicted= [int(round(x[0])) for x in predictions] # Result have been rounded off and converted o integer as y_test values are integers

# Build confusion matrix

cfm = confusion_matrix(y_test, predicted) # Actual Value\Predicted Value No Yes

# Calculating accuracy # No True Negative False Positive

acc = accuracy_score(y_test, predicted) # Yes False Negative True Positive

# Printing accuracy and cfm # Accuracy = (TP+TN) / Total Number of samples

print('Accuracy:', acc) # Prinitng Accuracy

print('Prediction No Yes’) #Prinitng Confusion Matrix

print(' No {} {}'.format(cfm[0][0], cfm[0][1]))

print(' Yes {} {}'.format(cfm[1][0], cfm[1][1]))


OUTPUT & EXPLANATION
Output

Accuracy: 0.75

Prediction No Yes

No 1 0
Yes 1 2

Explanation
Total Number of samples in test set = 4
Adding the elements of the principle diagonal of the confusion matrix gives us the Correctly identified
positive and negative classes = 1+2=3
Accuracy = (TP + TN) / Total = ¾ = 0.75

You might also like