Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
100% found this document useful (1 vote)
79 views

Assignment 11

This document discusses building a neural network model to predict diamond prices. It loads diamond price data, cleans and preprocesses the data, splits it into training and test sets, and builds a neural network model with an input, three hidden and one output layer. It trains the model over 20 epochs and evaluates the model's performance on the training and test sets using mean absolute error. The best model achieves a mean absolute error of around 318 on the training set and 323 on the test set.

Uploaded by

ankit mahto
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
79 views

Assignment 11

This document discusses building a neural network model to predict diamond prices. It loads diamond price data, cleans and preprocesses the data, splits it into training and test sets, and builds a neural network model with an input, three hidden and one output layer. It trains the model over 20 epochs and evaluates the model's performance on the training and test sets using mean absolute error. The best model achieves a mean absolute error of around 318 on the training set and 323 on the test set.

Uploaded by

ankit mahto
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Assignment 11 – Quantum Learning  

                                                Sandip University - Zeeshan Mawani

In [1]: #import libraries and dataset


import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import preprocessing
from tensorflow.keras.layers import Input, Dense, Dropout, Activation
from tensorflow.keras.models import Model
reg_dataset = pd.read_csv(r'C:\Users\Zeeshan Mawani\Desktop\Jupyter\Class stuff\diamond_price.csv')

In [2]: reg_dataset.head()

Out[2]:
S.No carat cut color clarity depth table price x y z

0 1 0.23 Ideal E SI2 61.5 55.0 326 3.95 3.98 2.43

1 2 0.21 Premium E SI1 59.8 61.0 326 3.89 3.84 2.31

2 3 0.23 Good E VS1 56.9 65.0 327 4.05 4.07 2.31

3 4 0.29 Premium I VS2 62.4 58.0 334 4.20 4.23 2.63

4 5 0.31 Good J SI2 63.3 58.0 335 4.34 4.35 2.75


In [3]: reg_dataset.drop('S.No', axis=1, inplace=True)
reg_dataset.head()

Out[3]:
carat cut color clarity depth table price x y z

0 0.23 Ideal E SI2 61.5 55.0 326 3.95 3.98 2.43

1 0.21 Premium E SI1 59.8 61.0 326 3.89 3.84 2.31

2 0.23 Good E VS1 56.9 65.0 327 4.05 4.07 2.31

3 0.29 Premium I VS2 62.4 58.0 334 4.20 4.23 2.63

4 0.31 Good J SI2 63.3 58.0 335 4.34 4.35 2.75

In [4]: reg_dataset.shape

Out[4]: (53940, 10)

In [5]: reg_dataset.describe()

Out[5]:
carat depth table price x y z

count 53940.000000 53940.000000 53940.000000 53940.000000 53940.000000 53940.000000 53940.000000

mean 0.797940 61.749405 57.457184 3932.799722 5.731157 5.734526 3.538734

std 0.474011 1.432621 2.234491 3989.439738 1.121761 1.142135 0.705699

min 0.200000 43.000000 43.000000 326.000000 0.000000 0.000000 0.000000

25% 0.400000 61.000000 56.000000 950.000000 4.710000 4.720000 2.910000

50% 0.700000 61.800000 57.000000 2401.000000 5.700000 5.710000 3.530000

75% 1.040000 62.500000 59.000000 5324.250000 6.540000 6.540000 4.040000

max 5.010000 79.000000 95.000000 18823.000000 10.740000 58.900000 31.800000


In [6]: # plot price vs. carat
sns.pairplot(reg_dataset, x_vars=['carat'], y_vars = ['price'])
# plot carat vs other Cs
sns.pairplot(reg_dataset, x_vars=['cut', 'clarity', 'color'], y_vars = ['carat'])
plt.show()
In [7]: sns.heatmap(reg_dataset[["carat","cut","color","clarity","depth","table"]].corr(), annot=True)

Out[7]: <AxesSubplot:>

In [8]: list(reg_dataset.cut.unique())

Out[8]: ['Ideal', 'Premium', 'Good', 'Very Good', 'Fair']

In [9]: numeric_data= reg_dataset.drop(['cut','color', 'clarity'], axis=1)

In [10]: cut_onehot = pd.get_dummies(reg_dataset.cut).iloc[:,1:]


color_onehot = pd.get_dummies(reg_dataset.color).iloc[:,1:]
clarity_onehot = pd.get_dummies(reg_dataset.clarity).iloc[:,1:]
In [11]: cut_onehot.head()

Out[11]:
Good Ideal Premium Very Good

0 0 1 0 0

1 0 0 1 0

2 1 0 0 0

3 0 0 1 0

4 1 0 0 0

In [12]: reg_dataset = pd.concat([numeric_data,cut_onehot, color_onehot,clarity_onehot], axis=1)


reg_dataset.head()

Out[12]:
carat depth table price x y z Good Ideal Premium ... H I J IF SI1 SI2 VS1 VS2 VVS1 VVS2

0 0.23 61.5 55.0 326 3.95 3.98 2.43 0 1 0 ... 0 0 0 0 0 1 0 0 0 0

1 0.21 59.8 61.0 326 3.89 3.84 2.31 0 0 1 ... 0 0 0 0 1 0 0 0 0 0

2 0.23 56.9 65.0 327 4.05 4.07 2.31 1 0 0 ... 0 0 0 0 0 0 1 0 0 0

3 0.29 62.4 58.0 334 4.20 4.23 2.63 0 0 1 ... 0 1 0 0 0 0 0 1 0 0

4 0.31 63.3 58.0 335 4.34 4.35 2.75 1 0 0 ... 0 0 1 0 0 1 0 0 0 0

5 rows × 24 columns

In [13]: features = reg_dataset.drop(['price'], axis=1).values


labels = reg_dataset['price'].values

In [14]: from sklearn.model_selection import train_test_split


X_train, X_test, y_train, y_test = train_test_split(features, labels, test_size=0.2, random_state=40)
In [15]: from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

In [16]: input_layer = Input(shape=(features.shape[1],))


l1 = Dense(100, activation='relu')(input_layer)
l2 = Dense(50, activation='relu')(l1)
l3 = Dense(25, activation='relu')(l2)
output = Dense(1)(l3)

In [17]: model = Model(inputs=input_layer, outputs=output)


model.compile(loss="mean_absolute_error" , optimizer="adam", metrics=["mean_absolute_error"])

In [18]: print(model.summary())

Model: "model"

_________________________________________________________________

Layer (type) Output Shape Param #

=================================================================

input_1 (InputLayer) [(None, 23)] 0

dense (Dense) (None, 100) 2400

dense_1 (Dense) (None, 50) 5050

dense_2 (Dense) (None, 25) 1275

dense_3 (Dense) (None, 1) 26

=================================================================

Total params: 8,751

Trainable params: 8,751

Non-trainable params: 0

_________________________________________________________________

None

In [19]: history = model.fit(X_train, y_train, batch_size=2, epochs=20, verbose=1, validation_split=0.2)


17261/17261 [==============================] - 39s 2ms/step - loss: 324.2164 - mean_absolute_error: 324.2164 - val_l
oss: 352.5723 - val_mean_absolute_error: 352.5723

Epoch 15/20

17261/17261 [==============================] - 39s 2ms/step - loss: 322.4259 - mean_absolute_error: 322.4259 - val_l


oss: 363.4964 - val_mean_absolute_error: 363.4964

Epoch 16/20

17261/17261 [==============================] - 39s 2ms/step - loss: 321.3640 - mean_absolute_error: 321.3640 - val_l


oss: 351.7753 - val_mean_absolute_error: 351.7753

Epoch 17/20

17261/17261 [==============================] - 39s 2ms/step - loss: 318.6007 - mean_absolute_error: 318.6007 - val_l


oss: 351.6654 - val_mean_absolute_error: 351.6654

Epoch 18/20

17261/17261 [==============================] - 39s 2ms/step - loss: 317.5522 - mean_absolute_error: 317.5522 - val_l


oss: 360.9181 - val_mean_absolute_error: 360.9181

Epoch 19/20

17261/17261 [==============================] - 39s 2ms/step - loss: 315.9409 - mean_absolute_error: 315.9409 - val_l


oss: 362.9208 - val_mean_absolute_error: 362.9208

Epoch 20/20

17261/17261 [==============================] - 39s 2ms/step - loss: 313.7580 - mean_absolute_error: 313.7580 - val_l


oss: 350.9129 - val_mean_absolute_error: 350.9129

In [20]: from sklearn.metrics import mean_absolute_error


from math import sqrt

pred_train = model.predict(X_train)
print(mean_absolute_error(y_train,pred_train))

pred = model.predict(X_test)
print(mean_absolute_error(y_test,pred))

318.7477203046652

323.489523630563

You might also like