KRAI LabManual
KRAI LabManual
KRAI LabManual
Lab Journal
Of
By
Student Name
Of
Professor Name
College Name
INDEX
Sr. Page
Title
No. No
1 Find the correlation matrix. 3
Plot the correlation plot on dataset and visualize giving an
2 5
overview of relationships among data on iris data.
Analysis of covariance: variance (ANOVA), if data have
3 12
categorical variables on iris data.
Apply linear regression Model techniques to predict the
4 0
data on any dataset.
Apply logical regression Model techniques to predict the
5 14
data on any dataset.
6 Clustering algorithms for unsupervised classification. 22
Association algorithms for supervised classification on any
7 28
dataset
Developing and implementing Decision Tree model on the
8 33
dataset
9 Bayesian classification on any dataset. 35
10 SVM classification on any dataset 38
11 Text Mining algorithms on unstructured dataset
12 Plot the cluster data using python visualizations. 47
Creating & Visualizing Neural Network for the given data.
13 52
(Use python)
14 Recognize optical character using ANN. 55
15 Write a program to implement CNN 61
16 Write a program to implement RNN 68
17 Write a program to implement GAN
18 Web scraping experiments (by using tools) 74
import scipy.stats as st
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
In [4]:
data=pd.read_excel("E:/MCA/Sem3/AI_ML/Practical/Sample - Superstore.xls")
In [5]:
np.corrcoef(data['Sales'],data['Profit'])
Out[6]:
array([[1. , 0.47906435],
[0.47906435, 1. ]])
In [7]:
data.corr()
Out[7]:
sns.heatmap(data.corr())
Out[8]:
<AxesSubplot:>
2. Plot the correlation plot on dataset and visualize giving an overview of
relationships among data on iris data
In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
from sklearn import metrics
sns.set()
In [12]:
iris_data=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/iris.csv')
iris_data
Out[12]:
sepal_length sepal_width petal_length petal_width species
sepal_length sepal_width petal_length petal_width species
iris_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 sepal_length 150 non-null float64
1 sepal_width 150 non-null float64
2 petal_length 150 non-null float64
3 petal_width 150 non-null float64
4 species 150 non-null object
dtypes: float64(4), object(1)
memory usage: 6.0+ KB
In [5]:
iris_data.describe()
Out[5]:
sepal_length sepal_width petal_length petal_width
In [6]:
iris_data[iris_data.duplicated()]
Out[6]:
sepal_length sepal_width petal_length petal_width species
In [8]:
iris_data['species'].value_counts()
Out[8]:
setosa 50
versicolor 50
virginica 50
Name: species, dtype: int64
In [15]:
sns.pairplot(iris_data,hue='species',height=4)
Out[15]:
<seaborn.axisgrid.PairGrid at 0x20b16a147f0>
In [21]:
plt.figure(figsize=(10,11))
sns.heatmap(iris_data.corr(),annot=True)
plt.plot()
Out[21]:
[]
In [22]:
iris_data.groupby('species').agg(['mean','median'])
Out[22]:
species
import numpy as np
import pandas as pd
In [4]:
df=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/iris.csv')
In [5]:
df.head()
Out[5]:
sepal_length sepal_width petal_length petal_width species
import pandas as pd
import matplotlib.pyplot as plt
In [40]:
df=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/Social_Network_Ads.csv')
In [36]:
X=df[['Age','EstimatedSalary']]
y=df['Purchased']
In [37]:
model=LogisticRegression()
In [12]:
model.fit(X,y)
Out[12]:
LogisticRegression()
In [13]:
Scaled_Age=(df['Age']-df['Age'].min()) / (df['Age'].max()-df['Age'].min())
Scaled_Salary=(df['EstimatedSalary']-df['EstimatedSalary'].min()) / (df['EstimatedSalary'].max()-
df['EstimatedSalary'].min())
In [14]:
X=pd.concat([Scaled_Age,Scaled_Salary],axis=1)
y=df['Purchased']
In [15]:
model_scaled = LogisticRegression()
model_scaled.fit(X,y)
Out[15]:
LogisticRegression()
In [16]:
def get_scaled(pt):
age,sal = pt[0],pt[1]
sc_age=(age-df['Age'].min()) / (df['Age'].max()-df['Age'].min())
sc_sal=(sal-df['EstimatedSalary'].min()) / (df['EstimatedSalary'].max()-df['EstimatedSalary'].min())
return sc_age,sc_sal
In [17]:
q1=get_scaled([52,130000])
q2=get_scaled([25,40000])
In [18]:
model_scaled.predict([q1])
Out[18]:
array([1], dtype=int64)
In [19]:
model_scaled.predict([q2])
Out[19]:
array([0], dtype=int64)
In [20]:
X = df[['Age','EstimatedSalary']]
scaler = MinMaxScaler()
scaler.fit(X)
X_scaled = scaler.transform(X)
In [22]:
X_scaled
Out[22]:
array([[0.02380952, 0.02962963],
[0.4047619 , 0.03703704],
[0.19047619, 0.20740741],
[0.21428571, 0.31111111],
[0.02380952, 0.45185185],
[0.21428571, 0.31851852],
[0.21428571, 0.51111111],
[0.33333333, 1. ],
[0.16666667, 0.13333333],
[0.4047619 , 0.37037037],
[0.19047619, 0.48148148],
[0.19047619, 0.27407407],
[0.04761905, 0.52592593],
[0.33333333, 0.02222222],
[0. , 0.4962963 ],
[0.26190476, 0.48148148],
[0.69047619, 0.07407407],
[0.64285714, 0.08148148],
[0.66666667, 0.0962963 ],
[0.71428571, 0.1037037 ],
[0.64285714, 0.05185185],
[0.69047619, 0.25185185],
[0.71428571, 0.19259259],
[0.64285714, 0.05185185],
[0.66666667, 0.05925926],
[0.69047619, 0.03703704],
[0.73809524, 0.0962963 ],
[0.69047619, 0.11111111],
[0.26190476, 0.20740741],
[0.30952381, 0.02222222],
[0.30952381, 0.43703704],
[0.21428571, 0.9037037 ],
[0.07142857, 0.00740741],
[0.23809524, 0.21481481],
[0.21428571, 0.55555556],
[0.4047619 , 0.08888889],
[0.35714286, 0.0962963 ],
.
.
.
.
[0.57142857, 0.36296296],
[0.71428571, 0.13333333],
[0.61904762, 0.91851852],
[0.73809524, 0.0962963 ],
[0.92857143, 0.13333333],
[0.9047619 , 0.33333333],
[0.73809524, 0.17777778],
[0.5 , 0.41481481],
[0.69047619, 0.14074074],
[0.71428571, 0.14814815],
[0.71428571, 0.13333333],
[0.69047619, 0.05925926],
[0.64285714, 0.22222222],
[1. , 0.2 ],
[0.5 , 0.32592593],
[0.66666667, 0.19259259],
[0.78571429, 0.05925926],
[0.76190476, 0.03703704],
[0.42857143, 0.13333333],
[0.73809524, 0.15555556]])
In [23]:
model = LogisticRegression()
model.fit(X_scaled,df['Purchased'])
Out[23]:
LogisticRegression()
In [24]:
model.score(X_scaled,df['Purchased'])
Out[24]:
0.83
In [25]:
y_pre=model.predict(X_scaled)
In [26]:
y_act=df['Purchased']
In [27]:
y_pre
Out[27]:
array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1,
0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0,
1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0,
1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0,
0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1,
1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1,
0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0,
1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1,
0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0], dtype=int64)
In [28]:
scaler = MinMaxScaler()
scaler.fit(X_train)
X_train_scaled=scaler.transform(X_train)
In [31]:
model = LogisticRegression()
model.fit(X_train_scaled,y_train)
Out[31]:
LogisticRegression()
In [32]:
train_score=model.score(X_train_scaled,y_train)
train_score
Out[32]:
0.81
In [33]:
X_test_scaled=scaler.transform(X_test)
test_score=model.score(X_test_scaled,y_test)
test_score
Out[33]:
0.88
6. Clustering algorithm for unsupervised classification
In [1]:
import pandas as pd
In [2]:
import matplotlib.pyplot as plt
%matplotlib inline
In [4]:
df=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/Mall_Customers_dataset.csv')
In [5]:
df.head()
Out[5]:
CustomerID Genre Age Annual Income (k$) Spending Score (1-100)
0 1 Male 19 15 39
1 2 Male 21 15 81
2 3 Female 20 16 6
3 4 Female 23 16 77
4 5 Female 31 17 40
In [6]:
0 15 39
1 15 81
Annual Income (k$) Spending Score (1-100)
2 16 6
3 16 77
4 17 40
195 120 79
196 126 28
197 126 74
198 137 18
199 137 83
Out[7]:
<matplotlib.collections.PathCollection at 0x27923480310>
In [8]:
model = KMeans(n_clusters=5)
model.fit(X)
Out[9]:
KMeans(n_clusters=5)
In [10]:
model.cluster_centers_
Out[10]:
array([[86.53846154, 82.12820513],
[26.30434783, 20.91304348],
[55.2962963 , 49.51851852],
[88.2 , 17.11428571],
[25.72727273, 79.36363636]])
In [11]:
cluster_number = model.predict(X)
In [12]:
len(cluster_number)
Out[12]:
200
In [13]:
c0 = X[cluster_number==0]
c1 = X[cluster_number==1]
c2 = X[cluster_number==2]
c3 = X[cluster_number==3]
c4 = X[cluster_number==4]
In [14]:
Out[14]:
<matplotlib.collections.PathCollection at 0x27926235f70>
In [15]:
model.inertia_
Out[15]:
44448.45544793369
In [19]:
WCSS =[]
for i in range(1,11):
model = KMeans(n_clusters=i)
model.fit(X)
WCSS.append(model.inertia_)
In [17]:
WCSS
Out[17]:
[269981.28000000014,
181363.59595959607,
106348.37306211119,
73679.78903948837,
44448.45544793369,
37265.86520484345,
30259.657207285458,
25011.920255473764,
21818.11458845217,
19657.783608703947]
In [18]:
plt.plot(range(1,11),WCSS,marker = 'x')
Out[18]:
[<matplotlib.lines.Line2D at 0x279263080d0>]
np.random.seed(12)
races = ["asian","black","hispanic","other","white"]
voter_age = stats.poisson.rvs(loc=18,
mu=30,
size=1000)
np.random.seed(12)
voter_age = stats.poisson.rvs(loc=18,
mu=30,
size=1000)
# Alternate method
model = ols('age ~ race', # Model formula
data = voter_frame).fit()
data=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/Salary_Data.csv')
In [3]:
data.head()
Out[3]:
YearsExperience Salary
0 1.1 39343.0
1 1.3 46205.0
2 1.5 37731.0
3 2.0 43525.0
4 2.2 39891.0
In [4]:
X=data[['YearsExperience']]
y=data['Salary']
In [5]:
from sklearn.tree import DecisionTreeRegressor
regressor = DecisionTreeRegressor(random_state=0)
In [6]:
regressor.fit(X,y)
Out[6]:
DecisionTreeRegressor(random_state=0)
In [7]:
regressor.predict([[6.5]])
Out[7]:
array([91738.])
df=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/iris.csv')
In [7]:
df.columns=['sepal_length','sepal_width','petal_length','petal_width','species']
In [8]:
col_names=list(df.columns)
predictors=col_names[0:4]
target=col_names[4]
In [9]:
In [11]:
train_acc_gau=np.mean(train_Gpred==train[target])
test_acc_gau=np.mean(test_Gpred==test[target])
In [12]:
train_acc_gau
Out[12]:
0.9428571428571428
In [13]:
test_acc_gau
Out[13]:
1.0
train_acc_multi=np.mean(train_Mpred==train[target])
test_acc_multi=np.mean(test_Mpred==test[target])
In [16]:
train_acc_multi
Out[16]:
0.7047619047619048
In [17]:
test_acc_multi
Out[17]:
0.6
df=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/Social_Network_Ads.csv')
In [3]:
df.head()
Out[3]:
User ID Gender Age EstimatedSalary Purchased
In [4]:
X=df[['Age','EstimatedSalary']]
y=df['Purchased']
In [5]:
from sklearn.model_selection import train_test_split
In [6]:
model_lin = SVC(kernel='linear')
model_lin.fit(X_train_scaled,y_train)
model_lin.score(X_test_scaled,y_test)
Out[9]:
0.8043478260869565
In [10]:
model_poly = SVC(kernel='poly')
model_poly.fit(X_train_scaled,y_train)
model_poly.score(X_test_scaled,y_test)
Out[10]:
0.8913043478260869
In [11]:
model_rbf = SVC(kernel='rbf')
model_rbf.fit(X_train_scaled,y_train)
model_rbf.score(X_test_scaled,y_test)
Out[11]:
0.8913043478260869
In [12]:
#Actual data
class_0_act = X_test[y_test==0]
class_1_act = X_test[y_test==1]
plt.scatter(class_0_act['Age'],class_0_act['EstimatedSalary'],c='red')
plt.scatter(class_1_act['Age'],class_1_act['EstimatedSalary'],c='blue')
Out[12]:
<matplotlib.collections.PathCollection at 0x2996f7e3ee0>
In [13]:
In [14]:
#Plot points according to predicted values of polynomial kernel
y_pre = model_poly.predict(X_test_scaled)
class_0_pre = X_test[y_pre==0]
class_1_pre = X_test[y_pre==1]
plt.scatter(class_0_pre['Age'],class_0_pre['EstimatedSalary'],c='red')
plt.scatter(class_1_pre['Age'],class_1_pre['EstimatedSalary'],c='blue')
plt.title('Polynomial Kernel')
Out[14]:
Text(0.5, 1.0, 'Polynomial Kernel')
In [15]:
Out[15]:
Text(0.5, 1.0, 'RBF Kernel')
In [16]:
import numpy as np
In [17]:
plot_data = []
for x in range(0,100,1):
for y in range(0,100,1):
plot_data.append([x,y])
plot_data=np.array(plot_data)/100
In [18]:
plot_data
Out[18]:
array([[0. , 0. ],
[0. , 0.01],
[0. , 0.02],
...,
[0.99, 0.97],
[0.99, 0.98],
[0.99, 0.99]])
In [19]:
plot_data.shape
Out[19]:
(10000, 2)
In [20]:
y_plot = model_lin.predict(plot_data)
class_0 = plot_data[y_plot==0]
class_1 = plot_data[y_plot==1]
plt.scatter(class_0[:,0],class_0[:,1],c='red')
plt.scatter(class_1[:,0],class_1[:,1],c='blue')
plt.title('Linear Kernel')
Out[20]:
Text(0.5, 1.0, 'Linear Kernel')
In [21]:
y_plot = model_poly.predict(plot_data)
class_0 = plot_data[y_plot==0]
class_1 = plot_data[y_plot==1]
plt.scatter(class_0[:,0],class_0[:,1],c='red')
plt.scatter(class_1[:,0],class_1[:,1],c='blue')
plt.title('Poly Kernel')
Out[21]:
Text(0.5, 1.0, 'Poly Kernel')
In [22]:
y_plot = model_rbf.predict(plot_data)
class_0 = plot_data[y_plot==0]
class_1 = plot_data[y_plot==1]
plt.scatter(class_0[:,0],class_0[:,1],c='red')
plt.scatter(class_1[:,0],class_1[:,1],c='blue')
plt.title('rbf Kernel')
Out[22]:
Text(0.5, 1.0, 'rbf Kernel')
In [23]:
pts = np.array([[25,60000],[50,120000]])
pts_scaled = scaler.transform(pts)
In [25]:
pts_scaled
Out[25]:
array([[0.16666667, 0.33333333],
[0.76190476, 0.77777778]])
In [26]:
y = model_rbf.predict(pts_scaled)
y
Out[26]:
array([0, 1], dtype=int64)
12.Plot the cluster data using Python visualization
In [1]:
from sklearn.datasets import load_digits
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
import numpy as np
In [2]:
data = load_digits().data
pca = PCA(2)
In [3]:
df = pca.fit_transform(data)
In [4]:
df.shape
Out[4]:
(1797, 2)
In [5]:
label = kmeans.fit_predict(df)
print(label)
[4 1 9 ... 9 8 0]
In [8]:
filtered_label0 = df[label == 0]
plt.scatter(filtered_label0[:,0] , filtered_label0[:,1])
plt.show()
In [9]:
filtered_label2 = df[label == 2]
filtered_label8 = df[label == 8]
plt.scatter(filtered_label2[:,0] , filtered_label2[:,1] , color = 'red')
plt.scatter(filtered_label8[:,0] , filtered_label8[:,1] , color = 'black')
plt.show()
In [10]:
u_labels = np.unique(label)
for i in u_labels:
plt.scatter(df[label == i , 0] , df[label == i , 1] , label = i)
plt.legend()
plt.show()
In [11]:
centroids = kmeans.cluster_centers_
u_labels = np.unique(label)
for i in u_labels:
plt.scatter(df[label == i , 0] , df[label == i , 1] , label = i)
plt.scatter(centroids[:,0] , centroids[:,1] , s = 80, color = 'k')
plt.legend()
plt.show()
13. Creating and Visualizing Neural Network for the given data
In [2]:
import tensorflow as tf
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
C:\Users\MICROS~1\AppData\Local\Temp/ipykernel_13700/3793406994.py in
<module>
----> 1 import tensorflow as tf
ModuleNotFoundError: No module named 'tensorflow'
In [16]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='linear',input_shape=(28,28,1),padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D((2, 2),padding='same'))
model.add(Conv2D(64, (3, 3), activation='linear',padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Conv2D(128, (3, 3), activation='linear',padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Flatten())
model.add(Dense(128, activation='linear'))
model.add(LeakyReLU(alpha=0.1))
model.add(Dense(500, activation='softmax'))
In [23]:
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(),metrics=['accuracy'])
In [26]:
model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 28, 28, 32) 320
_________________________________________________________________
leaky_re_lu (LeakyReLU) (None, 28, 28, 32) 0
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 14, 14, 64) 18496
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU) (None, 14, 14, 64) 0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 7, 7, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 7, 7, 128) 73856
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU) (None, 7, 7, 128) 0
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 128) 0
_________________________________________________________________
flatten (Flatten) (None, 2048) 0
_________________________________________________________________
dense_2 (Dense) (None, 128) 262272
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU) (None, 128) 0
_________________________________________________________________
dense_3 (Dense) (None, 500) 64500
=================================================================
Total params: 419,444
Trainable params: 419,444
Non-trainable params: 0
(x_train,y_train),(x_test,y_test)=mnist.load_data()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-
datasets/mnist.npz
11493376/11490434 [==============================] - 1s 0us/step
11501568/11490434 [==============================] - 1s 0us/step
In [19]:
x_train.shape
Out[19]:
(60000, 28, 28)
In [20]:
X_train=x_train.reshape(60000,784)
X_test=x_test.reshape(10000,784)
In [21]:
y_train=to_categorical(y_train,num_classes=10)
y_test=to_categorical(y_test,num_classes=10)
In [23]:
X_train=X_train/255
X_test=X_test/255
In [24]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
In [25]:
model=Sequential()
model.add(Dense(50,activation='relu',input_shape=(784,)))
model.add(Dense(50,activation='relu'))
model.add(Dense(10,activation='softmax'))
In [26]:
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 50) 39250
=================================================================
Total params: 42,310
Trainable params: 42,310
Non-trainable params: 0
_________________________________________________________________
In [27]:
model.compile(loss='categorical_crossentropy',metrics=['accuracy'])
In [28]:
model.fit(X_train,y_train,batch_size=64,epochs=10,validation_data=(X_test,y_test))
Epoch 1/10
938/938 [==============================] - 3s 3ms/step - loss: 0.3354 -
accuracy: 0.9043 - val_loss: 0.1931 - val_accuracy: 0.9422
Epoch 2/10
938/938 [==============================] - 2s 2ms/step - loss: 0.1624 -
accuracy: 0.9517 - val_loss: 0.1451 - val_accuracy: 0.9548
Epoch 3/10
938/938 [==============================] - 2s 2ms/step - loss: 0.1209 -
accuracy: 0.9638 - val_loss: 0.1142 - val_accuracy: 0.9675
Epoch 4/10
938/938 [==============================] - 2s 2ms/step - loss: 0.1001 -
accuracy: 0.9703 - val_loss: 0.1110 - val_accuracy: 0.9685
Epoch 5/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0854 -
accuracy: 0.9745 - val_loss: 0.1027 - val_accuracy: 0.9697
Epoch 6/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0745 -
accuracy: 0.9780 - val_loss: 0.0963 - val_accuracy: 0.9724
Epoch 7/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0658 -
accuracy: 0.9800 - val_loss: 0.1030 - val_accuracy: 0.9718
Epoch 8/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0586 -
accuracy: 0.9825 - val_loss: 0.1076 - val_accuracy: 0.9713
Epoch 9/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0542 -
accuracy: 0.9835 - val_loss: 0.0909 - val_accuracy: 0.9755
Epoch 10/10
938/938 [==============================] - 2s 3ms/step - loss: 0.0476 -
accuracy: 0.9855 - val_loss: 0.0950 - val_accuracy: 0.9748
Out[28]:
<keras.callbacks.History at 0x1ef21ab46a0>
In [29]:
import numpy as np
In [30]:
X_train
Out[30]:
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]])
In [31]:
y_train[:5,:]
Out[31]:
array([[0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]], dtype=float32)
In [32]:
img0 = np.array(X_train[0]).reshape(1,784)
In [33]:
model.predict(img0).argmax()
Out[33]:
5
In [34]:
y_train[0].argmax()
Out[34]:
5
In [35]:
def recognise(img):
img=np.array(img).reshape(1,784)
return model.predict(img).argmax()
In [36]:
y_pre=model.predict(X_test).argmax(axis=1)
In [37]:
y_pre
Out[37]:
array([7, 2, 1, ..., 4, 5, 6], dtype=int64)
In [38]:
len(y_pre)
Out[38]:
10000
In [39]:
y_test.argmax(axis=1)
Out[39]:
array([7, 2, 1, ..., 4, 5, 6], dtype=int64)
In [40]:
sum(y_pre==y_test.argmax(axis=1))
Out[40]:
9748
In [41]:
9737/10000
Out[41]:
0.9737
In [42]:
plt.imshow(np.array(X_test[560]).reshape(28,28))
Out[43]:
<matplotlib.image.AxesImage at 0x1ef22238640>
In [44]:
recognise(X_test[560])
Out[44]:
9
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load
import os
for dirname, _, filenames in os.walk('E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog'):
for filename in filenames:
print(os.path.join(dirname, filename))
# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you
create a version using "Save & Run All"
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\0.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\1.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\10.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\100.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\11.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\12.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\13.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\14.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\15.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\16.jpg
.
.
.
.
.
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\93.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\94.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\95.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\96.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\97.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\98.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\99.jpg
In [188]:
filenames=os.listdir('E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog')
In [191]:
len(filenames)
Out[191]:
101
In [192]:
filenames[:5]
Out[192]:
['0.jpg', '1.jpg', '10.jpg', '100.jpg', '11.jpg']
In [193]:
df=pd.DataFrame({'filename':filenames})
df.head()
Out[193]:
filename
0 0.jpg
1 1.jpg
2 10.jpg
3 100.jpg
4 11.jpg
In [194]:
df['class']=df['filename'].apply(lambda X:X[:3])
In [195]:
df.head()
Out[195]:
filename class
0 0.jpg 0.j
1 1.jpg 1.j
2 10.jpg 10.
3 100.jpg 100
4 11.jpg 11.
In [196]:
data_gen=ImageDataGenerator(zoom_range=0.2,shear_range=0.2,horizontal_flip=True,rescale=1/255)
In [198]:
train_data=data_gen.flow_from_dataframe(df,'E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImage
s/Dog',X='filename',y='class',target_size=(224,224))
Found 101 validated image filenames belonging to 101 classes.
In [199]:
model=Sequential()
model.add(Conv2D(16,(3,3),activation='relu',input_shape=(224,224,3)))
model.add(MaxPool2D())
model.add(Conv2D(32,(3,3),activation='relu'))
model.add(MaxPool2D())
model.add(Conv2D(64,(3,3),activation='relu'))
model.add(MaxPool2D())
model.add(Conv2D(64,(5,5),activation='relu'))
model.add(MaxPool2D())
model.add(Conv2D(128,(3,3),activation='relu'))
model.add(MaxPool2D())
model.add(Flatten())
model.add(Dense(2,activation='softmax'))
In [201]:
model.summary()
Model: "sequential_7"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_35 (Conv2D) (None, 222, 222, 16) 448
=================================================================
Total params: 204,002
Trainable params: 204,002
Non-trainable params: 0
_________________________________________________________________
In [202]:
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])
In [203]:
#model.fit_generator(train_data,epochs=5)
In [204]:
import cv2
def get_class(img_path):
img=cv2.imread(img_path)
img=cv2.resize(img,(224,224))
img=img/255
op=model.predict(img.reshape(1,224,224,3)).argmax()
return 'Cat' if op==0 else 'Dog'
In [205]:
train_data.class_mode
Out[205]:
'categorical'
In [207]:
get_class('E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog/10.jpg')
Out[207]:
'Dog'
(X_train,y_train),(X_test,y_test)=imdb.load_data(num_words=20000)
In [54]:
X_train.shape,X_test.shape
Out[54]:
((25000,), (25000,))
In [55]:
len(X_train[0]),len(X_train[1]),len(X_train[2]),len(X_train[3]),len(X_train[4])
Out[55]:
(218, 189, 141, 550, 147)
In [56]:
y_train[:5]
Out[56]:
array([1, 0, 0, 1, 0], dtype=int64)
In [57]:
X_train[0]
Out[57]:
[1,
14,
22,
16,
43,
530,
973,
1622,
1385,
65,
.
.
.
.
.
103,
32,
15,
16,
5345,
19,
178,
32]
In [58]:
import numpy as np
In [59]:
np.array(X_train[0])
Out[59]:
array([ 1, 14, 22, 16, 43, 530, 973, 1622, 1385,
65, 458, 4468, 66, 3941, 4, 173, 36, 256,
5, 25, 100, 43, 838, 112, 50, 670, 2,
9, 35, 480, 284, 5, 150, 4, 172, 112,
167, 2, 336, 385, 39, 4, 172, 4536, 1111,
17, 546, 38, 13, 447, 4, 192, 50, 16,
6, 147, 2025, 19, 14, 22, 4, 1920, 4613,
469, 4, 22, 71, 87, 12, 16, 43, 530,
38, 76, 15, 13, 1247, 4, 22, 17, 515,
17, 12, 16, 626, 18, 19193, 5, 62, 386,
12, 8, 316, 8, 106, 5, 4, 2223, 5244,
16, 480, 66, 3785, 33, 4, 130, 12, 16,
38, 619, 5, 25, 124, 51, 36, 135, 48,
25, 1415, 33, 6, 22, 12, 215, 28, 77,
52, 5, 14, 407, 16, 82, 10311, 8, 4,
107, 117, 5952, 15, 256, 4, 2, 7, 3766,
5, 723, 36, 71, 43, 530, 476, 26, 400,
317, 46, 7, 4, 12118, 1029, 13, 104, 88,
4, 381, 15, 297, 98, 32, 2071, 56, 26,
141, 6, 194, 7486, 18, 4, 226, 22, 21,
134, 476, 26, 480, 5, 144, 30, 5535, 18,
51, 36, 28, 224, 92, 25, 104, 4, 226,
65, 16, 38, 1334, 88, 12, 16, 283, 5,
16, 4472, 113, 103, 32, 15, 16, 5345, 19,
178, 32])
In [60]:
X=pad_sequences(X_train,maxlen=200)
X_val=pad_sequences(X_test,maxlen=200)
In [62]:
len(X[0])
Out[62]:
200
In [63]:
model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
In [*]:
model.fit(X,y_train,validation_data=(X_val,y_test),epochs=5,batch_size=64)
Epoch 1/5
391/391 [==============================] - 479s 1s/step - loss: 0.3931 -
accuracy: 0.8199 - val_loss: 0.3054 - val_accuracy: 0.8726
Epoch 2/5
391/391 [==============================] - 457s 1s/step - loss: 0.2015 -
accuracy: 0.9246 - val_loss: 0.3744 - val_accuracy: 0.8514
Epoch 3/5
391/391 [==============================] - 466s 1s/step - loss: 0.1295 -
accuracy: 0.9542 - val_loss: 0.3851 - val_accuracy: 0.8615
Epoch 4/5
391/391 [==============================] - 478s 1s/step - loss: 0.0785 -
accuracy: 0.9737 - val_loss: 0.4942 - val_accuracy: 0.8522
Epoch 5/5
391/391 [==============================] - 533s 1s/step - loss: 0.0627 -
accuracy: 0.9784 - val_loss: 0.5105 - val_accuracy: 0.8246
Out[66]:
<keras.callbacks.History at 0x1da72672280>
17. Web scraping experiments (by using tools)
In [126]:
import requests
from bs4 import BeautifulSoup
import csv
URL = 'http://www.values.com/inspirational-quotes'
r = requests.get(URL)
quotes=[]
In [127]:
soup.find('div', attrs = {'id':'all_quotes'})
Out[127]:
<div class="row" id="all_quotes">
<div class="col-6 col-lg-3 text-center margin-30px-bottom sm-
margin-30px-top">
<a href="/inspirational-quotes/3331-wherever-we-are-it-is-our-
friends-that-make"><img alt="Wherever we are, it is our friends that make our
world. #<Author:0x00007f188bf2e298>" class="margin-10px-bottom shadow"
height="310"
src="https://assets.passiton.com/quotes/quote_artwork/3331/medium/20220210_th
ursday_quote_updated.jpg?1644000474" width="310"/></a>
<h5 class="value_on_red"><a href="/inspirational-quotes/3331-
wherever-we-are-it-is-our-friends-that-make">FRIENDSHIP</a></h5>
<a href="/inspirational-quotes/8303-find-a-group-of-people-who-
challenge-and"><img alt="Find a group of people who challenge and inspire
........................
<a href="/inspirational-quotes/7182-the-bond-that-links-your-true-
family-is-not-one"><img alt="The bond that links your true family is not one
of blood, but of respect and joy in each other's life. Rarely do members of
one family grow up under the same roof. #<Author:0x00007f18843a51f8>"
class="margin-10px-bottom shadow" height="310"
src="https://assets.passiton.com/quotes/quote_artwork/7182/medium/20211229_we
dnesday_quote.jpg?1640015962" width="310"/></a>
<h5 class="value_on_red"><a href="/inspirational-quotes/7182-the-
bond-that-links-your-true-family-is-not-one">FAMILY</a></h5>
</div>
</div>
In [128]:
for row in soup.find_all_next('div', attrs = {'class': 'col-6 col-lg-3 text-center margin-30px-bottom sm-margin-30px-
top'}):
quote = {}
quote['theme'] = row.h5.text
quote['url'] = row.a['href']
quote['img'] = row.img['src']
quote['lines'] = row.img['alt'].split(" #")[0]
quote['author'] = row.img['alt'].split(" #")[1]
quotes.append(quote)
In [129]:
quote['theme'] = row.h5.text
In [130]:
quote['url'] = row.a['href']
In [131]:
filename = 'inspirational_quotes.csv'
with open(filename, 'w', newline='') as f:
w = csv.DictWriter(f,['theme','url','img','lines','author'])
w.writeheader()
for quote in quotes:
w.writerow(quote)