Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

KRAI LabManual

Download as pdf or txt
Download as pdf or txt
You are on page 1of 77

A

Lab Journal

Of

“Knowledge Representation & Artificial Intelligence-ML,DL"

By
Student Name

Submitted in partial fulfillment of


Second Year (Sem-III) Master in Computer Application
Savitribai Phule Pune University

Under The Guidance

Of
Professor Name

College Name
INDEX

Subject: Python Programming

Sr. Page
Title
No. No
1 Find the correlation matrix. 3
Plot the correlation plot on dataset and visualize giving an
2 5
overview of relationships among data on iris data.
Analysis of covariance: variance (ANOVA), if data have
3 12
categorical variables on iris data.
Apply linear regression Model techniques to predict the
4 0
data on any dataset.
Apply logical regression Model techniques to predict the
5 14
data on any dataset.
6 Clustering algorithms for unsupervised classification. 22
Association algorithms for supervised classification on any
7 28
dataset
Developing and implementing Decision Tree model on the
8 33
dataset
9 Bayesian classification on any dataset. 35
10 SVM classification on any dataset 38
11 Text Mining algorithms on unstructured dataset
12 Plot the cluster data using python visualizations. 47
Creating & Visualizing Neural Network for the given data.
13 52
(Use python)
14 Recognize optical character using ANN. 55
15 Write a program to implement CNN 61
16 Write a program to implement RNN 68
17 Write a program to implement GAN
18 Web scraping experiments (by using tools) 74

1. Find the correlation matrix


In [1]:

import scipy.stats as st
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
In [4]:

data=pd.read_excel("E:/MCA/Sem3/AI_ML/Practical/Sample - Superstore.xls")
In [5]:

!pip install xlrd


Requirement already satisfied: xlrd in c:\users\microsoft\anaconda3\lib\site-
packages (2.0.1)
In [6]:

np.corrcoef(data['Sales'],data['Profit'])
Out[6]:
array([[1. , 0.47906435],
[0.47906435, 1. ]])
In [7]:
data.corr()
Out[7]:

Row ID Postal Code Sales Quantity Discount Profit

Row ID 1.000000 0.009671 -0.001359 -0.004016 0.013480 0.012497

Postal Code 0.009671 1.000000 -0.023854 0.012761 0.058443 -0.029961

Sales -0.001359 -0.023854 1.000000 0.200795 -0.028190 0.479064

Quantity -0.004016 0.012761 0.200795 1.000000 0.008623 0.066253

Discount 0.013480 0.058443 -0.028190 0.008623 1.000000 -0.219487

Profit 0.012497 -0.029961 0.479064 0.066253 -0.219487 1.000000


In [8]:

sns.heatmap(data.corr())
Out[8]:
<AxesSubplot:>
2. Plot the correlation plot on dataset and visualize giving an overview of
relationships among data on iris data
In [2]:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
from sklearn import metrics
sns.set()
In [12]:

iris_data=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/iris.csv')
iris_data
Out[12]:
sepal_length sepal_width petal_length petal_width species
sepal_length sepal_width petal_length petal_width species

0 5.1 3.5 1.4 0.2 setosa

1 4.9 3.0 1.4 0.2 setosa

2 4.7 3.2 1.3 0.2 setosa

3 4.6 3.1 1.5 0.2 setosa

4 5.0 3.6 1.4 0.2 setosa

... ... ... ... ... ...

145 6.7 3.0 5.2 2.3 virginica

146 6.3 2.5 5.0 1.9 virginica

147 6.5 3.0 5.2 2.0 virginica

148 6.2 3.4 5.4 2.3 virginica

149 5.9 3.0 5.1 1.8 virginica

150 rows × 5 columns


In [4]:

iris_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 sepal_length 150 non-null float64
1 sepal_width 150 non-null float64
2 petal_length 150 non-null float64
3 petal_width 150 non-null float64
4 species 150 non-null object
dtypes: float64(4), object(1)
memory usage: 6.0+ KB
In [5]:
iris_data.describe()
Out[5]:
sepal_length sepal_width petal_length petal_width

count 150.000000 150.000000 150.000000 150.000000

mean 5.843333 3.057333 3.758000 1.199333

std 0.828066 0.435866 1.765298 0.762238

min 4.300000 2.000000 1.000000 0.100000

25% 5.100000 2.800000 1.600000 0.300000

50% 5.800000 3.000000 4.350000 1.300000

75% 6.400000 3.300000 5.100000 1.800000

max 7.900000 4.400000 6.900000 2.500000

In [6]:

iris_data[iris_data.duplicated()]
Out[6]:
sepal_length sepal_width petal_length petal_width species

142 5.8 2.7 5.1 1.9 virginica

In [8]:

iris_data['species'].value_counts()
Out[8]:
setosa 50
versicolor 50
virginica 50
Name: species, dtype: int64
In [15]:
sns.pairplot(iris_data,hue='species',height=4)
Out[15]:
<seaborn.axisgrid.PairGrid at 0x20b16a147f0>
In [21]:

plt.figure(figsize=(10,11))
sns.heatmap(iris_data.corr(),annot=True)
plt.plot()
Out[21]:
[]
In [22]:

iris_data.groupby('species').agg(['mean','median'])
Out[22]:

sepal_length sepal_width petal_length petal_width

mean median mean median mean median mean median

species

setosa 5.006 5.0 3.428 3.4 1.462 1.50 0.246 0.2

versicolor 5.936 5.9 2.770 2.8 4.260 4.35 1.326 1.3

virginica 6.588 6.5 2.974 3.0 5.552 5.55 2.026 2.0


In [25]:

fig, axes = plt.subplots(2, 2, figsize=(16,9))


sns.boxplot( y='petal_width', x= 'species', data=iris_data, orient='v' , ax=axes[0, 0])
sns.boxplot( y='petal_length', x= 'species', data=iris_data, orient='v' , ax=axes[0, 1])
sns.boxplot( y='sepal_length', x= 'species', data=iris_data, orient='v' , ax=axes[1, 0])
sns.boxplot( y='sepal_width', x= 'species', data=iris_data, orient='v', ax=axes[1, 1])
plt.show()
In [24]:

fig, axes = plt.subplots(2, 2, figsize=(16,9))


sns.violinplot( y='petal_width', x= 'species', data=iris_data, orient='v' , ax=axes[0, 0])
sns.violinplot( y='petal_length', x= 'species', data=iris_data, orient='v' , ax=axes[0, 1])
sns.violinplot( y='sepal_length', x= 'species', data=iris_data, orient='v' , ax=axes[1, 0])
sns.violinplot( y='sepal_width', x= 'species', data=iris_data, orient='v', ax=axes[1, 1])
plt.show()
3. Analysis of covariance: variance(ANOVA), if data has categorical variables in
iris data
In [1]:

import numpy as np
import pandas as pd
In [4]:

df=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/iris.csv')
In [5]:

df.head()
Out[5]:
sepal_length sepal_width petal_length petal_width species

0 5.1 3.5 1.4 0.2 setosa

1 4.9 3.0 1.4 0.2 setosa

2 4.7 3.2 1.3 0.2 setosa

3 4.6 3.1 1.5 0.2 setosa

4 5.0 3.6 1.4 0.2 setosa


5. Apply logistic regression Model techniques to predict the data on any
dataset
In [1]:

import pandas as pd
import matplotlib.pyplot as plt
In [40]:

df=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/Social_Network_Ads.csv')
In [36]:

X=df[['Age','EstimatedSalary']]
y=df['Purchased']
In [37]:

from sklearn.linear_model import LogisticRegression


In [38]:

model=LogisticRegression()
In [12]:

model.fit(X,y)
Out[12]:
LogisticRegression()
In [13]:

Scaled_Age=(df['Age']-df['Age'].min()) / (df['Age'].max()-df['Age'].min())
Scaled_Salary=(df['EstimatedSalary']-df['EstimatedSalary'].min()) / (df['EstimatedSalary'].max()-
df['EstimatedSalary'].min())
In [14]:

X=pd.concat([Scaled_Age,Scaled_Salary],axis=1)
y=df['Purchased']
In [15]:

model_scaled = LogisticRegression()
model_scaled.fit(X,y)
Out[15]:
LogisticRegression()
In [16]:

def get_scaled(pt):
age,sal = pt[0],pt[1]
sc_age=(age-df['Age'].min()) / (df['Age'].max()-df['Age'].min())
sc_sal=(sal-df['EstimatedSalary'].min()) / (df['EstimatedSalary'].max()-df['EstimatedSalary'].min())
return sc_age,sc_sal
In [17]:
q1=get_scaled([52,130000])
q2=get_scaled([25,40000])
In [18]:

model_scaled.predict([q1])
Out[18]:
array([1], dtype=int64)
In [19]:

model_scaled.predict([q2])
Out[19]:
array([0], dtype=int64)
In [20]:

from sklearn.preprocessing import MinMaxScaler


In [21]:

X = df[['Age','EstimatedSalary']]
scaler = MinMaxScaler()
scaler.fit(X)
X_scaled = scaler.transform(X)
In [22]:

X_scaled
Out[22]:
array([[0.02380952, 0.02962963],
[0.4047619 , 0.03703704],
[0.19047619, 0.20740741],
[0.21428571, 0.31111111],
[0.02380952, 0.45185185],
[0.21428571, 0.31851852],
[0.21428571, 0.51111111],
[0.33333333, 1. ],
[0.16666667, 0.13333333],
[0.4047619 , 0.37037037],
[0.19047619, 0.48148148],
[0.19047619, 0.27407407],
[0.04761905, 0.52592593],
[0.33333333, 0.02222222],
[0. , 0.4962963 ],
[0.26190476, 0.48148148],
[0.69047619, 0.07407407],
[0.64285714, 0.08148148],
[0.66666667, 0.0962963 ],
[0.71428571, 0.1037037 ],
[0.64285714, 0.05185185],
[0.69047619, 0.25185185],
[0.71428571, 0.19259259],
[0.64285714, 0.05185185],
[0.66666667, 0.05925926],
[0.69047619, 0.03703704],
[0.73809524, 0.0962963 ],
[0.69047619, 0.11111111],
[0.26190476, 0.20740741],
[0.30952381, 0.02222222],
[0.30952381, 0.43703704],
[0.21428571, 0.9037037 ],
[0.07142857, 0.00740741],
[0.23809524, 0.21481481],
[0.21428571, 0.55555556],
[0.4047619 , 0.08888889],
[0.35714286, 0.0962963 ],

.
.
.
.
[0.57142857, 0.36296296],
[0.71428571, 0.13333333],
[0.61904762, 0.91851852],
[0.73809524, 0.0962963 ],
[0.92857143, 0.13333333],
[0.9047619 , 0.33333333],
[0.73809524, 0.17777778],
[0.5 , 0.41481481],
[0.69047619, 0.14074074],
[0.71428571, 0.14814815],
[0.71428571, 0.13333333],
[0.69047619, 0.05925926],
[0.64285714, 0.22222222],
[1. , 0.2 ],
[0.5 , 0.32592593],
[0.66666667, 0.19259259],
[0.78571429, 0.05925926],
[0.76190476, 0.03703704],
[0.42857143, 0.13333333],
[0.73809524, 0.15555556]])
In [23]:

model = LogisticRegression()
model.fit(X_scaled,df['Purchased'])
Out[23]:
LogisticRegression()
In [24]:

model.score(X_scaled,df['Purchased'])
Out[24]:
0.83
In [25]:

y_pre=model.predict(X_scaled)
In [26]:

y_act=df['Purchased']
In [27]:

y_pre
Out[27]:
array([0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1,
0, 0, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 1, 0,
1, 1, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0,
1, 1, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0,
0, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1,
1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 1,
0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0,
1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1,
0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0,
0, 0, 0, 0], dtype=int64)
In [28]:

from sklearn.model_selection import train_test_split


In [29]:
X_train, X_test, y_train, y_test=train_test_split(X,y,test_size=0.25)
In [30]:

scaler = MinMaxScaler()
scaler.fit(X_train)
X_train_scaled=scaler.transform(X_train)
In [31]:

model = LogisticRegression()
model.fit(X_train_scaled,y_train)
Out[31]:
LogisticRegression()
In [32]:

train_score=model.score(X_train_scaled,y_train)
train_score
Out[32]:
0.81
In [33]:

X_test_scaled=scaler.transform(X_test)
test_score=model.score(X_test_scaled,y_test)
test_score
Out[33]:
0.88
6. Clustering algorithm for unsupervised classification
In [1]:

import pandas as pd
In [2]:
import matplotlib.pyplot as plt
%matplotlib inline
In [4]:

df=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/Mall_Customers_dataset.csv')
In [5]:

df.head()
Out[5]:
CustomerID Genre Age Annual Income (k$) Spending Score (1-100)

0 1 Male 19 15 39

1 2 Male 21 15 81

2 3 Female 20 16 6

3 4 Female 23 16 77

4 5 Female 31 17 40

In [6]:

X = df[['Annual Income (k$)','Spending Score (1-100)']]


X
Out[6]:
Annual Income (k$) Spending Score (1-100)

0 15 39

1 15 81
Annual Income (k$) Spending Score (1-100)

2 16 6

3 16 77

4 17 40

... ... ...

195 120 79

196 126 28

197 126 74

198 137 18

199 137 83

200 rows × 2 columns


In [7]:

plt.scatter(X['Annual Income (k$)'],X['Spending Score (1-100)'])

Out[7]:
<matplotlib.collections.PathCollection at 0x27923480310>
In [8]:

from sklearn.cluster import KMeans


In [9]:

model = KMeans(n_clusters=5)
model.fit(X)
Out[9]:
KMeans(n_clusters=5)
In [10]:

model.cluster_centers_

Out[10]:
array([[86.53846154, 82.12820513],
[26.30434783, 20.91304348],
[55.2962963 , 49.51851852],
[88.2 , 17.11428571],
[25.72727273, 79.36363636]])
In [11]:

cluster_number = model.predict(X)
In [12]:

len(cluster_number)
Out[12]:
200
In [13]:

c0 = X[cluster_number==0]
c1 = X[cluster_number==1]
c2 = X[cluster_number==2]
c3 = X[cluster_number==3]
c4 = X[cluster_number==4]
In [14]:

plt.scatter(c0['Annual Income (k$)'],c0['Spending Score (1-100)'],c='red')


plt.scatter(c1['Annual Income (k$)'],c1['Spending Score (1-100)'],c='blue')
plt.scatter(c2['Annual Income (k$)'],c2['Spending Score (1-100)'],c='yellow')
plt.scatter(c3['Annual Income (k$)'],c3['Spending Score (1-100)'],c='cyan')
plt.scatter(c4['Annual Income (k$)'],c4['Spending Score (1-100)'],c='green')

Out[14]:
<matplotlib.collections.PathCollection at 0x27926235f70>
In [15]:

model.inertia_
Out[15]:
44448.45544793369
In [19]:

WCSS =[]
for i in range(1,11):
model = KMeans(n_clusters=i)
model.fit(X)
WCSS.append(model.inertia_)
In [17]:

WCSS
Out[17]:
[269981.28000000014,
181363.59595959607,
106348.37306211119,
73679.78903948837,
44448.45544793369,
37265.86520484345,
30259.657207285458,
25011.920255473764,
21818.11458845217,
19657.783608703947]
In [18]:

plt.plot(range(1,11),WCSS,marker = 'x')
Out[18]:
[<matplotlib.lines.Line2D at 0x279263080d0>]

7. Association algorithms for supervised classification on any dataset


In [ ]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.stats as stats
In [2]:

np.random.seed(12)

races = ["asian","black","hispanic","other","white"]

# Generate random data


voter_race = np.random.choice(a= races,
p = [0.05, 0.15 ,0.25, 0.05, 0.5],
size=1000)

voter_age = stats.poisson.rvs(loc=18,
mu=30,
size=1000)

# Group age data by race


voter_frame = pd.DataFrame({"race":voter_race,"age":voter_age})
groups = voter_frame.groupby("race").groups

# Etract individual groups


asian = voter_age[groups["asian"]]
black = voter_age[groups["black"]]
hispanic = voter_age[groups["hispanic"]]
other = voter_age[groups["other"]]
white = voter_age[groups["white"]]

# Perform the ANOVA


stats.f_oneway(asian, black, hispanic, other, white)
Out[2]:
F_onewayResult(statistic=1.7744689357329695, pvalue=0.13173183201930463)
In [3]:
import statsmodels.api as sm
from statsmodels.formula.api import ols

model = ols('age ~ race', # Model formula


data = voter_frame).fit()

anova_result = sm.stats.anova_lm(model, typ=2)


print (anova_result)
sum_sq df F PR(>F)
race 199.369 4.0 1.774469 0.131732
Residual 27948.102 995.0 NaN NaN
In [4]:

np.random.seed(12)

# Generate random data


voter_race = np.random.choice(a= races,
p = [0.05, 0.15 ,0.25, 0.05, 0.5],
size=1000)

# Use a different distribution for white ages


white_ages = stats.poisson.rvs(loc=18,
mu=32,
size=1000)

voter_age = stats.poisson.rvs(loc=18,
mu=30,
size=1000)

voter_age = np.where(voter_race=="white", white_ages, voter_age)

# Group age data by race


voter_frame = pd.DataFrame({"race":voter_race,"age":voter_age})
groups = voter_frame.groupby("race").groups

# Extract individual groups


asian = voter_age[groups["asian"]]
black = voter_age[groups["black"]]
hispanic = voter_age[groups["hispanic"]]
other = voter_age[groups["other"]]
white = voter_age[groups["white"]]

# Perform the ANOVA


stats.f_oneway(asian, black, hispanic, other, white)
Out[4]:
F_onewayResult(statistic=10.164699828386366, pvalue=4.5613242113994585e-08)
In [5]:

# Alternate method
model = ols('age ~ race', # Model formula
data = voter_frame).fit()

anova_result = sm.stats.anova_lm(model, typ=2)


print (anova_result)
sum_sq df F PR(>F)
race 1284.123213 4.0 10.1647 4.561324e-08
Residual 31424.995787 995.0 NaN NaN
In [6]:

# Get all race pairs


race_pairs = []

for race1 in range(4):


for race2 in range(race1+1,5):
race_pairs.append((races[race1], races[race2]))

# Conduct t-test on each pair


for race1, race2 in race_pairs:
print(race1, race2)
print(stats.ttest_ind(voter_age[groups[race1]],
voter_age[groups[race2]]))
asian black
Ttest_indResult(statistic=0.838644690974798, pvalue=0.4027281369339345)
asian hispanic
Ttest_indResult(statistic=-0.42594691924932293, pvalue=0.6704669004240726)
asian other
Ttest_indResult(statistic=0.9795284739636, pvalue=0.3298877500095151)
asian white
Ttest_indResult(statistic=-2.318108811252288, pvalue=0.020804701566400217)
black hispanic
Ttest_indResult(statistic=-1.9527839210712925, pvalue=0.05156197171952594)
black other
Ttest_indResult(statistic=0.28025754367057176, pvalue=0.7795770111117659)
black white
Ttest_indResult(statistic=-5.379303881281835, pvalue=1.039421216662395e-07)
hispanic other
Ttest_indResult(statistic=1.5853626170340225, pvalue=0.11396630528484335)
hispanic white
Ttest_indResult(statistic=-3.5160312714115376, pvalue=0.0004641298649066684)
other white
Ttest_indResult(statistic=-3.763809322077872, pvalue=0.00018490576317593065)
In [7]:

from statsmodels.stats.multicomp import pairwise_tukeyhsd

tukey = pairwise_tukeyhsd(endog=voter_age, # Data


groups=voter_race, # Groups
alpha=0.05) # Significance level

tukey.plot_simultaneous() # Plot group confidence intervals


plt.vlines(x=49.57,ymin=-0.5,ymax=4.5, color="red")

tukey.summary() # See test summary


Out[7]:
Multiple Comparison of Means - Tukey HSD, FWER=0.05

group1 group2 meandiff p-adj lower upper reject

asian black -0.8032 0.9 -3.4423 1.836 False

asian hispanic 0.4143 0.9 -2.1011 2.9297 False

asian other -1.0645 0.8852 -4.2391 2.11 False

asian white 1.9547 0.175 -0.4575 4.3668 False


black hispanic 1.2175 0.2318 -0.386 2.821 False

black other -0.2614 0.9 -2.7757 2.253 False

black white 2.7579 0.001 1.3217 4.194 True

hispanic other -1.4789 0.4391 -3.863 0.9053 False

hispanic white 1.5404 0.004 0.3468 2.734 True

other white 3.0192 0.0028 0.7443 5.2941 True

8. Developing and implementing Decision Tree Model on the dataset


In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
In [2]:

data=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/Salary_Data.csv')
In [3]:

data.head()
Out[3]:

YearsExperience Salary

0 1.1 39343.0

1 1.3 46205.0

2 1.5 37731.0

3 2.0 43525.0

4 2.2 39891.0

In [4]:

X=data[['YearsExperience']]
y=data['Salary']
In [5]:
from sklearn.tree import DecisionTreeRegressor
regressor = DecisionTreeRegressor(random_state=0)

In [6]:

regressor.fit(X,y)
Out[6]:
DecisionTreeRegressor(random_state=0)
In [7]:

regressor.predict([[6.5]])
Out[7]:
array([91738.])

9. Bayesian Classification on any dataset


In [5]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
In [6]:

df=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/iris.csv')
In [7]:

df.columns=['sepal_length','sepal_width','petal_length','petal_width','species']
In [8]:

col_names=list(df.columns)
predictors=col_names[0:4]
target=col_names[4]
In [9]:

from sklearn.model_selection import train_test_split


train,test=train_test_split(df,test_size=0.3,random_state=0)
Gaussian Naive Bayes
In [10]:

from sklearn.naive_bayes import GaussianNB


Gmodel=GaussianNB()
Gmodel.fit(train[predictors],train[target])
train_Gpred=Gmodel.predict(train[predictors])
test_Gpred=Gmodel.predict(test[predictors])

In [11]:

train_acc_gau=np.mean(train_Gpred==train[target])
test_acc_gau=np.mean(test_Gpred==test[target])
In [12]:

train_acc_gau

Out[12]:

0.9428571428571428
In [13]:

test_acc_gau
Out[13]:
1.0

Multinomial Naive Bayes


In [14]:

from sklearn.naive_bayes import MultinomialNB


Mmodel=MultinomialNB()
Mmodel.fit(train[predictors],train[target])
train_Mpred=Mmodel.predict(train[predictors])
test_Mpred=Mmodel.predict(test[predictors])
In [15]:

train_acc_multi=np.mean(train_Mpred==train[target])
test_acc_multi=np.mean(test_Mpred==test[target])
In [16]:

train_acc_multi
Out[16]:

0.7047619047619048
In [17]:

test_acc_multi
Out[17]:
0.6

10. SVM classification on any dataset


In [1]:
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
In [2]:

df=pd.read_csv('E:/MCA/Sem3/AI_ML/Practical/Social_Network_Ads.csv')
In [3]:

df.head()
Out[3]:
User ID Gender Age EstimatedSalary Purchased

0 15624510 Male 19 19000 0

1 15810944 Male 35 20000 0

2 15668575 Female 26 43000 0

3 15603246 Female 27 57000 0

4 15804002 Male 19 76000 0

In [4]:

X=df[['Age','EstimatedSalary']]
y=df['Purchased']
In [5]:
from sklearn.model_selection import train_test_split
In [6]:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.23, random_state=91)


In [7]:

from sklearn.preprocessing import MinMaxScaler


scaler=MinMaxScaler()
scaler.fit(X_train)
X_train_scaled=scaler.transform(X_train)
X_test_scaled=scaler.transform(X_test)
In [8]:

from sklearn.svm import SVC


In [9]:

model_lin = SVC(kernel='linear')
model_lin.fit(X_train_scaled,y_train)
model_lin.score(X_test_scaled,y_test)
Out[9]:
0.8043478260869565
In [10]:

model_poly = SVC(kernel='poly')
model_poly.fit(X_train_scaled,y_train)
model_poly.score(X_test_scaled,y_test)
Out[10]:
0.8913043478260869
In [11]:

model_rbf = SVC(kernel='rbf')
model_rbf.fit(X_train_scaled,y_train)
model_rbf.score(X_test_scaled,y_test)
Out[11]:
0.8913043478260869
In [12]:

#Actual data
class_0_act = X_test[y_test==0]
class_1_act = X_test[y_test==1]
plt.scatter(class_0_act['Age'],class_0_act['EstimatedSalary'],c='red')
plt.scatter(class_1_act['Age'],class_1_act['EstimatedSalary'],c='blue')

Out[12]:
<matplotlib.collections.PathCollection at 0x2996f7e3ee0>
In [13]:

#Plot points according to predicted values of linear kernel


y_pre = model_lin.predict(X_test_scaled)
class_0_pre = X_test[y_pre==0]
class_1_pre = X_test[y_pre==1]
plt.scatter(class_0_pre['Age'],class_0_pre['EstimatedSalary'],c='red')
plt.scatter(class_1_pre['Age'],class_1_pre['EstimatedSalary'],c='blue')
plt.title('Linear Kernel')
Out[13]:
Text(0.5, 1.0, 'Linear Kernel')

In [14]:
#Plot points according to predicted values of polynomial kernel
y_pre = model_poly.predict(X_test_scaled)
class_0_pre = X_test[y_pre==0]
class_1_pre = X_test[y_pre==1]
plt.scatter(class_0_pre['Age'],class_0_pre['EstimatedSalary'],c='red')
plt.scatter(class_1_pre['Age'],class_1_pre['EstimatedSalary'],c='blue')
plt.title('Polynomial Kernel')
Out[14]:
Text(0.5, 1.0, 'Polynomial Kernel')

In [15]:

#Plot points according to predicted values of rbf kernel


y_pre = model_rbf.predict(X_test_scaled)
class_0_pre = X_test[y_pre==0]
class_1_pre = X_test[y_pre==1]
plt.scatter(class_0_pre['Age'],class_0_pre['EstimatedSalary'],c='red')
plt.scatter(class_1_pre['Age'],class_1_pre['EstimatedSalary'],c='blue')
plt.title('RBF Kernel')

Out[15]:
Text(0.5, 1.0, 'RBF Kernel')

In [16]:

import numpy as np
In [17]:

plot_data = []
for x in range(0,100,1):
for y in range(0,100,1):
plot_data.append([x,y])
plot_data=np.array(plot_data)/100
In [18]:

plot_data
Out[18]:
array([[0. , 0. ],
[0. , 0.01],
[0. , 0.02],
...,
[0.99, 0.97],
[0.99, 0.98],
[0.99, 0.99]])
In [19]:

plot_data.shape
Out[19]:
(10000, 2)
In [20]:

y_plot = model_lin.predict(plot_data)
class_0 = plot_data[y_plot==0]
class_1 = plot_data[y_plot==1]
plt.scatter(class_0[:,0],class_0[:,1],c='red')
plt.scatter(class_1[:,0],class_1[:,1],c='blue')
plt.title('Linear Kernel')
Out[20]:
Text(0.5, 1.0, 'Linear Kernel')

In [21]:
y_plot = model_poly.predict(plot_data)
class_0 = plot_data[y_plot==0]
class_1 = plot_data[y_plot==1]
plt.scatter(class_0[:,0],class_0[:,1],c='red')
plt.scatter(class_1[:,0],class_1[:,1],c='blue')
plt.title('Poly Kernel')
Out[21]:
Text(0.5, 1.0, 'Poly Kernel')

In [22]:

y_plot = model_rbf.predict(plot_data)
class_0 = plot_data[y_plot==0]
class_1 = plot_data[y_plot==1]
plt.scatter(class_0[:,0],class_0[:,1],c='red')
plt.scatter(class_1[:,0],class_1[:,1],c='blue')
plt.title('rbf Kernel')

Out[22]:
Text(0.5, 1.0, 'rbf Kernel')

In [23]:

pts = np.array([[25,60000],[50,120000]])
pts_scaled = scaler.transform(pts)
In [25]:

pts_scaled
Out[25]:
array([[0.16666667, 0.33333333],
[0.76190476, 0.77777778]])
In [26]:

y = model_rbf.predict(pts_scaled)
y
Out[26]:
array([0, 1], dtype=int64)
12.Plot the cluster data using Python visualization
In [1]:
from sklearn.datasets import load_digits
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
import numpy as np
In [2]:

data = load_digits().data
pca = PCA(2)
In [3]:

df = pca.fit_transform(data)
In [4]:

df.shape
Out[4]:
(1797, 2)
In [5]:

from sklearn.cluster import KMeans


In [6]:

kmeans = KMeans(n_clusters= 10)


In [7]:

label = kmeans.fit_predict(df)

print(label)
[4 1 9 ... 9 8 0]
In [8]:

import matplotlib.pyplot as plt

filtered_label0 = df[label == 0]

plt.scatter(filtered_label0[:,0] , filtered_label0[:,1])
plt.show()

In [9]:

filtered_label2 = df[label == 2]

filtered_label8 = df[label == 8]
plt.scatter(filtered_label2[:,0] , filtered_label2[:,1] , color = 'red')
plt.scatter(filtered_label8[:,0] , filtered_label8[:,1] , color = 'black')
plt.show()

In [10]:

u_labels = np.unique(label)

for i in u_labels:
plt.scatter(df[label == i , 0] , df[label == i , 1] , label = i)
plt.legend()
plt.show()
In [11]:

centroids = kmeans.cluster_centers_
u_labels = np.unique(label)

for i in u_labels:
plt.scatter(df[label == i , 0] , df[label == i , 1] , label = i)
plt.scatter(centroids[:,0] , centroids[:,1] , s = 80, color = 'k')
plt.legend()
plt.show()

13. Creating and Visualizing Neural Network for the given data
In [2]:

import tensorflow as tf
---------------------------------------------------------------------------
ModuleNotFoundError Traceback (most recent call last)
C:\Users\MICROS~1\AppData\Local\Temp/ipykernel_13700/3793406994.py in
<module>
----> 1 import tensorflow as tf
ModuleNotFoundError: No module named 'tensorflow'

In [16]:

from tensorflow import keras


In [17]:

from matplotlib.pyplot import title


In [21]:

from tensorflow.keras.models import Sequential,Model


from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D
from tensorflow.keras.layers import LeakyReLU
In [22]:

model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='linear',input_shape=(28,28,1),padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D((2, 2),padding='same'))
model.add(Conv2D(64, (3, 3), activation='linear',padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Conv2D(128, (3, 3), activation='linear',padding='same'))
model.add(LeakyReLU(alpha=0.1))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Flatten())
model.add(Dense(128, activation='linear'))
model.add(LeakyReLU(alpha=0.1))
model.add(Dense(500, activation='softmax'))
In [23]:

model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(),metrics=['accuracy'])
In [26]:

model.summary()
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 28, 28, 32) 320
_________________________________________________________________
leaky_re_lu (LeakyReLU) (None, 28, 28, 32) 0
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 32) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 14, 14, 64) 18496
_________________________________________________________________
leaky_re_lu_1 (LeakyReLU) (None, 14, 14, 64) 0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 7, 7, 64) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 7, 7, 128) 73856
_________________________________________________________________
leaky_re_lu_2 (LeakyReLU) (None, 7, 7, 128) 0
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 4, 4, 128) 0
_________________________________________________________________
flatten (Flatten) (None, 2048) 0
_________________________________________________________________
dense_2 (Dense) (None, 128) 262272
_________________________________________________________________
leaky_re_lu_3 (LeakyReLU) (None, 128) 0
_________________________________________________________________
dense_3 (Dense) (None, 500) 64500
=================================================================
Total params: 419,444
Trainable params: 419,444
Non-trainable params: 0

14. Recognize optical character using ANN


In [17]:

from tensorflow.keras.datasets import mnist


In [18]:

(x_train,y_train),(x_test,y_test)=mnist.load_data()
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-
datasets/mnist.npz
11493376/11490434 [==============================] - 1s 0us/step
11501568/11490434 [==============================] - 1s 0us/step
In [19]:

x_train.shape
Out[19]:
(60000, 28, 28)
In [20]:

X_train=x_train.reshape(60000,784)
X_test=x_test.reshape(10000,784)
In [21]:

from tensorflow.keras.utils import to_categorical


In [22]:

y_train=to_categorical(y_train,num_classes=10)
y_test=to_categorical(y_test,num_classes=10)
In [23]:

X_train=X_train/255
X_test=X_test/255
In [24]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
In [25]:

model=Sequential()
model.add(Dense(50,activation='relu',input_shape=(784,)))
model.add(Dense(50,activation='relu'))
model.add(Dense(10,activation='softmax'))
In [26]:

model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 50) 39250

dense_1 (Dense) (None, 50) 2550

dense_2 (Dense) (None, 10) 510

=================================================================
Total params: 42,310
Trainable params: 42,310
Non-trainable params: 0
_________________________________________________________________
In [27]:

model.compile(loss='categorical_crossentropy',metrics=['accuracy'])
In [28]:
model.fit(X_train,y_train,batch_size=64,epochs=10,validation_data=(X_test,y_test))
Epoch 1/10
938/938 [==============================] - 3s 3ms/step - loss: 0.3354 -
accuracy: 0.9043 - val_loss: 0.1931 - val_accuracy: 0.9422
Epoch 2/10
938/938 [==============================] - 2s 2ms/step - loss: 0.1624 -
accuracy: 0.9517 - val_loss: 0.1451 - val_accuracy: 0.9548
Epoch 3/10
938/938 [==============================] - 2s 2ms/step - loss: 0.1209 -
accuracy: 0.9638 - val_loss: 0.1142 - val_accuracy: 0.9675
Epoch 4/10
938/938 [==============================] - 2s 2ms/step - loss: 0.1001 -
accuracy: 0.9703 - val_loss: 0.1110 - val_accuracy: 0.9685
Epoch 5/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0854 -
accuracy: 0.9745 - val_loss: 0.1027 - val_accuracy: 0.9697
Epoch 6/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0745 -
accuracy: 0.9780 - val_loss: 0.0963 - val_accuracy: 0.9724
Epoch 7/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0658 -
accuracy: 0.9800 - val_loss: 0.1030 - val_accuracy: 0.9718
Epoch 8/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0586 -
accuracy: 0.9825 - val_loss: 0.1076 - val_accuracy: 0.9713
Epoch 9/10
938/938 [==============================] - 2s 2ms/step - loss: 0.0542 -
accuracy: 0.9835 - val_loss: 0.0909 - val_accuracy: 0.9755
Epoch 10/10
938/938 [==============================] - 2s 3ms/step - loss: 0.0476 -
accuracy: 0.9855 - val_loss: 0.0950 - val_accuracy: 0.9748
Out[28]:
<keras.callbacks.History at 0x1ef21ab46a0>
In [29]:

import numpy as np
In [30]:

X_train
Out[30]:
array([[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]])
In [31]:

y_train[:5,:]
Out[31]:
array([[0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
[0., 1., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]], dtype=float32)
In [32]:

img0 = np.array(X_train[0]).reshape(1,784)
In [33]:

model.predict(img0).argmax()
Out[33]:
5
In [34]:
y_train[0].argmax()
Out[34]:
5
In [35]:

def recognise(img):
img=np.array(img).reshape(1,784)
return model.predict(img).argmax()
In [36]:

y_pre=model.predict(X_test).argmax(axis=1)
In [37]:

y_pre
Out[37]:
array([7, 2, 1, ..., 4, 5, 6], dtype=int64)
In [38]:

len(y_pre)
Out[38]:
10000
In [39]:
y_test.argmax(axis=1)
Out[39]:
array([7, 2, 1, ..., 4, 5, 6], dtype=int64)
In [40]:

sum(y_pre==y_test.argmax(axis=1))
Out[40]:
9748
In [41]:

9737/10000
Out[41]:
0.9737
In [42]:

import matplotlib.pyplot as plt


In [43]:

plt.imshow(np.array(X_test[560]).reshape(28,28))
Out[43]:
<matplotlib.image.AxesImage at 0x1ef22238640>
In [44]:

recognise(X_test[560])
Out[44]:
9

15. Write a program to implement CNN


In [187]:

# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra


import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog'):
for filename in filenames:
print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you
create a version using "Save & Run All"
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\0.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\1.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\10.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\100.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\11.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\12.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\13.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\14.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\15.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\16.jpg
.
.
.
.
.
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\93.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\94.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\95.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\96.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\97.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\98.jpg
E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog\99.jpg
In [188]:

##Link for dataset: https://www.kaggle.com/biaiscience/dogs-vs-cats


In [189]:
os.listdir('E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog')
Out[189]:
['0.jpg',
'1.jpg',
'10.jpg',
'100.jpg',
'11.jpg',
'12.jpg',
'13.jpg',
'14.jpg',
'15.jpg',
'16.jpg',
'17.jpg',
'18.jpg',
'19.jpg',
'2.jpg',
'20.jpg',
.
.
.
.
.
.
'95.jpg',
'96.jpg',
'97.jpg',
'98.jpg',
'99.jpg']
In [190]:

filenames=os.listdir('E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog')
In [191]:
len(filenames)
Out[191]:
101
In [192]:

filenames[:5]
Out[192]:
['0.jpg', '1.jpg', '10.jpg', '100.jpg', '11.jpg']
In [193]:

df=pd.DataFrame({'filename':filenames})
df.head()
Out[193]:

filename

0 0.jpg

1 1.jpg

2 10.jpg

3 100.jpg

4 11.jpg

In [194]:
df['class']=df['filename'].apply(lambda X:X[:3])
In [195]:

df.head()
Out[195]:

filename class

0 0.jpg 0.j

1 1.jpg 1.j

2 10.jpg 10.

3 100.jpg 100

4 11.jpg 11.

In [196]:

from tensorflow.keras.preprocessing.image import ImageDataGenerator


In [197]:

data_gen=ImageDataGenerator(zoom_range=0.2,shear_range=0.2,horizontal_flip=True,rescale=1/255)
In [198]:

train_data=data_gen.flow_from_dataframe(df,'E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImage
s/Dog',X='filename',y='class',target_size=(224,224))
Found 101 validated image filenames belonging to 101 classes.
In [199]:

from tensorflow.keras.models import Sequential


from tensorflow.keras.layers import Conv2D,MaxPool2D,Flatten,Dense
In [200]:

model=Sequential()
model.add(Conv2D(16,(3,3),activation='relu',input_shape=(224,224,3)))
model.add(MaxPool2D())
model.add(Conv2D(32,(3,3),activation='relu'))
model.add(MaxPool2D())
model.add(Conv2D(64,(3,3),activation='relu'))
model.add(MaxPool2D())
model.add(Conv2D(64,(5,5),activation='relu'))
model.add(MaxPool2D())
model.add(Conv2D(128,(3,3),activation='relu'))
model.add(MaxPool2D())
model.add(Flatten())
model.add(Dense(2,activation='softmax'))
In [201]:

model.summary()
Model: "sequential_7"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_35 (Conv2D) (None, 222, 222, 16) 448

max_pooling2d_35 (MaxPoolin (None, 111, 111, 16) 0


g2D)

conv2d_36 (Conv2D) (None, 109, 109, 32) 4640


max_pooling2d_36 (MaxPoolin (None, 54, 54, 32) 0
g2D)

conv2d_37 (Conv2D) (None, 52, 52, 64) 18496

max_pooling2d_37 (MaxPoolin (None, 26, 26, 64) 0


g2D)

conv2d_38 (Conv2D) (None, 22, 22, 64) 102464

max_pooling2d_38 (MaxPoolin (None, 11, 11, 64) 0


g2D)

conv2d_39 (Conv2D) (None, 9, 9, 128) 73856

max_pooling2d_39 (MaxPoolin (None, 4, 4, 128) 0


g2D)

flatten_7 (Flatten) (None, 2048) 0

dense_7 (Dense) (None, 2) 4098

=================================================================
Total params: 204,002
Trainable params: 204,002
Non-trainable params: 0
_________________________________________________________________
In [202]:

model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])
In [203]:

#model.fit_generator(train_data,epochs=5)
In [204]:
import cv2
def get_class(img_path):
img=cv2.imread(img_path)
img=cv2.resize(img,(224,224))
img=img/255
op=model.predict(img.reshape(1,224,224,3)).argmax()
return 'Cat' if op==0 else 'Dog'
In [205]:

train_data.class_mode
Out[205]:
'categorical'
In [207]:

get_class('E:/MCA/Sem3/AI_ML/Practical/kagglecatsanddogs_3367a/PetImages/Dog/10.jpg')
Out[207]:
'Dog'

16. Write a program to implement RNN


In [52]:

from tensorflow.keras.datasets import imdb


In [53]:

(X_train,y_train),(X_test,y_test)=imdb.load_data(num_words=20000)
In [54]:

X_train.shape,X_test.shape
Out[54]:
((25000,), (25000,))
In [55]:

len(X_train[0]),len(X_train[1]),len(X_train[2]),len(X_train[3]),len(X_train[4])
Out[55]:
(218, 189, 141, 550, 147)
In [56]:

y_train[:5]
Out[56]:
array([1, 0, 0, 1, 0], dtype=int64)
In [57]:

X_train[0]
Out[57]:
[1,
14,
22,
16,
43,
530,
973,
1622,
1385,
65,
.
.
.
.
.
103,
32,
15,
16,
5345,
19,
178,
32]
In [58]:

import numpy as np
In [59]:

np.array(X_train[0])

Out[59]:
array([ 1, 14, 22, 16, 43, 530, 973, 1622, 1385,
65, 458, 4468, 66, 3941, 4, 173, 36, 256,
5, 25, 100, 43, 838, 112, 50, 670, 2,
9, 35, 480, 284, 5, 150, 4, 172, 112,
167, 2, 336, 385, 39, 4, 172, 4536, 1111,
17, 546, 38, 13, 447, 4, 192, 50, 16,
6, 147, 2025, 19, 14, 22, 4, 1920, 4613,
469, 4, 22, 71, 87, 12, 16, 43, 530,
38, 76, 15, 13, 1247, 4, 22, 17, 515,
17, 12, 16, 626, 18, 19193, 5, 62, 386,
12, 8, 316, 8, 106, 5, 4, 2223, 5244,
16, 480, 66, 3785, 33, 4, 130, 12, 16,
38, 619, 5, 25, 124, 51, 36, 135, 48,
25, 1415, 33, 6, 22, 12, 215, 28, 77,
52, 5, 14, 407, 16, 82, 10311, 8, 4,
107, 117, 5952, 15, 256, 4, 2, 7, 3766,
5, 723, 36, 71, 43, 530, 476, 26, 400,
317, 46, 7, 4, 12118, 1029, 13, 104, 88,
4, 381, 15, 297, 98, 32, 2071, 56, 26,
141, 6, 194, 7486, 18, 4, 226, 22, 21,
134, 476, 26, 480, 5, 144, 30, 5535, 18,
51, 36, 28, 224, 92, 25, 104, 4, 226,
65, 16, 38, 1334, 88, 12, 16, 283, 5,
16, 4472, 113, 103, 32, 15, 16, 5345, 19,
178, 32])
In [60]:

from tensorflow.keras.preprocessing.sequence import pad_sequences


In [61]:

X=pad_sequences(X_train,maxlen=200)
X_val=pad_sequences(X_test,maxlen=200)
In [62]:

len(X[0])
Out[62]:
200
In [63]:

from tensorflow.keras.models import Sequential


from tensorflow.keras.layers import LSTM,Dense,Embedding
In [64]:
model=Sequential()
model.add(Embedding(20000,128,input_shape=(200,)))
model.add(LSTM(100,return_sequences=True))
model.add(LSTM(100))
model.add(Dense(1,activation='sigmoid'))
In [65]:

model.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])
In [*]:

model.fit(X,y_train,validation_data=(X_val,y_test),epochs=5,batch_size=64)

Epoch 1/5
391/391 [==============================] - 479s 1s/step - loss: 0.3931 -
accuracy: 0.8199 - val_loss: 0.3054 - val_accuracy: 0.8726
Epoch 2/5
391/391 [==============================] - 457s 1s/step - loss: 0.2015 -
accuracy: 0.9246 - val_loss: 0.3744 - val_accuracy: 0.8514
Epoch 3/5
391/391 [==============================] - 466s 1s/step - loss: 0.1295 -
accuracy: 0.9542 - val_loss: 0.3851 - val_accuracy: 0.8615
Epoch 4/5
391/391 [==============================] - 478s 1s/step - loss: 0.0785 -
accuracy: 0.9737 - val_loss: 0.4942 - val_accuracy: 0.8522
Epoch 5/5
391/391 [==============================] - 533s 1s/step - loss: 0.0627 -
accuracy: 0.9784 - val_loss: 0.5105 - val_accuracy: 0.8246
Out[66]:
<keras.callbacks.History at 0x1da72672280>
17. Web scraping experiments (by using tools)
In [126]:

import requests
from bs4 import BeautifulSoup
import csv

URL = 'http://www.values.com/inspirational-quotes'

r = requests.get(URL)

soup = BeautifulSoup(r.content, 'html5lib')

quotes=[]
In [127]:
soup.find('div', attrs = {'id':'all_quotes'})
Out[127]:
<div class="row" id="all_quotes">
<div class="col-6 col-lg-3 text-center margin-30px-bottom sm-
margin-30px-top">

<a href="/inspirational-quotes/3331-wherever-we-are-it-is-our-
friends-that-make"><img alt="Wherever we are, it is our friends that make our
world. #&lt;Author:0x00007f188bf2e298&gt;" class="margin-10px-bottom shadow"
height="310"
src="https://assets.passiton.com/quotes/quote_artwork/3331/medium/20220210_th
ursday_quote_updated.jpg?1644000474" width="310"/></a>
<h5 class="value_on_red"><a href="/inspirational-quotes/3331-
wherever-we-are-it-is-our-friends-that-make">FRIENDSHIP</a></h5>

</div><div class="col-6 col-lg-3 text-center margin-30px-bottom sm-margin-


30px-top">

<a href="/inspirational-quotes/8303-find-a-group-of-people-who-
challenge-and"><img alt="Find a group of people who challenge and inspire
........................

<a href="/inspirational-quotes/7182-the-bond-that-links-your-true-
family-is-not-one"><img alt="The bond that links your true family is not one
of blood, but of respect and joy in each other's life. Rarely do members of
one family grow up under the same roof. #&lt;Author:0x00007f18843a51f8&gt;"
class="margin-10px-bottom shadow" height="310"
src="https://assets.passiton.com/quotes/quote_artwork/7182/medium/20211229_we
dnesday_quote.jpg?1640015962" width="310"/></a>
<h5 class="value_on_red"><a href="/inspirational-quotes/7182-the-
bond-that-links-your-true-family-is-not-one">FAMILY</a></h5>

</div>
</div>
In [128]:
for row in soup.find_all_next('div', attrs = {'class': 'col-6 col-lg-3 text-center margin-30px-bottom sm-margin-30px-
top'}):
quote = {}
quote['theme'] = row.h5.text
quote['url'] = row.a['href']
quote['img'] = row.img['src']
quote['lines'] = row.img['alt'].split(" #")[0]
quote['author'] = row.img['alt'].split(" #")[1]
quotes.append(quote)
In [129]:

quote['theme'] = row.h5.text

In [130]:

quote['url'] = row.a['href']
In [131]:

filename = 'inspirational_quotes.csv'
with open(filename, 'w', newline='') as f:
w = csv.DictWriter(f,['theme','url','img','lines','author'])
w.writeheader()
for quote in quotes:
w.writerow(quote)

You might also like