100% found this document useful (1 vote)

132 views

3) Code For ID3 Algorithm Implementation

The document loads and analyzes Iris flower data using Python libraries like Pandas and Seaborn. It uploads an Iris CSV file, loads the data into a Pandas dataframe, then performs various visualizations and analyses. These include scatter plots of features colored by species, box plots, density plots, and pair plots to understand relationships between features and species. It also fits a decision tree classifier to the data and plots the tree.

Uploaded by

Prajith Sprinťèř

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

132 views

3) Code For ID3 Algorithm Implementation

Uploaded by

Prajith Sprinťèř

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

import pandas as pd

from google.colab import files
uploaded = files.upload()

Choose Files No file chosen

Upload widget is only available when the cell has been
executed in the
current browser session. Please rerun this cell to enable.
Saving Iris.csv to Iris (1).csv

import io
Iris = pd.read_csv(io.BytesIO(uploaded['Iris.csv']))
Iris

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

3 4 4.6 3.1 1.5 0.2 Iris-setosa

4 5 5.0 3.6 1.4 0.2 Iris-setosa

... ... ... ... ... ... ...

145 146 6.7 3.0 5.2 2.3 Iris-virginica

146 147 6.3 2.5 5.0 1.9 Iris-virginica

147 148 6.5 3.0 5.2 2.0 Iris-virginica

148 149 6.2 3.4 5.4 2.3 Iris-virginica

149 150 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 6 columns

import pandas as pd

# We'll also import seaborn, a Python graphing library
import warnings # current version of seaborn generates a bunch of warnings that we'll igno
warnings.filterwarnings("ignore")
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="white", color_codes=True)

Iris.head()
Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

Iris["Species"].value_counts()
3 4 4.6 3.1 1.5 0.2 Iris-setosa

Iris-setosa
4 5 50

5.0 3.6 1.4 0.2 Iris-setosa

Iris-versicolor 50

Iris-virginica 50

Name: Species, dtype: int64

# The first way we can plot things is using the .plot extension from Pandas dataframes
# We'll use this to make a scatterplot of the Iris features.
Iris.plot(kind="scatter", x="SepalLengthCm", y="SepalWidthCm")

*c* argument looks like a single numeric RGB or RGBA sequence, which should be avoide
<matplotlib.axes._subplots.AxesSubplot at 0x7fb55c046750>

# We can also use the seaborn library to make a similar plot
# A seaborn jointplot shows bivariate scatterplots and univariate histograms in the same f
sns.jointplot(x="SepalLengthCm", y="SepalWidthCm", data=Iris, size=10)
<seaborn.axisgrid.JointGrid at 0x7fb55bf27150>

# One piece of information missing in the plots above is what species each plant is
# We'll use seaborn's FacetGrid to color the scatterplot by species
sns.FacetGrid(Iris, hue="Species", size=5) \
.map(plt.scatter, "SepalLengthCm", "SepalWidthCm") \
.add_legend()

<seaborn.axisgrid.FacetGrid at 0x7fb55bd81710>

# We can look at an individual feature in Seaborn through a boxplot
sns.boxplot(x="Species", y="PetalLengthCm", data=Iris)
<matplotlib.axes._subplots.AxesSubplot at 0x7fb55bc8ced0>

# A final seaborn plot useful for looking at univariate relations is the kdeplot,
# which creates and visualizes a kernel density estimate of the underlying feature
sns.FacetGrid(Iris, hue="Species", size=6) \
.map(sns.kdeplot, "SepalLengthCm") \
.add_legend()

<seaborn.axisgrid.FacetGrid at 0x7fb5657b0350>

# Another useful seaborn plot is the pairplot, which shows the bivariate relation
# between each pair of features
#
# From the pairplot, we'll see that the Iris-setosa species is separataed from the other
# two across all feature combinations
sns.pairplot(Iris.drop("Id", axis=1), hue="Species", size=3)
<seaborn.axisgrid.PairGrid at 0x7fb56579ae50>

from sklearn.datasets import load_iris
from sklearn import tree
iris = load_iris()
X, y = iris.data, iris.target
clf = tree.DecisionTreeClassifier()
clf = clf.fit(X, y)
clf

DecisionTreeClassifier(ccp_alpha=0.0, class_weight=None, criterion='gini',

max_depth=None, max_features=None, max_leaf_nodes=None,

min_impurity_decrease=0.0, min_impurity_split=None,

min_samples_leaf=1, min_samples_split=2,

min_weight_fraction_leaf=0.0, presort='deprecated',

random_state=None, splitter='best')

tree.plot_tree(clf)

[Text(167.4, 199.32, 'X[2] <= 2.45\ngini = 0.667\nsamples = 150\nvalue = [50, 50, 50]
Text(141.64615384615385, 163.07999999999998, 'gini = 0.0\nsamples = 50\nvalue = [50,
Text(193.15384615384616, 163.07999999999998, 'X[3] <= 1.75\ngini = 0.5\nsamples = 10
Text(103.01538461538462, 126.83999999999999, 'X[2] <= 4.95\ngini = 0.168\nsamples =
Text(51.50769230769231, 90.6, 'X[3] <= 1.65\ngini = 0.041\nsamples = 48\nvalue = [0,
Text(25.753846153846155, 54.359999999999985, 'gini = 0.0\nsamples = 47\nvalue = [0,
Text(77.26153846153846, 54.359999999999985, 'gini = 0.0\nsamples = 1\nvalue = [0, 0,
Text(154.52307692307693, 90.6, 'X[3] <= 1.55\ngini = 0.444\nsamples = 6\nvalue = [0,
Text(128.76923076923077, 54.359999999999985, 'gini = 0.0\nsamples = 3\nvalue = [0, 0
Text(180.27692307692308, 54.359999999999985, 'X[0] <= 6.95\ngini = 0.444\nsamples =
Text(154.52307692307693, 18.119999999999976, 'gini = 0.0\nsamples = 2\nvalue = [0, 2
Text(206.03076923076924, 18.119999999999976, 'gini = 0.0\nsamples = 1\nvalue = [0, 0
Text(283.2923076923077, 126.83999999999999, 'X[2] <= 4.85\ngini = 0.043\nsamples = 4
Text(257.53846153846155, 90.6, 'X[1] <= 3.1\ngini = 0.444\nsamples = 3\nvalue = [0,
Text(231.7846153846154, 54.359999999999985, 'gini = 0.0\nsamples = 2\nvalue = [0, 0,
Text(283.2923076923077, 54.359999999999985, 'gini = 0.0\nsamples = 1\nvalue = [0, 1,
Text(309.04615384615386, 90.6, 'gini = 0.0\nsamples = 43\nvalue = [0, 0, 43]')]

import graphviz

dot_data = tree.export_graphviz(clf, out_file=None)

graph = graphviz.Source(dot_data)

graph.render("iris")

'iris.pdf'

dot_data = tree.export_graphviz(clf, out_file=None,
feature names=iris feature names
...                      feature_names=iris.feature_names,
...                      class_names=iris.target_names,
...                      filled=True, rounded=True,
...                      special_characters=True)

graph = graphviz.Source(dot_data)

graph
petal length (c
gini = 0
samples
value = [50
class = s

True

gini = 0.0
samples = 50
value = [50, 0, 0]
class = setosa

petal length (cm) ≤ 4

gini = 0.168
samples = 54
value = [0, 49, 5
class = versicolo

Oracle 1z0 1053 22
No ratings yet
Oracle 1z0 1053 22
8 pages
Bootstrap Powerpoint
100% (1)
Bootstrap Powerpoint
20 pages
Python The Inventory Project
No ratings yet
Python The Inventory Project
52 pages
8 Best Python Cheat Sheets For Beginners and Intermediate Learners
100% (1)
8 Best Python Cheat Sheets For Beginners and Intermediate Learners
13 pages
Book
100% (1)
Book
480 pages
Thinkcspy 3
100% (1)
Thinkcspy 3
415 pages
Bagging, Boosting
100% (1)
Bagging, Boosting
32 pages
Linear - Regression
100% (1)
Linear - Regression
39 pages
Cardio Screen RF
100% (1)
Cardio Screen RF
27 pages
0.1 Stock Data
100% (1)
0.1 Stock Data
4 pages
Sales Forecasting
100% (1)
Sales Forecasting
10 pages
XG Boost PDF
100% (1)
XG Boost PDF
3 pages
9 Regression
100% (1)
9 Regression
14 pages
SQL Cheat Sheet
100% (1)
SQL Cheat Sheet
44 pages
Lab7.ipynb - Colaboratory
100% (1)
Lab7.ipynb - Colaboratory
5 pages
Chapter-3-Linear Models For Regression
100% (1)
Chapter-3-Linear Models For Regression
61 pages
Lab 3. Linear Regression 230223
100% (1)
Lab 3. Linear Regression 230223
7 pages
CS550 Regression Aug12
100% (1)
CS550 Regression Aug12
63 pages
Gradient Descent - Linear Regression
100% (1)
Gradient Descent - Linear Regression
47 pages
Decision Trees: at Some Point of Time You Have To Take A Decision Sitting On A Tree
100% (1)
Decision Trees: at Some Point of Time You Have To Take A Decision Sitting On A Tree
19 pages
A) What Is Motivation Behind Ensemble Methods? Give Your Answer in Probabilistic Terms
100% (1)
A) What Is Motivation Behind Ensemble Methods? Give Your Answer in Probabilistic Terms
6 pages
CS464 Ch9 LinearRegression
100% (1)
CS464 Ch9 LinearRegression
43 pages
Charmi Shah 20bcp299 Lab2
100% (1)
Charmi Shah 20bcp299 Lab2
7 pages
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
100% (1)
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
28 pages
0.1 Guilherme Marthe - Boston House Pricing Challenge
100% (1)
0.1 Guilherme Marthe - Boston House Pricing Challenge
15 pages
ML Lect1
100% (1)
ML Lect1
51 pages
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
100% (1)
Outlines: Statements of Problems Objectives Bagging Random Forest Boosting Adaboost
14 pages
Unit - 4 Machine Learning
100% (1)
Unit - 4 Machine Learning
84 pages
Regressao Linear Simples - Ipynb - Colaboratory
100% (1)
Regressao Linear Simples - Ipynb - Colaboratory
2 pages
Classification and Prediction
100% (1)
Classification and Prediction
31 pages
Importing Libraries: Import As Import As Import As From Import As From Import From Import Import
100% (1)
Importing Libraries: Import As Import As Import As From Import As From Import From Import Import
11 pages
Merging - Scaled - 1D - & - Trying - Different - CLassification - ML - Models - .Ipynb - Colaboratory
100% (1)
Merging - Scaled - 1D - & - Trying - Different - CLassification - ML - Models - .Ipynb - Colaboratory
16 pages
Hypothesis and Hypothesis Testing
100% (1)
Hypothesis and Hypothesis Testing
59 pages
K-NN (Nearest Neighbor)
100% (1)
K-NN (Nearest Neighbor)
17 pages
Homoscedasticity, Heteroscedasticity and Multicollinearity
100% (1)
Homoscedasticity, Heteroscedasticity and Multicollinearity
10 pages
Classification Problems
100% (1)
Classification Problems
25 pages
PR01
100% (1)
PR01
41 pages
Assignment10 4
100% (1)
Assignment10 4
3 pages
EMF CheatSheet V4
100% (1)
EMF CheatSheet V4
2 pages
Regression Anallysis Hands0n 1
100% (1)
Regression Anallysis Hands0n 1
3 pages
Csi 5155 ML Project Report
100% (1)
Csi 5155 ML Project Report
24 pages
Classification and Regression Trees
100% (1)
Classification and Regression Trees
60 pages
Bootstrap Powerpoint
100% (1)
Bootstrap Powerpoint
10 pages
ML Lab6.Ipynb - Colaboratory
100% (1)
ML Lab6.Ipynb - Colaboratory
5 pages
Ge3171-Python Lab
No ratings yet
Ge3171-Python Lab
82 pages
Regression
100% (1)
Regression
20 pages
XGBoost R Tutorial
100% (1)
XGBoost R Tutorial
10 pages
Vinee
100% (1)
Vinee
28 pages
Univariate and Bivariate Data Analysis + Probability
100% (1)
Univariate and Bivariate Data Analysis + Probability
5 pages
Loading The Dataset: First We Load The Dataset and Find Out The Number of Columns, Rows, NULL Values, Etc
100% (1)
Loading The Dataset: First We Load The Dataset and Find Out The Number of Columns, Rows, NULL Values, Etc
8 pages
01-Introduction Machine Learning
100% (1)
01-Introduction Machine Learning
48 pages
Project 1 - Radio Link Failure Prediction
100% (1)
Project 1 - Radio Link Failure Prediction
8 pages
Data Pre-Processing (Pandas)
No ratings yet
Data Pre-Processing (Pandas)
19 pages
Introduction to Boosting: Slides Adapted from Che Wanxiang (车万翔) at HIT, and Robin Dhamankar of Many thanks!
100% (1)
Introduction to Boosting: Slides Adapted from Che Wanxiang (车万翔) at HIT, and Robin Dhamankar of Many thanks!
41 pages
Outliers, Hypothesis and Natural Language Processing
100% (1)
Outliers, Hypothesis and Natural Language Processing
7 pages
Correlation & Regression Analysis
100% (1)
Correlation & Regression Analysis
39 pages
Assignment Updated 101
100% (1)
Assignment Updated 101
24 pages
Glass Classification
100% (2)
Glass Classification
3 pages
Heart: Our "Goal" Predict The Presence of Heart Disease in The Patient
100% (1)
Heart: Our "Goal" Predict The Presence of Heart Disease in The Patient
73 pages
ML Lec07 KNN
100% (2)
ML Lec07 KNN
37 pages
Lab Manual
No ratings yet
Lab Manual
32 pages
Experiment-2-1-Ml Kritika
No ratings yet
Experiment-2-1-Ml Kritika
11 pages
Balsus Analogue
No ratings yet
Balsus Analogue
8 pages
Graded Quiz - Module 4 (Page 3 of 10)
No ratings yet
Graded Quiz - Module 4 (Page 3 of 10)
1 page
Graded Quiz - Module 2 (Page 2 of 20)
No ratings yet
Graded Quiz - Module 2 (Page 2 of 20)
1 page
5) Randomforest - Ipynb - Colaboratory
No ratings yet
5) Randomforest - Ipynb - Colaboratory
12 pages
6) TCE MOOC-jLinear Regression
No ratings yet
6) TCE MOOC-jLinear Regression
19 pages
My Mother at Sixty Six PDF234
No ratings yet
My Mother at Sixty Six PDF234
5 pages
Autumn Break Assignment I
100% (1)
Autumn Break Assignment I
7 pages
Abhiraj File PDF
No ratings yet
Abhiraj File PDF
56 pages
TMSH Command Line Options
No ratings yet
TMSH Command Line Options
2 pages
8085 Assemblylanguage
100% (1)
8085 Assemblylanguage
52 pages
SRS Team7 v5
No ratings yet
SRS Team7 v5
31 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
7 pages
10 InteractionModeling
No ratings yet
10 InteractionModeling
11 pages
How To Secure Password Using Bcrypt in PHP
No ratings yet
How To Secure Password Using Bcrypt in PHP
5 pages
Factory Management Requirements
No ratings yet
Factory Management Requirements
5 pages
Scripting (Optimized Story Experience and Analytics Designer)
No ratings yet
Scripting (Optimized Story Experience and Analytics Designer)
166 pages
SAP B1 Approval Procedures
No ratings yet
SAP B1 Approval Procedures
7 pages
Unit I
No ratings yet
Unit I
53 pages
The News Flow and Copy Editing
No ratings yet
The News Flow and Copy Editing
5 pages
Lab Test 2 (25 Marks) : Coding Time: 1 Hour and 30 Mins
No ratings yet
Lab Test 2 (25 Marks) : Coding Time: 1 Hour and 30 Mins
8 pages
Platform Technologies
50% (2)
Platform Technologies
12 pages
Fpga Implementation of Modified Radix 2 SRT Division Algorithm
No ratings yet
Fpga Implementation of Modified Radix 2 SRT Division Algorithm
4 pages
6.6. Internal Networking: Network From The Drop-Down List of Networking Modes. Select The
No ratings yet
6.6. Internal Networking: Network From The Drop-Down List of Networking Modes. Select The
8 pages
React Style Guide
No ratings yet
React Style Guide
3 pages
Rows and Columns
No ratings yet
Rows and Columns
3 pages
X30 User Guide v2 PDF
No ratings yet
X30 User Guide v2 PDF
2 pages
MOBILedit!.pdf_MOBILedit=2.5.0
No ratings yet
MOBILedit!.pdf_MOBILedit=2.5.0
31 pages
RDF Summarization
No ratings yet
RDF Summarization
20 pages
5079724
No ratings yet
5079724
76 pages
2021 DBMS Mid
No ratings yet
2021 DBMS Mid
9 pages
Centum VP Training Manual
No ratings yet
Centum VP Training Manual
3 pages
3 Threads
No ratings yet
3 Threads
5 pages
Aishwarya Garg Resume
No ratings yet
Aishwarya Garg Resume
4 pages
JS Notes
No ratings yet
JS Notes
30 pages
LTSpice Tutorial New PDF
No ratings yet
LTSpice Tutorial New PDF
10 pages
AFleX Scripting Language Reference
No ratings yet
AFleX Scripting Language Reference
591 pages