Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
121 views

Task 1 Iris Flower Classification Using Machine Learning

iris flower classification with the help of machine learning algorithm in python

Uploaded by

jadhavvikram863
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
121 views

Task 1 Iris Flower Classification Using Machine Learning

iris flower classification with the help of machine learning algorithm in python

Uploaded by

jadhavvikram863
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 10
7113728, 1106 AM lis Flower Project Task + - Jupyter Notebook Project Report On Iris Flower Classification Using Machine Learning Submitted By, Mr. Omkar Balwant Jadhav Introduction The purpose of this project is to perform classification on the Iris flower dataset. The Iris dataset is a widely-used dataset in machine learning and consists of measurements of four features (Sepal Length, Sepal Width, Petal Length, and Petal Width) of three different species of Iris flowers (Setosa, Versicolor, and Virginica) Iris Flower Classification Defining the problem statement Collecting the data Filtering Data Exploratory data analysis Feature engineering Data Visualization Machine Leaming Stages 1. Defining the problem statement In this project, we study the data of iris flower which is present in tabular format in which we use different libraries like numpy, pandas and matplotlib and different machine leaming algorithms. We study different columns of the table and try to co-relate them with others and find a relation between those two. We try to find and analyze those key factors like species, petal lenths. etc which helps classification of iris data 2. Collecting Data locas 8888inotebooks/Desktop/Final Projectiris Flower Project Task 1 pynti#Submitted-By-Mr-Omkar-Balwant-Jadhav 10 7113728, 1106 AM In [2]: out [2]: In [3]: out(3]: In [4]: out[4]: lis Flower Project Task + - Jupyter Notebook 1 import pandas as pd 2 inis_data=pd.read_csv! 3 inis_data \Users\\Onkar\\Desktop\\Iris.csv") Id SepalLengthcm SepalWidthcm PetalLengthcm PetalWidthcm Species, o 4 eA 35 14 02 is-setosa 12 49 30 14 02. ris-setosa 2 3 47 32 13 0.2 Iris-setosa 34 46 34 15 02 rie-setosa 45 50 38 14 02 is-setosa 145146 67 30 52 23. Inswiginica 148 147 63 25 50 4.9. biswieginica 147 148 65 30 52 20. Iniswviginica 143 149 62 34 54 23. Iniswirginica 149 150 59 30 54 1.8 is-vieginica 150 rows 6 columns 3.Filtering Data 1 inis_data. shape (158, 6) 1 Gris _data[ "Species" -value_counts() Inis-setosa 50 Iris-versicolor 50 Inis-virginica 50 Name: Species, dtype: inted 4. Exploratory data analysis Exploratory Data Analysis refers to the critical process of performing initial investigations on data so as to discover patterns,to spot anomalies,to test hypothesis and to check assumptions with the help of summary statistics and graphical representations. Itis a good practice to understand the data first and try to gather as many insights from tt. EDA\is all about making sense of data in hand. Iocahost 8888/notebooks/Desktop/Final Projects Flower Projact Task 1 ipynb#Submitted By-Mr-Omkar-Balwant Jadhav 210 7113728, 1106 AM lis Flower Project Task + - Jupyter Notebook In [5]: 1 iris_data.info() Rangelndex: 15@ entries, @ to 149 Data columns (total 6 columns): # Column Non-Null Count type Id 15@ non-null intea SepallengthCm 15@ non-null float6a SepalWidthCm 150 non-null float64 PetalLengthCm 15@ non-null —floate4 PetalWidthCm 15@ non-null —floatea 5 Species 15@ non-null object dtypes: floatea(4), int64(1), object(1) memory usage: 7.2+ KB e 1 2 3 4 In [6]: 1 iris_data.describe() outs}: Id SepalLengthcm SepalWidthCm PatalLengthCm PetalWidthCm count 150,000000 180,000000 150,000000 150.0000 160,000000, mean 75500000 5243333 3.054000 3758667 1.198687 std 43.445968 0.28063 0.433504 1764420 0.768161 min 1.000000 4300000 2.000000 1.000000 0.100000 25% — 38,250000 5.100000 2.800000 1.600000 0.300000 50% 75.5000 5.800000 3.000000 4.350000 1.300000 75% 112.750000 6.400000 3.200000 5.100000 1.800000 max 150.000000 7.900000 4.400000 6.900000 2.500000 In [8]: 1 iris_data.isnull().sum() out[s]: 1d SepalLengthcn Sepalwidthcn PetalLengthcn Petalwidthcm Species dtype: intea 5. Feature Engineering What is a feature and why we need the engineering of it? Basically, all machine leaning algorithms use some input data to create outputs. This input data comprise features, which are usually in the form of structured columns. Algorithms require features with some specific characteristic to work properly. Here, the need for feature engineering arises. I think feature engineering efforts mainly have two goals: Iocahost 8888inotebooks/Desktop/Final Projects Flower Projact Task 1 ipynb#Submitted By-Mr-Omkar-Balwant Jadhav 7113728, 1106 AM In [9]: In [12]: out [12]: 7 . ~ . 6 oY 5 ° a 5 oot E - ‘ 6 ° ea is 5 . es a] £ ° é; . 2 L ris-setosa Iris-versicolor Iris-virginica Species Iocahost 8888inotebooks/Desktop/Final Projects Flower Projact Task 1 ipynb#Submitted By-Mr-Omkar-Balwant Jadhav eno 7113728, 1106 AM lis Flower Project Task + - Jupyter Notebook In [15]: 1. sns.stripplot(data=iris_data, x='Species’, y="PetalwidthCm') Out[15]: 25 ee oo 2.0 E gus 2 § 210 0s . ewe oe a 0.0 lris-setosa Iris-versicolor lris-virginica Species 7. Machine Learning Stages Stage 1: Importing Libraries and Loading the Dataset In [17]: from sklearn.model_selection import train_test_split 1 2 3 # Load the dataset 4 data = pd.read_csv("C:\\Users\\Onkar\Desktop\\Iris.csv") 5 6 7 8 # Split the dataset into features (x) and Labels (y) x y data.drop(‘Species', axis=1) data[ 'Species"] 10 # Split the dataset into training and testing sets 11 X.train, Xtest, y_train, y test = train_test_split(x, y, test_size=0.2, | Stage 2: Data Preprocessing Iocahost 8888inotebooks/Desktop/Final Projects Flower Projact Task 1 ipynb#Submitted By-Mr-Omkar-Balwant Jadhav m0 7113728, 1106 AM lis Flower Project Task + - Jupyter Notebook In [18]: from sklearn. preprocessing import StandardScaler # Scale the features using Standardscaler Standardscaler() ( train_scaled = scaler. fit_transform(X_train) X_test_scaled = scaler.transform(X_test) Stage 3: Model Training In [19]: from sklearn.svm import SVC # Create an SVM classifier classifier = SVC() # Train the classifier on the scaled training data classifier.fit(x_train_scaled, y_train) out[19]: sve() Stage 4: Model Evaluation In [20]: from sklearn.metrics import accuracy_score # Predict the Labels for the test set y_pred = classifier.predict(x_test_scaled) # Calculate the accuracy of the classifier accuracy = accuracy_score(y test, y_pred) print("Accuracy:", accuracy) Accuracy: 1.0 Stage 5: Hyperparameter Tuning Iocahost 8888inotebooks/Desktop/Final Projects Flower Projact Task 1 ipynb#Submitted By-Mr-Omkar-Balwant Jadhav ano 7113728, 1106 AM In [21]: out(21]: In [22]: In [24]: SVC(C=1, ganma=@.1, kerne: lis Flower Project Task + - Jupyter Notebook from sklearn.model_selection import GridSearchcv # Define the hyperparameters to tune param_grid = {'C': [@.1, 1, 1@, 100], ‘ganma [@.1, 1, 10, 100], ‘kernel # Create a GridSearchCV object and fit it to the training data grid_search = GridSearchcV(classifier, param_grid, cv=5) grid_search.fit(X_train_scaled, y train) # Get the best hyperparameters and retrain the classifier best_params = grid_search.best_params_ classifier = SVC(**best_params) classifier.fit(x_train_scaled, y_train) “Linear*) Stage 6: Feature Selection (Optional) from sklearn.feature_selection import SelectkBest, f_classif # Perform feature selection using ANOVA F-value selector = SelectkBest(f_classif, k=3) X_train_selected = selector. fit_transform(X_train_scaled, y_train) X_test_selected = selector.transform(x_test_scaled) Stage 7: Model Training with Selected Features (Optional) ae # Retrain the classifier on the selected features classifier. fit(x_train_selected, y_train) # Predict the Labels for the test set with selected features y_pred_selected = classifier.predict(x_test_selected) # Calculate the accuracy with selected features accuracy_selected = accuracy_score(y_test, y_pred_selected) print("Accuracy with selected features:”, accuracy_selected) Accuracy with selected features: 1.0 Stage 8: Final Predictions Iocahost 8888inotebooks/Desktop/Final Projects Flower Projact Task 1 ipynb#Submitted By-Mr-Omkar-Balwant Jadhav 90 7113728, 1106 AM In (28): lis Flower Project Task + - Jupyter Notebook # Scale the entire dataset X_scaled = scaler. transform(x) 1 2 3 4 # Retrain the classifier on the entire dataset 5 classifier.fit(x_scaled, y) 6 7 8 3 # Make predictions on new data new_data = pd.DataFrame({'SepallengthCn': [5.2, 6.1, 4.9], (3.1, 2.8, 3.5], ae "PetalLengthcm': [1.7, 4.7, 1.5], a "PetalWidthcm': [@.5, 1.6, @.4]}) 12 new_data_scaled = scaler. transform(new_data) 13 predictions = classifier.predict(new_data_scaled) 14. print("Predictions:", predictions) "SepalWidthcm Predictions: [‘Iris-setosa' ‘Iris-versicolor’ ‘Iris-setosa"] Conclusion In this project, we successfully performed classification on the Iris flower dataset using machine learning techniques. The trained classifier achieved a high accuracy score, indicating its ‘effectiveness in predicting the species of Iris flowers based on their measurements. The project demonstrates the steps involved in solving a classification problem and provides insights into feature analysis, data preprocessing, model training, and evaluation. Iocahost 8888inotebooks/Desktop/Final Projects Flower Projact Task 1 ipynb#Submitted By-Mr-Omkar-Balwant Jadhav s010

You might also like