Assignment

Assignment on html program

Uploaded by

venomfate778

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views

Assignment

Assignment on html program

Uploaded by

venomfate778

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 5

SUBMITTED BY : Afshan Rehman

SUBMITTED TO : Sir Aoan Shah

SUBJECT : Machine learrning

TOPIC : Implement supervised ML models

on same dataset for the same task

DEPARTMENT : BSCS (5TH )

ROLL NO : 02

INFORMATICS GROUP OF COLLEGES

PAINSRA
MACHINE LEARNING

This field of study uses data and algorithms to mimic human learning, allowing machines to improve
over time, becoming increasingly accurate when making predictions or classifications or uncovering
data-driven insights. It works in three basic ways, starting with using a combination of data and
algorithms to predict patterns and classify data sets, an error function that helps evaluate the accuracy,
and then an optimization process to fit the data points into the model best.

Apply supervised ML models on the same data type

Applying supervised machine learning models to the same datatype for the same task involves several
key steps, ranging from data preprocessing to model selection, training, evaluation, and refinement.
Here’s a detailed explanation of how to apply supervised machine learning models to the same type of
data for a specific task:
1. Understand the Task and Data
The first step is to understand the task you're trying to solve. In supervised learning, the task typically
involves predicting an output variable yyy based on input features XXX. There are two common types
of supervised learning problems:
 Classification: Predict a categorical outcome (e.g., spam detection, disease classification).
 Regression: Predict a continuous outcome (e.g., house prices, stock prices).
For this explanation, assume you are working on a classification task, such as predicting whether an
email is spam or not based on a set of features extracted from the email (e.g., word counts, sender,
subject line).
The dataset consists of features X={x1,x2,...,xn}X = \{x_1, x_2, ..., x_n\}X={x1,x2,...,xn} and a target
variable yyy (the label).

2. Data Preprocessing
Before applying any machine learning model, it's essential to preprocess the data. Preprocessing steps
ensure that the data is in a format suitable for machine learning algorithms.
 Handling Missing Values: Missing data can be handled by imputing values (mean, median,
mode) or removing rows/columns with missing values.
 Feature Encoding: If the features include categorical data, encode them numerically using
methods like one-hot encoding or label encoding.
 Feature Scaling: Many machine learning algorithms perform better when features are on a
similar scale. Use techniques like Standardization (Z-score normalization) or Min-Max scaling
to scale the features.
 Feature Selection: Identify and remove irrelevant or redundant features. Techniques such as
Recursive Feature Elimination (RFE), correlation matrices, or domain knowledge can help
select important features.
 Train-Test Split: Split the dataset into a training set and a test set (commonly an 80/20 or 70/30
split) to avoid overfitting and to assess model performance.

3. Choose Supervised Learning Models

Once the data is ready, you can apply different supervised learning models. Since we are performing
classification, some popular algorithms are:

A. Logistic Regression
 Use Case: Suitable for binary classification problems (e.g., spam vs. non-spam).
 How it works: Logistic regression computes the probability that a given input point belongs to
a certain class using the logistic function (sigmoid function). It is efficient and interpretable but
may not work well with non-linear relationships.

B. Decision Trees
 Use Case: Can be used for both classification and regression. Decision trees are good for
capturing non-linear relationships.
 How it works: A decision tree recursively splits the data into subsets based on feature values,
aiming to create homogenous subsets. It’s easy to interpret but prone to overfitting.

C. Random Forest
 Use Case: A more powerful ensemble method that reduces the overfitting risk of decision trees.
 How it works: Random forest builds multiple decision trees (an ensemble) and averages their
results (in regression) or takes the majority vote (in classification). It’s robust and can handle
both small and large datasets well.

D. Support Vector Machines (SVM)

 Use Case: Effective for both binary and multi-class classification, especially when the data is
high-dimensional.
 How it works: SVM tries to find the hyperplane that maximizes the margin between two
classes. It can also work in non-linear spaces using kernel tricks like the radial basis function
(RBF) kernel.

E. K-Nearest Neighbors (KNN)

 Use Case: Simple and effective for small datasets, though less efficient with larger datasets.
 How it works: KNN makes predictions based on the majority class of the nearest neighbors in
the feature space. It’s easy to understand and implement but can be computationally expensive.
F. Naive Bayes
 Use Case: Particularly suited for text classification problems (e.g., spam detection).
 How it works: Based on Bayes' Theorem, this classifier assumes that the features are
conditionally independent. It’s simple and effective for certain types of problems, especially text
classification.
G. Gradient Boosting Machines (GBM) and XGBoost
 Use Case: Powerful and scalable machine learning algorithms suitable for both classification
and regression tasks.
 How it works: GBM and XGBoost are ensemble techniques that build trees sequentially. Each
new tree corrects errors made by the previous ones, allowing the model to learn complex
patterns. They are highly accurate but computationally intensive.

4. Model Training
After selecting the models, you need to train them on the training data.
 Training: Use the training dataset to fit the model, adjusting the model’s internal parameters
(like coefficients in logistic regression or tree splits in decision trees).
 Hyperparameter Tuning: Some models, such as decision trees, SVMs, or random forests, have
hyperparameters (e.g., tree depth, number of trees, learning rate). Use techniques like Grid
Search or Random Search with cross-validation to tune these hyperparameters for optimal
performance.

5. Model Evaluation
After training, evaluate the models to assess their performance. Common metrics for classification
problems include:
 Accuracy: Proportion of correctly classified instances over the total instances. It’s simple but
not always ideal, especially with imbalanced data.
 Precision: Proportion of true positive predictions over all positive predictions made. Important
when the cost of false positives is high (e.g., email spam detection).
 Recall (Sensitivity): Proportion of true positives over all actual positives. Important when the
cost of false negatives is high (e.g., identifying cancer cases).
 F1-Score: Harmonic mean of precision and recall. Useful when the class distribution is
imbalanced.
 ROC Curve and AUC: The ROC curve plots true positive rate vs. false positive rate, and AUC
(Area Under Curve) measures how well the model distinguishes between classes.
 Confusion Matrix: A table showing the counts of true positives, false positives, true negatives,
and false negatives.
You can use cross-validation (e.g., k-fold cross-validation) to evaluate model performance more
robustly, especially if the dataset is small.
6. Model Comparison
Since you're applying multiple models to the same dataset for the same task, it's important to compare
their performance. This could involve:
 Comparing metrics: Accuracy, precision, recall, F1-score, ROC AUC, etc.
 Comparing training time and prediction time: Some models are faster or more efficient than
others.
 Robustness: How well the models generalize to unseen data (test set performance).

7. Model Refinement and Deployment

 If some models perform better than others, you may decide to select the best one.
 You can refine the model by adjusting hyperparameters, feature engineering, or using ensemble
methods to combine models (e.g., stacking, bagging).
Once you have a final model, the last step is deployment. This might involve:
 Saving the trained model (e.g., using libraries like joblib or pickle in Python).
 Integrating the model into a production system.
 Continuously monitoring the model’s performance over time to ensure it remains effective.
The process of applying supervised machine learning models to the same type of data for the same task
involves carefully selecting models, preprocessing data, tuning hyperparameters, evaluating
performance, and refining the models. The choice of model will depend on factors like dataset size,
complexity, and the trade-off between bias and variance. By evaluating multiple models, you can select
the one that best balances performance and computational efficiency for your specific task.
4o mini

Stock Price Prediction Using Recurrent Neural Networks PDF
No ratings yet
Stock Price Prediction Using Recurrent Neural Networks PDF
132 pages
Machine Learning
No ratings yet
Machine Learning
14 pages
Unit 5
No ratings yet
Unit 5
11 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
4 pages
5 no ans.
No ratings yet
5 no ans.
38 pages
Supervised Learning in Machine Learning
No ratings yet
Supervised Learning in Machine Learning
6 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
64 pages
Naïve Bayes & Decision Algorithm
No ratings yet
Naïve Bayes & Decision Algorithm
19 pages
Unit 1 AAM
No ratings yet
Unit 1 AAM
16 pages
Practical # 11
No ratings yet
Practical # 11
10 pages
Data Science Important Interview Questions & Answers✅
No ratings yet
Data Science Important Interview Questions & Answers✅
19 pages
Final ML
No ratings yet
Final ML
2 pages
Fam QB Ans
No ratings yet
Fam QB Ans
9 pages
Steps to create data sets and developing a machine learning model
No ratings yet
Steps to create data sets and developing a machine learning model
3 pages
Machine learning model ENG
No ratings yet
Machine learning model ENG
16 pages
chapter3
No ratings yet
chapter3
9 pages
Assignment1_LATEX
No ratings yet
Assignment1_LATEX
11 pages
ML Fundamentals
No ratings yet
ML Fundamentals
15 pages
2 Mark Questions
No ratings yet
2 Mark Questions
13 pages
Decision Trees
No ratings yet
Decision Trees
5 pages
LECTURE-2
No ratings yet
LECTURE-2
36 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
Unit V - Big Data Programming
No ratings yet
Unit V - Big Data Programming
22 pages
Week 7 - Tree-Based Model
100% (1)
Week 7 - Tree-Based Model
8 pages
Each Stage of A Data Mining Project
No ratings yet
Each Stage of A Data Mining Project
5 pages
All About ML
No ratings yet
All About ML
18 pages
UNIT 1 - Types of Learning
No ratings yet
UNIT 1 - Types of Learning
13 pages
Machine Learning Algorithm
No ratings yet
Machine Learning Algorithm
8 pages
Model Selection NEW
No ratings yet
Model Selection NEW
24 pages
Unit 1 Machine Learning
No ratings yet
Unit 1 Machine Learning
10 pages
Data Collection
No ratings yet
Data Collection
8 pages
Lecture 4 Machine Learning - Bcsc
No ratings yet
Lecture 4 Machine Learning - Bcsc
45 pages
Data Science for Civil Engineering Unit 4 Notes
No ratings yet
Data Science for Civil Engineering Unit 4 Notes
18 pages
ML (Theory)
No ratings yet
ML (Theory)
11 pages
Data Science Vijay1
No ratings yet
Data Science Vijay1
88 pages
ml_unit2
No ratings yet
ml_unit2
22 pages
Module 2notes
No ratings yet
Module 2notes
44 pages
A Review of Supervised Learning Based Classification For Text To Speech System
No ratings yet
A Review of Supervised Learning Based Classification For Text To Speech System
8 pages
Steps of Implementation of A GLM
No ratings yet
Steps of Implementation of A GLM
8 pages
UNIT-3 Material
No ratings yet
UNIT-3 Material
19 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
9 pages
MACHINE LEARNING
No ratings yet
MACHINE LEARNING
6 pages
ML NOTES
No ratings yet
ML NOTES
13 pages
ML notes
No ratings yet
ML notes
10 pages
unit3 ml
No ratings yet
unit3 ml
7 pages
Key Terms in Machine Learning
No ratings yet
Key Terms in Machine Learning
6 pages
ML
No ratings yet
ML
22 pages
Homework # 2 - CYS 607: Submission Date: 24-03-21 Total Marks: 10
No ratings yet
Homework # 2 - CYS 607: Submission Date: 24-03-21 Total Marks: 10
4 pages
??????? ???????? ??????????!
No ratings yet
??????? ???????? ??????????!
16 pages
unit 1 ml pdf
No ratings yet
unit 1 ml pdf
19 pages
Lecture 1 introduction PM (1)
No ratings yet
Lecture 1 introduction PM (1)
21 pages
Evaluating Machine Learning Algorithms and Model Selection
No ratings yet
Evaluating Machine Learning Algorithms and Model Selection
8 pages
Tesla Stock Marketing Price Prediction
No ratings yet
Tesla Stock Marketing Price Prediction
62 pages
Machine Learning Toolbox
No ratings yet
Machine Learning Toolbox
10 pages
Models For Machine Learning: M. Tim Jones
No ratings yet
Models For Machine Learning: M. Tim Jones
10 pages
AIMLlatestmodule 2Notes Removed
No ratings yet
AIMLlatestmodule 2Notes Removed
33 pages
module_2
No ratings yet
module_2
35 pages
MACHINE LEARNING 1-5 (Ai &DS)
100% (1)
MACHINE LEARNING 1-5 (Ai &DS)
60 pages
Chapter 2
No ratings yet
Chapter 2
15 pages
lecture 9 machine_learning new
No ratings yet
lecture 9 machine_learning new
11 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Refresher: Perceptron Training Algorithm
No ratings yet
Refresher: Perceptron Training Algorithm
12 pages
Recurrent Neural Network: SUBMITTED BY: Harmanjeet Singh ROLL NO - 1803448 B.Tech, Cse (7) Ctiemt, Shahpur (Jalandhar)
No ratings yet
Recurrent Neural Network: SUBMITTED BY: Harmanjeet Singh ROLL NO - 1803448 B.Tech, Cse (7) Ctiemt, Shahpur (Jalandhar)
11 pages
4th Attempts Huawei
No ratings yet
4th Attempts Huawei
6 pages
Explainable AI With Inductive Logic Programming
No ratings yet
Explainable AI With Inductive Logic Programming
8 pages
50be PDF
No ratings yet
50be PDF
2 pages
UNIT1_C
No ratings yet
UNIT1_C
21 pages
ZoKa A Fake News Detection Method Using Edge-Weighted Graph Attention Network With Transfer Models
No ratings yet
ZoKa A Fake News Detection Method Using Edge-Weighted Graph Attention Network With Transfer Models
9 pages
CS-13410 Introduction To Machine Learning
No ratings yet
CS-13410 Introduction To Machine Learning
33 pages
Visual Question Answering System For Indian Regional Languages
No ratings yet
Visual Question Answering System For Indian Regional Languages
6 pages
Soft Computing MCQ
No ratings yet
Soft Computing MCQ
10 pages
Cbse - Department of Skill Education: Artificial Intelligence (Subject Code 843)
No ratings yet
Cbse - Department of Skill Education: Artificial Intelligence (Subject Code 843)
2 pages
Named Entity Recognition With Bidirectional Lstm-Cnns
No ratings yet
Named Entity Recognition With Bidirectional Lstm-Cnns
14 pages
2403.10075v2
No ratings yet
2403.10075v2
33 pages
Grade 6 Worksheet Computer Systems
No ratings yet
Grade 6 Worksheet Computer Systems
3 pages
Class 6 Paper
100% (1)
Class 6 Paper
2 pages
CIFAKE Image Classification and Explainable Identification of AI-Generated Synthetic Images
No ratings yet
CIFAKE Image Classification and Explainable Identification of AI-Generated Synthetic Images
18 pages
University of Computer Studies, Mandalay (UCSM)
No ratings yet
University of Computer Studies, Mandalay (UCSM)
23 pages
NVIDIA Gen AI Slides Download
No ratings yet
NVIDIA Gen AI Slides Download
353 pages
Deep Clustering
No ratings yet
Deep Clustering
11 pages
Machine Learning Theory and Application
No ratings yet
Machine Learning Theory and Application
3 pages
Asian Face Recognition Improving Robustness and Performance Using YOLO
No ratings yet
Asian Face Recognition Improving Robustness and Performance Using YOLO
5 pages
Machine Learning, Deep Learning, Computer Vision On Raspberry Pi2019-20
No ratings yet
Machine Learning, Deep Learning, Computer Vision On Raspberry Pi2019-20
2 pages
Intro To Machine Learning Nanodegree Program Syllabus
No ratings yet
Intro To Machine Learning Nanodegree Program Syllabus
14 pages
LLM Attacks
No ratings yet
LLM Attacks
32 pages
AIML MODEL Q-Set
No ratings yet
AIML MODEL Q-Set
2 pages
Machine Learning Unit-2 Backpropagation Algorithm
No ratings yet
Machine Learning Unit-2 Backpropagation Algorithm
23 pages
BBBB
No ratings yet
BBBB
8 pages
ML Practical File
No ratings yet
ML Practical File
24 pages
Machine Learning Techniques For Heart Disease Prediction
No ratings yet
Machine Learning Techniques For Heart Disease Prediction
8 pages