Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Lesson 2 - Introduction to ML

Introduction to machine learning

Uploaded by

star
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lesson 2 - Introduction to ML

Introduction to machine learning

Uploaded by

star
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Intro to ML

Beginner Level
What is ML?
Machine
Learning
Types
Machine Learning Key Info #1

Supervised
Learning
vs

Unsupervised
Learning
vs

Reinforcement Learning
Supervised vs Unsupervised

Supervised
Learning MAIN DIFFERENCE :
vs Presence of
Labels
Unsupervised
Learning
What is label?

A label is like the


‘right answer’
Cat
provided to the
model so that it
knows whether it
made the correct
prediction

Dog
Types of ML Problems

Classification (Class based prediction),

Regression (Continuous prediction),

Clustering (Predict similar classes),

Etc…
WE FOCUS ON CLASSIFICATION AND REGRESSION
TODAY!
Machine Learning Pipeline

Data Preparation & Exploratory Data Analysis


01

Feature Engineering & Feature Selection


02

Model Selection & Testing


03

Hyper-parameter Tuning & Overall Evaluation


04

Deployment
05
Machine Learning Pipeline

Data Preparation & Exploratory Data Analysis


01

Feature Engineering & Feature Selection


02

Model Selection & Testing


03

Model Evaluation
05

What we are doing this week…


Step by Step Process (Oversimplified…)

Preprocessing Model Evaluation


Training
Data Preparation and EDA

IMPORTANT to ensure we understand the


data,
Eg how many columns are there, got NA
values anot, what the columns distribution
looks like, etc…
Preprocessing Principle

Create/Make data understandable to the


Model
Preprocessing Key Takeaway #1: Feature
Engineering
Data Transformation into machine readable
data (NUMBERS)
Preprocessing Key Takeaway #1: Feature
Engineering
Preprocessing Key Takeaway #1: Feature
Engineering
One Hot Encoding - Transform to columns with
1/True for presence of the element and 0/False
otherwise
Ex:
Preprocessing Key Takeaway #1: Feature
Engineering
!BEWARE of Curse of Dimensionality!
Essentially means we don’t want to use columns
with too many distinct values for one hot encoding
Preprocessing Key Takeaway #1: Feature
Engineering
Scaling - Converting numerical features to
similar magnitude
Preprocessing Key Takeaway #1: Feature
Engineering
Scaling - Converting numerical features to
similar magnitude
Preprocessing Key Takeaway #2: Feature
Selection
Preprocessing Key Takeaway #2: Feature
Selection

Intuition/Guess based
on Domain
Knowledge!
Preprocessing Key Takeaway #2: Feature
Selection

Correlation Matrix
Preprocessing Key Takeaway #2: Feature
Selection

Chi-Square Test
(For Categorical
Variables) - Test for
independence of each
feature against
target(what we wanna
predict)
Preprocessing Key Takeaway #2: Feature
Selection

AND many more…., including information


gain, fisher’s score, variance threshold,
etc…

Filter Based Feature selection, (there are


other types of feature selection algorithms
also - Wrapper Based, Embedded Feature
Preprocessing Key Takeaway #3: Train Test
Split
Model Training Principle

Pluck in the Xs (features) and Ys (Label) to


the model, let it run (Use packages! ~
Sklearn, Tensorflow, Pytorch etc…)

Key Note - Always do train test split to


prevent data leakage! (Train models on
training set, use test set for evaluation
(Identifying how accurate+consistent a
model is)
Model Training Key Takeaway #1: Packages

Scikit-Learn https://scikit-learn.org/stable/
Model Training Key Takeaway #2: Linear
Regression Model
A simple model that
learns a best-fit line
from training data. This
line is used to predict
target values for unseen
data.
Model Training Key Takeaway #3: Many
more models!

Naive Bayes, K-Nearest Neighbours,


Support Vector Machines, K-Means
Clustering, Decision Trees, Ensemble
Methods, Neural Networks etc….
Evaluation Principle

Checking how good the model is


performing ~ Usually different problem
types requires a different evaluation
metric
Evaluation Key Takeaway #1: Overfitting
and Underfitting
Evaluation Key Takeaway #1: Overfitting
and Underfitting (Bias vs Variance
Tradeoff)
Bias - How
accurate the
model is?

Variance - How
consistent the
model is?
Evaluation Key Takeaway #2: Classification
Evaluation - Confusion Matrix
Evaluation Key Takeaway #2: Classification
Evaluation - Accuracy

Accuracy - (TP
+ TN) / (TP +
TN + FP + FN)

Essentially how
many correct
predictions a
model made!
Evaluation Key Takeaway #3: Regression
Evaluation - Mean Square Error
Evaluation Key Takeaway #3: Regression
Evaluation - Mean Absolute Error
Exercise

You might also like