03 Logistic Regression

Logistic regression is a statistical method used to predict binary outcomes. It uses a logistic function to model the probabilities of the different outcomes. Unlike linear regression, which produces continuous outputs, logistic regression produces constant outputs between 0 and 1. Models are estimated using maximum likelihood estimation rather than ordinary least squares. Key types include binary, multinomial, and ordinal logistic regression. Performance is evaluated using metrics like accuracy, precision, recall, and ROC curves.

Uploaded by

Harold Costales

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views

03 Logistic Regression

Uploaded by

Harold Costales

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

03 Logistic Regression

Objectives

Define Logistic Regression

Differentiate Logistic Regression from Linear Regression
Differentiate the different types of Logistic Regression
Differentiate between MLE & LSM
Build a Logistic Regression model
Perform prediction using the model
Evaluate the performance of the model
Introduction

 Logistic regression is a statistical method for predicting binary

classes.
 The outcome or target variable is dichotomous in nature.
 It predicts the probability of occurrence of a binary event
utilizing a logistic/sigmoid function.
 For example, it can be used to classify whether a tumor is
benign or malignant, an email whether spam or not a spam, a
student will pass or fail in a course, etc.
The Sigmoid Function

 𝑝=
1
1+𝑒 −𝑦
 𝑦 = 𝑤0 + 𝑤1𝑋1 + 𝑤2𝑋2 + ⋯ + 𝑤𝑛𝑋𝑛
 𝑝=
1
1+𝑒 −(𝑤0+𝑤1𝑋1+𝑤2𝑋2+⋯+𝑤𝑛𝑋𝑛
The Sigmoid Function
 The sigmoid function, also called logistic function gives an ‘S’ shaped
curve that can take any real-valued number and map it into a value
between 0 and 1.
 If the curve goes to positive infinity, y predicted will become 1, and if
the curve goes to negative infinity, y predicted will become 0.
 If the output of the sigmoid function is more than 0.5, we can classify
the outcome as 1 or YES, and if it is less than 0.5, we can classify it as 0
or NO.
 For example: If the output is 0.75, we can say in terms of probability
as: There is a 75 percent chance that patient will suffer from cancer.
Properties

 The dependent variable in logistic regression follows Bernoulli

Distribution.
 Estimation is done through maximum likelihood
 No R Square, Model fitness is calculated through Concordance,
KS-Statistics.
Logistic Regression VS Linear Regression

 Linear regression gives you a continuous output, but logistic

regression provides a constant output.
 Linear regression is estimated using Ordinary Least Squares
(OLS) while logistic regression is estimated using Maximum
Likelihood Estimation (MLE) approach.
Logistic Regression VS Linear Regression
MLE VS OLS

 The Maximum Likelyhood Estimation (MLE) is a "likelihood"

maximization method, while Ordinary Least Square (OLS) is a
distance-minimizing approximation method.
 Maximizing the likelihood function determines the parameters that
are most likely to produce the observed data.
 From a statistical point of view, MLE sets the mean and variance
as parameters in determining the specific parametric values for a
given model. This set of parameters can be used for predicting the
data needed in a normal distribution.
MLE VS OLS

 Ordinary Least squares estimates are computed by fitting a

regression line on given data points that has the minimum sum of
the squared deviations (least square error).
 Both are used to estimate the parameters of a linear regression
model.
 MLE assumes a joint probability mass function, while OLS doesn't
require any stochastic assumptions for minimizing distance.
Types of Logistic Regression

 Binary Logistic Regression: The target variable has only two

possible outcomes such as Spam or Not Spam, Cancer or No
Cancer.
 Multinomial Logistic Regression: The target variable has three or
more nominal categories such as predicting the type of Wine.
 Ordinal Logistic Regression: the target variable has three or more
ordinal categories such as restaurant or product rating from 1 to 5.
Advantages

 doesn't require high computation power

 easy to implement
 easily interpretable
 used widely by data analyst and scientist
 it doesn't require scaling of features
Disadvantages

 not able to handle a large number of categorical

features/variables.
 vulnerable to over-fitting
 can't solve the non-linear problem with the logistic regression that
is why it requires a transformation of non-linear features
 will not perform well with independent variables that are not
correlated to the target variable and are very similar or correlated
to each other.
Demo: Build a Logistic Regression Model
Evaluation Metrics for Logistic Regression

 Accuracy Score
 Confusion Matrix
 Precision
 Recall
 F1 Score
 Receiver Operating Characteristic (ROC) Curve
Accuracy Score

 It is the total number of correct predictions over the total number

of predictions
 Not suitable for imbalance dataset
Confusion Matrix

 A confusion matrix is a table that is used to evaluate the

performance of a classification model.
 You can also visualize the performance of an algorithm.
 The fundamental of a confusion matrix is the number of correct
and incorrect predictions are summed up class-wise.
Precision

 Precision is the measure of how many observations our model

correctly predicted over the amount of correct and incorrect
predictions.
 Precision is about being precise, i.e., how accurate your model is.
 In other words, you can say, when a model makes a prediction,
how often it is correct.
 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑇𝑃
𝑇𝑃+𝐹𝐵
Recall

 Recall is the measure of how many observations our model

correctly predicted over the total amount of observations.
 𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑇𝑃
𝑇𝑃+𝐹𝑁
F1 Score

 If we put our focus into one score, we might end up neglecting the
other.
 In order to combat this we can use the F1 Score, which strikes a
balance between the Precision and Recall scores.
 𝐹1 𝑆𝑐𝑜𝑟𝑒 = 2 ∗
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗𝑅𝑒𝑐𝑎𝑙𝑙
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙
ROC Curve

 Receiver Operating Characteristic(ROC) curve is a plot of the true

positive rate against the false positive rate.
 It shows the tradeoff between sensitivity and specificity.
Demo: Evaluate a LR Model
References
 Bagheri, R. (2019). ROC curve, a complete introduction. Retrieved from
https://towardsdatascience.com/roc-curve-a-complete-introduction-2f2da2e0434c
 Galarnyk, M. (2017). Logistic regression using python. Retrieved from
https://towardsdatascience.com/logistic-regression-using-python-sklearn-numpy-
mnist-handwriting-recognition-matplotlib-a6b31e2b166a
 Navlani, A. (2019). Understanding logistic regression in python tutorial. Retrieved
from https://www.datacamp.com/community/tutorials/understanding-logistic-
regression-python
 Santos, M. (2020). Precision or Recall: Which should you use? Retrieved from
https://towardsdatascience.com/explaining-precision-vs-recall-to-everyone-
295d4848edaf
 Suresh, A. (2020). What is a confusion matrix? Retrieved from
https://medium.com/analytics-vidhya/what-is-a-confusion-matrix-d1c0f8feda5

Baya - Bgha - Bvga - Byba - B7wa - Maya - Mfya - MG
No ratings yet
Baya - Bgha - Bvga - Byba - B7wa - Maya - Mfya - MG
4 pages
Mark Dorrian, Frédéric Poussin (Eds.) - Seeing From Above - The Aerial View in Visual Culture-I.B.Tauris (2013)
No ratings yet
Mark Dorrian, Frédéric Poussin (Eds.) - Seeing From Above - The Aerial View in Visual Culture-I.B.Tauris (2013)
337 pages
Sfi Detail Code
94% (16)
Sfi Detail Code
175 pages
Brother in The Land Questions
100% (4)
Brother in The Land Questions
2 pages
Analytics Compendium
No ratings yet
Analytics Compendium
41 pages
CPP PROJECT REPORT 6TH SEMESTER Final
100% (5)
CPP PROJECT REPORT 6TH SEMESTER Final
40 pages
Free Clip Art
50% (2)
Free Clip Art
153 pages
Module 3.3 Classification Models, An Overview
No ratings yet
Module 3.3 Classification Models, An Overview
11 pages
Machine Learning Algorithm
100% (2)
Machine Learning Algorithm
20 pages
MACHINE LEARNING Presentation Logistic Regression
No ratings yet
MACHINE LEARNING Presentation Logistic Regression
18 pages
Lecture 2
No ratings yet
Lecture 2
17 pages
Business Analytics: Advance: Logistic Regression
100% (1)
Business Analytics: Advance: Logistic Regression
26 pages
ML 4
No ratings yet
ML 4
80 pages
Exp2 Milf
No ratings yet
Exp2 Milf
7 pages
DOC-20240831-WA0023.
No ratings yet
DOC-20240831-WA0023.
22 pages
ML CLASS 5 Logistic Regression Algorithm
No ratings yet
ML CLASS 5 Logistic Regression Algorithm
16 pages
Module 4: Regression Shrinkage Methods
No ratings yet
Module 4: Regression Shrinkage Methods
5 pages
Introudction To Regression Analysis and Measuring With Stat Model 1702371825910
No ratings yet
Introudction To Regression Analysis and Measuring With Stat Model 1702371825910
16 pages
ML Unit 3
No ratings yet
ML Unit 3
40 pages
DA Unit-3
No ratings yet
DA Unit-3
14 pages
9 Types of Regression Analysis
No ratings yet
9 Types of Regression Analysis
16 pages
6 ML Updated
No ratings yet
6 ML Updated
23 pages
Unit 2
No ratings yet
Unit 2
76 pages
Module 1 Notes
100% (1)
Module 1 Notes
73 pages
ML final
No ratings yet
ML final
92 pages
Regression
No ratings yet
Regression
45 pages
UNIT4 Evaluation Metrics
No ratings yet
UNIT4 Evaluation Metrics
16 pages
Interview questions companie
No ratings yet
Interview questions companie
72 pages
Logistic Regression
No ratings yet
Logistic Regression
5 pages
Models Assignment
No ratings yet
Models Assignment
43 pages
Logistic Regression in Machine Learning
No ratings yet
Logistic Regression in Machine Learning
10 pages
Logistic Regression
No ratings yet
Logistic Regression
16 pages
Unit 2
No ratings yet
Unit 2
19 pages
Chapter Three _ Regression Feb 26 2024
No ratings yet
Chapter Three _ Regression Feb 26 2024
17 pages
What Is Logistic Regression
No ratings yet
What Is Logistic Regression
20 pages
Information Retrieval Important questions
No ratings yet
Information Retrieval Important questions
20 pages
ML Endsem
No ratings yet
ML Endsem
14 pages
m2 Data analytic and visualization
No ratings yet
m2 Data analytic and visualization
53 pages
Predictive Modelling Using Linear Regression
No ratings yet
Predictive Modelling Using Linear Regression
12 pages
Unit I
No ratings yet
Unit I
14 pages
Regression Log
No ratings yet
Regression Log
4 pages
compare & contrast Linear Vs Logistic Regression
No ratings yet
compare & contrast Linear Vs Logistic Regression
3 pages
IAI&ML UNIT-5
No ratings yet
IAI&ML UNIT-5
15 pages
CQF EXAM 3-Answer
No ratings yet
CQF EXAM 3-Answer
14 pages
Final Answer Bank
No ratings yet
Final Answer Bank
10 pages
EDA 4th Module
No ratings yet
EDA 4th Module
26 pages
Advanced Regression
No ratings yet
Advanced Regression
13 pages
SEM MLOps
No ratings yet
SEM MLOps
58 pages
Sem Rpa
No ratings yet
Sem Rpa
61 pages
Logistic Regression Report
No ratings yet
Logistic Regression Report
39 pages
UNIT 3
No ratings yet
UNIT 3
20 pages
ML2
No ratings yet
ML2
8 pages
ML Unit 2
No ratings yet
ML Unit 2
27 pages
ML Using Python Unit3 pdf
No ratings yet
ML Using Python Unit3 pdf
8 pages
Unit V
No ratings yet
Unit V
27 pages
Machine Learning
No ratings yet
Machine Learning
19 pages
Logistic Regression and KNN Mock Question - Jupyter Notebook
No ratings yet
Logistic Regression and KNN Mock Question - Jupyter Notebook
6 pages
Evaluation Measures
No ratings yet
Evaluation Measures
8 pages
new89梁涛企业管理（运营与供应链方向）202111080248Application of linear regression model and logistic regression model based on Iris data set
No ratings yet
new89梁涛企业管理（运营与供应链方向）202111080248Application of linear regression model and logistic regression model based on Iris data set
21 pages
Applying_Machine_Learning_Algorithms_with_Scikit-learn(Sklearn)_-_Notes
No ratings yet
Applying_Machine_Learning_Algorithms_with_Scikit-learn(Sklearn)_-_Notes
19 pages
Unit 2
No ratings yet
Unit 2
67 pages
Dissertation Using Logistic Regression
100% (2)
Dissertation Using Logistic Regression
6 pages
LLM ML Interview Q
No ratings yet
LLM ML Interview Q
43 pages
Bio2 Module 5 - Logistic Regression
No ratings yet
Bio2 Module 5 - Logistic Regression
19 pages
Process Performance Models: Statistical, Probabilistic & Simulation
From Everand
Process Performance Models: Statistical, Probabilistic & Simulation
Vishnuvarthanan Moorthy
No ratings yet
Gale Researcher Guide for: Econometric Models
From Everand
Gale Researcher Guide for: Econometric Models
Chupp
No ratings yet
CSA105-LinearRegression-HousePrice-Prediction - Ipynb - Colaboratory
No ratings yet
CSA105-LinearRegression-HousePrice-Prediction - Ipynb - Colaboratory
17 pages
Httoy Sales and Inventory System
No ratings yet
Httoy Sales and Inventory System
64 pages
4 Chapter-I
No ratings yet
4 Chapter-I
9 pages
Forest Trees
No ratings yet
Forest Trees
10 pages
Relational Databases
No ratings yet
Relational Databases
374 pages
Costales Harold Mitc102 Hw01
No ratings yet
Costales Harold Mitc102 Hw01
4 pages
Creating Oracle 10g Xe Database
No ratings yet
Creating Oracle 10g Xe Database
22 pages
Costales Harold Activity01
No ratings yet
Costales Harold Activity01
4 pages
Crystal Structure and Habit
No ratings yet
Crystal Structure and Habit
3 pages
Backhoe Loader: Cab Fuel Reduction
No ratings yet
Backhoe Loader: Cab Fuel Reduction
8 pages
Di Mentioning A Simplex Swirl Injector Journal
No ratings yet
Di Mentioning A Simplex Swirl Injector Journal
15 pages
254522-FR-EN-HYDRUS Remote Control Guide Type 6 EN
No ratings yet
254522-FR-EN-HYDRUS Remote Control Guide Type 6 EN
4 pages
FIGO Hipertensi in Pregnancy
No ratings yet
FIGO Hipertensi in Pregnancy
456 pages
2XX-IP: Ethernet Interface Option
No ratings yet
2XX-IP: Ethernet Interface Option
12 pages
Handbook of Hydrophone Element Design Technology
No ratings yet
Handbook of Hydrophone Element Design Technology
2 pages
Immune System Song - Ben Kany and John Barback
No ratings yet
Immune System Song - Ben Kany and John Barback
3 pages
Science - G9 - Week 3 (Lessons 7-10)
No ratings yet
Science - G9 - Week 3 (Lessons 7-10)
44 pages
Carabusul de Aur
No ratings yet
Carabusul de Aur
28 pages
A Robust Torque and Drag Analysis Approach For Well Planning and Drillstring Design
100% (1)
A Robust Torque and Drag Analysis Approach For Well Planning and Drillstring Design
16 pages
Verlinde Electric Wire Rope Hoist
No ratings yet
Verlinde Electric Wire Rope Hoist
12 pages
Jour 92
No ratings yet
Jour 92
16 pages
DCS System Presentation
No ratings yet
DCS System Presentation
9 pages
12 Boom and Bucket Hydraulics
No ratings yet
12 Boom and Bucket Hydraulics
20 pages
Active Structural Control: Abstract This Chapter Provides An Overview of Building Structure Modeling and
No ratings yet
Active Structural Control: Abstract This Chapter Provides An Overview of Building Structure Modeling and
37 pages
Classification and Identification of Mosquitoes of New Mexico
No ratings yet
Classification and Identification of Mosquitoes of New Mexico
7 pages
Physics MCQs
No ratings yet
Physics MCQs
22 pages
5.2 TG in Eng 5 Q3 Week 5 (Q3-W5-D5)
100% (1)
5.2 TG in Eng 5 Q3 Week 5 (Q3-W5-D5)
6 pages
Wedding Dance by Amador Daguio
No ratings yet
Wedding Dance by Amador Daguio
10 pages
Farming Business Plan Example
No ratings yet
Farming Business Plan Example
45 pages
5th SFG Lessons Learned 14 Nov 1967
100% (1)
5th SFG Lessons Learned 14 Nov 1967
94 pages
Sam Winchester Supernatural Wiki Fandom
No ratings yet
Sam Winchester Supernatural Wiki Fandom
3 pages
Glovebox Guide To Evs Esf
No ratings yet
Glovebox Guide To Evs Esf
20 pages