0% found this document useful (0 votes)

59 views

Datamining Lecture6

This document provides an overview of linear regression, logistic regression, neural networks, and deep learning. It discusses regression problems for predicting continuous values, linear regression with one or multiple inputs, calculating error terms, and limitations of linear models. Logistic regression is introduced as an alternative for classification problems. Neural networks are described as collections of perceptrons that can perform multi-class classification. Deep learning neural networks are distinguished by having more hidden layers.

Uploaded by

Vi Le

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

59 views

Datamining Lecture6

Uploaded by

Vi Le

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 41

DATA MINING

LECTURE 6
Linear Regression
Logistic Regression
Neutral Networks
A Little Bit of Deep Learning

CSC177
Dr. Victor Chen
Regression Problem
• The problem of predicting continuous values is
called regression problem
• General approach: find a continuous function that
models the continuous points.
Linear Regression with one input

A simple regression has the coefficient β and the

constant α. The equation is then:

y=α+β*x

where α is y-intercept and β is slope

Multiple Linear Regression

Linear Regression with more than one

input

A multiple regression is in the following form:

y = α + β1 * X1 + β2 * X2 + … + βk * Xk

where k is the number of variables, or parameters.

Linear regression for 2D data
• Given a dataset of the
form
(𝑥1 , 𝑦1 ) , … , (𝑥𝑛 , 𝑦𝑛 ) ,
find a linear continuous
function that minimizes
the Sum of Squares of
Error (SSE)
Linear Regression

Dependent variable (y)

y’ = b0 + b1X
є

b1 = slope
= ∆y/ ∆x
b0 (y intercept)

Independent variable (x)

Regression

Sum of Squares of Error (SSE)

Sum of squared errors
෍ 𝑦𝑖′ − 𝑦𝑖 2
Dependent variable 𝑖

Independent variable (x)

Sum of Squares of Error (SSE) is the sum of all the squared errors.
Sum of Squares of Regression (SSR)

Dependent variable

Data mean

Independent variable (x)

Sum of Squares of Regression (SSR) is the sum of the squared differences

between each prediction and the mean of data.
Sum of Squares of Total (SST)
SST = SSE + SSR
The Coefficient of Determination

R-Squared score
• R-Squared score can be used as a single summary
number to measure the quality of linear regression
model
• The value of R2 can range between 0 and 1.
• The higher R2, the more accurate the regression
model is.
Nonlinear Regression
Nonlinear functions can also be fit as
regressions

Any nonlinear continuous functions can also be fit as regressions,

including power, Logarithmic, Exponential, and Logistic.
12

A Real Example: Social Network Analysis

High school dating

High school friendship

A Real Example: Social Network Analysis

• In link prediction, given any two arbitrary users,
we aim to answer:
• What is the likelihood that a particular link (relationship)
exists between them?
14

Markov Networks v.s. Bayesian

Networks

• Haiquan Chen, Wei-Shinn Ku, Haixun Wang, Liang Tang, Min-Te Sun:
Scaling Up Markov Logic Probabilistic Inference for Social Graphs. IEEE Trans. Knowl. Data
Eng. 29(2): 433-445 (2017)
15

Experimental validation
Now Logistic Regression…
Linear Regression Doesn’t Work
• A linear function/regression is not good
• It may produce probabilities beyond [0, 1]

Note: APACHE II is one of several ICU scoring systems.

Logistic Regression
Logistic (Sigmoid) function maps any input x between [0, 1]
1
𝑓 𝑡 =
1 + 𝑒 −𝑥

Logistic regression is a linear

regression on the Sigmoid
function

1
𝑃 𝐶𝑥 =
1 + 𝑒 −(α+β⋅𝒙)
Q: What is the logistic regression model for more than one dimension?
Logistic Regression
• For 2-class problem, the probability threshold is
set to 0.5 so
• If the predicted probability >= 0.5, predict “y = 1”,
• If the predicted probability > 0.5, predict “y = 0”,
How β affects the model
Compare Two Models In One Dimension

Q: What x value makes the model output a probability of 0.5?

Logistic regression in 2D space

Predict Iris flower

species based on
sepal length and
sepal width only

Coefficients

𝛽1 = −1.9
𝛽2 = −0.4
𝛼 = 13.04
Estimating the coefficients
• Use gradient descent algorithm to find the near-
optimal coefficients for linear/logistic regression
Gradient descent for two parameters
Gradient descent for two parameters
Gradient descent implementation
Sklearn implementation
• http://scikit-
learn.org/stable/modules/generated/sklearn.linear_model.
LinearRegression.html

• http://scikit-
learn.org/stable/modules/generated/sklearn.linear_model.
LogisticRegression.html
Logistic/Linear Regression Advantages
• Linear regression produces a continuous output.
• Logistic regression produces class membership
with predicted probability .
• The coefficients can be used for understanding
the feature importance.
• Works for relatively large datasets
Neutral networks
• Logistic regression can be considered as the
simplest form of a neutral network, which is a
collection of perceptron.
• Perceptron is seen as an analogy to a biological
neuron.
Perceptron (Neuron)
2-class classification with one neuron
2-class classification with two neurons,
one for each class
Putting multiple neurons in parallel we can predict multiple classes
sigmoid vs softmax
• sigmoid function is • softmax function is
used for the two-class used for the multiclass
logistic regression logistic regression
Softmax implementation
What is softmax([1, 2, 3])

e1
• y1 = 1 2 3 = 0.09
e +e +e
e2
• y1 = 1 2 3 = 0.24
e +e +e
e3
• y3 = 1 2 3 = 0.67
e +e +e

Output:

• [0.09, 0.24, 0.67]

3-class classification with three neurons
Putting multiple neurons in parallel we can predict multiple classes
Neutral Network = Multi Layer Perceptron
• Multiple classes can be predicted by putting many
neurons in parallel.
Sklearn implementation
• http://scikit-
learn.org/stable/modules/neural_networks_supervised.html#
Deep learning
Neural networks vs deep learning neural
networks
• "Normal" neural networks • Deep learning neural
usually have one to two network have more hidden
hidden layers and are layers and can be used for
used for SUPERVISED both UNSUPERVISED
classification. and SUPERVISED
learning tasks.
An example deep learning neural network

CS230: Deep Learning: Winter Quarter 2018 Stanford University Midterm Examination 180 Minutes
100% (1)
CS230: Deep Learning: Winter Quarter 2018 Stanford University Midterm Examination 180 Minutes
36 pages
Dimensionality Reduction Using PCA (Principal Component Analysis)
No ratings yet
Dimensionality Reduction Using PCA (Principal Component Analysis)
13 pages
Discovery of Cells and The Development of Cell Theory
100% (1)
Discovery of Cells and The Development of Cell Theory
2 pages
ISYE 8803 - Kamran - M1 - Intro To HD and Functional Data - Updated
No ratings yet
ISYE 8803 - Kamran - M1 - Intro To HD and Functional Data - Updated
87 pages
Unit 5
No ratings yet
Unit 5
104 pages
Lecture 04
No ratings yet
Lecture 04
46 pages
Lec 3 Regression.
No ratings yet
Lec 3 Regression.
20 pages
Welcome To:: Multiple Regression and Model Building
No ratings yet
Welcome To:: Multiple Regression and Model Building
20 pages
Data Analytics Unit 3 Notes
100% (2)
Data Analytics Unit 3 Notes
28 pages
AI Lecture
No ratings yet
AI Lecture
63 pages
Lesson 8_ Regression-T
No ratings yet
Lesson 8_ Regression-T
54 pages
w3 - Linear Model - Linear Regression
No ratings yet
w3 - Linear Model - Linear Regression
33 pages
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
No ratings yet
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
6 pages
PRu 4
No ratings yet
PRu 4
13 pages
learning2
No ratings yet
learning2
82 pages
Introduction To Matlab Tutorial 11
No ratings yet
Introduction To Matlab Tutorial 11
37 pages
Final Ml
No ratings yet
Final Ml
54 pages
Da Unit III Data Analytics Unit 1
No ratings yet
Da Unit III Data Analytics Unit 1
39 pages
Mcq-On-Linear-Regression - 5eea6a0b39140f30f369dd1c
No ratings yet
Mcq-On-Linear-Regression - 5eea6a0b39140f30f369dd1c
22 pages
UCS-401_CSE7th M L_lect_10_Unit-Ll_Least Squares Method, Multivariate Linear Regression, Regul
No ratings yet
UCS-401_CSE7th M L_lect_10_Unit-Ll_Least Squares Method, Multivariate Linear Regression, Regul
16 pages
MLDA U1
No ratings yet
MLDA U1
10 pages
Applied Statistics II-SLR
100% (1)
Applied Statistics II-SLR
23 pages
Week 9
No ratings yet
Week 9
23 pages
22AIP3101A Session 7
No ratings yet
22AIP3101A Session 7
28 pages
fileml
No ratings yet
fileml
54 pages
Part A Assignment - No - 4
No ratings yet
Part A Assignment - No - 4
14 pages
unit 4 regression
No ratings yet
unit 4 regression
26 pages
Multi-Frame Super-Resolution With No Explicit Moti
No ratings yet
Multi-Frame Super-Resolution With No Explicit Moti
6 pages
Week 2 Watermark
No ratings yet
Week 2 Watermark
84 pages
2a Linear Regression 18may
No ratings yet
2a Linear Regression 18may
28 pages
1.1. Linear Models - Scikit-Learn 1.4.2 Documentation
No ratings yet
1.1. Linear Models - Scikit-Learn 1.4.2 Documentation
17 pages
Vmls Additional Exercises
No ratings yet
Vmls Additional Exercises
66 pages
Neural Network BSC
No ratings yet
Neural Network BSC
32 pages
Lecture+8+ +Linear+Regression
No ratings yet
Lecture+8+ +Linear+Regression
45 pages
Chat Openai Com Share 42b24a73 839b 4128 Ade9 7d8eed9e9533
No ratings yet
Chat Openai Com Share 42b24a73 839b 4128 Ade9 7d8eed9e9533
21 pages
Principal Component Analysis: Courtesy:University of Louisville, CVIP Lab
No ratings yet
Principal Component Analysis: Courtesy:University of Louisville, CVIP Lab
48 pages
Multiple Linear Regression in Data Mining
100% (1)
Multiple Linear Regression in Data Mining
14 pages
Chapter 7: Statistical Applications in Traffic Engineering
No ratings yet
Chapter 7: Statistical Applications in Traffic Engineering
28 pages
No Linealidades Stock Watson
No ratings yet
No Linealidades Stock Watson
59 pages
Session 4 - Multiple Linear Regression
No ratings yet
Session 4 - Multiple Linear Regression
63 pages
Notes 9
No ratings yet
Notes 9
57 pages
Iv. Single Layer Structures: 4.1. Perceptrons
No ratings yet
Iv. Single Layer Structures: 4.1. Perceptrons
26 pages
Lecture3_upload
No ratings yet
Lecture3_upload
28 pages
Linear Regression
No ratings yet
Linear Regression
62 pages
Linear Regression
No ratings yet
Linear Regression
26 pages
Forecasting Assignment2023
No ratings yet
Forecasting Assignment2023
3 pages
Module III (Part II)(Regression and Time Series)
No ratings yet
Module III (Part II)(Regression and Time Series)
118 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
Pca
No ratings yet
Pca
73 pages
Chapter 17: Autocorrelation (Serial Correlation) : - o o o o - o
No ratings yet
Chapter 17: Autocorrelation (Serial Correlation) : - o o o o - o
32 pages
Lecture Slides-Week12
100% (1)
Lecture Slides-Week12
41 pages
Applied Business Forecasting and Planning: Multiple Regression Analysis
No ratings yet
Applied Business Forecasting and Planning: Multiple Regression Analysis
100 pages
NLP-NeuralNetworks Reading Notes
No ratings yet
NLP-NeuralNetworks Reading Notes
13 pages
CS-AI LECUTRE NOTES Unsupervised Learning-03
No ratings yet
CS-AI LECUTRE NOTES Unsupervised Learning-03
71 pages
An Introduction Of: Support Vector Machine
No ratings yet
An Introduction Of: Support Vector Machine
36 pages
mod3
No ratings yet
mod3
101 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
From Everand
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
Rob Porter
No ratings yet
Introduction to Advanced Mathematical Analysis
From Everand
Introduction to Advanced Mathematical Analysis
Simone Malacrida
No ratings yet
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Exercises of Multi-Variable Functions
From Everand
Exercises of Multi-Variable Functions
Simone Malacrida
5/5 (1)
MASONRY Handout
100% (1)
MASONRY Handout
20 pages
Problem Set 5
No ratings yet
Problem Set 5
2 pages
Tom Dieck, M. C., & Jung, T. (2017) Value of Augmented Reality at Cultural Heritage Sites a Stakeholder Approach
No ratings yet
Tom Dieck, M. C., & Jung, T. (2017) Value of Augmented Reality at Cultural Heritage Sites a Stakeholder Approach
2 pages
Plastic Recycling Business Plan
50% (2)
Plastic Recycling Business Plan
4 pages
GA90-315 Power Point REV3
No ratings yet
GA90-315 Power Point REV3
27 pages
JK Week5 Random
No ratings yet
JK Week5 Random
3 pages
Principles Involved in Handwriting Identification
100% (3)
Principles Involved in Handwriting Identification
2 pages
Summer Internship Report
No ratings yet
Summer Internship Report
75 pages
Defining Critical Thinking
No ratings yet
Defining Critical Thinking
5 pages
300V2 - 4T - Factory - Line 10W-50 DATASHEET
No ratings yet
300V2 - 4T - Factory - Line 10W-50 DATASHEET
2 pages
ABSTRACT: Second-Hand Clothes (SHC) Seem No Longer On New Terms With All of The
No ratings yet
ABSTRACT: Second-Hand Clothes (SHC) Seem No Longer On New Terms With All of The
14 pages
Time Distance
No ratings yet
Time Distance
23 pages
200-DR Datasheet PDF
100% (2)
200-DR Datasheet PDF
4 pages
SDET Bottles Filling and Capping Machine Operation Manual
No ratings yet
SDET Bottles Filling and Capping Machine Operation Manual
10 pages
LT BILL 76229365020 Apr22
No ratings yet
LT BILL 76229365020 Apr22
2 pages
Mahaveer Nishad RBD
No ratings yet
Mahaveer Nishad RBD
3 pages
Dual Full-Bridge MOSFET Driver With Microstepping Translator
No ratings yet
Dual Full-Bridge MOSFET Driver With Microstepping Translator
18 pages
DLL SHS STEM Grade 12 - General Biology1 Quarter1 Week1 (Palawan Division)
100% (4)
DLL SHS STEM Grade 12 - General Biology1 Quarter1 Week1 (Palawan Division)
13 pages
H.K. Moffatt - Cosmic Dynamos: From Alpha To Omega
No ratings yet
H.K. Moffatt - Cosmic Dynamos: From Alpha To Omega
5 pages
Siwes Report (Ime Industrial Maintenance Engineering) - 063225
No ratings yet
Siwes Report (Ime Industrial Maintenance Engineering) - 063225
20 pages
This Study Resource Was Shared Via: Course Description
No ratings yet
This Study Resource Was Shared Via: Course Description
3 pages
Book Analysis Worksheet
No ratings yet
Book Analysis Worksheet
5 pages
Influence: Science and Practice by Robert Cialdini
No ratings yet
Influence: Science and Practice by Robert Cialdini
12 pages
ICOM Code of Ethics 2013
100% (1)
ICOM Code of Ethics 2013
22 pages
Application of Lean Six Sigma Methodology Application in Banking FINAL
No ratings yet
Application of Lean Six Sigma Methodology Application in Banking FINAL
58 pages
HUAWEI CUN U29-Bitel Phone Upgrade Guide - V1.1 - 20161128
No ratings yet
HUAWEI CUN U29-Bitel Phone Upgrade Guide - V1.1 - 20161128
10 pages
Kotak Settlement Letter
No ratings yet
Kotak Settlement Letter
1 page
Aadhaar Update Form Panchayat
No ratings yet
Aadhaar Update Form Panchayat
2 pages
Ijme V2i6p107
No ratings yet
Ijme V2i6p107
5 pages