0% found this document useful (0 votes)

37 views

Linear Regression Python Programming

This document discusses linear regression. It defines linear regression, univariate vs multivariate regression, and the mean squared error cost function. It describes minimizing the cost function using gradient descent. It discusses overfitting and addresses it through regularization techniques like ridge and lasso regression which add penalties to the cost function to discourage complex models.

Uploaded by

carlotta.decastiglioni01

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views

Linear Regression Python Programming

Uploaded by

carlotta.decastiglioni01

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Departamento de Eletrónica, Telecomunicações e

Informática

LECTURE 2: LINEAR REGRESSION

Petia Georgieva
(petia@ua.pt)
LINEAR REGRESSION - Outline
1. Univariate linear regression
- Cost (loss) function - Mean Squared Error (MSE)
- Cost function convergence
- Gradient descent algorithm

2. Multivariate linear regression

-Overfitting problem

3. Regularization => way to deal with overfitting

2
Supervised Learning -
CLASSIFICATION vs REGRESSION
Classification- the label is an integer number.
(e.g. 0, 1 for binary classification)

Regression - the label is a real number.

Examples of regression problems:

• Weather forecast

• Predicting wind velocity from temperature, humidity, air pressure

• Time series prediction of stock market indices

• Predicting sales amounts of new product based on advertising

expenditure
3
Standard Notations in this course

4
Supervised Learning –
univariate regression
Problem: Learning to predict the housing prices (output, predicted variable) as a
function of the living area (input, feature, predictor)

observation – prediction = Error (“residual”)

5
Supervised Learning –
univariate regression

6
Mean Square Error (MSE)

Linear Model (hypothesis) =>

Cost (loss) function =>

(Mean Square Error)

m – number of training examples

Goal =>

Gradient descent algorithm =>

iterative algorithm; at each
iteration all parameters (theta)
are updated simultaneously

alpha – learning rate > 0

7
Linear Regression
(computing the gradient)
Cost function =>

Cost function gradients =>

vector with parcial
derivatives of J with
respect to each parameter
for one example (m=1)

Cost function gradients =>

for m examples

8
Linear Regression – iterative gradient
descent algorithm (summary)
Inicialize model parameters (e.g. θ =0)
Repeat until J converge {

Compute Linear Regression Model =>

Compute cost function =>

Compute cost function gradients =>

Update parameters =>

}

9
Batch/mini batch/stochastic
gradient descent for parameter update

Batch learning (classical approach):

update parameters after all training examples have been
processed, repeat several iterations until convergence

Mini batch learning (if big training data):

devide training data into small batches update parameters after
each mini batch has been processed, repeat until convergence

Stochastic (incremental) learning (large-scale ML problems):

update parameters after every single training example has been
processed.

Stochastic Gradient Descent (SGD): the true gradient is

approximated by a gradient at a single example. Adaptive
learning rate. 10
Cost function convergence

Linear Regression (LR):

starting from different initial values of the

parameters the cost function J should
always converge (maybe to a local
minimum !!!) if LR works properly.

11
Lin Reg Cost function – local minimum
Suppose θ1 is at a local optima as shown in the figure.

What will one step of Gradient Descent do ?

1) Leave θ1 unchanged
2) Change θ1 in a random direction
3) Decrease θ1
4) Move θ1 in direction to the global minimum of J

12
Cost function convergence
changing the learning rate (α) -100 iter.
10
x 10
7
alpha = 0.01
alpha = 0.03
alpha = 0.1
6
alpha = 0.3

4
Cost J

0
0 10 20 30 40 50 60 70 80 90 100
Number of iterations

If α too small : slow convergence of the cost function J (the

Gradient Descent optimization can be slow) 13
Cost function convergence
α) -400 iter.
changing the learning rate (α

10
x 10
7
alpha = 0.01
alpha = 0.03
alpha = 0.1
6
alpha = 0.3

4
Cost J

0
0 50 100 150 200 250 300 350 400
Number of iterations

14
LinReg Cost function convergence -
α)
learning rate variation (α
11
x 10
8
alpha = 0.01
alpha = 0.03
7 alpha = 0.1
alpha = 1.4

5
Cost J

0
0 2 4 6 8 10 12 14 16 18 20
Number of iterations

If α too large: the cost function J may no converge (decrease at each

iteration). It may diverge !

15
Univariate Regression

Given the house area, what is the most likely house price?
If univariate linear regression model is not sufficiently good model,
add more data (ex. # bedrooms).
16
Multivariate Regression

Problem: Learning to predict the housing price as a function of living area &
number of bedrooms.

 x 0 = 1
  rT r
hθ ( x ) = θ 0 + θ 1 x 1 + θ 2 x 2 = [θ 0 θ1 θ 2 ]  x1  = θ x
 x 2 

17
Polynomial Regression
If univariate linear regression model is not a good model, try polynomial
model.
Univariate (x1=size) housing price problem transformed into multivariate
(still linear !!!) regression model x=[ x1=size, x2=size^2, x3=size^3 ]

18
Overfitting problem
Overfitting: If we have too many features ( e.g. high order polynomial
model), the learned hypothesis may fit the training set very well but
fail to generalize to new examples (predict prices on new examples).

underfit just right overfit

(1st order polin. model) (3rd order polinom. model) (higher ord. polinom. Model)

hθ ( x ) = θ 0 + θ1 x hθ ( x) = θ 0 + θ1 x + θ 2 x 2 + θ 3 x 3 hθ ( x ) = θ 0 + θ1 x + θ 2 x 2 + .... + θ16 x n
19
Overfitting problem
Overfitting: If we have too many features (x1,…x100) the learned
model may fit the training data very well but fails to generalize to new
examples.

rT r
hθ ( x ) = θ 0 + θ 1 x 1 + θ 2 x 2 + .... + θ n x n = θ x
20
How to deal with overfitting problem ?

1. Reduce number of features.

― Manually select which features to keep.
― Algorithm to select the best model complexity.

2. Regularization (add extra term in cost function)

Regularization methods shrink model parameters θ towards zero to
prevent overfitting by reducing the variance of the model.

2.1 Ridge Regression (L2 norm)

― Reduce magnitude of θ (but never make them =0) => keep all features
― Works well when all features contributes a bit to the output y.

2.2 Lasso Regression (L1 norm)

- May shrink some of the elements of vector θ to become 0.
- Eliminate some of the features => Serve as feature selection

21
Regularized Linear Regression
(cost function)

Unregularized cost function =>

Regularized cost function Ridge Regression

(add extra regularization term
don’t regularize θ
0

22
Regularized Linear Regression
(cost function gradient)

Unregularized cost
function gradients =>

Regularized cost
function gradients =>

23
Regularized Linear Regression
What if lambda is set to an extremely large value ?

- Algorithm fails to eliminate overfitting.

- Algorithm results in under-fitting. (Fails to fit even training data
well).
- Gradient descent will fail to converge.

24
Regularization: Lasso Regression
1 m λ n
[ ( ( )) ( ) ( ( ))]
J (θ ) =  − y ( i ) log hθ x ( i ) − 1 − y ( i ) log 1 − hθ x ( i ) +
m i =1

2m j =1
θj

Ridge Regression shrinks θ towards zero, but never equal to zero => all
features are included in the model no matter how small are the
coefficients.

Lasso Regression is able to shrink coefficients to exactly zero =>

reduces the number of features. This makes Lasso Regression useful
in cases with high dimension.

Lasso Regression involves absolute values (not differentiable)=>

computing is difficult => relevant algorithms available in sklearn
Python library.

Solid Starts - First 100 Days
94% (18)
Solid Starts - First 100 Days
287 pages
Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
89% (45)
12 Week Program: Summer Body Starts Now
70 pages
The Hold Me Tight Workbook - Dr. Sue Johnson
100% (16)
The Hold Me Tight Workbook - Dr. Sue Johnson
187 pages
Read People Like A Book by Patrick King-Edited
62% (66)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Facial Gains Guide (001 081)
91% (45)
Facial Gains Guide (001 081)
81 pages
Cheat Code To The Universe
94% (77)
Cheat Code To The Universe
34 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
COSMIC CONSCIOUSNESS OF HUMANITY - PROBLEMS OF NEW COSMOGONY (V.P.Kaznacheev,. Л. V. Trofimov.)
94% (212)
COSMIC CONSCIOUSNESS OF HUMANITY - PROBLEMS OF NEW COSMOGONY (V.P.Kaznacheev,. Л. V. Trofimov.)
212 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (541)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (28)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
75% (12)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
36 Questions To Fall in Love 1
97% (31)
36 Questions To Fall in Love 1
2 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
ALCHEMIST
64% (14)
ALCHEMIST
4 pages
1001 Songs
71% (69)
1001 Songs
1,798 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Xgboost Presentation
100% (2)
Xgboost Presentation
54 pages
Huawei: Question & Answers
100% (1)
Huawei: Question & Answers
14 pages
Probabilistic Machine Learning An Introduction Book 1 (Kevin P Murphy)
100% (1)
Probabilistic Machine Learning An Introduction Book 1 (Kevin P Murphy)
949 pages
Hansen, Discrete Inverse Problems Full Book
100% (1)
Hansen, Discrete Inverse Problems Full Book
217 pages
Basic Operations in MATLAB:: Symbol Operation
No ratings yet
Basic Operations in MATLAB:: Symbol Operation
6 pages
M02 Linear Regression Methods
No ratings yet
M02 Linear Regression Methods
40 pages
Linear Regression
No ratings yet
Linear Regression
62 pages
Chapter 6
No ratings yet
Chapter 6
20 pages
Gradient Descent
No ratings yet
Gradient Descent
9 pages
ML 02 Linear Regression
No ratings yet
ML 02 Linear Regression
51 pages
Linear Regression
100% (1)
Linear Regression
51 pages
Matlab: Name:Mo Stafa Qasim Tahr
No ratings yet
Matlab: Name:Mo Stafa Qasim Tahr
16 pages
Power Law Revs C
No ratings yet
Power Law Revs C
4 pages
Matlab Intro
No ratings yet
Matlab Intro
48 pages
Expression in Jmag Designer
No ratings yet
Expression in Jmag Designer
3 pages
Intervention Models: Something's Happened Around T 200
No ratings yet
Intervention Models: Something's Happened Around T 200
41 pages
Summary of Variables: Arithmetic Operators
No ratings yet
Summary of Variables: Arithmetic Operators
92 pages
Gradient Descent and SGD
No ratings yet
Gradient Descent and SGD
8 pages
[ML&PR 2025] Lec2 Regression II
No ratings yet
[ML&PR 2025] Lec2 Regression II
41 pages
Lab 07 DSP: Lab # 07 Z-Transform and Inverse Z-Transform
No ratings yet
Lab 07 DSP: Lab # 07 Z-Transform and Inverse Z-Transform
15 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
25 pages
212 EEE 3310 LabSheet 04
No ratings yet
212 EEE 3310 LabSheet 04
13 pages
[PR 2024] Lec2 Regression II
No ratings yet
[PR 2024] Lec2 Regression II
41 pages
Optimisation and Optimal Control
No ratings yet
Optimisation and Optimal Control
26 pages
Power System Lab
No ratings yet
Power System Lab
54 pages
5th Session MTH111-1
No ratings yet
5th Session MTH111-1
12 pages
Matlab Chapter7 PDF
No ratings yet
Matlab Chapter7 PDF
43 pages
SOT Method
No ratings yet
SOT Method
9 pages
Lecture 3 Operator
No ratings yet
Lecture 3 Operator
14 pages
MAS3114 Project 4
No ratings yet
MAS3114 Project 4
9 pages
Lecture 2 Loops
No ratings yet
Lecture 2 Loops
83 pages
Matlab Class2
No ratings yet
Matlab Class2
26 pages
3 LPM and LOGIT (1)
No ratings yet
3 LPM and LOGIT (1)
13 pages
Experiment No 1: AIM: Creating A One-Dimensional Array (Row / Column Vector) Creating A
No ratings yet
Experiment No 1: AIM: Creating A One-Dimensional Array (Row / Column Vector) Creating A
22 pages
Gradient Descent - Linear Regression
100% (1)
Gradient Descent - Linear Regression
47 pages
UNIT-3_MATLAB PROGRAMMING_QUESTION BANK_SOLUTION (1)
No ratings yet
UNIT-3_MATLAB PROGRAMMING_QUESTION BANK_SOLUTION (1)
15 pages
ML Module 2
No ratings yet
ML Module 2
185 pages
18-660: Numerical Methods For Engineering Design and Optimization
No ratings yet
18-660: Numerical Methods For Engineering Design and Optimization
27 pages
Ridge Regression
No ratings yet
Ridge Regression
5 pages
Linear Regression- Gradient Descent Method
No ratings yet
Linear Regression- Gradient Descent Method
15 pages
Introduction To MATLAB & Function Manipulations
No ratings yet
Introduction To MATLAB & Function Manipulations
5 pages
تحكم الي
No ratings yet
تحكم الي
6 pages
GradientDescent_implementation.ipynb - Colab
No ratings yet
GradientDescent_implementation.ipynb - Colab
5 pages
Lec1_Matlab_Basic Operation
No ratings yet
Lec1_Matlab_Basic Operation
43 pages
MAT LAB Ex 1-3
No ratings yet
MAT LAB Ex 1-3
10 pages
Statistical Computing in Matlab: AMS 597 Ling Leng
No ratings yet
Statistical Computing in Matlab: AMS 597 Ling Leng
23 pages
Analisis Regresi Berganda Tanty Esty
No ratings yet
Analisis Regresi Berganda Tanty Esty
5 pages
Analisis Regresi Berganda Tanty Esty
No ratings yet
Analisis Regresi Berganda Tanty Esty
5 pages
OOP U3 BasicJava2 2024s1
No ratings yet
OOP U3 BasicJava2 2024s1
53 pages
CS601 Machine Learning Unit 2 Notes 1672759753
No ratings yet
CS601 Machine Learning Unit 2 Notes 1672759753
14 pages
Matrices and Arrays - MATLAB & Simulink PDF
No ratings yet
Matrices and Arrays - MATLAB & Simulink PDF
3 pages
Lec4 (C Program)
No ratings yet
Lec4 (C Program)
30 pages
13.a - Fixed Point Arithmetics
No ratings yet
13.a - Fixed Point Arithmetics
8 pages
CS601 - Machine Learning - Unit 2 - Notes - 1672759753
No ratings yet
CS601 - Machine Learning - Unit 2 - Notes - 1672759753
14 pages
Chapter 2 Basic Features: 2.1 Simple Math
No ratings yet
Chapter 2 Basic Features: 2.1 Simple Math
10 pages
Viden Io Amity Aset Matlab Practical File Basic Simulation Lab Manual Updated
No ratings yet
Viden Io Amity Aset Matlab Practical File Basic Simulation Lab Manual Updated
50 pages
Gradient Descent Slides
No ratings yet
Gradient Descent Slides
11 pages
1-2 Introduction To MATLAB (S)
No ratings yet
1-2 Introduction To MATLAB (S)
29 pages
Ch4 - Operators in Java
No ratings yet
Ch4 - Operators in Java
5 pages
Tariq Mahmood: Lab:1-2 Introduction To Matlab
No ratings yet
Tariq Mahmood: Lab:1-2 Introduction To Matlab
54 pages
Linear Control Systems Lab: E X Per I M E NT NO: 1
No ratings yet
Linear Control Systems Lab: E X Per I M E NT NO: 1
17 pages
Regresi 2021se042baru
No ratings yet
Regresi 2021se042baru
6 pages
Visvesvaraya National Institute of Technology, Nagpur: Presentation On Fuzzy Arithmetic
No ratings yet
Visvesvaraya National Institute of Technology, Nagpur: Presentation On Fuzzy Arithmetic
32 pages
Iterative Optimizers: Difficulty Measures and Benchmarks
From Everand
Iterative Optimizers: Difficulty Measures and Benchmarks
Maurice Clerc
No ratings yet
ANN Doc
No ratings yet
ANN Doc
2 pages
CH11
No ratings yet
CH11
36 pages
Benchmarking Detection Transfer Learning With Vision Transformers
No ratings yet
Benchmarking Detection Transfer Learning With Vision Transformers
9 pages
Total Generalized Variation: Kristian Bredies, Karl Kunisch, Thomas Pock
No ratings yet
Total Generalized Variation: Kristian Bredies, Karl Kunisch, Thomas Pock
35 pages
ML U-4
No ratings yet
ML U-4
63 pages
Deep Learning
No ratings yet
Deep Learning
80 pages
Micromachines 13 00317
No ratings yet
Micromachines 13 00317
10 pages
Maximum Likelihood Estimation
No ratings yet
Maximum Likelihood Estimation
46 pages
Machine Learning Interview Questions and Answers PDF
No ratings yet
Machine Learning Interview Questions and Answers PDF
15 pages
Crop Selection Method To Maximize Crop Yield Rate Using Machine Learning Technique
No ratings yet
Crop Selection Method To Maximize Crop Yield Rate Using Machine Learning Technique
8 pages
SMART: Robust and E Fficient Fine-Tuning For Pre-Trained Natural Language Models Through Principled Regularized Optimization
No ratings yet
SMART: Robust and E Fficient Fine-Tuning For Pre-Trained Natural Language Models Through Principled Regularized Optimization
21 pages
Machine Learning
No ratings yet
Machine Learning
16 pages
Data Science Interview Questions
100% (1)
Data Science Interview Questions
300 pages
Employee Salary Prediction Slides
No ratings yet
Employee Salary Prediction Slides
21 pages
Machine Learning Assignments
No ratings yet
Machine Learning Assignments
3 pages
Deep Learning
No ratings yet
Deep Learning
38 pages
DataIku Machine Learning Basics p2
No ratings yet
DataIku Machine Learning Basics p2
43 pages
MTech AI DS KIIT Syllabus v1.5
No ratings yet
MTech AI DS KIIT Syllabus v1.5
27 pages
Logistic - Regression Class 3
No ratings yet
Logistic - Regression Class 3
88 pages
Blind Deconvolution Using A Normalized Sparsity Measure
No ratings yet
Blind Deconvolution Using A Normalized Sparsity Measure
8 pages
ML System Optimization - Lecture 10 - Model Optimization Techniques
No ratings yet
ML System Optimization - Lecture 10 - Model Optimization Techniques
33 pages
Ai Notes
No ratings yet
Ai Notes
19 pages
unit1 DL JNTUK
No ratings yet
unit1 DL JNTUK
43 pages
7.a-CMP460-S22-Linear Models - Optimization Framework
No ratings yet
7.a-CMP460-S22-Linear Models - Optimization Framework
20 pages
Machine Learning & Portfolio Optimization: Gah-Yi Ban
No ratings yet
Machine Learning & Portfolio Optimization: Gah-Yi Ban
30 pages
LSTD A Low-Shot Transfer Detector For Object Detection
No ratings yet
LSTD A Low-Shot Transfer Detector For Object Detection
8 pages