Linear Regression Python Programming
Linear Regression Python Programming
Informática
Petia Georgieva
(petia@ua.pt)
LINEAR REGRESSION - Outline
1. Univariate linear regression
- Cost (loss) function - Mean Squared Error (MSE)
- Cost function convergence
- Gradient descent algorithm
2
Supervised Learning -
CLASSIFICATION vs REGRESSION
Classification- the label is an integer number.
(e.g. 0, 1 for binary classification)
• Weather forecast
4
Supervised Learning –
univariate regression
Problem: Learning to predict the housing prices (output, predicted variable) as a
function of the living area (input, feature, predictor)
5
Supervised Learning –
univariate regression
6
Mean Square Error (MSE)
Goal =>
8
Linear Regression – iterative gradient
descent algorithm (summary)
Inicialize model parameters (e.g. θ =0)
Repeat until J converge {
9
Batch/mini batch/stochastic
gradient descent for parameter update
11
Lin Reg Cost function – local minimum
Suppose θ1 is at a local optima as shown in the figure.
1) Leave θ1 unchanged
2) Change θ1 in a random direction
3) Decrease θ1
4) Move θ1 in direction to the global minimum of J
12
Cost function convergence
changing the learning rate (α) -100 iter.
10
x 10
7
alpha = 0.01
alpha = 0.03
alpha = 0.1
6
alpha = 0.3
4
Cost J
0
0 10 20 30 40 50 60 70 80 90 100
Number of iterations
10
x 10
7
alpha = 0.01
alpha = 0.03
alpha = 0.1
6
alpha = 0.3
4
Cost J
0
0 50 100 150 200 250 300 350 400
Number of iterations
14
LinReg Cost function convergence -
α)
learning rate variation (α
11
x 10
8
alpha = 0.01
alpha = 0.03
7 alpha = 0.1
alpha = 1.4
5
Cost J
0
0 2 4 6 8 10 12 14 16 18 20
Number of iterations
15
Univariate Regression
Given the house area, what is the most likely house price?
If univariate linear regression model is not sufficiently good model,
add more data (ex. # bedrooms).
16
Multivariate Regression
Problem: Learning to predict the housing price as a function of living area &
number of bedrooms.
x 0 = 1
rT r
hθ ( x ) = θ 0 + θ 1 x 1 + θ 2 x 2 = [θ 0 θ1 θ 2 ] x1 = θ x
x 2
17
Polynomial Regression
If univariate linear regression model is not a good model, try polynomial
model.
Univariate (x1=size) housing price problem transformed into multivariate
(still linear !!!) regression model x=[ x1=size, x2=size^2, x3=size^3 ]
18
Overfitting problem
Overfitting: If we have too many features ( e.g. high order polynomial
model), the learned hypothesis may fit the training set very well but
fail to generalize to new examples (predict prices on new examples).
hθ ( x ) = θ 0 + θ1 x hθ ( x) = θ 0 + θ1 x + θ 2 x 2 + θ 3 x 3 hθ ( x ) = θ 0 + θ1 x + θ 2 x 2 + .... + θ16 x n
19
Overfitting problem
Overfitting: If we have too many features (x1,…x100) the learned
model may fit the training data very well but fails to generalize to new
examples.
rT r
hθ ( x ) = θ 0 + θ 1 x 1 + θ 2 x 2 + .... + θ n x n = θ x
20
How to deal with overfitting problem ?
21
Regularized Linear Regression
(cost function)
22
Regularized Linear Regression
(cost function gradient)
Unregularized cost
function gradients =>
Regularized cost
function gradients =>
23
Regularized Linear Regression
What if lambda is set to an extremely large value ?
24
Regularization: Lasso Regression
1 m λ n
[ ( ( )) ( ) ( ( ))]
J (θ ) = − y ( i ) log hθ x ( i ) − 1 − y ( i ) log 1 − hθ x ( i ) +
m i =1
2m j =1
θj
Ridge Regression shrinks θ towards zero, but never equal to zero => all
features are included in the model no matter how small are the
coefficients.
25