Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
33 views

(Slide) Non Linear Regression

Uploaded by

22560023
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

(Slide) Non Linear Regression

Uploaded by

22560023
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

AI VIETNAM

All-in-One Course

Machine Learning

Nonlinear Regression
Nguyen Quoc Thai

1
Year 2023
CONTENT

(1) – Linear Regression


(2) – Nonlinear Regression
(3) – Polynomial Regression
(4) – Multivariable Polynomial Regression
(5) – Summary

2
1 – Linear Regression
! Problem

Data
Level Salary Level Salary
0 8 3,5 ???
1 15 10 ???
2 18
Prediction
3 22
4 26
Learning
5 30
6 38
7 47

3
1 – Linear Regression
! Problem

Data Visualization
Level Salary
y = 6x + 7
0 8
y = f(x): linear function
1 15
2 18
3 22
4 26
5 30
6 38
7 47

4
1 – Linear Regression
! Linear Regression

Data Modeling Visualization


Level Salary y = ax + b
y = 6x + 7
0 8
y = f(x): linear function
1 15 Find a and b to fit
2 18 the data
3 22
4 26
5 30
6 38
7 47

5
1 – Linear Regression
! Linear Regression using Gradient Descent
Modeling
Init 𝜃 Visualization
y = ax + b lr = 0.1 y = 2x + 2
Data
x = [1 2]
y = 18

y = 2x + 2 = 6
+2
y = 2x

6
1 – Linear Regression
! Linear Regression using Gradient Descent
Modeling
Init 𝜃 Visualization
y = ax + b lr = 0.1 y = 2x + 2
Data
x = [1 2]
y = 18

y = 2x + 2 = 6
+2
y = 2x
Loss

L = (6 – 18)2 = 144

Difference between
predicted and actual value 7
1 – Linear Regression
! Linear Regression using Gradient Descent
Modeling
Init 𝜃 Visualization
y = ax + b lr = 0.1 y = 2x + 2
Data
x = [1 2]
y = 18

4.4
𝜃= y = 2x + 2 = 6
6.8
+2
y = 2x
Loss
−24 k = -24
L’ = L = (6 – 18)2 = 144
−48
Difference between
predicted and actual value 8
1 – Linear Regression
! Linear Regression using Gradient Descent
Modeling
Visualization
y = ax + b y = 2x + 2
Updated 4 .4
x +
.8
=6
y = 6.8x + 4.4 y

+2
y = 2x

9
1 – Linear Regression
! Linear Regression using Gradient Descent

Data Inputs / Features Target


Level Salary 1 𝜑!(1) … 𝜑"#!(1) 𝑦 1
0 8 X= ⋮ ⋮ ⋱ ⋮ Y= ⋮
1 15 1 𝜑!(N) … 𝜑"#!(N) 𝑦 𝑁
2 18
3 22 Weight Predict
4 26 𝜃$ 𝜃$ + 𝜃! ∗ 𝜑!(1) + ⋯ + 𝜃"#!𝜑"#!(1)
5 30 𝜃= ⋮ 5=
Y ⋮
6 38 𝜃"#! 𝜃$ + 𝜃! ∗ 𝜑!(N) + ⋯ + 𝜃"#!𝜑"#!(N)
7 47

10
1 – Linear Regression
! Linear Regression using Gradient Descent

Data Learning

1 𝜑! (1) … 𝜑"#! (1)


X= ⋮ ⋮ ⋱ ⋮
1 𝜑! (N) … 𝜑"#! (N)
𝑦 1
Y= ⋮
𝑦 𝑁

Modeling
b
y = ax + b 𝜃=
a
𝜃$
𝜃= ⋮
𝜃"#! 11
1 – Linear Regression
! Optimal Learning Rate

Slow Optimal High

12
1 – Linear Regression
! Limitation

Modeling
y = ax + b

The main disadvantage of this technique is that the model is linear in both the
parameters and the features. This is a very restrictive assumption, quite of the often
data exhibits behaviours that are nonlinear in the features

Extend this approach to more flexible models…

13
2 – Nonlinear Regression
! Moving beyond linearity

Data Visualization Linear function


y9 i = 𝜃$ + 𝜃! ∗ 𝜑(i)
Polynomial function
y9 i = 𝜃$ + 𝜃! ∗ 𝜑 i + 𝜃% ∗ 𝜑(i)%

Nonlinear regression estimates the ouput


based on nonlinear function

Notice that the prediction is still linear in the


parameters but nonlinear in the features

14
2 – Nonlinear Regression
! Moving beyond linearity

2-degree polynomial function 3-degree polynomial function Step function

Sinusoidal Exponetial function

15
3 – Polynomial Regression
! Polynomial Features

2-degree polynomial function


y9 i = 𝜃$ ∗ 1 + 𝜃! ∗ 𝜑 i + 𝜃% ∗ 𝜑(i)%

Find 𝜃! , 𝜃" , 𝜃# to fit the data

+ 7
+ 6x
2
5x
y=

16
3 – Polynomial Regression
! Polynomial Features

2-degree polynomial function Create polynomial feature


y9 i = 𝜃$ ∗ 1 + 𝜃! ∗ 𝜑 i + 𝜃% ∗ 𝜑(i)%
1

𝜑 i = 𝝍(𝝋 𝒊 )

𝜑(i)%

𝜓 @ is referred to as basis function and it can be seen as a funtion that transforms the
input in some way (In this case its powers function)

17
3 – Polynomial Regression
! Polynomial Features

2-degree polynomial function Create polynomial feature


y9 i = 𝜃$ ∗ 1 + 𝜃! ∗ 𝜑 i + 𝜃% ∗ 𝜑(i)%

Data 𝝍(𝝋 𝒊 )
Level Salary Input 1 𝜑 i 𝜑(i)%
0 45000 0 1 0 0
1 50000 1 1 1 1
2 60000 2 1 2 4
3 80000 3 1 3 9
4 110000 4 1 4 16
5 160000 5 1 5 25 18
3 – Polynomial Regression
! Polynomial Features

2-degree polynomial function


y9 i = 𝜃$ ∗ 1 + 𝜃! ∗ 𝜑 i + 𝜃% ∗ 𝜑(i)%

Features Target
1 𝜑 i 𝜑(i)%
1 0 0 45000
1 1 1 50000
1 2 4 60000
1 3 9 80000
1 4 16 110000
1 5 25 160000 19
3 – Polynomial Regression
! Polynomial Features

3-degree polynomial function


y9 i = 𝜃$ ∗ 1 + 𝜃! ∗ 𝜑 i + 𝜃% ∗ 𝜑 𝑖 % + 𝜃& ∗ 𝜑(i)&

Features Target
1 𝜑 i 𝜑(i)% 𝜑(i)&
1 0 0 0 45000
1 1 1 1 50000
1 2 4 8 60000
1 3 9 27 80000
1 4 16 64 110000
1 5 25 125 160000 20
3 – Polynomial Regression
! Polynomial Features

Input Features Algorithm


1 𝜑 i 𝜑(i)%
0 1 0 0
1 1 1 1
2 1 2 4
3 1 3 9
4 1 4 16
5 1 5 25

21
3 – Polynomial Regression
! Model

Data Inputs / Features with b-degree Target


1 𝜑!(1) … 𝜑!(1)' 𝑦 1
Level Salary X= ⋮ ⋮ ⋱ ⋮ Y= ⋮
0 45000 1 𝜑!(N) … 𝜑!(N)' 𝑦 𝑁
1 50000
2 60000 Weight Predict
3 80000 𝜃$ 𝜃$ + 𝜃! ∗ 𝜑!(1) + ⋯ + 𝜃' 𝜑!(1)'
4 110000 𝜃= ⋮ 5=
Y ⋮
5 160000 𝜃' 𝜃$ + 𝜃! ∗ 𝜑!(N) + ⋯ + 𝜃' 𝜑!(N)'

22
3 – Polynomial Regression
! Model

Nonlinear Regression Model Both models are linear Linear Regression Model
in the parameters
1 𝜑!(1) … 𝜑!(1)' 1 𝜑!(1)
X= ⋮ ⋮ ⋱ ⋮ 1
&
X= ⋮ ⋮
J 𝜃 = & y' − y #
1 𝜑!(N) … 𝜑!(N)' N 1 𝜑!(N)
$%"

𝑦 1 𝜃$ 𝑦 1 𝜃$
Y= ⋮ 𝜃= ⋮ Using Gradient Decent Y= ⋮ 𝜃 =
𝜃!
𝑦 𝑁 𝜃' 𝑦 𝑁
𝜃$ + 𝜃! ∗ 𝜑!(1) + ⋯ + 𝜃' 𝜑!(1)' 𝜃$ + 𝜃! ∗ 𝜑!(1)
5=
Y ⋮ 5=
Y ⋮
𝜃$ + 𝜃! ∗ 𝜑!(N) + ⋯ + 𝜃' 𝜑!(N)' 𝜃$ + 𝜃! ∗ 𝜑!(N)
23
3 – Polynomial Regression
! Degree Choice

b-degree polynomial function


y9 i = 𝜃$ ∗ 1 + 𝜃! ∗ 𝜑 i + 𝜃% ∗ 𝜑 𝑖 % + ⋯ + 𝜃' ∗ 𝜑(i)'
The choice of the degree of the polynomial if critical and depends on the dataset at hand

1-degree 2-degree 3-degree 9-degree


Too simple/not flexible enough Just right Overfitting
24
3 – Polynomial Regression
! Degree Choice

b-degree polynomial function


y9 i = 𝜃$ ∗ 1 + 𝜃! ∗ 𝜑 i + 𝜃% ∗ 𝜑 𝑖 % + ⋯ + 𝜃' ∗ 𝜑(i)'
The choice of the degree of the polynomial if critical and depends on the dataset at hand

Good method for choice of the degree:


K-fold cross-validation

Choose the degree which has the


lowest out-of-sample error

25
3 – Polynomial Regression
! Disadvantages

Increasing the degree of the polynomial always results in a model that is more
sensitive to stochastic noise (even if that degree is the best one obtained from
validation), especially at the boundaries (where we often have less data).

26
4 – Multivariable PL
! Extension to the multivariable case

Build a machine learning model to predict the weight of the fish based on the body
measurement data of seven types of fish species

27
4 – Multivariable PL
! Extension to the multivariable case

Data Visualization

28
4 – Multivariable PL
! Simple Approach

(a+b)2 => a2 + b2 + a + b + 1
Multivariable
𝑋 1 𝜑 i 𝜑(i)%
0 2 1 0 2 0 4
1 1 1 1 1 1 1
2 2 1 2 2 4 4
3 1 1 3 1 9 1
4 2 1 4 2 16 4
5 1 1 5 1 25 1

29
4 – Multivariable PL
! Simple Approach

(a+b)2 => a2 + b2 + a + b + 1

30
4 – Multivariable PL
! Advanced Approach

(a+b)2 => a2 + b2 + ab + a + b + 1
Multivariable
𝑋 1 𝜑 i 𝜑(i)%
0 2 1 0 2 0 4 0
1 1 1 1 1 1 1 1
2 2 1 2 2 4 4 4
3 1 1 3 1 9 1 3
4 2 1 4 2 16 4 8
5 1 1 5 1 25 1 5

31
4 – Multivariable PL
! Practice: Fish Dataset

32
4 – Multivariable PL
! Practice: Fish Dataset

(1) - Preprocessing

33
4 – Multivariable PL
! Practice: Fish Dataset

(2) – EDA (Exploratory Data Analysis)

34
4 – Multivariable PL
! Practice: Fish Dataset

(3) – Representation + Polynomial Feature


Category Feature

35
4 – Multivariable PL
! Practice: Fish Dataset

(3) – Representation + Polynomial Feature


Category Feature Category Encoding One Hot Encoding

Index Category Category


0 Bream Bream [1 0 0]
1 Parkki Parkki [0 1 0]
2 Perch Perch [0 0 1]

36
4 – Multivariable PL
! Practice: Fish Dataset

(4) - Modeling

37
SUMMARY

(1) – Linear Regression


(2) – Nonlinear Regression
(3) – Polynomial Regression
(4) – Multivariable Polynomial Regression
(5) – Summary

38
Thanks!
Any questions?

39

You might also like