LinearRegression Tutorial
LinearRegression Tutorial
Linear Regression
October 7, 2022
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 1 / 40
New Packages
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 2 / 40
Generate A Regression Problem
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 3 / 40
Data Visualization
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 4 / 40
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 5 / 40
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 6 / 40
Recall (Linear Regression)
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 7 / 40
Minimizing cost function with gradient descent
1 X (i)
J(w ) = (y − ŷ (i) )2 (1)
2
i
∂J X (i)
=− (y (i) − ŷ (i) )xj (4)
∂wj
i
∂J X (i)
∆wj = −η =η (y (i) − ŷ (i) )xj (5)
∂wj
i
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 8 / 40
Minimizing cost function with gradient descent (cont.)
(
wj + η ∗ sum(y − ŷ ) j =0
wj = (i)
wj + η ∗ i (y (i) − ŷ (i) )xj
P
j ∈ [1, . . . , n]
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 9 / 40
Pseudocode of the Training Process
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 10 / 40
Components
Hyperparameters
eta (float): the initial learning rate
max iter (int): the maximum number of iterations
random state (int)
Parameters
w (list/array): the weight values
costs (list/array): the list containing the cost values over iterations
Methods
fit(X , y )
predict(X )
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 11 / 40
Implement (code from scratch)
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 12 / 40
’fit’ method
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 13 / 40
’fit’ method (2)
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 14 / 40
Train Model
Gradient Descent
>> reg GD = LinearRegression GD(eta=0.001, max iter=20,
random state=42)
reg GD.fit(X, y)
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 15 / 40
Visualize the trend in the cost values (Gradient Descent)
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 16 / 40
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 17 / 40
Visualize on Data
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 18 / 40
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 19 / 40
Weight values
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 20 / 40
Implement (package)
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 21 / 40
Implement (package) (cont.)
Normal Equation
from sklearn.linear model import LinearRegression
Parameters Methods
intercept fit(X, y)
coef predict(X)
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 22 / 40
Differences
Gradient Descent
w := w + ∆w
∆w = η i (y (i) − ŷ (i) )x i
P
Normal Equation
w = (X T X )−1 X T y
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 23 / 40
Practice (cont.)
Normal Equation
>> from sklearn.linear model import LinearRegression
>> reg NE = LinearRegression()
reg NE.fit(X, y)
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 24 / 40
Weight Values Comparisons
Normal Equation
>> w NE = np.append(reg NE.intercept , reg NE.coef )
w NE
>> [-0.97941333, 63.18605572]
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 25 / 40
Visualize on Data (all)
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 26 / 40
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 27 / 40
Performance Evaluation
1 X (i)
MAE (y , ŷ ) = |y − ŷ (i) | (6)
n
i
1 X (i)
MSE (y , ŷ ) = (y − ŷ (i) )2 (7)
n
i
R-Squared (R2)
P (i)
2 (y − ŷ (i) )2
R (y , ŷ ) = 1 − Pi (i) − y )2
(8)
i (y
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 28 / 40
Performance Evaluation
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 29 / 40
Performance Evaluation (cont.)
R 2 score
>> print(’R2 of GD:’, round(R2(y, y pred GD), 6))
print(’R2 of SGD:’, round(R2(y, y pred SGD), 6))
print(’R2 of NE:’, round(R2(y, y pred NE), 6))
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 30 / 40
Run Gradient Descent with lr = 0.005
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 31 / 40
Polynominal Regression
Example
X = [258.0, 270.0, 294.0, 320.0, 342.0, 368.0, 396.0, 446.0, 480.0, 586.0]
y = [236.4, 234.4, 252.8, 298.6, 314.2, 342.2, 360.8, 368.0, 391.2, 390.8]
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 32 / 40
Visualize data
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 33 / 40
Experiment with Linear Regression
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 34 / 40
Experiment with Linear Regression (cont.)
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 35 / 40
Experiment with Polynominal Regression
Syntax
from sklearn.preprocessing import PolynomialFeatures
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 36 / 40
Experiment with Polynominal Regression (cont.)
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 37 / 40
>> X test = np.arange(250, 600, 10)[:, np.newaxis]
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 38 / 40
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 39 / 40
Practice
Q.M. Phan & N.H. Luong (VNU-HCM UIT) Machine Learning October 7, 2022 40 / 40