100% found this document useful (1 vote)

92 views

Slide 4 - Linear Regression With Multiple Variables

This document discusses linear regression with multiple variables. It introduces using gradient descent to minimize a cost function to find the optimal parameters for a multivariate linear regression model. It covers important practical considerations for gradient descent, including feature scaling and selecting an appropriate learning rate. The document also contrasts gradient descent with solving for the parameters directly using the normal equation approach.

Uploaded by

Lôny Nêz

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

92 views

Slide 4 - Linear Regression With Multiple Variables

Uploaded by

Lôny Nêz

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Linear'Regression'with'

mul2ple'variables'

Mul2ple'features'

Machine'Learning'
2
Mul4ple%features%(variables).%

Size%(feet2)% Price%($1000)%
% %
% %
2104' 460'
1416' 232'
1534' 315'
852' 178'
…' …'

Andrew'Ng'
3
Mul4ple%features%(variables).%
Size%(feet2)% Number%of% Number%of% Age%of%home% Price%($1000)%
% bedrooms% ﬂoors% (years)% %
% % % % %
2104' 5' 1' 45' 460'
1416' 3' 2' 40' 232'
1534' 3' 2' 30' 315'
852' 2' 1' 36' 178'
…' …' …' …' …'
Nota2on:'
='number'of'features'
='input'(features)'of''''''''training'example.'
='value'of'feature''''in''''''''training'example.'

Andrew'Ng'
4
Hypothesis:'
'Previously:'

Andrew'Ng'
5

For'convenience'of'nota2on,'deﬁne''''''''''''''''.'

1 by (n+1)
matrix

Mul2variate'linear'regression.'
Andrew'Ng'
6
Linear'Regression'with'
mul2ple'variables'

Gradient'descent'for'
mul2ple'variables'
Machine'Learning'
8
Hypothesis:'
Parameters:'
n+1 dimensional vector
Cost'func2on:'

Gradient'descent:'
Repeat'

(simultaneously'update'for'every'''''''''''''''''''''''')'
Andrew'Ng'
9

Until Convergence
Linear'Regression'with'
mul2ple'variables'
Gradient'descent'in'
prac2ce'I:'Feature'Scaling'

Machine'Learning'
11
Feature%Scaling% Practical tricks for making gradient descent work well
Idea:'Make'sure'features'are'on'a'similar'scale.'
E.g.'''''''='size'(0X2000'feet2)' size'(feet2)'
''''''''''''''='number'of'bedrooms'(1X5)'
problem with
number'of'bedrooms'
two features

if you plot the

if you run gradient contours of the
descents on this cost function J a useful thing to do is to scale the features
cos-function, your of theta (theta
gradients may end 0 =0), it can
up taking a long take on this the contours
time before it can very very may look more
skewed elliptical like circles. And
finally find its way g r a d i e n t
to the global shape
descent can
minimum. fi n d a m u c h
more direct path
to the global
minimum
Andrew'Ng'
12
13
Feature%Scaling%
Get'every'feature'into'approximately'a'''''''''''''''''''''''''''range.'

Andrew'Ng'
14

Average value

(Max - Min)

4
Linear'Regression'with'
mul2ple'variables'

Gradient'descent'in'
prac2ce'II:'Learning'rate'
Machine'Learning'
16
Gradient%descent%

X  “Debugging”:'How'to'make'sure'gradient'
descent'is'working'correctly.'
X  How'to'choose'learning'rate'''''.'

Andrew'Ng'
17
Making%sure%gradient%descent%is%working%correctly.%

J(θ) should decrease

after every iteration
Example'automa2c'
convergence'test:'

Declare'convergence'if'''''''
decreases'by'less'than'''''''
in'one'itera2on.'
0' 100' 200' 300' 400' Deciding this threshold may be hard
No.'of'itera2ons' number of iterations Gradient descent takes
to converge depends on the application
Andrew'Ng'
18
Making%sure%gradient%descent%is%working%correctly.%
Gradient'descent'not'working.''
Use'smaller''''.''

No.'of'itera2ons'

No.'of'itera2ons' Theta
No.'of'itera2ons'

X  For'suﬃciently'small''''','''''''''''''should'decrease'on'every'itera2on.'
X  But'if''''''is'too'small,'gradient'descent'can'be'slow'to'converge.'
Andrew'Ng'
19
Summary:%
X  If'''''is'too'small:'slow'convergence.'
X  If'''''is'too'large:'''''''''may'not'decrease'on'
every'itera2on;'may'not'converge.' Slow Converge also
possible

To'choose'''','try'

Andrew'Ng'
Linear'Regression'with'
mul2ple'variables'

Features'and'
polynomial'regression'
Machine'Learning'
21
Housing%prices%predic4on%

Land area

sometimes by defining new features you might actually get a better model.
Andrew'Ng'
22
It doesn't look like a straight line fits this data very well.
Polynomial%regression%
quadratic model

Price'
(y)'
Cubic Function

Size'(x)'

Feature scaling is important

Andrew'Ng'
23
Choice%of%features%

Price'
(y)'

Size'(x)'

Andrew'Ng'
24
Linear'Regression'with'
mul2ple'variables'

Normal'equa2on'

Machine'Learning'
26

Gradient'Descent'
iterative algorithm that takes many steps,
multiple iterations of gradient descent to
converge to the global minimum.

Normal'equa2on:'Method'to'solve'for''
analy2cally.' For some linear regression problems, Normal equation will
give us a much better way to solve for the optimal value of
the parameters theta.
Andrew'Ng'
27
Intui2on:'If'1D' Example: Theta is just
a scalar value

The way to minimize this quadratic

Solve for θ function is to set derivatives equal
to zero.

(for'every''')'

Solve'for''
Andrew'Ng'
28
Examples:''
add an extra column Size%(feet2)% Number%of% Number%of% Age%of%home% Price%($1000)%
% bedrooms% ﬂoors% (years)% %
% % % % %
1' 2104' 5' 1' 45' 460'
1' 1416' 3' 2' 40' 232'
1' 1534' 3' 2' 30' 315'
1' 852' 2' 1' 36' 178'

m x (n+1) dimensional matrix

m dimensional vector

Andrew'Ng'
29
%%%%%%training%examples,%%%%%features.%
Gradient'Descent' Normal'Equa2on'
No need to do features scaling
•  Need'to'choose''''.'' •  No'need'to'choose''''.'
•  Needs'many'itera2ons.' •  Don’t'need'to'iterate.'
•'   Works'well'even' •  Need'to'compute'
when'''''is'large.'
•  Slow'if'''''is'very'large.'

Andrew'Ng'
To summarize, so long as the number of features is not too large, the normal equation 30
gives us a great alternative method to solve for the parameter theta. Concretely, so long
as the number of features is less than 1000, normal equation method can be used
rather than gradient descent.

As we get to the more complex learning algorithm, for example, when we talk about
classification algorithm, like a logistic regression algorithm, the normal equation method
actually do not work for those more sophisticated learning algorithms, and, we will have
to resort to gradient descent for those algorithms.

So, gradient descent is a very useful algorithm to know. The linear regression will have a
large number of features and for some of the other algorithms, because, for them, the
normal equation method just doesn't apply and doesn't work. But for this specific model
of linear regression, the normal equation can give you an alternative that can be much
faster, than gradient descent.

So, depending on the detail of the problems and how many features that you have, both
of these algorithms are well worth knowing about.

Instant Download Regression Analysis An Intuitive Guide For Using and Interpreting Linear Models 1st Edition Jim Frost PDF All Chapter
0% (1)
Instant Download Regression Analysis An Intuitive Guide For Using and Interpreting Linear Models 1st Edition Jim Frost PDF All Chapter
62 pages
Docs Slides Lecture11
No ratings yet
Docs Slides Lecture11
18 pages
Exam Formula Sheet
No ratings yet
Exam Formula Sheet
2 pages
Machine Learning Cheat Sheet
100% (1)
Machine Learning Cheat Sheet
211 pages
Slide 3 - Linear Regression One Variable
No ratings yet
Slide 3 - Linear Regression One Variable
60 pages
Machine Learning Andrew NG Week 6 Quiz 1
No ratings yet
Machine Learning Andrew NG Week 6 Quiz 1
8 pages
Machine Learning Coursera All Exercies PDF
No ratings yet
Machine Learning Coursera All Exercies PDF
117 pages
ML Cheatsheet
No ratings yet
ML Cheatsheet
1 page
Machine Learning Andrew NG Week 5 Quiz 1
No ratings yet
Machine Learning Andrew NG Week 5 Quiz 1
3 pages
Regularization: The Problem of Overfitting
No ratings yet
Regularization: The Problem of Overfitting
24 pages
Linear Regression With Gradient Descent
100% (1)
Linear Regression With Gradient Descent
8 pages
Slide 2 - Data Preprocessing
100% (1)
Slide 2 - Data Preprocessing
39 pages
Microeconomics Formulas and Expressions
No ratings yet
Microeconomics Formulas and Expressions
5 pages
Sanofi Research Document by Lazard
No ratings yet
Sanofi Research Document by Lazard
13 pages
Deloitte-Leading Beyond The Great Disruption
No ratings yet
Deloitte-Leading Beyond The Great Disruption
16 pages
Advice For Applying Machine Learning: Deciding What To Try Next
No ratings yet
Advice For Applying Machine Learning: Deciding What To Try Next
30 pages
Deployment: Cheat Sheet: Machine Learning With KNIME Analytics Platform
No ratings yet
Deployment: Cheat Sheet: Machine Learning With KNIME Analytics Platform
1 page
Model Perf Cheat Sheet
No ratings yet
Model Perf Cheat Sheet
2 pages
Unit 4 - Linear Regression
No ratings yet
Unit 4 - Linear Regression
52 pages
Slide 7 - Neural Networks
No ratings yet
Slide 7 - Neural Networks
64 pages
Chapter 5
No ratings yet
Chapter 5
58 pages
Regression
No ratings yet
Regression
46 pages
Customer Segmentation Clustering
No ratings yet
Customer Segmentation Clustering
35 pages
ML Unit 1 Notes
100% (1)
ML Unit 1 Notes
19 pages
Data Science With R
100% (1)
Data Science With R
6 pages
Support Vector Machines Succinctly
No ratings yet
Support Vector Machines Succinctly
116 pages
Logistic Regression
100% (1)
Logistic Regression
21 pages
1
100% (1)
1
385 pages
Introduction To Data Science Data Analysis and Prediction Algorithms With R
No ratings yet
Introduction To Data Science Data Analysis and Prediction Algorithms With R
4 pages
Logistic Regression: Jia Li
No ratings yet
Logistic Regression: Jia Li
44 pages
Logistic Regression Analysis
No ratings yet
Logistic Regression Analysis
16 pages
11 Data Visualization
No ratings yet
11 Data Visualization
44 pages
Cheat Sheet: With Stata 15
No ratings yet
Cheat Sheet: With Stata 15
1 page
Get Applied Machine Learning and AI for Engineers Jeff Prosise free all chapters
100% (3)
Get Applied Machine Learning and AI for Engineers Jeff Prosise free all chapters
40 pages
Scip y Lectures
No ratings yet
Scip y Lectures
337 pages
GAM: The Predictive Modeling Silver Bullet: Author: Kim Larsen
No ratings yet
GAM: The Predictive Modeling Silver Bullet: Author: Kim Larsen
27 pages
Basic Data Cleaning
100% (2)
Basic Data Cleaning
66 pages
Machine Learning Advanced
100% (2)
Machine Learning Advanced
12 pages
Database Management Systems by Raghu Ramakrishnan: Special Features of Book
No ratings yet
Database Management Systems by Raghu Ramakrishnan: Special Features of Book
3 pages
MiniTab Introduction
100% (1)
MiniTab Introduction
124 pages
Confidence Interval Exercise
No ratings yet
Confidence Interval Exercise
19 pages
CSE-Machine Learning & Big Data - WSS Source Book
No ratings yet
CSE-Machine Learning & Big Data - WSS Source Book
181 pages
Mathematics For Machine Learning
No ratings yet
Mathematics For Machine Learning
100 pages
A Primer On Process Mining Practical Skills With Python and Graphviz
No ratings yet
A Primer On Process Mining Practical Skills With Python and Graphviz
101 pages
Lecture 01 (Introduction To Pattern Recognition)
No ratings yet
Lecture 01 (Introduction To Pattern Recognition)
26 pages
Data Analytics
100% (2)
Data Analytics
99 pages
Statistical Method
No ratings yet
Statistical Method
156 pages
ST2195 Complete
No ratings yet
ST2195 Complete
430 pages
The Illustrated BERT, ELMo, and Co. (How NLP Cracked Transfer Learning) - Jay Alammar - Visualizing Machine Learning One Concept at A Time
No ratings yet
The Illustrated BERT, ELMo, and Co. (How NLP Cracked Transfer Learning) - Jay Alammar - Visualizing Machine Learning One Concept at A Time
19 pages
Unit 5 Mensuration
No ratings yet
Unit 5 Mensuration
5 pages
Data Science Full Roadmap
No ratings yet
Data Science Full Roadmap
2 pages
PCA Using Python
No ratings yet
PCA Using Python
18 pages
Machine Learning Super Cheatsheet (Prof. Pedram Jahangiry)
No ratings yet
Machine Learning Super Cheatsheet (Prof. Pedram Jahangiry)
2 pages
Math For Data Science
100% (1)
Math For Data Science
554 pages
Diabetes Prediction Report
No ratings yet
Diabetes Prediction Report
16 pages
Economics Cheat Sheet Maddogz43
No ratings yet
Economics Cheat Sheet Maddogz43
2 pages
Lecture4 PDF
No ratings yet
Lecture4 PDF
31 pages
Machine Learniing
No ratings yet
Machine Learniing
31 pages
3
No ratings yet
3
14 pages
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
From Everand
Hidden Line Removal: Unveiling the Invisible: Secrets of Computer Vision
Fouad Sabry
No ratings yet
Slide 11 - Anomaly Detection PDF
No ratings yet
Slide 11 - Anomaly Detection PDF
31 pages
Slide 12 - Dimentionality Reduction - PCA
No ratings yet
Slide 12 - Dimentionality Reduction - PCA
26 pages
Slide 9 - SVM
No ratings yet
Slide 9 - SVM
27 pages
Data Mining
No ratings yet
Data Mining
49 pages
Bio Local Extrema 2 Variables 2020
No ratings yet
Bio Local Extrema 2 Variables 2020
2 pages
Assignment 1
No ratings yet
Assignment 1
41 pages
Session 1
No ratings yet
Session 1
48 pages
PDF Spreadsheet Modeling and Decision Analysis 6th Edition Ragsdale Test Bank download
100% (4)
PDF Spreadsheet Modeling and Decision Analysis 6th Edition Ragsdale Test Bank download
53 pages
The Resolvent Cubic of A Quartic Polynomial
No ratings yet
The Resolvent Cubic of A Quartic Polynomial
2 pages
2018 OE (CS) 802D Operation Research
No ratings yet
2018 OE (CS) 802D Operation Research
5 pages
Taylor Maclaurin Polynomials
No ratings yet
Taylor Maclaurin Polynomials
15 pages
Cubic 2
No ratings yet
Cubic 2
5 pages
Duality 1 PDF
No ratings yet
Duality 1 PDF
26 pages
15forteen@cm Viva Voce
No ratings yet
15forteen@cm Viva Voce
6 pages
Mcse 004
No ratings yet
Mcse 004
5 pages
OR - 5th Meeting For Upload
No ratings yet
OR - 5th Meeting For Upload
28 pages
Lecture 3 - ODE (BVP) - Finite Difference
No ratings yet
Lecture 3 - ODE (BVP) - Finite Difference
31 pages
Manual Meta
No ratings yet
Manual Meta
4 pages
Ch5 (4e) Soln Part2
No ratings yet
Ch5 (4e) Soln Part2
34 pages
Fuzzy TNorm and SNorm
No ratings yet
Fuzzy TNorm and SNorm
16 pages
Linear Programming: Mcgraw-Hill/Irwin
No ratings yet
Linear Programming: Mcgraw-Hill/Irwin
20 pages
Abbas FastDOG Fast Discrete Optimization On GPU CVPR 2022 Paper
No ratings yet
Abbas FastDOG Fast Discrete Optimization On GPU CVPR 2022 Paper
11 pages
Alejandro, Jhaysen A (Laboratory Activity 7)
No ratings yet
Alejandro, Jhaysen A (Laboratory Activity 7)
3 pages
W1 W2 W3 W4 Supply F1 14 25 45 5 6 F2 65 25 35 55 8 F3 35 3 65 15 16 Demand 4 7 6 13
No ratings yet
W1 W2 W3 W4 Supply F1 14 25 45 5 6 F2 65 25 35 55 8 F3 35 3 65 15 16 Demand 4 7 6 13
2 pages
Math 10 Quiz Bee Quarter 2
No ratings yet
Math 10 Quiz Bee Quarter 2
50 pages
08 Matlab Interpolation
No ratings yet
08 Matlab Interpolation
21 pages
CH 8 Test Review
No ratings yet
CH 8 Test Review
8 pages
Numerica Analysis 1
No ratings yet
Numerica Analysis 1
44 pages
Chapter 3 - Transporatition and Assignment Models & Programming
No ratings yet
Chapter 3 - Transporatition and Assignment Models & Programming
32 pages
Tic Tac Toe Polynomials
No ratings yet
Tic Tac Toe Polynomials
2 pages
Bisecstional Method
No ratings yet
Bisecstional Method
3 pages
Numerical Methods Viva
100% (5)
Numerical Methods Viva
10 pages
Week03 Answers PDF
No ratings yet
Week03 Answers PDF
8 pages
Lect 8 Simplex Method - 1
No ratings yet
Lect 8 Simplex Method - 1
32 pages