Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
5 views

DeepLearning Lect2 3

Uploaded by

Nalain Abbas
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

DeepLearning Lect2 3

Uploaded by

Nalain Abbas
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 89

Deep Learning

Lecture-2
Dr. Abdul Jaleel
Associate Professor
Machine learning: A new Programming Paradigm
Linear Regression y=mx+c

But how to determine such exact m & c ??


Linear Regression y=mx+c

Possibility of too many


lines.

which one is best


suited?
Mean Squared Error
Residuals, Error or Loss

Cost Function

May be used to
Compare different
hypothetical Lines
Lots of Regression Lines, Each having some Cost
Which one is the best?
That have minimum
MSE cost.
Loss Functions

 Squared Loss Loss =

 Mean Squared Error (MSE) MSE =

 Absolute Loss Loss =

 Mean Absolute Error (MAE) MAE =

8
Convex Optimization and Gradient Descent Approach
A real-valued function defined on an n-dimensional interval is called convex if the
line segment between any two points on the graph of the function lies above or on the
graph.
Convex Optimization for a set of 21 Data-Points

Regression Line

+0

12
Convex Optimization

Loss function plot of 21


data points for

Wj ={-1,-0.5, 0, 0.5, 1, 1.5, 2, 2.5,…,5}

Loss =

13
Convex Optimization

From a range of weight values plotted in left side graph, Lets estimate loss for weight value zero.

14
Convex Optimization

𝑦 =0

Mean Squared Error Loss for weight value zero is calculated as a


difference/distance of Red and Green lines plotted in right side.
+0
15
Convex Optimization

𝑦 =0.5 𝑥

+0
16
Convex Optimization

𝑦 =1 𝑥

Next, lets guess loss for weight value one. +0


17
Convex Optimization

𝑦 =1.5 𝑥

Weight value 1.5 decreases the MSE.


+0
18
Convex Optimization

𝑦 =2 𝑥

For weight value 2, the predicted line best fits the data point line.
+0
19
Convex Optimization with bias

𝑦 𝑝 =𝑤𝑥 + 𝑏

20
Convex Optimization with bias

Loss =

21
Convex Optimization with bias

Loss =

Loss function’s surface plot is converted into contour plot.

22
Convex Optimization with bias

𝑦 𝑝 =𝑤𝑥 + 𝑏
23
Convex Optimization with bias

𝑦 =0 𝑥 − 1

𝑦 𝑝 =𝑤𝑥 + 𝑏
24
Convex Optimization with bias

𝑦 =1 𝑥 −1

𝑦 𝑝 =𝑤𝑥 + 𝑏
25
Convex Optimization with bias

𝑦 =2 𝑥 −1

𝑦 𝑝 =𝑤𝑥 + 𝑏
26
Convex Optimization with bias

𝑦 =2 𝑥+ 0

𝑦 𝑝 =𝑤𝑥 + 𝑏
27
Convex Optimization with bias

𝑦 =2 𝑥+1

𝑦 𝑝 =𝑤𝑥 + 𝑏
28
Gradient Descent Approach
Gradient Descent
Slope and
Derivative
Slope and Derivative
Result: the derivative of x2 is 2x
Derivative Partial Derivative
Partial Derivative
Gradient Descent Approach
Deep Learning
Lecture-3
Dr. Abdul Jaleel
Associate Professor
H(x) = Pred_y
 Lets apply Gradient descent in
coefficient learning to find the
Gradient
values of a function's parameters
Descent
that minimize the cost function as
far as possible.
Almost we reached a best fit line
- In Neural Networks, we apply Logistic Regression on the outcome of Gradient

Decent based learned parameters of Linear Regression best fit line.

- The Sigmoid Function works as an activation function for the Neuron to


classify

the outcome
Why we need a Sigmoid / Logit function instead of
Step Function for Neuron Activation
Why we need a Sigmoid / Logit function instead of
Step Function for Neuron Activation
Why we need a Sigmoid / Logit function instead of
Step Function for Neuron Activation
The Linear Equation

Non Linear
Activation Function
How it works for Row 1
Predicted and Actual outcome for Row 1 :- Error calculated with LogLoss function
instead of MSE
Predicted and Actual outcome for Row 2 :- Error calculated with LogLoss function
Predicted and Actual outcome for Row 13 :- Error calculated with LogLoss function
https://towardsdatascience.com/why-not-mse-as-a-lo
ss-function-for-logistic-regression-589816b5e03c
Loss is high for W1=1,W2=1, Need to apply Gradient Descent
Implementation of activation functions in python
Implementation of Loss functions in python
Now we start implementing gradient descent in plain python. Again the goal is to come up with same w1, w2
and bias that keras model calculated.
We want to show how keras/tensorflow would have computed these values internally using gradient descent

First write couple of helper routines such as sigmoid and log_loss


Now comes the time to implement our own custom neural network class !!
This shows that in the end we were able to come up with same value of w1,w2
and bias using a plain python implementation of gradient descent function

you can compare predictions from our own custom model


and tensoflow model.

You will notice that predictions are almost same


 https://www.analyticsvidhya.com/blog/2021/08/understanding-lin
ear-regression-with-mathematical-insights/
 https://youtu.be/xq7aULLsCtw
LINKS
 https://youtu.be/1-OGRohmH2s
https://
towardsdatascience.com/why-not-mse-as-a-loss-fu
nction-for-logistic-regression-589816b5e03c

https://www.baeldung.com/cs/cost-function-logisti
c-regression-logarithmic-expr#:~:
text=Mean%20Squared%20Error%2C%20common
ly%20used,function%20is%20however%20always
%20convex

You might also like