DeepLearning Lect2 3
DeepLearning Lect2 3
Lecture-2
Dr. Abdul Jaleel
Associate Professor
Machine learning: A new Programming Paradigm
Linear Regression y=mx+c
Cost Function
May be used to
Compare different
hypothetical Lines
Lots of Regression Lines, Each having some Cost
Which one is the best?
That have minimum
MSE cost.
Loss Functions
8
Convex Optimization and Gradient Descent Approach
A real-valued function defined on an n-dimensional interval is called convex if the
line segment between any two points on the graph of the function lies above or on the
graph.
Convex Optimization for a set of 21 Data-Points
Regression Line
+0
12
Convex Optimization
Loss =
13
Convex Optimization
From a range of weight values plotted in left side graph, Lets estimate loss for weight value zero.
14
Convex Optimization
𝑦 =0
𝑦 =0.5 𝑥
+0
16
Convex Optimization
𝑦 =1 𝑥
𝑦 =1.5 𝑥
𝑦 =2 𝑥
For weight value 2, the predicted line best fits the data point line.
+0
19
Convex Optimization with bias
𝑦 𝑝 =𝑤𝑥 + 𝑏
20
Convex Optimization with bias
Loss =
21
Convex Optimization with bias
Loss =
22
Convex Optimization with bias
𝑦 𝑝 =𝑤𝑥 + 𝑏
23
Convex Optimization with bias
𝑦 =0 𝑥 − 1
𝑦 𝑝 =𝑤𝑥 + 𝑏
24
Convex Optimization with bias
𝑦 =1 𝑥 −1
𝑦 𝑝 =𝑤𝑥 + 𝑏
25
Convex Optimization with bias
𝑦 =2 𝑥 −1
𝑦 𝑝 =𝑤𝑥 + 𝑏
26
Convex Optimization with bias
𝑦 =2 𝑥+ 0
𝑦 𝑝 =𝑤𝑥 + 𝑏
27
Convex Optimization with bias
𝑦 =2 𝑥+1
𝑦 𝑝 =𝑤𝑥 + 𝑏
28
Gradient Descent Approach
Gradient Descent
Slope and
Derivative
Slope and Derivative
Result: the derivative of x2 is 2x
Derivative Partial Derivative
Partial Derivative
Gradient Descent Approach
Deep Learning
Lecture-3
Dr. Abdul Jaleel
Associate Professor
H(x) = Pred_y
Lets apply Gradient descent in
coefficient learning to find the
Gradient
values of a function's parameters
Descent
that minimize the cost function as
far as possible.
Almost we reached a best fit line
- In Neural Networks, we apply Logistic Regression on the outcome of Gradient
the outcome
Why we need a Sigmoid / Logit function instead of
Step Function for Neuron Activation
Why we need a Sigmoid / Logit function instead of
Step Function for Neuron Activation
Why we need a Sigmoid / Logit function instead of
Step Function for Neuron Activation
The Linear Equation
Non Linear
Activation Function
How it works for Row 1
Predicted and Actual outcome for Row 1 :- Error calculated with LogLoss function
instead of MSE
Predicted and Actual outcome for Row 2 :- Error calculated with LogLoss function
Predicted and Actual outcome for Row 13 :- Error calculated with LogLoss function
https://towardsdatascience.com/why-not-mse-as-a-lo
ss-function-for-logistic-regression-589816b5e03c
Loss is high for W1=1,W2=1, Need to apply Gradient Descent
Implementation of activation functions in python
Implementation of Loss functions in python
Now we start implementing gradient descent in plain python. Again the goal is to come up with same w1, w2
and bias that keras model calculated.
We want to show how keras/tensorflow would have computed these values internally using gradient descent
https://www.baeldung.com/cs/cost-function-logisti
c-regression-logarithmic-expr#:~:
text=Mean%20Squared%20Error%2C%20common
ly%20used,function%20is%20however%20always
%20convex