Lect 5- Non Linear Activation Functions
Lect 5- Non Linear Activation Functions
FUNCTIONS
Problems: not
compatible
gradient descent
via back-
propagation since
its derivative is
zero
Sigmoid (logistic)
Problems:
vanishing
gradient at
edges.
Hyperbolic Tangent
Now it’s output is zero centered because
its range in between -1 to 1 (i.e) -1 <
output < 1 .
Hence optimization is easier in this
method hence in practice it is always
preferred over Sigmoid function .
But still it suffers from Vanishing
gradient problem.
Hyperbolic Tangent
What is zero centered
distribution
Why zero centered activation
function is important?
Output is zero
centered.
ReLU (Rectified Linear Unit)
No dead neurons
Output is zero
centered.
ReLU Variants- ELU (Exponential Linear Units)
Identity function – for output layer
Softmax
Summary
Choosing an Activation Function
Hidden Layers: Typically use ReLU or its
variants (Leaky ReLU, Swish).
Output Layer:
Regression: Linear activation.
Binary Classification: Sigmoid.
Multiclass Classification: Softmax.
Activation functions are essential for deep
learning as they enable networks to capture
complex patterns, relationships, and features in
data.
Summary
Sources:
https://www.jeremyjordan.me/neural-net
works-activation-functions/
https://missinglink.ai/guides/neural-netw
ork-concepts/7-types-neural-network-act
ivation-functions-right/
https://ml-cheatsheet.readthedocs.io/en/
latest/activation_functions.html
https://github.com/MlvPrasadOfficial/ineu
ron.ai/blob/master/IPYNB%20FILES%20D
L/Activation%20Functions.ipynb
Sorces-Sigmoid function
https://deepai.org/machine-learning-glos
sary-and-terms/sigmoidal-nonlinearity
https://www.analyticsvidhya.com/blog/20
20/12/beginners-take-how-logistic-regres
sion-is-related-to-linear-regression/
https://towardsdatascience.com/activatio
n-functions-neural-networks-1cbd9f8d91
d6
Thank you
Linear function definition
Non-linear function
definition