9 DL_ANN_ActivationFunctions
9 DL_ANN_ActivationFunctions
HariBabu KVN
Activation Functions • Activation Functions are
applied over the linear
weighted summation of the
incoming information to a
node.
• Convert linear input signals
from perceptron to a
linear/non-linear output
signal.
• Decide whether to activate a
node or not.
Activation Functions
• Must be monotonic,
differentiable, and
quickly converging.
• Types of Activation
Functions:
• Linear
• Non-Linear
Linear Activation Functions
f(x) = ax + b
df(x)/dx = a
• Observations:
• Constant gradient
• Gradient does not
depend on the
change in the input
Non-Linear Activation Functions
• Sigmoid (Logistic)
• Hyperbolic Tangent (Tanh)
• Rectified Linear Unit (ReLU)
– Leaky ReLU
– Parametric ReLU
• Exponential Linear Unit (ELU)
Sigmoid
• Observations:
– Output: 0 to 1
– Outputs are not
zero-centered
– Can saturate and kill
(vanish) gradients
Tanh
• Observations:
– Output: -1 to 1
– Outputs are zero-centered
– Can saturate and kill
(vanish) gradients
– Gradient is more steeped
than Sigmoid resulting in
faster convergence
ReLU
• Observations:
– Greatly increases training
speed compared to Tanh and
Sigmoid
– Reduces likelihood of
vanishing gradient
– Issues: Dead nodes and
blowing up activations
Leaky ReLU
• Observations:
– Fixes dying ReLU issue
Parametric ReLU
Multi-Class Classification
Multi-Class Classification
Softmax
Softmax
Softmax
Binary Classification
Softmax Implementation
import numpy as np
def softmax(z):
'''Return the softmax output of a vector.'''
exp_z = np.exp(z)
sum = exp_z.sum()
softmax_z = np.round(exp_z/sum,3)
return softmax_z