Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Activation Functions

Activation functions are essential in artificial neural networks as they introduce non-linearity and determine whether a neuron should fire based on the weighted sum of inputs. Various types of activation functions, such as Step, Sigmoid, ReLU, and Swish, each have unique characteristics and applications. The document also includes Python implementations for several activation functions, demonstrating their use in neural networks.

Uploaded by

bca2m2
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Activation Functions

Activation functions are essential in artificial neural networks as they introduce non-linearity and determine whether a neuron should fire based on the weighted sum of inputs. Various types of activation functions, such as Step, Sigmoid, ReLU, and Swish, each have unique characteristics and applications. The document also includes Python implementations for several activation functions, demonstrating their use in neural networks.

Uploaded by

bca2m2
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 34

Activation Functions

Activation Functions
• Activation functions are a crucial component of
artificial neural networks, used to introduce non-
linearity into the model.
• They determine whether a neuron should be activated
(fire) or not, based on the weighted sum of inputs.
• There are several types of activation functions, each
with its own characteristics and use cases.
• To put in simple terms, an artificial neuron calculates
the ‘weighted sum’ of its inputs and adds a bias, as
shown in the following figure the net input.
Activation Functions
Activation Functions
• Now the value of net input can be any anything from -inf to
+inf.
• The neuron doesn’t really know how to bound to value and
thus is not able to decide the firing pattern.
• Thus the activation function is an important part of an artificial
neural network.
• They basically decide whether a neuron should be activated or
not.
• Thus it bounds the value of the net input.
The activation function is a non-linear transformation that we
do over the input before sending it to the next layer of neurons
or finalizing it as output.
Types of Activation Functions
• Types of Activation Functions –
– Several different types of activation functions are
used in Machine Learning.
– Some of them are explained next.
Step Function
• Step Function is one of the simplest kind of
activation functions.
• In this, we consider a threshold value and if
the value of net input say y is greater than the
threshold then the neuron is activated.
Mathematically,
Given below is the graphical representation of
step function.
Sigmoid Function
• Sigmoid function is a widely used activation
function. It is defined as:
Sigmoid Function
• This is a smooth function and is continuously
differentiable.
• The biggest advantage that it has over step and linear
function is that it is non-linear.
• This is an incredibly cool feature of the sigmoid
function.
• This essentially means that when I have multiple
neurons having sigmoid function as their activation
function – the output is non linear as well.
• The function ranges from 0-1 having an S shape.
ReLU
• The ReLU function is the Rectified linear unit.
• It is the most widely used activation function.
It is defined as:
ReLU
• Graphically,
ReLU
• The main advantage of using the ReLU
function over other activation functions is that
it does not activate all the neurons at the
same time.
• It means if the input is negative it will convert
it to zero and the neuron does not get
activated.
Leaky ReLU
• Leaky ReLU function is nothing but an
improved version of the ReLU function.
• Instead of defining the Relu function as 0 for x
less than 0, we define it as a small linear
component of x. It can be defined as:
Leaky ReLU
• Graphically,
Parametric Rectified Linear Unit (PReLU)

• Similar to Leaky ReLU, but a is learned during


training rather than being a fixed constant.
• Provides flexibility to adapt the slope of the
negative part of the activation function.
Exponential Linear Unit (ELU)
• Formula:
f(x) = x for x >= 0,
f(x) = a * (e^x - 1) for x < 0 (where a is a positive
constant)
• Output range: (-a, ∞)
• Similar to Leaky ReLU but differentiable and
has a non-zero mean.
• It addresses the vanishing gradient problem.
Scaled Exponential Linear Unit (SELU)
• Formula:
– Similar to ELU but with a particular choice for a
and scaling of weights.
• Designed to have self-normalizing properties
and improve training stability in deep
networks.
Hyperbolic Tangent (Tanh)
• Formula: f(x) = (e^(2x) - 1) / (e^(2x) + 1)
• Output range: (-1, 1)
• Similar to the sigmoid but centered at 0.
• It's zero-centered and helps mitigate the
vanishing gradient problem.
• Used in hidden layers of neural networks,
especially when outputs are normalized.
Swish
• Formula: f(x) = x / (1 + e^(-x))
• Designed to be smoother than ReLU and has
shown promise in some applications.
Implementation
• Step Function
import numpy as np
import matplotlib.pyplot as plt
def step_function(x):
return np.where(x >= 0, 1, 0)
x = np.linspace(-5, 5, 100)
y = step_function(x)
plt.plot(x, y)
plt.title("Step Function")
plt.grid()
plt.show()
Sigmoid Gunction
import numpy as np
import matplotlib.pyplot as plt
def sigmoid(x):
return 1 / (1 + np.exp(-x))
x = np.linspace(-5, 5, 100)
y = sigmoid(x)
plt.plot(x, y)
plt.title("Sigmoid Function")
plt.grid()
plt.show()
Hyperbolic Tangent (Tanh)
import numpy as np
import matplotlib.pyplot as plt
def tanh(x):
return np.tanh(x)
x = np.linspace(-5, 5, 100)
y = tanh(x)
plt.plot(x, y)
plt.title("Hyperbolic Tangent (Tanh)")
plt.grid()
plt.show()
Rectified Linear Unit (ReLU)
import numpy as np
import matplotlib.pyplot as plt
def relu(x):
return np.maximum(0, x)
x = np.linspace(-5, 5, 100)
y = relu(x)
plt.plot(x, y)
plt.title("Rectified Linear Unit (ReLU)")
plt.grid()
plt.show()
Leaky Rectified Linear Unit (Leaky ReLU)

import numpy as np
import matplotlib.pyplot as plt
def leaky_relu(x, alpha=0.01):
return np.where(x >= 0, x, alpha * x)
x = np.linspace(-5, 5, 100)
y = leaky_relu(x)
plt.plot(x, y)
plt.title("Leaky Rectified Linear Unit (Leaky ReLU)")
plt.grid()
plt.show()
Parametric Rectified Linear Unit (PReLU)

import numpy as np
import matplotlib.pyplot as plt
def prelu(x, a=0.01):
return np.where(x >= 0, x, a * x)
x = np.linspace(-5, 5, 100)
y = prelu(x)
plt.plot(x, y)
plt.title("Parametric Rectified Linear Unit (PReLU)")
plt.grid()
plt.show()
Exponential Linear Unit (ELU)
import numpy as np
import matplotlib.pyplot as plt
def elu(x, alpha=1.0):
return np.where(x >= 0, x, alpha * (np.exp(x) - 1))
x = np.linspace(-5, 5, 100)
y = elu(x)
plt.plot(x, y)
plt.title("Exponential Linear Unit (ELU)")
plt.grid()
plt.show()
Exponential Linear Unit (ELU)
• In this program, we define the ELU activation
function using the formula
elu(x, alpha) = x for x >= 0,
and elu(x, alpha) = alpha * (exp(x) - 1) for x < 0.
• You can adjust the alpha parameter to control
the slope of the negative part of the curve.
• The code then creates a range of x values,
computes the corresponding y values using
the ELU function, and plots the result.
Scaled Exponential Linear Unit (SELU)
• The Scaled Exponential Linear Unit (SELU) is a
self-normalizing activation function that can
maintain mean activations close to 0 and
standard deviations close to 1 during training.
• Here's a Python implementation of the SELU
activation function:
Scaled Exponential Linear Unit (SELU)
import numpy as np
import matplotlib.pyplot as plt
def selu(x, alpha=1.67326, scale=1.0507):
return scale * np.where(x > 0, x, alpha * (np.exp(x) - 1))
x = np.linspace(-5, 5, 100)
y = selu(x)
plt.plot(x, y)
plt.title("Scaled Exponential Linear Unit (SELU)")
plt.grid()
plt.show()
Scaled Exponential Linear Unit (SELU)
Scaled Exponential Linear Unit (SELU)
• In this implementation, we define the SELU
activation function using the formula
selu(x, alpha, scale) = scale * x for x >= 0,
and selu(x, alpha, scale) = scale * (alpha * (exp(x) - 1)) for
x < 0.
• The alpha and scale parameters are specific values
that are part of the SELU definition.
• The code then creates a range of x values,
computes the corresponding y values using the
SELU function, and plots the result.
Swish Function
• The Swish activation function is defined as:
swish(x) = x / (1 + exp(-x))
Swish Function
import numpy as np
import matplotlib.pyplot as plt
def swish(x):
return x / (1 + np.exp(-x))
x = np.linspace(-5, 5, 100)
y = swish(x)
plt.plot(x, y)
plt.title("Swish Activation Function")
plt.grid()
plt.show()
Swish Function
• In this code, we define the Swish function
using the formula provided.
• We create a range of x values, calculate the
corresponding y values using the Swish
function, and plot the curve.
• The Swish function is known for being smooth
and continuous, allowing it to be a viable
choice as an activation function in neural
networks.

You might also like