Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Session 6 Machine Learning Algorithms

The document provides an overview of machine learning algorithms, focusing on mathematical principles, linear regression, logistic regression, cost functions, and gradient descent. It introduces the perceptron model as a fundamental building block of neural networks, explaining its components, learning process, and types of perceptron models. Additionally, it discusses the advantages and disadvantages of single-layer and multi-layer perceptrons in solving complex problems.

Uploaded by

owekesa361
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Session 6 Machine Learning Algorithms

The document provides an overview of machine learning algorithms, focusing on mathematical principles, linear regression, logistic regression, cost functions, and gradient descent. It introduces the perceptron model as a fundamental building block of neural networks, explaining its components, learning process, and types of perceptron models. Additionally, it discusses the advantages and disadvantages of single-layer and multi-layer perceptrons in solving complex problems.

Uploaded by

owekesa361
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

EIT 4454 Machine Learning

Machine Learning Algorithms


Fundamental Mathematical Formulae
Learning outcomes
By the end of this Session, you should be able to:

 Describe some of the mathematical equations implemented


in those algorithms that help in the learning process.

 Demonstrate knowledge and understanding on how


machine learning algorithms function
Introduction
 As a subfield of artificial intelligence, Machine learning relies
heavily on mathematical principles and formulas to create
models that can learn and make predictions from data.

 Understanding the underlying mathematical concepts is


crucial for practitioners to effectively apply machine learning
algorithms. In this article.

 Lets explore some common mathematical formulas used in


machine learning.
Linear Regression
 Linear Regression is used to predict the outcome of a
continuous variable by fitting the best line on the data
points.
 The best-fitted line defines a relationship between the
dependent and the independent variable(s).
 The algorithm tries to find the best-fitted line for predicting
the value of the target variable.
 The best-fit line is attained by minimizing the sum of the
squared difference between the data points and the
regression line.
Linear Regression
Linear Regression
 The formula for simple linear regression can be expressed as:

y = β₀ + β₁x₁ + β₂x₂ + … + βₙxₙ + ɛ


Where:
 y is the target variable or Dependent Variable
 x₁, x₂, …, xₙ are the input features (Independent Variables)
 β₀, β₁, β₂, …, βₙ are the coefficients to be learned (slope)
 ɛ represents the error term
Logistic Regression
 Logistic Regression is a Classification algorithm used to estimate
the outcome of a categorical variable based on the
independent variables.
 It predicts the probability of an event occurring by fitting the data
to a logistic function.
 The coefficients of the independent variables in the logistic
function are optimized by maximizing the likelihood function.
 A decision boundary is optimized such that the cost function is
minimal.
 The cost function can be minimized by using Gradient Descent.
Logistic Regression
 Logistic regression is widely used for classification tasks.
 It is useful when the dependent variable is “binary” in nature.
 Logistic regression is usually associated with examining the
association of independent variables with one dichotomous
dependent variable.
 Linear regression is used when the dependent variable is
continuous, and the nature of the line of regression is linear.
Logistic Regression
 Logistic regression is widely used for classification tasks.
 The logistic function, or sigmoid function, plays a pivotal role
in this algorithm.
 It is defined as: σ(z) = 1 / (1 + e^(-z))
Where:
 z represents a linear combination of input features and their
corresponding coefficients.
 The logistic function maps the linear output to a value
between 0 and 1, allowing us to interpret it as a probability.
Cost Functions
 Cost functions quantify the error or discrepancy between
the predicted values and the actual values.
1. Mean Squared Error (MSE) is a commonly used cost function
for regression problems, defined as:
MSE = (1 / N) * ∑(yᵢ — ȳ)²
Where:
 N is the number of samples
 yᵢ is the actual value
 ȳ is the predicted value
Cost Functions
 For classification problems:
2. Cross-Entropy Loss is often employed, given by:
CE = -∑(yᵢ * log(pᵢ) + (1 — yᵢ) * log(1 — pᵢ))
Where:
 yᵢ is the actual label (0 or 1)
 pᵢ is the predicted probability
Gradient Descent
 Gradient descent is an optimization algorithm used to
minimize the cost function and find the optimal values of the
coefficients.
 The update rule for gradient descent can be expressed as:
θ = θ — α * ∇J(θ)
where
 θ represents the coefficients
 α is the learning rate
 J(θ) is the cost function
 ∇J(θ) is the gradient of the cost function with respect to θ.
Gradient Descent Example
Train me to find the line of best fit:
y = x * 1.28 + 1.02 For 500 times,
Cost: 123.53 after 100 times y = x * 1.39 + 1.03 Cost: 96.20
Gradient Descent
 Gradient Descent is a popular algorithm for solving AI problems.
 A simple Linear Regression Model can be used to demonstrate a
gradient descent.
 The goal of a linear regression is to fit a linear graph to a set of (x,y)
points.
 Though this can be solved with a math formula, Machine Learning
Algorithm can also solve this.
 This is what the previous example does.
 It starts with a scatter plot and a linear model (y = wx + b).
 Then it trains the model to find a line that fits the plot.
 This is done by altering the weight (slope) and the bias (intercept)
of the line.
Machine Learning Algorithms
Perceptron
Perceptron
 A Perceptron is an Artificial Neuron and the building block of
an Artificial Neural Network
 It is inspired by the function of a Biological Neuron.

 It is the simplest possible Neural Network.

 Perceptron is a linear Machine Learning algorithm used for


supervised learning for various binary classifiers.

 In Machine Learning and Artificial Intelligence, Perceptron is


the most commonly used term for all folks.
Perceptron - History
 1928 – 1971 - Frank Rosenblatt invented
a Perceptron program, on an IBM 704 computer at Cornell
Aeronautical Laboratory.
 Scientists had discovered that brain cells (Neurons) receive
input from our senses by electrical signals.
 The Neurons, then again, use electrical signals to store
information, and to make decisions based on previous input.
 Perceptrons are designed to simulate brain principles, with
the ability to learn and make decisions.
Perceptron - History

 It is the primary step to learn Machine Learning and Deep


Learning technologies, which consists of a set of weights,
input values or scores, and a threshold.
Perceptron Model

 Perceptron is also understood as the Artificial Neuron or


neural network unit helps to detect certain input data
computations in business intelligence.

 In Machine Learning, binary classifiers are defined as the


function that helps in deciding whether input data can be
represented as vectors of numbers and belongs to some
specific class.
 Binary classifiers can be considered as linear classifiers.
Perceptron – examples
Just imagine of the perceptron in your brain and how it helps
make decision? Will I go to the concert?

Question: Will the perceptron fire given the threshold as 1.5?


Basic Components of Perceptron

 The perceptron model contains three main components:


Perceptron Model
1. Input Nodes or Input Layer
 This is the primary component of Perceptron which accepts
the initial data into the system for further processing.
 Each input node contains a real numerical value.
Perceptron Model
2. Wight and Bias
 Weight parameter represents the strength of the connection
between units.
 Weight is directly proportional to the strength of the
associated input neuron in deciding the output.
 A higher value means that the input has a stronger influence
on the output.
 Bias can be considered as the line of intercept in a linear
equation
Perceptron Model
2. Wight and Bias
Example on Bias: Sometimes, if both inputs are zero, the
perceptron might produce an incorrect output.
 To avoid this, we give the perceptron an extra input with the
value of 1.
 This is called a bias.
Perceptron Model
3. Activation Function:
 This components are the ones that help to determine
whether the neuron will fire or not.
 Activation Function can be considered primarily as a step
function.
 The activation function is typically accompanied by
a Threshold Value.
 If the result of the activation function exceeds the threshold,
the perceptron fires (outputs 1), otherwise it remains inactive
(outputs 0).
Perceptron Model
3. Activation Function:

Note: The training function guesses the outcome based on the


activate function.
 Every time the guess is wrong, the perceptron should adjust
the weights.
 After many guesses and adjustments, the weights will be
correct.
Perceptron Model
3. Activation Function:
Backpropagation:
• After each guess, the perceptron calculates how wrong the
guess was.
• If the guess is wrong, the perceptron adjusts the bias and the
weights so that the guess will be a little bit more correct the
next time.
• This type of learning is called backpropagation.
• After trying (a few thousand times) the perceptron will
become quite good at guessing.
Perceptron Model
3. Activation Function:
 There are three types of Activation functions:
• Sign function
• Step function
• Sigmoid function
Perceptron Model
Perceptron Model
How Perceptron works:
 In Machine Learning, Perceptron is considered as a single-
layer neural network that consists of four main parameters
named:
1. input values (Input nodes)
2. weights and Bias, net sum
3. an activation function
 As shown in the previous diagram, Perceptron model works
in two important steps as discussed next:
Perceptron Model
Step-1
 In the first step, multiply all input values with corresponding
weight values and then add them to determine the
weighted sum.
 Mathematically, we can calculate the weighted sum as
follows:
∑wi*xi = x1*w1 + x2*w2 +…wn*xn
 Next, add a special term called bias 'b' to this weighted sum
to improve the model's performance.
∑wi*xi + b
Perceptron Model
Step-2
 The weighted sum is applied to the activation function 'f' to
obtain the desired output as follows:

Y = f(∑wi*xi + b)

 The output is either in binary form or a continuous value


Perceptron Algorithm - Frank Rosenblatt
 Frank Rosenblatt suggested the following algorithm:
Steps:
1. Set a threshold value
2. Multiply all inputs with its weights
3. Sum all the results
4. Activate the output
Perceptron Learning
 The perceptron can learn from examples through a process
called training.
 The learning process presents the perceptron with labeled
examples, where the desired output is known.
 The perceptron compares its output with the desired output
and adjusts its weights accordingly, aiming to minimize the
error between the predicted and desired outputs.
 This is typically done using a learning algorithm such as the
perceptron learning rule or a backpropagation algorithm.
 The learning process allows the perceptron to learn the
weights that enable it to make accurate predictions for new,
unknown inputs.
Types of Perceptron Models
 A decisions can not be made by One Neuron alone, thus
other neurons provide more input
 Multi-Layer Perceptrons can be used for more sophisticated
decision making.
 Although perceptrons are limited to learning linearly
separable patterns, this is overcome by stacking multiple
perceptrons together in layers and incorporating non-linear
activation functions
 Neural networks can overcome this limitation and learn
more complex patterns
Types of Perceptron Models
 Perceptron models are divided into two types based on the
layers as follows:

1. Single-layer Perceptron Model


2. Multi-layer Perceptron model
Types of Perceptron Models
1. Single Layer Perceptron Model:
 The easiest Artificial neural networks (ANN) types.
 Consists of a feed-forward network and includes a threshold
transfer function inside the model.
 The main objective of the single-layer perceptron model is to
analyze the linearly separable objects with binary outcomes.
 In a single layer perceptron model, its algorithms do not
contain recorded data, so it begins with inconstantly
allocated input for weight parameters.
Types of Perceptron Models
1. Single Layer Perceptron Model:
Take note:
 That this model consists of a few discrepancies triggered
when multiple weight inputs values are fed into the model.
 Hence, to find desired output and minimize errors, some
changes should be necessary for the weights input.
 Single-layer perceptron can learn only linearly separable
patterns.
Types of Perceptron Models
2. Multi-Layered Perceptron Model:
 a multi-layered perceptron model is considered as multiple
artificial neural networks having various layers in which
activation function does not remain linear, similar to a single
layer perceptron model.
 A multi-layer perceptron model has greater processing
power and can process linear and non-linear patterns.
 Further, it can also implement logic gates such as AND, OR,
XOR, NAND, NOT, XNOR, NOR.
Types of Perceptron Models
2. Multi-Layered Perceptron Model:
 Consists of at least three layers: an input layer, more than
one hidden layer, and an output layer
Types of Perceptron Models
2. Multi-Layered Perceptron Model:
 Like a single-layer perceptron model, a multi-layer
perceptron model also has the same model structure but
has a greater number of hidden layers.
 The multi-layer perceptron model is also known as the
Backpropagation algorithm, which executes in two stages
as follows:
a. Forward Stage: Activation functions start from the input
layer in the forward stage and terminate on the output
layer.
Types of Perceptron Models
2. Multi-Layered Perceptron Model:
b. Backward Stage:
 In the backward stage, weight and bias values are modified
as per the model's requirement.
 In this stage, the error between actual output and
demanded originated backward on the output layer and
ended on the input layer.
Types of Perceptron Models
Advantages of Multi-Layer Perceptron:
1. A multi-layered perceptron model can be used to solve
complex non-linear problems.
2. It works well with both small and large input data.
3. It helps us to obtain quick predictions after the training.
4. It helps to obtain the same accuracy ratio with large as well
as small data.
Types of Perceptron Models
Disadvantages of Multi-Layer Perceptron:
1. In Multi-layer perceptron, computations are difficult and
time-consuming.
2. In multi-layer Perceptron, it is difficult to predict how much
the dependent variable affects each independent
variable.
Neural Networks
 Perceptrons are often used as the building blocks for more
complex neural networks, such as multi-layer perceptrons
(MLPs) or deep neural networks (DNNs).
 By combining multiple perceptrons in layers and connecting
them in a network structure, these models can learn and
represent complex patterns and relationships in data,
enabling tasks such as image recognition, natural language
processing, and decision making.
Neural Networks

You might also like