4. Learning Algorithm
4. Learning Algorithm
Artificial Intelligent
(AI)
4. Learning
Algorithms
Motivation
Real world example:
• Fish packing plant: separate sea
bass from salmon using optical
sensing
• Features: Physical differences such
as length, lightness, width, number
and shape of fins, position of the
mouth
• Noise: variations in lighting,
position of the fish on the conveyor,
“static” due to the electronics of 2
Motivation
The two features of lightness and width for sea bass and salmon
Error is the difference between a single actual value and a single predicted
value 6
Loss
Python code
import numpy as np
def mae(predictions, targets):
differences = predictions - targets
absolute_differences = np.absolute(differences)
mean_absolute_differences =
Mean absolute loss absolute_differences.mean()
return mean_absolute_differences 7
Learning algorithms
8
Learning algorithms
2000s: Kernel Methods and Probabilistic Models
2006: NVIDIA
release CUDA 2001: Adaboost – An adaptive boosting method developed by Yoav Freund and
Robert Schapire.
2009: Andrew Ng
utilized GPUs to 2010s: Deep Learning Revolution
accelerate the 2012: AlexNet (Krizhevsky et al.) – A deep convolutional neural network that
training of large won the ImageNet competition, leading to breakthroughs in computer vision.
neural networks 2014: Generative Adversarial Networks (GANs) (Ian Goodfellow et al.) –
Introduced a new framework for generating synthetic data through adversarial
learning.
2017: Transformers (Vaswani et al.) – Revolutionized natural language
processing (NLP) by eliminating the need for recurrent neural networks.
2020s: Scalable AI and Further Innovations
2020: GPT-3 (OpenAI) – A large-scale transformer-based model demonstrating
significant progress in language understanding and generation.
9
Random forest
10
Adaboost
decision stumps or decision trees
11
Adaboost
12
Adaboost
Weak learners for image recognition
Haar filters
Common features 160,000+ possible features
associated with each 24 x 24 window
13
Cascade filter
14
Cascade filter
Prepare data
A proportion of 2:1 or higher between negative and positive samples is considered accept
15
Cascade filter
16
Biological Neurons
17
Biological Neurons
18
Model of a neuron
◉ Simplest model
inputs
outputs
System
19
Model of a neuron
Input x
outputs
System
𝒎
𝑅𝑒𝑙𝑎𝑡𝑖𝑜𝑠h𝑖𝑝 𝒚 = ∑ 𝒙 𝒊
𝒊=𝟏
But:
• The neuron only fires when it is sufficiently
excited
• Firing rate has an upper bound
20
Model of a neuron
◉ Modified model:
○ b: threshold bias → the neuron will not fire till it “high” enough.
21
Model of a neuron
𝒎
𝒖𝒌 = ∑ 𝒘 𝒊 𝒙 𝒊 𝒚 𝒌 =𝝋(𝒖𝒌 +𝒃𝒌 )
𝒊=𝟏
𝒗 𝒌 =𝒖𝒌 +𝒃𝒌
22
Model of a Neuron
23
Type of Activation (Squash) Functions
◉ Threshold function
(McCulloch-Pitts model -
1943)
◉ Piecewise-linear function
24
Type of Activation (Squash) Functions
◉ Logistic function:
◉ Hyperbolic tangent
function:
25
Type of Activation (Squash) Functions
◉ Gaussian functions
26
Network Architectures
27
Learning in neural networks
28
Simplest neural network: Perceptron
29
Perceptron
◉ Goal: To correctly classify the set of externally
applied stimuli x1, x2,…, xn into one of two
classes, C1 and C2.
31
Decision boundary
Decision boundary
◉ m = 1: ?
◉ m=2?
◉ m=2?
32
Selection of weights
33
Off-line calculation of weights
Example
Truth table of NAND
Three points (0,0), (0,1) and
(1,0) belong to one class. And (1,1)
belong to another class.
w = (1.5, −1, − 1)
Is the decision line unique for this problem?
34
Perceptron Learning
35
Perceptron Learning
◉ If the correct label (all the labels of the training samples are
known) is d=0; should we update the weights?
◉ If the desired output is d=1, assume the new weight vector
is w’, then we have:
36
Perceptron Learning
37
Perceptron Learning
38
Perceptron Learning
39
Perceptron Learning
◉ Algorithm Perceptron
Start with a randomly chosen weight vector w(1);
while there exist input vectors that are misclassified by w(n)
Do Let x(n) be a misclassified input vector;
Update the weight vector to
Increment n
end-while
40
Perceptron Learning
◉ Solution:
41
Perceptron Learning
42
Perceptron Convergence Theorem
43
Multilayer Perceptrons
◉ Multilayer perceptrons
(MLPs)
○ Generalization of the
single-layer perceptron
◉ Consists of
○ An input layer
○ One or more hidden
layers of computation
nodes
○ An output layer of
computation nodes
◉ Architectural graph of
a multilayer
perceptron with two
hidden layers:
44
Backpropagation
45
Data Augmentation
Data Augmentation is a technique that is used to create new artificial data from
already existing data sets
Motivation
Underfitting Overfitting
The model works well with the training When a model is trained with lots of data, it
data but performs poorly with the testing starts to pick up data from the noise and
data incorrect data entries.
Reasons: Reasons:
• Low variation of data and highly • High variation of the data and low bias.
biased model. • Model created is too complex and
• Model developed can’t handle advanced.
complex data. • The size of the training data is high.
• Small size of the training dataset.
• Training data is of poor quality
containing noise.
46
Data Augmentation
Data augmentation methods
Geometric Transformation Color Transformation AI Generative
• Flipping • Brightness • Generative
• Cropping • Darkness Adversarial
Networks
• Rotating • Sharpness
• Variation Auto-
• Zooming • Saturation Encoders
• Color Augmentation • Neural Style
Transfer
47
Benefits of Neural Networks
48
Limitation Neural Network