100% found this document useful (1 vote)

697 views

Module 3

1. Unsupervised learning networks use unsupervised learning algorithms to analyze and cluster unlabeled data by discovering hidden patterns or grouping data without human intervention. 2. Kohonen self-organizing maps are a type of unsupervised neural network that uses competitive learning to map higher-dimensional input onto a lower-dimensional output space. 3. The Kohonen learning algorithm updates the weights of the winning neuron and its topological neighbors in response to each input pattern, causing the network to self-organize by developing clusters of neurons that respond similarly to certain input classes.

Uploaded by

sabharish varshan410

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

697 views

Module 3

Uploaded by

sabharish varshan410

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

Module: 3

UNSUPERVISED NETWORKS
content

Unsupervised Networks

• Kohonen Self-organizing maps

• Learning Vector Quantization(LVQ) network

• Adaptive Resonance Theory (ART)

• Deep learning and Recurrent Neural network

UNSUPERVISED LEARNING NETWORKS

• Unsupervised learning is the second major learning paradigm

• The system(environment) does not provide any feedback to

indicate the desired output
• The network itself has to discover any relationships of
interest (features, patterns, correlations, classifications) in
the input data

• Translate the discovered relationships into outputs

• These type of ANNs are also called self-organizing networks

UNSUPERVISED LEARNING NETWORKS
CONTD…
• It can determine how similar a new input pattern is to
typical patterns already seen
• The network gradually learns what similarity is
• It can set up several axes along which to measure similarity
to previous patterns
• The axes can be any one of the following:
• Principal Component Analysis
• Clustering
• Adaptive Vector Quantization
• Feature Mapping
UNSUPERVISED LEARNING NETWORKS
CONTD…
• The net has added structure by means of which it is forced to
make a decision
• If there are several neurons for firing then only one of them
is selected to fire (respond)
• The process for achieving this is called a competition

• An Example:
• Suppose we have a set of students
• Let us classify them on the basis of their performance
• The scores will be calculated
UNSUPERVISED LEARNING NETWORKS
CONTD…
• The one whose score is higher than all others will be the
winner
• The same principle is followed for pattern classification in
neural networks
• These nets are called competitive nets
• The extreme form of these nets are called winner-take-all
• In such a case, only one neuron in the competing group will
possess a non-zero output signal at the end of the
competition
UNSUPERVISED LEARNING NETWORKS
CONTD…
• Several ANNs exist under this category
1.Maxnet
2.Mexican hat
3.Hamming net
4.Kohonen self-organizing feature map
5.Counter propagation net
6.Learning Vector Quantization(LVQ)
7.Adaptive Resonance Theory (ART)
UNSUPERVISED LEARNING NETWORKS
CONTD…
• In case of these ANNs the net seeks to find patterns or
regularity in the input data by forming clusters
• ARTs are called clustering nets
• In such nets there are as many input units as an input vector
possessing components
• Since each output unit represents a cluster, the number of
output units will limit the number of clusters that can be
formed
• The learning algorithm used in most of these nets is known
as Kohonen learning
KOHONEN LEARNING

• The units update their weights by forming a new weight

vector, which is a linear combination of the old weight vector
and the new input vector
• The learning continues for the unit whose weight vector is
closest to the input vector
• The weight updation formula for output cluster unit ‘j’ is
given by
w j (new)  w j (old )   [ x  w j (old )]
• x: Input vector
• w j : weight vector for unit j
• : learning rate, its value decreases monotonically as
training continues
DECISION ON WINNERS

• The square of the Euclidean distance between the input

vector and the weight vector is computed
• The unit whose weight vector is at the smallest distance from
the input vector is chosen as the winner
KOHONEN SELF-ORGANISING FEATURE MAPS

• Feature mapping is a process which converts the patterns of

arbitrary dimensionality into a response of one or two
dimensional array of neurons
• It converts a wide pattern space into a typical feature space
• Feature map is a network performing such a map
• It maintains the neighbourhood relations of input patterns
• It has to obtain a topology (structure) preserving map
• For such feature maps it is required to find a self-organizing
neural array which consists of neurons arranged in a one-
dimensional array or two-dimensional array
KOHONEN SELF-ORGANISING FEATURE MAPS
CONTD…
• Developed in 1982
• By Tuevo Kohonen, a professor emeritus of the Academy of
Finland
• SOMs learn on their own through unsupervised competitive
learning
• Called “Maps” because they attempt to map their weights to
conform to the given input data
• The nodes in a SOM network attempt to become like the
inputs presented to them
• Retaining principle 'features’ of the input data is a
fundamental principle of SOMs
KOHONEN SELF-ORGANISING FEATURE MAPS
CONTD…
• SOMs provide a way of representing multidimensional data
in a much lower dimensional space – typically one or two
dimensions
• This aides in their visualization benefit, as humans are more
proficient at comprehending data in lower dimensions than
higher dimensions
• SOMs are a valuable tool in dealing with complex or vast
amounts of data
• These are extremely useful for the visualization and
representation of complex or large quantities of data in a
manner that is most easily understood by the human brain
KOHONEN SELF-ORGANISING FEATURE MAPS
CONTD…
KOHONEN SELF-ORGANISING FEATURE MAPS
CONTD…
• Few key things to notice:
• Each map node is connected to each input node
• For this small 4x4 node network, that is 4x4x3=48 connections
• Map nodes are not connected to each other
• Nodes are organized in this manner, as a 2-D grid makes it easy
to visualize the results
• Each map node has a unique (i, j) coordinate
• This makes it easy to reference a node in the network
• Also, it is easy to calculate the distances between nodes
• Because of the connections only to the input nodes, the map
nodes are oblivious (unaware) as to what values their
neighbours have
KSO FEATURE MAPS: TRAINING ALGORITHM

• The steps involved in the training algorithm are:

• STEP 0: Initialize the weights wij (Random values are assumed)
• These can be chosen as the same as the components of the
input vector
• Set the topological neighbourhood parameters (radius of the
neighbourhood etc.)
• Initialize the learning rate
• STEP 1: Perform Steps 2 – 8 when stopping condition is false
• STEP 2: Perform Steps 3 – 5 for each input vector x
KSO FEATURE MAPS: TRAINING ALGORITHM

• STEP 3: Compute the square of the Euclidean distance for j =

1,2,…m. m
D( j )   ( xi  wij ) 2
i 1

• We can use the dot product method also

• STEP 4: Find the winning unit j, so that D(j) is minimum.
• STEP 5: For all units j within a specific neighbourhood of j and
for all i, calculate the new weights as:

wij (new)  wij (old)   .[x i  wij (old)]

wij (new)  (1   ) wij (old)   .xi

KSO FEATURE MAPS: TRAINING ALGORITHM

• STEP 6: Update the learning rate  using the formula

 (t  1)  (0.5). (t ) The value of alpha
decreases as we proceed
• STEP 7: Reduce radius of topological neighbourhood at
specified times
• STEP 8: Test for stopping condition of the network
•
EXAMPLE

• For a given Kohonen self-organising feature map with

weights shown in the next slide:

• (a) use the square of the Euclidean distance to find the

cluster unit 𝑌𝑗 closest to the input vector (0.2, 0.4). Using a
learning rate 0.2 find the new weights for unit 𝑌𝑗 .

• (b)For the input vector (0.6, 0.6) with learning rate 0.1, find
the winning cluster unit and its new weights.
EXAMPLE CONTD…
a

𝑌1 𝑌2 𝑌3 𝑌4 𝑌5
0.5 0.8
0.6 0.4
0.3 0.2 0.7 0.9
0.1 0.2

𝑋1 𝑋2

𝑥1 𝑥2
EXAMPLE CONTD…

(a) For the input vector (0.2, 0.4) = 𝑥1 , 𝑥2 and learning rate
𝛼 = 0.2, the weight vector W is given by (chosen arbitrarily)

0.3 0.2 0.1 0.8 0.4

𝑊=
0.5 0.6 0.7 0.9 0.2
Next we find out the winner unit by using square of unit
distance.
Here the formula is
2 2 2 2
𝐷(𝑗) = 𝑤𝑖𝑗 − 𝑥𝑖 = 𝑤1𝑗 − 𝑥1 + 𝑤2𝑗 − 𝑥2
𝑖=1
j= 1, 2, 3 , 4 ,5
EXAMPLE CONTD…

𝐷 1 = 0.3 − 0.2 2 + 0.5 − 0.4 2 = 0.01 + 0.01 = 0.02

𝐷 2 = 0.2 − 0.2 2 + 0.6 − 0.4 2 = 0.0 + 0.04 = 0.04
• 𝐷 3 = 0.1 − 0.2 2 + 0.7 − 0.4 2 = 0.01 + 0.09 = 0.1
𝐷 4 = 0.8 − 0.2 2 + 0.9 − 0.4 2 = 0.36 + 0.25 = 0.61
𝐷 5 = 0.4 − 0.2 2 + 0.2 − 0.4 2 = 0.04 + 0.04 = 0.08

• Since D(1)=0.02 is the minimum value, the winner unit is j =1.

• We now update the weights on the winner unit j =1. The

weight updation formula used is
• 𝑤𝑖𝐽 (𝑛𝑒𝑤) = 𝑤𝑖𝐽 (𝑜𝑙𝑑) + 𝛼[𝑥𝑖 − 𝑤𝑖𝐽 (𝑜𝑙𝑑)
EXAMPLE CONTD…

• Since j =1, the formula becomes

• 𝑤𝑖1 (𝑛𝑒𝑤) = 𝑤𝑖1 (𝑜𝑙𝑑) + 𝛼[𝑥𝑖 − 𝑤𝑖1 (𝑜𝑙𝑑)
• Putting i = 1, 2 we get
𝑤11 (𝑛) = 𝑤11 (O) + 𝛼[𝑥1 − 𝑤11 (O) = 0.3 + 0.2[0.2 − 0.3 = 0.28
•
𝑤21 (𝑛) = 𝑤21 (O) + 𝛼[𝑥2 − 𝑤21 (O) = 0.5 + 0.2[0.4 − 0.5 = 0.48

• The updated matrix is now

0.28 0.2 0.1 0.8 0.4
• 𝑊=
0.48 0.6 0.7 0.9 0.2
EXAMPLE CONTD…

• (b) For the input vector 𝑥1 , 𝑥2 ) = (0.6,0.6 and 𝛼 = 0.1, the

initialised weight matrix we take as same for part (a)
• The square of the Euclidean distance formula is same as in part (a)
• So, taking j = 1,…5, we get

𝐷 1 = 0.3 − 0.6 2 + 0.5 − 0.6 2 = 0.09 + 0.01 = 0.1

𝐷 2 = 0.2 − 0.6 2 + 0.6 − 0.6 2 = 0.16 + 0.0 = 0.16
• 𝐷 3 = 0.1 − 0.6 2 + 0.7 − 0.6 2 = 0.25 + 0.01 = 0.26
𝐷 4 = 0.8 − 0.6 2 + 0.9 − 0.6 2 = 0.04 + 0.09 = 0.13
𝐷 5 = 0.4 − 0.6 2 + 0.2 − 0.6 2 = 0.04 + 0.16 = 0.2
• Since D(1)=0.1 is the minimum, the winner is unit j = 1
EXAMPLE CONTD…

• We update the weight for j = 1 with the same formula

• 𝑤𝑖2 (𝑛𝑒𝑤) = 𝑤𝑖2 (𝑜𝑙𝑑) + 𝛼[𝑥𝑖 − 𝑤𝑖2 (𝑜𝑙𝑑)
• Taking i = 1, 2
𝑤11 𝑛 = 𝑤11 O + 𝛼 𝑥1 − 𝑤11 O = 0.3 + 0.1 0.6 − 0.3 = 0.33
•
𝑤21 (𝑛) = 𝑤21 (O) + 𝛼[𝑥2 − 𝑤21 (O) = 0.5 + 0.1[0.6 − 0.5 = 0.51

0.33 0.2 0.1 0.8 0.4

𝑊=
0.51 0.6 0.7 0.9 0.2
Problem
Construct a Kohonen self-organizing map to cluster the
four given vectors,[0011], [1000], [0110] and [0001]. The
number of cluster to be formed is two .Assume the initial
learning rate is 0.5

0.2 0.9
Assume the initial the weights 0.4 0.7
0.6 0.5
0.8 0.3
LEARNING VECTOR QUANTIZATION
(LVQ)
LVQ

• It is a process of classifying the patterns, wherein each

output unit represents a particular class
• For each class several units should be used
• The output unit weight vector is called Reference vector/
code book vector for the class which the unit represents. This
is a special case of competitive net
• During training the output units are found to be positioned to
approximate the decision surfaces of the existing Bayesian
classifier
LVQ CONTD…

• At the end of the training process

• LVQ net is found to classify an input vector by assigning it to
the same class as that of the output unit, which has its weight
vector very close to the input vector
• It is a classifier paradigm that adjusts the boundaries between
categories to minimize existing misclassification
• It is used for:
– Optical character recognition
– Converting speech into phonemes
LVQ ARCHITECTURE

• It has n input and m output units The weights from the ith
input unit to the jth output unit is given by wij
• Each output unit is associated with a class/cluster/category
• x: Training vector (x1, x2, …xn)
• T: category or class of the training vector x
• wj = weight vector for the jth output unit (w1j, …, wij, …wnj)
• cj = cluster or class or category associated with jth output
unit
• The Euclidean distance of jth output unit is
𝑛 2
• D(j) = 𝑥𝑖 − 𝑤𝑖𝑗
𝑖=1
TRAINING ALGORITHM

• STEP 0: Initiate the reference vectors. That is,

• From the given set of training vectors, take the first “m”
(number of clusters) training vectors and use them as weight
vectors, the remaining vectors can be used for training
• Assign the initial weights and classifications randomly
• Set initial learning rate 𝛼
• STEP 1: Perform steps 2-6 if the stopping condition is false
• STEP 2: Perform steps 3-4 for each training input vector x
• STEP 3: Calculate the Euclidean distance, for i = 1 to n, j =1 to
𝑛 2
m 𝐷(𝑗) = 𝑥𝑖 − 𝑤𝑖𝑗
𝑖=1
TRAINING ALGORITHM CONTD..

• Find the winning unit index J, when D(J) is minimum

• STEP 4: Update the weights on the winning unit wj using the
following conditions
• If T = cj, then 𝑤𝑗 (𝑛𝑒𝑤) = 𝑤𝑗 (𝑜𝑙𝑑) + 𝛼[𝑥 − 𝑤𝑗 (𝑜𝑙𝑑)
• If 𝑇 ≠ 𝑐𝑗 then 𝑤𝑗 (𝑛𝑒𝑤) = 𝑤𝑗 (𝑜𝑙𝑑) − 𝛼[𝑥 − 𝑤𝑗 (𝑜𝑙𝑑)
• STEP 5: reduce the learning rate 𝛼
• STEP 6: Test for the stopping condition of the training process.
(the stopping conditions may be fixed number of epochs or if
learning rate has reduced to a negligible value)
Problem
• Consider and test the LVQ net with five vector
to two classes.
Vector class
0011 1
1000 2
0001 2
1100 1
0110 1
In the given five vectors, first two vectors are used
as initial vector and the remaining three vectors are
used input vector.
Initialize the reference weight vectors as
W1= [0 0 1 1] W2=[1 0 0 0]
Learning rate α = 0.1
First input vector [0 0 0 1] with T=2
Calculate the Euclidean distance
𝑛 2
D(j) = 𝑥𝑖 − 𝑤𝑖𝑗
𝑖=1
D(1)= (0-0)2+ (0-0)2 +(0-1)2 +(1-1)2= 1
D(2) = (0-1)2 +(0-0)2 (0-0)2 (1-0)2 = 2
D(1) is minimum, hence the winner unit index J=1
𝑡 ≠ 𝐽 the weight updating is performed as
𝑤𝐽 (𝑛𝑒𝑤) = 𝑤𝐽 (𝑜𝑙𝑑) − 𝛼[𝑥 − 𝑤𝐽 𝑜𝑙𝑑
w11 = 0- 0.1(0-0) =0
w21= 0-0.1(0-0) = 0
w31= 1- 0.1(0-1)=1.1
w41 = 1- 0.1(1-1) = 1
The weight matrix becomes

0 1
0 0
1.1 0
1 0
Second input pattern [ 1 1 0 0] T=1 calculate the
Euclidian distance

D(2) is minimum 𝑡 ≠ 𝐽
0 1
0 −.01
1.1 0
1 0
Third input pattern [0 1 1 0] , T=1, calculate the Euclidian
distance

D(1) is minimum T=J = cj, then

𝑤𝑗 (𝑛𝑒𝑤) = 𝑤𝑗 (𝑜𝑙𝑑) + 𝛼[𝑥 − 𝑤𝑗 (𝑜𝑙𝑑)
The weight matrix becomes

0 1
0 −.01
1.09 0
0.9 0
AN EXAMPLE

• Consider an LVQ net with 2 input units x1, x2 and 4 output

classes c1, c2, c3 and c4
• There exists 16 classification units, with weight vectors as
indicated in the table below
x2/x1 0.0 0.2 0.4 0.6 0.8 1
0.0
0.2 c1 c2 c3 c4
0.4 c3 c4 c1 c2
0.6 c1 c2 c3 c4
0.8 c3 c4 c1 c2
1
EXAMPLE CONTD…

• Use square of Euclidean distance to measure the changes

occurring
• (a) Given an input vector of (0.25, 0.25) representing class 1
and using a learning rate of 0.25, show which classification
unit moves where?
• (b) Given the input vector of (0.4, 0.35) representing class 1,
using initial weight vector and learning rate of 0.25, note what
happens?
SOLUTION
• Let the input vectors be u1 and u2. Output classes be c1, c2, c3 and
c4
• The initial weights for the different classes are as follows:
• Class 1:
0.2 0.2 0.6 0.6
• 𝑤1 = 𝑐1 t =1
0.2 0.6 0.8 0.4
• Class 2:
0.4 0.4 0.8 0.8
• 𝑤2 = 𝑐2 t=2
0.2 0.6 0.8 0.4
• Class 3:
0.2 0.2 0.6 0.6
• 𝑤3 = 𝑐3 t=3
0.4 0.8 0.6 0.2
• Class 4:
0.4 0.4 0.8 0.8
• 𝑤4 = 𝑐4 t=4
0.4 0.8 0.6 0.2
SOLUTION CONTD…

• Part (a):
• For the given input vector (u1, u2) = (0.25, 0.25) with
𝛼 = 0.25 and t = 1, we compute the square of the Euclidean
distance as follows:
2 2
• D(j) = 𝑤1𝑗 − 𝑥1 + 𝑤2𝑗 − 𝑥2
SOLUTION CONTD…

• For j = 1 to 4
• D(1) = 0.005, D(2) = 0.125, D(3) = 0.145 and D(4) = 0.425
• As D(1) is minimum, the winner index J = 1. Also t =1
• So, we use the formula
𝑤𝑗 (𝑛𝑒𝑤) = 𝑤𝑗 (𝑜𝑙𝑑) + 𝛼[𝑥 − 𝑤𝑗 (𝑜𝑙𝑑)
The updated weights on the winner unit are
𝑤11 (𝑛𝑒𝑤) = 𝑤11 (𝑜𝑙𝑑) + 𝛼[𝑥1 − 𝑤11 (𝑜𝑙𝑑)
= 0.2 + 0.25(0.25 − 0.2) = 0.2125
𝑤21 (𝑛𝑒𝑤) = 𝑤21 (𝑜𝑙𝑑) + 𝛼[𝑥2 − 𝑤21 (𝑜𝑙𝑑)
= 0.2 + 0.25(0.25 − 0.2) = 0.2125
SOLUTION CONTD…

• So, the new weight vector is

0.2125 0.2 0.6 0.6
• 𝑊1 =
0.2125 0.6 0.8 0.4
b) For the given input vector
𝑤1 , 𝑤2 ) = (0.4,0.35 , 𝛼 = 0.25 and t = 1 we
calculate the Euclidean distance using
2 2 2 2
𝐷(𝑗) = 𝑤𝑖𝑗 − 𝑥𝑖 = 𝑤1𝑗 − 𝑥1 + 𝑤2𝑗 − 𝑥2
𝑖=1
Then we have D(1) = 0.0625, D(2)= 0.1025, D(3)
= 0.2425 and D(4) = 0.0425
So, D(4) is minimum and hence the winner unit
index is J = 4
SOLUTION CONTD…

• The fourth unit is the winner unit that is closest to the input
vector
• Since 𝑡 ≠ 𝐽, the weight updation formula to be used is:
• 𝑤𝐽 (𝑛𝑒𝑤) = 𝑤𝐽 (𝑜𝑙𝑑) − 𝛼[𝑥 − 𝑤𝐽 (𝑜𝑙𝑑)
• Updating the weights on the winner unit, we obtain

𝑤14 (𝑛𝑒𝑤) = 𝑤14 (𝑜𝑙𝑑) − 𝛼[𝑥1 − 𝑤14 (𝑜𝑙𝑑)

= 0.6 − 0.25(0.4 − 0.6) = 0.65
•
𝑤24 (𝑛𝑒𝑤) = 𝑤24 (𝑜𝑙𝑑) − 𝛼[𝑥2 − 𝑤24 (𝑜𝑙𝑑)
= 0.4 − 0.25(0.35 − 0.4) = 0.4125
SOLUTION CONTD…

• Therefore the new weight vector is

0.2 0.2 0.6 0.65
• 𝑊1 =
0.2 0.6 0.8 0.4125
Adaptive Resonance Theory (ART)
NETWORK
ART NETWORK

• It is Adaptive Resonance Theory (ART) Network

• Designed by Carpenter and Grossberg in 1987
• It is designed for both binary inputs and analog valued inputs
• The input patterns can be presented in any order
• This is unsupervised learning model based on competition
• It finds the categories automatically and learns new
categories if needed
• This was proposed to solve the problem of instability
occurring in feed forward networks
What is ART ?

The term "resonance" refers to resonant state of a

neural network in which a category prototype vector

matches close enough to the current input vector.

ART matching leads to this resonant state, which

permits learning. The network learns only in its

resonant state.
Key Innovation
The key innovation of ART is the use of “expectations.”
– As each input is presented to the network, it is compared
with the prototype vector that is most closely matches
(the expectation).
– If the match between the prototype and the input vector is
NOT satisfactory, a new prototype is selected. In this way,
previous learned memories (prototypes) are not destroyed
by new learning.
ART NETWORK
There are two versions of it: ART1 and ART2
– ART1 was developed for clustering binary vectors
– ART2 was developed to accept continuous valued vectors
• For each pattern presented to the network, an appropriate cluster
unit is chosen and the weights of the cluster unit are adjusted to let
the cluster unit to learn the pattern
• The network controls the degree of similarity of the patterns placed
on the same cluster units
• During training:
• Each training pattern may be presented several times
• The input patterns should not be presented on the same cluster unit
when it is presented each time
Stability-Plasticity Dilemma (SPD)

• System behaviour doesn’t change after irrelevant

Stability events

• System adapts its behaviour according to

Plasticity significant events

• How to achieve stability without rigidity and

plasticity without confusion?
Dilemma • Ongoing learning capability
• Preservation of learned knowledge
THE ART NETWORK

• The stability may be achieved by reducing the learning rates

• The ability of the network to respond to a new pattern equally
at any stage of learning is called a plasticity
• The ART networks are designed to possess the properties of
stability and plasticity
• But, it is difficult to handle both stability and plasticity
• The ART networks are designed particularly to resolve the
stability-plasticity dilemma
• This means they are stable to preserve significant past
learning but nevertheless remain adaptable to incorporate
new information whenever it appears
FUNDAMENTAL ARCHITECTURE OF ART

Three groups of neurons are used

• Input processing layer (F1 layer- F1(a) and F1(b))
• Clustering units (F2 layer)
• Control mechanism (It controls degree of similarity of
patterns placed on the same cluster)

Input Processing Layer

Input Portion Interface Portion

FUNDAMENTAL ARCHITECTURE CONTD…

• F1 layer input portion may be denoted as F1(a) and interface

as F1(b)
• The input portion may perform some processing based upon
the inputs it receives
• This is mostly performed in case of ART2 compared to ART1
• The interface portion of the F1 layer combines the input
portion from F1 and F2 layers for comparing the similarity of
the input signal with the weight vector for the cluster unit
that has been selected as a unit for learning.
• There exists two sets of weighted interconnections for
controlling the degree of similarity between the units in the
interface portion and the cluster layer
• The bottom-up weights are used for the connection from F1(b)

layer to F2 layer and are represented by 𝑏𝑖𝑗 (ith F1 unit to the jth
F2 unit)

• The top-down weights are used for the connections from F2 layer

to F1(b) layer and are represented by 𝑡𝑗𝑖 (jth F2 unit to ith F1 unit)

• The competitive layer in this case is the cluster layer and the
cluster unit with largest net input is the victim to learn input
pattern

• The activations of all other F2 units are made zero

ARCHITECTURE CONTD…

• The interface units combine the data from input and cluster
layer units
• On the basis of similarity between the top-down weight
vector and input vector, the cluster unit may be allowed to
learn the input pattern
• The decision is done by reset mechanism unit on the basis of
the signals it receives from interface portion and input portion
of the F1 layer
• When cluster unit is not allowed to learn, it is inhibited an a
new cluster unit is selected as the victim
FUNDAMENTAL OPERATING PRINCIPLE

• Presentation of one input pattern forms a learning trial

• The activations of all the units in the net are set to zero before
an input pattern is presented
• All the units n the F2 layer are inactive
• On presentation of a pattern, the input signals are sent
continuously until the learning trial is completed
• There exists a user-defined parameter called vigilance
parameter
• This parameter controls the degree of similarity of the
patterns assigned to the patterns assigned to the same cluster
unit
FUNDAMENTAL OPERATING PRINCIPLE

• The function of the reset mechanism is to control the state of

each node on F2 layer
• The units in F2 layer can be in any one of the three states at
any instant of time
1. Active: Unit is ON. The activation in this case is equal to 1.
For ART1, d = 1 and for ART2, 0 < d < 1
2. Inactive: Unit is OFF. The activation here is zero and the unit
may be available to participate in competition.
3. Inhibited: Unit is OFF. The activation here is also zero but the
unit here is prevented from participating in any further
competition during the presentation of current input vector
FUNDAMENTAL ALGORITHM

• This algorithm discovers clusters of a set of pattern vectors

• STEP 0: Initialize the necessary parameters
• STEP 1: Perform Steps 2-9 when stopping condition is false
• STEP 2: Perform Steps 3-8 for each input vector
• STEP 3: F1 layer processing is done
• STEP 4: Perform Steps 5-7 when reset condition is true
• STEP 5: Find the victim unit to learn the current input pattern.
The victim unit is going to be the F2 unit with the largest input
• STEP 6: F1(b) units combine their inputs from F1(a) and F2
• STEP 7: Test for reset condition
FUNDAMENTAL ALGORITHM

• If reset is true then the current victim unit is rejected

(inhibited)
• Go to step 4 if reset is false and the current victim unit is
accepted for learning
• Go to next step (Step 8)
• STEP 8: Weight updation is performed
• STEP 9: Test for stopping condition
• Note: The ART network does not require all training patterns
to be presented in the same order or even if all patterns are
presented in the same order , we refer to this as an epoch
DEEP LEARNING
Machine Learning
Machine learning (“ML“) is the scientific study of algorithms and statistical
models that computer systems use to perform a specific task without
using explicit instructions, relying instead on patterns and inference
derived from data.

“A computer program is said to learn from experience E with respect to

some class of tasks T and performance measure P if its performance at
tasks in T, as measured by P, improves with experience E“
Deep Learning
• Deep Learning is a subfield of machine learning concerned with

algorithms inspired by the structure and function of the brain called

artificial neural networks.

• The concept of deep learning is not new. It has been around for a couple

of years now. It’s on hype nowadays because earlier we did not have

that much processing power and a lot of data.

• As in the last 20 years, the processing power increases exponentially,

deep learning and machine learning came in the picture.

Deep Learning
• DL can process a wider range of data resources, requires less data
preprocessing by humans (e.g. feature labelling), and can sometimes
produce more accurate results than traditional ML approaches
(although it requires a larger amount of data to do so).

• Computationally more expensive in time to execute, hardware costs and

data quantities
Machine learning - most of the applied features need to be identified by
an domain expert in order to reduce the complexity of the data and make
patterns more visible to learning algorithms to work.
Deep Learning - learn high-level features from data in an incremental
manner. This eliminates the need of domain expertise and hard core
feature extraction.
Deep Learning
• The “deep” part of deep learning refers to creating deep neural
networks.
• This refers a neural network with a large amount of layers — with the
addition of more weights and biases, the neural network improves its
ability to approximate more complex functions.
• Deep learning carries out the machine learning process using an artificial
neural net that is composed of a number of levels arranged in a
hierarchy.
• The network learns something simple at the initial level in the hierarchy
and then sends this information to the next level.
Why is Deep Learning Important?
• The ability to process large numbers of features makes deep learning
very powerful when dealing with unstructured data.

• However, deep learning algorithms can be overkill for less complex

problems because they require access to a vast amount of data to be
effective.
Deep Learning
• Deep Learning outperform other techniques if the data size is large. But
with small data size, traditional machine Learning algorithms are
preferable

• Deep Learning techniques need to have high end infrastructure to train in

reasonable time.

• Deep Learning really shines when it comes to complex problems such as

image classification, natural language processing, and speech recognition
Recurrent Neural Network(RNN)
• Recurrent Neural Network is a generalization of feedforward neural
network that has an internal memory
• RNN is recurrent in nature as it performs the same function for
every input of data while the output of the current input depends
on the past one computation
• After producing the output, it is copied and sent back into the
recurrent network
• For making a decision, it considers the current input and the output
that it has learned from the previous input.
Recurrent Neural Network(RNN)
• RNNs can use their internal state (memory) to process sequences of
inputs
• This makes them applicable to tasks such as unsegmented,
connected handwriting recognition or speech recognition
• RNN, all the inputs are related to each other.
• RNN can model sequence of data so that each sample can be
assumed to be dependent on previous ones
• The main and most important feature of RNN is Hidden state,
which remembers some information about a sequence
Recurrent Neural Network(RNN)
• First, it takes the X(0) from the sequence of input and then it outputs
h(0) which together with X(1) is the input for the next step.

• h(0) and X(1) is the input for the next step.

• Similarly, h(1) from the next is the input with X(2) for the next step and
so on.

• It keeps remembering the context while training.

Recurrent Neural Network(RNN)
The formula for the current state is

ht - current state , ht-1 - previous state , xt -> input state

Applying Activation Function:

W is weight, h is the single hidden vector, Whh is the weight at previous hidden
state, Whx is the weight at current input state, tanh is the Activation funtion,
that implements a Non-linearity that squashes the activations to the range[-1,1]

Output:

Yt is the output state.

Why is the weight at the output state.

Recurrent Neural Network(RNN)
Training through RNN
1. A single time step of the input is provided to the network.
2. Then calculate its current state using set of current input and the previous state.
3. The current ht becomes ht-1 for the next time step.
4. One can go as many time steps according to the problem and join the information from
all the previous states.
5. Once all the time steps are completed the final current state is used to calculate the
output.
6. The output is then compared to the actual output i.e the target output and the error is
generated.
7. The error is then back-propagated to the network to update the weights and hence the
network (RNN) is trained.
Recurrent Neural Network(RNN)
Advantages
• An RNN remembers each and every information through time. It is useful in
time series prediction only because of the feature to remember previous
inputs as well.
• Recurrent neural network are even used with convolutional layers to extend
the effective pixel neighborhood.
Drawbacks:
• Gradient vanishing and exploding problems.
• Training an RNN is a very difficult task.
• It cannot process very long sequences if using tanh or relu as an activation
function.
Recurrent Neural Network(RNN)
• RNNs work upon the fact that the result of an information is dependent on its
previous state or previous n time steps

• RNNs might have a difficulty in learning long range dependencies. If we

backpropagate the error , need to apply the chain rule

• Chain rule, if any one of the gradients approached 0, all the gradients would
rush to zero exponentially fast due to the multiplication. Such states would no
longer help the network to learn anything. This is known as the gradient
vanishing problem.

• Other RNN architectures like the LSTM(Long Short term memory) and the
GRU(Gated Recurrent Units) which can be used to deal with the vanishing
gradient problem.

AI For Everyone 2
No ratings yet
AI For Everyone 2
69 pages
EC8093 Unit 5
100% (1)
EC8093 Unit 5
124 pages
IoT-Enabling-Technologies
No ratings yet
IoT-Enabling-Technologies
17 pages
28-5-I O Fundamentals Handshaking, Buffering-20!10!2021 (20-Oct-2021) Material I 20-10-2021 Unit-5-Lecture1
100% (1)
28-5-I O Fundamentals Handshaking, Buffering-20!10!2021 (20-Oct-2021) Material I 20-10-2021 Unit-5-Lecture1
15 pages
COA Lab Manual
No ratings yet
COA Lab Manual
13 pages
BM2406 Digital Image Processing Lab Manual
No ratings yet
BM2406 Digital Image Processing Lab Manual
107 pages
Daa Lab Manual PDF
No ratings yet
Daa Lab Manual PDF
22 pages
Peripheral Interface Controller
No ratings yet
Peripheral Interface Controller
62 pages
Exp 9 Log Data Using Raspberry Pi and Upload It To Cloud Platform
No ratings yet
Exp 9 Log Data Using Raspberry Pi and Upload It To Cloud Platform
12 pages
Input Output Organization
No ratings yet
Input Output Organization
19 pages
2D Viewing and Clipping
No ratings yet
2D Viewing and Clipping
79 pages
Problem Solving and Python Programming L T P C
No ratings yet
Problem Solving and Python Programming L T P C
1 page
Introduction To Micro Vision Keil
No ratings yet
Introduction To Micro Vision Keil
10 pages
CH 5 Basic Computer Organization and Design
100% (1)
CH 5 Basic Computer Organization and Design
50 pages
DAA Unit - 1
No ratings yet
DAA Unit - 1
68 pages
cs8086 Soft Computing
No ratings yet
cs8086 Soft Computing
14 pages
Matlab File - Deepak - Yadav - Bca - 4TH - Sem - A50504819015
No ratings yet
Matlab File - Deepak - Yadav - Bca - 4TH - Sem - A50504819015
59 pages
Timimg Diagram
No ratings yet
Timimg Diagram
27 pages
Unit-Iii Cell Site and Mobile Antennas: Ece/Liet C410 CMC Notes Sudheer Asst Prof ECE Dept
No ratings yet
Unit-Iii Cell Site and Mobile Antennas: Ece/Liet C410 CMC Notes Sudheer Asst Prof ECE Dept
10 pages
4th Sem Syllabus of RGPV Bhopal Cse
No ratings yet
4th Sem Syllabus of RGPV Bhopal Cse
14 pages
Mobile Computing Dr.P.rizwan Ahmed
No ratings yet
Mobile Computing Dr.P.rizwan Ahmed
6 pages
Unit 2 Topic 11-12 IEEE 802.3,802.4,802.5
No ratings yet
Unit 2 Topic 11-12 IEEE 802.3,802.4,802.5
68 pages
Attributes of Output Primitives
89% (9)
Attributes of Output Primitives
25 pages
Assignment Questions of ES1
No ratings yet
Assignment Questions of ES1
5 pages
ADE Lab Viva Question and Answer 2018
No ratings yet
ADE Lab Viva Question and Answer 2018
5 pages
8085 Microprocessor 2 Mark Questions
No ratings yet
8085 Microprocessor 2 Mark Questions
1 page
BTCS 504 18
No ratings yet
BTCS 504 18
2 pages
Telecommunication Switching Lab File
No ratings yet
Telecommunication Switching Lab File
62 pages
TOC Question Bank - Unit - 1 - 2 - 3 - 4 - 2022
No ratings yet
TOC Question Bank - Unit - 1 - 2 - 3 - 4 - 2022
7 pages
4.6.3 SCSI Bus
100% (1)
4.6.3 SCSI Bus
15 pages
Iot Unit6
No ratings yet
Iot Unit6
35 pages
KTU S5 Microprocessor and Microcontroller CSE May 2019 Question Paper
0% (1)
KTU S5 Microprocessor and Microcontroller CSE May 2019 Question Paper
2 pages
Computer Network Assignment
No ratings yet
Computer Network Assignment
17 pages
Question Bank Unit 1 2 3
No ratings yet
Question Bank Unit 1 2 3
2 pages
Locating Mobile Entities in Distributed Systems
67% (3)
Locating Mobile Entities in Distributed Systems
2 pages
MKC ES Units 3&4 ARM 1
No ratings yet
MKC ES Units 3&4 ARM 1
105 pages
Question Bank For 5 Units DIP-IT6005
0% (2)
Question Bank For 5 Units DIP-IT6005
5 pages
6th Sem Bput Eee Syllabus
No ratings yet
6th Sem Bput Eee Syllabus
8 pages
MCQ On Unit 1 2 PDF
0% (1)
MCQ On Unit 1 2 PDF
22 pages
Smac Protocol
No ratings yet
Smac Protocol
28 pages
Beyond Syllabus - 2
No ratings yet
Beyond Syllabus - 2
4 pages
IOT Assignment - 2 Q1) Write Short Note On Xively
No ratings yet
IOT Assignment - 2 Q1) Write Short Note On Xively
6 pages
Little Theorem
No ratings yet
Little Theorem
7 pages
Nptel QS Bank WSN
No ratings yet
Nptel QS Bank WSN
94 pages
Assign 7
No ratings yet
Assign 7
5 pages
Unit II - Perceptron
No ratings yet
Unit II - Perceptron
20 pages
Design and Analysis of Algorithms Laboratory (15Csl47)
100% (1)
Design and Analysis of Algorithms Laboratory (15Csl47)
12 pages
CG-Complete Notes 1584255270
No ratings yet
CG-Complete Notes 1584255270
144 pages
Collision Free Protocol-Bit Map Protocol
No ratings yet
Collision Free Protocol-Bit Map Protocol
9 pages
Lec-2 Image Enhancement in The Frequency Domain
No ratings yet
Lec-2 Image Enhancement in The Frequency Domain
74 pages
Program For Searching A Number or Character in String For 8086
No ratings yet
Program For Searching A Number or Character in String For 8086
23 pages
Programming Techniques for Turing Machine construction
No ratings yet
Programming Techniques for Turing Machine construction
31 pages
Computer Networks Course File
No ratings yet
Computer Networks Course File
50 pages
Digital Lab VIVA Questions
No ratings yet
Digital Lab VIVA Questions
4 pages
DVTP LAB Manual R21 IT
No ratings yet
DVTP LAB Manual R21 IT
44 pages
A Distinguish Between Linearly Separable and Linearly Inseparable Problems With Example
100% (1)
A Distinguish Between Linearly Separable and Linearly Inseparable Problems With Example
3 pages
Corner and Interest Point Detection
No ratings yet
Corner and Interest Point Detection
37 pages
Week 4
No ratings yet
Week 4
14 pages
MLCH9
No ratings yet
MLCH9
45 pages
Metode Data Mining Som
No ratings yet
Metode Data Mining Som
22 pages
Kohonen-SOM-1
No ratings yet
Kohonen-SOM-1
32 pages
Unit1 Intro and Prescriptive Model
No ratings yet
Unit1 Intro and Prescriptive Model
78 pages
Socket Programming 13
No ratings yet
Socket Programming 13
2 pages
School of Information Technology and Engineering M. Tech (Software Engineering)
No ratings yet
School of Information Technology and Engineering M. Tech (Software Engineering)
4 pages
Unit2 1 Descriptive Model
No ratings yet
Unit2 1 Descriptive Model
17 pages
Smart Waste Management: MGT 1022 Lean Start Up Management
No ratings yet
Smart Waste Management: MGT 1022 Lean Start Up Management
28 pages
FALLSEM2020-21 SWE2022 EPJ VL2020210105665 Review III Material Project SRS Common Format PDF
No ratings yet
FALLSEM2020-21 SWE2022 EPJ VL2020210105665 Review III Material Project SRS Common Format PDF
12 pages
A TCP/IP Networking Example: Figure 0.1. A Simple Web Request
No ratings yet
A TCP/IP Networking Example: Figure 0.1. A Simple Web Request
43 pages
CN Programming Assignment 1
No ratings yet
CN Programming Assignment 1
1 page
Module II - Form Object
No ratings yet
Module II - Form Object
35 pages
Module 2 PDF
No ratings yet
Module 2 PDF
83 pages
Fuzzy Rule Base and Approximate Reasoning
No ratings yet
Fuzzy Rule Base and Approximate Reasoning
31 pages
Module 4
No ratings yet
Module 4
90 pages
Introduction of Neural Network
100% (1)
Introduction of Neural Network
69 pages
Digital Assignment - I: Course Code: SWE1011 Course: Soft Computing
No ratings yet
Digital Assignment - I: Course Code: SWE1011 Course: Soft Computing
1 page
Introduction To Expert System
No ratings yet
Introduction To Expert System
18 pages
Business Intelligence Unit 5
No ratings yet
Business Intelligence Unit 5
12 pages
HTML,Mysql
No ratings yet
HTML,Mysql
21 pages
Arc 1.x - The First Year of The Future - Sumit Paul-Choudhury PDF
No ratings yet
Arc 1.x - The First Year of The Future - Sumit Paul-Choudhury PDF
514 pages
A Multi Touch Table With An AI User Control Point
No ratings yet
A Multi Touch Table With An AI User Control Point
2 pages
Artificial Intelligence Tutorial For Beginners
No ratings yet
Artificial Intelligence Tutorial For Beginners
17 pages
QUALTRICS - 2025 Consumer Trends Report_CAIG
No ratings yet
QUALTRICS - 2025 Consumer Trends Report_CAIG
30 pages
Eventmodeling and Eventsourcing Sample
100% (1)
Eventmodeling and Eventsourcing Sample
18 pages
Accenture Tech Vision 2021 Full Report
No ratings yet
Accenture Tech Vision 2021 Full Report
107 pages
Similarity Is Not All You Need: Endowing Retrieval-Augmented Generation With Multi-Layered Thoughts
No ratings yet
Similarity Is Not All You Need: Endowing Retrieval-Augmented Generation With Multi-Layered Thoughts
12 pages
Lecture-2 (Intelligent Agents)
No ratings yet
Lecture-2 (Intelligent Agents)
42 pages
AI Guide
100% (1)
AI Guide
14 pages
Stable Diffusion For Image Generation
No ratings yet
Stable Diffusion For Image Generation
23 pages
Feedforward Neural Networks in Depth, Part 1 - Forward and Backward Propagations - I, Deep Learning
No ratings yet
Feedforward Neural Networks in Depth, Part 1 - Forward and Backward Propagations - I, Deep Learning
11 pages
Introduction To Big Data: Types of Digital Data, History of Big Data Innovation
No ratings yet
Introduction To Big Data: Types of Digital Data, History of Big Data Innovation
12 pages
Week 3
No ratings yet
Week 3
15 pages
Gaia: A Benchmark For General AI Assistants: Hereby Accessible
No ratings yet
Gaia: A Benchmark For General AI Assistants: Hereby Accessible
24 pages
The Impact of Auto-Correction Tools On The Written Spelling Skill Among Students of English
No ratings yet
The Impact of Auto-Correction Tools On The Written Spelling Skill Among Students of English
97 pages
Project2 - 158755. 4.21
No ratings yet
Project2 - 158755. 4.21
3 pages
Building The World's Largest AI Tools Directory Taught Me These
No ratings yet
Building The World's Largest AI Tools Directory Taught Me These
7 pages
Unlabeled Data - Semi-Supervised Classification (PU Learning) - by Alon Agmon - Towards Data Science
No ratings yet
Unlabeled Data - Semi-Supervised Classification (PU Learning) - by Alon Agmon - Towards Data Science
10 pages
Jacob Eisenstein - Natural Language Processing-MIT Press
No ratings yet
Jacob Eisenstein - Natural Language Processing-MIT Press
591 pages
Activity Monitoring of Islamic Prayer Salat Postur
No ratings yet
Activity Monitoring of Islamic Prayer Salat Postur
6 pages
Confrom References Merged
No ratings yet
Confrom References Merged
86 pages
UTS Digital Language Literacies - Selma Seila Amelia
No ratings yet
UTS Digital Language Literacies - Selma Seila Amelia
2 pages
Ensemble Methods.pptx
No ratings yet
Ensemble Methods.pptx
32 pages
Hands On Machine Learning with Scikit Learn and TensorFlow Concepts Tools and Techniques to Build Intelligent Systems 1st Edition by Aurelien Geron ISBN 1491962291 9781491962299 - The ebook is available for instant download, read anywhere
100% (5)
Hands On Machine Learning with Scikit Learn and TensorFlow Concepts Tools and Techniques to Build Intelligent Systems 1st Edition by Aurelien Geron ISBN 1491962291 9781491962299 - The ebook is available for instant download, read anywhere
89 pages
First-Year Students AI-competence As A Predictor For Intended and de Facto Use of AI-tools For Supporting Learning Processes in Higher Education
No ratings yet
First-Year Students AI-competence As A Predictor For Intended and de Facto Use of AI-tools For Supporting Learning Processes in Higher Education
13 pages
Sensors 23 06839 v2
No ratings yet
Sensors 23 06839 v2
16 pages