Week 3
Week 3
Single Layer networks cannot used to solve Linear Inseparable problems & can only be
used to solve linear separable problems
Single layer networks cannot solve complex problems
Single layer networks cannot be used when large input-output data set is available
Single layer networks cannot capture the complex information’s available in the
training pairs
Any neural network which has at least one layer in between input and output layers is
called Multi-Layer Networks
Layers present in between the input and out layers are called Hidden Layers
Input layer neural unit just collects the inputs and forwards them to the next higher
layer
Hidden layer and output layer neural units process the information’s feed to them and
produce an appropriate output
Multi -layer networks provide optimal solution for arbitrary classification problems
Multi -layer networks use linear discriminants, where the inputs are non linear
2
2.3 BACK PROPAGATION NETWORKS (BPN)
3
2.3.1 BPN Algorithm
The algorithm for BPN is as classified int four major steps as follows:
Algorithm
I. Initialization of weights
Step 1: Initialize the weights to small random values near zero
Step 2: While stop condition is false , Do steps 3 to 10
Step 3: For each training pair do steps 4 to 9
Step 4: Each input xi is received and forwarded to higher layers (next hidden)
Step 5: Hidden unit sums its weighted inputs as follows
Zinj = Woj + Σxiwij
Applying Activation function
Zj = f(Zinj)
This value is passed to the output layer
Step 6: Output unit sums it’s weighted inputs
yink= Voj + Σ ZjVjk
Applying Activation function
Yk = f(yink)
4
V. Updating of Weights & Biases
Step 9: continued:
New Weight is
Wij(new) = Wij(old) + Δwij
Vjk(new) = Vjk(old) + ΔVjk
New bias is
Woj(new) = Woj(old) + Δwoj
Vok(new) = Vok(old) + ΔVok
Step 10: Test for Stop Condition
2.3.2 Merits
2.3.3 Demerits
This network was proposed by Hect & Nielsen in 1987.It implements both supervised
& Unsupervised Learning. Actually it is a combination of two Neural architectures (a) Kohonan
Layer - Unsupervised (b) Grossberg Layer – Supervised. It Provides good solution where long
training is not tolerated. CPN functions like a Look-up Table Generalization. The training pairs
may be Binary or Continuous. CPN produces a correct output even when input is partially
incomplete or incorrect. Main types of CPN is (a) Full Counter Propagation (b) Forward only
Counter Propagation. Figure 2 represents the architectural diagram of CPN network.
5
• Forward only nets are the simplified form of Full Counter Propagation networks
• Forward only nets are used for approximation problems
The first layer is Kohonan layer which uses competative learning law.The procedure
used here is , when an input is provided the weighted net values is calculated for each node.
Then the node with maximum output is selected and the signals from other neurons are
inhibited. This output from the wining neuron only is provided to the next higher layer, which
is the Supervised Grosssberg layer.Grossberg processing is similar to that of an normal
supervised algorithm.
Step 1: Initialize the weights & learning rate to a small random value near zero
Step 2: While stop condition is false , Do steps 3 to 9
Step 3: Set the X- input layer input Activations to vector X
Step 4: Each input xi is received and forwarded to higher layers (Kohonan Layer)
Step 5: Kohonan unit sums its weighted inputs as follows
Inputs & Weights are normalised, then net value is calculated as follows
Kinj = Woj + XW ( in vectors)
Applying Activation function
Kj = f(Kinj)
Step 5A: Wining Cluster is identified.( The node with a maximum output is selected as Winner.
Only this output is forwarded to the next Grossberg layer, All other units output are
inhibited)
6
For clustering the inputs Xi’s , Euclidian Distance Norm function is used
Dj = (xi-vij) ^ 2 + (yk-wkj) ^ 2
Dj should be minimum
Step 6: Update the weights over the calculated winner unit Kj
Step 7: Test for stop condition of phase -I
(Phase –I : Input X layer to Z Cluster layers)
Phase – II : Z Cluster layers to Y output layers
Step 8: Repeat steps 5,5A,6 for Phase –II layers
Step 9: Test for stop condition for phase II
2.4.2 Merits
A combination of Unsupervised (Phase-I) & Supervised
Network works as like a Look Up Table
Fast and Coarse approximation
100 times faster than BPN model
2.4.3 Demerits
Learning phase requires intensive calculations
Selection of number of Hidden layer neurons is an issue
Selection of number of Hidden layers is also an issue
Network gets trapped in Local Minima
7
2.5 BI-DIRECTIONAL ASSOCIATIVE MEMORIES
8
Figure 3 represents the BAM architecture. BAM contains ‘n’ neurons in X layer and
‘m’ neurons in Y layer. Both X & Y layers can act as input or Output layers. Weights
for X layer is taken as Wij . Weights for Y layer is taken as Wij T. If Binary or Bipolar
Activations are used then it is known as Discrete BAM. If Continuous Activations are
used then it comes under Continuous BAM type.
2.5.2 Algorithm
The following steps explain the procedural flow of BiDirectional Associative Memory
2.5.3 Merits
2.5.4 Demerits
• Incorrect Convergence
• Memory capacity is limited because storage of ‘m’ patterns should be lesser than
‘n’ neurons of smaller layer
• Sometimes the networks learns some patterns which are not provided to it
9
2.5.5 Applications of BAM
• Fault Detection
• Pattern Association
• Real Time Patient Monitoring
• Medical Diagnosis
• Pattern Mapping
• Pattern Recognition systems
• Optimization problems
• Constraint satisfaction problems
How to learn a new pattern without forgetting the old traces (patterns) and how to adapt
to the changing environment (i/p). When there is change in the patterns (plasticity) how to
remember previously learned vectors (stability problem) is a problem. ART uses competitive
law (self-regulating control) to solve this PLACITICITY – STABILITY Dilemma. The
simplified ART diagram is given below in Figure 5.
2.6.2 Comparison Layer: Take 1D i/p vector and transfers it to the best match in recognition
field (best match - neuron in recognition unit whose weight closely matches with i/p
vector).
10
2.6.3 Recognition Unit: produces an output proportional to the quality of match. In this way
recognition field allows a neuron to represent a category to which the input vectors are
classified.
Vigilance parameter: After the i/p vectors are classified a reset module compares the
strength of match to vigilance parameter (defined by the user). Higher vigilance produces fine
detailed memories and lower vigilance value gives more general memory.
2.6.4 Reset module: compares the strength of recognition phase. When vigilance threshold
is met then training starts otherwise neurons are inhibited until a new i/p is provided.
There are two set of weights (1) Bottom up weight - from F1 layer to F2 Layer
(2) Top –Down weight – F2 to F1 Layer
Fast learning: Happens in ART 1 – Weight changes are rapid and takes place during
resonance. The network is stabilized when correct match at cluster unit is reached.
Slow Learning: Used in ART 2. weight change is slow and does not reach equilibrium
in each learning iteration.so more memory to store more i/p patterns (to reach stability) is
required.
11
Images adapted from Laurene Fausett, “Fundamentals of Neural Networks,
Architectures, Algorithms and
12
Images adapted from Laurene Fausett, “Fundamentals of Neural Networks,
Architectures, Algorithms and Applications”, Prentice Hall publications
• Pattern Recognition
• Pattern Restoration
• Pattern Generalization
• Pattern Association
• Speech Recognition
• Image Enhancement
• Image Restoration
• Facial Recognition systems
• Optimization problems
• Used to solve Constraint satisfaction problems
The net is a fully interconnected neural net, in the sense that each unit is connected
to every other unit. The net has symmetric weights with no self-connections, Wij =Wji and
Wii = 0
Only one unit updates its activation at a time and each unit continues to receive an
external signal in addition to the signal from the other units in the net. The asynchronous
13
updating of the units allows a function, known as an energy or Lyapunov function, to be found
for the net.
The basic diagram for Hopfield Networks is given in Figure 7. Here no learning
algorithm is used. No Hidden units/layers used. Patterns are simply stored by learning the
energies. Similar to Human brain in storing and retrieving memory patterns. Some patterns /
images are stored & when similar noisy input is provided the network recalls the related stored
pattern. The neuron can be ONN(+1) or OFF(-1).The neurons can change state between +1 &
-1 based on the inputs which they receives from other neurons. Hopfield Network is trained to
store patterns(memories). It can recognize previously learned (stored) Pattern from partial
(noisy) inputs.
Based on the activation functions used the Hopfield Network can Be classified into two
types. They are
14
Discrete Hopfield Network – Uses Discrete Activation Function
Continuous Hopfield Network – Uses Continuous Activation Function
Hopfield Networks Uses Lyapunov Energy Function. Energy function guarantees the
network to reach a stable minimum local energy state which resembles the stored patterns
2.7.4 Algorithm
15
Demerits
Incorrect Convergence
Memory capacity is limited because storage of ‘m’ patterns should be lesser than ‘n’
neurons of smaller layer
Sometimes the networks learn some patterns which are not provided to it
REFERENCE BOOKS
16