Soft Computing
Soft Computing
Perceptron Networks
Perceptron Networks
The basic networks in supervised learning
The perceptron network consist of three units
– Sensory unit (input unit)
– Associator unit (hidden unit)
– Response unit (output unit)
Perceptron Networks
• Input unit is connected to hidden units with fixed weights -
1,0,-1 assigned at random
• Binary activation function is used in input and hidden unit
• Output unit (1,0,-1) activation. The binary step with fixed
threshold ϴis used as activation.
• Output of perceptron is
y f ( yin )
1ifyin
f ( yin ) 0if yin
1 yin
Perceptron Networks
• Weight updation between hidden and output unit
• Checks out for error between hidden and calculated output
layer
• Error=target-calculated output
• weights are adjusted in case of error
wi (new) wi (old ) txi
b(new) b(old ) t
• α is the learning rate, ‘t’ is the target which is -1 or 1.
• No error-no weight change-training is stopped
Single classification perceptron
network
Perceptron Training Algorithm for
Single Output Classes
Step 0: initialize weights, bias, learning rate( 0< <=1)
Step 1: perform step 2-6 until final stopping condition is false
Step 2: perform steps 3-5 for each bipolar or binary training pair indicated by s:t
Step 3: input layer is applied with identity activation fn:
xi=si
Step 4: calculate output response of each input j=1 to m
first, net input is calculated
activation are applied over the net input to calculate the output response.
Perceptron Training Algorithm for
Single Output Classes
Step 5: Make adjustment in weight and bias j=1 to m and i=1 to n
x1 X2 t
1 1 1
1 -1 -1
-1 1 -1
-1 -1 -1
• The perceptron network, which uses perceptron learning
rule, is used to train the AND function.
• The network architecture is as shown in Figure.
• The input patterns are presented to the network one by one.
When all the four input patterns are presented, then one
epoch is said to be completed.
• The initial weights and threshold are set to zero
W1 = W2. = b = 0 and ϴ= 0. The learning rate is set equal to=1.
• First input, (1,1,1) Calculate the net input yin
x1 x2 t
1 1 -1
1 -1 1
-1 1 -1
-1 -1 -1
NETWORK STRUCTURE
• S
1
b
w1 y
x1 X1 Y
w2
x2
X2
COMPUTATIONS
• Let us take w1 w2 0; 1, 0
b=0
First input: (1, 1, -1)
yin b x1w1 x2 w2 0 1 0 1 0 0
• The output using activation function is
1, if yin 0;
y f ( yin ) 0, if yin 0;
1, if y 0.
in
COMPUTATIONS
• So, output (y = 0) (t = -1)
• So weight updation is necessary
b(new) 0
x1 x2 b t yin y w1 w2 b
1 1 1 -1 0 0 -1 -1 -1
1 -1 1 1 -1 -1 0 -2 0
-1 1 1 -1 -2 -1 0 -2 0
-1 -1 1 -1 2 1 1 -1 -1
EXAMPLE
x1 x2 x3 x4 b t
1 1 1 1 1 1
-1 1 -1 -1 1 1
1 1 1 -1 1 -1
1 -1 -1 1 1 -1
COMPUTATIONS
• Here we take w1 w2 w3 w4 0,b=0, 0.2 . Also, 1
• The activation function is given by
1, if yin 0.2;
y 0, if yin -0.2 ≤ 𝑦𝑖𝑛 ≤ 0.2;
1, if y 0.2.
in
-1 1 -1 1 1 1 -1 -1 0 2 0 0 2
1 1 1 -1 1 -1 4 1 -1 1 -1 1 1
1 -1 -1 1 1 -1 1 1 -2 2 0 0 0
COMPUTATIONS
Input Target Net Out weights
Input put
EPOCH-2 x x4 b t Y in y w1 w2 w3 w4 b
x1 x2 3
1 1 1 1 1 1 0 0 -1 3 1 1 1
-1 1 -1 1 1 1 4 1 -1 3 1 1 1
1 1 1 -1 1 -1 5 1 -2 2 0 2 0
1 -1 -1 1 1 -1 -2 -1 -2 2 0 2 0
COMPUTATIONS
Input Target Net Out Weights
Input put
EPOCH-3 x x4 b t Y in y w1 w2 w3 w4 b
x1 x2 3
1 1 1 1 1 1 2 1 -2 2 0 2 0
-1 1 -1 1 1 1 6 1 -2 2 0 2 0
1 1 1 -1 1 -1 -2 -1 -2 2 0 2 0
1 -1 -1 1 1 -1 -2 -1 -2 2 0 2 0
Here the target outputs are equal to the actual outputs. So, we stop.
THE FINAL NET
• s
1
x1 b 0
X1
w1 2
x2 w2 2 Y y
X2
w3 0
x3
X3 w4 2
x4 X4
ADALINE Networks
Adaptive Linear Neuron (Adaline)
• A network with a single linear unit is called an ADALINE (ADAptive
LINear Neuron)
• Input-output relationship is linear
• It uses bipolar activation for its input signals and its target output
• Weights between the input and output are adjustable and has only
one output unit
• Trained using Delta rule (Least mean square) or (Widrow-Hoff rule)
Delta Rule
• Delta rule for Single output unit
– Minimize the error over all training patterns.
– Done by reducing the error for each pattern one at a time
• Delta rule for adjusting the weight for ith pattern is (i=1to n)
wi (t yin) xi
• Delta rule in case of several output units for adjusting the
weight from ith input unit to jth output unit
wij (t yinj ) xi
Adaline Model
x0=1
1
b
x1 w1
X1
yin= xiwi
f(yin)
w2
x2
X2 wn
yin
xn
Xn e=t-yin
Output error t
Adaptive
generator
algorithm
Adaline Training Algorithm
Step 0: Weights and bias are set to some random values other
than zero. Learning rate parameter α
Step 2: Perform steps 3-5 for each bipolar training pair s:t
Y
Initialize weights and
bias and α
If Ei=Es
Calculate error
Ei=Σ(t-yin)2
For
each wi (new) wi (old ) (t yin ) xi
s:t b(new) b(old ) (t yin )
x1 x2 1 t
1 1 1 1
1 -1 1 1
-1 1 1 1
-1 -1 1 -1
The initial weights are taken to be W1 = W2 =b = 0.1
A 1 bias 1 w01
bias t1
x1 X1 y1
Y1
vo1
Z1 yk t2
Xi Yk
x2
vij wjk
Zj
ym tm
Ym
xn Xn
Zp
Input Layer Hidden Layer Output Layer
BACK PROPOGATION NETWORK
Updation of weights.
FLOWCHART DESCRIPTION FOR TRAINING PROCESS
yk f (( yk )in ), k 1,2...m.
TRAINING ALGORITHM (BACK PROPAGATION OF ERROR)
Back-propagation of error (Phase II)
• STEP 6: Each output unit yk , k 1, 2...m receives a target
pattern to the input training pattern and computes the error
correction using:
k (tk yk ) f '(( yk )in )
Note : f ' ( yin ) f ( yin )[1 f ( yin )]
• STEP 7: Each hidden unit z j , j 1, 2...p sums its delta inputs
from the output units:
m
( )inj .kw jk
k =1
• The term ( )inj gets multiplied with the derivative of f ((z j )in )
to calculate the error term:
𝜹𝒋 = 𝜹𝒊𝒏𝒋 𝒇′ 𝒛𝒊𝒏𝒋
Update the changes in weight and bias
•
v ij . xi
j
v .j
oj
BPN TRAINING ALGORITHM
• STEP 0: Initialize the weights. (The weights are taken from the
training phase
• STEP 1: Perform steps from 2 – 4 for each input vector
• STEP 2: Set the activation of input for xi , I = 1,2,…n
• STEP 3: Calculate the net input to hidden unit z and its output
n
(z in ) j voj i 1
xi .vij and z j f ((zin ) j )
• STEP 4: Compute the net input and the output of the output
layer ( y ) w p z .w , y f (( y ) )
in k 0k j1 j jk k in k
•
• USE SIGMOIDAL FUNCTIONS AS ACTIVATION FUNCTIONS
EXAMPLE-1
• Using the Back propagation network, find the new weights
for the net shown below. It is presented with the input
pattern [0,1] and the target output 1. Use a learning rate of
0.25 and binary sigmoidal activation function.
1
0.3 1
0.2
0 0.6 0.5 0.4 y
X1 Z1 Y
0.3
0.1
1 0.1
X2 Z2
0.4
COMPUTATIONS
• The initial weights are:
v11 0.6, v21 0.1, v01 0.3
v12 0.3, v22 0.4, v02 0.5
w1 0.4, w2 0.1, w0 0.2
• The learning rate: 0.25 .
1
• The activation function is: Binary sigmoidal , i.e. f (x)
1 e x
Phase I :The feed-forward of the input training pattern
Calculate the net input , for hidden layer
• For z 1 ne uron :
(z1 )in v01 v11.x1 v21.x2 0.3 0 (0.6) 1 (0.1) 0.2
• For z2 neuron:
(z2 )in v02 v12 .x1 v22 .x2 0.5 0 (0.3) 1(0.4) 0.9
• Output:
(in ) j k .w jk
k 1
COMPUTATIONS CONTD…
• Initial Weights
• Learning rate
• Upgradation rule
−3 3
,
𝑜𝑖 𝑜𝑖
lead to overshooting
• Solution to this problem is to monitor the error on the test set and
terminates the training when the errors increases.
1 X2 Z2
v22 0.4
COMPUTATIONS
• Here, the activation function is the bipolar sigmoidal
activation function, that is
x
f (x) 1 e
x
• and 1 e
v11 0.6, v21 0.1, v01 0.3
v12 0.3, v22 0.4, v02 0.5
w1 0.4, w2 0.1, w0 0.2
• The input vector is [-1, 1] and target vector is t = 1
• Learning rate 0.25
COMPUTATIONS
• The net input:
• For z 1 neuron :
(z1 )in v01 v11.x1 v21.x2 0.3 (1) (0.6) 1(0.1) 0.4
• For z2 neuron:
(z2 )in v02 v12 .x1 v22 .x2 0.5 (1) (0.3) 1 (0.4) 1.2
• Outputs:
1 e( z1 )in 1 e0.4
z1 f ((z1 )in ) ( z1 )in 0.1974
0.4
• 1 e 1 e
1 e( z1 )in 1 e1.2
z2 f ((z2 )in ) ( z2 )in
1.2
0.537
1 e 1 e
COMPUTATIONS
• For output layer:
• Input:
yin w0 w1z1 w2z2 0.2 0.1974 0.4 0.537 0.1 0.22526
• Output
1 e yin 1 e 0.22526
y f ( yin ) 0.1122
y in 0.22526
1 e 1 e
COMPUTATIONS
• Error at the output neuron:
• We use the gradient descent formula: k (tk yk ) f ' (( yk )in )
f ' ( yin ) 0.5[1 f ( yin )][1 f ( yin )] 0.5[1 0.1122] [1 0.1122] 0.4937
So, m
(in ) j k .w jk
• k 1
COMPUTATIONS
• Here, m =1 (the output neuron). So, (in ) j 1.wj1
• Hence,
(in )1 1.w11 0.5491 0.4 0.21964
(in )2 1.w21 0.5491 0.1 0.05491
• Now,
f ' ((zin )1 ) 0.5 [1 f ((zin )1 )][1 f ((zin )1)] 0.5 (1 0.1974)(1 0.1974)
where x^ji; is the center of the RBF unit for input variables; σithe width of ith
RBF unit; xji the jth variable of input pattern.
Step 7:calculate the output of the neural network