0% found this document useful (0 votes)

171 views

Soft Computing

The document describes a perceptron network for supervised learning. It consists of three units: a sensory/input unit, an associator/hidden unit, and a response/output unit. Inputs are connected to hidden units with randomly assigned weights. The output unit uses a binary step activation function. Weights are adjusted based on the error between the target and calculated output to minimize error through training. The network learns by updating weights according to a learning rule until the stopping condition is met.

Uploaded by

Eco Frnd Nikhil Ch

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

171 views

Soft Computing

Uploaded by

Eco Frnd Nikhil Ch

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 92

SUPERVISED LEARNING NETWORK

Perceptron Networks
Perceptron Networks
 The basic networks in supervised learning
 The perceptron network consist of three units
– Sensory unit (input unit)
– Associator unit (hidden unit)
– Response unit (output unit)
Perceptron Networks
• Input unit is connected to hidden units with fixed weights -
1,0,-1 assigned at random
• Binary activation function is used in input and hidden unit
• Output unit (1,0,-1) activation. The binary step with fixed
threshold ϴis used as activation.
• Output of perceptron is
y  f ( yin )

1ifyin   
 
f ( yin )  0if    yin   
 1 yin   
 
Perceptron Networks
• Weight updation between hidden and output unit
• Checks out for error between hidden and calculated output
layer
• Error=target-calculated output
• weights are adjusted in case of error
wi (new)  wi (old )  txi
b(new)  b(old )  t
• α is the learning rate, ‘t’ is the target which is -1 or 1.
• No error-no weight change-training is stopped
Single classification perceptron
network
Perceptron Training Algorithm for
Single Output Classes
Step 0: initialize weights, bias, learning rate( 0<  <=1)
Step 1: perform step 2-6 until final stopping condition is false
Step 2: perform steps 3-5 for each bipolar or binary training pair indicated by s:t
Step 3: input layer is applied with identity activation fn:
xi=si
Step 4: calculate output response of each input j=1 to m
first, net input is calculated

activation are applied over the net input to calculate the output response.
Perceptron Training Algorithm for
Single Output Classes
Step 5: Make adjustment in weight and bias j=1 to m and i=1 to n

Step 6: Test for the stopping condition. If there no change in

weight then stop the training process else start again from step2.
Flow chart for perceptron Network
with single output
Example
Implement AND function using perceptron
networks for bipolar inputs and target.

x1 X2 t
1 1 1
1 -1 -1
-1 1 -1
-1 -1 -1
• The perceptron network, which uses perceptron learning
rule, is used to train the AND function.
• The network architecture is as shown in Figure.
• The input patterns are presented to the network one by one.
When all the four input patterns are presented, then one
epoch is said to be completed.
• The initial weights and threshold are set to zero
W1 = W2. = b = 0 and ϴ= 0. The learning rate is set equal to=1.
• First input, (1,1,1) Calculate the net input yin

• The output y is computed by activation function over the

net input calculated:

ϴ = 0 hence, when yin =0, then y=0

Check whether t=y . Here t=1 and y=0
so t≠y, hence weight updation take place
The weight W1 = 1, W2 = l, b = 1 are the final weights after first
input pattern is presented.
The same process is repeated for all the input patterns.
The process can be stopped when all the targets become equal
to the calculated output or when a separating line is obtained
using the final weights for separating the positive responses
from negative responses.
Input Target Net Calculated Weight changes Weight
output
input
x1 x2 1 (t) yin y ∆w1 ∆w2 ∆wb W1 W2 b
0 0 0
1 1 1 1 0 0 1 1 1 1 1 1
1 -1 1 -1
-1 1 1 -1
-1 -1 1 -1
EXAMPLE
• Find the weights using perceptron network for the
given truth table. when all the inputs are presented
only one time. Use bipolar inputs and targets.

x1 x2 t
1 1 -1
1 -1 1
-1 1 -1
-1 -1 -1
NETWORK STRUCTURE
• S

1
b
w1 y
x1 X1 Y
w2

x2
X2
COMPUTATIONS
• Let us take w1  w2  0;   1,  0
b=0
First input: (1, 1, -1)
yin  b  x1w1  x2 w2  0 1 0 1 0  0
• The output using activation function is

1, if yin  0;
y  f ( yin )  0, if yin  0;
 1, if y  0.
 in
COMPUTATIONS
• So, output (y = 0)  (t = -1)
• So weight updation is necessary

w1(new)  w1(old )  .t.x1

 0 1 (1) 1
 1
w2 (new)  w2 (old )  .t.x1
 0 1 (1) 1
 1
b(new)  1
• The new weights are (-1, -1, -1)
COMPUTATIONS
• Second input: (1,-1,1)
yin  b  x1w1  x2 w2  11 (1)  (1 1)  1
( y  f ( yin )  1)  (t  1)
• So, new weights are to be computed
w1(new)  w1(old )  .t.x1 w2 (new)  w2 (old )  .t.x1
 (1) 111  111 (1)
0  2

b(new)  0

• The new weights are (0, -2, 0)

COMPUTATIONS
• Third input: (-1,1,-1)
yin  b  x1w1  x2 w2  0  (1)  0  (1 2)  2
( y  f ( yin )  1)  (t  1)
• Weight updation is not necessary
• So, new weights are not to be computed
• Fourth input: (-1, -1, -1)
yin  b  x1w1  x2 w2  0  (1)  0  (1 2)  2
( y  f ( yin )  1)  (t  1)

• So, new weights are to be computed

COMPUTATIONS
w1(new)  w1(old )  .t.x1
• S

 0 1 (1)  (1)

1
w2 (new)  w2 (old )  .t.x1
 2 1 (1)  (1)
 1
b(new)  1

• The new weights are (1, -1, -1)

FINAL ANALYSIS
Input Target Net Calculated weights
Input output

x1 x2 b t yin y w1 w2 b

1 1 1 -1 0 0 -1 -1 -1

1 -1 1 1 -1 -1 0 -2 0

-1 1 1 -1 -2 -1 0 -2 0

-1 -1 1 -1 2 1 1 -1 -1
EXAMPLE

Find the weights required to perform classification

using perceptron network. The vectors (1, 1, 1, 1) and
(-1, 1, -1, -1) are belonging to the class (so have target
value 1), vectors (1, 1, 1,-1) and (1, -1, -1, 1) are not
belonging to the class (so we have target value -1).
Assume learning rate as 1 ,initial weight as 0 and   0.2
INITIAL TABLE
• The truth table is given by

x1 x2 x3 x4 b t

1 1 1 1 1 1

-1 1 -1 -1 1 1

1 1 1 -1 1 -1

1 -1 -1 1 1 -1
COMPUTATIONS
• Here we take w1  w2  w3  w4  0,b=0,   0.2 . Also,   1
• The activation function is given by
1, if yin  0.2;
y  0, if yin  -0.2 ≤ 𝑦𝑖𝑛 ≤ 0.2;
 1, if y   0.2.
 in

• The net input is given by

yin  b  x1w1  x2 w2  x3w3  x4 w4
• The next table reflects the training performed with weights
computed
COMPUTATIONS
Input Target Net Out Weights
Inp put
ut
EPOCH-1 x x4 b t Y in y w1 w2 w3 w4 b
x1 x2 3
1 1 1 1 1 1 0 0 1 1 1 1 1

-1 1 -1 1 1 1 -1 -1 0 2 0 0 2

1 1 1 -1 1 -1 4 1 -1 1 -1 1 1

1 -1 -1 1 1 -1 1 1 -2 2 0 0 0
COMPUTATIONS
Input Target Net Out weights
Input put
EPOCH-2 x x4 b t Y in y w1 w2 w3 w4 b
x1 x2 3
1 1 1 1 1 1 0 0 -1 3 1 1 1

-1 1 -1 1 1 1 4 1 -1 3 1 1 1

1 1 1 -1 1 -1 5 1 -2 2 0 2 0

1 -1 -1 1 1 -1 -2 -1 -2 2 0 2 0
COMPUTATIONS
Input Target Net Out Weights
Input put
EPOCH-3 x x4 b t Y in y w1 w2 w3 w4 b
x1 x2 3
1 1 1 1 1 1 2 1 -2 2 0 2 0

-1 1 -1 1 1 1 6 1 -2 2 0 2 0

1 1 1 -1 1 -1 -2 -1 -2 2 0 2 0

1 -1 -1 1 1 -1 -2 -1 -2 2 0 2 0

Here the target outputs are equal to the actual outputs. So, we stop.
THE FINAL NET
• s

1
x1 b 0
X1
w1  2

x2 w2  2 Y y
X2
w3  0
x3
X3 w4  2

x4 X4
ADALINE Networks
Adaptive Linear Neuron (Adaline)
• A network with a single linear unit is called an ADALINE (ADAptive
LINear Neuron)
• Input-output relationship is linear
• It uses bipolar activation for its input signals and its target output
• Weights between the input and output are adjustable and has only
one output unit
• Trained using Delta rule (Least mean square) or (Widrow-Hoff rule)
Delta Rule
• Delta rule for Single output unit
– Minimize the error over all training patterns.
– Done by reducing the error for each pattern one at a time
• Delta rule for adjusting the weight for ith pattern is (i=1to n)
wi   (t  yin) xi
• Delta rule in case of several output units for adjusting the
weight from ith input unit to jth output unit

wij   (t  yinj ) xi
Adaline Model
x0=1
1

b
x1 w1
X1
yin= xiwi
f(yin)
w2 
x2
X2 wn

yin

xn
Xn e=t-yin
Output error t
Adaptive
generator
algorithm
Adaline Training Algorithm
Step 0: Weights and bias are set to some random values other
than zero. Learning rate parameter α

Step 1: Perform Steps 2-6 when stopping condition is false.

Step 2: Perform steps 3-5 for each bipolar training pair s:t

Step 3: Set activations for input units i=1 to n xi=si

Step 4: Calculate the net input to the output unit

n
yin  b   xiwi
i 1
Adaline Training Algorithm
Step 5: Update the weights and bias for i=1 to n

wi (new)  wi (old )   (t  yin) xi

b(new)  b(old )   (t  yin)

• Step 6: If highest weight change that occurred during training

is smaller than a specified tolerance then stop the training
else continue. (Stopping condition)
Start
Stop

Y
Initialize weights and
bias and α
If Ei=Es

Input the specified

tolerance error Es

Calculate error
Ei=Σ(t-yin)2

For
each wi (new)  wi (old )   (t  yin ) xi
s:t b(new)  b(old )   (t  yin )

Activate input units Calculate net input

Xi=si Yin=b+Σxi wi
Testing Algorithm
• Step 0: Initialize the weights(from training algorithm)
• Step 1: Perform steps2-4 for each bipolar input vector x
• Step 2: Set the activations of the input units to x
• Step 3: Calculate the net input
yin  b   xiwi

• Step 4: Apply the activation function over the net input

calculated
Example-7
Implement OR function with bipolar inputs and
target using Adaline network.

x1 x2 1 t
1 1 1 1
1 -1 1 1
-1 1 1 1
-1 -1 1 -1
The initial weights are taken to be W1 = W2 =b = 0.1

learning rate α = 0.1.

For the first input sample, X1 = 1, X2 = 1, t = 1,

Calculate the net input as

Input Target Net Weight changes Weight Error
inpu
t
x1 x2 1 (t) yin (t-yin) ∆w1 ∆w2 ∆wb W1 W2 b
0.1 0.1 0.1 (t-yin)2
1 1 1 1 0.3 0.7 0.07 0.07 0.07 0.17 0.17 0.17 0.49
1 -1 1 1
-1 1 1 1
-1 -1 1 -1
Epoch = Total Error value
• Epoch 1= (0.49+0.69+0.83+1.01) = 3.02
• Epoch 2= (0.046+0.564+0.629+0.699) =1.938
• Epoch 3 = (0.007+0.487+0.515+0.541) =1.550
Epoch4 = (0.076+0.437+0.448+0.456)= 1.417
Epoch 5 = (0.155+0.405+0.408+0.409) = 1.377
BACK-PROPAGATION NETWORK
(BPN)
BACK PROPOGATION NETWORK
• Back propagation learning algorithm is one of the most important
developments in neural networks
• This learning algorithm is applied to multilayer feed-forward
networks consisting of processing elements with continuous
differentiable activation functions
• The networks using the back propagation learning algorithm
are also called back propagation networks (BPN)
• Given a set of input, output pairs this algorithm provides a
procedure for changing the weights in a BPN to classify the given
input pattern correctly.
• The basic concept used for weight updation is the gradient-
descent method as used in simple perceptron network
BACK PROPOGATION NETWORK
• In this method the error is propagated back to the hidden unit
• Neural network is to train the net to achieve a balance between
the net's ability to respond (memorization) and its ability to give
reasonable responses to the input that similar but not identical
to the one that is used in training (generalization)
• It differs from other networks in respect to the process by which
the weights are calculated during the learning period
• When the number of hidden layers is increased the
complexity increases
• The error is usually measured at the output layer
• At the hidden layers there is no information about the errors
• So, other techniques are needed to be followed to calculate the
error at the hidden layers so that the ultimate error is minimized
ARCITECTURE OF BACK PROPOGATION
NETWORK
• It is a multi layer, feed forward neural network
• It has one input layer,one hidden layer and one output layer
• The neurons in the hidden and output layer have biases

• During the back-propagation phase of learning, signals are sent in

the reverse direction
• The output obtained from the net can be
Binary {0, 1} / Bipolar {-1, 1}
• The activation functions should increase monotonically,
differentiable
DIAGRAMATIC REPRESENTATION OF THE ARCHITECTURE OF BPN

A 1 bias 1 w01
bias t1
x1 X1 y1
Y1
vo1

Z1 yk t2
Xi Yk
x2
vij wjk

Zj
ym tm
Ym
xn Xn

Zp
Input Layer Hidden Layer Output Layer
BACK PROPOGATION NETWORK

The training of the BPN is done in three stages

 The feed-forward of the input training pattern

 Calculation and back-propagation of the error

 Updation of weights.
FLOWCHART DESCRIPTION FOR TRAINING PROCESS

• x= input training vector ( x 1 , x 2 , . . . x n ) [‘n’ in number]

• t= target output vector ( t 1 , t 2 , . . . t m ) [‘m’ in number]
•  = learning
𝑡ℎ
rate parameter
• x i = 𝑖 input unit
• v0 j = bias on 𝑗𝑡ℎ hidden unit
• wok = bias on 𝑘𝑡ℎ output unit
• z j = 𝑗𝑡ℎ hidden unit
𝑡ℎ
• Then we have input to the 𝑗 hidden unit is given by
n
(z j )in  voj  xi.vij , j 1, 2...p
i1
FLOWCHART DESCRIPTION FOR TRAINING PROCESS
CONTD…

• The output from the 𝑗𝑡ℎ hidden unit is given by

z j  f ((z j )in )
• Let y k be the 𝑘 𝑡ℎ output unit. The net input to it is
p
( yk )in  wok   z j .wjk , k  1, 2...m
j 1

• The output from the 𝑘 𝑡ℎ output unit is

yk  f (( yk )in )
ACTIVATION FUNCTIONS USED
• The commonly used activation functions are binary sigmoidal
and bipolar sigmoidal functions have the properties:
 x
1
1 e f (x)   x
,
f (x)   x
, 1 e
• Differentiable 1 e
• Monotonously Non-decreasing
• So, these functions are used as the activation functions. Here
• k = Error correction weight adjustment for wjk
• This error is at the output unit y k
• This is back-propagated to the hidden units which have fed
into y k
ERROR CORRECTION
CONTD…
• j
is the error correction weight adjustment for v
ij
• This error correction has occurred due to the back
propagation of error to the hidden unit z j
BPN TRAINING ALGORITHM
• STEP 0: Initialize weights and learning rate (some random
small values are taken)
• STEP 1: Perform Steps 2 -9 when stopping condition is false
• STEP 2: Perform steps 3 – 8 for each training pair
Feed-forward phase( phase I)
• STEP 3: Each input unit receives input signal x i , i=1,…n
and sends it to the hidden unit
• STEP 4: Each hidden unit z j , j = 1,…p sums its weighted input
signals to calculate net input:
n
(z j )in  voj   xi .vij
i1
TRAINING ALGORITHM
• Calculate the outputs from the hidden layer by applying
activation functions over(zin ) j , (binary or bipolar sigmoidal
activation function)
z j  f ((zin ) j )
These signals are sent as input signals for the output units
Step 5:
• For each output unit yk , k  1,...m , calculate the net input
p

( yk )in  w0k   z j .wjk

j 1

Apply the activation function to compute the output signal

yk  f (( yk )in ), k  1,2...m.
TRAINING ALGORITHM (BACK PROPAGATION OF ERROR)
Back-propagation of error (Phase II)
• STEP 6: Each output unit yk , k  1, 2...m receives a target
pattern to the input training pattern and computes the error
correction using:
k  (tk  yk ) f '(( yk )in )
Note : f ' ( yin )  f ( yin )[1  f ( yin )]

On the basis of the calculated error correction term, update the

change in weights and bias:
wjk  .k .z j
w0k  .k
• Also, send k to the hidden layer backwards
TRAINING ALGORITHM (BACK PROPAGATION OF ERROR)

• STEP 7: Each hidden unit z j , j 1, 2...p sums its delta inputs
from the output units:
m

( )inj  .kw jk
k =1

• The term ( )inj gets multiplied with the derivative of f ((z j )in )
to calculate the error term:
𝜹𝒋 = 𝜹𝒊𝒏𝒋 𝒇′ 𝒛𝒊𝒏𝒋
Update the changes in weight and bias
•
v ij   .  xi
j

 v   .j
oj
BPN TRAINING ALGORITHM

Weights and Bias Updation (Phase III)

Step 8: Each output unit, yk , k  1, 2,...m updates the bias and
weights as:
wjk (new)  wjk (old)   w jk
w0k (new)  w0k (old)   w0k
Each hidden unit, z j , j 1,...p updates its bias and weights as:
vij (new)  vij (old)   vij
v0 j (new)  v0 j (old)   v0 j

• STEP 9: Check for the stopping condition. The stopping

condition may be a certain number of cycles reached or
when actual output is equal to the target output (That is
error is zero)
TESTING ALGORITHM FOR BPN

• STEP 0: Initialize the weights. (The weights are taken from the
training phase
• STEP 1: Perform steps from 2 – 4 for each input vector
• STEP 2: Set the activation of input for xi , I = 1,2,…n
• STEP 3: Calculate the net input to hidden unit z and its output


n
(z in ) j  voj  i 1
xi .vij and z j  f ((zin ) j )
• STEP 4: Compute the net input and the output of the output
layer ( y )  w  p z .w , y  f (( y ) )
in k 0k  j1 j jk k in k
•
• USE SIGMOIDAL FUNCTIONS AS ACTIVATION FUNCTIONS
EXAMPLE-1
• Using the Back propagation network, find the new weights
for the net shown below. It is presented with the input
pattern [0,1] and the target output 1. Use a learning rate of
0.25 and binary sigmoidal activation function.

1
0.3 1
0.2
0 0.6 0.5 0.4 y
X1 Z1 Y
0.3
0.1
1 0.1
X2 Z2
0.4
COMPUTATIONS
• The initial weights are:
v11  0.6, v21  0.1, v01  0.3
v12  0.3, v22  0.4, v02  0.5
w1  0.4, w2  0.1, w0  0.2
• The learning rate:  0.25 .
1
• The activation function is: Binary sigmoidal , i.e. f (x) 
1 e x
Phase I :The feed-forward of the input training pattern
Calculate the net input , for hidden layer
• For z 1 ne uron :

(z1 )in  v01  v11.x1  v21.x2  0.3  0  (0.6) 1 (0.1)  0.2
• For z2 neuron:
(z2 )in  v02  v12 .x1  v22 .x2  0.5  0 (0.3) 1(0.4)  0.9

• Applying the activation function to calculate the

output, for hidden layer
Calculate the net input entering the output layer ,
• Input:
yin  w0  w1z1  w2 z2  0.2  0.5498 0.4  0.7109  0.1  0.09101

• Output:

Applying activation function to calculate the output,

Phase II: Calculation and back-propagation of the error
• We use the gradient descent formula: k  (tk  yk ) f ' (( yk )in )
• We have,
f ' ( yin )  f ( yin )[1  f ( yin )]  0.5227 [1  0.5227]  0.2495
• Here k = 1. So,
1  (1 0.5227) (0.2495)  0.1191
We next compute the changes in weights between the hidden
and the output layer
w1  .1.z1  0.25 0.1191 0.5498  0.0164
w2  .1 .z2  0.25 0.1191 0.7109  0.02117
w0  .1  0.25 0.1191  0.02978
COMPUTATIONS CONTD…
Compute the error portion between input and hidden layer

• The general formula is  j  (in ) j  f ' ((zin ) j )

• Each hidden unit sums its delta inputs from the output units.
m

(in ) j    k .w jk
k 1
COMPUTATIONS CONTD…

Here, m =1 (the output neuron). So, (in ) j  1.wj1

• So, ( )   .w  0.1191 0.4  0.04764
in 1 1 11
(in )2  1.w21  0.1191 0.1  0.1191
• Now, Error correction
f ' ((zin )1 )  f ((zin )1 )[1 f (zin )1 ]  0.5498 [1 0.5498]  0.2475
Hence, 1  (in )1. f ' ((zin )1 )  0.04764  0.2475  0.0118
• Again,
f ' ((zin )2 )  f ((zin )2 )[1  f (zin )2 ]  0.7109 [1  0.7109]  0.2055

So, 2  (in )2 . f ' ((zin )2 )  0.01191 0.2055  0.00245

•
COMPUTATIONS CONTD…
Now find the changes in weights between input and hidden layer

v11  .1.x1  0.25 0.0118 0  0

v21  .1.x2  0.25 0.01181  0.00295
v01  .1  0.25 0.0118  0.00295

v12  .2 .x1  0.25 0.00245 0  0

v22  .2 .x2  0.25 0.002451  0.0006125
v02  .2  0.25 0.00245  0.0006125
Phase :III Compute the final weights of the network
• s
v11 (new)  v11 (old )  v11  0.6  0  0.6
v12 (new)  v12 (old )  v12  0.3  0  0.3
v21 (new)  v21 (old )  v21  0.1  0.00295  0.09705
v22 (new)  v22 (old )  v22  0.4  0.0006125  0.4006125
v01 (new)  v01 (old )  v01  0.3  0.00295  0.30295
v02 (new)  v02 (old )  v02  0.5  0.0006125  0.5006125

w1(new)  w1(old)   w1  0.4  0.0164  0.4164

w2 (new)  w2 (old)   w 2  0.1 0.02117  0.12117
w0 (new)  w0 (old)   w 0  0.2  0.02978  0.17022
LEARNING FACTORS OF BACK PROPAGATION ALGORITHM

Convergence of the BPN is based upon several important factors

• Initial Weights

• Learning rate

• Upgradation rule

• Size and nature of the training set

• Architecture (Number of layers and number of neurons per layer)

INITIAL WEIGHTS
• Ultimate solution may be affected by the initial weights
• These are initialized by small random values
• The choice of initial weights determines the speed at which the network
converges
• Higher initial weights may lead to saturation of activation functions from
the beginning by stuck up at a local minimum
• One method to select the weights wij is choosing it in the range

−3 3
,
𝑜𝑖 𝑜𝑖

• oi = The number of processing elements j that feed forward to the

ith processing element
LEARNING RATE

• Learning rate  also affects the convergence of the network

• Larger value of  may speed up the convergence but may

lead to overshooting

• The range of  is 103 to 10

• Large learning rate leads to rapid learning but there will be

oscillation of weights

• Lower learning rate leads to slow learning

MOMENTUM FACTOR
• To overcome the problems stated above (in the previous slide)
we add a factor called momentum factor to the usual gradient
descent method

• Momentum factor  is in the interval [ 0, 1]

•  is normally taken to be 0.9

GENERALIZATION
• The best network for generalization is BPN

• A network is said to be generalized when it sensibly interpolates with

input networks that are new to the network.

• Overfitting or overtraining problem occur

• Solution to this problem is to monitor the error on the test set and
terminates the training when the errors increases.

• Improving the ability of the network to generalize from a training

data set to a test data set, it is desirable to make small changes in
the input space of training pattern as part of the training set.
NUMBER OF TRAINING DATA
• The training data should be sufficient and proper

• Training data should cover the entire expected input space.

• Training vectors should be taken randomly from the training set

• Scaling or Normalization has to be done to help learning.

NUMBER OF HIDDEN LAYER NODES
• If the number of hidden layers is more than one in a BPN
then the calculations performed for a single layer are
repeated for all the layer and are summed up at the end
• The size of a layer is very important
• It is determined experimentally
• If the network does not converge then the number of hidden
nodes are to be increased
• If the network converges then the user may try with a few
hidden nodes and settle for a size based on the performance
• In general the size of hidden nodes should be a relatively
small fraction of the input layer
EXAMPLE-2
• Find the new weights, using back-propagation network for
the network shown below. The network is presented with the
input pattern [-1,1] and the target output +1. Use a learning
rate of   0.25 and bipolar sigmoidal activation function.
1
v02  0.5 v01  0.3 1
1 v11  0.6 w0  0.2
X1 Z1 w1  0.4
v12  0.3 y
w2  0.1
Y
v21  0.1

1 X2 Z2
v22  0.4
COMPUTATIONS
• Here, the activation function is the bipolar sigmoidal
activation function, that is
x
f (x)  1 e
x
• and 1 e
v11  0.6, v21  0.1, v01  0.3
v12  0.3, v22  0.4, v02  0.5
w1  0.4, w2  0.1, w0  0.2
• The input vector is [-1, 1] and target vector is t = 1
• Learning rate   0.25
COMPUTATIONS
• The net input:
• For z 1 neuron :
(z1 )in  v01  v11.x1  v21.x2  0.3  (1) (0.6) 1(0.1)  0.4

• For z2 neuron:
(z2 )in  v02  v12 .x1  v22 .x2  0.5  (1)  (0.3) 1 (0.4)  1.2
• Outputs:
1 e( z1 )in 1 e0.4
z1  f ((z1 )in )  ( z1 )in   0.1974
0.4
• 1 e 1 e
1 e( z1 )in 1 e1.2
z2  f ((z2 )in )  ( z2 )in
 1.2
 0.537
1 e 1 e
COMPUTATIONS
• For output layer:
• Input:
yin  w0  w1z1 w2z2  0.2  0.1974 0.4  0.537  0.1  0.22526

• Output
1 e  yin  1 e 0.22526
y  f ( yin )   0.1122
 y in 0.22526
1 e 1 e
COMPUTATIONS
• Error at the output neuron:
• We use the gradient descent formula: k  (tk  yk ) f ' (( yk )in )

f ' ( yin )  0.5[1 f ( yin )][1 f ( yin )]  0.5[1 0.1122] [1 0.1122]  0.4937

• Here k = 1. So, 1  (1 0.1122)(0.4937)  0.5491

• The changes in weights between the hidden and output layer:
1 1 1
• w  . .z  0.25 0.5491(0.1974)  0.0271

w2  .2 .z2  0.25 0.5491 0.537  0.0737

w0  .1  0.25 0.5491  0.1373
COMPUTATIONS

• Next we compute the error portion j between the input and

the hidden layer
• The general formula is  j  (in ) j  f ' ((zin ) j )
• Each hidden unit sums its delta inputs from the output units.

So, m
(in ) j    k .w jk
• k 1
COMPUTATIONS
• Here, m =1 (the output neuron). So, (in ) j  1.wj1
• Hence,
(in )1  1.w11  0.5491 0.4  0.21964
(in )2  1.w21  0.5491 0.1  0.05491
• Now,
f ' ((zin )1 )  0.5 [1 f ((zin )1 )][1  f ((zin )1)]  0.5  (1 0.1974)(1  0.1974)

• So, 1  (in )1  f ' ((zin )1 )

 0.21964 0.5 (1 0.1974)(1 0.1974)
 0.1056
COMPUTATIONS
• and 2  (in )2  f ' ((zin )2 )
 0.5491 0.5 (1 0.537)(1 0.537)
 0.0195
• Now, the changes in the weights between the input and the
hidden layer are
v11  .1.x1  0.25 0.1056 (1)  0.0264
v21  .1.x2  0.25 0.10561  0.0264
v01  .1  0.25 0.1056  0.0264
v12  .2 .x1  0.25 0.0195 (1)  0.0049
v22  .2 .x2  0.25 0.01951  0.0049
v02  .2  0.25 0.0195  0.0049
FINAL WEIGHTS
w1(new)  w1(old)   w1  0.4  0.0271  0.3729
• s

w2 (new)  w2 (old)   w 2  0.1 0.0737  1.737

w0 (new)  w0 (old)   w 0  0.2  0.1373  0.0627
v11 (new)  v11 (old )  v11  0.6  0.0264  0.5736
v12 (new)  v12 (old )  v12  0.3  0.0049  0.3049
v21 (new)  v21 (old )  v21  0.1 0.0264  0.0736
v22 (new)  v22 (old )  v22  0.4  0.0049  0.4049
v01 (new)  v01 (old )  v01  0.3  0.0264  0.3264
v02 (new)  v02 (old )  v02  0.5  0.0049  0.5049
Radial Basis Function(RBF)
Network
Radial Base Function(RBF) Network
 The radial basis function (RBF) is a
classification and functional
approximation neural network
developed by M.J.D. Powell.

 The network uses the most

common nonlinearities such as
sigmoidal and Gaussian kernel
functions.

 The Gaussian functions are also

used in regularization networks.

 The Gaussian function is generally

defined as
Radial Base Function(RBF) Network
Radial Base Function(RBF) Network
Training Algorithm
Step 0: set the weights to small random values
Step 1: perform step 2-8 when the stopping condition is false
Step 2: perform step 3-7 for each input
Step 3: Each input unit receives the input signals and transmits
to the next hidden layer unit.
Step 4: calculate the radial basis function
Step 5: select the centres for the radial base function . The
centre are selected from the set of inputs vector
Radial Base Function(RBF) Network
Step 6: calculate the output from the hidden unit

where x^ji; is the center of the RBF unit for input variables; σithe width of ith
RBF unit; xji the jth variable of input pattern.
Step 7:calculate the output of the neural network

Where k is the number of hidden layer nodes (RBF function);

ynet the output value of mth node in output layer for the nth incoming pattern
Wim weight between ith RBF unit and mth output node; wo the biasing term
at nth output node.
Step 8: calculate the error and test for the stopping condition. The stopping
condition may be number of epochs or to certain extent weight change.

Laboratory No. 9:: Simple Calculator (Final Lab)
No ratings yet
Laboratory No. 9:: Simple Calculator (Final Lab)
10 pages
Derivation of Partial Correlation Coefficient Formula
100% (1)
Derivation of Partial Correlation Coefficient Formula
17 pages
Lab Report 2 Zaryab Rauf Fa17-Ece-046
No ratings yet
Lab Report 2 Zaryab Rauf Fa17-Ece-046
9 pages
6.891 Machine Learning: Project Proposal
No ratings yet
6.891 Machine Learning: Project Proposal
2 pages
Toc Experiment
No ratings yet
Toc Experiment
14 pages
Counters: Compiled By: Afaq Alam Khan
No ratings yet
Counters: Compiled By: Afaq Alam Khan
28 pages
Lecture 18 Conditional Jumps Instructions PDF
No ratings yet
Lecture 18 Conditional Jumps Instructions PDF
7 pages
Lap No-08 - Implementation of SJF Scheduling Algorithm
No ratings yet
Lap No-08 - Implementation of SJF Scheduling Algorithm
3 pages
Eee I Engineering Mathematics I (15mat11) Notes
No ratings yet
Eee I Engineering Mathematics I (15mat11) Notes
165 pages
Logic Design Viva Question Bank: Compiled by
No ratings yet
Logic Design Viva Question Bank: Compiled by
23 pages
Computer Exercise 5: Recursive Least Squares (RLS)
No ratings yet
Computer Exercise 5: Recursive Least Squares (RLS)
6 pages
Encoders and Multiplexer Circuits: by Dr. Nermeen Talaat
No ratings yet
Encoders and Multiplexer Circuits: by Dr. Nermeen Talaat
22 pages
MAT3002 Applied Linear Algebra Interim 2021-22 TEE Question Paper
No ratings yet
MAT3002 Applied Linear Algebra Interim 2021-22 TEE Question Paper
2 pages
Digital Circuits and Signal Simulation Lab Manual-(R23)-1 (1)
No ratings yet
Digital Circuits and Signal Simulation Lab Manual-(R23)-1 (1)
95 pages
Reg. No.: Name:: A X B X X X
No ratings yet
Reg. No.: Name:: A X B X X X
2 pages
Fpga Dumping Steps
No ratings yet
Fpga Dumping Steps
3 pages
Draw A Dependency Graph Between Any Two Courses...
No ratings yet
Draw A Dependency Graph Between Any Two Courses...
2 pages
MPMC Lab Manual
33% (3)
MPMC Lab Manual
56 pages
The Processor Status and THR Flags Register
No ratings yet
The Processor Status and THR Flags Register
12 pages
Lab Report 6: Digital System and Processing
No ratings yet
Lab Report 6: Digital System and Processing
8 pages
Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02
No ratings yet
Advanced Computer Architecture-II - CS704 Power Point Slides Lecture 02
44 pages
Applications of Linked List-2
No ratings yet
Applications of Linked List-2
40 pages
Randomized Quick Sort
No ratings yet
Randomized Quick Sort
5 pages
NPTEL Online Course: Control Engineering: Assignment 1
No ratings yet
NPTEL Online Course: Control Engineering: Assignment 1
4 pages
Task - 4: Task 4: Set Operators, Nested Queries, Joins, Sequences
100% (1)
Task - 4: Task 4: Set Operators, Nested Queries, Joins, Sequences
6 pages
DSA Lab 10
No ratings yet
DSA Lab 10
15 pages
TCS CodeVita 2017
No ratings yet
TCS CodeVita 2017
12 pages
VHDL Design Units
No ratings yet
VHDL Design Units
20 pages
DLD LabReport2 (CSE 204) PDF
No ratings yet
DLD LabReport2 (CSE 204) PDF
6 pages
Tutorial Lab-5
No ratings yet
Tutorial Lab-5
4 pages
5 6145610554284180365
No ratings yet
5 6145610554284180365
3 pages
Lab7: Part (A) : Design of 2-Out-Of-5 To BCD Code Converter With Display
No ratings yet
Lab7: Part (A) : Design of 2-Out-Of-5 To BCD Code Converter With Display
7 pages
Ward and Mellor Extensions
No ratings yet
Ward and Mellor Extensions
9 pages
Lab Report Exp 1
No ratings yet
Lab Report Exp 1
20 pages
Circuit, State Diagram, State Table
No ratings yet
Circuit, State Diagram, State Table
42 pages
Region Filling (RF & FF)
No ratings yet
Region Filling (RF & FF)
3 pages
Addition of Two Polynomials Using Linked List
No ratings yet
Addition of Two Polynomials Using Linked List
15 pages
Tut 6 Sol Multistage
No ratings yet
Tut 6 Sol Multistage
4 pages
Custom Single Purpose Processor Design
No ratings yet
Custom Single Purpose Processor Design
24 pages
5 - Linked List C++
No ratings yet
5 - Linked List C++
27 pages
BEE Question Bank With Answers
No ratings yet
BEE Question Bank With Answers
11 pages
Lab - 03 .: Singly Linked List and Its Applications: Objectives
No ratings yet
Lab - 03 .: Singly Linked List and Its Applications: Objectives
8 pages
Solved Numericals
No ratings yet
Solved Numericals
10 pages
Lab Manual With Procedure For Xilinx and Microwind
No ratings yet
Lab Manual With Procedure For Xilinx and Microwind
102 pages
Shift Register and Its Types
No ratings yet
Shift Register and Its Types
22 pages
Verification of Full Subtractor Circuit
No ratings yet
Verification of Full Subtractor Circuit
5 pages
Subsets, Graph Coloring, Hamiltonian Cycles, Knapsack Problem. Traveling Salesperson Problem
No ratings yet
Subsets, Graph Coloring, Hamiltonian Cycles, Knapsack Problem. Traveling Salesperson Problem
22 pages
Program For Searching A Number or Character in String For 8086
No ratings yet
Program For Searching A Number or Character in String For 8086
23 pages
Lecture11 - Linear Block Codes
No ratings yet
Lecture11 - Linear Block Codes
50 pages
112 Tellegens Theorem New
No ratings yet
112 Tellegens Theorem New
7 pages
Asynchronous and Synchronous Counter
No ratings yet
Asynchronous and Synchronous Counter
10 pages
Experiment - 05
No ratings yet
Experiment - 05
17 pages
Switching and Finite Automata Theory, 3rd Ed by Kohavi, K. Jha Sample From Ch10
No ratings yet
Switching and Finite Automata Theory, 3rd Ed by Kohavi, K. Jha Sample From Ch10
3 pages
Matlab Programs For Elementary Signals: Unit Step Function
No ratings yet
Matlab Programs For Elementary Signals: Unit Step Function
3 pages
BLOCK 2 Computer Graphics Ignou
No ratings yet
BLOCK 2 Computer Graphics Ignou
87 pages
Module 2 PDF
No ratings yet
Module 2 PDF
83 pages
Pattern Classifiers1
No ratings yet
Pattern Classifiers1
46 pages
Soft Computing Module 2
No ratings yet
Soft Computing Module 2
78 pages
Perceptron
No ratings yet
Perceptron
17 pages
Neural
No ratings yet
Neural
4 pages
Machine Learning
No ratings yet
Machine Learning
77 pages
Soft Computing
No ratings yet
Soft Computing
40 pages
Soft Computing
No ratings yet
Soft Computing
52 pages
POStagging
No ratings yet
POStagging
72 pages
Natural Language Processing:: N-Gram Language Models
No ratings yet
Natural Language Processing:: N-Gram Language Models
48 pages
User Datagram Protocol
No ratings yet
User Datagram Protocol
33 pages
Language Modeling: Introduction To N-Grams
No ratings yet
Language Modeling: Introduction To N-Grams
88 pages
Part-Of-Speech (POS) Tagging
No ratings yet
Part-Of-Speech (POS) Tagging
53 pages
Data Leakage and Prevention
No ratings yet
Data Leakage and Prevention
27 pages
Data Leakage and Prevention
No ratings yet
Data Leakage and Prevention
36 pages
SC Cat1 Merged PDF
No ratings yet
SC Cat1 Merged PDF
244 pages
Enterprise Resource Planning (Swe1014) : Prof. Kumaresan P
No ratings yet
Enterprise Resource Planning (Swe1014) : Prof. Kumaresan P
25 pages
Web Quiz
No ratings yet
Web Quiz
36 pages
Bypass Fraud Detection:: Artificial Intelligence Approach
No ratings yet
Bypass Fraud Detection:: Artificial Intelligence Approach
4 pages
Probability and Statistics MATH-361 (3-0) : Instructor: Sophia Siddique
No ratings yet
Probability and Statistics MATH-361 (3-0) : Instructor: Sophia Siddique
22 pages
Nighttime Vehicle Detection For Intelligent Headlight Control
No ratings yet
Nighttime Vehicle Detection For Intelligent Headlight Control
12 pages
A Reductions Approach To Fair Classification: Alekh Agarwal Alina Beygelzimer Miroslav Dud Ik John Langford Hanna Wallach
No ratings yet
A Reductions Approach To Fair Classification: Alekh Agarwal Alina Beygelzimer Miroslav Dud Ik John Langford Hanna Wallach
18 pages
Diploma in Data Science Online Training Content by MR Navin NareshIT Modified
No ratings yet
Diploma in Data Science Online Training Content by MR Navin NareshIT Modified
10 pages
Performance Metrics (Error Measures) in Machine Learning Regression, Forecasting and Prognostics: Properties and Typology
No ratings yet
Performance Metrics (Error Measures) in Machine Learning Regression, Forecasting and Prognostics: Properties and Typology
37 pages
Approaches To AI
0% (1)
Approaches To AI
7 pages
Turkish Natural Language Processing Studies
No ratings yet
Turkish Natural Language Processing Studies
7 pages
Introduction To Data Mining-1
100% (1)
Introduction To Data Mining-1
24 pages
Pergunta 1: 1 / 1 Ponto
No ratings yet
Pergunta 1: 1 / 1 Ponto
22 pages
Inter Cluster Inertia Gains: Slim Kammoun
No ratings yet
Inter Cluster Inertia Gains: Slim Kammoun
13 pages
Clustering Numericals
No ratings yet
Clustering Numericals
8 pages
MLP Vs RBF Doctoral Thesis
No ratings yet
MLP Vs RBF Doctoral Thesis
32 pages
Calculation Method of Electric Vehicle Power Consumption Based On Naive Bayes Classification
No ratings yet
Calculation Method of Electric Vehicle Power Consumption Based On Naive Bayes Classification
5 pages
CHAPTER 6 Machine Learning: Objective
No ratings yet
CHAPTER 6 Machine Learning: Objective
29 pages
Threshold Tuning For Improved Classification Association Rule Mining
No ratings yet
Threshold Tuning For Improved Classification Association Rule Mining
10 pages
ML Lab 08 Manual - Logisitic Regression (Ver7)
No ratings yet
ML Lab 08 Manual - Logisitic Regression (Ver7)
9 pages
Legal Document Similarity Matching Based On Ensemble Learning
No ratings yet
Legal Document Similarity Matching Based On Ensemble Learning
13 pages
International Journal of Rock Mechanics & Mining Sciences: David Recio-Gordo, Rafael Jimenez
No ratings yet
International Journal of Rock Mechanics & Mining Sciences: David Recio-Gordo, Rafael Jimenez
7 pages
Pattern Recognition
No ratings yet
Pattern Recognition
3 pages
MIssing Data Imputation Using Machine Learning Algorithm
No ratings yet
MIssing Data Imputation Using Machine Learning Algorithm
11 pages
Industrial Applications of Machine Learning Pedro Larrañaga 2024 scribd download
100% (2)
Industrial Applications of Machine Learning Pedro Larrañaga 2024 scribd download
65 pages
Email Classification: Roll No-41463 (LP-3)
No ratings yet
Email Classification: Roll No-41463 (LP-3)
5 pages
Master Thesis Uni Bonn
100% (3)
Master Thesis Uni Bonn
6 pages
Asset-V1 - MITx 6.86x 1T2021 Type@Asset Block@Slides - Lecture1 - Withcredits
No ratings yet
Asset-V1 - MITx 6.86x 1T2021 Type@Asset Block@Slides - Lecture1 - Withcredits
29 pages
Statistical Classification
No ratings yet
Statistical Classification
6 pages
M.Sc. Computer Science Part-I PDF
No ratings yet
M.Sc. Computer Science Part-I PDF
32 pages
Application of Fuzzy Sets in Soil Science: Fuzzy Logic, Fuzzy Measurements and Fuzzy Decisions
No ratings yet
Application of Fuzzy Sets in Soil Science: Fuzzy Logic, Fuzzy Measurements and Fuzzy Decisions
29 pages
Illegal Parking Notification System Using Image Feature Extraction Method
No ratings yet
Illegal Parking Notification System Using Image Feature Extraction Method
22 pages