Part B
Part B
Part B
xi
yj
wi,j
bj
sj
W
wj
uj
x
wij
Neural Networks
1 / 36
Typically the same activation function is used for all neurons in any
particular layer (this is not a requirement).
In a multi-layer network if the neurons have linear activation
functions the capabilities are no better than a single layer network
with a linear activation function.
Hence in most cases nonlinear activation functions are used.
Linear Activation Function
f (x) = x, x.
Neural Networks
2 / 36
f (x)
Neural Networks
3 / 36
1
.
1 + exp (x)
f (x)
1
Neural Networks
4 / 36
g (x) =
1 exp (x)
.
1 + exp (x)
[1 + g(x)][1 g(x)].
2
Neural Networks
5 / 36
Matrix Manipulations
1
X1
Xi
bj
w1j
wij
Yj
wnj
Xn
sj = wj xt =
Pn
i=1 xi wij ,
Neural Networks
6 / 36
McCulloch-Pitts Neuron
Characteristics
Neural Networks
7 / 36
McCulloch-Pitts Neuron
Architecture
X1
w
Xn
Xn+1
w
p
p
Xn+m
Neural Networks
8 / 36
X1
1
Y
X2
Neural Networks
9 / 36
X1
2
Y
X2
Neural Networks
10 / 36
X1
2
Y
X2
Neural Networks
11 / 36
Z1
2
1
Y
1
2
X2
Z2
Neural Networks
12 / 36
Neural Networks
13 / 36
Neural Networks
14 / 36
Example 3.3.5
Note that the threshold function can be converted to the discrete
activation function by converting s > 0 to s u > 0. By doing this
we are converting a threshold to a bias.
We have to mark the area in which the output turns out to be 1.
That is by looking at the vertices, we determine it is
x1 > 0, x1 < 1, x2 > 0, and, x2 < 1.
In order to obtain the required output the threshold function has to
satisfy s u > 0. That is w1 x1 + w2 x2 u > 0. Now compare the
above two items to obtain the required w1 , w2 and u
Even for Exercise 3.3.2 we have a region bonded by a triangle in
which the same computation can be carried out.
Even when you are building a two layer network the arguments
used in MCP networks can be used for example the question in
pp.33.
All of these techniques produce fixed weight ANNs.
Dr. E.C. Kulasekere ()
Neural Networks
15 / 36
Neural Networks
16 / 36
where s = b +
xi wi,.
where s =
xi wi,.
Essentially the above are equivalent for most cases. However the
bias becomes essential for certain problems associated with linear
separability.
Dr. E.C. Kulasekere ()
Neural Networks
17 / 36
w1
b
x1
.
w2
w2
Neural Networks
18 / 36
For bipolar signals the outputs for the two classes are -1 and +1.
For unipolar signals it is 0 and 1.
Depending on the number of inputs the decision boundary can be
a line, plane or a hyperplane. Eg. For two inputs its a line and for
three inputs its a plane.
If all of the training input vectors for which the correct response is
+1 lie on one side of the decision boundary we say that this
system is linearly separable.
It has been shown that a single layer network can only learn
linearly separable problems.
The trained weights are not unique.
Neural Networks
19 / 36
Neural Networks
x2
x1
20 / 36
Importance of Bias
Neural Networks
21 / 36
x2
x1
Neural Networks
22 / 36
Neural Networks
23 / 36
Neural Networks
24 / 36
Hebb Algorithm
Step 0 Initialize all weights: wi = 0, i = 1, . . . , n.
Step 1 For each input training vector and target output pair, s : t,
do steps 2-4.
Step 2 Set activations for input units: xi = si , i = 1, . . . , n.
Step 3 Set activation for output unit: y = t.
Step 4 Adjust the weight and bias:
wi (new) = wi (old) + xi y for i = 1, . . . , n
b(new) = b(old) + y .
If the bias is considered to be an input signal that is always 1, the
weight change can be written as
w(new) = w(old) + w
where w = xy .
Dr. E.C. Kulasekere ()
Neural Networks
25 / 36
Input
Target
(x1 , x2 , 1)
(1,1,1)
1
The truth table for the AND function is
(1,0,1)
0
(0,1,1)
0
(0,0,1)
0
The initial values are w(old) = (0, 0) and b(old) = 0.
The first step of the algorithm is
Input
Target Weight Changes Weights
(x1 , x2 , 1)
(w1 , w2 , b) (w1 , w2 , b)
(0,0,0)
(1,1,1)
1
(1,1,1)
(1,1,1)
The separating line after the first step is
x2 = x1 1
Dr. E.C. Kulasekere ()
Neural Networks
26 / 36
Now if we present the second, third and fourth training vectors the
weight change is given by
Input
Target Weight Changes Weights
(x1 , x2 , 1)
(w1 , w2 , b) (w1 , w2 , b)
(1,1,1)
(1,0,1)
0
(0,0,0)
(1,1,1)
(0,1,1)
0
(0,0,0)
(1,1,1)
(0,0,1)
0
(0,0,0)
(1,1,1)
We note that in the above problem when the target value is zero,
no learning occurs and hence no weight change.
With this we can determine that the bipolar inputs have resulted in
a short coming in the learning method.
Neural Networks
27 / 36
Input
(x1 , x2 , 1)
(1,1,1)
The truth table in this case is
(1,0,1)
(0,1,1)
(0,0,1)
Presenting the first input
Input
Target Weight Changes
(x1 , x2 , 1)
(w1 , w2 , b)
Target
+1
-1
-1
-1
Weights
(w1 , w2 , b)
(0,0,0)
(1,1,1)
1
(1,1,1)
(1,1,1)
The separating line becomes x2 = x1 1. This is the correct
classification for the first input.
Dr. E.C. Kulasekere ()
Neural Networks
28 / 36
Neural Networks
x1
29 / 36
Neural Networks
30 / 36
Neural Networks
x2
x1
31 / 36
Neural Networks
32 / 36
x1
Figure: Decision boundary for bipolar AND function using Hebb rule after
third/fourth training.
Neural Networks
33 / 36
Check the system with training patterns that are similar but not
identical to see if the system will still react with the correct
classification.
Dr. E.C. Kulasekere ()
Neural Networks
34 / 36
Neural Networks
35 / 36
Additional Comments
Neural Networks
36 / 36