Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Linear

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 18

Linear Seprability

Biological Neural Systems


 Neuron switching time : > 10-3 secs
 Number of neurons in the human brain: ~1010
 Connections (synapses) per neuron : ~104–105
 Face recognition : 0.1 secs
 High degree of distributed and parallel computation
 Highly fault tolerent
 Highly efficient
 Learning is key
Excerpt from Russell and Norvig’s
Book
A Neuron
ak Wkj
inj output
Input
links  aj output
links

inj   Wkj * Ik
ai = output(inj)
j
 Computation:
 input signals  input function(linear)  activation
function(nonlinear)  output signal
Part 1. Perceptrons: Simple NN

inputs
weights
x1 w1
activation output
w2
x2  
y
. a=i=1n wi xi
.
. wn
xn Xi’s range: [0, 1]

1 if a  
y= { 0 if a < 
Decision Surface of a Perceptron

1 1 Decision line
x2
w w1 x1 + w2 x2 = 
1
0
0
0
x1
1

0 0
Linear Separability
w1=1 x2 w1=?
w2=1 w2=? 0 1
0 1
=1.5 x1 = ? x1
0 0 1 0

Logical AND Logical XOR


x1 x2 a y x1 x2 y
0 0 0 0 0 0 0
0 1 1 0 0 1 1
1 0 1 0 1 0 1
1 1 2 1 1 1 0
Threshold as Weight: W0

=w0
x0=-1
x1 w1
w0
w2
x2  y
. a= i=0n wi xi
.
. wn
xn 1 if a  
y= { 0 if a <
Thus, y= sgn(a)=0 or 1
Training the Perceptron
 Training set S of examples {x,t}
 x is an input vector and
 t the desired target vector

 Example: Logical And

S = {(0,0),0}, {(0,1),0}, {(1,0),0}, {(1,1),1}


 Iterative process
 Present a training example x , compute network output y ,
compare output y with target t, adjust weights and thresholds
 Learning rule

Specifies how to change the weights w and thresholds  of the
network as a function of the inputs x, output y and target t.
Perceptron Learning Rule
 w’=w +  (t-y) x
wi := wi + wi = wi +  (t-y) xi (i=1..n)
 The parameter  is called the learning rate.
 In Han’s book it is lower case L

It determines the magnitude of weight updates wi .
 If the output is correct (t=y) the weights are not
changed (wi =0).
 If the output is incorrect (t  y) the weights wi are
changed such that the output of the Perceptron for
the new weights w’i is closer/further to the input xi.
Perceptron Training Algorithm
Repeat
for each training vector pair (x,t)
evaluate the output y when x is the input
if yt then
form a new weight vector w’ according
to w’=w +  (t-y) x
else
do nothing
end if
end for
Until y=t for all training vector pairs or # iterations > k
Perceptron Convergence Theorem

 The algorithm converges to the correct classification


 if the training data is linearly separable

 and learning rate is sufficiently small

 If two classes of vectors X1 and X2 are linearly separable,


the application of the perceptron training algorithm will
eventually result in a weight vector w0, such that w0
defines a Perceptron whose decision hyper-plane
separates X1 and X2 (Rosenblatt 1962).
 Solution w0 is not unique, since if w0 x =0 defines a
hyper-plane, so does w’0 = k w0.
Experiments
Perceptron Learning from
Patterns
x1 w1
w2
x2 
.
.
. wn
xn weights (trained)
fixed
Input Association Summation Threshold
pattern units
Association units (A-units) can be assigned arbitrary Boolean
functions of the input pattern.
Part 2. Multi Layer Networks
Output vector

Output nodes

Hidden nodes

Input nodes

Input vector
Can use multi layer to learn
nonlinear functions
w1=?  How to set the weights?
w2=? 0 1
= ? 3
x1
1 0 x1 w35
w23
5
Logical XOR
x2
x1 x2 y
4
0 0 0
0 1 1
1 0 1
1 1 0
Examples

 Learn the AND gate?


 Learn the OR gate?
 Learn the NOT gate?
 Is X1  X2 a linear learning problem?
Learning the Multilayer Networks

 Known as back-propagation algorithm


 Learning rule slightly different
 Can consult the text book for the
algorithm, but we need not worry in
this course.

You might also like