Lecture1 Introduction To ANN
Lecture1 Introduction To ANN
Neural Networks
Content
Neuron Models
ANN structures
Learning
Distributed Representations
Conclusions
Introduction to Artificial
Neural Networks
Fundamental
Concepts of ANNs
Applications
Pattern Matching
Pattern Recognition
Associate Memory (Content Addressable Memory)
Function Approximation
Learning
Optimization
Vector Quantization
Data Clustering
...
Pattern Matching
Pattern Recognition
Associate Memory (Content Addressable Memory)
Function Approximation
Learning
Optimization
Vector Quantization
Data Clustering
...
Collective behavior.
wi1
x2
wi2
f (.) a (.)
xm
wim
yi
yi (t 1) a ( f )
wi1
x2
wi2
yi
f (.) a (.)
m
f ( i ) x wij w
x im
j i
m j 1
1 f 0
a( f )
0 otherwise
The
x1
wij
positive
excitatory
Artificial
Neuron
negative inhibitory
zero no connection
wi1
x2
wi2
f (.) a (.)
xm
wim
yi
wi1
x2
wi2
Proposed by McCulloch
and Pitts [1943]
M-P neurons
f (.) a (.)
xm
wim
yi
A hard limiter.
A binary threshold unit.
Hyperspace separation.
y
f ( i ) w1 x1 w2 x2
1 f ( i ) 0
y
0 otherwise
w1
x1
w2
x2
x2
x1
Introduction to Artificial
Neural Networks
Basic Models and Learning Rules
Neuron Models
ANN structures
Learning
Processing Elements
Extensions of M-P neurons
What integration
functions we may
have?
f (.) a (.)
What activation
functions we may
have?
Integration Functions
m
Quadratic
Function
Spherical
Function
f i wij x 2j i
j 1
f (.) a (.)
f i ( x j wij ) i
2
j 1
Polynomial
m m
j
k
f
w
x
x
x
Function
ijk j k j k i
i
j 1 k 1
Activation Functions
M-P neuron: (Step function)
1 f 0
a( f )
0 otherwise
f (.) a (.)
a
1
Activation Functions
Hard Limiter (Threshold function)
1
a( f ) sgn( f )
1
f (.) a (.)
f 0
f 0
a
1
1
Activation Functions
Ramp function:
f (.) a (.)
f 1
a( f ) f
0
0 f 1
f 0
a
1
1
Activation Functions
Unipolar sigmoid function:
1
a( f )
f
1 e
f (.) a (.)
1.5
1
0.5
0
-4
-3
-2
-1
Activation Functions
Bipolar sigmoid function:
2
a( f )
1
f
1 e
f (.) a (.)
1.5
1
0.5
0
-4
-3
-2
-1
-0.5 0
-1
-1.5
L1
L3
L1
L3
L2
L2
L1
x1=0
L3
xy+4=0
L2
y1=0
x
1=1
2=1
L1
L2
3= 4
0
1
L3
1
011
111
001
101
110
L1
L3
L2
L2
100 x
L1
z=0
L4
L3
z=1
L1
L2
L3
L2
L1
4=2.5
z=0
L4
L3
z=1
L1
L2
L3
L2
1 f 0
a( f )
0 otherwise
L4
L1
L2
L3
Unipolar sigmoid
function:
1
a( f )
f
1 e
=3
L4
L1
=5
L2
L3
=10
Introduction to Artificial
Neural Networks
Basic Models and Learning Rules
Neuron Models
ANN structures
Learning
y1
y2
yn
. . .
w1m
w11 w12
x1
w21 w22w2m
x2
wn1 w
n2 wnm
xm
y2
Output Layer
yn
. . .
. . .
Hidden Layer
. . .
Input Layer
. . .
x1
x2
xm
Pattern Recognition
Learning
Classification
Analysis
Input
Output
Feedback
Loop
y1
y2
yn
. . .
x1
x2
xm
y2
y3
. . .
. . .
x1
x2
x3
Introduction to Artificial
Neural Networks
Basic Models and Learning Rules
Neuron Models
ANN structures
Learning
Learning
w1T w11
T
w 2 w21
T
w n wn1
w12
w22
wn 2
w1m
w2 m
wnm
How?
Learning
Consider
an ANN
with
n neurons
and each
To Learn
the
weight
matrix.
with m adaptive weights.
Weight matrix:
w1T w11
T
w 2 w21
T
w n wn1
w12
w22
wn 2
w1m
w2 m
wnm
Learning Rules
Supervised
learning
Reinforcement
Unsupervised
learning
learning
Supervised Learning
Learning
with a teacher
Learning
by examples
Training set
T (x(1) , d(1) ),(x(2) , d(2) ),
,(x( k ) , d( k ) ),
T (x , d ),(x , d ),
(1)
(1)
(2)
(2 )
(k )
(k )
,(x , d ),
Supervised Learning
ANN
W
Error
signal
Generator
Reinforcement Learning
Learning
with a critic
Learning by comments
Reinforcement Learning
ANN
W
Critic
signal
Generator
Reinforcement
Signal
Unsupervised Learning
Self-organizing
Clustering
Unsupervised Learning
ANN
x1 w
i1
x2 w
xj
. i2
.
. wij
.
.
.
xm-1
m 1
wi,m-1
yi
Output: yi a (neti )
x1 w
i1
x2 w
xj
. i2
.
. wij
.
.
.
xm-1
m 1
wi,m-1
yi
Output: yi a (neti )
x1 w
i1
x2 w
xj
. i2
.
. wij
.
.
.
xm-1
m 1
neti wij x j
wi,m-1
j 1
x1 w
i1
x2 w
xj
. i2
.
. wij
.
.
.
xm-1
wi,m-1
m 1
wim=i
xm= 1
neti wij x j
j 1
We want
to learn
x1 w
i1
x2 w
xj
. i2
.
. wij
.
.
.
xm-1
wi,m-1
i
wim=i
xm= 1
yi
wi(t) = ?
yi
wi
r
Learning
Signal
Generator
di
yi
wi
r
Learning
f r (w
Signal
i , x, d i )
Generator
di
yi
wi
w i (t ) rx(t )
Learning
f r (w
Signal
i , x, d i )
Generator
di
w i (t ) rx(t )
yi
wi
w i (t ) rx(t )
r
Learning Rate
Learning
f r (w
Signal
i , x, d i )
Generator
di
We want
to learn
w i (t ) rx(t )
r f r (w i , x, di )
( t 1)
i
(t )
i
f r (w , x , d )x
(t )
i
(t )
(t )
i
dw i (t )
rx(t )
dt
(t )
w i (t ) rx(t ) yi x
wij yi x j
Introduction to Artificial
Neural Networks
Distributed
Representations
Distributed Representations
Distributed Representation:
Local Representation:
Example
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15
Dog
Cat
Bread
+ _ + + _ _ _ _ + + + + + _ _ _
+ _ + + _ _ _ _ + _ + _ + + _ +
+ + _ + _ + + _ + _ _ + + + + _
Advantages
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15
Dog
Cat
Bread
+ _ + + _ _ _ _ + + + + + _ _ _
+ _ + + _ _ _ _ + _ + _ + + _ +
+ + _ + _ + + _ + _ _ + + + + _
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15
+ + + +
What is this?
Advantages
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15
Dog
Cat
Bread
+ _ + + _ _ _ _ + + + + + _ _ _
+ _ + + _ _ _ _ + _ + _ + + _ +
+ + _ + _ + + _ + _ _ + + + + _
P0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15
Fido
+ _ _ + _ _ _ _ + + + + + + _ _
Advantages
Bread
+ _ + + _ _ _ _ + + + + + _ _ _
+ _ + + _ _ _ _ + _ + _ + + _ +
+ + _ + _ + + _ + _ _ + + + + _
Doughnut
+ + _ _ _ + + _ + _ _ _ + + + _
Dog
Cat
Advantages
Dog
Cat
Bread
+ _ + + _ _ _ _ + + + + + _ _ _
+ _ + + _ _ _ _ + _ + _ + + _ +
+ + _ + _ + + _ + _ _ + + + + _
Disadvantages
How to understand?
How to modify?