Widrow-Hoff Learning: (LMS Algorithm)

10
Widrow-Hoff Learning
(LMS Algorithm)
1
10 ADALINE Network
Input Linear Neuron
p a
Rx1 W Sx1
SxR n
Sx1 a = purelin ( Wp + b ) = Wp + b
1 b
Sx1
R S
a = purelin (Wp + b)
w i, 1
T T w i, 2
a i = purelin ( n i ) = purelin ( iw p + b i ) = iw p + b i iw =

w i, R
2
10 Two-Input ADALINE
Inputs Two-Input Neuron p2

a<0 a>0
-b/w1,2
p1 w1,1 1w
p2
n a
1w Tp + b = 0
w1,2 b
p1
1 -b/w1,1
a = purelin (Wp + b)
T T
a = purelin ( n ) = purelin ( 1w p + b ) = 1w p + b
T
a = 1w p + b = w 1, 1 p 1 + w 1, 2 p 2 + b
3
10 Mean Square Error
Training Set:
{p 1, t 1} , { p 2, t 2} , , {pQ , tQ}
Input: pq Target: tq
Notation:
T
1w
T
x = z = p a =
1
w p+b a = x z
b 1
Mean Square Error:

2 2 2
F( x )= E[ e ] = E[ ( t a ) ] = E[ ( t xT z ) ]
4
10 Error Analysis
2 2 2
F( x )= E[ e ] = E[ ( t a ) ] = E[ ( t xT z ) ]
2 T
F ( x ) = E [ t 2t x T z + x T zz x ]
2 T
F ( x ) = E [ t ] 2 x T E [ t z ] + x T E [ zz ] x
T T
F ( x ) = c 2 x h + x Rx
2 T
c = E[ t ] h = E[tz] R = E [ zz ]
The mean square error for the ADALINE Network is a

quadratic function:
T 1 T
F ( x ) = c + d x + --- x Ax
2
d = 2 h A = 2R
5
10 Stationary Point
Hessian Matrix:
A = 2R
The correlation matrix R must be at least positive semidefinite. If

there are any zero eigenvalues, the performance index will either
have a weak minumum or else no stationary point, otherwise
there will be a unique global minimum x*.
F ( x ) = c + d x + --- x Ax = d + Ax = 2 h + 2 Rx
T 1 T
2
2 h + 2 Rx = 0
If R is positive definite:
x = R 1 h
6
10 Approximate Steepest Descent
Approximate mean square error (one sample):
2 2
F ( x ) = ( t ( k ) a ( k ) ) = e ( k )
Approximate (stochastic) gradient:
F ( x ) = e2 ( k )

2
2 e ( k ) e ( k )
[ e ( k ) ] j = ---------------- = 2e ( k ) ------------- j = 1, 2, , R
w1, j w 1, j
2
2 e ( k ) e ( k )
[ e ( k ) ] R + 1 = ---------------- = 2e ( k ) -------------
b b
7
10 Approximate Gradient Calculation
e ( k ) [ t ( k ) a ( k ) ] T
------------- = ---------------------------------- = [ t ( k ) ( 1w p ( k ) + b ) ]
w 1, j w 1, j w1, j
R
e ( k )
------------- = t ( k ) w 1, i p i ( k ) + b
w1, j w 1, j
i=1
e ( k ) e ( k )
------------- = p j ( k ) ------------- = 1
w 1, j b
F ( x ) = e 2 ( k ) = 2e ( k ) z ( k )

8
10 LMS Algorithm
x k + 1 = x k F ( x )
x = xk
x k + 1 = x k + 2e ( k ) z ( k )
1w ( k + 1 ) = 1w ( k ) + 2e ( k ) p ( k )
b ( k + 1 ) = b ( k ) + 2e ( k )
9
10 Multiple-Neuron Case
iw ( k + 1 ) = iw ( k ) + 2e i ( k ) p ( k )
b i ( k + 1 ) = b i ( k ) + 2e i ( k )
Matrix Form:
T
W ( k + 1 ) = W ( k ) + 2 e ( k ) p ( k )
b ( k + 1 ) = b ( k ) + 2 e ( k )
10
10 Analysis of Convergence
x k + 1 = x k + 2e ( k ) z ( k )
E [ x k + 1 ] = E [ x k ] + 2E [ e ( k ) z ( k ) ]
T
E [ x k + 1 ] = E [ x k ] + 2 { E [ t ( k ) z ( k ) ] E [ ( x k z ( k ) ) z ( k ) ] }
T
E [ x k + 1 ] = E [ x k ] + 2 { E [ t k z ( k ) ] E [ ( z ( k ) z ( k ) ) x k ] }
E [ x k + 1 ] = E [ x k ] + 2 { h R E [ x k ] }
E [ x k + 1 ] = [ I 2 R ]E [ x k ] + 2 h
For stability, the eigenvalues of this

matrix must fall inside the unit circle.
11
10 Conditions for Stability
eig ( [ I 2 R ] ) = 1 2 i < 1
(where i is an eigenvalue of R)
Since i > 0 , 1 2 i < 1 .
Therefore the stability condition simplifies to
1 2 i > 1
< 1 i for all i
0 < < 1 max
12
10 Steady State Response
E [ x k + 1 ] = [ I 2 R ]E [ x k ] + 2 h
If the system is stable, then a steady state condition will be reached.
E [ x ss ] = [ I 2 R ]E [ x ss ] + 2 h
The solution to this equation is
E [ x ss ] = R h = x
1
This is also the strong minimum of the performance index.
13
10 Example
1 1

Banana p1 = 1 , t1 = 1 Apple 2
p = ,
1 2 t = 1

1 1
T 1 T 1 T
R = E [ pp ] = --- p 1 p 1 + --- p 2 p 2
2 2
1 1 1 0 0
1 1
R = --- 1 1 1 1 + --2- 1 1 1 1 = 0 1 1
2
1 1 0 1 1
1 = 1.0, 2 = 0.0, 3 = 2.0
1 1
< ------------ = ------- = 0.5
max 2.0
14
10 Iteration One
1
Banana a ( 0 ) = W ( 0 ) p ( 0 ) = W ( 0 ) p1 = 0 0 0 1 = 0
1
e ( 0 ) = t ( 0) a( 0 )= t1 a( 0 )= 1 0= 1
W ( 1 ) = W ( 0 ) + 2e ( 0 ) p T ( 0 )
T
1
W ( 1 ) = 0 0 0 + 2 ( 0.2 ) ( 1 ) 1 = 0.4 0.4 0.4
1
15
10 Iteration Two
1
Apple a ( 1 ) = W ( 1 ) p ( 1 ) = W ( 1 ) p 2 = 0.4 0.4 0.4 1 = 0.4
1
e ( 1 ) = t ( 1 ) a ( 1 ) = t 2 a ( 1 ) = 1 ( 0 . 4) = 1 . 4
T
1
W ( 2 ) = 0.4 0.4 0.4 + 2 ( 0.2 ) ( 1.4 ) 1 = 0.96 0.16 0.16
1
16
10 Iteration Three
1
a ( 2 ) = W ( 2 ) p ( 2 ) = W ( 2 ) p 1 = 0.96 0.16 0.16 1 = 0.64
1
e ( 2 ) = t ( 2 ) a ( 2 ) = t 1 a ( 2 ) = 1 ( 0.64 ) = 0.36
T
W ( 3 ) = W ( 2 ) + 2e ( 2 ) p ( 2 ) = 1.1040 0.0160 0.0160
W( ) = 1 0 0
17
10 Adaptive Filtering
Tapped Delay Line Adaptive Filter
Inputs ADALINE
y(k) p1(k) = y(k)
y(k)
D w1,1
p2(k) = y(k - 1)
D
w1,2
D n(k) a(k)
D
SxR
b
D 1
pR(k) = y(k - R + 1)
D w1,R
a(k) = purelin (Wp(k) + b)

R
a ( k ) = purelin ( Wp + b ) = w1, i y ( k i + 1 ) + b
i=1 18
10 Example: Noise Cancellation
EEG Signal Contaminated Restored Signal
(random) s t Signal e
+
- "Error"
Contaminating
Noise Adaptively Filtered
Noise to Cancel
m
Contamination
Noise Path
Filter
Graduate
Student
v a
Adaptive
Filter
60-Hz
Noise Source
Adaptive Filter Adjusts to Minimize Error (and in doing
this removes 60-Hz noise from contaminated signal)
19
10 Noise Cancellation Adaptive Filter
Inputs ADALINE
v(k) w1,1
n(k) a(k)
D w1,2
SxR
a(k) = w1,1 v(k) + w1,2 v(k - 1)
20
10 Correlation Matrix
R = [ zz T ] h = E[ tz ]
z(k ) = v(k)
v(k 1)
t ( k ) = s( k ) + m ( k )
2
E[v (k )] E [ v ( k )v ( k 1 ) ]
R =
2
E [ v ( k 1 )v ( k ) ] E[v (k 1 )]
h = E [ ( s ( k ) + m ( k ) )v ( k ) ]
E [ ( s ( k ) + m ( k ) )v ( k 1 ) ]
21
10 Signals
2k 3
v ( k ) = 1.2 sin --------- m ( k ) = 1.2 sin --------- ------
2k
3 3 4
3
2
21 sin 2k
--------- = ( 1.2) 0.5 = 0.72

2 2
E [ v ( k ) ] = ( 1.2) ---
3 3
k=1
2 2
E [ v ( k 1 ) ] = E [ v ( k ) ] = 0.72
3
2 ( k 1 )
1
E [ v ( k )v ( k 1 ) ] = ---
3 1.2 sin 2k
--------- 1.2 sin -----------------------
3 3
k=1
2
= ( 1.2 ) 0.5 cos ------ = 0.36
2
3
R = 0.72 0.36
0.36 0.72
22
10 Stationary Point
E [ ( s ( k ) + m ( k ) )v ( k ) ] = E [ s ( k )v ( k ) ] + E [ m ( k )v ( k ) ]
0
3
--------- ------ 1.2 sin ---------
1.2 sin 2k
1 3 2k
E [ m ( k )v ( k ) ] = --- = 0.51
3 3 4 3
k=1
E [ ( s ( k ) + m ( k ) )v ( k 1 ) ] = E [ s ( k )v ( k 1 ) ] + E [ m ( k )v ( k 1 ) ]
0
3
2 ( k 1 )
--------- ------ 1.2 sin ----------------------- = 0.70
1.2 sin 2k
1 3
E [ m ( k )v ( k 1 ) ] = ---
3 3 4 3
k=1
h = E[ ( s( k ) + m( k ) )v( k ) ] h = 0.51
E[ ( s( k ) + m( k ) )v( k 1) ] 0.70
1
x = R h = 0.72 0.36 0.51 = 0.30
1
0.36 0.72 0.70 0.82

23
10 Performance Index
T T
F ( x ) = c 2 x h + x Rx
2 2
c = E[t (k )]= E[ (s(k ) + m(k ) ) ]
2 2
c = E [ s ( k ) ] + 2E [ s ( k )m ( k ) ] + E [ m ( k ) ]
0.2
0.2
2 1 2 1 3
E [ s ( k ) ] = -------
0.4 s ds = ---------------s
3 ( 0.4 ) 0.2
= 0.0133
0.2
3 2
1 2 = 0.72

2
E [ m ( k ) ] = --- 3
1.2 sin
3
-----
- -----
-
3 4
k=1
c = 0.0133 + 0.72 = 0.7333
F ( x ) = 0.7333 2 ( 0.72 ) + 0.72 = 0.0133
24
10 LMS Response
Original and Restored EEG Signals

2 4
2 Original and Restored EEG Signals
0
1
-2
-4
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
0
2 EEG Signal Minus Restored Signal

-1
0
-2
-2 -4
-2 -1 0 1 2 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Time
25
10 Echo Cancellation
+
Transmission
Line
-
Adaptive Adaptive
Phone Hybrid Hybrid Phone
Filter Filter
-
Transmission
Line +
26

Widrow-Hoff Learning: (LMS Algorithm)

Uploaded by

Copyright:

Available Formats

Widrow-Hoff Learning: (LMS Algorithm)

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Widrow-Hoff Learning: (LMS Algorithm)

Uploaded by

Copyright:

Available Formats

10

Inputs Two-Input Neuron p2

Mean Square Error:

The mean square error for the ADALINE Network is a

The correlation matrix R must be at least positive semidefinite. If

Approximate (stochastic) gradient:

For stability, the eigenvalues of this

Since i > 0 , 1 2 i < 1 .

Therefore the stability condition simplifies to

< 1 i for all i

0 < < 1 max

If the system is stable, then a steady state condition will be reached.

The solution to this equation is

This is also the strong minimum of the performance index.

1 = 1.0, 2 = 0.0, 3 = 2.0

a(k) = purelin (Wp(k) + b)

a(k) = w1,1 v(k) + w1,2 v(k - 1)

0.36 0.72 0.70 0.82

c = 0.0133 + 0.72 = 0.7333

F ( x ) = 0.7333 2 ( 0.72 ) + 0.72 = 0.0133

Original and Restored EEG Signals

2 Original and Restored EEG Signals

2 EEG Signal Minus Restored Signal

You might also like