Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Widrow-Hoff Learning: (LMS Algorithm)

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

10

Widrow-Hoff Learning
(LMS Algorithm)

1
10 ADALINE Network
Input Linear Neuron

p a
Rx1 W Sx1
SxR n
Sx1 a = purelin ( Wp + b ) = Wp + b
1 b
Sx1
R S

a = purelin (Wp + b)

w i, 1
T T w i, 2
a i = purelin ( n i ) = purelin ( iw p + b i ) = iw p + b i iw =


w i, R

2
10 Two-Input ADALINE

Inputs Two-Input Neuron p2


a<0 a>0
-b/w1,2
p1 w1,1 1w

p2
n a
1w Tp + b = 0
w1,2 b
p1
1 -b/w1,1
a = purelin (Wp + b)

T T
a = purelin ( n ) = purelin ( 1w p + b ) = 1w p + b

T
a = 1w p + b = w 1, 1 p 1 + w 1, 2 p 2 + b

3
10 Mean Square Error
Training Set:
{p 1, t 1} , { p 2, t 2} , , {pQ , tQ}

Input: pq Target: tq

Notation:
T
1w
T
x = z = p a =
1
w p+b a = x z
b 1

Mean Square Error:


2 2 2
F( x )= E[ e ] = E[ ( t a ) ] = E[ ( t xT z ) ]

4
10 Error Analysis
2 2 2
F( x )= E[ e ] = E[ ( t a ) ] = E[ ( t xT z ) ]

2 T
F ( x ) = E [ t 2t x T z + x T zz x ]

2 T
F ( x ) = E [ t ] 2 x T E [ t z ] + x T E [ zz ] x

T T
F ( x ) = c 2 x h + x Rx

2 T
c = E[ t ] h = E[tz] R = E [ zz ]

The mean square error for the ADALINE Network is a


quadratic function:
T 1 T
F ( x ) = c + d x + --- x Ax
2

d = 2 h A = 2R
5
10 Stationary Point
Hessian Matrix:
A = 2R

The correlation matrix R must be at least positive semidefinite. If


there are any zero eigenvalues, the performance index will either
have a weak minumum or else no stationary point, otherwise
there will be a unique global minimum x*.

F ( x ) = c + d x + --- x Ax = d + Ax = 2 h + 2 Rx
T 1 T
2

2 h + 2 Rx = 0

If R is positive definite:
x = R 1 h
6
10 Approximate Steepest Descent
Approximate mean square error (one sample):
2 2
F ( x ) = ( t ( k ) a ( k ) ) = e ( k )

Approximate (stochastic) gradient:

F ( x ) = e2 ( k )

2
2 e ( k ) e ( k )
[ e ( k ) ] j = ---------------- = 2e ( k ) ------------- j = 1, 2, , R
w1, j w 1, j

2
2 e ( k ) e ( k )
[ e ( k ) ] R + 1 = ---------------- = 2e ( k ) -------------
b b

7
10 Approximate Gradient Calculation

e ( k ) [ t ( k ) a ( k ) ] T
------------- = ---------------------------------- = [ t ( k ) ( 1w p ( k ) + b ) ]
w 1, j w 1, j w1, j

R
e ( k )
------------- = t ( k ) w 1, i p i ( k ) + b
w1, j w 1, j
i=1

e ( k ) e ( k )
------------- = p j ( k ) ------------- = 1
w 1, j b

F ( x ) = e 2 ( k ) = 2e ( k ) z ( k )

8
10 LMS Algorithm
x k + 1 = x k F ( x )
x = xk

x k + 1 = x k + 2e ( k ) z ( k )

1w ( k + 1 ) = 1w ( k ) + 2e ( k ) p ( k )

b ( k + 1 ) = b ( k ) + 2e ( k )

9
10 Multiple-Neuron Case

iw ( k + 1 ) = iw ( k ) + 2e i ( k ) p ( k )

b i ( k + 1 ) = b i ( k ) + 2e i ( k )

Matrix Form:

T
W ( k + 1 ) = W ( k ) + 2 e ( k ) p ( k )

b ( k + 1 ) = b ( k ) + 2 e ( k )

10
10 Analysis of Convergence
x k + 1 = x k + 2e ( k ) z ( k )

E [ x k + 1 ] = E [ x k ] + 2E [ e ( k ) z ( k ) ]

T
E [ x k + 1 ] = E [ x k ] + 2 { E [ t ( k ) z ( k ) ] E [ ( x k z ( k ) ) z ( k ) ] }

T
E [ x k + 1 ] = E [ x k ] + 2 { E [ t k z ( k ) ] E [ ( z ( k ) z ( k ) ) x k ] }

E [ x k + 1 ] = E [ x k ] + 2 { h R E [ x k ] }

E [ x k + 1 ] = [ I 2 R ]E [ x k ] + 2 h

For stability, the eigenvalues of this


matrix must fall inside the unit circle.
11
10 Conditions for Stability

eig ( [ I 2 R ] ) = 1 2 i < 1

(where i is an eigenvalue of R)

Since i > 0 , 1 2 i < 1 .

Therefore the stability condition simplifies to

1 2 i > 1

< 1 i for all i

0 < < 1 max

12
10 Steady State Response

E [ x k + 1 ] = [ I 2 R ]E [ x k ] + 2 h

If the system is stable, then a steady state condition will be reached.

E [ x ss ] = [ I 2 R ]E [ x ss ] + 2 h

The solution to this equation is

E [ x ss ] = R h = x
1

This is also the strong minimum of the performance index.

13
10 Example
1 1

Banana p1 = 1 , t1 = 1 Apple 2
p = ,
1 2 t = 1

1 1

T 1 T 1 T
R = E [ pp ] = --- p 1 p 1 + --- p 2 p 2
2 2

1 1 1 0 0
1 1
R = --- 1 1 1 1 + --2- 1 1 1 1 = 0 1 1
2
1 1 0 1 1

1 = 1.0, 2 = 0.0, 3 = 2.0

1 1
< ------------ = ------- = 0.5
max 2.0
14
10 Iteration One
1
Banana a ( 0 ) = W ( 0 ) p ( 0 ) = W ( 0 ) p1 = 0 0 0 1 = 0
1

e ( 0 ) = t ( 0) a( 0 )= t1 a( 0 )= 1 0= 1

W ( 1 ) = W ( 0 ) + 2e ( 0 ) p T ( 0 )

T
1
W ( 1 ) = 0 0 0 + 2 ( 0.2 ) ( 1 ) 1 = 0.4 0.4 0.4
1

15
10 Iteration Two

1
Apple a ( 1 ) = W ( 1 ) p ( 1 ) = W ( 1 ) p 2 = 0.4 0.4 0.4 1 = 0.4
1

e ( 1 ) = t ( 1 ) a ( 1 ) = t 2 a ( 1 ) = 1 ( 0 . 4) = 1 . 4

T
1
W ( 2 ) = 0.4 0.4 0.4 + 2 ( 0.2 ) ( 1.4 ) 1 = 0.96 0.16 0.16
1

16
10 Iteration Three

1
a ( 2 ) = W ( 2 ) p ( 2 ) = W ( 2 ) p 1 = 0.96 0.16 0.16 1 = 0.64
1

e ( 2 ) = t ( 2 ) a ( 2 ) = t 1 a ( 2 ) = 1 ( 0.64 ) = 0.36

T
W ( 3 ) = W ( 2 ) + 2e ( 2 ) p ( 2 ) = 1.1040 0.0160 0.0160

W( ) = 1 0 0

17
10 Adaptive Filtering
Tapped Delay Line Adaptive Filter
Inputs ADALINE
y(k) p1(k) = y(k)

y(k)
D w1,1
p2(k) = y(k - 1)
D
w1,2
D n(k) a(k)
D
SxR

b
D 1
pR(k) = y(k - R + 1)
D w1,R

a(k) = purelin (Wp(k) + b)


R
a ( k ) = purelin ( Wp + b ) = w1, i y ( k i + 1 ) + b
i=1 18
10 Example: Noise Cancellation
EEG Signal Contaminated Restored Signal
(random) s t Signal e
+
- "Error"
Contaminating
Noise Adaptively Filtered
Noise to Cancel
m
Contamination

Noise Path
Filter
Graduate
Student

v a
Adaptive
Filter
60-Hz
Noise Source
Adaptive Filter Adjusts to Minimize Error (and in doing
this removes 60-Hz noise from contaminated signal)

19
10 Noise Cancellation Adaptive Filter

Inputs ADALINE

v(k) w1,1
n(k) a(k)
D w1,2
SxR

a(k) = w1,1 v(k) + w1,2 v(k - 1)

20
10 Correlation Matrix
R = [ zz T ] h = E[ tz ]

z(k ) = v(k)
v(k 1)

t ( k ) = s( k ) + m ( k )

2
E[v (k )] E [ v ( k )v ( k 1 ) ]
R =
2
E [ v ( k 1 )v ( k ) ] E[v (k 1 )]

h = E [ ( s ( k ) + m ( k ) )v ( k ) ]
E [ ( s ( k ) + m ( k ) )v ( k 1 ) ]
21
10 Signals
2k 3
v ( k ) = 1.2 sin --------- m ( k ) = 1.2 sin --------- ------
2k
3 3 4

3
2
21 sin 2k
--------- = ( 1.2) 0.5 = 0.72

2 2
E [ v ( k ) ] = ( 1.2) ---
3 3
k=1

2 2
E [ v ( k 1 ) ] = E [ v ( k ) ] = 0.72

3
2 ( k 1 )
1
E [ v ( k )v ( k 1 ) ] = ---
3 1.2 sin 2k
--------- 1.2 sin -----------------------
3 3
k=1

2
= ( 1.2 ) 0.5 cos ------ = 0.36
2
3

R = 0.72 0.36
0.36 0.72
22
10 Stationary Point
E [ ( s ( k ) + m ( k ) )v ( k ) ] = E [ s ( k )v ( k ) ] + E [ m ( k )v ( k ) ]

0
3
--------- ------ 1.2 sin ---------
1.2 sin 2k
1 3 2k
E [ m ( k )v ( k ) ] = --- = 0.51
3 3 4 3
k=1

E [ ( s ( k ) + m ( k ) )v ( k 1 ) ] = E [ s ( k )v ( k 1 ) ] + E [ m ( k )v ( k 1 ) ]

0
3
2 ( k 1 )
--------- ------ 1.2 sin ----------------------- = 0.70
1.2 sin 2k
1 3
E [ m ( k )v ( k 1 ) ] = ---
3 3 4 3
k=1

h = E[ ( s( k ) + m( k ) )v( k ) ] h = 0.51
E[ ( s( k ) + m( k ) )v( k 1) ] 0.70

1
x = R h = 0.72 0.36 0.51 = 0.30
1

0.36 0.72 0.70 0.82


23
10 Performance Index
T T
F ( x ) = c 2 x h + x Rx
2 2
c = E[t (k )]= E[ (s(k ) + m(k ) ) ]

2 2
c = E [ s ( k ) ] + 2E [ s ( k )m ( k ) ] + E [ m ( k ) ]
0.2
0.2
2 1 2 1 3
E [ s ( k ) ] = -------
0.4 s ds = ---------------s
3 ( 0.4 ) 0.2
= 0.0133
0.2

3 2
1 2 = 0.72

2
E [ m ( k ) ] = --- 3
1.2 sin
3
-----
- -----
-
3 4
k=1

c = 0.0133 + 0.72 = 0.7333

F ( x ) = 0.7333 2 ( 0.72 ) + 0.72 = 0.0133

24
10 LMS Response

Original and Restored EEG Signals


2 4

2 Original and Restored EEG Signals

0
1
-2

-4
0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
0

2 EEG Signal Minus Restored Signal


-1
0

-2

-2 -4
-2 -1 0 1 2 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
Time

25
10 Echo Cancellation

+
Transmission
Line
-

Adaptive Adaptive
Phone Hybrid Hybrid Phone
Filter Filter

-
Transmission
Line +

26

You might also like