Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
100% found this document useful (1 vote)
196 views

MATLAB Code of LMS Algorithm

This document summarizes an academic report on implementing acoustic echo cancellation algorithms in MATLAB. It describes using the LMS and NLMS adaptive filtering algorithms to cancel echo in a teleconference system. The report includes an overview of acoustic echo and cancellation techniques, adaptive filter algorithms like LMS and NLMS, double-talk detection methods, and a simulation of the algorithms in MATLAB. Evaluation of the LMS and NLMS results demonstrates echo reduction during testing of the implemented acoustic echo cancellation system.

Uploaded by

Inayat Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
196 views

MATLAB Code of LMS Algorithm

This document summarizes an academic report on implementing acoustic echo cancellation algorithms in MATLAB. It describes using the LMS and NLMS adaptive filtering algorithms to cancel echo in a teleconference system. The report includes an overview of acoustic echo and cancellation techniques, adaptive filter algorithms like LMS and NLMS, double-talk detection methods, and a simulation of the algorithms in MATLAB. Evaluation of the LMS and NLMS results demonstrates echo reduction during testing of the implemented acoustic echo cancellation system.

Uploaded by

Inayat Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 69

School of Mathematics and Systems Engineering

Reports from MSI - Rapporter från MSI

Implementation of the LMS and NLMS


algorithms for Acoustic Echo Cancellation
in teleconference system
using MATLAB

Hung Ngoc Nguyen


Majid Dowlatnia
Azhar Sarfraz

December MSI Report 09087


2009 Växjö University ISSN 1650-2647
SE-351 95 VÄXJÖ ISRN VXU/MSI/ED/E/--09087/--SE

0
Abstract

In hands-free telephony and in teleconference systems, the main aim is to provide a


good free voice quality when two or more people communicate from different places.
The problem often arises during the conversation is the creation of acoustic echo. This
problem will cause the bad quality of voice signal and thus talkers could not hear
clearly the content of the conversation, even thought lost the important information.
This acoustic echo is actually the noise which is created by the reflection of sound
waves by the wall of the room and the other things exist in the room. The main
objective for engineers is the cancellation of this acoustic echo and provides an echo
free environment for speakers during conversation. For this purpose, scientists design
different adaptive filter algorithms. Our thesis is also to study and simulate the
acoustics echo cancellation by using different adaptive algorithms.

1
Acknowledgements

We would like to express our sincere gratitude to our supervisor: Professor Sven
Nordebo, for being a constant source of help and inspiration throughout our work. His
timely advice and guidelines have assisted us to get through a lot of difficult situations.
Finally, we want to say thanks to our families and friends for their encouragements to
support us to accomplish this Master’s thesis.

2
Table of contents

Abstract ............................................................................................................................. 1
Acknowledgements ............................................................................................................ 2
List of tables ...................................................................................................................... 5
List of figures ..................................................................................................................... 6
CHAPTER I : INTRODUCTION ................................................................................. 7
1.1. Overview ................................................................................................................ 7
1.1.1. Echo ................................................................................................................. 7
1.1.2. Acoustic Echo Cancellation (AEC) ................................................................. 9
1.2. Thesis organization ............................................................................................. 10
CHAPTER II : THEORY OF ACOUSTIC ECHO CANCELLATION ................. 12
2.1. System overview.................................................................................................. 12
2.1.1. Adaptive Filter ............................................................................................... 13
2.1.2. Double-talk detector (DTD) ........................................................................... 13
2.1.3. Nonlinear Processor (NLP) ............................................................................ 13
2.2. Adaptive filter Algorithms ................................................................................. 14
2.2.1. Wiener Filter .................................................................................................. 15
2.2.2. The Steepest Decent Method ......................................................................... 18
2.2.2. Least Mean Square (LMS) Algorithm ........................................................... 21
2.2.3. Normalized Least Mean Square (NLMS) Algorithm .................................... 23
2.2.4. Recursive Least Square (RLS) ....................................................................... 25
2.3. Double-talk detector (DTD) ............................................................................... 28
2.3.1. The Geigel algorithm ..................................................................................... 30
2.3.2. The Cross-correlation (Benesty) algorithm ................................................... 30
2.2.3. Normalized cross-correlation (NCC) algorithm ............................................ 32
2.3. Frequency-domain acoustic echo cancellation................................................. 34
2.2.1. The generic frequency domain echo canceller .............................................. 34
2.2.2. Sub-band adaptive filter ................................................................................. 37
CHAPTER III : SIMULATION .................................................................................. 40
3.1. General setup of Simulation .............................................................................. 40
3.1.1. Setup of the Simulation .................................................................................. 40
3.1.2. Flowchart of the AEC algorithm.................................................................... 42
3.2. Measure Room Impulse Response (RIR) ......................................................... 44
3.2.1. Why we must to measure RIR ....................................................................... 44
3.2.2. Method of measuring RIR ............................................................................. 44

3
3.2.3. Result ............................................................................................................. 45
3.3. Explanation of MATLAB code.......................................................................... 46
3.3.1. Adaptive filter algorithms .............................................................................. 46
3.3.2. Double-talk Detector algorithm ..................................................................... 49
3.3.3. Calculate other issues ..................................................................................... 50
3.4. Results .................................................................................................................. 52
3.4.1. LMS algorithm ............................................................................................... 53
3.4.1. NLMS algorithm ............................................................................................ 55
3.5. Evaluation............................................................................................................ 57
CHAPTER IV : CONCLUSION AND FURTHER WORK ..................................... 59
4.1. Conclusion ........................................................................................................... 59
4.2. Further Works .................................................................................................... 60
Reference ......................................................................................................................... 61
Appendix A: MATLAB code of LMS algorithm ............................................................... 62
Appendix B: MATLAB code of NLMS algorithm ............................................................ 65

4
List of tables
Table III-1: Summary of LMS algorithm............................................................... 47
Table III-2: MATLAB code of LMS algorithm ..................................................... 47
Table III-3: Summary NLMS algorithm ................................................................ 48
Table III-4: MATLAB code of NLMS algorithm .................................................. 48
Table III-5: Summary of NCC double-talk detection algorithm............................ 49
Table III-6: MATLAB code of NCC double-talk detection algorithm .................. 50

5
List of figures
Figure I-1: A teleconference system with echo paths of room ......................................... 8
Figure I-2: Implement Acoustic Echo Canceller using adaptive filter ........................... 10
Figure II-1: Block diagram of AEC ................................................................................ 12
Figure II-2: The basic model of AEC.............................................................................. 14
Figure II-3: General Wiener filter problem. .................................................................... 15
Figure II-4: Illustration of gradient search of the mean square error surface for the
minimum error point ....................................................................................................... 18
Figure II-5: Feedback model of the variation of coefficient error with time .................. 21
Figure II-6: Double-talk detector with AEC ................................................................... 29
Figure II-7: Frequency domain echo canceller ............................................................... 35
Figure II-8: Sub-band adaptive filtering (SAF) for M sub-bands ................................... 37
Figure II-9: Analysis filter bank ...................................................................................... 38
Figure II-10: Synthesis filter bank .................................................................................. 39
Figure III-1: Far-end and near-end speeches................................................................... 42
Figure III-3: Room impulse response is measured with 8000 taps length. ..................... 45
Figure III-4: The room impulse response (128 taps length) ........................................... 46
Figure III-5: Plot near-end speech. .................................................................................. 52
Figure III-6: Plot the needed signals (LMS algorithm) ................................................... 53
Figure III-7: Double-talk detection of LMS algorithm ................................................... 54
Figure III-8: Evaluation of LMS algorithm .................................................................... 54
Figure III-9: Plot the needed signals (NLMS algorithm) ................................................ 55
Figure III-10: Double-talk detection of NLMS algorithm .............................................. 56
Figure III-11: Evaluation of NLMS algorithm................................................................ 56

6
CHAPTER I : INTRODUCTION
INTRODUCTION

1.1. OVERVIEW
In hands-free telephony and in teleconference systems, the main aim is to provide a
good free voice quality when two or more people communicate from different places.
The problem often arises during the conversation is the creation of acoustic echo. This
problem will cause the bad quality of voice signal and thus talkers could not hear
clearly the content of the conversation, even thought lost the important information.
This acoustic echo is actually the noise which is created by the reflection of sound
waves by the wall of the room and the other things exist in the room. The main
objective for engineers is the cancellation of this acoustic echo and provides an echo
free environment for speakers during conversation. For this purpose, scientists design
different adaptive filter algorithms. Our thesis is also to study and simulate the
acoustics echo cancellation by using different adaptive filter algorithms.

1.1.1. Echo
In principle, “Echo is the phenomenon in which delayed and distorted version of an
original sound or electrical signal is reflected back to the source” [4]. There are two
types of echo :
1. Electrical echo: caused by the impedance mismatch at the hybrids transformer
which the subscriber two-wire lines are connected to telephone exchange four-
wire lines in the telecommunication systems.

7
2. Acoustic echo: caused by the reflection of sound waves and acoustics coupling
between the loudspeaker and the microphone.
In teleconference system (figure I-1), the speech signal from far-end generated from
loud speaker after directing and reflecting from the wall, floor and other objects inside
the room is receipt by microphone of near-end, as the result, this makes the echo that is
sent back to the far-end. The acoustic echo problem will disturb the conversation of the
people and reduce the quality of system. This is a common problem of the
communication networks.

Microphone
Microphone
Loud
Loudspeaker
speaker
Talker From far-end

Reflected sounds
Direct sound

To far-end

Microphone

Far-end room Near-end room

Figure I-1: A teleconference system with echo paths of room

Two main characteristics of echo are reverberation and latency. Reverberation is the
persistence of sound after stopping the original sound. This sound will slowly decay
because of the absorption by the materials constructing the environment. Latency or
delay is the different time of the signal between the transmitter and receiver. In the
case of teleconference system, the sound is generated from loud speaker and received
by microphone, the delay can compute base on the distance between them (i.e., the
length of the direct sound).
Delay = distance/speed of sound

8
1.1.2. Acoustic Echo Cancellation (AEC)
To handle with the acoustic echo problem above in teleconference systems, one can
use voice switches and directional microphones but these methods have placed
physical restriction on the speaker. The common and more perfective method is
implementing the Acoustic Echo Cancellation (AEC) to remove the echo. AEC
enhances greatly the quality of the audio signal of the hands-free communication
system. Due to their assistance, the conferences will work more smoothly and
naturally, keep the participants more comfortable.
Some echo cancellation algorithms are used for this purpose. All of them process the
signals follow the basic steps below:
1. Estimate the characteristics of echo path of the room.
2. Create a replica of the echo signal.
3. Echo is then subtracted from microphone signal (includes near-end and echo
signals) to obtain the desired signal.
Adaptive filter is a good supplement to achieve a good replica because of the echo path
is usually unknown and time-varying. The figure below illustrates about three step of
the AEC using adaptive filter.
In the Figure (I-2), by using adaptive filter for AEC follows three basic steps above:

1. Estimate the characteristics of echo path h(n) of the room: hˆ(n)

2. Create a replica of the echo signal: yˆ (n)

3. Echo is then subtracted from microphone signal (includes near-end and echo
signals) to obtain the desired signal: clear signal = d (n) − yˆ (n)

In the modern digital communication system such as: Public Switched Telephone
Network (PSTN), Voice over IP (VoIP), Voice over Packet (VoP) and cell phone
networks; the application of AEC is very important and necessary because it brings the
better quality of service and obtains the main purpose of the communication service
providers.

9
Far-end signal
x(n)

Step 1 Adaptive filer


Estimate echo path
echo path
hˆ(n)
h(n)

Estimate echo
Step 3 yˆ (n) Step 2 echo: y (n)
-
+ Near-end talker
Subtract echo from d ( n) : Microphone signal v(n)
d (n) − yˆ (n) d ( n ) = y ( n) + v ( n )
Near-end room

Figure I-2: Implement Acoustic Echo Canceller using adaptive filter

1.2. THESIS ORGANIZATION


In this thesis, we will perform the works related to the Acoustic Echo cancellation. It
contains 4 chapters that focuses on two main parts are theory and simulation. All of
them try to express and discuss about two main issues of acoustic echo cancellation,
namely the adaptation algorithms and the control of adaptation in double-talk situation.
Chapter 1: give the general information and introduction of the problems and
solutions related to the thesis’ topic. And mention the brief descriptions of echo theory
and acoustic echo problem in teleconference system and other telecommunication
systems.
Chapter 2: presents all the theory backgrounds. The adaptive filter which is used to
model the acoustic echo path is the central part of the AEC. Hence much effort and
researches have been devoted to it. Least Mean Square (LMS) algorithm is an old,
simple and proven algorithm which has turned out to work well in comparison with
newer more advanced algorithms. In this project, we use the normalized LMS (NLMS)
for the main filter in AEC, since NLMS is so far the most popular algorithm in practice
for its computational simplicity. After that, the generic double talk detection scheme is
outlined and then several well-known double talk detectors are discussed. The Geigel
algorithm is simple and works well when the far-end signal is sufficiently smaller than

10
the near-end speech, namely it has assumption of the echo path, so in practice not
widely applied to the echo cancellation algorithms. The Normalized Cross-correlation
method uses the correlation value between the error signal and the microphone signal
which would bring more promising results compared to the Geigel algorithm.
Chapter 3: is devoted to the evaluation of all the algorithms discussed above. Through
a bunch of recordings and simulations in MATLAB, we try to find out which adaptive
filtering and double talk detection algorithms suit better for the PC application.
Chapter 4: the conclusion is drawn and also the possible future work is presented.

11
CHAPTER II : THEORY OF ACOUSTIC
ECHO CANCELLATION

2.1. SYSTEM OVERVIEW


Acoustic echo cancellation is required in different fields of communication for
removing the echo of the coupling between the loudspeaker and the microphone. In
case of not doing this, then this coupling results in an undesired acoustic echo which
degrades the quality of sound.

Far-end signal (Input signal)

Double-talk Adaptive Non-linear


Detector Double-talk filter Filtered processor Desired
decision signal signal

Reference signals

Figure II-1: Block diagram of AEC

We describe a block diagram of an AEC system as in Figure (II-1). This system


consists of following three components:

12
1. Adaptive filter.
2. Doubletalk detector.
3. Nonlinear processor.

2.1.1. Adaptive Filter


Adaptive filter is the most important component of acoustic echo canceller and it plays
a key role in acoustic echo cancellation. It performs the work of estimating the echo
path of the room for getting a replica of echo signal. It requires an adaptive update to
adapt to the environmental change. Another important thing is the convergence rate of
the adaptive filter which measures that how fast the filter converges for best estimation
of the room acoustic path.

2.1.2. Double-talk detector (DTD)


It is rather difficult to predict when the adaptation of the filter should stop or slow
down and it is also important to know that the near-end speech signal exists or not in
the presence of far-end signal. In the situation when both ends talk (near-end and far-
end), this is known as double-talk. In case of double-talk, the error signal will contain
both echo estimation error and near-end speech signal. When we use this signal for
updating the filter coefficient then it diverges. As the result, the adaptive filter will
work incorrectly and finally the bad sound signal was issued. So to overcome this
problem, one uses Double-talk Detector.

2.1.3. Nonlinear Processor (NLP)


The nonlinear processor (NLP) is required for completely or partly cancels the residual
signal in the absence of near-end speech signal. By removing the residual signal will
cancel any occurring acoustic echo. The NLP will gradually cancel the signal and
insert a form of comfort noise to give the impression to far-end. The NLP as well as
the adaptive filter need an accurate estimation from the DTD to operate efficiently.

13
2.2. ADAPTIVE FILTER ALGORITHMS
Adaptive filtering is the process which is required for echo canceling in different
applications. Adaptive filter is such type of filter whose characteristics can be changed
for achieving optimal desired output. An adaptive filter can change its parameters to
minimize the error signal by using adaptive algorithms. The error is the difference
between the desired signal and the output signal of the filter. The figure below shows
the basic model of adaptive filter used in AEC.

Far-end signal
x (n)

Adaptive filter Echo path


ĥ h

yˆ (n) y ( n)

-
+
Near-end signal + residual echo Near-end signal + echo Near-end signal
e(n) = v(n) + y (n) − yˆ (n) d ( n ) = v ( n) + y ( n ) v( n)

Figure II-2: The basic model of AEC

The notations are used in the figure above and during this thesis in turn are:

• Far-end signal: x(n)

• Near-end signal: v(n)

• The true echo path (room impulse response): h

• Echo signal: y (n)

• Microphone signal: d (n) = v(n) + y (n)

• Estimated echo path: ĥ

• Estimate echo signal: yˆ (n)

• Error signal: e(n) = v(n) + y (n) − yˆ (n)

The echo path h of the room normally variable depends on the room structure and the
moving object inside. The estimated echo yˆ(n) is calculated from the reference input

14
signal x(n) and the adaptive filter ĥ . The near-end signal v(n) and background noise
are added into echo signal y (n) to create the desired signal d (n) ,

d ( n ) = v ( n) + y ( n ) (2.1)

The signal x(n) and y (n) are correlated. We get the error signal as,

error (n) = d (n) − yˆ (n) = v(n) + y (n) − yˆ (n) (2.2)

The adaptive filter works to minimize the echo ( y (n) − yˆ (n) ) to be zero to obtain only
near-end signal v(n) in the perfect case.

In Acoustic Echo Cancellation (AEC), the adaptive filter plays the main role to adapt
the filter tap weight in order to overcome the echo problem. There are different types
of algorithms are used for this purpose such as Least Mean Square (LMS), Normalized
Least Mean Square (NLMS), Recursive Least Square (RLS) and Affine Projection
Algorithm (APA) and etc. The LMS is widely used algorithm for adaptive application
such as channel equalization and echo cancellation. This algorithm is the most simple
if we compare it with NLMS and RLS algorithm. The normalized least mean square
(NLMS) is also famous algorithm due to its computational simplicity.

2.2.1. Wiener Filter

d (n)

Adaptive filer - +
W ( z)
x (n) dˆ (n) e( n )

Figure II-3: General Wiener filter problem.

Wiener filters play a central role in a wide range of applications such as linear
prediction, echo cancellation, signal restoration, channel equalization and system
identification. [5]
The FIR Wiener Filter is the signal processing to produces the minimum mean-square
estimate, d$ ( n ) of d (n) . Two signals x(n) and d (n) are assumed to be wide-sense

15
stationary with known autocorrelations rx (k ) , rd (k ) and cross-correlation rdx (k ) . w(n) is
the unit sample response of the wiener filter:
p −1
W ( z ) = ∑ w(n) z − n (2.3)
n=0

The output signal d$ (n) of the Wiener filter is the convolution of w(n) and x(n) ,
p −1
d$ (n) = ∑ w(l ) x(n − l ) (2.4)
l =0

The requirement of the filter is to find filter coefficients w(k ) that minimize the mean-
square error,

{ } = E { d (n) − d$(n) }
2 2
ξ = E e( n ) (2.5)

Now taking the derivative to both sides with respect to w* (k ) ,

∂ξ ∂  ∂e* (n) 
=
∂w* (k ) ∂w* (k )
E {e ( n ) e*
( n )} = E  e ( n ) 
∂w* (k ) 
(2.6)

For k = 0,1,..., p − 1,

This derivative must be equal to zero to minimize ξ for a set of filter coefficients,

 ∂e* (n) 
E e ( n ) *  = 0 (2.7)
 ∂w (k ) 

Where,
p −1
• The error signal: e(n) = d (n) − d$ (n) = d (n) − ∑ w(l ) x(n − l ) (2.8)
l =0

it follows that,

∂e* (n)
= − x* ( n − k )
∂w (k )
*

Now the above Equation (2.7) becomes,

E {e(n) x* (n − k )} = 0 ; k = 0,1, 2,..., p − 1 (2.9)

This equation is known as orthogonally principle or the projection theorem.


By substituting e(n) in Equation (2.8) into Equation (2.9) , we have

16
p −1
E {d (n) x* (n − k )} − ∑ w(l ) E { x(n − l ) x* (n − k )} = 0 (2.10)
l =0

Already assumed that x(n) and d (n) are jointly wide-sense stationary, then

E { x(n − l ) x* (n − k )} = rx (k − l ) (2.11)

E {d (n) x* (n − k )} = rdx (k ) (2.12)

so the above equation becomes,


p −1

∑ w(l )r (k − l ) = r
l =0
x dx (k ) ; k = 0,1, 2,..., p − 1 (2.13)

This equation is known as Wiener-Hopf equation and we can write this equation in
generalized form as,
R x w=r dx (2.14)

Where:

• R x is p × p Hermitian Toeplitz matrix of auto correlation

• w is vector of filter coefficients

• r dx is vector of cross-correlation between d (n) and x(n)

By taking the Equation (2.5), we try to find the minimum mean square error,

  p −1
 
*

{
ξ = E e( n )
2
} = E e(n)  d (n) − ∑ w(l ) x(n − l )  
  
 l =0

p −1
= E {e(n)d * (n)} − ∑ w* (l ) E {e(n) x* (n − l )} (2.15)
l =0

By following the Equation (2.9), the second term of above Equation (2.15) is equal to
zero. So we attain,

ξ min = E {e(n)d * (n)} (2.16)

Also, we have,
p −1
e(n) = d (n) − ∑ w(l ) x(n − l )
l =0

So Equation (2.16) becomes,

17
 p −1
 
ξ min = E {e(n)d (n)} = E   d (n) − ∑ w(l ) x(n − l )  d * (n) 
*
(2.17)
 l =0  

Finally, by taking the expected values, we have


p −1
ξ min = rd (0) − ∑ w(l )rdx* (l ) (2.18)
l =0

In vector form, we have from Equation (2.14) and Equation (2.18)

ξ min = rd (0) − r dxH w (2.19)

Or ξ min = rd (0) − r dxH R −x1 r dx (2.20)

2.2.2. The Steepest Decent Method


The method of steepest descent is an iterative procedure that has been used to find
extreme of nonlinear functions since before the time of Newton. [5]
In the steepest decent or gradient algorithm, the mean square error surface (respect to
an FIR filter coefficients) is a quadratic bowl-shaped curve as shown in Figure (II-4)
below.

E  e2 (n) 

woptimal w(i ) w ( i − 1) w (i − 2 ) w

Figure II-4: Illustration of gradient search of the mean square error surface for the
minimum error point

18
This figure explains the mean square error curve for a single coefficient filter and the
steepest decent search for the coefficient of minimum mean square error. This steepest
decent search is to find a value by taking successive downward step in the direction of
negative gradient of the error surface. By taking start with different initial values and
the coefficients of the filter are updated while moving in the downward direction
towards the negative gradient and until a point comes where the gradient shows zero
value. This steepest decent adaptation method can be written as,
w (n + 1) = w (n) − µ∇ξ (n) (2.22)

where µ is the step-size parameter and ξ (n) is the mean square error at time n.

Now we assume that we have,

dˆ (n) = wT x(n) (2.23)

{
R x = E x ( n) xT ( n) } (2.24)

rdx = E {d (n)x(n)} (2.25)

The gradient of the mean square error function is,

{
∇ξ ( n ) = ∇E e( n )
2
} = E {∇ e(n) } = E {e(n)∇e (n)}
2 *
(2.26)

And we know that,

∇e* (n) = −x* (n)

Thus it yields,

{
∇ξ ( n ) = − E e( n ) x * ( n ) } (2.27)

In the case of stationary processes, if x(n) and d (n) are jointly WSS (wide-sense
stationary) then,

{ } { } {
E e(n)x* (n) = E d (n)x* (n) − E wTn x(n)x* (n) } (2.28)

= rdx − R x w (n)

Therefore,
∇ξ (n) = −rdx + R x w (n) (2.29)

By considering the above two Equations (2.22) and (2.29), we have

w (n + 1) = w (n) + µ [ rdx − R x w (n)] (2.30)

19
Now we can define a filter coefficients error vector as,
~ ( n) = w ( n) − w
w (2.31)
0

where ,

• w0 is the optimal least square error filter coefficient vector.


By Wiener filter, w0 is given by,

w 0 = R −x 1rdx (2.32)

After few mathematical arrangements in last three equations, (2.30) becomes,


~ (n + 1) = [I − µR ]w
w ~ ( n) (2.33)
x

The step-size parameter µ controls the stability and rate of convergence of the adaptive
filter. The filter shows instability if the value of µ is too large and low convergence
rate if µ too small. The stability of the filter depends on the selection of step-size
adaptive parameter µ and the autocorrelation matrix. The correlation matrix can be
expressed in term of the matrices of eigenvectors and eigenvalues as,

R x = QΛQT (2.34)

Where,

• Q is orthonormal matrix of the eigenvectors of R x

• Λ is a diagonal matrix having diagonal elements corresponding to the


eigenvalues of R x

By putting the value of R x in Equation (2.34) into Equation (2.33) we obtain,


~ (n + 1) = [I − QΛQT ]w
w ~ ( n) (2.35)

Multiplying QT to both sides of Equation (2.35) and the using relations


QT Q = QQT = I yields,
~ (n + 1) = [I − µΛ ]Q T w
QT w ~ ( n) (2.36)
~ ( n) ,
Let v(n) = Q T w

So the Equation (2.36) becomes,

v(n + 1) = [ Ι − µ Λ ] v(n) (2.37)

20
Here, Ι and Λ are diagonal matrices. So the above equation can be written in term of
individual elements of the error vector v(n) as,

vk (n + 1) = [1 − µλk ] vk (n) (2.38)

where λk is the k th eigenvalue of the autocorrelation of the filter input x(n)

vk (n) 1 − µλk vk (n + 1)

z −1

Figure II-5: Feedback model of the variation of coefficient error with time

By considering the Equation (2.38), we make a condition for stability for the process
of adaptation and the coefficient error vector decay is,
−1 < 1 − µλk < 1 (2.39)

Let’s denote λmax the maximum eigenvalue of the autocorrelation matrix, the limits
of µ for stable adaptation is,

2
0<µ < (2.40)
λmax

2.2.2. Least Mean Square (LMS) Algorithm


In 1959, Widow and Hoff [3] derived an algorithm whose name was Least Mean Square
(LMS) algorithm and till now it is one of the best adaptive filtering algorithms. This
algorithm is used widely for different application such as channel equalization and
echo cancellation. This algorithm adjusts the coefficients of w(n) of a filter in order to
reduce the mean square error between the desired signal and output of the filter. This
algorithm is basically the type of adaptive filter known as stochastic gradient-based
algorithms. Why it’s called stochastic gradient algorithm? Because in order to
converge on the optimal Wiener solution, this algorithm use the gradient vector of the
filter tap weights. This algorithm is also used due to its computational simplicity.

21
The equation below is LMS algorithm for updating the tap weights of the adaptive
filter for each iteration.

w (n + 1) = w (n) + µ e(n)x* (n) (2.41)

Where,

• x(n) : input vector of time delayed input values.

• w (n) : weight vector at time n .

µ is a step-size parameter and it controls the immediate change of the updating factor.
It shows a great impact on the performance of the LMS algorithm in order to change
its value. If the value of µ is so small then the adaptive filter takes long time to
converge on the optimal solution and in case of large value the adaptive filter will be
diverge and become unstable.

Derivation of the LMS algorithm:


The derivation of LMS algorithm is the development of the steepest decent method
and also takes help from the theory of Wiener solution (optimal filter tap weights).
This algorithm is basically using the formulas which updates the filter coefficients by
using the tap weight vectors w and also update the gradient of the cost function
accordingly to the filter tap weight coefficient vector ∇ξ (n) . From Equation (2.22) in
the steepest decent algorithm,
w (n + 1) = w (n) − µ∇ξ (n)

w (n + 1) = w (n) + µ E {e(n)x* (n)} (2.42)

In practice, the value of the expectation E {e(n)x* (n)} is normally unknown, therefore
we need to introduces the approximation or estimated as the sample mean,

1 L −1
Eˆ {e(n)x* (n)} = ∑ e(n − l )x* (n − l ) (2.43)
L l =0

With this estimate we obtain the updating weight vector as,

µ L −1
w (n + 1) = w (n) +
L
∑ e( n − l ) x ( n − l )
l =0
*
(2.44)

22
If we using one point sample mean (L=1) then,

Eˆ {e(n)x* (n)} = e(n)x* (n) (2.45)

And finally, the weight vector update equation become the simple form,

w (n + 1) = w (n) + µ e(n)x* (n) (2.46)

2.2.3. Normalized Least Mean Square (NLMS) Algorithm


By using this normalized step-size parameter in Least Mean Square algorithm, this
algorithm is known as Normalized Least Mean Square (NLMS) algorithm [5]. The step-
size for computing the update weight vector is,
β
µ ( n) = 2
(2.47)
c + x( n )

Where,

• µ (n) is step-size parameter at sample n

• β is normalized step-size ( 0 < β < 2 )

• c is safety factor (small positive constant)

Derivation of the NLMS algorithm:


Normalized Least Mean Square (NLMS) is actually derived from Least Mean Square
(LMS) algorithm. The need to derive this NLMS algorithm is that the input signal
power changes in time and due to this change the step-size between two adjacent
coefficients of the filter will also change and also affect the convergence rate. Due to
small signals this convergence rate will slow down and due to loud signals this
convergence rate will increase and give an error. So to overcome this problem, try to
adjust the step-size parameter with respect to the input signal power. Therefore the
step-size parameter is said to be normalized.
When design the LMS adaptive filter, one difficulty we meet is the selection of the
step-size parameter µ . For stationary processes, this algorithm converts in the limits:

2 2
0<µ < And 0 < µ <
λmax trace(R x )

23
However, the auto-correlation R x generally is unknown, for this reason, the maximum
lambda λmax and R x are estimated in order to use the bounds. To solve this problem,
one introduces new estimate of trace(R x ) as,

trace(R x ) = ( p + 1) E x(n) { 2
} (2.48)

Where,

• p = 0,1, 2,...

• {
E x ( n)
2
} is the power of input signal. It can be estimated by estimator:
{
2
Eˆ x(n) =
1 p
∑ }
p + 1 k =0
x(n − k )
2
(2.49)

Therefore, the limits of step-size parameter will become,


2
0<µ < (2.50)
{
( p + 1) E  x(n) 
2
}
Substitutes Equation (2.49) into Equation (2.50), one get the step-size parameter as,
2
0<µ < H
(2.51)
x ( n) x ( n)

For time-varying processes, one computes the step-size parameter in time (sample n),
β β
µ ( n) = H
= 2
(2.52)
x ( n ) x ( n) x( n )

Where,

• β is normalized step-size ( 0 < β < 2 )

By replaced µ by µ (n) into the Equation (2.46) for updating the weight vector in
LMS algorithm, we achieve a new algorithm was known as Normalized Least Mean
Square (NLMS). The weight vector update now is,

w (n + 1) = w (n) + µ (n)e(n)x* (n)

β
Or w (n + 1) = w (n) + 2
e ( n) x* ( n ) (2.53)
x ( n)

In the LMS algorithm, because the weight vector w (n) changes depending on the input
[5]
signal x(n) . Thus it will get the problem is called as gradient noise amplification

24
when x(n) is too large. However, by using NLMS algorithm we can avoid this
problem. Take a look the Equation (2.53), when x(n) is very small the calculation of
weight vector updating equation will be the big problem. For this reason, one
implements the safety factor as,
β
w (n + 1) = w (n) + 2
e( n ) x * ( n ) (2.54)
c + x ( n)

Where

• c is safety factor (small positive constant)


Finally, the Equation (2.54) is the weight vector updating equation for NLMS
algorithm.

2.2.4. Recursive Least Square (RLS)


The RLS filter is a simple adaptive and time update version of wiener filter [12]. For
non-stationary signals, this filter tracks the time variations but in case of stationary
signals, the convergence behavior of this filter is the same as Wiener filter that it
converges to the same optimal coefficients. This filter has fast convergence rate and it
is widely used in the application such as echo cancellation, channel equalization,
speech enhancement and radar where the filter should do fast changes in signal
process. This adaptive algorithm is used due to following factors:

• Computational complexity

• Speed of convergence

• Minimum error at convergence

• Numerical stability

• Robustness
For RLS algorithm, we consider the following:

• x(n) is the discrete time array M × 1 array input vector.

• y (n) = w H x(n) is the output signal.

• d (n) is the desired signal.

• And w is the M ×1 complex weight vector

25
The cost function f n∂ ( w) at time instant n is given by,
n
f n (w ) = ∑ λ
n−k 2
w H x(k ) − d (k ) + (w − w 0 ) H λ n R 0 (w − w 0 ) (2.55)
k =1

n = 1, 2,3,.....

where,

• w 0 and R 0 are the initial chosen parameters

• λ is the real-positive constant (0 < λ < 1)

And we define the new function,

f 0 (w ) = (w − w 0 ) H R 0 (w − w 0 ) (2.56)

p0 = R 0 w 0

Now consider two column matrices,

 n2−1 H   n2−1 * 
λ x (1)   λ d (1) 
A= M , b= M 
 H   * 
 x ( n)   d ( n) 
   

Now the cost function f n ( w) could rewrite as,

f n (w ) = ( Aw − b) H ( Aw - b) + (w - w 0 ) H λ n R 0 (w - w 0 ) (2.57)

When n is large (n>M), then λ n → 0 , hence the second term of the Equation (2.57)
will disappear and the least square error now is,

f n (w ) = ( Aw − b) H ( Aw - b) (2.58)

w n will be a solution of over-determined linear system of equation Aw = b .

For otherwise, by differentiating the Equation (2.57), we have w n as

w n = R −n1p n (2.59)

Where,
n
• R n = A H A + λ n R 0 = ∑ λ n − k x( k ) x H ( k ) + λ n R 0 (2.60)
k =1

26
n
• p n = A H b + λ n p 0 = ∑ λ n − k x( k ) d * ( k ) + λ n p 0 (2.61)
k =1

Then we obtain the recursive relations for R n and p n as,


n −1
R n = ∑ λ n − k x ( k ) x H ( k ) + x( n ) x H ( n) + λ n R 0
k =1

 n −1 
= λ  ∑ λ n −1− k x(k )x H (k ) + λ n −1R 0  + x(n)x H (n)
 k =1 

= λ R n −1 + x(n)x H (n) (for n ≥ 1 ) (2.62)


n −1
p n = ∑ λ n − k x ( k ) d * ( k ) + x( n ) d * ( n) + λ n p 0
k =1

 n −1 
= λ  ∑ λ n −1− k x(k )d * (k ) + λ n −1p 0  + x(n)d * (n)
 k =1 

= λp n −1 + x(n)d * (n) (for n ≥ 1 ) (2.63)

Rewrite the Equation (2.62) for recursive relation of R n . We have,

λ −1R n = R n −1 + x(n)λ −1x H (n) (2.64)

Using matrix inversion, suppose that A and B are two positive-definite matrices related
by,

B −1 = A −1 + CD−1C H (2.65)
So the relations of these matrices are:

B −1 = λ −1R n , A −1 = R n −1 , C = x(n) and D−1 = λ −1

Now by taking the inverse of λ −1R n we have,

1
λ R −n1 = R n−1−1 − R n−1−1x(n) −1
x H (n)R −n1−1 (2.66)
x ( n ) R x ( n) + λ
H
n −1

Therefore,

λ −1R n−1−1x(n)x H (n)R n−1−1


R n−1 = λ −1R n−1−1 − (2.67)
x H (n)R −n1−1x(n) + λ

Multiplying both sides with x(n) , the useful relation R −n1x(n) is,

27
λ −1R n−1−1x(n)x H (n)R n−1−1x(n)
R n−1x(n) = λ −1R n−1−1x(n) −
x H (n)R n−1−1x(n) + λ

R n−1−1x(n)
= (2.68)
x H (n)R n−1−1x(n) + λ

Now we will use the above equations to attain the recursive relation for least square
solution w n as,

w n = R −n 1po = R n−1 (λ p n −1 + x(n)d * (n))

= R −n 1λ p n −1 + R n−1x(n)d * (n)

 −1 −1 λ −1R n−1−1x(n)x H (n)R n−1−1  −1


=  λ R n −1 − −1  λp n −1 + R n x(n)d (n)
*

 x (n)R n −1x(n) + λ 
H

−1 R n−1−1x(n)x H (n)R n−1−1p n −1


=R p n −1 n −1 − −1
+ R n−1x(n)d * (n)
x (n)R n −1x(n) + λ
H

= w n −1 − R n−1x(n)x H (n)w n −1 + R n−1x(n)d * (n)

= w n −1 + R −n 1x(n)(d * (n) − x H (n)w n −1 ) (2.69)

Finally, we know that the error signal is ε (n) = d (n) − w nH−1x(n) , substitute this term into
Equation (2.69), we achieve the weight vector update equation of RLS algorithm as
following,

w n = w n −1 + R −n 1x(n)ε * (n) (2.70)

2.3. DOUBLE-TALK DETECTOR (DTD)


In Acoustic Echo Cancellation, the most difficult problem is to handle with the
situation of Double-talk presence. Double-talk occurs when far-end and near-end talk
at the same time, as a result, the far-end speech signal is corrupted by near-end signal.
To solve this problem, one introduces the Double-talk Detector. The task of DTD is
freezes the adaptation step during filtering algorithm in case of near-end speech
present to avoid the divergence of adaptive algorithm. Without DTD, when the near-
end talking would make the system estimation process fail and produce extremely
erroneous results.

28
Now we see the Figure (II-6), when near-end speech is not present ( v(n) = 0 ) then the
adaptive algorithm will quickly converse to an estimate echo path. This is the best case
of canceling echo. But when near-end speech present ( v(n) ≠ 0 ) then this signal could
influence to the adaptation of the filter and cause the divergence. The process of
adaptive algorithm will be incorrect and the echo can not be removed.

Far-end signal x(n)

Adaptive filer Echo path


ĥ h

yˆ (n) y ( n)
Updating Double-talk
filter Detector
-
+
Near-end signal + residual echo Near-end signal + echo Near-end signal
error (n) = v(n) + y (n) − yˆ (n) d ( n ) = v ( n) + y ( n ) v( n)

Figure II-6: Double-talk detector with AEC

By implementing the DTD and Updating filter blocks in the figure above, the DTD
will estimate the statistic decision may depend on far-end speech, near-end speech and
error signal. After that, it will compare to the threshold to make the DTD decision to
control the Updating filter (freeze the adaptation or not). The “Updating filter” block
here has the meaning as a switch (on or off) which permits to update weight vector or
not.
There are several methods of DTD, one can use the basic algorithm as Geigel, one
bases on the cross correlation calculations (Benesty and Normalized Cross-Correlation
algorithms) and another method is Variance Impulse Response.

29
2.3.1. The Geigel algorithm
One simple algorithm is introduced by A.A. Geigel [9]. His approach is that first
measure the power of the received signal (microphone signal) and then compares this
power to the power of the far-end signal. Due to damping the signal by room acoustic
filter, as a result the power of the received signal containing only the echo will be
lesser than the signal consisting of echo and a near-end speaker. This is known as
Geigel Double-talk detector. The decision variable for this algorithm is,

max { x(t ) ,..., x(t − L + 1) }


ξG (t ) = (2.71)
d (t )

Where,

• L is length of adaptive filter


Make the comparison this value to the threshold TG . If ξG (t ) is greater than the preset
threshold, it is supposed that doubletalk is present and otherwise is not. That is mean:

ξ (t ) < TG doubletalk 
Decision =  G 
ξG (t ) > TG no − doubletalk 

The selection of TG requires to be chosen carefully because it strongly affect the


performance of the detector. The Geigel detector has the benefit of being
computationally simple and requiring very little memory. This detection approach is
based on a waveform level comparison between microphone signal d (n) and the far-
end signal x(n) . And also assume that the near-end speech signal v(n) in the
microphone signal will be stronger than the echo y (n) = h T x(n) . For AEC, it is difficult
to set threshold which works in any situation because the loss through the acoustic
echo path depends on different factors. In general, this detector has quite poor
performance.

2.3.2. The Cross-correlation (Benesty) algorithm


Ye and Wu [8] firstly introduced the idea by using cross-correlation vector between the
far-end signal x(n) and the error signal e( n) for doubletalk detection which is given
as,

rex = E {e(n)x(n)T } (2.72)

30
Where,

• rex : is the cross-correlation vector between far-end and error signal.

But Benesty [8] worked on this with different approach and he claimed that the above
approach does not work well for doubletalk detection. He mentioned that both near-
end speech v(n) and the far-end speech signal x(n) are independent and assume that all
the signals are zero mean.
According to him, the cross-correlation rxd between far-end signal and microphone
signal will be used to calculate the decision statistic.

{
rxd = E x(n)d (n)T }
{
= E x( n) ( y ( n ) + v ( n ) )
T
}
{
= E x( n ) ( h T x( n ) )
T
}
= R xh (2.73)

Where,

• R x = E {x( n)x(n)T } is the autocorrelation vector of far-end signal.

Benesty’s decision statistic for double-talk detection is,

ξCC = rxdT (σ d2 R x )−1 rxd (2.74)

In this equation, the variance of the microphone signal σ d2 is,

σ d2 = E {d ( n) d ( n)T }

=E {( y(n) + v(n) )( y(n) + v(n) ) } T

= E { y (n) y ( n)T } + E { v( n) v ( n)T }

{
= E hT x ( n ) ( hT x ( n ) )
T
} +σ 2
v

= hT R xh + σ v2 (2.75)

Where,

• σ v2 is variance of the near-end speech

31
Finally, the Equation (2.74) of the decision statistic becomes,

ξ Benesty = ξ CC
2
= rxdT (σ d2 R x ) −1 rxd

hT R x R x h
=
(hT R x h + σ v2 ) R x

hT R x h
= (2.76)
hT R x h + σ v2

Therefore, observe the above equation, easily to see that,

• If near-end speech is present ( v(n) = 0 ), then ξ Benesty ≈ 1

• If near-end speech is not present ( v(n) ≠ 0 ), then ξ Benesty < 1

Thus, finally we get the double-talk decisions as,

ξ Benesty (t ) < T doubletalk 


Decision =  
ξ Benesty (t ) > T no − doubletalk 

Where,

• T is a threshold with the chosen value approximately is 1.

2.2.3. Normalized cross-correlation (NCC) algorithm


Another method here we will discuss for doubletalk detection is the Normalized Cross-
Correlation algorithm [8]. The NCC algorithm computes the decision statistic
depending on the relations of microphone signal and error signal. It can be approached
by considering the values of variance of near-end signal and cross-correlation between
error signal and microphone signal.
The cross-correlation red between the error signal e(n) and microphone signal
d (n) which is given as,

red = E {e(n)d (n)}

= E ( y (n) + v(n) − hT x(n) ) ( y (n) + v(n))T 

( )
= E  hT x(n) − hˆ T x(n) + v(n) (hT x(n) + v(n))T 
 

( )
= E  h T x(n) − hˆ T x(n) x(n)T h + v(n)v(n)T 
 

32
( )
= h T − hˆ T R xh + σ v2 (2.77)

Now one introduces the normalized decision statistic as,


red
ξ NCC = 1 −
σ d2 (2.78)

By substituting the values of rem and σ m2 from above relations into Equation (2.78), we
have,

ξ NCC = 1−
(h T
)
− hˆ T R x h + σ v2
hˆ T R x h + σ v2

hˆ T R x h
= T (2.79)
h R x h + σ v2

Look at the Equation (2.79), when the adaptive filter works well to converge to an
estimate echo path ĥ that approximately equal to the true echo path h . Therefore,
easily to obtain bellow conclusion,

• If near-end speech is present ( v(n) = 0 ), then ξ MECC ≈ 1

• If near-end speech is not present ( v(n) ≠ 0 ), then ξ MECC < 1

Thus, finally we get the double-talk decisions as,

ξ (t ) < T doubletalk 
Decision =  NCC 
ξ NCC (t ) > T no − doubletalk 

Where, T is a threshold with the chosen value approximately is 1.

The values of red and σ v2 are not available in practice, so we define the new estimated
decision statistic as,

r$ ed
ξ NCC = 1 − (2.80)
σˆ d2

Where,

• r$ ed is the estimate of red

• σˆ d is the estimate of σ d
2 2

We can found these estimates by using the exponential recursive weighting algorithm.

33
r$ ed ( n) = λ r$ ed ( n − 1) + (1 − λ )e( n) d T ( n) (2.81)

σˆ d2 (n) = λσˆ d2 (n − 1) + (1 − λ )d (n)d T (n) (2.82)

Where,

• e(n) is the captured cancellation error sample at time n

• d(n) is the captured microphone signal sample at time n

• λ is the exponential weighting factor ( λ <1 and λ ≈ 1 )

2.3. FREQUENCY-DOMAIN ACOUSTIC ECHO CANCELLATION


Above all algorithms that we described in this thesis are time domain algorithms. They
deals with low frequencies and we can get good result in case of acoustic echo
cancellation for low frequency signals. But when we deal with high frequencies then
we get good result by implementing frequency domain adaptive algorithm. The main
advantages of frequency domain adaptive algorithm is the fast convergence rate
especially when we are dealing with speech signals and second is the low
computational complexity due to the efficiency of block processing in connection with
discrete Fourier transform (DFT). The frequency domain adaptive filter has two basic
types [10]:
1. Gradient constrained frequency domain adaptive filter.
2. Unconstrained frequency domain adaptive filter.

2.2.1. The generic frequency domain echo canceller


In this thesis, we focus on unconstrained frequency domain adaptive filter [10]. This
filter has low computational complexity and it can converge to the Wiener solution
when the length of the unknown system is less than half of the block size of DFT. This
algorithm is based on overlap-save sectioning with the DFT and its robustness is based
on a nonlinear function that provides scaling of the reference and error signal levels.
Due to this algorithm, we get good results without time-variant threshold estimators.
This algorithm is useful for the applications of echo cancellation.
In the below figure, x(n) is the far-end speech signal with discrete time index n and
after passing through the room echo path this signal is picked up by microphone. The
room impulse response h is given by,

34
h = [ h1 , h2 ,..., hL ]
T
(2.83)

where L is the length of the adaptive filter. The microphone signal or the output signal
y (n) is given as,

y (n) = hT Rx(n) + v(n) + w(n) (2.84)

Where,

• v(n) is the near-end speech signal.

• w(n) is the ambient noise.

x(n) = [ x (n − L + 1),..., x (n) ] .


T

• R is the matrix that reverses the order of the elements of x(n) .

x(kL)
 x(kL − L) 
 x(kL)  DFT
 
Conjugate Xk(l)
X k* (l )
g(|Ek(l)|,|Xk(l)|ej(Ekl-Xkl)

Make second
block zeros

Hˆ k (l )
Echo
IDFT

DFT

Update

path h

Yˆk (l )
Ek (l )

DFT Gradient constraint IDFT

Add zeros as Second block


first block as yˆ(kL)
- v( n)
e k (l ) y k (l )
w(n)

Figure II-7: Frequency domain echo canceller

35
The impulse response h is assumed to be fixed or vary slowly with the convergence
rate of adaptive filter. The transformed far-end signal in l -th frequency bin at k -th
T
step is X k (l ) and it is an element of DFT of  xT (kL − L), xT (kL)  . By increasing the
value of L , the index k is incremented after every time n and l = 0,..., 2 L − 1 .

The coefficient of the filter for k and l is Hˆ k (l ) , so the output signal is given as

Yˆk (l ) = Hˆ k (l ) X k (l ) (2.85)

The echo replica $y (kl ) corresponds to the last L elements of the inverse DFT (IDFT)

[ ]T
of Yˆk (0), Yˆk (1),...,Yˆk (2 L − 1) . And the error signal becomes,

e(kL) = y (kL) − $y (kL) (2.86)

Where,

• y (kL) = y (kL − L + 1),..., y ( kL)

In time-domain the error signal and the filter output are scalars whereas in frequency
domain these are vectors.
T
The transformed error signal Ek (l ) is an element of DFT of  zT , eT (kL)  , where z is an
L × 1 zero vector.

Here we are focusing on unconstrained case. So by neglecting the gradient constrained


in figure and the updating equation for Hˆ k (l ) is,

Hˆ k +1 (l ) = Hˆ k (l ) + µ .g ( Ek (l ) , X k (l ) )e j (θ Ekl −θ Xkl ) (2.87)

Where,

• θ Ekl and θ Xkl are the phases of Ek (l ) and X k (l ) .

• g ( Ek (l ) , X k (l ) ) is an arbitrary function of Ek (l ) and X k (l ) .

• µ is the step-size parameter depend on g ( Ek (l ) , X k (l ) ) .

36
2.2.2. Sub-band adaptive filter
The basic idea of a sub-band [15] decomposition approach is its increase in convergence
speed in comparison to a full-band solution, especially when extremely long FIR filters
are being adapted. This is due to a reduced spectral magnitude range, i.e. sub-band
filtering has a de-correlating effect because colored input signals are decomposed into
sub-bands with ’’whiter’’ sub-spectra.
Figure (II-8) depicts the sub-band adaptive filtering. Using analysis filter banks P(Z)
the original signal from far-end signal and near-end signal, microphone signal are
decomposed by subdividing their spectra into smaller intervals (x0(n),x1(n),…).
Adaptive filtering is then performed in these sub-bands by a set of independent
filters ( h0→ ( n), h1→ ( n),...) . The outputs of these filters are subsequently combined using a
synthesis filter bank Q(z) to reconstruct the full-band output.

d 0 ( n)
Unknown
system
d ( n) Analysis-
ur d1 (n)
filter-bank
g ( n) P(z)
d M −1 (n)

x0 (n) r
h 0 ( n)
-
x (n) Analysis- x1 (n) e1 (n) e ( n)
r Synthesis-
Input filter-bank filter-bank
P(z)
h1 (n) Error
signal - Q(z)
signal

xM −1 (n) eM −1 (n)
r
h M −1 (n)
-

Figure II-8: Sub-band adaptive filtering (SAF) for M sub-bands

The width of each sub-band is reduced because the sampling frequency for each filter
can be lowered. Consequently the sub-band adaptive filters need fewer taps in
comparison to full-band solutions to cover the same time interval and are updated at a
lower rate. This leads to a significant reduction of computational complexity.
Because linear group delays are required for sub-band adaptive filtering, only non-
recursive filters are allowed for the filter banks.

37
Filter Bank Structure:
Because the quality of sub-band separation is highly significant for the obtained
decimation rate and for the convergence behavior of the adaptive filters in sub-band,
the design of analysis and synthesis filter banks [15] is the determining factor for the
quality and efficiency of the overall system.
The following figures depict the analysis filter bank and synthesis filter bank. To ease
the processing, down-sampling ( L ↓) and up-sampling ( L ↑) can be inserted between
the analysis and synthesis filter banks.

Modulation
− j 2π n M
m
e
x0 (k )
m=0 P( z ) ↓L

m =1 x1 (k )
P( z ) ↓L
x (n)

xM −1 (k )
m=M−1 P( z ) ↓L

Figure II-9: Analysis filter bank

For sub-band separation and recombination we use DFT filter banks. To subdivide the
sequence x(n) (apart from the low-frequency part) any part of the spectrum centered
around the frequencies ω=ωm (for m=0,1,…,M-1) are shifted into the base-band by
m
multiplying x(n) with the complex sinusoid e− jω n (with ωm = 2π
m
).
M

The ideal filters have unit magnitude and zero-phase in the pass-band while zero for
the stop-band magnitude. The choice is to use FIR filters that have linear phase, but
not ideal magnitude requirements.
The synthesis filter bank design reduces also to the design of a signal synthesis
prototype filter Q( z ) . Since we always use FIR sub-band filters and sub-band models,
residual errors are unavoidable. This implies that in the design of a sub-band
identification system, there is a tradeoff between asymptotic residual error and
computational cost.

38
Modulation
− j 2π n M
m
e
θ m= 0 (k ) m=0
↑L Q( z )

θ m =1 (k ) m =1 e( n )
↑L Q( z )

θ m = M −1 (k ) Q( z )
↑L m=M−1

Figure II-10: Synthesis filter bank

39
CHAPTER III : SIMULATION

In the previous chapters above provided us the detail theory about the Acoustic Echo
Cancellation including Algorithms of Adaptive filter, Double-talk Detection and other
issues.
This chapter will perform these ideas to simulate the topic’s problems by using the
software environment (MATLAB).

3.1. GENERAL SETUP OF SIMULATION


3.1.1. Setup of the Simulation
1. MATLAB
MATLAB is a numerical computing environment that especially effective to calculate
and simulate the technical problems. This programming language is very powerful
allows matrix manipulation, plotting of functions and data, implementation of
algorithms, creation of user interfaces, and interfacing with other programming
languages (C, C++, Fortran and Java).
One of the most beneficial features is graphical visualization which helps us have
confidence in results by monitoring and analyzing resultant plots.
In addition, MATLAB implement Simulink, the software package models, simulates,
and analyzes dynamic systems. It enables us to pose a question about a system, model
the system, and see what happens.
For our simulation purpose, MATLAB actually is necessary and effective software to
attain the convincible results because of some reasons as following:

40
- Easy to record audio signals of the far-end and near-end speeches. These data
are indispensable of the simulation.
- Matrix calculation is very important since data was processed as the matrix
formats.
- Easy to monitor the results by plotting desired graphs. Especially, we need to
hear the resultant sounds – By MATLAB, it is simple to achieve.
- The structure of the commands is suitable to compute with Signal Processing.

2. Requirements during the simulation


- This simulation tries to perform the tasks of the acoustic echo canceller and double-
talk detector at the near-end conference room. We assume that both far-end and near-
end rooms are the same characteristics (size, acoustic features). In the case of the
perfectively performance of the far-end echo canceller, we only try to do the task of
near-end room.
- The speech signals (including far-end and near-end signals) were recorded by
MATLAB software at the sampling rate of 8 kHz. The speech signal is the audio signal
contains the frequencies between 300Hz-3400Hz. Because of the sampling theorem
(Nyquist–Shannon sampling theorem), the analog signal will reconstruct perfectly
from the sequence of samples if the sampling rate exceeds 2B (B is highest frequency
of the analog signal). Thus by using sampling rate fs of 8000Hz, we will satisfy to the
sampling theorem (fs=8000Hz>2B=3400x2=6800Hz).
- For our simulation, the duration of the signals is 20 seconds (160.000 samples) which
can express 4 cases (5 seconds for each case, respective to 40.000 samples) of the
communication between far-end and near-end in the teleconference system. These
signals are plotted as the figures below. 4 cases in teleconference are:
1. Far-end talks only
2. Double talk
3. Near-end talks only
4. Both of them are silent

41
Far-end Double-talk Near-end Both sides
talks only talks only are silent

Figure III-1: Far-end and near-end speeches

- The background noise and the ambient noise were generated by MATLAB as a white
noise that has a zero mean. The suitable noise we used here compares to the echo so
that the Signal to Noise Ration (SNR) is approximately 45dB.
- The process of the simulation will be performed in off-line mode, i.e. the
performance of the acoustic echo canceller and double-talk detector will work in
MATLAB with the recorded speech and measured room impulse response.

3.1.2. Flowchart of the AEC algorithm


The flowchart of the AEC algorithm is very important for us to orient all steps we need
to do in the simulation. This is shown in the figure next page.

42
Start

Read far-end signal x(n)


and Room Impulse
response h(n)

Create Echo signal

Read near-end
Signal v(n)

Create desired signal d(n)

Running NLMS algorithm to


calculate error signal

Double-talk detection
YES
(Normalized Cross-
Correlation method)

NO
Freeze Adaptive
Updating Adaptive Filter filter
coefficients

Get residual echo e(n) by


subtracting estimated echo
from desired Signal

NLP

Stop

Figure III-2: Flowchart of acoustic echo cancellation algorithm

43
3.2. MEASURE ROOM IMPULSE RESPONSE (RIR)
3.2.1. Why we must to measure RIR
In real time performance, the Adaptive Filter works to estimate the true value of
impulse response h(n) of the specific room. Therefore, if we have the exact Room
Impulse Response, we can compare our result hˆ(n) to this to make sure it is correct or
not. In addition, by comparing to the true impulse response we could adjust the value
of the factors to get the correct convergence of adaptive filter.
The acoustic characteristic of different rooms is different. They are frequency
response, cumulative spectral decay, energy decay and reverberation characteristics,
they depend on the three main factors:
1. Size of the room.
2. Constructing materials of the room (hard wood, concrete, ceramics…).
3. Objects inside the room (tables, chairs, people…).

3.2.2. Method of measuring RIR


There are several methods to measure RIR [13], we can use various excitation signals
such as: white noise, pink noise, Dirac pulse, swept-sine and so on.
To achieve the room impulse response, we need to record the response (microphone
signal) of the excitation signals (loud speaker signal). Impulse response may be
obtained by direct de-convolution or by spectral division between the spectrum of the
response and the spectrum of the excitation.
The chosen method in this simulation is measuring the room impulse response by
using white noise signal as excitation. The reason is this signal contains equal amounts
of energy for all frequencies, thus it will be good for demonstrating the frequencies of
the speech that are used during the simulation. Moreover, this method only requires us
with the simple equipments: one computer is connected to microphone and loud
speaker. And after that, all calculations of the impulse response are performed easily
by the MATLAB software.

44
3.2.3. Result
Our experiment was done under the setup as follow:
- The study room at the Vaxjo University’s library is acoustically isolated with
dimensions of 4m x 5m x 3m.
- The microphone was located on a 1m high table and far from the loud speaker
the distant of 1m.
- The white noise was generated at the sampling frequency of 8000Hz and 16bits
resolution.
- The recorded sounds were processed off-line using MATLAB.
Because the results are different for different times we measure. For this reason, to
obtain the exact approximation of the room impulse response we took several times of
measurements and got the average of them. In our experiment, we took 15 times of
measurements and calculated the mean of them by using MATLAB.
To obtain the real room impulse response, we must take approximated 1 second (8000
taps) since it is depend on the reverberation factor. The graph below will illustrate this,

Figure III-3: Room impulse response is measured with 8000 taps length.

For this simulation, to make the simulation in MATLAB simpler, thus we measured
the room impulse response with 128 taps length (16ms).

45
delay

Figure III-4: The room impulse response (128 taps length)


is used in this simulation.

As the figure’s illustration, the length of the filter is approximately 16ms (respective to
128 taps length) and the delay is 3ms (24 taps from beginning to the first maximum
magnitude)
The delay of 4ms is the approximate value we can compute from the distance (d=1m)
between loud speaker and microphone.
delay=distance/speed of sound=1/343 ≈ 3 [ms]

3.3. EXPLANATION OF MATLAB CODE


3.3.1. Adaptive filter algorithms
In these code segments, the purpose we need to obtain is processing and performing
the LMS and NLMS algorithms.
The comparison below between the theory and the simulation parts (MATLAB code)
may bring the better understanding about not only the algorithm but the programming
work.
The summary of the algorithms (LMS, NLMS) and the code segments are
demonstrated in the tables below.

46
1. Least Mean Square (LMS) algorithm

LMS algorithm

Initial Conditions: 0 < µ <1

Length of adaptive filter: L

Input vector: x L ,1 = [0,0,...,0]T

Weight vector: wL ,1 = [0,0,...,0]T

For each instant of time, n = 1, 2,…, compute:

Output signal: y (n) = wT (n) x(n)

Estimation Error: e( n ) = d ( n ) − y ( n )

Tap-Weight Adaptation: w( n + 1) = w( n) + 2 µx ( n)e( n)

Table III-1: Summary of LMS algorithm

MATLAB code

mu=0.014;
Initial Conditions: L=length(h); %the same length of RIR
w=zeros(L,1); %Initial weight vector
xin=zeros(L,1); %Initial input signal

For each instant of time, k = 1, 2,…, compute:

Output signal: y(i)=w'*xin;

Estimation Error: error= d(i)-y(i);

Tap-Weight Adaptation: wtemp = w + 2*mu*error*xin;

Table III-2: MATLAB code of LMS algorithm

47
2. Normalized Least Mean Square (NLMS) algorithm

NLMS algorithm

Initial Conditions: 0 < α < 1 and c : a small constant


Length of adaptive filter: L
Input vector: x L ,1 = [0,0,...,0]T
Weight vector: wL ,1 = [0,0,...,0]T

For each instant of time, n = 1, 2,…, compute:

Output signal: y (n) = wT (n) x(n)

Estimation Error: e( n ) = d ( n ) − y ( n )

Tap-Weight Adaptation: 2α
w( n + 1) = w( n) + x ( n ) e( n )
c + x ( n ) x ( n)
T

Table III-3: Summary NLMS algorithm

MATLAB code

alfa=0.42; %Alfa
c=0.01; %A small constant
Initial Conditions: L=length(h); %the same length of RIR
w=zeros(L,1); %Initial weight vector
xin=zeros(L,1); %Initial input signal

For each instant of time, k = 1, 2,…, compute:

Output signal: y(i)=w'*xin;

Estimation Error: error= d(i)-y(i);

mu=alfa/(c+xin'*xin);
Tap-Weight Adaptation:
wtemp = w + 2*mu*error*xin;

Table III-4: MATLAB code of NLMS algorithm

48
3.3.2. Double-talk Detector algorithm
In this simulation, we used the Normalized Cross-Correlation method to detect the
existence of the double-talk. We calculate the estimates using the exponential recursive
weighting algorithm to obtain the values of the cross-correlation ( rem ) between error
signal and microphone signal and the variance ( σ m2 ) of the microphone signal. And
achieve the decision statistic ( ξ DTD ) from these values. Finally, we compare this value
to the threshold (T) to make the decision of Double-talk Detector.
Because of the convergence time of the adaptive filter, we must setup the first time
(DTDbegin) when the Double-talk Detector start working.
The summary and MATLAB code below will illustrate the theory and simulation the
problem of Double-talk detection:

NCC algorithm summary

0 < λ < 1 and λ ≈ 1


Initial Conditions:
Threshold ≈ 1

For each instant of time, n= 1, 2,…, compute:

Cross-correlation: red ( n) = λ red (n − 1) + (1 − λ )e( n) d T ( n)

Variance of near-end speech: σˆ d2 (n) = λσˆ d2 (n − 1) + (1 − λ )d (n)d T (n)

rˆed
Decision statistic: ξ NCC = 1 −
σˆ 2d

If ξ DTD <T, freeze adaptive filter


Making the DTD decision:
If ξ DTD >T, updating adaptive filter coefficients

Table III-5: Summary of NCC double-talk detection algorithm

In the MALAB code, one introduces the new variable wtemp (temporary value) for
updating the adapter filter coefficients.

49
1. If ξ DTD <T, then freeze the adaptive filter, i.e., stop updating the adaptive filter
coefficients. Double-talk mode or both near-end and far-end are silent.
wtemp = w; %Freeze the adaptive filter coefficients
2. If ξ DTD >T, then continue running the loop of the adaptive filter algorithm, i.e.,
the updating the adaptive filter coefficients will be continuous. This case for no
near-end speech is detected: far-end talks only.
w=wtemp; %Update filter coefficients

MATLAB code of DTD using NCC algorithm

T=0.92; %Threshold

Initial Conditions: Lambda_DTD=0.95; %Constant

DTDbegin=50*L; %The time to activate DTD

For each instant of time, n= 1, 2,…, compute:

r_em(i)=lambda_DTD*(r_em(i-1))+(1-
Cross-correlation: lambda_DTD)*e(i)*d(i)';

varMIC(i)=sqrt(lambda_DTD*(varMIC(i-
Variance of near-end speech: 1)^2)+(1-lambda_DTD)*d(i)*d(i)');

decision_statistic(i)=1-
Decision statistic:
(r_em(i)/varMIC(i)^2);

if (decision_statistic(i)>threshold(i))

Making the DTD decision: w=wtemp; %Update filter coefficient

end

Table III-6: MATLAB code of NCC double-talk detection algorithm

3.3.3. Calculate other issues


1. Mean Square Error (MSE)
The purpose of the adaptive filter is minimizing the Mean Square Error MSE:

{ } = E { d (n) − d$(n) }
2 2
ξ = E e( n )

50
Therefore, the values and graph of this quantity will be essential to evaluate the
performance of the adaptive filter. If the adaptive algorithm works well, after
convergence time, the value of MSE should be reduced gradually to zero (for the case
of no near-end signal). The segment of MATLAB code below to calculate this
parameter.

mse_iteration(i)=error^2; %Square Error


for i=1:N-L

mse(i)=mean(mse_iteration(i:i+L)); %MSE - Mean Square Error

end

2. Echo Return Loss Enhancement (ERLE)


Echo Return Loss Enhancement ERLE is one of the most important parameters is
commonly used to evaluate the performance of the echo cancellation algorithm. This
quantity measures how much echo attenuation the echo canceller removed from the
microphone signal.
ERLE, measures in dB, is defined as the ratio of the microphone signal’s power (d[n])
and the residual error signal’s power (e[n]).

Pd (n) E[d 2 (n)]


ERLE = 10 log =
Pe (n) E[e 2 (n)]

ERLE depends on the algorithm we use for the adaptive filter, two quantities are
considered with ERLE are the convergence time and near-end attenuation will be
different relative to different algorithms. In our simulation, we made the MATLAB
code to fulfill the computation of ERLE as follow,

powerD(i) = abs(d(i))^2; %Power of Microphone signal


powerE(i)=abs(e(i))^2; %power of Error signal

for i=1:N-L

ERLE(i)=10*log10(mean(powerD(i:i+L))/mean(powerE(i:i+L));

end

51
3.4. RESULTS
The resulting plots below demonstrate the performance of acoustics echo canceller
with different algorithms. Fist, we show the graph of near-end speech which we need
to compare with the error signal (it is should be approximately equal). Second, the
plots in turn are: microphone signal d (n) , output signal y (n) of the adaptive filter and
error signal e(n) . The error signal in real teleconference system is transmitted from
near-end user to the far-end user. If no near-end speech, it should be a silent signal and
if near-end talks, then the error signal contains only near-end speech.
Other graphs will express the evaluations of the estimated impulse response, Double-
talk detector and performance of the adaptive filter (MSE, ERLE).

Figure III-5: Plot near-end speech, it is necessary to compare with the error signal
that the echo canceller produced.

52
3.4.1. LMS algorithm

Figure III-6: Plot the needed signals (LMS algorithm) in turn are:
desired signal, output signal and error signal

53
Figure III-7: Double-talk detection of LMS algorithm where the decision statistic
ξ NCC (n) (green line) compare to the threshold T (red line). If ξ NCC (n) >T, far-end talks
only and if ξ NCC (n) <T, double-talk or only near-end talks or both side are silent

Figure III-8: Evaluation of LMS algorithm, Mean Square Error (measure how much
the algorithm minimize the echo) and Echo Return Loss Enhancement (measure the
echo attenuation the echo canceller removed)

54
3.4.1. NLMS algorithm

Figure III-9: Plot the needed signals (NLMS algorithm) in turn are:
desired signal, output signal and error signal

55
Figure III-10: Double-talk detection of NLMS algorithm where the decision statistic
ξ NCC (n) (green line) compare to the threshold T (red line). If ξ NCC (n) >T, far-end talks
only and if ξ NCC (n) <T, double-talk or only near-end talks or both side are silent

Figure III-11: Evaluation of NLMS algorithm, Mean Square Error (measure how
much the algorithm minimize the echo) and Echo Return Loss Enhancement (measure
the echo attenuation the echo canceller removed)

56
3.5. EVALUATION
From the experiment works and the resultant graphs, we can evaluate the echo
cancellation algorithm in order to have a deeper understanding and the conclusion of
the thesis’ topic. As the results above of two echo cancellation algorithms are LMS
and NLMS, we have some evaluations as follow.

1. Comparison between LMS and NLMS

Both of them could converge approximately the estimated impulse response ĥ of the
true room impulse response h, thus the estimated echo yˆ(n) signal look like similar the
true echo signal y (n) . As a result, the error signals almost obtain our desired results.
Considering the Figure (III-6) and (III-9), it is easily to see that the resultant error
signals of NLMS is more convincing than the ones of LMS. The figure (III-7) and (III-
10), the double-talk decision of NLMS also is better than LMS (more exactly).
From these comments above, we can conclude that the performance of NLMS
algorithm is better and exacter than LMS algorithm.

2. Convergence test
Convergence is the most important factor to observe when running the echo
cancellation algorithm. If the filter coefficients used in the adaptive filter algorithm did
not converge, the code could get problem. In this simulation, we used the standard
signals as white noise (as the input signals), the low pass filter (model the impulse
response) to check the operation of the algorithm. If the problem still exist, then we
verify the convergence factor µn . By varying this factor, we can control and adjust the
convergence of the adaptive filter algorithm.

3. Echo Return Loss Enhancement (ERLE)


This parameter is used in order to evaluate the quality of the echo cancellation
algorithm. If the echo cancellation algorithm perform well, then the values of ERLE
should be in the range of (45dB, -40dB) (for Signal to Noise ratio SNR of 45dB).

57
Both LMS and NLMS gave the plots of ERLE within this range, thus the ERLE for
these algorithms achieved the required value or in other words, the algorithms work
well.

4. Check the resultant audio signals


With MATLAB, we can check the resultant signals by playing them. The microphone
signal and the error signal after echo canceller are play by loud speaker gave us the
good result we want. We realized that there was litter echo still existed in the
beginning of the error signal, this because of the convergence time of the adaptive
filtering algorithm. In general, the sounds we heard bring us with the convincing
results.

58
CHAPTER IV : CONCLUSION AND

FURTHER WORK

4.1. CONCLUSION
In this thesis we studied how to cancel acoustic echo by AEC. One of the major
problems in a telecommunication application over a telephone system is echo. The
Echo cancellation algorithm presented in this thesis successfully attempted to find a
software solution for the problem of echoes in the telecommunications environment.
AEC is the conventional method for solving the acoustic echo problem. Under ideal
conditions AEC can achieve perfect echo cancellation, because it estimates both the
phase and amplitude of the echo signal. The proposed algorithm was completely a
software approach without utilizing any Digital Signal Processing (DSP) hardware
components.
Speech has most of its energy concentrated to lower frequencies. Therefore it is most
important to achieve an optimal echo cancellation at these frequencies. At higher
frequencies the ear is not sensitive to phase information.
The algorithm was capable of running in any PC with MATLAB software installed. In
addition, the results obtained were convincing. The audio of the output speech signals
were highly satisfactory and validated the goals of this research.

59
4.2. FURTHER WORKS
The algorithm proposed in this thesis presents a solution for single channel acoustic
echoes. However, most often in real life situations, multi-channel sound is the norm for
telecommunication. For example, when there is a group of people in a teleconference
environment and everybody is busy talking, laughing or just communicating with each
other multi-channel sound abounds. Since there is just a single microphone the other
end will hear just a highly incoherent monographic sound. In order to handle such
situations in a better way the echo cancellation algorithm developed during this
research should be extended for the multi-channel case.
Another thing we need to implement in the acoustic echo canceller is that make it
working in real time (in practice) performance of teleconference system. Our work
only simulate the thesis’ topic in offline mode, therefore, it is necessary to implement
in real communication between far-end and near-end room of the teleconference
network.

60
Reference
[1] Jacob Benesty, Tomas Gansler, Denis R. Morgan, M. Mohan Sondhi and
Steven L. Gay, Advances in Network and acoustic echo cancellation, ISBN: 3-
540-41721-4, Springer, 2001
[2] Farhang-Boroujeny, Adaptive Filters, Theory and Applications, John Wiley and
Sons, New York, 1999
[3] Haykin, Simon, Adaptive Filter Theory, 2nd Edition. Prentice-Hall Inc., New
Jersey, 1991
[4] Sadaoki Furui and M. Mohan Sondhi, “Advances in Speech Signal Processing”,
Marcel Dekker, Inc, 1992
[5] Monson H. Hayes, Statistical Digital Signal Processing And Modeling, John
Wiley & Sons, Inc., 1996
[6] Haykin, Simon, Modern Filters, Macmillan Publishing Company New York,
ISBN 0-02-35275 0-1, 1989
[7] S. Haykin, Englewood, Adaptive Filter Theory (4th edition), N.J.: Prentice Hall,
Inc., 2002
[8] Mohammad Asif Iqbal, Jack W. Stokes and Steven L. Grant, Normalized
Double-talk Detection based on microphone and AEC error, cross-correlation
[9] Steven L. Gay and Jacob Benesty, Acoustic Signal Processing for
Telecommunication, Kluwer Academic Publishers, 2001, pp. 81-97
[10] Suehiro Shimauchi, Yoichi Haneda and Akitoshi Kataoka, Frequency domain
adaptive algorithm with nonlinear function of error-to-reference ratio for
double-talk robust echo cancellation, NTT Cyber Space Laboratories,
NTT Corporation, July, 2004
[11] Sven Nordebo, Signal Processing Antennas I, September 18, 2004
http://w3.msi.vxu.se/~sno/ED4024/ED4024.pdf
[12] Sven Nordebo, Signal Processing Antennas II, May 11, 2004.
http://w3.msi.vxu.se/~sno/ED4034/ED4034.pdf
[13] Ivo Mateljan, Kosta Ugrinovic, The comparison of room impulse response
measuring systems, University of Split, 21000 Split, Croatia
[14] Samuel D. Stearns, Digital Signal Processing with Example in MATLAB,
ISBN: 0-8493-1091-1, CRC Press LLC, 2003
[15] Irina Dornean, Marina Topa, Botond Sandor Kirei and Marius Neag, Sub-Band
Adaptive Filtering for Acoustic Echo Cancellation, IEEE Xplore, 2009

61
Appendix A: MATLAB code of LMS algorithm
clear all
%---------------------------------------------------------------------
%Load Data
[x, Fs, nbits] = wavread('c:/audiofiles/fe1'); %Far-end signal
[v, Fs, nbits] = wavread('c:/audiofiles/ne1'); %Near-end signal
[h, Fs, nbits] = wavread('c:/audiofiles/room_impulse_response_128taps');
%Room impulse response

%Declare the needed variables


L=length(h); %Length of adaptive filter (same length of RIR)
N=length(x); %Number of iterations
T=0.92; %Threshold for Double talk detection
lambda_DTD=0.95; %Constant for calculating decision statistic of DTD
DTDbegin=21000; %The time to activate DTD

%Intial value 0
w=zeros(L,1); %Initial weight vector of AF Lx1
xin=zeros(L,1); %Initial input signal of AF Lx1
varMIC=zeros(N,1); %Initial variance of microphone signal of AF Nx1
r_em=zeros(N,1); %Initial Cross correlation between error and microphone
signals

%Ambient noise
WhiteNoise = wgn(N,1,-65); %With make SNR of 45dB
%Microphone signal
EchoSignal=filter(h,1,x); %Echo signal after filter H
d=EchoSignal+WhiteNoise+v; %Desired signal (Microphone Signal)

%Make column vectors


x=x(:); %Far end signal Nx1
d=d(:); %Desired signal Nx1

%The values for calculate Step-Size of Adaptive Filter


mu=0.014;

%Calculate the average SNR (desired signal/noise)


powerMic = sum(abs(d).^2)/N; %Power of Microphone signal
powerN = sum(abs(WhiteNoise).^2)/N; %Power of White Noise
SNR=10*log10(powerMic/powerN); %Calculate the SNR

%----------------------------------------------------------------------
%-------------LMS algorithm for Adaptive Filter-----------------------
for i=1:N
for j=L:-1:2
xin(j)=xin(j-1);
end
xin(1)=x(i); %Insert new sample at beginning of input

y(i)=w'*xin; %Output signal after adaptive filter


error=d(i)-y(i); %ERROR
e(i)=error; %Store estimation error
wtemp = w + 2*mu*error*xin;%Update filter

% -----------NORMALIZED CROSS-CORELATION ALGORITHM DTD--------------


threshold(i)=T; %Threshold for plotting DTD
if (i<=DTDbegin) %The beginning time of the DTD
w=wtemp; %Update filter coefficients

62
end
if (i>DTDbegin)
%Cross correlation between error and microphone signal
r_em(i)=lambda_DTD*(r_em(i-1))+(1-lambda_DTD)*e(i)*d(i)';
%Variance of microphone signal
varMIC(i)=sqrt(lambda_DTD*(varMIC(i-1)^2)+(1-lambda_DTD)*d(i)*d(i)');
decision_statistic(i)=1-(r_em(i)/varMIC(i)^2); %Decision statistic
%Making the double-talk decision
if (decision_statistic(i)>threshold(i))
w=wtemp; %Update filter coefficients
end

end
%-------------ERLE-------------------------------------
powerD(i) = abs(d(i))^2; %Power of Microphone signal
powerE(i)=abs(e(i))^2; %power of Error signal
%--------------MSE-------------------------------------
mse_iteration(i)=error^2; %Square Error
end

for i=1:N-L
%MSE - Mean Square Error
mse(i)=mean(mse_iteration(i:i+L));
%Echo Return Loss Enhancement
ERLE(i)=10*log10(mean(powerD(i:i+L))/mean(powerE(i:i+L)));

%Plotting Double-talk Detection


if (i>DTDbegin)
ds(i)=mean(decision_statistic(i:i+L));
else
ds(i)=1;
end
end

%----------------------------------------------------------------------
%PlOTTING THE NECESSARY SIGNALS
%----------------------------------------------------------------------
figure(1)
%-------echo signal-------------------------
subplot(4,1,1)
plot(EchoSignal)
xlabel('time (samples)');
ylabel('echo(n)');
title('ECHO SIGNAL: echo(n)')
grid on
axis([0 N -1 1]);

%-------Desired signal-----------------------
subplot(4,1,2)
plot(d)
xlabel('time (samples)');
ylabel('d(n)');
title('DESIRED SIGNAL: d(n)')
grid on
axis([0 N -1 1]);

%-------Output signal x(n)-------------------


subplot(4,1,3)
plot(y)
xlabel('time (samples)');

63
ylabel('y(n)');
title('OUTPUT SIGNAL (AFTER W): y(n)')
grid on
axis([0 N -1 1]);

%-------Error signal x(n)--------------------


subplot(4,1,4)
plot(e,'red')
xlabel('time (samples)');
ylabel('E(n)');
title('ERROR SIGNAL: e(n)')
axis([0 N -1 1]);
grid on

%-------Estimation system w-----------------


figure(2)
subplot(2,1,1)
plot(w,'red')
xlabel('Tap');
ylabel('Magnitude (W)');
title('ESTIMATE SYSTEM: W(N)')
grid on

%-------True system h----------------------


subplot(2,1,2)
plot(h)
xlabel('Sample number (n)');
ylabel('Magnitude (H)');
title('TRUE IMPULSE RESPONSE: h(n)')
grid on

%-------Estimator for DTD------------------


figure(3)

%-------Decision Statistic-----------------
subplot(311)
plot(ds,'green')
hold all
plot(threshold,'red')
hold off
xlabel('Sample number (n)');
ylabel('Decision Statistic');
title('DOUBLE TALK DETECTION')
grid on

%-------Mean square error-------------------


subplot(312)
plot(mse)
xlabel('Sample number (n)');
ylabel('Mean(Error^2)');
title('MEAN SQUARE ERROR')
grid on

%-------Echo return loss enhancement---------


subplot(313)
plot(ERLE)
xlabel('Sample number (n)');
ylabel('Desired signal/Error signal (dB)');
title('ECHO RETURN LOSS ENHANCEMENT')
grid on

64
Appendix B: MATLAB code of NLMS algorithm
clear all
%---------------------------------------------------------------------
%Load Data
[x, Fs, nbits] = wavread('c:/audiofiles/fe1'); %Far-end signal
[v, Fs, nbits] = wavread('c:/audiofiles/ne1'); %Near-end signal
[h, Fs, nbits] = wavread('c:/audiofiles/room_impulse_response_128taps');
%Room impulse response

%Declare the needed variables


L=length(h); %Length of adaptive filter (same length of RIR)
N=length(x); %Number of iterations
T=0.92; %Threshold for Double talk detection
lambda_DTD=0.95; %Constant for calculating decision statistic of DTD
DTDbegin=21000; %The time to activate DTD

%Intial value 0
w=zeros(L,1); %Initial weight vector of AF Lx1
xin=zeros(L,1); %Initial input signal of AF Lx1
varMIC=zeros(N,1); %Initial variance of microphone signal of AF Nx1
r_em=zeros(N,1); %Initial Cross correlation between error and microphone
signals

%Ambient noise
WhiteNoise = wgn(N,1,-65); %With make SNR of 45dB
%Microphone signal
EchoSignal=filter(h,1,x); %Echo signal after filter H
d=EchoSignal+WhiteNoise+v; %Desired signal (Microphone Signal)

%Make column vectors


x=x(:); %Far end signal Nx1
d=d(:); %Desired signal Nx1

%The values for calculate Step-Size of Adaptive Filter


alfa=0.42; %Alfa
c=0.01; %A small constant

%Calculate the average SNR (desired signal/noise)


powerMic = sum(abs(d).^2)/N; %Power of Microphone signal
powerN = sum(abs(WhiteNoise).^2)/N; %Power of White Noise
SNR=10*log10(powerMic/powerN); %Calculate the SNR

%----------------------------------------------------------------------
%-------------NLMS algorithm for Adaptive Filter-----------------------
for i=1:N
for j=L:-1:2
xin(j)=xin(j-1);
end
xin(1)=x(i); %Insert new sample at beginning of input

y(i)=w'*xin; %Output signal after adaptive filter


error=d(i)-y(i); %ERROR
e(i)=error; %Store estimation error
mu=alfa/(c+xin'*xin); %Calculate Step-size
wtemp = w + 2*mu*error*xin;%Update filter

% -----------NORMALIZED CROSS-CORELATION ALGORITHM DTD--------------


threshold(i)=T; %Threshold for ploting DTD

65
if (i<=DTDbegin) %The beginning time of the DTD
w=wtemp; %Update filter coefficients
end
if (i>DTDbegin)
%Cross correlation between error and microphone signal
r_em(i)=lambda_DTD*(r_em(i-1))+(1-lambda_DTD)*e(i)*d(i)';
%Variance of microphone signal
varMIC(i)=sqrt(lambda_DTD*(varMIC(i-1)^2)+(1-lambda_DTD)*d(i)*d(i)');
decision_statistic(i)=1-(r_em(i)/varMIC(i)^2); %Decision statistic
%Making the double-talk decision
if (decision_statistic(i)>threshold(i))
w=wtemp; %Update filter coefficients
end

end
%-------------ERLE--------------------
powerD(i) = abs(d(i))^2; %Power of Microphone signal
powerE(i)=abs(e(i))^2; %power of Error signal
%--------------MSE--------------------
mse_iteration(i)=error^2; %Square Error
end
for i=1:N-L
%MSE - Mean Square Error
mse(i)=mean(mse_iteration(i:i+L));
%Echo Return Loss Enhancement
ERLE(i)=10*log10(mean(powerD(i:i+L))/mean(powerE(i:i+L)));
%Plotting Double-talk Detection
if (i>DTDbegin)
ds(i)=mean(decision_statistic(i:i+L));
else
ds(i)=1;
end
end

%----------------------------------------------------------------------
%PlOTTING THE NECESSARY SIGNALS
%----------------------------------------------------------------------
figure(1)
%-------echo signal------------------------
subplot(4,1,1)
plot(EchoSignal)
xlabel('time (samples)');
ylabel('echo(n)');
title('ECHO SIGNAL: echo(n)')
grid on
axis([0 N -1 1]);

%-------Desired signal--------------------
subplot(4,1,2)
plot(d)
xlabel('time (samples)');
ylabel('d(n)');
title('DESIRED SIGNAL: d(n)')
grid on
axis([0 N -1 1]);

%-------Output signal x(n)----------------


subplot(4,1,3)
plot(y)
xlabel('time (samples)');
ylabel('y(n)');

66
title('OUTPUT SIGNAL (AFTER W): y(n)')
grid on
axis([0 N -1 1]);

%-------Error signal x(n)-----------------


subplot(4,1,4)
plot(e,'red')
xlabel('time (samples)');
ylabel('E(n)');
title('ERROR SIGNAL: e(n)')
axis([0 N -1 1]);
grid on

%-------Estimation system w---------------


figure(2)
subplot(2,1,1)
plot(w,'red')
xlabel('Tap');
ylabel('Magnitude (W)');
title('ESTIMATE SYSTEM: W(N)')
grid on

%-------True system h---------------------


subplot(2,1,2)
plot(h)
xlabel('Sample number (n)');
ylabel('Magnitude (H)');
title('TRUE IMPULSE RESPONSE: h(n)')
grid on

%-------Estimator for DTD-----------------


figure(3)

%-------Decision Statistic----------------
subplot(311)
plot(ds,'green')
hold all
plot(threshold,'red')
hold off
xlabel('Sample number (n)');
ylabel('Decision Statistic');
title('DOUBLE TALK DETECTION')
grid on

%-------Mean square error-----------------


subplot(312)
plot(mse)
xlabel('Sample number (n)');
ylabel('Mean(Error^2)');
title('MEAN SQUARE ERROR')
grid on

%-------Echo return loss enhancement------


subplot(313)
plot(ERLE)
xlabel('Sample number (n)');
ylabel('Desired signal/Error signal (dB)');
title('ECHO RETURN LOSS ENHANCEMENT')
grid on

67
Matematiska och systemtekniska institutionen
SE-351 95 Växjö

Tel. +46 (0)470 70 80 00, fax +46 (0)470 840 04


http://www.vxu.se/msi/

68

You might also like