Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

A N Algorithm For Linearly Constrained Adaptive Processing

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

926

PROCEEDINGS OF THE IEEE, VOL.

60, NO. 8,AUGUST 1972

A n Algorithm For Linearly Constrained Adaptive Array Processing


OTIS LAMONT FROST, 111,
MEMBER, IEEE

Abstract-A constrained least mean-squares algorithm has been derived whichis capable of adjusting an array of sensors in real time t o respond to a signal coming from a desired direction while discriminating against noises comingfromotherdirections.Analysis and computer simulations confirm that the algorithm is able to iteratively adapt variable weights on the taps of the sensor array to minimize noise power in the array output. A set of linear equality constraints on the weights maintains a chosen frequency characteristic for the array in the direction of interest. The array problem would be a classical constrained least-meansquares problem except that the signal and noise statistics are assumed unknown a priori. A geometrical presentation shows that the algorithm is able to maintain the constraints and prevent the accumulation of quantization errors in a digital implementation.

I. INTRODUCTION
HIS PAPERdescribes a simple algorithm for adjusting an array of sensors in real time to respond to a desired signal while discriminating against noises. A signal is here defined as a waveform of interest which arrives in plane waves from a chosen direction (called the look direction).Thealgorithmiterativelyadaptstheweights of a broad-band sensor array (Fig. 1) to minimize noise power a t a t the array output while maintaining a chosen frequency response in the look direction. The algorithm, called the Constrained Least MeanSquares or Constrained LMS algorithm, is a simple stochastic gradient-descent algorithm which requires only that the direction of arrival and a frequency band of interest be specified a priori. In the adaptive process, the algorithm progressively learns statisticsof noise arriving from directions other than the look direction. Noise arriving from the look direction may be filtered out by a suitable choice of the frequency response characteristic in that direction, or by externalmeans. Subsequent processing of the array output may be done for detection or classification. A major advantage of the constrained LMS algorithm is that it has a self-correcting feature permitting it to operate for arbitrarily long periods of time in a digital computer implementation without deviating from its constraints because of cumulative roundoff or truncation errors. The algorithm is applicable to array processing problems in geoscience, sonar, and electromagnetic antenna arrays in which a simple method is required for adjusting an array in real timetodiscriminateagainst noises impinging on the array sidelobes.

Fig. 1.

Broad-band antenna array and equivalent processorfor signals coming from the look direction.

Previous Work
Previous work on iterative least squares array processing was done by Griffiths [l]; his method uses an unconstrained minimum-mean-square-error optimization criterion which requires a priori knowledge of second-order signal statistics. [ 2 ] proposed a variWidrow, Mantey, Griffiths, and Goode able-criterion [3] optimizationprocedureinvolvingthe use of a known training signal; this was an application and extension of the original work on adaptive filters done by Widrow and Hoff [4]. Griffiths also proposed a constrained least mean-squares processor not requiring a priori knowledge of the signal statistics [SI; a new derivation of this processor, given in [ 6 ] , shows thatitmaybeconsideredasputting soft constraints on the processor via the quadratic penaltyfunction method. Hard (i.e., exactly)-constrainediterativeoptimization was studied by Rosen [7] for the deterministic case. Lacoss [8], Booker et al. [g], and Kobayashi [ l o ] studied hardconstrained optimization in the array processing context for filtering short lengths of data. All four authors used gradientprojection techniques [ l l ] ;Rosen and Booker correctly indicatedthatgradient-projectionmethodsaresusceptibleto cumulative roundoff errors and are not suitable for long runs withoutanadditionalerror-correctionprocedure.Theconstrained LMS algorithmdevelopedinthepresentworkis designed toavoiderroraccumulation while maintaininga hard constraint; as a result, it is able to provide continual filtering for arbitrarily large numbers of iterations.

ManuscriptreceivedDecember23,1971:revised May 4, 1972. This research is based on a Ph.D. dissertation in the Department of Electrical Engineering, Stanford University, Stanford, Calif. The author is with ARGOSystems, Inc., Palo Alto, Calif. 94303.

Basic Principle of the Constraints Thealgorithmisabletomaintaina chosen frequency response in the look direction while minimizing output noise

FROST: ALGORITHM FOR ADAPTIVE ARRAY PROCESSING

92 7

power because of a simple relation between the look direction frequency response and the weights in the array of Fig. 1. Assume that the look directionischosenasthedirection perpendiculartotheline of sensors. Thenidenticalsignal components arriving on a plane wavefront parallel to the line of sensors appear a t t h e first taps simultaneously and parade in parallel down the tapped delay lines following each sensor; however, noise waveforms arriving from other than the look direction will not,ingeneral,produceequalvoltagecomponents on any given vertical column of taps. The voltages at eachtaparemultipliedbythetap (signalplusnoise) weights and added to form the array output. Thus as far as the signal is concerned, the array processor is equivalent to a single tapped delay line in which each weight is equal to the sum of the weights in the corresponding vertical column of the processor, as indicated in Fig. 1. These summation weights in the equivalent tapped delay line must be selected so as to give the desired frequency response characteristic in the look direction. If the look direction is chosen to be other t h a n t h a t perpendicular tothe line of sensors,thenthearraycanbe steered either mechanically or electrically by the addition of steeringtimedelays(notshown)placedimmediatelyafter each sensor. A processor having K sensors and J taps per sensor has K J weights and requires J constraints to determine itslookdirection frequency response. The remaining K J - J degrees of freedom in choosing the weights may be used to minimize the total power in the array output. Since the look-direction frequency response isfixed by the J constraints, minimization of the total output power is equivalent to minimizing the nonlook-direction noise power, so long as the set of signal voltages at the taps is uncorrelated with the set of noise voltages at these taps. The latter assumption has commonly been made in previous work on iterative array processing [l], [SI, [8][lo]. The effect of signal-correlated noise in the array may be to cancel out all or part of the desired signal component in the array output. Sources of signal-correlated noise may be multiple signal-propagation paths, and coherent radar or sonar clutter. I t is permissible, and in fact desirable for proper noise cancellation that the voltages produced by the noises on the taps of the array be correlated among themselves, although uncorrelated with the signal voltages. Examples of such noises include waveforms from point sources in other than the look direction (e.g., lightning, jammers, noise from nearby vehicles}, spatially localized incoherent clutter, and self-noise from the structure carrying the array. Noise voltages which are uncorrelated between taps (e.g., amplifierthermal noise) maybepartiallyrejectedbythe adaptive array in two ways. As in a conventional nonadaptive array, such noises are eliminated to the extent that signal voltages on the taps are added coherently at the array output, while uncorrelated noise voltages are added incoherently. Second, an adaptive array can reduce the weighting on any t a p t h a t may have a disportionately large uncorrelated noise power.

Notation
Notation will be as follows (see Fig. 2): Every A seconds, where A may be a multiple of the delay r between taps, the voltages at the array taps are sampled. The vector of tap voltages at the kth sample is written X(k), where

XT(k)

[ ~ 1 ( k A ) ,x * ( k A ) , *

,XKJ(~A)].

The superscript T denotes transpose. The tap voltages are 1 and the sums of voltages due to look-direction waveforms non-look-direction noises n, so t h a t

X ( k ) = L(k)
where theKJ-dimensionalvector forms at the kth sample is

+X@)
R taps

(1)

of look-direction wave-

1
N T ( k ) 4 [ n ~ ( k A ) ,nn(kA), W Tp [Wl, w2, .
*

K taps

K taps
and the vector of non-look-direction noises is

. ,~

K J ( W ] .

The vector of weights a t each t a p is W , where


* *

,W K J ] .

I t isassumedforthisderivationthatthesignalsand noises are adequately modeled as zero-mean random processes with (unknown) second-order statistics:
E [ X ( k ) X T ( h )4 ] Rxx E[N(k);VT(k)] 4 RNN

(24

(2b)
(24
of lookof non-

E [ L ( k )LT(k)] 4 R L L .
As previously stated, it is assumed that the vector direction waveforms is uncorrelated with the vector look-direction noises, i.e.,

E[N(k)LT(k)]= 0 .

(3)

I t is assumed that the noise environment is distributed so that RXX and RNNare positive definite [12]. The output of the array (signal estimate) at the time of kth sample is

y ( k ) = WTX(k) = X * ( k ) W .
Using (4) the expected output power of the array is

(4)

LMS WEIGHTVECTOR 11. OPTIMUM-CONSTRAINED The first step in developing the constrained LMS algorithm is to find the optimum weight vector.

E [ y Z ( k ) ]= E [ W T X ( k ) X T ( k ) W= ] WTRxxW.

(5)

The constraint that the weights on t h e j t h vertical column of taps sum toa chosen numberfi (see Fig. 1) is expressed by

928

PROCEEDINGS OF THE IEEE, AUGUST

1972

LOOK DIMCTION SIGNALS Ax) NOISES

P1.1

NOISE A
NOISE B NOW-LOOK DIAECTDN
no(Ks

Fig. 2.

Signals and noises on the array. Because the array is steered toward the look direction, all beam signal components on any given column of filter taps are identical.

F t h 2 Z t x t -

the requirement

look-direction-equivalent tapped delay line shown in Fig.

1:

cjTW = f j ,

1, 2,

- . - ,J

where the KJ-dimensional vector c j has the form

'0'
0
By inspection the constraint vectors cj are linearly independent, hence, C has full rank equal to J . The constraints ( 6 ) are now written

0
0

CTW = 5.
Optimum Weight Vector

(10)

Since the look-direction-frequency response is fixed by the

1
cj =

J constraints,minimization of the non-look-directionnoise

j t h group of K elements. 1
0 0
0

power is the same as minimization of the total output power. The cost criterion used in this paper will be minimization of total array output power WTRxxW. The problem of finding the optimum set of filter weights Wept is summarized by (5) and (10) as minimize
W

WTRxxW CTW = 5.

Ula)

subject to

(1W

.om
Constraining the weight vector to satisfy the J equations of (6)restricts W to a (KJ-J)-dimensional plane. Define the constraint matrix C as

This is the constrained LMS problem. W,,t isfoundbythemethod of Lagrangemultipliers, which is discussed in [13]. Including a factor of to simplify later arithmetic, the constraint function is adjoined to the a /-dimensionalvector of undetermined costfunctionby Lagrange multipliers X:

H ( W ) = +WTRxxW

+ XT(CTW- 5).

(12)

J -

Taking the gradient of (12) with respect to W

V w H ( W ) RxxW
and define 5 as the J-dimensional vector of weights of the

+ a.

(13)

The first term in (13) is a vector proportional to the gradient of the cost function (lla), and the second term is a vector

FROST: ARRAY ALGORITHM FOR ADAPTIVE

PROCESSING

929

normal to the (KJ-J)-dimensional constraint plane defined by P W - S = O [14]. For optimality these vectors must be antiparallel [IS], which is achieved by setting the sum of the vectors (13) equal to zero

V w H ( W ) = RxxW

+ CX = 0.

implement and, for a given computational cost, is applicable to arrays in which the number of weights is on the order of the square of the number that could be handled by the iterative matrix inversion method and the cube of the number that could be handled by the direct substitution method.

Derivation For motivation of the algorithm derivation temporarily Rxx is known. I n consuppose that the correlation matrix Wopt = - R x x - 0 (14) strained gradient-descent optimization, the weight vector is where Rxx-l exists because RXX was assumed positive defi- initialized at a vectorsatisfyingtheconstraint(llb),say W ( 0 )= C ( P C ) - V , and at each iteration the weight vector is nite. Since Wept must satisfy the constraint (llb) moved in the negative direction of the constrained gradient CTWopt= 5 = - CTRxx-CX (13). The length of the step is proportional to the magnitude of the constrained gradient and is scaled by a constant p. and the Lagrange multipliers are found to be After the kth iteration the next weight vector is
Interms of theLagrangemultipliers,theoptimalweight vector is then

x = - [ cTR~x-C]-5

(15)

W(k

where the existence of [CTRxx-C]-follows from the facts C has full rank [6]. From t h a t RXX is positive definite and (14) and (15) the optimum-constrained LMS weight vector solving (11) is

+ 1) = W ( k ) - p v w a [ W ( k ) ] W ( k ) - p [ R x x W ( k )+ CX(k)]
=

(18)

5 = CTW(k 1) = C T W ( k )- pCTRxxW(k) - pCTCX(k). Using the set of weights Wept in the array processor of Fig. 2 forms the optimum constrained LMS processor, which Solving for the Lagrange multipliersX(k) and substitutinginto is a filter in space and frequency. Substituting Wept in (4), the the weight-iteration equation (18) we have constrained least squares estimate of the look-direction waveW ( k 1) = W ( k ) - p [ l - C(CTC)-CT]RxxW(k) form is

Wept = R x x - C [ C ~ R X X - * C ] - ~ ~ .

(16)

where the second step is from (13). The Lagrange multipliers arechosenbyrequiringW(kS1)tosatisfytheconstraint (llb):

yo&)

= Wo,tTX(k).

(17)

+ C(CTC)-[5 - C T W ( k ) ] . (19)

Discussion T h e constrained LMS filter is sometimes known by other names. If the frequency characteristic in the look-direction is chosen to beall-passandlinearphase(distortionless),the output of the constrained LMS filter is the maximum likelihood estimate of a stationary process in Gaussian noise if the angle of arrival is known [IS]. T h e distortionless form of the constrained LMS filter is called by some authors the MinimumVarianceDistortionlessLookestimator,Maximum Likelihood Distortionless Estimator, and Least Squares Unbiased Estimator. By suitable choice of 5 a variety of other optimal processors can be obtained [16].

The deterministic algorithm (19) is shown i n this form to emphasize that the last factor5- F W ( k ) is not assumed to be zero, as i t would be if the weight vector precisely satisfied the constraintat the kth iteration.I t will be shown in Section VI that this term permits the algorithm to correct any small deviations from the constraint due to arithmetic inaccuracy and prevents their eventual accumulation and growth. Defining the KJ-dimensional vector

6 C(CTC)-15

(204

and the K J X K J matrix

PpI

- C(CTC)-CT

(20b)

the algorithm may be rewritten as Inthispaperitisassumedthattheinputcorrelation W ( k 1) = P [ W ( k ) - p R x x W ( k ) ] F . (21) matrix RXX is unknown a priori and must be learned by an adaptive technique. In stationary environments during constrained gradient Equation (21) is a deterministic learning! and in time-vaving an estimate Of descent algorithm requiring knowledge of the input correlathe Optimum weights must be recomputed periodically. tion matrix Rxx, which, however, the in array is Direct substitution of a correlation matrix estimate into the unavailable a priori. An available and simple approximation optimal-weight equation (16) requires a number of multiplifor RXX at the kth iteration is the outer product of t h e t a p cations at each iteration proportional to the cube of the numX ( k ) X T ( k ) . Substitution of this voltagevectorwithitself: Of weights. The is primarily caused by the estimate into (21) gives the constrained LMS required inversion of the input correlation matrix. Recently algorithm Saradis et al. [I71 and Mantey and Griffiths [18] have shown how to iteratively update matrix inversions, requiring onlya W(0)= F number of multiplications and storage locations proportional to the squareof the number of weights. The gradient-descent W(k 1) = P [ W ( k ) - r y ( k ) X ( k ) ] F constrainedLMSalgorithmpresentedhererequiresonly a number of multiplications and storage locations directly pro(signal estimate) defined by (4). portional to the number of weights. It is therefore simple to where y ( k ) is the array output

111. THEADAPTIVE ALGORITHM

+ !

930

PROCEEDINGS OF THE IEEE, AUGUST

1972

Discussion Theconstrained LMS algorithm (22)satisfies theconbe verified straint P W ( k + 1 ) = 5 at each iteration, as can by premultiplying (22) by P and using (20). At each iteration the algorithm requires only the tap voltages X ( k ) and the array outputy ( k ) ; no a priori knowledge of the input correlaF is a constant vector that can be tionmatrixisneeded. precomputed. One of the two most complex operations required by (22) is the multiplication of each of the K J components of the vector X ( k ) by the scalar p y ( k ) ; the other is indicated by the matrix P= I significa I t operation - C ( P C ) - l P . Because of the simple form of C [refer to (7)], multiplication of a vector by P as indicated in (22) amounts to little more than a few additions. Expressed in summation notation the iterative equations for the weight vector components are

(22), using (4), (2a), and the independence assumption yields an iterative equation in the mean value of the constrained LMS weight vector

E[W(k

+ I)] = P { E [ W ( k ) ]- r R x x E [ W ( k ) ] ]+ F .
V(k

(23)

Define the vector V ( k f 1 ) to be the difference between the mean adaptive weight vector at iteration kfl and the optimal weight vector (16)

+ 1) p E [ W ( k + l)] + 1)

wopt.

Using (23) and the relations F = ( I - P ) Wept and PRxxWo,t=O, which may be verified by direct substitution of (16) and (20b), an equation for the difference process may be constructed

V(k

PV(R) - pPRxxV(k). (24)

These equations can readily be implemented on a digital computer.

V ( k 1) = [I - p P R x x P ] V ( k ) use of the stochastic algorithm (22) is a random vector. Convergence of the = [ I - pPRxxP]k+1V(O). meanweightvectortotheoptimumisdemonstratedby The matrix PRxxP determines both .the rate of convershowing that the length of the difference vector between the meanweightvectorandtheoptimum (16) asymptotically gence of the mean weight vector to the optimum and the steady-state variance of the weight vector about the optimum. approaches zero. I t i s shown in [6] that PRxxP has precisely J zero eigenProof of convergence of themean is greatly simplified by the assumption (used in [2]) that successive samples of values, corresponding t o the column vectors of the constraint the input vector taken A seconds apart are statistically inde- matrix C; this is a result of the fact that during adaption no movement is permitted away from (KJ- J)-dimensional conpendent. This condition can usually be approximated in straint plane. I t is also shown in[6, appendix C ] t h a tPRxxP at intervals large compractice by sampling the input vector has K J - J nonzero eigenvalues ui. i = 1, 2, . . K J - J, with pared to the correlation time of the input process plus the length of time it takes an input waveform to propagate down values bounded between the smallest and largest eigenvalues the array. The assumption is more restrictive than necessary, of Rxx since Daniel1 [19] has shown that the much weaker assumpAmin 5 ami, 5 (Ti 5 umax 5 X,, i = 1 , 2 , . . . , KJ - J independence of the input vectors is tion of asymptotic sufficient to demonstrate convergence in the related unconwhere A m i n and X,, are the smallest and largest eigenvalues strained least squares problem. of RXXand urnin and urnax are the smallest and largest nonzero Taking the expected value of both sides of the algorithm eigenvalues of PRxx P .
0 ,

IV. PERFORMANCE Convergence to the Optimum The weight vector W ( k ) obtained by the

The idempotence ofP (i.e., P 2 = P )which , can be verified by carrying out the multiplication using (20b) and premultiplication of equation (24) by P shows that P V ( k ) = V ( k ) for all k, so (24) can be written

FROST: ALGORITHM FOR ADAPTIVE ARRAY PROCESSING

931

Examination of V(O)=F - Wept shows thatitcan be expressed as a linear combination of the eigenvectors of P R x x P corresponding to nonzero eigenvalues.If V ( 0 )is equal to an eigenvectorof P R x x P , say ei with eigenvalue ui#O then

V ( ~ Z 1) = [ I - P P R X X P ] ~ + ~ ~ ~
=

[I - pui]k+ei.
where tr denotes trace. M ( M )can be made arbitrarily close to zero by suitably small choice of p ; this means that the steady-state performance of the constrained LMS algorithm can be made arbitrarily close to the optimum. From (25) i t seen t h a t such performance is obtained at the expense of increased convergence time. If p is chosen to satisfy
n

The convergence of the mean weight vector to the optimum weight vector along any eigenvector of P R x x P is therefore geometric with geometric ratio (1-pa;). The time required for the euclidean length of the difference vector to decrease t o e-1 of its initial value (time constant) is
7i

= A/ln (1 - pai) E A/pai

(25)

where the approximation is valid for pcai<<l. If p is chosen so t h a t

< P < l/amax

then the length (norm) of any difference vector is bounded between two ever-decreasing geometric progressions

(1 - ~~mnx>k+ll/V(O>ll I IIv(k

then it is guaranteed to satisfy (26). Griffiths [ l ] shows t h a t the upper bound in (28) can be calculated directly and easily l>lj from observations since tr ( R x x ) = E [ X r ( k ) X ( k ) ] , the sum of 5 (1 - Pamin)k+ll/V(o>Ij the powers of the tap voltages.

and so if the initial difference is finite the mean weight vector Steady-State Performance-Nonstatwnary Environment converges to the optimum, i.e., A model of the effect of a nonstationary noise environment proposed by Brown [23] is that the steady-state rms change lim - Woptl(= 0 of the optimal weight vector Wopt(K)between iterations has k+ m magnitude 6, i.e., with time constants given by (25). lim s u p E I J W o p t ( k 1) - Wopt(lZ)l12 = 62. k- m Steady-State Performance-Stationary Environment be applied to the constrained The algorithm is designed to continually adapt for coping Brownsgeneralresultsmay with nonstationary noise environments. In stationary environ- LMS algorithm by restricting the optimal weight vector to ments this adaptation causes the weight vector to have a vari- have magnitude less than some number Wmnxlland again assuming the successive input vectors are independent with ance about the optimum and produces an additional comcomponents. For p small i t can be ponent of noise (above the optimum) to appear at the outputGaussian-distributed shown [23, p. 471 that the steady-state rms distance of the of the adaptive processor. The output power of the optimum processor with a fixed weight vector from the optimum is bounded by weight vector (17) is

I~E[W(K)]

//

E[yopt2(h)]= WoptTRxxWopt
= ST(CTRxx-C)-5.

A measure of the fractionof additional noise caused by the adaptive algorithm operating in steady state in a stationary where any starred quantities p* or p* are taken to bound the environment is termed misadjustment M ( p ) by Widrow corresponding time-varying quantity &), i.e., p* <q(k) <q* for all k. Ingeneral,theoptimumconvergenceconstant p t h a t minimizes theupperbound (29)for a nonstationary environment is nonzero. This contrasts with the stationary By assuming t h a t successive observation vectors [vectors case, in which the best steady-state performance is obtained X@) of t a p voltages] are independent and have components by makingp as small as possible. x l ( k ) , . . - , X K L ( ~ ) that arejointlyGaussiandistributed, V. GEOMETRICAL INTERPRETATION Moschner[20]calculatedverytightbounds on the misadT h e constrained LMS algorithm (22) has a simplegeo[21], [22].Fora justment, using a methodduetoSenne metrical interpretation that is useful for visualizing the errorconvergence constant p satisfying correcting property which keeps the weight vector from 1 I deviating from its constraints. O<p< In an error-free implementation of the algorithm, theKJamsx (1/2) tr ( P R x x P ) dimensionalweightvectorssatisfytheconstraintequation the steady-state misadjustment maybe bounded by ( l l b ) a n dtherefore terminate on a constraint plane A defined

932

PROCEEDINGS OF THE IEEE, AUGUST

1972

'
Fig. 3.

\r.

{\YIC%v

4)

CO((S1RmNT WBSPICE

The (KJ- J)-plane A and subspace Z defined by the constraint.

Fig. 5

Example showing contours of constantoutput power and the constrained weight vector that minimizes output power.

Rxx-'C(C?'Rxx-'C)-19.

Fig. 4.

P projects vectors onto the constraint subspace.

A = { W : p W = 51.
This (KJ-a-dimensional constraint plane is indicated diagrammatically in Fig. 3. I t is well known [I41 that vectors pointing in a direction normal to the constraint plane (but not necessarily normal to the vectors that terminate on that plane) are linear combinations of the constraint matrix column vectors. These vectors have the form C A , where A is a J-dimensional vector determining the linear combination. Thus the vector F = C(CTC)-%T, appearing in the algorithm (22) and used as the initial weight vector, points in a direction normal to the constraint plane. F also terminates on the constraint plane since C?'F=S. Thus F is the shortest vector terminating on the constraint plane(see Fig. 3). The homogeneous form of the constraint equation

Fig. 6. Operation of the constrained LMS algorithm:

K(k+l)=P[~(k)-~~(k)X(k)l+F.

o f f the congeneral, this change moves the resulting vector straint plane. The resulting vector is projected onto the constraint subspace and then returned to the constraint plane by adding F. The new weight vector W(K+l)satisfies the constraint to within the accuracy of the arithmetic used in implementing the algorithm.

VI. ERROR-CORRECTING FEATURE


I n a digital-computer implementation of any algorithm, i t i s likely t h a t small computational errors will occur at each iteration because of truncation, roundoff, or quantization in applying the well-known gradienterrors. A difficulty projection algorithm to the real time array-processing problem is that computational errors causing deviations of the weight vector from the constraint are not corrected [7], [ 9 ] . Withoutadditionalerror-correctingprocedures,application of the gradient-projection algorithm is limited to problems requiring few enoughiterationsthatsignificantdeviations fromtheconstraintdonotoccur.The.constrained LMS algorithm,ontheotherhand,was specifically designed t o continuously correct for such errors and prevent them from accumulating. The reason for this characteristic is shown by a geometrical comparison of the two algorithms. The gradient-projection algorithm may be derived by following the derivation of the constrained LMS algorithm to (19)and dropping the last factor, 5- C'W(R). This factor would be equal to zero in a perfect implementation in which the weight vector satisfied the constraint P W ( K ) = 5 at each iteration. The algorithm that results when the term is dropped is

CrW'O

(30)

defines a second (KJ-J)-dimensional plane, which includes the zero vector and thus passes through the origin. Such a plane is called a subspace [ll] (see Fig. 3). The matrix Pin the algorithm (22) is a projection operator [24]. Premultiplication of anyvectorby P will annihilate 8 , projectingthevector anycomponentsperpendicularto into the constraint subspace (see Fig. 4). T h e vector y ( k ) X ( K ) in the algorithm is an estimate of the unconstrained gradient. Referring to (12) the uncon4 W R x x W . The unconstrained strained cost function is gradient [refer to (13)] is RxxU;. The estimate of RxxW a t the Kth iteration, used in deriving (22), is y ( k ) X ( k ) . Contours of constant output power (cost) and the optimum constrained weight vector Wept that minimizes the output power are shown in Fig.5. The operationof the constrained LMS algorithm is shown W ( 0 ) = C(PC)--Is 6. In this example, the unconstrained negative in Fig. gradient estimate --y(k)X(K) is scaled by p and added to the Tlr(k 1) = W(K) - pPy(K)X(K). (3 1) current weight vector W(K).This is an attempt to change the weight vector in a direction that minimizes output power. In This is a gradient-projection algorithm [ll]. I t is so named

FROST: ALGORITHM FOR ADAPTIVE ARRAY PROCESSING

933

h
Fig. 7. Operation of the gradient-projection algorithm (31).
0
0.1
0.2

0.3
CREOUENCY

0.4

0.5

Fig. 9.

Frequency response of the processor in the look direction.

c
0

WSE A

LOOKOlRECtlW SGWAL

. .
I

.. ..
! L

.To
1

'

'

0.1

oz

rafpufm

a3

0.4

I 0.5

Fig. 10.

Power spectral density of incoming signals. See Fig. Table I for spatial position of noises. TABLE I
SIGNALS AND NOISES IN THE SIMULATION (SEE FIG.2)

2 and

Direction
(Oo is normal

Source Look-direction

Power
0.1 1 .o 1 .o

to array)
00 450
600

Center Frequency (1.0 is 1 / ~ ) Bandwidth

I\

Noise A Noise B White noise (per tap)

signal

0.3 0.2
0.4

0.1 0.05

0.07

a. 1

Fig. 8. Error propagation. The constrained L M S algorithm (a) corrects deviations from theconstraintswhile the gradient-projection algoo accumulate. rithm (b) allows them t

assumption made in the derivation of the gradient-projection algorithm. Accumulating errors in the gradient-projection algorithm can be expected to cause the weight vector to do a random because the unconstrained gradient estimate y ( k ) X ( k ) is walk away from the constraint plane with variance (expected projected onto the constraint subspace and then added to thesquared distance from the plane) increasing linearly with the current weight vector. Its operation is shown in Fig. 7 (com- number of iterations. By contrast, the expected deviation of the constrained LMS algorithm from the constraint does not pare with Fig. 6). A comparison between the effect of computational errors grow, remaining at its original value. on the gradient-projection algorithm and on the constrained VII. SIMULATION L M S algorithmisshownin Fig. 8. Theweightvectoris assumed to be off the constraint at the kth iteration because A computer simulation of the processor was made using on asmallcomputer(the of a quantization error occurring in the previous iteration. I t 6-digitfloatingpointarithmetic LMS algorithm HP-2116). The processor had four sensors on a line spaced at isshowninFig.8(a)thattheconstrained makestheunconstrainedstep,projectsontothesubspace, 7-second intervals and had four taps sensor per (thus K J = 16). and then adds F,producing a new weight vector W ( k + 1) t h a t Theenvironmenthadthreepoint-noisesources,andwhite satisfies theconstraint.Thegradient-projectionalgorithm noise added to each sensor. Power of the look direction signal [Fig.8(b)], however, projects thegradientestimateonto was quite smallincomparisontothe power of interfering the subspace and adds the projected vector to the past weightnoises (see Table I). The tap spacing defined a frequency of vector, moving parallel to the constraint plane but con- 1.0 (i.e., f = 1.0 is a frequency of 1/7 Hz). I n the look-directinuing the error. Note the implicit (incorrect) assumption tion, foldover frequency for the processor response was 37, or t h a t W ( k ) satisfied the constraint, corresponding to the same 0.5. All signals were generatedby a pseudo-Gaussiangen-

934

PROCEEDINGS OF TJ3E IEEE, AUGUST

1972

I-

ni3
00

.... ........... ...

. *....-.... . ..-*...^ ..... .' * .. .


2..a.--.

. ... "-*.,..".

r .

. -

6(10)-"
w

* I

~ o - ~ - h - ~ - ~ -- -~' - - - ~ 1% ~ '- '-~ ' - - - 200 1 ~ - l ~ -

b .

wu

J..'-. . - . . . . . . . . . . . . 50
b ..

...
100

. . . . . . -....
I50

200

NUYBER OF AOIPTATIONS

Fig. 11. The output power of the constrained LMS filter (upper graph) decreases as it adapts to discriminate against unwanted noise. Lower curve shows small deviations from the constraint due to quantization.

........ ..
_

"... . _
.. _ ..._ .a

". . ..... --. . ...............

..
. e
0

.*-

..

. 4 .

. -. ....._ .... - .--.-.....-:


' r
I

NUYBER OF AOAPTATIONS

Fig. 12. Output power of the gradient-projection algorithm (upper graph) operated on the same data as theconstrained L M S algorithm (c.f., Fig. 11). Lower curve shows that deviations from the constraint tend to increase with time. Note scale.

erator and filtered to give them the proper spatial and All temporal correlations were artemporal correlations. ranged to be identically zero for time differences greater than 257. The time between adaptations A was assumed greater than 587, so successive samples of X ( k ) were uncorrelated. the vector The look direction filter was specified by ST= [l, - 2, 1.5, 21 which resulted in the frequency characteristicshown in Fig. 9. The signal and noise spectraare shown in Fig. 10 and their spatial position in Fig.2. I n this problem, the eigenvalues of Rxx ranged from 0.111 t o 8.355. The upper permissible bound on the convergence constant p calculated by (26) was 0.074; a value of p=O.Ol was selected, which, by (27), would lead to a misadjustment of between 15.2 and 17.0 percent, The processor was initialized with W ( 0 )= F = C ( P C ) - ' 3 , and Fig. 11 showsperformanceasafunction of time. The upper graph has three horizontal lines. The lower line is the output power of theoptimumweightvector.The closely

spaced upper two lines are upper and lower bounds for the adaptive processor output power, which is the optimum output power plus misadjustment. The mean steady-state value of the processor's output power falls somewhere between the upper and lower bounds (but may, at any instant fall above or below these bounds). The difference betweentheinitial and steady-state power levels is the amount of undesirable noise power the processor has been able to remove from the output. A simulation of the gradient projection algorithm (31) on the array problem was made using exactly the same data as LMS algorithm.Theresultsare used bytheconstrained shown in Fig. 12. The lower part of Fig. 1 2 shows how the gradient-projection algorithm walks away from the constraint. Notethechangein scale. If theerrors of theconstrained LMS algorithm (Fig. 11) were plotted on the same scale they of the gradient-projection would not be discernible. The errors method are expected to continue to grow.

FROST: ALGORITHM FOR ADAPTIVE ARRAY PROCESSING

935

The fact that the output power of the gradient-projection signal estimation and discriminating against noises in a processor (upper curve, Fig. 12) is virtually identical to the possibly time-varying environment where little a priori output power of the constrained LMS processor is a result of informationisavailableaboutthesignalsor noises. Time the fact that the errors have not yet accumulated to the point constants, steady-state performance, and a proof of convergof moving the constraint a significant radial distance from the ence are derived for operation of the algorithmin a stationary origin. environment: convergence and steady-state performance in a nonstationary environment are also shown. A simple extension of the algorithm may be used t o solve VIII. LIMITATIONS AND EXTENSION a general constrained LMS problem, which is t o minimize the Application of the constrained LMS algorithm in some expected squared difference betweena multidimensional filter array processing problems is limited by the requirement that outputandaknowndesiredsignalunderaset of linear the non-look-direction noise voltages on the taps be uncorequality constraints. related with the look-direction signal voltages. This restriction is a result of the fact that if the noise voltagesarecorrelatedwiththesignalthentheprocessor REFERENCES may cancel out portions of the signal with them in spite of [ l ] L. J. GrifIiths, A simple adaptive algorithm for real-time processing the constraints. If the source of correlated noise is known, its in antenna arrays, Proc. IEEE, vol. 57, pp. 1696-1704, Oct. 1969. [2] B. Widrow, P. E. Manpy, L. J. Griffiths, and B. B. Goode, Adapeffect may be reduced by placing additional constraints to tiveantenna systems, Proc. IEEE, vol. 55, pp. 214342158, Dec. minimize the array response in its direction. 1967. Implementation errors, i.e., deviations from the assumed [3] L. J. GrBiths, Comments on A simple adaptive algorithm for realtime processing in antenna arrays (Authors reply), Proc. IEEE electrical and spatial properties of the array(such as incorrect (Lett.), vol. 58, p. 798, May 1970. amplifier gains, incorrect sensor placements, or unpredicted [4] B. Widrow and M. E. Hoff, Jr., Adaptive switching circuits, IRE WESCON Conv. Rec., pt. 4, pp. 96-104, 1960. mutual coupling between sensors) may also limit the effec[5] L. J. GrBiths, Signal extraction using real-time adaptation of a tiveness of the processorbypermittingittodiscriminate linear multichannel filter, Stanford Electron. Lab., Stanford, against look-direction signals while still satisfying the letter Calif., Doc. SEL-60-017, Tech. Rep. T R 67881-1, Feb. 1968. [6] 0.L. Frost, 111, Adaptiveleastsquaresoptimizationsubject to of the constraints. Injection of known test signals into the linear equality constraints, Stanford Electron. Lab., Stanford, array may provide information about the signal paths that Calif., Doc. SEL-70-055, Tech. Rep. TR 679rS-2, Aug. 1970. can be used to compensate, in part, for the errors. [7] J. B. Rosen, The gradient projection method for nonlinear programming, pt. 1 : Linear constraints, J . Soc. Indust. Appl. Math., The algorithm may be extended to a moregeneral stovol. 8, p. 181, Mar. 1960. chastic constrained least squares problem [8] R. T. Lacoss, Adaptive combining of wideband arraydata for
optimal reception, IEEE Trans. Gcosci. Electron., vol. GE-6, pp. 78-86, May 1968. [9] A. H. Booker, C. Y . Ong, J;, P. Burg, and G. D. Hair, Multipleconstraint adaptive filtering, Texas Instruments, Sci. Services Div., where d(R) is a scalar variable related to the observation vecDallas, Tex., Apr. 1969. synthesismethods for a seismic array tor X(R) a n d C i sa general constraint matrix. The scalar d ( k ) [lo] H.Kobayshi,Iterative processor, IEEETrans. Geosci. Electron; vol. GE-8, pp. 169-178, may be a random variable correlated with X ( R ) or it maybe a July 1970. known test signal used to compensate for array errors. This (111 D. G. Luenberger, Optimization by Vector Space Methods. New York: Wiley, 1969. wouldbe a classical least squares problem except that the I. J. Good and K. Koog, A paradoxconcerning rate of information, statistics of X ( k ) and d(k) are assumed unknown a priori. [12] Informat. Contr., vol. 1, pp. 113-116, May 1958. The general constrained LMS algorithm solving (32) may be [13] A. E. Brymn, Jr., and Y . C. Ho, Applied Optimal Control. Waltham, Mass. : Blaisdell, 1969. derived similarly t o (22) and is [14] W. H. Fleming, Functions of Smeral Variables. Reading, Mass.: Addison-Wesley, 1965. W ( 0 ) = C(CTC)-5 [15] E. J. Kelly, Jr., and M. J. Levin, Signal parameter estimation for seismometer arrays, Mass. Inst. Technol. Lincoln Lab. Tech. Rept. 339, Jan. 1964. 1161 .~ A. H. Nuttall and D.W. Hyde, A unified approach to optimum and suboptimum processing for^ arrays, U. S. Navy Underwater Sound T h e general algorithm is applicable to constrained modeling, Lab., New London, Conn., USL Rep. 992, Apr. 1969. [If] G. N. Saradis, 2.J. Nikolic, and K. S. F u , Stochastic approximaprediction, estimation, and control. I t is discussed in [6]. tion algorithms for system identification, estimation, and decomposition of mixtures, IEEETrans.Sys. Sri. Cybnn., Vol. SSC-5, pp. 8-15, Jan. 1969. I X . CONCLUSION [18] P.E. Mantey and L.J. Gri5ths, Iterativeleast-squares algorithms Analysisandcomputersimulationshaveconfirmedthe for signal extraction, in Proc. 2nd Hawaii Znt. Conf. on Syst. Sci., pp. 767-770. ability of the constrained LMS algorithm to adjust an array [19] T. P.Daniell, Adaptive estimation with mutually correlated trainof sensors in real time to respond to a desired signal while ing samples, Stanford Electron. Labs., Stanford, Calif., Doc. SEL68-083. Tech. Rep, TR 6778-4, Aug. 1968. discriminatingagainst noise.Because of a system of con[ZO] J. L. Moschner, Adaptive filtering with clipped inputdata, straints on weights in the array, the algorithm is shown to Stanford Electron. Labs., Stanford, Calif., Doc. SEL-70-053, Tech. require no prior knowledge of the signal or noise statistics. Rep. T R 6796-1, June 1970. [21] K. D. Senne, Adaptive linear discrete-time estimation, Stanford A geometrical presentation has shown why the conElectron. Labs., Stanford, Calif., Doc. SEL-68090, Tech. Rep. strained LMS algorithm has an ability to maintain the conT R 6778-5, June 1968. straints and prevent the accumulationof quantization errors [22] , New results in adaptive estimation theory, Frank J. Seiler Res. Lab., USAF Academy, Colo., Tech. Rep. SRL-TR-70-0013, in a digital implementation. The simulation tests have conApr. 1970. firmed the effectiveness of thiserror-correctingfeature,in [23] J. E. p w n , 111, Adaptive estimation in nonstationary, environments, Stanford Electron. Labs., Stanford, Calif., Doc. SEL-70-056. contrast with the usual uncorrected gradient-projection Rep. TR 6795-1, Aug. 1970. algorithm. The error-correcting feature and the simplicity of [24] Tech. D. T. Finkbeiner, 11, Introduction t o Mai~iccs and L i u a ~ Transforthe algorithm make it appropriate for continuous real-time mcrtions. San Francisco, Calif.: Freeman. 1966.

min E { [d(k) - W T X ( k ) I 2 ) subject to CTW

(32)

You might also like