Mean-square performance of a family of affine projection algorithms

Ali Sayed

Mean-square performance of a family of affine projection algorithms

IEEE Transactions on Signal Processing, 2004

SHIN AND SAYED: MEAN-SQUARE PERFORMANCE OF A FAMILY OF AFFINE PROJECTION ALGORITHMS 91 and then, the learning behavior is characterized. Section VI il- lustrates the theoretical results by giving several simulation re- sults. II. DATA MODELS AND APA FAMILY Consider reference data that arise from the linear model (1) where is an unknown column vector that we wish to es- timate, accounts for measurement noise, and denotes 1 row input (regressor) vectors with a positive-definite co- variance matrix, . In this paper, we focus on a general class of affine projection algorithms for estimating of the form (2) where , is an estimate for at iteration , is the step size, and . . . . . . Different choices of the parameters result in dif- ferent affine projection algorithms. Table I defines the param- eters for some special cases. For example, the choices , , and result in the standard APA For NLMS-OCF, it is further assumed that is orthog- onal to . For PRA, it is understood that , for , i.e., the weight vector is updated once every iterations. Most algorithms assume . Moreover, although we focus on (2), our approach can be extended to other APA algo- rithms such as DA, which is not covered by (2). III. MEAN SQUARE PERFORMANCE OF APA Our first objective is to evaluate the steady-state mean-square error performance of the APA family (2), i.e., to compute MSE where is the output estimation error at time . To do so, we will rely on energy-conservation arguments. A. Energy Conservation Relation Let . Note that for all algorithms listed in Table I, except PRA. Then, (2) becomes (3) TABLE I APA FAMILY WHERE ARE INTEGERS which can be rewritten in terms of the weight-error vector as (4) If we multiply both sides of (4) by from the left, we find that (5) Introduce the a posteriori and a priori error vectors and Then, from (5), it holds that (6) We can use (6) to solve for , assuming is invertible and substitute into (4) to get (7) which can be rearranged as (8) By evaluating the energies of both sides of this equation, we find that the following energy equality should hold: (9) The important fact to emphasize is that no approximations are used to establish the energy relation (9); it is an exact relation that shows how the energies of the weight-error vectors at two successive iterations are related to the weighted energies of the a priori and a posteriori estimation error vectors. Relation (9) is the extension to the APA case of the energy-conservation rela- tion originally derived in [12] and [13] in the context of robust- ness analysis and subsequently used in [15]–[18] in the context of steady-state and transient performance analysis. See also [15] B. Variance Relation for Steady-State Performance The relevance of (9) to the mean-square analysis of affine pro- jection algorithms can be seen as follows. Taking expectations of both sides of (9), we get (10)

90 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 52, NO. 1, JANUARY 2004 Mean-Square Performance of a Family of Affine Projection Algorithms Hyun-Chool Shin and Ali H. Sayed, Fellow, IEEE Abstract—Affine projection algorithms are useful adaptive filters whose main purpose is to speed the convergence of LMS-type filters. Most analytical results on affine projection algorithms assume special regression models or Gaussian regression data. The available analysis also treat different affine projection filters separately. This paper provides a unified treatment of the mean-square error, tracking, and transient performances of a family of affine projection algorithms. The treatment relies on energy conservation arguments and does not restrict the regressors to specific models or to a Gaussian distribution. Simulation results illustrate the analysis and the derived performance expressions. Index Terms—Affine projection algorithm, energy-conservation, learning-curve, steady-state analysis, tracking analysis, transient analysis. I. INTRODUCTION HE normalized least mean-squares (NLMS) algorithm is a widely used adaptive algorithm due to its computational simplicity and ease of implementation. However, colored input signals can deteriorate its convergence speed appreciably [1], [2]. To address this problem, Ozeki and Umeda [3] developed the basic form of an affine projection algorithm (APA) using affine subspace projections. APA is a useful family of adaptive filters whose main purpose is to speed the convergence of LMS-type filters, especially for correlated data, at a computational cost that is still comparable to that of LMS. This class of filters is particularly useful in echo cancellation applications, e.g., [4]. While NLMS updates the weights based only on the current input vector, APA updates the weights based on previous input vectors. Since [3], many variants of APA have been devised independently from different perspectives such as the regularized APA (R-APA) [4], the partial rank algorithm (PRA) [5], the decorrelating algorithm (DA) [6], and NLMS with orthogonal correction factors (NLMS-OCF) [7]. We will refer to all these algorithms as belonging to the APA family (see also [8] and [9]). T Manuscript received October 23, 2002; revised April 11, 2003. This work was supported in part by the National Science Foundation under Grants ECS9820765 and CCR-0208573. This work was performed while H. Shin was a visiting graduate student at the UCLA Adaptive Systems Laboratory. His work was supported in part by the Brain Korea (BK) 21 Program funded by the Ministry of Education and in part by HY-SDR Reserch Center at Hanyang University under the ITRC Program of MIC, Korea. The associate editor coordinating the review of this paper and approving it for publication was Dr. Behrouz Farhang-Boroujeny. H.-C. Shin is with Division of Electronics and Computer Engineering, Pohang University of Science and Technology (POSTECH), Pohang, Korea. A. H. Sayed is with the Department of Electrical Engineering, University of California, Los Angeles, CA 90095 USA (e-mail: sayed@ee.ucla.edu). Digital Object Identifier 10.1109/TSP.2003.820077 The transient behavior of affine projection algorithms is not as widely studied as that of NLMS. The available results have progressed more for some variations than others, and most analyses assume particular models for the regression data. For example, in [10], convergence analyses in the mean and in the mean-square senses are presented for the binormalized data-reusing LMS (BNDR-LMS) algorithm. Although the results show good agreement with simulations, the arguments are based on a particular model for the input signal and are applicable only to second-order APA. Likewise, the convergence results in [9] focus on NLMS-OCF and rely on a special model for the input signal vector. A convergence analysis of DA is given in [11], where the theoretical results of [6] are extended to the evaluation of learning curves assuming a Gaussian autoregressive input model. All these results provide useful design guidelines. However, each APA form is usually studied separately with specific techniques. Such distinct treatments tend to obscure commonalities that exist among algorithms. In this paper, we provide a unified treatment of the transient performance of the APA family. In particular, we derive expressions for the mean-square error and tracking performances, as well as conditions on the step-size for mean-square stability. Our derivation relies on energy conservation arguments [12]–[18], and it does not restrict the regression data to being Gaussian or white. Extensive simulations at the end of the paper illustrate the derived results. Throughout the paper, the following notations are adopted: Euclidean norm of a vector. Tr Trace of a matrix. Diagonal matrix of its entries . diag Hermitian conjugation (complex conjugation for scalars). Transpose of a vector or a matrix. Determinant of a matrix. Largest eigenvalue of a matrix. Set of positive real numbers. In addition, small boldface letters are used to denote vectors, and capital letters are used to denote matrices, e.g., and . The symbol denotes the identity matrix of appropriate dimensions. All vectors are column vectors except for the input data vector denoted by , which is taken to be a row vector for convenience of notation. The paper is organized as follows. In the next section, the data model and reviews of the APA family are provided. In Section III, by examining the mean-square performance of the APA family, expressions for the steady-state mean-square error (MSE) are derived. Section IV studies the tracking ability of the APA family. In Section V, the transient performance is analyzed, 1053-587X/04$20.00 © 2004 IEEE SHIN AND SAYED: MEAN-SQUARE PERFORMANCE OF A FAMILY OF AFFINE PROJECTION ALGORITHMS and then, the learning behavior is characterized. Section VI illustrates the theoretical results by giving several simulation results. 91 TABLE I APA FAMILY WHERE f ; K; Dg ARE INTEGERS II. DATA MODELS AND APA FAMILY Consider reference data model that arise from the linear (1) where is an unknown column vector that we wish to estimate, accounts for measurement noise, and denotes row input (regressor) vectors with a positive-definite co1 variance matrix, . In this paper, we focus on a general class of affine projection algorithms for estimating of the form (2) where iteration , , is an estimate for at is the step size, and which can be rewritten in terms of the weight-error vector as (4) If we multiply both sides of (4) by from the left, we find that (5) Introduce the a posteriori and a priori error vectors and Then, from (5), it holds that (6) .. . .. . We can use (6) to solve for Different choices of the parameters result in different affine projection algorithms. Table I defines the param, eters for some special cases. For example, the choices , and result in the standard APA , assuming is invertible and substitute into (4) to get (7) which can be rearranged as For NLMS-OCF, it is further assumed that is orthog. For PRA, it is understood that onal to , for , i.e., the weight vector is updated once every iterations. Most algorithms assume . Moreover, although we focus on (2), our approach can be extended to other APA algorithms such as DA, which is not covered by (2). III. MEAN SQUARE PERFORMANCE OF APA Our first objective is to evaluate the steady-state mean-square error performance of the APA family (2), i.e., to compute MSE where is the output estimation error at time . To do so, we will rely on energy-conservation arguments. A. Energy Conservation Relation (8) By evaluating the energies of both sides of this equation, we find that the following energy equality should hold: (9) The important fact to emphasize is that no approximations are used to establish the energy relation (9); it is an exact relation that shows how the energies of the weight-error vectors at two successive iterations are related to the weighted energies of the a priori and a posteriori estimation error vectors. Relation (9) is the extension to the APA case of the energy-conservation relation originally derived in [12] and [13] in the context of robustness analysis and subsequently used in [15]–[18] in the context of steady-state and transient performance analysis. See also [15] B. Variance Relation for Steady-State Performance The relevance of (9) to the mean-square analysis of affine projection algorithms can be seen as follows. Taking expectations of both sides of (9), we get . Note that for all algorithms Let listed in Table I, except PRA. Then, (2) becomes (3) (10) 92 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 52, NO. 1, JANUARY 2004 Taking the limit as , and using the steady-state condition , we obtain (11) Substituting (6) into the right-hand side (RHS) of (11), we get RHS of (11) (12) where we are defining In order to evaluate the EMSE, we need to deal with the expectations in (14). For this purpose, we shall rely on the following assumption. is statistically independent of A.2) At steady-state, and moreover, where for small and for large where Note that since for all algorithms listed in Table I, except , and is the top entry of . PRA, then, For PRA, , and therefore, is also . The condition on is equal to the top entry of motivated in Appendix A. Using (14) and A.2), the first term on the left-hand side (LHS) of (14) becomes Tr Tr and (15) as . Similar manipulations can be applied to the remaining terms in (14). Thus, we get Using (12), equality (11) simplifies to Tr (13) . This equation can now be used to evaluate the meanas square performance of affine projection algorithms. C. Mean-Square Performance Introduce the noise vector (16) and Tr . If we introduce the quantities (which are solely dependent on the statistics of the regression data): as Tr and Tr (18) then (14) becomes Then, (1) gives Tr and under the often realistic assumption that is i.i.d. and statistically independent of A.1) the noise . the regression matrix on past noises, we find that Neglecting the dependency of the variance relation (13) reduces to as as . This expression can be used to deduce an expression for the filter MSE or, equivalently, for the filter excess mean square error (EMSE), which is defined by EMSE (20) and the steady-state MSE is EMSE Tr MSE (21) Two simplifications can be made when the regularization parameter is small. • If is small enough so that its effect can be ignored, then , and the definitions of and will coincide. In this case, (20) reduces to Tr Tr EMSE MSE Tr EMSE . Now, from (1), we get and therefore, the MSE and EMSE define each other via (19) , and the EMSE of the filter is therefore given by (14) where (17) If we use , we obtain EMSE (22) SHIN AND SAYED: MEAN-SQUARE PERFORMANCE OF A FAMILY OF AFFINE PROJECTION ALGORITHMS and if we use , we get Using the Tr EMSE • Another approximation assumes is small and and uses Tr and 93 random-walk model (24), we know that for , and therefore Tr is large (28) Substituting into (27), we obtain Tr to get Tr EMSE Tr Note that this expression for the EMSE is proportional to contrast, the expression given in [9] is EMSE (23) . In Tr Tr IV. TRACKING PERFORMANCE OF APA (30) , and the EMSE is then given by as EMSE Tr Tr (31) The two simplifications of Section III can be used to get A similar analysis can be used to evaluate the performance of APA in nonstationary environments. Thus, assume that , where the unknown system is now is according time-variant. It is assumed that the variation in to the random-walk model (see, e.g., [1], [2], [15], and [19]) (24) of the Comparing with (10), we see that the only difference in the nonstationary case is the appearance of the additional term Tr . Note that the other terms are identical. Therefore, similar manipulations to those in Section III lead to Tr which does not take into account the effect of . Simulation results in Section VI (see Figs. 7–12) show that (22) and (23) provide good approximations for filter performance for relatively small step-size and order . where (29) is an i.i.d. sequence with autocorrelation matrix and independent of the initial conditions for all and of the for all . Let , , and . Then Tr (32) Tr or Tr Tr (33) From (32) and (33), we see that for a given , there is an optimal that minimizes the EMSE, and for a given , there that minimizes the EMSE. Comparisons of is an optimal the tracking performance among the APA family are given in Table II. V. TRANSIENT ANALYSIS OF APA and (25) If we multiply (25) by from the left, we obtain that (6) still holds for the nonstationary case. Substituting (6) into (25), we get (26) Evaluating the energies of both sides of (26) and taking expectations, we find that (27) We now study the transient (i.e., convergence and stability) performance of the APA family. This task is more challenging than mean-square performance. Nevertheless, the same energy conservation arguments of the previous section can still be used if we incorporate weighting into the energy relation and into the definition of the error quantities [14], [17], as we now explain. . Then, We will assume, without loss of generality, that (2) becomes In the following analysis, if we substitute then the results for would be obtained. by , A. Weighted Energy Relation Let and . If we multiply from the left, for any both sides of the above recursion by Hermitian positive-definite matrix , we find that the a priori and a posteriori estimation errors are related via (34) 94 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 52, NO. 1, JANUARY 2004 TABLE II EMSE OF APA FAMILY IN NONSTATIONARY ENVIRONMENTS WHERE A Similarly to the arguments in Section III, we can get Replacing (35) On each side of this identity, we have a combination of a priori and a posteriori errors. If we equate the weighted Euclidean norms of both sides of (35), we find that (U U ) by its equivalent expression in (34), we get (38) Using the relation , we can eliminate . Since most of the factors disappear under A.1) and expectation, we get (36) The special choice Moreover, since reduces to the energy relation (9). (39) where we also get (37) In addition, Thus, we have . (40) B. Weighted Variance Relation In transient analysis, we are interested in the time evolution of for some desirable choices of . For this reason, rather than eliminate the effect of the weight-error vector, the are incontributions of the other error quantities stead expressed in terms of the weight-error vector itself. In so doing, the energy relation (36) will lead to a recursion that de. scribes the evolution of can be expressed in terms of where (41) Recursion (40) provides a compact characterization of the time evolution of the weight-error variance. However, recursion SHIN AND SAYED: MEAN-SQUARE PERFORMANCE OF A FAMILY OF AFFINE PROJECTION ALGORITHMS 95 (40) is still hard to propagate due to the presence of the expectation with This expectation is difficult to evaluate due to the dependence of on and of on prior regressors. One way to overcome this difficulty is to introduce an independence assumption on the regressor sequence , namely, to assume the following. is independent and identiA.3) The matrix sequence cally distributed. is independent of both This assumption guarantees that and . Clearly, A.3) is a strong assumption (it is actually stronger than the usual independence assumption, which only requires the to be i.i.d [1], [2]). Observe, however, from that it is sufficient for our purposes to require the (41) for following: is independent of . A.3’) This is generally a weaker assumption. In this way, recursion (40) reduces to We can rewrite the recursion for instead of the matrices vectors vec in (40) by using the , say, as vec (48) where, for the last term, we used the fact that Tr where vec . For compactness of notation, we drop the vec notation from the subscripts and keep the vectors so that the above is simply rewritten as (49) In addition, we obtain the following result for the evolution of the mean of the weight-error vector: (42) (50) where now (43) with expectations appearing in (43). In addition, taking expectations of both sides of (37) and using assumption A.1), we obtain the following result for the evolution of the mean of the weight-error vector: (44) Relations (42) and (44) can be used to derive conditions for mean-square stability, as well as expressions for the steady-state MSE and mean-square deviation (MSD) of the APA family. To notation, e.g., see this, we introduce some notation. The vec vec , allows us to replace an arbitrary matrix by an 1 column vector whose entries are formed by stacking the successive columns of the matrix on top of each other. On the other hand, writing vec for an 1 column vector results in an matrix whose entries are obtained from . Therefore, we also write vec . The notation is convenient when working with Kronecker vec products. The Kronecker product of two matrices and , say and , respectively, is denoted of dimensions by [20]. For any matrices of compatible dimensions, it holds that vec vec (45) Applying (45) to (40), we find that it leads to the vector relation (46) where the coefficient matrix is , we Recursion (49) shows that in order to evaluate , with a weighting matrix whose need to know entries are determined by . Now, the quantity can be inferred from (49) by writing the recursion for , i.e., We again find that in order to evaluate , we need . The natural question is whether this to know procedure terminates. Fortunately, as in [14] and [17], this procedure does terminate. This is because once we write (48) by , we get substituting by where the weighting matrix on the RHS is . This term can be deduced from the prior weighting factors. Indeed, let denote the characteristic polynomial of , It is a polynomial of order with coefficients guarantees that in . Now, the Cayley–Hamilton theorem so that (51) Theorem 1 [Transient Performance]: Under assumptions A.1) and A.3’), the transient performance of the APA family (2) for is described by the state recursion and defined by (47) (52) 96 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 52, NO. 1, JANUARY 2004 where TABLE III STABILITY BOUNDS COMPUTED BY THEOREM II (GAUSSIAN INPUT) .. . .. . .. . .. .. . . .. . vec .. . TABLE IV STABILITY BOUNDS COMPUTED BY THEOREM II (UNIFORM INPUT) vec , , and are coefficients of the characteristic polynomial of . Observe that the eigenvalues of coincide with those of . , C. Learning Curves Gaussian input 20 K=1 K=2 K=4 K=8 10 MSE in dB The learning curve of an adaptive filter describes the time . Now, if the are assumed evolution of the variance to be i.i.d., then and the learning curve can be evaluated by computing for each . This task can be accomplished recursively from (48) by iterating it and setting vec . This yields 0 -10 stability bound µ≈2 -20 -30 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 Uniform input 20 K=1 K=2 K=4 K=8 (53) (54) where the vector and the scalar satisfy the recursions MSE in dB 10 That is 0 -10 stability bound µ≈2 -20 -30 0.4 0.6 Fig. 1. D. Mean-Square Stability From (50), the convergence in the mean of the APA family is guaranteed for any satisfying (55) Moreover, recursion (49) is stable if, and only if, the matrix is stable. Thus, let and so that . The following holds. Theorem 2 [Stability]: The convergence in the mean-square sense of the APA family is guaranteed for any in the range where , , and . The above condition on is in terms of the largest positive eigenvalue of when it exists. The theorem is proved Ap- 2.4 Step--size (µ) 0.8 1 1.2 1.4 1.6 Step size (µ) 1.8 2 2.2 2.4 Simulated MSE of APA as a function of the step size. pendix B. By combining (55) and Theorem 2, a bound on the step-size for both mean and mean-square stability is obtained. Theorem 2 provides an explicit and unified stability bound for a general class of input signals and various affine projection algorithms. E. Steady-State Behavior In the above, we used the variance relation (49) to characterize the transient behavior of the APA family in terms of a state recursion. We can use the same variance relation to shed further light on the mean-square performance of the APA family. In particular, we shall re-examine the EMSE, as well as study the mean-square deviation (MSD), which is defined as MSD Assuming the step-size is chosen to guarantee filter stability, recursion (49) becomes in steady-state (56) SHIN AND SAYED: MEAN-SQUARE PERFORMANCE OF A FAMILY OF AFFINE PROJECTION ALGORITHMS 0 0 (a) K=1, D=8 (b) K=2, D=8 (c) K=4, D=8 (d) K=8, D=8 Theory Simulation -5 (a) Using (54) (b) From [10] (c) Simulation -5 -10 -10 (a) K=1 MSE in dB MSE in dB 97 -15 (b) From [10] -15 (b) K=2 (c) K=4 -20 -20 (a) Using (54) (c) Simulation -25 -25 (d) K=8 -30 0 Fig. 2. 50 100 150 200 250 300 Iteration number 350 400 450 -30 500 Learning curves of the APA family for colored Gaussian input using = 1:0 and D = 8. (a) K = 1. (b) K = 2. (c) K = 4. (d) K = 8 [Input: Gaussian AR(1), pole at 0.9. System: FIR (16)]. 0 50 100 150 200 Iteration number 250 300 350 400 Fig. 4. Comparison of learning curves for colored Gaussian input using K = 2, = 1:0, and D = 8. (a) Using (54). (b) Using the results of [10]. (c) Simulation [Input: Gaussian AR(1), pole at 0.9. System: FIR (16)]. 0 0 (a) K=1, D=8 (b) K=2, D=8 (c) K=4, D=8 (d) K=8, D=8 Theory Simulation -5 (a) Using (54) (b) From [9] (c) Simulation -5 -10 MSE in dB MSE in dB -10 (a) K=1 -15 (b) K=2 -15 (c) K=4 -20 -20 (a) Using (54) (c) Simulation -25 -25 (d) K=8 -30 (b) From [9] 0 Fig. 3. 50 100 150 200 250 300 Iteration number 350 400 450 500 Learning curves of the APA family for colored uniform input using = 1:0 and D = 8. (a) K = 1. (b) K = 2. (c) K = 4. (d) K = 8 [Input: uniform AR(1), pole at 0.5. System: FIR (16)]. 0 20 40 60 80 100 120 Iteration number 140 160 180 200 Fig. 5. Comparison of learning curves for colored Gaussian input using K = 4, = 1:0, and D = 8. (a) Using (54). (b) Using the results of [9]. (c) Simulation [Input: Gaussian AR(1), pole at 0.9. System: FIR (16)]. which is equivalent to (57) We choose to reduce the weight into the identity matrix. Thus, it needs to be selected as the solution to the linear system of vec , i.e., vec . In equations this case, the weighting quantity that appears in (57) reduces to the vector of unit entries. Then, the left-hand side of (57) becomes the filter MSD, and (57) leads to MSD -30 vec (58) In a similar way, let us evaluate the EMSE of the APA family. Note that since we need to evaluate , where the weighting factor is vec . Assume we select as the solution to the linear system of equations . In this case, the weighting . Then, the LHS of quantity that appears in (57) reduces to (57) becomes the filter EMSE, and (57) leads to the desired result EMSE vec (59) VI. SIMULATION RESULTS We illustrate the theoretical results presented in this paper by carrying out computer simulations in a channel estimation scenario. The unknown channel has 16 taps and is randomly generated. Two different types of signals, viz., Gaussian and 98 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 52, NO. 1, JANUARY 2004 0 -21 (a) Using (54) (b) From [9] (c) Simulation (a) K=4, D=1 (b) K=4, D=4 (c) K=4, D=8 Theory Simulation -22 -5 -23 -10 MSE in dB MSE in dB -24 -15 -25 -26 (a) D=1 (b) D=4 -20 -27 (a) Using (54) (c) Simulation -28 -25 (c) D=8 -29 (b) From [9] -30 0 20 40 60 80 100 120 Iteration number 140 160 180 200 Fig. 6. Comparison of learning curves for colored Gaussian input using K = 8, = 1:0, and D = 8. (a) Using (54). (b) Using the results of [9]. (c) Simulation [Input: Gaussian AR(1), pole at 0.9. System: FIR (16)]. -30 0 0.1 0.2 0.3 0.4 0.5 0.6 Step--size (µ) 0.7 0.8 0.9 1 Fig. 8. Steady-state MSE curves of the APA family for colored Gaussian input using K = 4 in stationary environments. (a) D = 1. (b) D = 4. (c) D = 8 [Input: Gaussian AR(1), pole at 0.9. System: FIR (16)]. -18 K=1 (a) K=1, D=1 (b) K=2, D=1 (c) K=4, D=1 (d) K=8, D=1 Theory Simulation Simulation Eq.(20) Eq.(22) Eq.(23) Eq.(59) -26 MSE in dB -20 -25 -22 -27 -28 MSE in dB -29 -30 -24 0 0.1 0.2 0.3 0.4 0.5 0.6 Step--size (µ) 0.7 0.8 0.9 1 0.2 0.3 0.4 0.5 0.6 Step size (µ) 0.7 0.8 0.9 1 (d) K=8 (c) K=4 K=4 20 - -26 (b) K=2 - 22 MSE in dB (a) K=1 -28 -30 -24 Simulation Eq.(20) Eq.(22) Eq.(23) Eq.(59) -26 28 - 0 0.1 0.2 0.3 0.4 0.5 0.6 Step--size(µ) 0.7 0.8 0.9 1 - 30 Fig. 7. Steady-state MSE curves of the APA family for colored Gaussian input using D = 1 in stationary environments. (a) K = 1. (b) K = 2. (c) K = 4. (d) K = 8 [Input: Gaussian AR(1), pole at 0.9. System: FIR (16)]. uniformly distributed signals, are used for the input signal viz., , which is a first-order autoregressive (AR) process with a pole at . For the Gaussian case, is a white, zero-mean, Gaussian random sequence having unit variance, and is set to 0.9. As a result, a highly colored Gaussian signal is generated. For the is a uniform random sequence between 1.0 uniform case, and 1.0, and is set to 0.5. In Tables III and IV we evaluate the bounds in (55) and Theorem 2. These tables indicate that the stability bound on is approximately for both Gaussian input (which is consistent with [9] and uniform input signals). This fact is further verified by simulation in Fig. 1, 0 0.1 Fig. 9. Comparison of MSE expressions when K = 1 or K = 4 and D = 1 [Input: Gaussian AR(1), pole at 0.9. System: FIR (16)]. where MSE curves are plotted as a function of the step size. The expectations involved in evaluating and are estimated via ensemble averaging. The signal-to-noise ratio (SNR) is calculated by SNR where . The measurement noise is added to such that SNR 30 dB. The adaptive filter and the unknown channel are assumed to have the same number of taps. All adaptive filter coefficients are initialized to zero. In addition, the regularization parameter is set to 0.001. We set . The simulation results shown are obtained by ensemble averaging over 200 independent trials. SHIN AND SAYED: MEAN-SQUARE PERFORMANCE OF A FAMILY OF AFFINE PROJECTION ALGORITHMS 99 -22 -23 (a) Simulation (b) Eq.(20) (c) Eq.(22) (d) Eq.(23) (e) Eq.(59) (f) [8] -24 (a) K=4, D=1 (b) K=4, D=4 (c) K=4, D=8 Theory Simulation -23 (d) -24 -25 -25 MSE in dB MSE in dB (e) -26 (a) -27 (a) D=1 -26 (b) D=4 (b) or (c) (c) D=8 -27 -28 -28 (f) -29 -29 -30 0 0.1 0.2 0.3 0.4 0.5 0.6 Step--size (µ) Fig. 10. Comparison of MSE when AR(1), pole at 0.9. System: FIR (16)]. K = 2 and 0.7 D 0.8 0.9 -30 1 = 1 [Input: Gaussian 0 0.1 0.2 0.3 0.4 0.5 0.6 Step--size(µ) 0.7 0.8 0.9 1 Fig. 12. Steady-state MSE curves of the APA family for colored uniform input using = 4 in stationary environments. (a) = 1. (b) = 4. (c) = 8 [Input: uniform AR(1), pole at 0.5. System: FIR (16)]. K D D D -18 (a) K=1, D=1 (b) K=2, D=1 (c) K=4, D=1 (d) K=8, D=1 Theory Simulation -20 -12 -16 -22 -18 (c) K=4 -24 MSE in dB MSE in dB (a) K=1, D=1 (b) K=2, D=1 (c) K=4, D=1 (d) K=8, D=1 Theory Simulation -14 (d) K=8 -26 (a) K=1 -20 -22 (c) K=4 (d) K=8 -24 (b) K=2 (b) K=2 (a) K=1 -28 -26 -30 -28 0 0.1 0.2 0.3 0.4 0.5 0.6 Step--size(µ) 0.7 0.8 0.9 1 -30 Fig. 11. Steady-state MSE curves of the APA family for colored uniform input using = 1 in stationary environments. (a) = 1. (b) = 2. (c) = 4. (d) = 8 [Input: uniform AR(1), pole at 0.5. System: FIR (16)]. K D K K K A. Transient Performance Figs. 2–6 show the learning curves of the APA family. The step size is set to , and the delay parameter is set to 8. Fig. 2 shows how close the simulation results are to the theoretical results (54), where and were evaluated via ensemble averaging. The theoretical results are very close to the simu. lated results, although there is some discrepancy when In Fig. 3, the colored uniform input signal is used for the simulation. For generating the input signal, is set to 0.5, unlike the Gaussian case. In Figs. 4–6, the learning curves in Fig. 2 are compared with the theoretical results in [9] and [10]. B. Steady-State Performance Fig. 7 shows the steady-state MSE curves of the APA family for colored Gaussian input as a function of the step size. The step size varies from 0.04 to 1.0. This range guarantees stability as 0 0.1 0.2 0.3 0.4 0.5 0.6 Step--size(µ) 0.7 0.8 0.9 1 Fig. 13. Steady-state MSE curves of the APA family for colored Gaussian input using = 1 in nonstationary environments. (a) = 1. (b) = 2. (c) = 4. (d) = 8 [Input: Gaussian AR(1), pole at 0.9. System: FIR (16)]. K D K K K mentioned before. The theoretical results are calculated using (22), and the simulation results are obtained by averaging more than 1000 instantaneous square errors in the steady-state and then averaging 200 independent trials. The simulation results present good agreement with the theoretical results for small step size but deviates from the theoretical one for a larger step sizes and larger . The theoretical MSE in [9] is almost the same as the curve corresponding to in Fig. 7; the MSE expression in [9] is independent of and is therefore not able to predict the variations in MSE as a function of . Fig. 8 shows the steady state MSE for different delay parameters . As increases, the MSE decreases. To compare the EMSE expressions in Sections III and V, theoretical MSE curves using (20), (22), (23), and (59) are plotted in Fig. 9. The EMSE curves using (20) and (22) show good agreement with the simulation results. 100 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 52, NO. 1, JANUARY 2004 -16 -14 (a) K=2, D=1 (b) K=2, D=2 (c) K=2, D=4 Theory Simulation -18 (a) K=2, D=1 (b) K=2, D=2 (c) K=2, D=4 Theory Simulation -16 -18 -20 (a) D=1 -22 (b) D=2 MSE in dB MSE in dB -20 (c) D=4 -24 (b) D=2 -22 (c) D=4 -24 (a) D=1 -26 -26 -28 -30 -28 0 0.1 0.2 0.3 0.4 0.5 0.6 Step--size(µ) 0.7 0.8 0.9 1 Fig. 14. Steady-state MSE curves of the APA family for colored Gaussian input = 2 in nonstationary environments. (a) = 1. (b) = 2. (c) = 4 [Input: Gaussian AR(1), pole at 0.9. System: FIR (16)]. K D D D -12 -16 0 0.1 0.2 0.3 0.4 0.5 0.6 Step--size(µ) 0.7 0.8 0.9 1 Fig. 16. Steady-state MSE curves of the APA family for colored uniform input using = 2 in nonstationary environments. (a) = 1. (b) = 2. (c) = 4 [Input: uniform AR(1), pole at 0.9. System: FIR (16)]. K D D D different values of . The simulation results show the dependence of the tracking performance on . For a given , there exists an optimal that minimizes the MSE. Figs. 15 and 16 show the theoretical and simulated results for colored uniform input signal. (a) K=1, D=1 (b) K=2, D=1 (c) K=4, D=1 (d) K=8, D=1 Theory Simulation -14 -30 -18 MSE in dB (a) K=1 VII. CONCLUSIONS -20 (d) K=8 -22 (b) K=2 -24 (c) K=4 -26 -28 -30 0 0.1 0.2 0.3 0.4 0.5 0.6 Step--size(µ) 0.7 0.8 0.9 1 Fig. 15. Steady-state MSE curves of the APA family for colored uniform input using = 1 in nonstationary environments. (a) = 1. (b) = 2. (c) = 4. (d) = 8 [Input: uniform AR(1), pole at 0.9. System: FIR (16)]. K D K K K Fig. 10 shows comparison of MSE with [10]. Figs. 11 and 12 present the results for a colored uniform input signal. C. Tracking Performance Figs. 13–16 show the steady-state MSE tracking performance of the APA family in a nonstationary environment. The steadystate tracking MSE in (31) is not a monotonically increasing function of . Therefore, there exists an optimal value of step size that minimizes the MSE in the nonstationary case. To see this, the range of the step-size is set from 0.04 to 1.0. We are using an i.i.d. sequence with autocorrelation matrix , where . Fig. 13 shows the theoretical and simulated results for colored Gaussian input for the different value of . For a given , there exists an optimal that minimizes the MSE, and for a given , there exists an optimal , which minimizes the MSE. Fig. 14 shows the tracking performance for In this paper, we carried out a rather detailed mean-square performance evaluation of the family of affine projection algorithms under the assumptions A1), A2), and A3’). Using energy-conservation arguments, we were able to derive expressions for the steady-state mean-square error and mean-square deviation without restricting the distribution of the input data to being Gaussian or white and without assuming any particular model for the input signals. Both stationary and nonstationary environments were considered. We also characterized the transient behavior of the filters by means of a first-order state-space model, whose stability was shown to determine the mean-square stability of the adaptive filter. Several simulation results were included to illustrate the application of the theory. In particular, it was seen that there is relatively good match between theory and practice. APPENDIX A EVALUATION OF Recall that the a priori and a posteriori error vectors are defined by .. . where we are assuming and generality. From (6), we know that .. . without loss of SHIN AND SAYED: MEAN-SQUARE PERFORMANCE OF A FAMILY OF AFFINE PROJECTION ALGORITHMS when is small. Then, the following relations hold: 101 Now, we want to determine conditions on in order to guarantee , where .. . Following the same argument used in [17, App. A], we can establish the condition From these relations, we also get ACKNOWLEDGMENT The authors would like to thank Prof. W.-J. Song for his support of the first author’s visit to the UCLA Adaptive Systems Laboratory. .. . REFERENCES but since in steady-state and neglecting off-diagonal terms in , we find that (60) where the diagonal matrices ( , .. .. ) are given by . . Note that when is small, and . In addition, when is close to 1 and when SNR is high, and so that (60) agrees with our assumption A.2. and are Expression (60) suggests that other choices for possible for assumption A.2). However, simulations show that the simpler conditions in A.2) lead to good results. APPENDIX B PROOF OF THEOREM 2 From properties of Kronecker products, we know that the eigenvalues of are all the combinations are the eigenvalues of definite. Moreover, for all , where . Since , is positive is non-negative definite. [1] B. Widrow and S. D. Stearns, Adaptive Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, 1985. [2] S. Haykin, Adaptive Filter Theory, 3rd ed, NJ: Prentice-Hall, 1996. [3] K. Ozeki and T. Umeda, “An adaptive filtering algorithm using an orthogonal projection to an affine subspace and its properties,” Electron. Commun. Jpn., vol. 67-A, no. 5, pp. 19–27, 1984. [4] S. L. Gay and J. Benesty, Acoustic Signal Processing for Telecommunication. Boston, MA: Kluwer, 2000. [5] S. G. Kratzer and D. R. Morgan, “The partial-rank algorithm for adaptive beamforming,” in Proc. SPIE Int. Soc. Opt. Eng., vol. 564, 1985, pp. 9–14. [6] M. Rupp, “A family of adaptive filter algorithms with decorrelating properties,” IEEE Trans. Signal Processing, vol. 46, pp. 771–775, Mar. 1998. [7] S. G. Sankaran and A. A. (Louis) Beex, “Normalized LMS algorithm with orthogonal correction factors,” in Proc. 31st Annu. Asilomar Conf. Signals, Syst., Comput., Pacific Grove, CA, Nov. 1997, pp. 1670–1673. [8] D. R. Morgan and S. G. Kratzer, “On a class of computationally efficient, rapidly converging, generalized NLMS algorithms,” IEEE Signal Processing Lett., vol. 3, pp. 245–247, Aug. 1996. [9] S. G. Sankaran and A. A. (Louis) Beex, “Convergence behavior of affine projection algorithms,” IEEE Trans. Signal Processing, vol. 48, pp. 1086–1096, Apr. 2000. [10] J. Apolinário, Jr., M. L. R. Campos, and P. S. R. Diniz, “Convergence analysis of the binormailzed data-reusing LMS algorithm,” IEEE Trans. Signal Processing, vol. 48, pp. 3235–3242, Nov. 2000. [11] N. J. Bershad, D. Linebarger, and S. McLaughlin, “A stochastic analysis of the affine projection algorithm for gaussian autoregressive inputs,” in Proc. ICASSP, Salt Lake City, UT, 2001, pp. 3837–3840. [12] A. H. Sayed and M. Rupp, “A time-domain feedback analysis of adaptive algorithms via the small gain theorem,” Proc. SPIE, vol. 2563, pp. 458–469, July 1995. [13] M. Rupp and A. H. Sayed, “A time-domain feedback analysis of filterederror adaptive gradient algorithms,” in IEEE Trans. Signal Processing, June 1996, vol. 44, pp. 1428–1439. [14] A. H. Sayed, Fundamentals of Adaptive Filtering. New York: Wiley, 2003. [15] N. R. Yousef and A. H. Sayed, “A unified aproach to the steady-state and tracking analyzes of adaptive filters,” IEEE Trans. Signal Processing, vol. 49, pp. 314–324, Feb. 2001. , “Ability of adaptive filters to track carrier offsets and random [16] channel nonstationarities,” IEEE Trans. Signal Processing, vol. 50, pp. 1533–1544, July 2002. [17] T. Y. Al-Naffouri and A. H. Sayed, “Transient analysis of data-normalized adaptive filters,” IEEE Trans. Signal Processing, vol. 51, pp. 639–652, Mar. 2003. , “Transient analysis of adaptive filters with error nonlinearities,” [18] IEEE Trans. Signal Processing, vol. 51, pp. 653–663, Mar. 2003. [19] E. Eweda, “Comparison of RLS, LMS, and sign algorithms for tracking randomly time-varying channels,” IEEE Trans. Signal Processing, vol. 42, pp. 2937–2944, Nov. 1994. [20] G. Alexander, Kronecker Products and Matrix Calculus With Applications. New York: Halsted, 1981. 102 IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 52, NO. 1, JANUARY 2004 Hyun-Chool Shin was born in Seoul, Korea, in 1974. He received the B.Sc. and M.Sc. degrees in electronic and electrical engineering from Pohang University of Science and Technology (POSTECH), Pohang, Korea, in 1997 and 1999, respectively. Since 1997, he has been a Research Assistant with the Department of Electronic and Electrical Engineering, POSTECH, where he is currently pursuing the Ph.D. degree. His research interests include adaptive filter theory and methods applied to channel equalization and identification. Ali H. Sayed (F’01) received the Ph.D. degree in electrical engineering in 1992 from Stanford University, Stanford, CA. He is currently Professor and Vice Chair of electrical engineering at the University of California, Los Angeles. He is also the Principal Investigator of the UCLA Adaptive Systems Laboratory (www.ee.ucla.edu/asl). He has over 190 journal and conference publications, is the author of the textbook Fundamentals of Adaptive Filtering (New York: Wiley, 2003), is coauthor of the research monograph Indefinite Quadratic Estimation and Control (Philadelphia, PA: SIAM, 1999) and of the graduate-level textbook Linear Estimation (Englewood Cliffs, NJ: Prentice-Hall, 2000). He is also co-editor of the volume Fast Reliable Algorithms for Matrices with Structure (Philadelphia, PA: SIAM, 1999). He is a member of the editorial boards of the SIAM Journal on Matrix Analysis and Its Applications and the International Journal of Adaptive Control and Signal Processing and has served as coeditor of special issues of the journal Linear Algebra and Its Applications. He has contributed several articles to engineering and mathematical encyclopedias and handbooks and has served on the program committees of several international meetings. He has also consulted with industry in the areas of adaptive filtering, adaptive equalization, and echo cancellation. His research interests span several areas, including adaptive and statistical signal processing, filtering and estimation theories, signal processing for communications, interplays between signal processing and control methodologies, system theory, and fast algorithms for large-scale problems. Dr. Sayed is recipient of the 1996 IEEE Donald G. Fink Award, a 2002 Best Paper Award from the IEEE Signal Processing Society in the area of Signal Procesing Theory and Methods, and co-author of two Best Student Paper awards at international meetings. He is also a member of the technical committees on Signal Processing Theory and Methods (SPTM) and on Signal Processing for Communications (SPCOM), both of the IEEE Signal Processing Society. He is a member of the editorial board of the IEEE SIGNAL PROCESSING MAGAZINE. He has also served twice as Associate Editor of the IEEE TRANSACTIONS ON SIGNAL PROCESSING, of which he is now serving as Editor-in-Chief.

Log In

Mean-square performance of a family of affine projection algorithms