Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Random matrix theory of multi-antenna communications: the Ricean channel

Journal of Physics A: Mathematical and General, 2005
...Read more
INSTITUTE OF PHYSICS PUBLISHING JOURNAL OF PHYSICS A: MATHEMATICAL AND GENERAL J. Phys. A: Math. Gen. 38 (2005) 10859–10872 doi:10.1088/0305-4470/38/49/024 Random matrix theory of multi-antenna communications: the Ricean channel Aris L Moustakas 1,3 and Steven H Simon 2 1 Department of Physics, University of Athens, Panepistimiopolis, Athens 15784, Greece 2 Bell Labs, Lucent Technologies, 600 Mountain Avenue, Murray Hill, NJ 07974, USA Received 25 July 2005, in final form 26 July 2005 Published 22 November 2005 Online at stacks.iop.org/JPhysA/38/10859 Abstract The use of multi-antenna arrays in wireless communications through disordered media promises huge increases in the information transmission rate. It is therefore important to analyse the information capacity of such systems in realistic situations of microwave transmission, where the statistics of the transmission amplitudes (channel) may be coloured. Here, we present an approach that provides analytic expressions for the statistics, i.e. the moments of the distribution, of the mutual information for general Gaussian channel statistics. The mathematical method applies tools developed originally in the context of coherent wave propagation in disordered media, such as random matrix theory and replicas. Although it is valid formally for large antenna numbers, this approach produces extremely accurate results even for arrays with as few as two antennas. We also develop a method to analytically optimize over the input signal distribution, which enables us to calculate analytic capacities when the transmitter has knowledge of the statistics of the channel. The emphasis of this paper is on elucidating the novel mathematical methods used. We do this by analysing a specific case when the channel matrix is a complex Gaussian with arbitrary mean and unit covariance, which is usually called the Ricean channel. PACS numbers: 84.40.Ba, 02.10.Yn, 07.50.Qx 1. Introduction Recently, there has been increased interest in using multi-antenna arrays simultaneously in transmission and reception of microwave signals. Recent theoretical work in [1] and [2] has shown that for sufficiently rich scattering environments the Shannon capacity of an n t -element transmitting and an n r -element receiving array is roughly proportional to min(n r ,n t ) for large numbers of antennas. This is significantly more than the usually logarithmic increase in 3 Author to whom correspondence should be addressed. 0305-4470/05/4910859+14$30.00 © 2005 IOP Publishing Ltd Printed in the UK 10859
10860 A L Moustakas and S H Simon capacity for increasing antenna numbers when only a single path (plane-wave) connects the transmission and reception arrays (line-of-sight). Intuitively, one can understand this result by observing that if the scattering is rich enough, then there is an independent channel from each transmission antenna to each reception antenna. Therefore, one can send independent signals from each transmission antenna. Both indoors [3] and outdoors [4] measurements have shown promising throughput gains for such MIMO (multiple input multiple output) technologies. Therefore it is important to be able to assess the capacity gains of MIMO technologies in realistic situations. Such situations include the case where spatial correlations between transmission and/or reception antennas exist, which tend to reduce the effective number of independent channels (paths) between the transmitter and the receiver [5]. Another interesting situation corresponds to the case where the transmitter has partial knowledge of the channel itself. This may arise when the channel feedback to the transmitter from the receiver (who is assumed to always know the channel) may be impaired or noisy. This channel is usually called the Ricean channel in the literature. For large antenna numbers and Gaussian channel distributions the analysis of MIMO ergodic capacities (expectation value of mutual information over channel realizations) is greatly facilitated by the use of asymptotic techniques of random matrix theory (RMT). These methods were introduced in this context by various authors, starting with Foschini [1] and Telatar [2]. Verd´ u and Shamai [6] derived the expression of the capacity for infinite antennas with uncorrelated channels in the context of CDMA codes, and more recently Rapajic and Popescu [7] applied it in the context of multi-antenna systems. Lozano and Tulino [8] using methods developed by Tse and Hanley [9] calculated the infinite antenna-number capacity with spatially uncorrelated channels and uncorrelated interferers. Also, Chuah et al [10] extended the results of [6] to calculate the mutual information of spatially correlated channels in the infinite antenna limit. In all previous studies, the capacity has been assumed to be asymptotically proportional to min(n t ,n r ) and only the corresponding proportionality factor was studied as the number of antennas grows indefinitely. However, for finite antenna numbers or for slowly decaying spatial correlations the capacity of arrays with finite antenna numbers cannot be simply described as linear in the antenna number. For example, as shown in figure 2 of [5], for square arrays with λ/2 spacing the capacity per antenna does not reach the limiting value even at 10 000 antennas per array! In [5], using techniques developed by Sengupta and Mitra [11] in a different context, a method was developed to analytically calculate the capacity of spatially correlated channels for large but finite antenna numbers. In this paper we extend work done in [5, 12, 13] to provide analytic expressions for the statistics of the mutual information in the case of the Ricean channel. We apply a method from physics, known as the replica approach, to analyse the resulting random matrix problems. The replica approach, first introduced in [14], has been used heavily in physics for understanding random systems [15, 16]. One of the first applications of this method in communication theory was by Sourlas [17] in the field of error correcting codes. More recently, this method has seen increased application in the field of information theory [1820]. We will use the replica method to average over the channel realizations and obtain moments of the distribution of the mutual information. We find that for large antenna numbers n, the second and third moments of the mutual information is of O(1) and O(1/n) respectively, which shows that the mutual information distribution approaches a Gaussian. It should be noted that recently the generating function of the mutual information for the Ricean channel [21, 22], as well as some other channels [23], was calculated in closed form using other methods. Nevertheless, the replica method produces far simpler equations, while the equations resulting from other methods become increasingly difficult to evaluate for large antenna numbers. Surprisingly, the replica method also gives accurate results when applied to arrays with even
INSTITUTE OF PHYSICS PUBLISHING JOURNAL OF PHYSICS A: MATHEMATICAL AND GENERAL J. Phys. A: Math. Gen. 38 (2005) 10859–10872 doi:10.1088/0305-4470/38/49/024 Random matrix theory of multi-antenna communications: the Ricean channel Aris L Moustakas1,3 and Steven H Simon2 1 2 Department of Physics, University of Athens, Panepistimiopolis, Athens 15784, Greece Bell Labs, Lucent Technologies, 600 Mountain Avenue, Murray Hill, NJ 07974, USA Received 25 July 2005, in final form 26 July 2005 Published 22 November 2005 Online at stacks.iop.org/JPhysA/38/10859 Abstract The use of multi-antenna arrays in wireless communications through disordered media promises huge increases in the information transmission rate. It is therefore important to analyse the information capacity of such systems in realistic situations of microwave transmission, where the statistics of the transmission amplitudes (channel) may be coloured. Here, we present an approach that provides analytic expressions for the statistics, i.e. the moments of the distribution, of the mutual information for general Gaussian channel statistics. The mathematical method applies tools developed originally in the context of coherent wave propagation in disordered media, such as random matrix theory and replicas. Although it is valid formally for large antenna numbers, this approach produces extremely accurate results even for arrays with as few as two antennas. We also develop a method to analytically optimize over the input signal distribution, which enables us to calculate analytic capacities when the transmitter has knowledge of the statistics of the channel. The emphasis of this paper is on elucidating the novel mathematical methods used. We do this by analysing a specific case when the channel matrix is a complex Gaussian with arbitrary mean and unit covariance, which is usually called the Ricean channel. PACS numbers: 84.40.Ba, 02.10.Yn, 07.50.Qx 1. Introduction Recently, there has been increased interest in using multi-antenna arrays simultaneously in transmission and reception of microwave signals. Recent theoretical work in [1] and [2] has shown that for sufficiently rich scattering environments the Shannon capacity of an nt -element transmitting and an nr -element receiving array is roughly proportional to min(nr , nt ) for large numbers of antennas. This is significantly more than the usually logarithmic increase in 3 Author to whom correspondence should be addressed. 0305-4470/05/4910859+14$30.00 © 2005 IOP Publishing Ltd Printed in the UK 10859 10860 A L Moustakas and S H Simon capacity for increasing antenna numbers when only a single path (plane-wave) connects the transmission and reception arrays (line-of-sight). Intuitively, one can understand this result by observing that if the scattering is rich enough, then there is an independent channel from each transmission antenna to each reception antenna. Therefore, one can send independent signals from each transmission antenna. Both indoors [3] and outdoors [4] measurements have shown promising throughput gains for such MIMO (multiple input multiple output) technologies. Therefore it is important to be able to assess the capacity gains of MIMO technologies in realistic situations. Such situations include the case where spatial correlations between transmission and/or reception antennas exist, which tend to reduce the effective number of independent channels (paths) between the transmitter and the receiver [5]. Another interesting situation corresponds to the case where the transmitter has partial knowledge of the channel itself. This may arise when the channel feedback to the transmitter from the receiver (who is assumed to always know the channel) may be impaired or noisy. This channel is usually called the Ricean channel in the literature. For large antenna numbers and Gaussian channel distributions the analysis of MIMO ergodic capacities (expectation value of mutual information over channel realizations) is greatly facilitated by the use of asymptotic techniques of random matrix theory (RMT). These methods were introduced in this context by various authors, starting with Foschini [1] and Telatar [2]. Verdú and Shamai [6] derived the expression of the capacity for infinite antennas with uncorrelated channels in the context of CDMA codes, and more recently Rapajic and Popescu [7] applied it in the context of multi-antenna systems. Lozano and Tulino [8] using methods developed by Tse and Hanley [9] calculated the infinite antenna-number capacity with spatially uncorrelated channels and uncorrelated interferers. Also, Chuah et al [10] extended the results of [6] to calculate the mutual information of spatially correlated channels in the infinite antenna limit. In all previous studies, the capacity has been assumed to be asymptotically proportional to min(nt , nr ) and only the corresponding proportionality factor was studied as the number of antennas grows indefinitely. However, for finite antenna numbers or for slowly decaying spatial correlations the capacity of arrays with finite antenna numbers cannot be simply described as linear in the antenna number. For example, as shown in figure 2 of [5], for square arrays with λ/2 spacing the capacity per antenna does not reach the limiting value even at 10 000 antennas per array! In [5], using techniques developed by Sengupta and Mitra [11] in a different context, a method was developed to analytically calculate the capacity of spatially correlated channels for large but finite antenna numbers. In this paper we extend work done in [5, 12, 13] to provide analytic expressions for the statistics of the mutual information in the case of the Ricean channel. We apply a method from physics, known as the replica approach, to analyse the resulting random matrix problems. The replica approach, first introduced in [14], has been used heavily in physics for understanding random systems [15, 16]. One of the first applications of this method in communication theory was by Sourlas [17] in the field of error correcting codes. More recently, this method has seen increased application in the field of information theory [18–20]. We will use the replica method to average over the channel realizations and obtain moments of the distribution of the mutual information. We find that for large antenna numbers n, the second and third moments of the mutual information is of O(1) and O(1/n) respectively, which shows that the mutual information distribution approaches a Gaussian. It should be noted that recently the generating function of the mutual information for the Ricean channel [21, 22], as well as some other channels [23], was calculated in closed form using other methods. Nevertheless, the replica method produces far simpler equations, while the equations resulting from other methods become increasingly difficult to evaluate for large antenna numbers. Surprisingly, the replica method also gives accurate results when applied to arrays with even Random matrix theory of multi-antenna communications 10861 few (two or three) antennas. Thus, this analytic approach provides a powerful tool for analysing antenna systems with even a few antennas. In the remainder of this section we provide some notational definitions used in this paper (section 1.1) and define several quantities of interest (section 1.2). In section 2 we describe the mathematical framework of the methods used to calculate the statistics of the mutual information. In the next section (section 3) we maximize over the input signal distribution to calculate the maximum mutual information (channel capacity). In section 4 we discuss the Gaussian character of the distribution by presenting a few numerical examples. Finally, in the appendix we provide without proof several useful identities regarding complex integrals. 1.1. Notation 1.1.1. Vectors/matrices. Throughout this paper we will use bold-faced upper-case letters to denote matrices, e.g. X, with elements given by Xab , bold-faced lower-case letters for column vectors, e.g. x, with elements xa , and non-bold lower-case letters for scalar quantities. Also the superscripts T and † will indicate transpose and Hermitian conjugate operations and In will represent the n-dimensional identity matrix. In addition, K = M ⊗ N will depict the outer product between the m × m-dimensional matrix M and the n × n-dimensional N. 1.1.2. Order of number of antennas O(nk ). We will be examining quantities in the limit when all nt , nr , ni are large but their ratios are fixed and finite. We will denote collectively the order in an expansion over the antenna numbers as O(n), O(1), O(1/n), etc, irrespective of whether the particular term involves nt , nr . 1.1.3. Integral measures. In this paper we will be dealing with two general types of integrals over matrix elements. We will therefore adopt the following notation for their corresponding integration measures. In the first type we will be integrating over the real and imaginary parts of the elements of a complex mrows × mcols matrix X. The integral measure will be denoted by DX = m rows m cols   a=1 α=1 d Re Xaα d Im Xaα . 2π (1) The second type of integration is over pairs of complex square matrices T and R. Each element of T and R will be integrated over a contour in the complex plane (to be specified). The corresponding measure will be described as dµ(T, R) = m rows m cols   a=1 α=1 dTaα dRαa 2π i (2) 1.2. Definitions We consider the case of single-user transmission from nt transmit antennas to nr receive antennas over a narrow band fading channel. The received nr -dimensional complex signal vector y can be written as  ρ Gx + z, (3) y= nt where G is an nr × nt complex matrix with the channel coefficients from the transmitting to the receiving arrays, while x and z are nt - and nr -dimensional vectors of the transmitted signal and the additive noise, respectively, both assumed to be zero-mean Gaussian. The signal 10862 A L Moustakas and S H Simon covariance Q = E[xx† ] is normalized so that Tr Q = nt . For simplicity, the noise vector z is assumed to be white with unit covariance, normalized so that E[zz† ] = Inr . Finally, ρ is the signal-to-noise ratio (SNR). It is assumed that the receiver knows the channel matrix G and ρ. The transmitter, on the other hand, knows only the statistics of the noise, p(y|{x, G}), as well as the statistics of the channel p(G). The associated mutual information, i.e. the reduced entropy of the random variable x, gives the knowledge of the received signal y, can be expressed as [1]   ρ † (4) h(x|G) − h(x|y, G) = I (y; x|G) = log det Inr + GQG . nt The log above (and throughout the whole paper) represents the natural logarithm and thus I is expressed in nats. Due to the underlying randomness of G, I is also a random quantity. The average mutual information can then be expressed as    ρ † . (5) I (y; x|G) = log det Inr + GQG nt When this is maximized over the signal covariance Q, one obtains the capacity of the channel. This is the maximum error-free information transmission rate when the channel matrix G varies through its whole distribution p(G). This is why it is also called the ergodic capacity. Another important metric of the information capacity of the channel, the so-called outage capacity [24], is obtained by inverting the expression below with respect to Iout , Pout = Prob(I < Iout ) (6) where Prob(I < Iout ) is the probability that the mutual information is less than a given value Iout . In this review we will analyse the case where G has Gaussian statistics with a non-zero mean and unit covariance, i.e. with √ Gaα  = µaα pc (7) Gaα Gbβ ∗  − Gaα Gbβ  = pd δαβ δab . (8) In the above, pd signifies the fraction of the power due to diffuse (ergodic) components of the channel, while pc = 1 − pd is the stationary part of the channel, which is not averaged over. These are related to the commonly used Ricean factor K, by K = pc /pd . We will also be using the notation ρd = ρpd and ρc = ρpc for the diffuse and stationary fractions of the signal to noise ratio. The simple form of covariance matrix of (7) has been shown [5, 25] to be the leading term in a controlled approximation for diffusive environments. More complicated correlations between matrix elements have been analysed elsewhere [13]. The non-zero mean of G can be thought to be present due to a non-diffusive and non-ergodic term in the propagation process, such as a line-of-sight component. We define the notation · to denote the ensemble average over channel realizations. Thus for O(G), an arbitrary function of G, we have    1 (9) O = DG exp − Tr{(G − µ)(G − µ)† } O. 2 Random matrix theory of multi-antenna communications 10863 2. Mathematical framework The purpose of this paper is to analyse the statistics of the mutual information I in (4) for Gaussian channels having statistics given by (7). In this section we introduce the mathematical framework necessary for deriving analytic expressions of the cumulant moments of I. This method was introduced in this context in [12, 13]. We first introduce the generating function g(ν) of I    −ν ρ g(ν) = det Inr + GQG† = e−νI  nt = 1 − νI  + ν2 2 I  + · · · . 2 (10) Assuming that g(ν) is analytic at least in the vicinity of ν = 0, we can express log g(ν) as follows: log g(ν) = −νI  + ∞ p=2 (−ν)p Cp , p! (11) where Cp is the pth cumulant moment of I. For example, C2 = Var(I ) = (I − I )2  is the variance and C3 = Sk(I ) = (I − I )3  is the skewness of the distribution. It should be noted that I  and Cp for p > 2 are implicit functions of Q and ρ. For notational simplicity we will be suppressing this dependence. Thus to obtain the moments of the mutual information distribution we need to calculate g(ν) for ν in the vicinity of ν = 0. This is not necessarily any easier than evaluating the moments Cp directly, which is a notoriously difficult task, since one has to average products of logarithms of random quantities. In contrast, averaging g(ν) for integer values of ν involves averages over integer powers of determinants of random quantities, in which case some analytic progress can be made. We will therefore make the following assumption: Assumption 1 (replica method). g(ν) evaluated for positive integer values of ν can be analytically continued for real ν, specifically in the vicinity of ν = 0+ . This assumption, used also in [15, 16, 20], alleviates the problem of dealing with averages of logarithms of random quantities, since the logarithm is obtained after calculating g(ν). It has seen widespread use in the field of physics for more than 25 years [14], and, in many cases [11] can be been shown to produce exactly the same results as systematic series expansions. Thus, the replica method can be seen essentially a bookkeeping tool. Here, we will be using it without any direct proof, although we will be comparing some of our final results to Monte Carlo simulations to demonstrate their validity. Therefore, in the following analysis we will assume ν to be an arbitrary positive integer. Using identity 1 in the appendix we can write g(ν) as     ρ 1 † Tr{X† GQG† X} (12) g(ν) = DX e− 2 Tr{X X} exp − 2nt where X is a complex nr × ν matrix. The √ bracketed quantity in (12) can be rewritten, using identity 2 (with A = B = Q1/2 G† X ρ/nt , where Q1/2 is the matrix square root of Q, which is well defined since Q is non-negative definite) by introducing an integral over a 10864 A L Moustakas and S H Simon complex nt × ν matrix Y,    ρ exp − Tr{X† GQG† X} 2nt      ρ 1 † = DY e− 2 Tr{Y Y} exp − Tr{X† GQ1/2 Y − Y† Q1/2 G† X} . 4nt (13) At this point we can use identity 2 to integrate over G. Combining the result of (9), (12), (13), g(ν) can be expressed as    1 ρd † Y QYX† X g(ν) = DX DY exp − Tr X† X + Y† Y + 2 2nt    ρc Tr{X† µQ1/2 Y − Y† Q1/2 µ† X} . (14) × exp − 4nt To make progress we now need to express the term in the exponent proportional to ρ d, in a quadratic form in terms of X, Y. This can be done by using identity 3 and introducing ν × ν matrices R, T . Thus the last term in the exponent of (14) becomes   ρd ρd exp Tr{Y† QYX† X} = dµ(T , R) exp Tr{T R} − Tr{T Y† QY + X† XR} 2nt 4nt (15) The application of identity 3 and the introduction of matrices R and T is a particular form of a Hubbard–Stratonovich transformation. The usefulness of this method is in that it allows the integration of certain quantities (X, Y) of limited relevance, and introduces auxiliary quantities (such as R and T ), which will prove to have particular importance in the final answer. This is in a sense a mean-field theory approach, which will end up being exact in the large-n limit. Combining (14), (15) and using identity 1, we can now integrate X, Y, resulting in  g(ν) = dµ(T , R) e−S (16) where    ρd − Tr{T R} R S = log det Inr ⊗ Iν + nt    ρd ρd ρc + log det Int ⊗ Iν + Q ⊗ T + Q1/2 µ† µQ1/2 ⊗ Iν + R nt nt nt −1  . (17) In the above, ν is still a positive integer, which should be taken to zero following assumption 1. However, before doing this, we will first take the limit of large antenna numbers nt , nr ≫ 1. In this limit the saddle-point method of evaluating the integral (described below) becomes accurate. Subsequently, we will take the ν → 0+ limit. Assumption 2 (interchanging 2 limits). The limits n → ∞ and ν → 0+ in evaluating g(ν) in (16) can be interchanged by first taking the former and then the latter without changing the final answer. As we shall see below, the two limits of large antenna numbers and small ν are related: higher terms in the expansion in ν involve successively higher terms in a 1/n expansion. Random matrix theory of multi-antenna communications 10865 We will now use the saddle-point method to calculate the above integral. In the case of (16) and (17), when nt , nr ≫ 1, the exponent S is nominally of order O(νn). Following assumption 2 and thus keeping ν fixed we may apply the saddle-point approximation to calculate the integral of (16) and then take the ν → 0+ limit. It should be stressed that for a fixed positive integer ν, the saddle-point analysis of S and g(ν) is a straightforward exercise in asymptotic analysis. The only additional complexity to the standard textbook treatment of this topic [26] is that S involves integrals over multiple variables (the elements of T , R). Rather than search for saddle-point solutions over all possible complex T and R matrices, we are going to invoke the following hypothesis without proof. Assumption 3 (replica invariance). The relevant saddle-point solution for (16) involves matrices T , R, which are invariant in replica space and thus are proportional to the identity matrix Iν . The above hypothesis, used heavily in physics [15, 16], basically states that there is no preferred direction in the space of replicas, and thus if any saddle-point solution is valid, so is any unitary transformation thereof in replica space. Although we are not going to provide a proof here, it should be noted that in [11] it is shown that the results obtained by this method are identical to those using a systematic expansion. 2.1. Saddle-point analysis √ √ The assumed form of T and R at the saddle point is t nt Iν and r nt Iν , respectively. The √ extra factor of nt has been included for convenience, as will become evident below. To consider the vicinity around the saddle point, we thus rewrite T , R as √ T = t nt Iν + δ T (18) √ R = r nt Iν + δ R where δ T, δ R are ν × ν matrices representing deviations around the saddle point. One can then expand S of (17) in a Taylor series of increasing powers of δ T, δ R as follows: S = S0 + S1 + S2 + S3 + S4 + · · · (19) with Sp containing pth-order terms in δ T, δ R. These terms can be obtained explicitly by differentiating (17):   ρc Qu† u √ √ = νŴ (20) S0 = ν nr log(1 + ρd r) − nt rt + Tr log Inr + ρd tQ + √ nt 1 + r ρd S1 =      √ ρc Qu† u ρd Tr{δ R} nt  nr − Tr − nt t √ √ √  √ 1 + r ρd nt (1 + r ρd ) Inr + ρd tQ + ρnct Qu† u     √ √ ρd (1 + r ρd )Q Tr{δ T}  ρc + Tr − nt r √ √  √ † nt (1 + r ρd ) Inr + ρd tQ + nt Qu u 1 S2 = Tr 2  δR δT T δR δT  , (21) (22) 10866 A L Moustakas and S H Simon where the 2 × 2 Hessian Σ is given by     2 √   Inr + ρd tQ ρd  nr − nt  = − + Tr √ 11  (1 + r√ρd )In + √ρd tQ + ρc Qu† u2  nt (1 + r ρd )2 r nt   √  (1 + ρd r)2 Q2 ρd  Tr  22 = −    nt  (1 + r√ρd ) In + √ρd tQ + ρc Qu† u 2  r nt      2 † ρ µ µ ρ Q d c  . Tr 12 = 21 = − 1 −      (1 + r√ρd ) In + √ρd tQ + ρc Qu† u 2  nt 2 r (23) nt For p > 2 the expanded terms can be written as    nr − nt (−1)p ρd p/2 Tr{(δ R)p } Sp = − √ p nt (1 + r ρd )p     √  √ Inr + t ρd Q δ R + (1 + r ρd Qδ T)p + Tr  p .  √ √  (1 + r ρd ) Inr + ρd tQ + ρnct Qu† u (24) The saddle-point solution of (16) and hence the corresponding values of t, r are found by demanding that S is stationary with respect to variations in T , R [26], resulting in S1 = 0. This produces the following saddle-point equations:   r 1 Q  Tr (25) = √ √ √  √ ρd (1 + r ρd ) nt (1 + r ρd ) Inr + ρd tQ + ρnct Qu† u  √ ρc Qu† u t (1 + r ρd ) nr 1 nt  = − Tr √ √  √ ρd nt nt (1 + r ρd ) Inr + ρd tQ + ρc Qu† u nt  . (26) It is interesting to note that the solutions to the above two equations extremize Ŵ for real and positive t, r. It is important to note that for generic full rank matrices Q, u† u, both r and t are generally of order unity, r, t = O(1). Thus the expansion coefficients multiplying the terms δ Tp , δ Rp , etc are generally of order O(n1−p/2 ), successively decreasing in size for increasing p. The small parameter controlling this approximation is therefore n−1/2 , making this saddlepoint solution increasingly accurate for large n. Therefore, the aim of this analysis is to calculate successively higher order terms in n−1/2 and classify each resulting quantity in terms of their powers of ν to the appropriate cumulant moment Cp in (11). This matching of powers of ν implicitly assumes that the expansion in (11) is valid for integer ν, as described in assumption 1. Thus, for example, terms of orders O(νn) and O(ν/n) will both contribute to I , as we shall see. 2.2. Ergodic mutual information We start with the leading term of g(ν) in the saddle-point approximation: g(ν) ≈ exp(−S0 ) = exp(−νŴ), where Ŵ is evaluated in (21) using (25) and (26). We thus see from (10), (11) that the leading term in the expansion of I  is Ŵ and note that I  = O(n). Random matrix theory of multi-antenna communications 10867 2.3. Variance of the mutual information To obtain the O(ν 2 ) term in the expansion of log g(ν) in (11) we need to include the next non-vanishing term, S2 . Thus, for the moment we neglect higher order terms Sp for p > 2. Noting the measure-preserving transformation T , R → δ T, δ R of (18) we have  −S0 g(ν) = e dµ(δ T, δ R) e−S2    ν 1 T −S0 [δT1,ab δR1,ab ]Σ[δT1,ba δR1,ba ] . (27) =e dµ(δ T, δ R) exp − 2 a,b=1 To diagonalize the exponent of the above equation, we rotate each pair δ T, δ R to a new basis of ν × ν-dimensional matrices W1 and W2 . In particular, [W1,ab W2,ab ]T = U[δT1,ab δR1,ab ]T , where U is an orthogonal matrix such that UVUT is a diagonal matrix with the diagonal given by v = [v1 v2 ]T , the vector of eigenvalues of Σ. We may now rewrite (27) as    ν 1 −S0 [v1 W1,ab W1,ba + v2 W2,ab W2,ba ] . (28) g(ν) = e dµ(W1 , W2 ) exp − 2 a,b=1 We now take appropriate paths to integrate W1,2,ab , resulting in ν2 ν2 g(ν) = e−S0 |v1 v2 |− 2 = e−S0 |det Σ|− 2 . (29) Comparing (11) to (29) and matching order by order the terms of the ν-Taylor expansion of the exponent of g(ν), we can identify the leading term in the variance of the mutual information to be I 2  − I 2 = C2 = − log|det Σ| + · · · . (30) We note that since ij = O(1), the variance is also O(1) in the expansion of n−1/2 when both nt and nr are of the same order. (However, if nr is fixed while nt increases, we find that C2 = O(n−1 ), in agreement with [27].) Also we see that no term proportional to ν is produced from S2 . Thus no term of O(1) in the antenna number appears in I , resulting in I  = Ŵ + o(1). 2.4. Higher order terms To obtain higher order corrections in the small parameter n−1/2 , we need to take into account the terms Sp for p > 2 in (24). Details of the method can be found elsewhere [13]. For simplicity, here we will briefly describe how to set up the perturbation expansion and discuss its implications to the distribution of the mutual information for large n. We define an expectation bracket of an arbitrary operator O(δ T, δ R) as  ν 2 /2 O = |det Σ| dµ(δ T, δ R) e−S2 O(δ T, δ R). (31) The integration over δ T, δ R is performed as described in the previous section. We can obtain the expectations of quadratic terms in δ T, δ R, written here in a compact form as follows: [δRab δTab ][δRcd δTcd ]T  = δad δbc Σ−1 . (32) In addition, the expectation of any odd power of δ T, δ R must vanish by symmetry. As a result, only integer powers of 1/n survive in the perturbative expansion. 10868 A L Moustakas and S H Simon With this bracket notation we can rewrite g(ν) = e−S0 e−S2 e−(S3 +S4 +···) !! "" 2 = e−νŴ |det Σ|−ν /2 1 − (S3 + S4 + · · ·) + 21 (S3 + S4 + · · ·)2 + · · · (33) (34) At this point, it is interesting to count powers of n in the various terms of the expansion. Using simple power counting arguments, we see that the term Sp  is of order n−p/2+1 , but vanishes for p odd. Also, Sp Sq  is of order n−p/2−q/2+2 but is zero for p + q odd, and so forth. By regrouping the terms in the above expansion by their order in 1/n we obtain the following expansion: where g(ν) = e−S0 [|det |−ν 2 /2 ][1 + D1 + D2 + D3 + · · ·] "" !! D1 = S4 + 12 S32 !! D2 = S6 + 12 S42 + S3 S5 + .. . 1 4 S 24 3 "" (35) (36) Here we have regrouped all terms which are of order 1/np into the term Dp . Thus, for example, D1 contains all terms of order 1/n. As in [13], we can evaluate the averages above using (32) and Wick’s theorem. We find that D1 produces 1/n corrections to the mean mutual information and its skewness. All additional terms Dp provide higher order corrections in 1/n. We thus see that in the large n limit only the first two moments of the distribution are finite. Therefore, asymptotically the mutual information distribution approaches a Gaussian. 2.5. Summary of results To summarize, we have derived the following results, with Ŵ given by (20), Σ given by (23), and with the parameters r and t given by (25) and (26), I  = Ŵ + O(1/n) (37) C2 = Var(I ) = I 2  − I 2 = −log|det Σ| + O(1/n2 ) (38) C3 = Sk(I ) = O(1/n). (39) The expansion can be continued to higher order in 1/n straightforwardly. We thus see that for large n the distribution of the mutual information approaches a Gaussian. Similar results have been obtained for other types of statistics of the Gaussian matrix G [13]. 3. Capacity-achieving signal covariance Q As discussed in section 1.2, instead of instantaneous channel information, the transmitter has statistical information for the channel G, namely only µ and ρ are known. Based on this information, the signal covariance can be optimized to maximize a particular metric of the mutual information distribution. Here we describe how to optimize Q, in order to maximize the average mutual information, keeping only the O(n) term in (21), i.e. to find maxQ I , in the large antenna limit. We start by observing that Ŵ depends on Q through the last term in (21). Expressing the determinant of this term in the eigen-basis of µ† µ, it can be written as    ρc M̃ √ (40) ρd tInt + det Int + Q̃ √ nt (1 + r ρd ) Random matrix theory of multi-antenna communications 10869 where M̃ is a diagonal matrix with the eigenvalues of µ† µ/nt , given by µ2k for k = 1, . . . , nt √ on the diagonals, where µk are the singular values of µ/ nt , and Q̃ is the original matrix Q expressed in the eigen-basis of µ† µ. We now use the fact that for any non-negative definite # matrix A, det(A)  k Akk , where Akk are the diagonal elements [2]. Applying this inequality to (40) we get     nt  ρc µ2k ρc √ 1 + Q̃kk ρd t + M̃  √ √ nt (1 + r ρd ) (1 + r ρd ) k=1   √ det Int + Q̃ ρd tInt + (41) with equality when Q̃ is diagonal. Thus the Q maximizing Ŵ is simultaneously diagonalizable with µ† µ. Once the optimal eigen-basis of Q has been determined to be the same as µ† µ, one needs to find its optimal eigenvalues qk for k = 1, . . . , nt . Ŵ has to be optimized subject to the power constraint Tr Q = nt . This constraint is enforced by adding a Lagrange multiplier to Ŵ, i.e.  n t Ŵ→Ŵ− k=1 (42) qk − nt . Incorporating the Lagrange multiplier to (21) and maximizing, it is easy to see that the optimal eigenvalues of Q are then given by qk = 1 √ 1 + r ρd −√ √ ρd t (1 + r ρd ) + ρc µ2k where [x]+ = {x + sgn(x)}/2. Here, , (43) + > 0 is determined by imposing the power constraint nt Tr Q = k=1 qk = nt (44) Solving (43), (44) together with (26), (25) allows us to calculate I  and Var(I ) in (37), (38) to obtain the ergodic capacity and the variance of the distribution around it. Note that the optimization over Q at the transmitter is based on statistical rather than instantaneous information about the channel. Therefore it depends on statistical quantities (µ† µ and ρ rather than G itself). As a result, the transmitter needs to be updated about the channel information at a relatively slow rate. 4. Validity of Gaussian approximation N (I, Var(I)) In section 2.5 we saw that in the limit of large n, the distribution of the mutual information approaches a Gaussian with mean equal to Ŵ and variance the calculated variance of I, in the sense that all higher moments and corrections tend to zero. Surprisingly, this approximation is valid for even a small number of antennas. We demonstrate this property by comparing numerically the Gaussian distribution N (I , Var(I )) calculated using (21) and (38) with the simulated distribution resulting from the generation of a large number of random matrix realizations. This comparison can be seen in figure 1, where indeed the agreement not only to the Gaussian distribution but to its correct means and variances is striking for both small 10870 A L Moustakas and S H Simon Probability Distribution of mutual information for nt=nr=2,10 1 0.9 0.8 0.7 Percentage 0.6 0.5 0.4 0.3 0.2 0.1 0 0.1 0.15 0.2 0.25 Mutual Information (nats) 0.3 0.35 Figure 1. Cumulative distribution (CDF) of mutual information per antenna (I /nt ) for nr = nt = 2, 10. The signal to noise ratio is set to ρ = 1, while the diffuse component is 50% of the power ρd = ρc = 0.5. The dotted lines correspond to the numerically generated curves, while the solid lines are the theoretical ones, generated as Gaussian distributions with mean and variance the ones obtained by applying the methods of this paper. We see that the agreement is very good. We also see that for n = 10 the distribution is more narrowly peaked. In both cases the non-zero mean component of the channel (µ) was generated randomly from a Gaussian distribution. Among the pairs corresponding to the same number of antennas, those to the left have the transmission covariance matrix Q optimized with respect to the known µ† µ, while those to the right simply have Q = Int . n = 2 and large n = 10 antenna numbers. This allows us to accurately calculate not only the ergodic capacity but also the outage capacity, defined in (6). 5. Conclusion In conclusion, we have presented an analytic approach to calculate the statistics of the mutual information of MIMO systems for the case of Ricean statistics. To this end we applied tools developed for the analysis of mesoscopic systems, such as replicas random matrix theory. In addition, we have used this method to find the optimal signal covariance Q and thus analytically calculate the capacity when the statistics of the (Gaussian) channel are known at the transmitter. These methods, although formally valid for large antenna numbers, apply with very high accuracy to arrays with only few number of antennas. This allows us to accurately evaluate the outage capacity for any number of antennas. We demonstrated this by comparing to numerical simulations. This analytic approach provides the framework and a simple tool to accurately analyse the statistics of throughput of even small arrays. It is a simple example where mesoscopic physics methods can have technological applications. Random matrix theory of multi-antenna communications 10871 Acknowledgments We wish to acknowledge enlightening discussions with Harold Baranger and Anirvan Sengupta. After this work was completed we became aware of a work [28] that using different methods provides analytic results only for the average of the mutual information in the infinite antenna limit. Appendix. Complex integrals Identity 1. Let M be a Hermitian positive-definite square m × m matrix, and let X be a complex m × n matrix. Then  1 † −n (A.1) (det M) = DX e− 2 Tr{X MX} where the integration measure DX is given by (1). Identity 2. Let X, A, B be m × n complex matrices. Then, the following equality holds:      1 1 † † † † DX exp − Tr{X X + A X − X B} = exp − Tr{A B} . (A.2) 2 2 Identity 3 (Hubbard–Stratonovich transformation). Let U, V be arbitrary complex ν × ν matrices, where ν is assumed to be an arbitrary positive integer. Then the following identity holds:  exp[−Tr{UV}] = dµ(T, R) exp[Tr{RT − UT − RV}]. (A.3) In the above equation, the auxiliary matrices T, R are general complex matrices ν × ν and their integration measure is given by (2). The integration of the elements of R and T is along contours in complex space parallel to the real and imaginary axis. References [1] Foschini G J and Gans M J 1998 On limits of wireless communications in a fading environment when using multiple antennas Wirel. Pers. Commun. 6 311–35 [2] Telatar I E 1999 Capacity of multi-antenna gaussian channels Eur. Trans. Telecommun. Relat. Technol. 10 585–96 [3] Wolniansky P W, Foschini G J, Golden G D and Valenzuela R A 1998 V-BLAST: an architecture for realizing very high data rates over the rich-scattering wireless channel URSI Int. Symp. Signals Syst. Electronics pp 295–300 [4] Ling J, Chizhik D, Wolniansky P and Valenzuela R 2001 Multiple transmit multiple receive (MTMR) capacity survey in Manhattan IEEE Electronics Lett. 37 1041–2 [5] Moustakas A L, Baranger H U, Balents L, Sengupta A M and Simon S H 2000 Communication through a diffusive medium: coherence and capacity Science 287 287–90 (Preprint cond-mat/0009097) [6] Verdú S and Shamai S 1999 Spectral efficiency of CDMA with random spreading IEEE Trans. Inform. Theory 45 622–40 [7] Rapajic P B and Popescu D 2000 Information capacity of a random signature multiple-input multiple-output chanel IEEE Trans. Commun. 48 1245 [8] Lozano A and Tulino A M 2002 Capacity of multiple-transmit multiple-receive antenna architectures IEEE Trans. Inform. Theory 48 3117–28 [9] Tse D N and Hanly S V 1999 Linear multiuser receivers: effective interference, effective bandwidth and user capacity IEEE Trans. Inform. Theory 45 641 [10] Chuah C N, Tse D, Kahn J and Valenzuela R A 2002 Capacity scaling in MIMO wireless systems under correlated fading IEEE Trans. Inform. Theory 48 637 10872 A L Moustakas and S H Simon [11] Sengupta A M and Mitra P P 1999 Distributions of singular values for some random matrices Phys. Rev. E 60 3389–92 [12] Sengupta A M and Mitra P P 2000 Capacity of multivariate channels with multiplicative noise: I. Random matrix techniques and large-n expansions for full transfer matrices Preprint physics/0010081 [13] Moustakas A L, Simon S H and Sengupta A M 2003 MIMO capacity through correlated channels in the presence of correlated interferers and noise: a (not so) large N analysis IEEE Trans. Inform. Theory 45 2545–61 [14] Edwards S F and Anderson P W 1975 Theory of spin glasses J. Phys. F: Met. Phys. 5 965–74 [15] Mézard M, Parisi G and Virasoro M A 1987 Spin Glass Theory and Beyond (Singapore: World Scientific) [16] Itzykson C and Drouffe J-M 1989 Statistical Field Theory (Cambridge: Cambridge University Press) [17] Sourlas N 1989 Spin glass models as error correcting codes Nature 339 693–5 [18] Montanari A and Sourlas N 2000 Statistical mechanics of turbo codes Eur. Phys. J. B 18 107–19 [19] Montanari A 2000 Turbo codes: the phase transition Eur. Phys. J. B 18 121–36 [20] Tanaka T 2002 A statistical-mechanics approach to large-system analysis of CDMA multiuser detectors IEEE Trans. Inform. Theory 48 2888–910 [21] Kang M and Alouini M-S 2002 Capacity of MIMO Rician channels Proc. 40th Annual Conference on Communication, Control, and Computing (Monticello, IL) [22] Simon S H, Moustakas A L and Marinelli L 2004 Capacity and character expansions: moment generating function and other exact results for MIMO correlated channels (submitted) (Preprint cs.IT/0509080) [23] Simon S H and Moustakas A L 2004 Eigenvalue density of correlated random Wishart matrices Phys. Rev. E 69 (Preprint math-ph/0401038) [24] Ozarow L H, Shamai S and Wyner A D 1994 Information theoretic considerations for cellular mobile radio IEEE Trans. Veh. Technol. 43 359–78 [25] Simon S H, Moustakas A L, Stoytchev M and Safar H 2001 Communication in a disordered world Phys. Today September 38–43 [26] Bender C M and Orszag S A 1978 Advanced Mathematical Methods for Scientists and Engineers (New York: McGraw-Hill) [27] Hochwald B M, Marzetta T L and Tarokh V 2004 Multi-antenna channel hardening and its implications for rate feedback and scheduling IEEE Trans. Inform. Theory 50 1893–909 [28] Dumont J, Loubaton P, Lasaulce S and Debbah M 2005 On the asymptotic performance of MIMO correlated Ricean channels IEEE Int. Conf. Acoustics, Speech and Signal Processing
Keep reading this paper — and 50 million others — with a free Academia account
Used by leading Academics
Nikos Mantzakouras
National & Kapodistrian University of Athens
Oliver Knill
Harvard University
Samit Chakrabarty
University of Leeds
Antoine Meyer
Université Gustave Eiffel