Burda@th - If.uj - Edu.pl Atg@th - If.uj - Edu.pl Corresponding Author: Bwaclaw@th - If.uj - Edu.pl
Burda@th - If.uj - Edu.pl Atg@th - If.uj - Edu.pl Corresponding Author: Bwaclaw@th - If.uj - Edu.pl
Burda@th - If.uj - Edu.pl Atg@th - If.uj - Edu.pl Corresponding Author: Bwaclaw@th - If.uj - Edu.pl
I.
INTRODUCTION
Random Matrix Theory provides a useful tool for description of systems with many degrees of freedom. A large spectrum of problems in physics [1], telecommunication, information theory [2, 3, 4, 5] and quantitative nance [6, 7, 8, 9, 10, 11, 12, 13] can be naturally formulated in terms of random matrices. In this paper we apply random matrix theory to calculate the eigenvalue density of the empirical covariance matrix. Statistical properties of this matrix play an important role in many empirical applications. More precisely, the problem which we shall discuss here can be generally formulated in the following way. Consider a statistical system with N correlated random variables. Imagine that we do not know a priori correlations between the variables and that we try to learn about them by sampling the system T times. Results of the sampling can be stored in a rectangular matrix X containing empirical data Xit , where the indices i = 1, . . . , N and t = 1, . . . T run over the set of random variables and measurements, respectively. If the measurements are uncorrelated in time the two-point correlation function reads: Xi1 t1 Xi2 t2 = Ci1 i2 t1 t2 . (1)
where C is called correlation matrix or covariance matrix. For simplicity assume that Xit = 0. If one does not know C one can try to reconstruct it from the data X using the empirical covariance matrix: cij = 1 T
T
Xit Xjt ,
t=1
(2)
which is a standard estimator of the correlation matrix. One can think of X as of an N T random matrix chosen from the matrix ensemble with some prescribed probability measure P (X)DX. The empirical covariance matrix: c= 1 XX T (3)
depends thus on X. Here X stands for the transpose of X. For the given random matrix X the eigenvalue density of the empirical matrix c is: 1 (X, ) N
N i=1
( i (c)),
(4)
where i (c)s denote eigenvalues of c. Averaging over all random matrices X: () (X, ) = (X, ) P (X) DX, (5)
we can nd the eigenvalue density of c which is representative for the whole ensemble of X. We are interested in how the eigenvalue spectrum of c is related to that of C [14, 15, 16]. Clearly, as follows from (1), the quality of the
author: bwaclaw@th.if.uj.edu.pl
information encoded in the empirical covariance matrix c depends on the number of samples or more precisely on the ratio r = N/T . Only in the limit T , that is for r 0, the empirical matrix c perfectly reproduces the real covariance matrix C. Recently a lot of eort has been made to understand the statistical relation between c and C for nite r. This relation plays an important role in the theory of portfolio selection where Xit are identied with normalized stocks returns and C is the covariance matrix for inter-stock correlations. It is a common practice to reconstruct the covariance matrix from historical data using the estimator (2). Since the estimator is calculated for a nite historical sample it contains a statistical noise. The question is how to optimally clean the spectrum of the empirical matrix c from the noise in order to obtain a best quality estimate of the spectrum of the underlying exact covariance matrix C. One can consider a more general problem, where in addition to the correlations between the degrees of freedom (stocks) there are also temporal correlations between measurements [17]: Xi1 t1 Xi2 t2 = Ci1 i2 At1 t2 , (6)
given by an autocorrelation matrix A. If X is a Gaussian random matrix, or more precisely if the probability measure P (X)DX is Gaussian, then the problem is analytically solvable in the limit of large matrices [17, 18, 19, 20]. One can derive then an exact relation between the eigenvalue spectrum of the empirical covariance matrix c and the spectra of the correlation matrices A and C. In this paper we present an analytic solution for a class of probability measures P (X)DX for which the marginal 1 distributions of individual degrees of freedom have power law tails: p(Xit ) Xit which means that the cumulative distribution function falls like Xit . Such kind of distributions has been discussed previously [21, 22] but, up to our knowledge, the spectral density of c remained unattainable analytically. The motivation to study such systems comes from the empirical observation that stocks returns on nancial markets undergo non-Gaussian uctuations with power-law tails. The observed value of the power-law exponent 3 seems to be universal for a wide class of nancial assets [23, 24, 25]. Random matrix ensembles with heavy tails have been recently considered for 0 < < 2 using the concept of Lvy stable distributions [26, 27, 28]. Here we will present a method which extrapolates also to e the case > 2, being of particular interest for nancial markets. We will study here a model which on the one hand preserves the structure of correlations (6) and on the other hand has power-law tails in the marginal probability distributions for individual matrix elements. More generally, we will calculate the eigenvalue density of the empirical covariance matrix c (3) for random matrices X which have a probability distribution of the form: Pf (X)DX = N 1 f (Tr X C1 XA1 )DX, where DX =
N,T i,t=1
(7)
dXit is a volume element. The normalization constant N : N = d/2 (DetC)T /2 (DetA)N/2 (8)
and the parameter d = N T have been introduced for convenience. The function f is an arbitrary non-negative function such that P (X) is normalized: P (X)DX = 1. In particular we will consider an ensemble of random matrices with the probability measure given by a multivariate Student distribution: ( +d ) 2 P (X)DX = N ( ) d 2 1 1 + 2 Tr X C1 XA1
+d 2
DX.
(9)
The two-point correlation function can be easily calculated for this measure: Xi1 t1 Xi2 t2 = 2 Ci i At t . 2 1 2 1 2 (10)
We see that for 2 = 2 and for > 2 the last equation takes the form (6). With this choice of 2 the two-point function becomes independent on , however the formula for the probability measure (9) breaks down at = 2 and cannot be extrapolated to the range 0 < 2. An alternative and actually a more conventional choice is 2 which extrapolates easily to this range. In this case one has to remember that for > 2 the exact covariance matrix is given by 2 C, where C is the matrix in Eq. (9) with 2 = . We will stick to this choice in the remaining part of the paper. The marginal probability distribution for a matrix element Xit can be obtained by integrating out all others degrees of freedom from the probability measure P (X)DX. One can see that for the Student probability measure (9) the marginal distributions of individual elements have by construction power-law tails. For example if C is diagonal
2 2 C = Diag(C1 , . . . , CN ) and A = T then the marginal probability distributions can be found exactly for each element of the matrix X:
( +1 ) 2 pi (Xit ) = ( ) Ci 2
X2 1 + it 2 Ci
+1 2
(11)
1 The distributions pi fall like Xit for large Xit with amplitudes which depend on the index i and are independent of t. If one thinks of a stock market, this means that stocks returns have the same tail exponent but dierent tail amplitudes. The independence of t means that the distributions pi (Xit ) are stationary. More generally, for any C and for A which is translationally invariant At1 t2 = A(|t1 t2 |) the marginal distributions of entries Xit can be shown to have power-law tails with the same exponent for all Xit and tail coecients which depend on i and are independent of t, exactly expected from stocks returns on a nancial market. The main purpose of this paper is to calculate the spectral density of the empirical covariance matrix c for the Student distribution (9). The method is similar to the one presented in [29, 30, 31, 32, 33] for a square Hermitian matrix. It consists in an observation that every quantity averaged over the probability distribution having the form (7) can be rst averaged over (d1) angular variables and then of a radial variable. This shall be shortly presented in sections II and III. In the section IV the main equation for the eigenvalue density of c for the radial ensemble (7) with an arbitrary radial prole f shall be given. The section V contains results for the Student distribution (9) including some special cases.
II.
RADIAL MEASURES
The radial measure (7) depends on one scalar function f = f (x2 ) of a real positive argument. In this section we shall develop a formalism to calculate the eigenvalue spectrum f () of the empirical covariance matrix (3) for such radial ensembles. The calculation can be simplied by noticing that the dependence of f () on the matrices C and A actually reduces to a dependence on their spectra. This follows from an observation that for a radial measure (7) the integral (5) dening the eigenvalue density is invariant under simultaneous transformations: C C = C A A = Q AQ X X = XQ
(12)
where , Q are orthogonal matrices of size N N and T T , respectively. Choosing the orthogonal transformations 2 2 and Q in such a way that C and A become diagonal: C = Diag(C1 , . . . , CN ), A = Diag(A2 , . . . , A2 ) with all Ci s 1 T and At s being positive, we see that f () depends on the matrices C and A indeed only through their eigenvalues. Therefore, for convenience we shall assume that C and A are diagonal from the very beginning. The radial form of the measure allows one to determine the dependence of the eigenvalue density f () on the radial prole f (x2 ). Intuitively, the reason for that stems from the fact that one can do the integration for the radial ensembles (7) in two steps: the rst step is a sort of angular integration which is done for xed x and thus is independent of the radial prole f (x2 ), and the second one is an integration over x. A short inspection of the formula (7) tells us that xed x corresponds to xed trace: Tr X C1 XA1 , and thus that we should rst perform the integration over the xed trace ensemble. We shall follow this intuition below. 1 1 Let us dene a matrix x = C 2 XA 2 . Since we assumed that A and C are diagonal, A1/2 and C1/2 are also diagonal with elements being square roots of those for A and C. The elements of x are: xit Xit . Ci At (13)
They can be viewed as components xj , j = 1, . . . , d of a d-dimensional Euclidean vector, where the index j is constructed from i and t. The length of this vector is:
d N T
x2
x2 = j
j=1 i=1 t=1
x2 = Tr x x = Tr X C1 XA1 , it
(14)
and thus the xed trace matrices X are mapped onto a d-dimensional sphere of the given radius x. It is convenient to parameterize the d-dimensional vector x using spherical coordinates x = x, where 2 Tr = 1. We can
(15)
where the denition of the matrix () is equivalent to it Ci At it . While gives a point on a unit sphere in d-dimensional space, () gives a radial projection of this point on a d-dimensional ellipsoid of xed trace: Tr C1 A1 = 1.
III. ANGULAR INTEGRATION
(16)
We are now prepared to do the integration over the angular variables D. In the spherical coordinates (15) the radial measure (7) assumes a very simple form: Pf (X)DX = d/2 f (x2 )xd1 dx D. (17)
The normalization factor N 1 from Eq. (7) cancels out. The spherical coordinates X = x() allow us to write the formula for f () in the form:
f () = d/2
D
0
(18)
Although the integration over the angular and the radial part cannot be entirely separated, we can partially decouple x from in the rst argument of (x(), ). It follows from (4) that the rescaling X X by a constant gives the relation: (X, ) = 2 (X, 2 ). This observation can be used to rewrite the equation (18) in a more convenient form:
(19)
f () = d/2
D
0
(),
x2
f (x2 )xd3 dx =
2 (d/2)
x2
(20)
d Here Sd denotes the hyper-surface area of d-dimensional sphere of radius one: Sd = 2 d/2 /( 2 ). As we shall see below the last expression is an eigenvalue distribution of the empirical covariance matrix for the xed trace ensemble dened as an ensemble of matrices X such that Tr X C1 XA1 = 1. From the structure of the equation (20) it is clear that if () is known then f () can be easily calculated for any radial prole just by doing one-dimensional integral. So the question which we face now is how to determine () for arbitrary C and A. We will do this by a trick. Instead of calculating () directly from Eq. (21), we will express () by the corresponding eigenvalue density G () for a Gaussian ensemble, whose form is known analytically [17, 28]. Let us follow this strategy in the next section.
IV.
The probability measure for the xed trace ensemble is dened as P (X)DX = ( d ) 2 Tr (X C1 XA1 ) 1 DX. N 2 (x2 1) xd1 dx D. Sd (22)
P (X)DX =
One can easily check that the integration () = (X, )P (X)DX indeed gives (21). It is also worth noticing that the normalization condition for P (X) is fullled. Consider now a Gaussian ensemble: PG (X)DX N 1 fG (Tr X C1 XA1 )DX, where fG (x2 ) = 1 1 x2 e 2 , 2d/2 (24) (23)
for which the spectrum G () is known or more precisely it can be easily computed numerically in the thermodynamical limit N, T [17, 34, 35]. On the other hand as we learned in the previous section, the density of eigenvalues of the empirical covariance matrix c can be found applying Eq. (20) to the Gaussian radial prole (24): G () = 21d/2 ( d ) 2
x2
xd3 e 2 x dx.
d
dG (d) =
0
y2
1 y2
21d/2 dd/2 d1 1 d y2 y e 2 ( d ) 2
dy.
One can easily check that the formula in the square brackets tends to the Dirac delta for large matrices because then d goes to innity: 21d/2 dd/2 d1 1 d y2 = (y 1), y e 2 d ( d ) 2 lim and thus the integrand in Eq. (26) gets localized around the value y = 1. Therefore for large d we can make the following substitution: () = dG (d). Inserting it into Eq. (20) and changing the integration variable to y = paper: f () = dd/2 d/21 (d/2)
d x2
G (y)f
0
d y
y d/2 dy.
(28)
The meaning of this formula is the following: for any random matrix ensemble with a radial measure (7) the eigenvalue density function f () is given by a one-dimensional integral of a combination of the corresponding Gaussian spectrum G () and the radial prole f (x). The equation holds in the thermodynamic limit: d = N T and r = N/T = const. Since in this limit we are able to calculate the spectrum G () for arbitrarily chosen A, C, the formula (28) gives us a powerful tool for computing spectra of various distributions. In the next section we shall apply it to the multivariate Student ensemble (9).
V. MULTIVARIATE STUDENT ENSEMBLE
The radial prole for the Student ensemble (9) is: ( +d ) 2 f (x ) f (x ) = ( ) d/2 2
2 2
x2 1+
+d 2
(29)
We have chosen here the standard convention 2 = since we would like to calculate the spectrum () also for 2 (see the discussion at the end of the rst section). Inserting (29) into the equation (28): () = d
d/2
( +d ) 2 ( d )( ) 2 2
d/21 0
G (y)
d 1+ y
+d 2
y 2 dy,
lim
d/2 ( +d ) d d 2 y 2 2 1 ( d ) 2
1+
d y
+d 2
/2
e 2 y 2
+2 2
2 1
0
G (y) e 2 y 2 dy.
(30)
The formula (30) works for all > 0. From the last equation we can infer the behavior of () for large . The function G (y) has a compact support [17, 20, 34], therefore for large the exponential can be approximated well by 1. The function () has thus a long tail: () 2 1
1 ( 2 ) 2
/2 0
G (y)y 2 dy,
(31)
where the integral does not depend on . The exponent /2 1 in the above power-law depends on the index of the original Student distribution. The change from the power to the power /2 comes about because c is a quadratic combination of X. The power-law tail in the eigenvalue distribution (31) does not disappear in the limit of large matrices contrary to the power-law tails in the eigenvalue distribution for an ensemble of matrices whose elements are independently distributed random numbers. For such matrices, for > 2, the density () falls into the Gaussian universality class and yields the Wishart spectrum [36]. One should remember that the multivariate Student distribution (9) discussed here does not describe independent degrees of freedom even for A = T and C = N , in which case the degrees of freedom are uncorrelated but not independent. We have learned that the spectrum is unbounded from above. Let us now examine the lower limit of the spectrum. Rewriting Eq. (30) in the form: () = 2 /2 ( ) 2
(32)
we see that as long as > 0 the function () is positive since G (x) is positive on a nite support. Thus the function () vanishes only at = 0 and it is positive for any > 0. Contrary to the classical Wishart distribution for the Gaussian measure, the spectrum (30) spreads over the whole real positive semi-axis. On the other hand, taking the limit of Eq. (32) and using the formula: 2 /2 /2 x x e = (x 1/2), ( ) 2 lim (33)
we obtain () = G () as expected, because in this limit the radial prole f (x2 ) given by Eq. (29) for the Student distribution reduces to the Gaussian one (24).
VI. EXAMPLES
Let us rst consider the case without correlations: C = N and A = T . The spectrum of the empirical covariance for the Gaussian ensemble is given by the Wishart distribution: G () = where = (1 1 2r (+ )( ),
2 r) [14, 15, 16]. The corresponding spectrum (30) for the Student ensemble is then: () = 1 2r( ) 2 2
/2 +
/21
(34)
The integral over dy can be easily computed numerically. Results of this computation for dierent values of are shown in Fig. 1. For increasing the spectrum () tends to the Wishart distribution but even for very large it
0.8
()
0.6
0.4
0.2
FIG. 1: Spectra of the covariance matrix c for the Student distribution (9) with C = N and A = T , r = N/T = 0.1, for = 1/2, 2, 5, 20 and 100 (thin lines from solid to dotted), calculated using the formula (34) and compared to the uncorrelated Wishart (thick line). One sees that for the spectra tend to the Wishart distribution.
0.6
0.5
0.5
0.4
()
()
0.1
FIG. 2: Spectra of the empirical covariance matrix c calculated from Eq. (34) with r = 1/3, compared to experimental data (stair lines) obtained by the Monte Carlo generation of nite matrices N = 50, T = 150. Inset: the left part of the same distributions, points represent experimental data.
has a tail which touches = 0 as follows from Eq. (32). In Fig. 2 we have plotted () for = 0.5, 1 and 2 and compared them to experimental results obtained by the Monte-Carlo generation of random matrices drawn from the corresponding ensemble with the probability measure (9) for which eigenvalue densities were computed by numerical diagonalization. The agreement is perfect. Actually it is even better than for the Gaussian case for the same size N . As a second example we consider the case when C has two distinct eigenvalues 1 and 2 with degeneracies: (1 p)N for 1 and pN for 2 , where 0 p 1. Such a covariance matrix can be used to model the simplest eect of sectorization on a stock exchange. For example if all diagonal elements of the matrix C are equal 1 and all o-diagonal are equal 0 (0 < 0 < 1) the model can be used to mimic a collective behavior on the market [9, 10]. In this case 1 = 1 0 has a degeneracy N 1 and 2 = 1 + (N 1)0 is non-degenerated, hence p = 1/N . The eigenvector corresponding to the larger eigenvalue 2 can be thought of as describing the correlations of all stocks. For our purposes it is however more convenient to set 1 = 1 and 2 and p being an arbitrary number between
0.8
0.6
()
0.4
0.2
FIG. 3: Spectra () for C having two distinct eigenvalues: 1 and in proportion (1 p) : p, calculated from Eq. (30) with G given by formula (35), with r = 1/10, p = 1/2 and = 5. Thick solid line corresponds to the Gaussian case while thin lines to = 5, 20, 100. These lines are compared to Monte-Carlo results obtained by the generation and diagonalization of nite matrices with N = 40, T = 400 (gray lines), which lie almost exactly on top of them and can be hardly seen by an unarmed eye.
zero and one. The corresponding Wishart spectrum G () can be obtained by solving equations given by a conformal map [20]. The resulting spectrum has the form: G () = where M (Z) = p 1p + , Z 1 Z a (1 i 3)(3b a2 ) (1 + i 3)E , Z() = + 3 3 22/3 E 6 21/3 E = 3 3 27c2 18abc + 4a3 c + 4b3 a2 b2 27c + 9ab 2a3 (36) (37)
1/3
M (Z()) 1 Im ,
(35)
(38)
where a = r 1 pr (1 pr) , b = ( + 1) (1 r) and c = . Inserting the above formula into Eq. (30) we obtain an integral, which can be computed numerically for arbitrary r, , p. In Fig. 3 we show examples of this computation for dierent values of the index . In the same gure we compare the analytic results with those obtained by the Monte Carlo generation and numerical diagonalization of random matrices for N = 40, T = 400. As before, the agreement between the analytic and Monte-Carlo results is perfect. We see that the eect on the spectrum of introducing heavy tails increases with decreasing . When is decreasing from innity to zero the two disjoint islands of the distribution develop a bridge to eventually end up as a distribution having only one connected component.
VII. SUMMARY
In the paper we have developed a method for computing spectral densities of empirical covariance matrices for a wide class of quasi-Wishart ensembles with radial probability measures. In particular we have applied this method to determine the spectral density of the empirical covariance matrix for heavy tailed data described by a Student multivariate distribution. We have shown that the spectrum () decays like /21 where is the index of Student distribution. The case of = 3 is of particular importance since it can be used in modeling stock markets. The eigenvalue density spreads over the whole positive semi-axis in contrast to the Wishart spectrum which has a nite support.
We have also derived a general formula for the eigenvalue spectrum of the empirical covariance matrix for radial ensembles. The spectrum is given by a one-dimensional integral, which can be easily computed numerically. The method works also in the case of correlated assets.
Acknowledgements
We would like to thank Jerzy Jurkiewicz, Maciej A. Nowak, Gabor Papp and Ismail Zahed for many inspiring discussions. This work was supported by Polish Ministry of Science and Information Society Technologies grants: 2P03B-08225 (2003-2006) and 1P03B-04029 (2005-2008) and EU grants: MTKD-CT-2004-517186 (COCOS) and MRTN-CT-2004-005616 (ENRAGE).
[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36]
T. Guhr, A. Mller-Groeling, H. A. Weidenmller, Phys. Rept. 299 (1998) 189. u u A. L. Moustakas et al., Science 287 (2000) 287. A. M. Sengupta and P. P. Mitra, physics/0010081. R. Mller, IEEE Transactions on Information Theory 48 (2002) 2495. u S. E. Skipetrov, Phys. Rev. E 67 (2003) 036621. L. Laloux, P. Cizeau, J.-P. Bouchaud and M. Potters, Phys. Rev. Lett. 83 (1999) 1467. V. Plerou, P. Gopikrishnan, B. Rosenow, L. A. N. Amaral, T. Guhr and H. E. Stanley, Phys. Rev. E 65 (2002) 066126. A. Utsugi, K. Ino and M. Oshikawa, Phys. Rev. E 70 (2004) 026110. S. Pafka and I. Kondor, Physica A 319 (2003) 487; Physica A 343 (2004) 623. G. Papp, S. Pafka, M. A. Nowak and I. Kondor, Acta Phys. Polon. B 36 (2005) 2757. T. Guhr and B. Klber, J. Phys. A 36 (2003) 3009. a Y. Malevergna and D. Sornette, Physica A 331 (2004) 660. Z. Burda and J. Jurkiewicz, Physica A 344 (2004) 67. V. A. Marenko and L. A. Pastur, Math. USSR-Sb. 1 (1967) 457. c Z. D. Bai and J. W. Silverstein, J. Multivariate Anal. 54 (1995) 175. S. I. Choi and J. W. Silverstein, J. Multivariate Anal. 54 (1995) 295. Z. Burda, J. Jurkiewicz, B. Waclaw, Phys. Rev. E 71 (2005) 026111. J. Feinberg, A. Zee, J. Stat. Phys. 87 (1997) 473. A. M. Sengupta and P. P. Mitra, Phys. Rev. E 60 (1999) 3389. Z. Burda, A. Grlich, A. Jarosz, J. Jurkiewicz, Physica A 343 (2004) 295. o P. Repetowicz, P. Richmond, math-ph/0411020. P. Cizeau, J. P. Bouchaud, Phys. Rev. E 50 (1994) 1810. P. Gopikrishnan, M. Meyer, L. A. N. Amaral, H. E. Stanley, Eur. Phys. J. B 3 (1998) 139. P. Gopikrishnan, V. Plerou, L. A. N. Amaral, M. Meyer, H. E. Stanley, Phys. Rev. E 60 (1999) 5305. R. Rak, S. Drozdz, J. Kwapien, physics/0603071. Z. Burda, J. Jurkiewicz, M. A. Nowak, G. Papp, I. Zahed, Physica A 343 (2004) 694. Z. Burda, J. Jurkiewicz, M. A. Nowak, Acta Phys. Polon. B 34 (2003) 87. Z. Burda, A. Jarosz, J. Jurkiewicz, M. A. Nowak, G. Papp, I. Zahed physics/0603024. G. Le Car, R. Delannay, Phys. Rev. E 59 (1999) 6281. e R. Delannay, G. Le Car, J. Phys. A 33 (2000) 2611. e F. Toscano, R. O. Vallejos, C. Tsallis, Phys. Rev. E 69 (2004) 066131. A. C. Bertuola, O. Bohigas, M. P. Pato, Phys. Rev. E 70 (2004) 065102. A. Y. Abul-Magd, Phys. Rev. E 71 (2005) 066207. Z. Burda, A. Grlich, J. Jurkiewicz and B. Waclaw, Eur. Phys. J. B 49 (2006) 319. o Z. Burda, J. Jurkiewicz and B. Waclaw, Acta Phys. Pol. B 36 (2005) 2641. E. Brezin, A. Zee, Nucl. Phys. B 402 (1993) 613.