Abstract
This chapter introduces a unified approach to high- and low-dimensional cases for matricial shrinkage estimation of a normal mean matrix with unknown covariance matrix. A historical background is briefly explained, and matricial shrinkage estimators are motivated from an empirical Bayes method. An unbiased risk estimate is unifiedly developed for a class of estimators corresponding to all possible orderings of sample size and dimensions. Specific examples of matricial shrinkage estimators are provided and also some related topics are discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
A.J. Baranchik, A family of minimax estimators of the mean of a multivariate normal distribution. Ann. Math. Stat. 41, 642–645 (1970)
M. Bilodeau, T. Kariya, Minimax estimators in the normal MANOVA model. J. Multivar. Anal. 28, 260–270 (1989)
D. Chételat, M.T. Wells, Improved multivariate normal mean estimation with unknown covariance when \(p\) is greater than \(n\). Ann. Stat. 40, 3137–3160 (2012)
B. Efron, C. Morris, Empirical Bayes on vector observations: an extension of Stein’s method. Biometrika 59, 335–347 (1972)
B. Efron, C. Morris, Multivariate empirical Bayes and estimation of covariance matrices. Ann. Stat. 4, 22–32 (1976)
M.H.J. Gruber, Improving Efficiency by Shrinkage (Marcel Dekker, New York, 1998)
T. Honda, Minimax estimators in the manova model for arbitrary quadratic loss and unknown covariance matrix. J. Multivar. Anal. 36, 113–120 (1991)
James, W. and Stein, C. (1961). Estimation with quadratic loss, in Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, ed. by J. Neyman (University of California Press, Berkeley), pp. 361–379
T. Kariya, Y. Konno, W.E. Strawderman, Double shrinkage estimators in the GMANOVA model. J. Multivar. Anal. 56, 245–258 (1996)
T. Kariya, Y. Konno, W.E. Strawderman, Construction of shrinkage estimators for the regression coefficient matrix in the GMANOVA model. Commun. Stat.—Theory Methods 28, 597–611 (1999)
Y. Konno, Families of minimax estimators of matrix of normal means with unknown covariance matrix. J. Japan Stat. Soc. 20, 191–201 (1990)
Y. Konno, On estimation of a matrix of normal means with unknown covariance matrix. J. Multivar. Anal. 36, 44–55 (1991)
Y. Konno, Improved estimation of matrix of normal mean and eigenvalues in the multivariate \(F\)-distribution. Doctoral dissertation, Institute of Mathematics, University of Tsukuba, 1992. (http://mcm-www.jwu.ac.jp/~konno/)
T. Kubokawa, AKMdE Saleh, K. Morita, Improving on MLE of coefficient matrix in a growth curve model. J. Stat. Plann. Infer. 31, 169–177 (1992)
T. Kubokawa, M.S. Srivastava, Robust improvement in estimation of a mean matrix in an elliptically contoured distribution. J. Multivar. Anal. 76, 138–152 (2001)
R.F. Potthoff, S.N. Roy, A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika 51, 313–326 (1964)
M.S. Srivastava, C.G. Khatri, An Introduction to Multivariate Statistics (North Holland, New York, 1979)
C. Stein, Estimation of the mean of a multivariate normal distribution. Technical Reports No. 48 (Department of Statistics, Stanford University, Stanford, 1973)
M. Tan, Improved estimators for the GMANOVA problem with application to Monte Carlo simulation. J. Multivar. Anal. 38, 262–274 (1991)
H. Tsukuma, Shrinkage minimax estimation and positive-part rule for a mean matrix in an elliptically contoured distribution. Stat. Probab. Lett. 80, 215–220 (2010)
H. Tsukuma, T. Kubokawa, Methods for improvement in estimation of a normal mean matrix. J. Multivar. Anal. 98, 1592–1610 (2007)
H. Tsukuma, T. Kubokawa, A unified approach to estimating a normal mean matrix in high and low dimensions. J. Multivar. Anal. 139, 312–328 (2015)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
This appendix provides some brief proofs of useful results on matrix differential operators that were previously applied to Theorem 6.1.
Let \({\varvec{X}}=(x_{ab})\in \mathbb {R}^{m\times p}\) and \({\varvec{Y}}=(y_{ab})\in \mathbb {R}^{n\times p}\). Denote the matrix differential operators with respect to \({\varvec{X}}\) and \({\varvec{Y}}\), respectively, by \(\nabla _X=\big (\mathrm{d}_{ab}^X\big )\) with \(\mathrm{d}_{ab}^X=\partial /\partial x_{ab}\) and by \(\nabla _Y=\big (\mathrm{d}_{ab}^Y\big )\) with \(\mathrm{d}_{ab}^Y=\partial /\partial y_{ab}\). Let
Here, the eigenvalue decomposition of \({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top \) is
where \({\varvec{F}}=\mathrm{\,diag\,}(f_1,\ldots ,f_\tau )\in \mathbb {D}_\tau ^{(\ge 0)}\) and \({\varvec{R}}=(r_{ij})\in \mathbb {V}_{m,\tau }\).
Lemma 6.5
For \(i=1,\ldots ,\tau \), \(k=1,\ldots ,m\), \(a=1,\ldots ,m\) and \(b=1,\ldots ,p\), we have
-
(i)
\(\mathrm{d}_{ab}^X f_i=A_{ab}^{ii}\),
-
(ii)
\(\displaystyle \mathrm{d}_{ab}^X r_{ki}=\sum _{j\ne i}^\tau \frac{r_{kj}A_{ab}^{ij}}{f_i-f_j}+f_i^{-1}\{{\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top \}_{ka}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}\),
where \(A_{ab}^{ij}=r_{aj}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}+r_{ai}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{jb}\).
For \(i=1,\ldots ,\tau \), \(k=1,\ldots ,m\), \(a=1,\ldots ,n\) and \(b=1,\ldots ,p\), we have
-
(iii)
\(\mathrm{d}_{ab}^Y f_i=B_{ab}^{ii}\),
-
(iv)
\(\displaystyle \mathrm{d}_{ab}^Y r_{ki}\!=\!\sum _{j\ne i}^\tau \frac{r_{kj}B_{ab}^{ij}}{f_i-f_j}+f_i^{-1}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{S}}^+{\varvec{Y}}^\top \}_{ia}\{({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top ){\varvec{X}}({\varvec{I}}_p-{\varvec{S}}{\varvec{S}}^+)\}_{kb}\),
where
Proof
Take \({\varvec{R}}_*\in \mathbb {V}_{m,m-\tau }\) such that \({\varvec{R}}_*^\top {\varvec{R}}=\mathbf{0}_{(m-\tau )\times \tau }\). Define \({\varvec{R}}_0=({\varvec{R}},{\varvec{R}}_*)\in \mathbb {O}_m\). Denote \({\varvec{F}}_0=\mathrm{\,diag\,}(f_1,\ldots ,f_\tau ,0,\ldots ,0)\ (\in \mathbb {D}_m^{(\ge 0)})\). Then \({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top ={\varvec{R}}_0{\varvec{F}}_0{\varvec{R}}_0^\top \).
Differentiating both sides of \({\varvec{R}}_0^\top {\varvec{R}}_0={\varvec{I}}_m\) gives \([\mathrm{d}{\varvec{R}}_0^\top ]{\varvec{R}}_0+{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0=\mathbf{0}_{m\times m}\), implying that \({\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0\) is skew-symmetric in \(\mathbb {R}^{m\times m}\). Thus, for \(j,i\in \{1,\ldots ,m\}\), \(\{{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0\}_{ji}=0\) if \(j=i\) and \(\{{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0\}_{ji}=-\{{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0\}_{ij}=-\{[\mathrm{d}{\varvec{R}}_0^\top ]{\varvec{R}}_0\}_{ji}\) otherwise. Differentiating both sides of \({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top ={\varvec{R}}_0{\varvec{F}}_0{\varvec{R}}_0^\top \) gives
and then
Comparing each element in both sides of the above identity, we have
Note that \(\mathrm{d}_{ab}^X {\varvec{X}}={\varvec{E}}_{ab}\), where \({\varvec{E}}_{ab}\in \mathbb {R}^{m\times p}\) such that the (a, b)-th element is one and the other elements are zeros. Since \(\mathrm{d}_{ab}^X ({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )=[\mathrm{d}_{ab}^X {\varvec{X}}]{\varvec{S}}^+{\varvec{X}}^\top +{\varvec{X}}{\varvec{S}}^+[\mathrm{d}_{ab}^X{\varvec{X}}^\top ]={\varvec{E}}_{ab}{\varvec{S}}^+{\varvec{X}}^\top +{\varvec{X}}{\varvec{S}}^+{\varvec{E}}_{ab}^\top \), we observe that, for \(j,i\in \{1,\ldots ,\tau \}\),
Thus, for \(i=1,\ldots ,\tau \), \(\mathrm{d}_{ab}^X f_i=\{{\varvec{R}}^\top [\mathrm{d}_{ab}^X({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ii}=A_{ab}^{ii}\), which shows (i).
On the other hand, it is observed that for \(k=1,\ldots ,m\) and \(i=1,\ldots ,\tau \)
In a similar way to (6.40), for \(k=1,\ldots ,m\) and \(i=1,\ldots ,\tau \),
Here, \({\varvec{R}}_*^\top {\varvec{X}}{\varvec{S}}^+=\mathbf{0}_{(m-\tau )\times p}\), so that
Substituting (6.42) into (6.41), we obtain (ii).
Since \(\{{\varvec{R}}^\top [\mathrm{d}_{ab}^Y({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ji}=\{{\varvec{R}}^\top {\varvec{X}}[\mathrm{d}_{ab}^Y {\varvec{S}}^+]{\varvec{X}}^\top {\varvec{R}}\}_{ji}\) for \(j,i\in \{1,\ldots ,\tau \}\), it is observed from (iii) of Lemma 5.2 that
Similarly,
for \(k=1,\ldots ,m\) and \(i=1,\ldots ,\tau \). Hence using the same arguments as in the proofs of (i) and (ii) yields (iii) and (iv). \(\square \)
Lemma 6.6
Let \(c_0=|n\wedge p-m|+1\). Define \({\varvec{\Phi }}=\mathrm{\,diag\,}(\phi _1,\ldots ,\phi _\tau )\in \mathbb {D}_\tau \) such that the diagonals of \({\varvec{\Phi }}\) are absolutely continuous functions of \({\varvec{F}}\). Then
where \({\varvec{\Phi }}^*=\mathrm{\,diag\,}(\phi _1^*,\ldots ,\phi _\tau ^*)\) and for \(i=1,\ldots ,\tau \)
In particular,
Proof
For \(a,c\in \{1,\ldots ,\tau \}\), the (a, c)-th element of \(\nabla _X{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \) is
and then
where
Since \(\mathrm{d}_{ab}^X x_{kd}={\delta }_{ak}{\delta }_{bd}\) and \({\varvec{S}}{\varvec{S}}^+\) is idempotent with rank \(n\wedge p\), it follows that
To evaluate \(D_{ac}^{(2)}\), we first use the chain rule and (i) of Lemma 6.5 to obtain
Note that \({\varvec{S}}^+{\varvec{S}}{\varvec{S}}^+={\varvec{S}}^+\) and
so that
Finally, we consider \(D_{ac}^{(3)}\). Using (ii) of Lemma 6.5 yields
Since
it is seen that
Similarly,
Combining (6.48) and (6.49) gives
Substituting (6.46), (6.47) and (6.50) into (6.45) yields (6.43).
Note that \(n\wedge p-\tau +1+\mathrm{\,tr\,}({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top )=n\wedge p+m-2\tau +1=|n\wedge p-m|+1=c_0\) and also that
Hence taking the trace of (6.43) yields (6.44), which completes the proof. \(\square \)
Lemma 6.7
Let \(c_1=n-(n\wedge p)+\tau -2\), \(c_2=p-(n\wedge p)+\tau -1\) and \(c_0=c_1+c_2\). Let \({\varvec{\Phi }}={\varvec{\Phi }}({\varvec{F}})=\mathrm{\,diag\,}(\phi _1,\ldots ,\phi _\tau )\), where the \(\phi _i\)’s are absolutely continuous functions of \({\varvec{F}}\). Then
where \({\varvec{\Phi }}^{*k}=\mathrm{\,diag\,}(\phi _1^{*k},\ldots ,\phi _\tau ^{*k})\) for \(k=1,2\) and, for \(i=1,\ldots ,\tau \),
In particular,
Proof
The proofs of (6.51) and (6.52) can be done by using the same arguments as in the proof of Lemma 6.6. Since
the identity (6.53) can be verified by combining (6.51) and (6.52). \(\square \)
Rights and permissions
Copyright information
© 2020 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Tsukuma, H., Kubokawa, T. (2020). Estimation of the Mean Matrix. In: Shrinkage Estimation for Mean and Covariance Matrices. SpringerBriefs in Statistics(). Springer, Singapore. https://doi.org/10.1007/978-981-15-1596-5_6
Download citation
DOI: https://doi.org/10.1007/978-981-15-1596-5_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1595-8
Online ISBN: 978-981-15-1596-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)