Estimation of the Mean Matrix

Tsukuma, Hisayuki; Kubokawa, Tatsuya

doi:10.1007/978-981-15-1596-5_6

Hisayuki Tsukuma³ &
Tatsuya Kubokawa⁴

Part of the book series: SpringerBriefs in Statistics ((JSSRES))

920 Accesses

Abstract

This chapter introduces a unified approach to high- and low-dimensional cases for matricial shrinkage estimation of a normal mean matrix with unknown covariance matrix. A historical background is briefly explained, and matricial shrinkage estimators are motivated from an empirical Bayes method. An unbiased risk estimate is unifiedly developed for a class of estimators corresponding to all possible orderings of sample size and dimensions. Specific examples of matricial shrinkage estimators are provided and also some related topics are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

A.J. Baranchik, A family of minimax estimators of the mean of a multivariate normal distribution. Ann. Math. Stat. 41, 642–645 (1970)
Article MathSciNet Google Scholar
M. Bilodeau, T. Kariya, Minimax estimators in the normal MANOVA model. J. Multivar. Anal. 28, 260–270 (1989)
Article MathSciNet Google Scholar
D. Chételat, M.T. Wells, Improved multivariate normal mean estimation with unknown covariance when $p$ is greater than $n$. Ann. Stat. 40, 3137–3160 (2012)
Article MathSciNet Google Scholar
B. Efron, C. Morris, Empirical Bayes on vector observations: an extension of Stein’s method. Biometrika 59, 335–347 (1972)
Article MathSciNet Google Scholar
B. Efron, C. Morris, Multivariate empirical Bayes and estimation of covariance matrices. Ann. Stat. 4, 22–32 (1976)
Article MathSciNet Google Scholar
M.H.J. Gruber, Improving Efficiency by Shrinkage (Marcel Dekker, New York, 1998)
MATH Google Scholar
T. Honda, Minimax estimators in the manova model for arbitrary quadratic loss and unknown covariance matrix. J. Multivar. Anal. 36, 113–120 (1991)
Article MathSciNet Google Scholar
James, W. and Stein, C. (1961). Estimation with quadratic loss, in Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, ed. by J. Neyman (University of California Press, Berkeley), pp. 361–379
Google Scholar
T. Kariya, Y. Konno, W.E. Strawderman, Double shrinkage estimators in the GMANOVA model. J. Multivar. Anal. 56, 245–258 (1996)
Article MathSciNet Google Scholar
T. Kariya, Y. Konno, W.E. Strawderman, Construction of shrinkage estimators for the regression coefficient matrix in the GMANOVA model. Commun. Stat.—Theory Methods 28, 597–611 (1999)
Article MathSciNet Google Scholar
Y. Konno, Families of minimax estimators of matrix of normal means with unknown covariance matrix. J. Japan Stat. Soc. 20, 191–201 (1990)
MathSciNet MATH Google Scholar
Y. Konno, On estimation of a matrix of normal means with unknown covariance matrix. J. Multivar. Anal. 36, 44–55 (1991)
Article MathSciNet Google Scholar
Y. Konno, Improved estimation of matrix of normal mean and eigenvalues in the multivariate $F$-distribution. Doctoral dissertation, Institute of Mathematics, University of Tsukuba, 1992. (http://mcm-www.jwu.ac.jp/~konno/)
T. Kubokawa, AKMdE Saleh, K. Morita, Improving on MLE of coefficient matrix in a growth curve model. J. Stat. Plann. Infer. 31, 169–177 (1992)
Article MathSciNet Google Scholar
T. Kubokawa, M.S. Srivastava, Robust improvement in estimation of a mean matrix in an elliptically contoured distribution. J. Multivar. Anal. 76, 138–152 (2001)
Article MathSciNet Google Scholar
R.F. Potthoff, S.N. Roy, A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika 51, 313–326 (1964)
Article MathSciNet Google Scholar
M.S. Srivastava, C.G. Khatri, An Introduction to Multivariate Statistics (North Holland, New York, 1979)
MATH Google Scholar
C. Stein, Estimation of the mean of a multivariate normal distribution. Technical Reports No. 48 (Department of Statistics, Stanford University, Stanford, 1973)
Google Scholar
M. Tan, Improved estimators for the GMANOVA problem with application to Monte Carlo simulation. J. Multivar. Anal. 38, 262–274 (1991)
Article MathSciNet Google Scholar
H. Tsukuma, Shrinkage minimax estimation and positive-part rule for a mean matrix in an elliptically contoured distribution. Stat. Probab. Lett. 80, 215–220 (2010)
Article MathSciNet Google Scholar
H. Tsukuma, T. Kubokawa, Methods for improvement in estimation of a normal mean matrix. J. Multivar. Anal. 98, 1592–1610 (2007)
Article MathSciNet Google Scholar
H. Tsukuma, T. Kubokawa, A unified approach to estimating a normal mean matrix in high and low dimensions. J. Multivar. Anal. 139, 312–328 (2015)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Medicine, Toho University, Tokyo, Japan
Hisayuki Tsukuma
Faculty of Economics, University of Tokyo, Tokyo, Japan
Tatsuya Kubokawa

Authors

Hisayuki Tsukuma
View author publications
You can also search for this author in PubMed Google Scholar
Tatsuya Kubokawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hisayuki Tsukuma .

Appendix

This appendix provides some brief proofs of useful results on matrix differential operators that were previously applied to Theorem 6.1.

Let ${\varvec{X}}=(x_{ab})\in \mathbb {R}^{m\times p}$ and ${\varvec{Y}}=(y_{ab})\in \mathbb {R}^{n\times p}$. Denote the matrix differential operators with respect to ${\varvec{X}}$ and ${\varvec{Y}}$, respectively, by $\nabla _X=\big (\mathrm{d}_{ab}^X\big )$ with $\mathrm{d}_{ab}^X=\partial /\partial x_{ab}$ and by $\nabla _Y=\big (\mathrm{d}_{ab}^Y\big )$ with $\mathrm{d}_{ab}^Y=\partial /\partial y_{ab}$. Let

$$ \tau =m\wedge n\wedge p. $$

Here, the eigenvalue decomposition of ${\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top $ is

$$ {\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top ={\varvec{R}}{\varvec{F}}{\varvec{R}}^\top , $$

where ${\varvec{F}}=\mathrm{\,diag\,}(f_1,\ldots ,f_\tau )\in \mathbb {D}_\tau ^{(\ge 0)}$ and ${\varvec{R}}=(r_{ij})\in \mathbb {V}_{m,\tau }$.

Lemma 6.5

For $i=1,\ldots ,\tau $, $k=1,\ldots ,m$, $a=1,\ldots ,m$ and $b=1,\ldots ,p$, we have

(i)
$\mathrm{d}_{ab}^X f_i=A_{ab}^{ii}$,
(ii)
$\displaystyle \mathrm{d}_{ab}^X r_{ki}=\sum _{j\ne i}^\tau \frac{r_{kj}A_{ab}^{ij}}{f_i-f_j}+f_i^{-1}\{{\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top \}_{ka}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}$,

where $A_{ab}^{ij}=r_{aj}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}+r_{ai}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{jb}$.

For $i=1,\ldots ,\tau $, $k=1,\ldots ,m$, $a=1,\ldots ,n$ and $b=1,\ldots ,p$, we have

(iii)
$\mathrm{d}_{ab}^Y f_i=B_{ab}^{ii}$,
(iv)
$\displaystyle \mathrm{d}_{ab}^Y r_{ki}\!=\!\sum _{j\ne i}^\tau \frac{r_{kj}B_{ab}^{ij}}{f_i-f_j}+f_i^{-1}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{S}}^+{\varvec{Y}}^\top \}_{ia}\{({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top ){\varvec{X}}({\varvec{I}}_p-{\varvec{S}}{\varvec{S}}^+)\}_{kb}$,

where

$$\begin{aligned} B_{ab}^{ij}&=-\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{Y}}^\top \}_{ia}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{jb} -\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{Y}}^\top \}_{ja}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}\\&\qquad +\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{S}}^+{\varvec{Y}}^\top \}_{ia}\{{\varvec{R}}^\top {\varvec{X}}({\varvec{I}}_p-{\varvec{S}}{\varvec{S}}^+)\}_{jb}\\&\qquad +\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{S}}^+{\varvec{Y}}^\top \}_{ja}\{{\varvec{R}}^\top {\varvec{X}}({\varvec{I}}_p-{\varvec{S}}{\varvec{S}}^+)\}_{ib}. \end{aligned}$$

Proof

Take ${\varvec{R}}_*\in \mathbb {V}_{m,m-\tau }$ such that ${\varvec{R}}_*^\top {\varvec{R}}=\mathbf{0}_{(m-\tau )\times \tau }$. Define ${\varvec{R}}_0=({\varvec{R}},{\varvec{R}}_*)\in \mathbb {O}_m$. Denote ${\varvec{F}}_0=\mathrm{\,diag\,}(f_1,\ldots ,f_\tau ,0,\ldots ,0)\ (\in \mathbb {D}_m^{(\ge 0)})$. Then ${\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top ={\varvec{R}}_0{\varvec{F}}_0{\varvec{R}}_0^\top $.

Differentiating both sides of ${\varvec{R}}_0^\top {\varvec{R}}_0={\varvec{I}}_m$ gives $[\mathrm{d}{\varvec{R}}_0^\top ]{\varvec{R}}_0+{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0=\mathbf{0}_{m\times m}$, implying that ${\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0$ is skew-symmetric in $\mathbb {R}^{m\times m}$. Thus, for $j,i\in \{1,\ldots ,m\}$, $\{{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0\}_{ji}=0$ if $j=i$ and $\{{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0\}_{ji}=-\{{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0\}_{ij}=-\{[\mathrm{d}{\varvec{R}}_0^\top ]{\varvec{R}}_0\}_{ji}$ otherwise. Differentiating both sides of ${\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top ={\varvec{R}}_0{\varvec{F}}_0{\varvec{R}}_0^\top $ gives

$$ \mathrm{d}({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )=[\mathrm{d}{\varvec{R}}_0]{\varvec{F}}_0{\varvec{R}}_0^\top +{\varvec{R}}_0[\mathrm{d}{\varvec{F}}_0]{\varvec{R}}_0^\top +{\varvec{R}}_0{\varvec{F}}_0\mathrm{d}{\varvec{R}}_0^\top , $$

and then

$$\begin{aligned} {\varvec{R}}_0^\top [\mathrm{d}({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}_0&=[{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0]{\varvec{F}}_0+\mathrm{d}{\varvec{F}}_0+{\varvec{F}}_0[\mathrm{d}{\varvec{R}}_0^\top ]{\varvec{R}}_0\\&=[{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0]{\varvec{F}}_0+\mathrm{d}{\varvec{F}}_0-{\varvec{F}}_0{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0. \end{aligned}$$

Comparing each element in both sides of the above identity, we have

$$\begin{aligned} \mathrm{d}f_i&=\{{\varvec{R}}^\top [\mathrm{d}({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ii}\quad \text {for}\,\, i\in \{1,\ldots ,\tau \} , \\ \{{\varvec{R}}^\top \mathrm{d}{\varvec{R}}\}_{ji}&=\frac{\{{\varvec{R}}^\top [\mathrm{d}({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ji}}{f_i-f_j}\quad \text {for}\,\, j,i\in \{1,\ldots ,\tau \} \,\,\text {with}\,\, j\ne i , \\ \{{\varvec{R}}_*^\top \mathrm{d}{\varvec{R}}\}_{ji}&=\frac{\{{\varvec{R}}_*^\top [\mathrm{d}({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ji}}{f_i} \quad \text {for}\,\, j\in \{1,\ldots ,m-\tau \} \,\,\text {and}\,\, i\in \{1,\ldots ,\tau \} . \end{aligned}$$

Note that $\mathrm{d}_{ab}^X {\varvec{X}}={\varvec{E}}_{ab}$, where ${\varvec{E}}_{ab}\in \mathbb {R}^{m\times p}$ such that the (a, b)-th element is one and the other elements are zeros. Since $\mathrm{d}_{ab}^X ({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )=[\mathrm{d}_{ab}^X {\varvec{X}}]{\varvec{S}}^+{\varvec{X}}^\top +{\varvec{X}}{\varvec{S}}^+[\mathrm{d}_{ab}^X{\varvec{X}}^\top ]={\varvec{E}}_{ab}{\varvec{S}}^+{\varvec{X}}^\top +{\varvec{X}}{\varvec{S}}^+{\varvec{E}}_{ab}^\top $, we observe that, for $j,i\in \{1,\ldots ,\tau \}$,

$$\begin{aligned} \{{\varvec{R}}^\top [\mathrm{d}_{ab}^X({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ji}&=\{{\varvec{R}}^\top {\varvec{E}}_{ab}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{ji}+ \{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{E}}_{ab}^\top {\varvec{R}}\}_{ji} {\nonumber }\\&=r_{aj}\{{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{bi}+\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{jb}r_{ai}=A_{ab}^{ij}. \end{aligned}$$

(6.40)

Thus, for $i=1,\ldots ,\tau $, $\mathrm{d}_{ab}^X f_i=\{{\varvec{R}}^\top [\mathrm{d}_{ab}^X({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ii}=A_{ab}^{ii}$, which shows (i).

On the other hand, it is observed that for $k=1,\ldots ,m$ and $i=1,\ldots ,\tau $

$$\begin{aligned} \mathrm{d}_{ab}^X r_{ki}&=\{\mathrm{d}_{ab}^X{\varvec{R}}\}_{ki}=\{({\varvec{R}}{\varvec{R}}^\top +{\varvec{R}}_*{\varvec{R}}_*^\top ) \mathrm{d}_{ab}^X{\varvec{R}}\}_{ki}{\nonumber }\\&=\sum _{\begin{array}{c}j \ne i \end{array}}^\tau r_{kj}\{{\varvec{R}}^\top \mathrm{d}_{ab}^X{\varvec{R}}\}_{ji} +\{{\varvec{R}}_*{\varvec{R}}_*^\top \mathrm{d}_{ab}^X{\varvec{R}}\}_{ki}{\nonumber }\\&=\sum _{\begin{array}{c}j \ne i \end{array}}^\tau \frac{r_{kj}A_{ab}^{ij}}{f_i-f_j} +\frac{\{{\varvec{R}}_*{\varvec{R}}_*^\top [\mathrm{d}_{ab}^X({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ki}}{f_i}. \end{aligned}$$

(6.41)

In a similar way to (6.40), for $k=1,\ldots ,m$ and $i=1,\ldots ,\tau $,

$$ \{{\varvec{R}}_*{\varvec{R}}_*^\top [\mathrm{d}_{ab}^X({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ki}=\{{\varvec{R}}_*{\varvec{R}}_*^\top \}_{ka}\{{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{bi}+\{{\varvec{R}}_*{\varvec{R}}_*^\top {\varvec{X}}{\varvec{S}}^+\}_{kb}r_{ai}. $$

Here, ${\varvec{R}}_*^\top {\varvec{X}}{\varvec{S}}^+=\mathbf{0}_{(m-\tau )\times p}$, so that

$$\begin{aligned} \{{\varvec{R}}_*{\varvec{R}}_*^\top [\mathrm{d}_{ab}^X({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ki} =\{{\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top \}_{ka}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}. \end{aligned}$$

(6.42)

Substituting (6.42) into (6.41), we obtain (ii).

Since $\{{\varvec{R}}^\top [\mathrm{d}_{ab}^Y({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ji}=\{{\varvec{R}}^\top {\varvec{X}}[\mathrm{d}_{ab}^Y {\varvec{S}}^+]{\varvec{X}}^\top {\varvec{R}}\}_{ji}$ for $j,i\in \{1,\ldots ,\tau \}$, it is observed from (iii) of Lemma 5.2 that

$$\begin{aligned}&\{{\varvec{R}}^\top [\mathrm{d}_{ab}^Y({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ji}\\&=-\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{jb}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{Y}}^\top \}_{ia}-\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{Y}}^\top \}_{ja}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}\\&\quad +\{{\varvec{R}}^\top {\varvec{X}}({\varvec{I}}_p-{\varvec{S}}{\varvec{S}}^+)\}_{jb}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{S}}^+{\varvec{Y}}^\top \}_{ia} \\&\quad +\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{S}}^+{\varvec{Y}}^\top \}_{ja}\{{\varvec{R}}^\top {\varvec{X}}({\varvec{I}}_p-{\varvec{S}}{\varvec{S}}^+)\}_{ib}=B_{ab}^{ij}. \end{aligned}$$

Similarly,

$$ \{{\varvec{R}}_*{\varvec{R}}_*^\top [\mathrm{d}_{ab}^Y({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ki} =\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{S}}^+{\varvec{Y}}^\top \}_{ia}\{({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top ){\varvec{X}}({\varvec{I}}_p-{\varvec{S}}{\varvec{S}}^+)\}_{kb} $$

for $k=1,\ldots ,m$ and $i=1,\ldots ,\tau $. Hence using the same arguments as in the proofs of (i) and (ii) yields (iii) and (iv). $\square $

Lemma 6.6

Let $c_0=|n\wedge p-m|+1$. Define ${\varvec{\Phi }}=\mathrm{\,diag\,}(\phi _1,\ldots ,\phi _\tau )\in \mathbb {D}_\tau $ such that the diagonals of ${\varvec{\Phi }}$ are absolutely continuous functions of ${\varvec{F}}$. Then

$$\begin{aligned} \nabla _X{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top ={\varvec{R}}{\varvec{\Phi }}^*{\varvec{R}}^\top +(\mathrm{\,tr\,}{\varvec{\Phi }})({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top ), \end{aligned}$$

(6.43)

where ${\varvec{\Phi }}^*=\mathrm{\,diag\,}(\phi _1^*,\ldots ,\phi _\tau ^*)$ and for $i=1,\ldots ,\tau $

$$ \phi _i^*=(n\wedge p-\tau +1)\phi _i+2f_i\frac{\partial \phi _i}{\partial f_i}+\sum _{j\ne i}^\tau \frac{f_i\phi _i-f_j\phi _j}{f_i-f_j}. $$

In particular,

$$\begin{aligned} \mathrm{\,tr\,}\nabla _X{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top = \sum _{i=1}^\tau \bigg \{c_0\phi _i+2f_i\frac{\partial \phi _i}{\partial f_i}+2\sum _{j>i}^\tau \frac{f_i\phi _i-f_j\phi _j}{f_i-f_j}\bigg \}. \end{aligned}$$

(6.44)

Proof

For $a,c\in \{1,\ldots ,\tau \}$, the (a, c)-th element of $\nabla _X{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top $ is

$$ \{\nabla _X{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \}_{ac} =\sum _{b=1}^p\sum _{d=1}^p\sum _{k=1}^m\sum _{i=1}^\tau \mathrm{d}_{ab}^X [ \{{\varvec{S}}{\varvec{S}}^+\}_{bd}x_{kd} r_{ki}\phi _i r_{ci} ], $$

and then

$$\begin{aligned} \{\nabla _X {\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \}_{ac} =D_{ac}^{(1)} + D_{ac}^{(2)}+D_{bc}^{(3)}, \end{aligned}$$

(6.45)

where

$$\begin{aligned} D_{ac}^{(1)}&=\sum _{b=1}^p\sum _{d=1}^p\sum _{k=1}^m \{{\varvec{S}}{\varvec{S}}^+\}_{bd}\{{\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \}_{kc} \mathrm{d}_{ab}^X x_{kd} , \\ D_{ac}^{(2)}&=\sum _{b=1}^p\sum _{i=1}^\tau \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{bi}r_{ci} \mathrm{d}_{ab}^X \phi _i , \\ D_{ac}^{(3)}&=\sum _{b=1}^p\sum _{k=1}^m\sum _{i=1}^\tau \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top \}_{bk} \phi _i ( r_{ci} \mathrm{d}_{ab}^X r_{ki}+r_{ki} \mathrm{d}_{ab}^Xr_{ci}). \end{aligned}$$

Since $\mathrm{d}_{ab}^X x_{kd}={\delta }_{ak}{\delta }_{bd}$ and ${\varvec{S}}{\varvec{S}}^+$ is idempotent with rank $n\wedge p$, it follows that

$$\begin{aligned} D_{ac}^{(1)} =\sum _{b=1}^p \{{\varvec{S}}{\varvec{S}}^+\}_{bb}\{{\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \}_{ac} =(n\wedge p)\{{\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \}_{ac}. \end{aligned}$$

(6.46)

To evaluate $D_{ac}^{(2)}$, we first use the chain rule and (i) of Lemma 6.5 to obtain

$$ \mathrm{d}_{ab}^X \phi _i =\sum _{j=1}^\tau [\mathrm{d}_{ab}^X f_j]\frac{\partial \phi _i}{\partial f_j} =\sum _{j=1}^\tau A_{ab}^{jj}\frac{\partial \phi _i}{\partial f_j}. $$

Note that ${\varvec{S}}^+{\varvec{S}}{\varvec{S}}^+={\varvec{S}}^+$ and

$$ \sum _{b=1}^p \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{bi} A_{ab}^{jj} =2r_{aj}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{ji} =2r_{aj}\{{\varvec{F}}\}_{ji}, $$

so that

$$\begin{aligned} D_{ac}^{(2)} =2\sum _{i=1}^\tau \sum _{j=1}^\tau r_{aj}r_{ci}\{{\varvec{F}}\}_{ji} \frac{\partial \phi _i}{\partial f_j} =2\sum _{i=1}^\tau r_{ai}r_{ci} f_i \frac{\partial \phi _i}{\partial f_i}. \end{aligned}$$

(6.47)

Finally, we consider $D_{ac}^{(3)}$. Using (ii) of Lemma 6.5 yields

$$\begin{aligned} \sum _{k=1}^m \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top \}_{bk} \mathrm{d}_{ab}^X r_{ki}&=\sum _{j\ne i}^\tau \frac{\{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{bj}A_{ab}^{ij}}{f_i-f_j} \\&\qquad +f_i^{-1}\{({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top ){\varvec{X}}{\varvec{S}}{\varvec{S}}^+\}_{ab}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}. \end{aligned}$$

Since

$$\begin{aligned}&\sum _{b=1}^p \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{bj}A_{ab}^{ij}=r_{aj}\{{\varvec{F}}\}_{ij}+r_{ai}f_j, \\&\sum _{b=1}^p \{({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top ){\varvec{X}}{\varvec{S}}{\varvec{S}}^+\}_{ab}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}=\{({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top ){\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{ai} = 0, \end{aligned}$$

it is seen that

$$\begin{aligned}&\sum _{b=1}^p\sum _{k=1}^m\sum _{i=1}^\tau \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top \}_{bk} \phi _i r_{ci} \mathrm{d}_{ab}^X r_{ki} {\nonumber }\\&=\sum _{i=1}^\tau \sum _{j\ne i}^\tau \frac{r_{ai}r_{ci}f_j\phi _i}{f_i-f_j} =\sum _{i=1}^\tau \sum _{j\ne i}^\tau \frac{r_{ai}r_{ci}(f_j-f_i+f_i)\phi _i}{f_i-f_j} {\nonumber }\\&=-(\tau -1)\sum _{i=1}^\tau r_{ai}r_{ci}\phi _i+\sum _{i=1}^\tau \sum _{j\ne i}^\tau \frac{r_{ai}r_{ci} f_i\phi _i}{f_i-f_j}. \end{aligned}$$

(6.48)

Similarly,

$$\begin{aligned} \sum _{b=1}^p\sum _{k=1}^m\sum _{i=1}^\tau \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top \}_{bk} \phi _i r_{ki} \mathrm{d}_{ab}^Xr_{ci}&= \sum _{i=1}^\tau \sum _{j\ne i}^\tau \frac{r_{aj}r_{cj} f_i\phi _i}{f_i-f_j} +(\mathrm{\,tr\,}{\varvec{\Phi }})\{{\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top \}_{ac} {\nonumber }\\&= -\sum _{i=1}^\tau \sum _{j\ne i}^\tau \frac{r_{ai}r_{ci} f_j\phi _j}{f_i-f_j} +(\mathrm{\,tr\,}{\varvec{\Phi }})\{{\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top \}_{ac}. \end{aligned}$$

(6.49)

Combining (6.48) and (6.49) gives

$$\begin{aligned} D_{ac}^{(3)} =\sum _{i=1}^\tau r_{ai}r_{ci}\bigg \{-(\tau -1)\phi _i+\sum _{j\ne i}^\tau \frac{f_i\phi _i-f_j\phi _j}{f_i-f_j}\bigg \} + (\mathrm{\,tr\,}{\varvec{\Phi }})\{{\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top \}_{ac}. \end{aligned}$$

(6.50)

Substituting (6.46), (6.47) and (6.50) into (6.45) yields (6.43).

Note that $n\wedge p-\tau +1+\mathrm{\,tr\,}({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top )=n\wedge p+m-2\tau +1=|n\wedge p-m|+1=c_0$ and also that

$$ \sum _{i=1}^\tau \sum _{j\ne i}^\tau \frac{f_i\phi _i-f_j\phi _j}{f_i-f_j} = 2\sum _{i=1}^\tau \sum _{j>i}^\tau \frac{f_i\phi _i-f_j\phi _j}{f_i-f_j}. $$

Hence taking the trace of (6.43) yields (6.44), which completes the proof. $\square $

Lemma 6.7

Let $c_1=n-(n\wedge p)+\tau -2$, $c_2=p-(n\wedge p)+\tau -1$ and $c_0=c_1+c_2$. Let ${\varvec{\Phi }}={\varvec{\Phi }}({\varvec{F}})=\mathrm{\,diag\,}(\phi _1,\ldots ,\phi _\tau )$, where the $\phi _i$’s are absolutely continuous functions of ${\varvec{F}}$. Then

$$\begin{aligned} {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}{\varvec{S}}^+\nabla _Y^\top {\varvec{Y}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top&= {\varvec{R}}{\varvec{\Phi }}^{*1}{\varvec{R}}^\top , \end{aligned}$$

(6.51)

$$\begin{aligned} {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{Y}}^\top \nabla _Y{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top&= {\varvec{R}}{\varvec{\Phi }}^{*2}{\varvec{R}}^\top , \end{aligned}$$

(6.52)

where ${\varvec{\Phi }}^{*k}=\mathrm{\,diag\,}(\phi _1^{*k},\ldots ,\phi _\tau ^{*k})$ for $k=1,2$ and, for $i=1,\ldots ,\tau $,

$$ \phi _i^{*k} =c_k f_i\phi _i^2 -2f_i^2\phi _i\frac{\partial \phi _i}{\partial f_i} -\sum _{j\ne i}^\tau \frac{f_i^2\phi _i^2}{f_i-f_j} +\sum _{j\ne i}^\tau \frac{f_i\phi _if_j\phi _j}{f_i-f_j}. $$

In particular,

$$\begin{aligned} \mathrm{\,tr\,}\nabla _Y^\top {\varvec{Y}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}^2{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}{\varvec{S}}^+ =\sum _{i=1}^\tau \bigg \{ c_0 f_i\phi _i^2 -2f_i^2\frac{\partial (\phi _i^2)}{\partial f_i} -2\sum _{j>i}^\tau \frac{f_i^2\phi _i^2-f_j^2\phi _j^2}{f_i-f_j}\bigg \}. \end{aligned}$$

(6.53)

Proof

The proofs of (6.51) and (6.52) can be done by using the same arguments as in the proof of Lemma 6.6. Since

$$\begin{aligned} \mathrm{\,tr\,}\nabla _Y^\top {\varvec{Y}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}^2{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}{\varvec{S}}^+&=\mathrm{\,tr\,}\nabla _Y^\top \{{\varvec{Y}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \cdot {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}{\varvec{S}}^+\} \\&=\mathrm{\,tr\,}{\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}{\varvec{S}}^+\nabla _Y^\top {\varvec{Y}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \\&\qquad +\mathrm{\,tr\,}{\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{Y}}^\top \nabla _Y{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top , \end{aligned}$$

the identity (6.53) can be verified by combining (6.51) and (6.52). $\square $

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tsukuma, H., Kubokawa, T. (2020). Estimation of the Mean Matrix. In: Shrinkage Estimation for Mean and Covariance Matrices. SpringerBriefs in Statistics(). Springer, Singapore. https://doi.org/10.1007/978-981-15-1596-5_6

Download citation

DOI: https://doi.org/10.1007/978-981-15-1596-5_6
Published: 17 April 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1595-8
Online ISBN: 978-981-15-1596-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Estimation of the Mean Matrix

Abstract

Access this chapter

Subscribe and save

Buy Now

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Lemma 6.5

Proof

Lemma 6.6

Proof

Lemma 6.7

Proof

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation