Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Estimation of the Mean Matrix

  • Chapter
  • First Online:
Shrinkage Estimation for Mean and Covariance Matrices

Part of the book series: SpringerBriefs in Statistics ((JSSRES))

  • 920 Accesses

Abstract

This chapter introduces a unified approach to high- and low-dimensional cases for matricial shrinkage estimation of a normal mean matrix with unknown covariance matrix. A historical background is briefly explained, and matricial shrinkage estimators are motivated from an empirical Bayes method. An unbiased risk estimate is unifiedly developed for a class of estimators corresponding to all possible orderings of sample size and dimensions. Specific examples of matricial shrinkage estimators are provided and also some related topics are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • A.J. Baranchik, A family of minimax estimators of the mean of a multivariate normal distribution. Ann. Math. Stat. 41, 642–645 (1970)

    Article  MathSciNet  Google Scholar 

  • M. Bilodeau, T. Kariya, Minimax estimators in the normal MANOVA model. J. Multivar. Anal. 28, 260–270 (1989)

    Article  MathSciNet  Google Scholar 

  • D. Chételat, M.T. Wells, Improved multivariate normal mean estimation with unknown covariance when \(p\) is greater than \(n\). Ann. Stat. 40, 3137–3160 (2012)

    Article  MathSciNet  Google Scholar 

  • B. Efron, C. Morris, Empirical Bayes on vector observations: an extension of Stein’s method. Biometrika 59, 335–347 (1972)

    Article  MathSciNet  Google Scholar 

  • B. Efron, C. Morris, Multivariate empirical Bayes and estimation of covariance matrices. Ann. Stat. 4, 22–32 (1976)

    Article  MathSciNet  Google Scholar 

  • M.H.J. Gruber, Improving Efficiency by Shrinkage (Marcel Dekker, New York, 1998)

    MATH  Google Scholar 

  • T. Honda, Minimax estimators in the manova model for arbitrary quadratic loss and unknown covariance matrix. J. Multivar. Anal. 36, 113–120 (1991)

    Article  MathSciNet  Google Scholar 

  • James, W. and Stein, C. (1961). Estimation with quadratic loss, in Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, ed. by J. Neyman (University of California Press, Berkeley), pp. 361–379

    Google Scholar 

  • T. Kariya, Y. Konno, W.E. Strawderman, Double shrinkage estimators in the GMANOVA model. J. Multivar. Anal. 56, 245–258 (1996)

    Article  MathSciNet  Google Scholar 

  • T. Kariya, Y. Konno, W.E. Strawderman, Construction of shrinkage estimators for the regression coefficient matrix in the GMANOVA model. Commun. Stat.—Theory Methods 28, 597–611 (1999)

    Article  MathSciNet  Google Scholar 

  • Y. Konno, Families of minimax estimators of matrix of normal means with unknown covariance matrix. J. Japan Stat. Soc. 20, 191–201 (1990)

    MathSciNet  MATH  Google Scholar 

  • Y. Konno, On estimation of a matrix of normal means with unknown covariance matrix. J. Multivar. Anal. 36, 44–55 (1991)

    Article  MathSciNet  Google Scholar 

  • Y. Konno, Improved estimation of matrix of normal mean and eigenvalues in the multivariate \(F\)-distribution. Doctoral dissertation, Institute of Mathematics, University of Tsukuba, 1992. (http://mcm-www.jwu.ac.jp/~konno/)

  • T. Kubokawa, AKMdE Saleh, K. Morita, Improving on MLE of coefficient matrix in a growth curve model. J. Stat. Plann. Infer. 31, 169–177 (1992)

    Article  MathSciNet  Google Scholar 

  • T. Kubokawa, M.S. Srivastava, Robust improvement in estimation of a mean matrix in an elliptically contoured distribution. J. Multivar. Anal. 76, 138–152 (2001)

    Article  MathSciNet  Google Scholar 

  • R.F. Potthoff, S.N. Roy, A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika 51, 313–326 (1964)

    Article  MathSciNet  Google Scholar 

  • M.S. Srivastava, C.G. Khatri, An Introduction to Multivariate Statistics (North Holland, New York, 1979)

    MATH  Google Scholar 

  • C. Stein, Estimation of the mean of a multivariate normal distribution. Technical Reports No. 48 (Department of Statistics, Stanford University, Stanford, 1973)

    Google Scholar 

  • M. Tan, Improved estimators for the GMANOVA problem with application to Monte Carlo simulation. J. Multivar. Anal. 38, 262–274 (1991)

    Article  MathSciNet  Google Scholar 

  • H. Tsukuma, Shrinkage minimax estimation and positive-part rule for a mean matrix in an elliptically contoured distribution. Stat. Probab. Lett. 80, 215–220 (2010)

    Article  MathSciNet  Google Scholar 

  • H. Tsukuma, T. Kubokawa, Methods for improvement in estimation of a normal mean matrix. J. Multivar. Anal. 98, 1592–1610 (2007)

    Article  MathSciNet  Google Scholar 

  • H. Tsukuma, T. Kubokawa, A unified approach to estimating a normal mean matrix in high and low dimensions. J. Multivar. Anal. 139, 312–328 (2015)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hisayuki Tsukuma .

Appendix

Appendix

This appendix provides some brief proofs of useful results on matrix differential operators that were previously applied to Theorem 6.1.

Let \({\varvec{X}}=(x_{ab})\in \mathbb {R}^{m\times p}\) and \({\varvec{Y}}=(y_{ab})\in \mathbb {R}^{n\times p}\). Denote the matrix differential operators with respect to \({\varvec{X}}\) and \({\varvec{Y}}\), respectively, by \(\nabla _X=\big (\mathrm{d}_{ab}^X\big )\) with \(\mathrm{d}_{ab}^X=\partial /\partial x_{ab}\) and by \(\nabla _Y=\big (\mathrm{d}_{ab}^Y\big )\) with \(\mathrm{d}_{ab}^Y=\partial /\partial y_{ab}\). Let

$$ \tau =m\wedge n\wedge p. $$

Here, the eigenvalue decomposition of \({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top \) is

$$ {\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top ={\varvec{R}}{\varvec{F}}{\varvec{R}}^\top , $$

where \({\varvec{F}}=\mathrm{\,diag\,}(f_1,\ldots ,f_\tau )\in \mathbb {D}_\tau ^{(\ge 0)}\) and \({\varvec{R}}=(r_{ij})\in \mathbb {V}_{m,\tau }\).

Lemma 6.5

For \(i=1,\ldots ,\tau \), \(k=1,\ldots ,m\), \(a=1,\ldots ,m\) and \(b=1,\ldots ,p\), we have

  1. (i)

    \(\mathrm{d}_{ab}^X f_i=A_{ab}^{ii}\),

  2. (ii)

    \(\displaystyle \mathrm{d}_{ab}^X r_{ki}=\sum _{j\ne i}^\tau \frac{r_{kj}A_{ab}^{ij}}{f_i-f_j}+f_i^{-1}\{{\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top \}_{ka}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}\),

where \(A_{ab}^{ij}=r_{aj}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}+r_{ai}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{jb}\).

For \(i=1,\ldots ,\tau \), \(k=1,\ldots ,m\), \(a=1,\ldots ,n\) and \(b=1,\ldots ,p\), we have

  1. (iii)

    \(\mathrm{d}_{ab}^Y f_i=B_{ab}^{ii}\),

  2. (iv)

    \(\displaystyle \mathrm{d}_{ab}^Y r_{ki}\!=\!\sum _{j\ne i}^\tau \frac{r_{kj}B_{ab}^{ij}}{f_i-f_j}+f_i^{-1}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{S}}^+{\varvec{Y}}^\top \}_{ia}\{({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top ){\varvec{X}}({\varvec{I}}_p-{\varvec{S}}{\varvec{S}}^+)\}_{kb}\),

where

$$\begin{aligned} B_{ab}^{ij}&=-\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{Y}}^\top \}_{ia}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{jb} -\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{Y}}^\top \}_{ja}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}\\&\qquad +\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{S}}^+{\varvec{Y}}^\top \}_{ia}\{{\varvec{R}}^\top {\varvec{X}}({\varvec{I}}_p-{\varvec{S}}{\varvec{S}}^+)\}_{jb}\\&\qquad +\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{S}}^+{\varvec{Y}}^\top \}_{ja}\{{\varvec{R}}^\top {\varvec{X}}({\varvec{I}}_p-{\varvec{S}}{\varvec{S}}^+)\}_{ib}. \end{aligned}$$

Proof

Take \({\varvec{R}}_*\in \mathbb {V}_{m,m-\tau }\) such that \({\varvec{R}}_*^\top {\varvec{R}}=\mathbf{0}_{(m-\tau )\times \tau }\). Define \({\varvec{R}}_0=({\varvec{R}},{\varvec{R}}_*)\in \mathbb {O}_m\). Denote \({\varvec{F}}_0=\mathrm{\,diag\,}(f_1,\ldots ,f_\tau ,0,\ldots ,0)\ (\in \mathbb {D}_m^{(\ge 0)})\). Then \({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top ={\varvec{R}}_0{\varvec{F}}_0{\varvec{R}}_0^\top \).

Differentiating both sides of \({\varvec{R}}_0^\top {\varvec{R}}_0={\varvec{I}}_m\) gives \([\mathrm{d}{\varvec{R}}_0^\top ]{\varvec{R}}_0+{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0=\mathbf{0}_{m\times m}\), implying that \({\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0\) is skew-symmetric in \(\mathbb {R}^{m\times m}\). Thus, for \(j,i\in \{1,\ldots ,m\}\), \(\{{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0\}_{ji}=0\) if \(j=i\) and \(\{{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0\}_{ji}=-\{{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0\}_{ij}=-\{[\mathrm{d}{\varvec{R}}_0^\top ]{\varvec{R}}_0\}_{ji}\) otherwise. Differentiating both sides of \({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top ={\varvec{R}}_0{\varvec{F}}_0{\varvec{R}}_0^\top \) gives

$$ \mathrm{d}({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )=[\mathrm{d}{\varvec{R}}_0]{\varvec{F}}_0{\varvec{R}}_0^\top +{\varvec{R}}_0[\mathrm{d}{\varvec{F}}_0]{\varvec{R}}_0^\top +{\varvec{R}}_0{\varvec{F}}_0\mathrm{d}{\varvec{R}}_0^\top , $$

and then

$$\begin{aligned} {\varvec{R}}_0^\top [\mathrm{d}({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}_0&=[{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0]{\varvec{F}}_0+\mathrm{d}{\varvec{F}}_0+{\varvec{F}}_0[\mathrm{d}{\varvec{R}}_0^\top ]{\varvec{R}}_0\\&=[{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0]{\varvec{F}}_0+\mathrm{d}{\varvec{F}}_0-{\varvec{F}}_0{\varvec{R}}_0^\top \mathrm{d}{\varvec{R}}_0. \end{aligned}$$

Comparing each element in both sides of the above identity, we have

$$\begin{aligned} \mathrm{d}f_i&=\{{\varvec{R}}^\top [\mathrm{d}({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ii}\quad \text {for}\,\, i\in \{1,\ldots ,\tau \} , \\ \{{\varvec{R}}^\top \mathrm{d}{\varvec{R}}\}_{ji}&=\frac{\{{\varvec{R}}^\top [\mathrm{d}({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ji}}{f_i-f_j}\quad \text {for}\,\, j,i\in \{1,\ldots ,\tau \} \,\,\text {with}\,\, j\ne i , \\ \{{\varvec{R}}_*^\top \mathrm{d}{\varvec{R}}\}_{ji}&=\frac{\{{\varvec{R}}_*^\top [\mathrm{d}({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ji}}{f_i} \quad \text {for}\,\, j\in \{1,\ldots ,m-\tau \} \,\,\text {and}\,\, i\in \{1,\ldots ,\tau \} . \end{aligned}$$

Note that \(\mathrm{d}_{ab}^X {\varvec{X}}={\varvec{E}}_{ab}\), where \({\varvec{E}}_{ab}\in \mathbb {R}^{m\times p}\) such that the (a, b)-th element is one and the other elements are zeros. Since \(\mathrm{d}_{ab}^X ({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )=[\mathrm{d}_{ab}^X {\varvec{X}}]{\varvec{S}}^+{\varvec{X}}^\top +{\varvec{X}}{\varvec{S}}^+[\mathrm{d}_{ab}^X{\varvec{X}}^\top ]={\varvec{E}}_{ab}{\varvec{S}}^+{\varvec{X}}^\top +{\varvec{X}}{\varvec{S}}^+{\varvec{E}}_{ab}^\top \), we observe that, for \(j,i\in \{1,\ldots ,\tau \}\),

$$\begin{aligned} \{{\varvec{R}}^\top [\mathrm{d}_{ab}^X({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ji}&=\{{\varvec{R}}^\top {\varvec{E}}_{ab}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{ji}+ \{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{E}}_{ab}^\top {\varvec{R}}\}_{ji} {\nonumber }\\&=r_{aj}\{{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{bi}+\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{jb}r_{ai}=A_{ab}^{ij}. \end{aligned}$$
(6.40)

Thus, for \(i=1,\ldots ,\tau \), \(\mathrm{d}_{ab}^X f_i=\{{\varvec{R}}^\top [\mathrm{d}_{ab}^X({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ii}=A_{ab}^{ii}\), which shows (i).

On the other hand, it is observed that for \(k=1,\ldots ,m\) and \(i=1,\ldots ,\tau \)

$$\begin{aligned} \mathrm{d}_{ab}^X r_{ki}&=\{\mathrm{d}_{ab}^X{\varvec{R}}\}_{ki}=\{({\varvec{R}}{\varvec{R}}^\top +{\varvec{R}}_*{\varvec{R}}_*^\top ) \mathrm{d}_{ab}^X{\varvec{R}}\}_{ki}{\nonumber }\\&=\sum _{\begin{array}{c}j \ne i \end{array}}^\tau r_{kj}\{{\varvec{R}}^\top \mathrm{d}_{ab}^X{\varvec{R}}\}_{ji} +\{{\varvec{R}}_*{\varvec{R}}_*^\top \mathrm{d}_{ab}^X{\varvec{R}}\}_{ki}{\nonumber }\\&=\sum _{\begin{array}{c}j \ne i \end{array}}^\tau \frac{r_{kj}A_{ab}^{ij}}{f_i-f_j} +\frac{\{{\varvec{R}}_*{\varvec{R}}_*^\top [\mathrm{d}_{ab}^X({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ki}}{f_i}. \end{aligned}$$
(6.41)

In a similar way to (6.40), for \(k=1,\ldots ,m\) and \(i=1,\ldots ,\tau \),

$$ \{{\varvec{R}}_*{\varvec{R}}_*^\top [\mathrm{d}_{ab}^X({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ki}=\{{\varvec{R}}_*{\varvec{R}}_*^\top \}_{ka}\{{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{bi}+\{{\varvec{R}}_*{\varvec{R}}_*^\top {\varvec{X}}{\varvec{S}}^+\}_{kb}r_{ai}. $$

Here, \({\varvec{R}}_*^\top {\varvec{X}}{\varvec{S}}^+=\mathbf{0}_{(m-\tau )\times p}\), so that

$$\begin{aligned} \{{\varvec{R}}_*{\varvec{R}}_*^\top [\mathrm{d}_{ab}^X({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ki} =\{{\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top \}_{ka}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}. \end{aligned}$$
(6.42)

Substituting (6.42) into (6.41), we obtain (ii).

Since \(\{{\varvec{R}}^\top [\mathrm{d}_{ab}^Y({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ji}=\{{\varvec{R}}^\top {\varvec{X}}[\mathrm{d}_{ab}^Y {\varvec{S}}^+]{\varvec{X}}^\top {\varvec{R}}\}_{ji}\) for \(j,i\in \{1,\ldots ,\tau \}\), it is observed from (iii) of Lemma 5.2 that

$$\begin{aligned}&\{{\varvec{R}}^\top [\mathrm{d}_{ab}^Y({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ji}\\&=-\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{jb}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{Y}}^\top \}_{ia}-\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{Y}}^\top \}_{ja}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}\\&\quad +\{{\varvec{R}}^\top {\varvec{X}}({\varvec{I}}_p-{\varvec{S}}{\varvec{S}}^+)\}_{jb}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{S}}^+{\varvec{Y}}^\top \}_{ia} \\&\quad +\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{S}}^+{\varvec{Y}}^\top \}_{ja}\{{\varvec{R}}^\top {\varvec{X}}({\varvec{I}}_p-{\varvec{S}}{\varvec{S}}^+)\}_{ib}=B_{ab}^{ij}. \end{aligned}$$

Similarly,

$$ \{{\varvec{R}}_*{\varvec{R}}_*^\top [\mathrm{d}_{ab}^Y({\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top )]{\varvec{R}}\}_{ki} =\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{S}}^+{\varvec{Y}}^\top \}_{ia}\{({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top ){\varvec{X}}({\varvec{I}}_p-{\varvec{S}}{\varvec{S}}^+)\}_{kb} $$

for \(k=1,\ldots ,m\) and \(i=1,\ldots ,\tau \). Hence using the same arguments as in the proofs of (i) and (ii) yields (iii) and (iv). \(\square \)

Lemma 6.6

Let \(c_0=|n\wedge p-m|+1\). Define \({\varvec{\Phi }}=\mathrm{\,diag\,}(\phi _1,\ldots ,\phi _\tau )\in \mathbb {D}_\tau \) such that the diagonals of \({\varvec{\Phi }}\) are absolutely continuous functions of \({\varvec{F}}\). Then

$$\begin{aligned} \nabla _X{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top ={\varvec{R}}{\varvec{\Phi }}^*{\varvec{R}}^\top +(\mathrm{\,tr\,}{\varvec{\Phi }})({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top ), \end{aligned}$$
(6.43)

where \({\varvec{\Phi }}^*=\mathrm{\,diag\,}(\phi _1^*,\ldots ,\phi _\tau ^*)\) and for \(i=1,\ldots ,\tau \)

$$ \phi _i^*=(n\wedge p-\tau +1)\phi _i+2f_i\frac{\partial \phi _i}{\partial f_i}+\sum _{j\ne i}^\tau \frac{f_i\phi _i-f_j\phi _j}{f_i-f_j}. $$

In particular,

$$\begin{aligned} \mathrm{\,tr\,}\nabla _X{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top = \sum _{i=1}^\tau \bigg \{c_0\phi _i+2f_i\frac{\partial \phi _i}{\partial f_i}+2\sum _{j>i}^\tau \frac{f_i\phi _i-f_j\phi _j}{f_i-f_j}\bigg \}. \end{aligned}$$
(6.44)

Proof

For \(a,c\in \{1,\ldots ,\tau \}\), the (a, c)-th element of \(\nabla _X{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \) is

$$ \{\nabla _X{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \}_{ac} =\sum _{b=1}^p\sum _{d=1}^p\sum _{k=1}^m\sum _{i=1}^\tau \mathrm{d}_{ab}^X [ \{{\varvec{S}}{\varvec{S}}^+\}_{bd}x_{kd} r_{ki}\phi _i r_{ci} ], $$

and then

$$\begin{aligned} \{\nabla _X {\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \}_{ac} =D_{ac}^{(1)} + D_{ac}^{(2)}+D_{bc}^{(3)}, \end{aligned}$$
(6.45)

where

$$\begin{aligned} D_{ac}^{(1)}&=\sum _{b=1}^p\sum _{d=1}^p\sum _{k=1}^m \{{\varvec{S}}{\varvec{S}}^+\}_{bd}\{{\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \}_{kc} \mathrm{d}_{ab}^X x_{kd} , \\ D_{ac}^{(2)}&=\sum _{b=1}^p\sum _{i=1}^\tau \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{bi}r_{ci} \mathrm{d}_{ab}^X \phi _i , \\ D_{ac}^{(3)}&=\sum _{b=1}^p\sum _{k=1}^m\sum _{i=1}^\tau \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top \}_{bk} \phi _i ( r_{ci} \mathrm{d}_{ab}^X r_{ki}+r_{ki} \mathrm{d}_{ab}^Xr_{ci}). \end{aligned}$$

Since \(\mathrm{d}_{ab}^X x_{kd}={\delta }_{ak}{\delta }_{bd}\) and \({\varvec{S}}{\varvec{S}}^+\) is idempotent with rank \(n\wedge p\), it follows that

$$\begin{aligned} D_{ac}^{(1)} =\sum _{b=1}^p \{{\varvec{S}}{\varvec{S}}^+\}_{bb}\{{\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \}_{ac} =(n\wedge p)\{{\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \}_{ac}. \end{aligned}$$
(6.46)

To evaluate \(D_{ac}^{(2)}\), we first use the chain rule and (i) of Lemma 6.5 to obtain

$$ \mathrm{d}_{ab}^X \phi _i =\sum _{j=1}^\tau [\mathrm{d}_{ab}^X f_j]\frac{\partial \phi _i}{\partial f_j} =\sum _{j=1}^\tau A_{ab}^{jj}\frac{\partial \phi _i}{\partial f_j}. $$

Note that \({\varvec{S}}^+{\varvec{S}}{\varvec{S}}^+={\varvec{S}}^+\) and

$$ \sum _{b=1}^p \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{bi} A_{ab}^{jj} =2r_{aj}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{ji} =2r_{aj}\{{\varvec{F}}\}_{ji}, $$

so that

$$\begin{aligned} D_{ac}^{(2)} =2\sum _{i=1}^\tau \sum _{j=1}^\tau r_{aj}r_{ci}\{{\varvec{F}}\}_{ji} \frac{\partial \phi _i}{\partial f_j} =2\sum _{i=1}^\tau r_{ai}r_{ci} f_i \frac{\partial \phi _i}{\partial f_i}. \end{aligned}$$
(6.47)

Finally, we consider \(D_{ac}^{(3)}\). Using (ii) of Lemma 6.5 yields

$$\begin{aligned} \sum _{k=1}^m \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top \}_{bk} \mathrm{d}_{ab}^X r_{ki}&=\sum _{j\ne i}^\tau \frac{\{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{bj}A_{ab}^{ij}}{f_i-f_j} \\&\qquad +f_i^{-1}\{({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top ){\varvec{X}}{\varvec{S}}{\varvec{S}}^+\}_{ab}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}. \end{aligned}$$

Since

$$\begin{aligned}&\sum _{b=1}^p \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{bj}A_{ab}^{ij}=r_{aj}\{{\varvec{F}}\}_{ij}+r_{ai}f_j, \\&\sum _{b=1}^p \{({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top ){\varvec{X}}{\varvec{S}}{\varvec{S}}^+\}_{ab}\{{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+\}_{ib}=\{({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top ){\varvec{X}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}\}_{ai} = 0, \end{aligned}$$

it is seen that

$$\begin{aligned}&\sum _{b=1}^p\sum _{k=1}^m\sum _{i=1}^\tau \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top \}_{bk} \phi _i r_{ci} \mathrm{d}_{ab}^X r_{ki} {\nonumber }\\&=\sum _{i=1}^\tau \sum _{j\ne i}^\tau \frac{r_{ai}r_{ci}f_j\phi _i}{f_i-f_j} =\sum _{i=1}^\tau \sum _{j\ne i}^\tau \frac{r_{ai}r_{ci}(f_j-f_i+f_i)\phi _i}{f_i-f_j} {\nonumber }\\&=-(\tau -1)\sum _{i=1}^\tau r_{ai}r_{ci}\phi _i+\sum _{i=1}^\tau \sum _{j\ne i}^\tau \frac{r_{ai}r_{ci} f_i\phi _i}{f_i-f_j}. \end{aligned}$$
(6.48)

Similarly,

$$\begin{aligned} \sum _{b=1}^p\sum _{k=1}^m\sum _{i=1}^\tau \{{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top \}_{bk} \phi _i r_{ki} \mathrm{d}_{ab}^Xr_{ci}&= \sum _{i=1}^\tau \sum _{j\ne i}^\tau \frac{r_{aj}r_{cj} f_i\phi _i}{f_i-f_j} +(\mathrm{\,tr\,}{\varvec{\Phi }})\{{\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top \}_{ac} {\nonumber }\\&= -\sum _{i=1}^\tau \sum _{j\ne i}^\tau \frac{r_{ai}r_{ci} f_j\phi _j}{f_i-f_j} +(\mathrm{\,tr\,}{\varvec{\Phi }})\{{\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top \}_{ac}. \end{aligned}$$
(6.49)

Combining (6.48) and (6.49) gives

$$\begin{aligned} D_{ac}^{(3)} =\sum _{i=1}^\tau r_{ai}r_{ci}\bigg \{-(\tau -1)\phi _i+\sum _{j\ne i}^\tau \frac{f_i\phi _i-f_j\phi _j}{f_i-f_j}\bigg \} + (\mathrm{\,tr\,}{\varvec{\Phi }})\{{\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top \}_{ac}. \end{aligned}$$
(6.50)

Substituting (6.46), (6.47) and (6.50) into (6.45) yields (6.43).

Note that \(n\wedge p-\tau +1+\mathrm{\,tr\,}({\varvec{I}}_m-{\varvec{R}}{\varvec{R}}^\top )=n\wedge p+m-2\tau +1=|n\wedge p-m|+1=c_0\) and also that

$$ \sum _{i=1}^\tau \sum _{j\ne i}^\tau \frac{f_i\phi _i-f_j\phi _j}{f_i-f_j} = 2\sum _{i=1}^\tau \sum _{j>i}^\tau \frac{f_i\phi _i-f_j\phi _j}{f_i-f_j}. $$

Hence taking the trace of (6.43) yields (6.44), which completes the proof. \(\square \)

Lemma 6.7

Let \(c_1=n-(n\wedge p)+\tau -2\), \(c_2=p-(n\wedge p)+\tau -1\) and \(c_0=c_1+c_2\). Let \({\varvec{\Phi }}={\varvec{\Phi }}({\varvec{F}})=\mathrm{\,diag\,}(\phi _1,\ldots ,\phi _\tau )\), where the \(\phi _i\)’s are absolutely continuous functions of \({\varvec{F}}\). Then

$$\begin{aligned} {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}{\varvec{S}}^+\nabla _Y^\top {\varvec{Y}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top&= {\varvec{R}}{\varvec{\Phi }}^{*1}{\varvec{R}}^\top , \end{aligned}$$
(6.51)
$$\begin{aligned} {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{Y}}^\top \nabla _Y{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top&= {\varvec{R}}{\varvec{\Phi }}^{*2}{\varvec{R}}^\top , \end{aligned}$$
(6.52)

where \({\varvec{\Phi }}^{*k}=\mathrm{\,diag\,}(\phi _1^{*k},\ldots ,\phi _\tau ^{*k})\) for \(k=1,2\) and, for \(i=1,\ldots ,\tau \),

$$ \phi _i^{*k} =c_k f_i\phi _i^2 -2f_i^2\phi _i\frac{\partial \phi _i}{\partial f_i} -\sum _{j\ne i}^\tau \frac{f_i^2\phi _i^2}{f_i-f_j} +\sum _{j\ne i}^\tau \frac{f_i\phi _if_j\phi _j}{f_i-f_j}. $$

In particular,

$$\begin{aligned} \mathrm{\,tr\,}\nabla _Y^\top {\varvec{Y}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}^2{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}{\varvec{S}}^+ =\sum _{i=1}^\tau \bigg \{ c_0 f_i\phi _i^2 -2f_i^2\frac{\partial (\phi _i^2)}{\partial f_i} -2\sum _{j>i}^\tau \frac{f_i^2\phi _i^2-f_j^2\phi _j^2}{f_i-f_j}\bigg \}. \end{aligned}$$
(6.53)

Proof

The proofs of (6.51) and (6.52) can be done by using the same arguments as in the proof of Lemma 6.6. Since

$$\begin{aligned} \mathrm{\,tr\,}\nabla _Y^\top {\varvec{Y}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}^2{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}{\varvec{S}}^+&=\mathrm{\,tr\,}\nabla _Y^\top \{{\varvec{Y}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \cdot {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}{\varvec{S}}^+\} \\&=\mathrm{\,tr\,}{\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}{\varvec{S}}^+\nabla _Y^\top {\varvec{Y}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top \\&\qquad +\mathrm{\,tr\,}{\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top {\varvec{X}}{\varvec{S}}^+{\varvec{Y}}^\top \nabla _Y{\varvec{S}}{\varvec{S}}^+{\varvec{X}}^\top {\varvec{R}}{\varvec{\Phi }}{\varvec{R}}^\top , \end{aligned}$$

the identity (6.53) can be verified by combining (6.51) and (6.52). \(\square \)

Rights and permissions

Reprints and permissions

Copyright information

© 2020 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Tsukuma, H., Kubokawa, T. (2020). Estimation of the Mean Matrix. In: Shrinkage Estimation for Mean and Covariance Matrices. SpringerBriefs in Statistics(). Springer, Singapore. https://doi.org/10.1007/978-981-15-1596-5_6

Download citation

Publish with us

Policies and ethics