A Generalized Stein Identity and Matrix Differential Operators

Tsukuma, Hisayuki; Kubokawa, Tatsuya

doi:10.1007/978-981-15-1596-5_5

Hisayuki Tsukuma³ &
Tatsuya Kubokawa⁴

Part of the book series: SpringerBriefs in Statistics ((JSSRES))

925 Accesses

Abstract

In shrinkage estimation, the Stein (1973, 1981) identity is known as an integration by parts formula for deriving unbiased risk estimates. It is a simple but very powerful mathematical tool and has contributed significantly to the development of shrinkage estimation. This chapter provides a generalized Stein identity in matrix-variate normal distribution model and also some useful results on matrix differential operators for a unified application of the identity to high- and low-dimensional normal models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

M. Bilodeau, T. Kariya, Minimax estimators in the normal MANOVA model. J. Multivar. Anal. 28, 260–270 (1989)
Article MathSciNet Google Scholar
B. Efron, C. Morris, Families of minimax estimators of the mean of a multivariate normal distribution. Ann. Stat. 4, 11–21 (1976)
Article MathSciNet Google Scholar
W. Fleming, Functions of Several Variables, 2nd edn. (Springer, New York, 1977)
Book Google Scholar
L.R. Haff, Minimax estimators for a multinormal precision matrix. J. Multivar. Anal. 7, 374–385 (1977)
Article MathSciNet Google Scholar
L.R. Haff, An identity for the Wishart distribution with applications. J. Multivar. Anal. 9, 531–544 (1979)
Article MathSciNet Google Scholar
Y. Konno, Improved estimation of matrix of normal mean and eigenvalues in the multivariate $F$-distribution. Doctoral dissertation, Institute of Mathematics, University of Tsukuba, 1992. http://mcm-www.jwu.ac.jp/~konno/
Y. Konno, Shrinkage estimators for large covariance matrices in multivariate real and complex normal distributions under an invariant quadratic loss. J. Multivar. Anal. 100, 2237–2253 (2009)
Article MathSciNet Google Scholar
T. Kubokawa, M.S. Srivastava, Estimation of the precision matrix of a singular Wishart distribution and its application in high-dimensional data. J. Multivar. Anal. 99, 1906–1928 (2008)
Article MathSciNet Google Scholar
Y. Sheena, Unbiased estimator of risk for an orthogonally invariant estimator of a covariance matrix. J. Jpn. Stat. Soc. 25, 35–48 (1995)
Article MathSciNet Google Scholar
C. Stein, Estimation of the mean of a multivariate normal distribution. Technical Reports No.48 (Department of Statistics, Stanford University, Stanford, 1973)
Google Scholar
C. Stein, Lectures on the theory of estimation of many parameters, in Proceedings of Scientific Seminars of the Steklov Institute Studies in the Statistical Theory of Estimation, Part I, vol. 74, ed. by I.A. Ibragimov, M.S. Nikulin (Leningrad Division, 1977), pp. 4–65
Google Scholar
C. Stein, Estimation of the mean of a multivariate normal distribution. Ann. Stat. 9, 1135–1151 (1981)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Medicine, Toho University, Tokyo, Japan
Hisayuki Tsukuma
Faculty of Economics, University of Tokyo, Tokyo, Japan
Tatsuya Kubokawa

Authors

Hisayuki Tsukuma
View author publications
You can also search for this author in PubMed Google Scholar
Tatsuya Kubokawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hisayuki Tsukuma .

Appendix

In this appendix, we first give another derivation of the Stein identity (5.3). Denote by $f({\varvec{X}})$ the p.d.f. of $\mathcal{N}_{m\times p}({\varvec{{\Theta }}},{\varvec{I}}_m\otimes {\varvec{{\Sigma }}})$. Let

$$ I_{ST}({\varvec{G}})=\int _{\mathbb {R}^{m\times p}} \mathrm{\,tr\,}\nabla _X \{{\varvec{G}}^\top f({\varvec{X}})\} (\mathrm{d}{\varvec{X}}). $$

It follows that $\nabla _X f({\varvec{X}})=-({\varvec{X}}-{\varvec{{\Theta }}}){\varvec{{\Sigma }}}^{-1} f({\varvec{X}})$, so that

$$ I_{ST}({\varvec{G}})=E[\mathrm{\,tr\,}\nabla _X{\varvec{G}}^\top ]-E[\mathrm{\,tr\,}({\varvec{X}}-{\varvec{{\Theta }}}){\varvec{{\Sigma }}}^{-1}{\varvec{G}}^\top ] $$

provided the expectations exist. Hence the Stein identity (5.3) can be verified if $I_{ST}({\varvec{G}})=0$.

For $r>0$, let $\mathbb {B}_r=\{{\varvec{X}}\in \mathbb {R}^{m\times p}:\Vert \mathrm {vec}({\varvec{X}})\Vert \le r\}$, where $\Vert \cdot \Vert $ is the usual Euclidean norm and $\mathrm {vec}(\cdot )$ is defined in Definition 2.3. Then $\mathbb {B}_r\rightarrow \mathbb {R}^{m\times p}$ as $r\rightarrow {\infty }$ and

$$ I_{ST}({\varvec{G}})=\lim _{r\rightarrow {\infty }} \int _{\mathbb {B}_r} \mathrm {vec}(\nabla _X)^\top \mathrm {vec}({\varvec{G}}f({\varvec{X}})) (\mathrm{d}{\varvec{X}}). $$

The boundary of $\mathbb {B}_r$ is expressed by $\partial \mathbb {B}_r=\{\mathrm {vec}({\varvec{X}})\in \mathbb {R}^{mp}:\Vert \mathrm {vec}({\varvec{X}})\Vert =r\}$. Denote by ${\varvec{u}}$ an outward unit normal vector at a point $\mathrm {vec}({\varvec{X}})\in \partial \mathbb {B}_r$. Let ${\lambda }_{\partial \mathbb {B}_r}$ be Lebesgue measure on $\partial \mathbb {B}_r$. By the Gauss divergence theorem,

$$ I_{ST}({\varvec{G}}) =\lim _{r\rightarrow {\infty }} \int _{\partial \mathbb {B}_r} {\varvec{u}}^\top \mathrm {vec}({\varvec{G}}) f({\varvec{X}})(\mathrm{d}{\lambda }_{\partial \mathbb {B}_r}). $$

For details of the Gauss divergence theorem, see Fleming (1977).

Let $o(\cdot )$ be the Landau symbol, namely, for real-valued functions f(x) and g(x) with $g(x)\ne 0$, we write $f(x)=o(g(x))$ when $\lim _{x\rightarrow c}|f(x)/g(x)|=0$ for an extended real number c. If

$$ \sup _{\mathrm {vec}({\varvec{X}})\in \partial \mathbb {B}_r} \Vert \mathrm {vec}({\varvec{G}})\Vert f({\varvec{X}})=o(r^{1-mp}) \quad {\text {as}}\,\,\,r\rightarrow {\infty }, $$

then $I_{ST}({\varvec{G}})=0$. In fact,

$$\begin{aligned} \int _{\partial \mathbb {B}_r} |{\varvec{u}}^\top \mathrm {vec}({\varvec{G}})| f({\varvec{X}})(\mathrm{d}{\lambda }_{\partial \mathbb {B}_r})&\le \int _{\partial \mathbb {B}_r} \Vert \mathrm {vec}({\varvec{G}})\Vert f({\varvec{X}})(\mathrm{d}{\lambda }_{\partial \mathbb {B}_r}) \\&\le o(r^{1-mp})\int _{\partial \mathbb {B}_r} (\mathrm{d}{\lambda }_{\partial \mathbb {B}_r}) =o(1), \end{aligned}$$

because $\int _{\partial \mathbb {B}_r} (\mathrm{d}{\lambda }_{\partial \mathbb {B}_r})$ is the surface area of the $(mp-1)$-sphere of radius r in $\mathbb {R}^{mp}$, namely, $\int _{\partial \mathbb {B}_r} (\mathrm{d}{\lambda }_{\partial \mathbb {B}_r})\approx r^{mp-1}$.

Next, a simple derivation of the Haff identity (5.5) is provided by using the Gauss divergence theorem. The derivation is essentially the same as Haff (1977, 1979). Let $f({\varvec{S}})$ be the p.d.f. of $\mathcal{W}_p(n,{\varvec{{\Sigma }}})$. For a differentiable matrix-valued function ${\varvec{G}}\in \mathbb {S}_p$, let

$$ I_{HF}({\varvec{G}})=\int _{\mathbb {S}_p^{(+)}} \mathrm{\,tr\,}\mathrm {D}_S\{{\varvec{G}}f({\varvec{S}})\}(\mathrm{d}{\varvec{S}}) $$

Since $\mathrm {D}_S|{\varvec{S}}|={\varvec{S}}^{-1}|{\varvec{S}}|$ and $\mathrm {D}_S\mathrm{\,tr\,}{\varvec{{\Sigma }}}^{-1}{\varvec{S}}={\varvec{{\Sigma }}}^{-1}$, we get $\mathrm {D}_S f({\varvec{S}})=\{(n-p-1){\varvec{S}}^{-1}-{\varvec{{\Sigma }}}^{-1}\}f({\varvec{S}})/2$, implying that

$$ I_{HF}({\varvec{G}}) =E[\mathrm{\,tr\,}\mathrm {D}_S{\varvec{G}}]+\frac{n-p-1}{2}E[\mathrm{\,tr\,}{\varvec{S}}^{-1}{\varvec{G}}]-\frac{1}{2}E[\mathrm{\,tr\,}{\varvec{{\Sigma }}}^{-1}{\varvec{G}}] $$

provided the expectations exist. Hence the Haff identity (5.5) follows if $I_{HF}({\varvec{G}})=0$.

Denote by $\partial /\partial {\varvec{S}}=(\partial /\partial s_{ij})$ the $p\times p$ matrix differential operator with respect to ${\varvec{S}}\in \mathbb {S}_p$. For ${\varvec{A}}=(a_{ij})\in \mathbb {S}_p$, define

$$ \mathrm {Vec}({\varvec{A}})=(a_{11},a_{21},\ldots ,a_{p1},a_{22},a_{32},\ldots ,a_{p2},\ldots ,a_{p-1,p-1},a_{p,p-1},a_{pp})^\top \in \mathbb {R}^q, $$

where $q=p(p+1)/2$. From symmetry of ${\varvec{G}}$, it holds that

$$\begin{aligned} \mathrm{\,tr\,}\mathrm {D}_S\{{\varvec{G}}f({\varvec{S}})\}&=\sum _{i=1}^p\sum _{j=1}^p\frac{1+{\delta }_{ij}}{2}\frac{\partial }{\partial s_{ij}}\{g_{ji}f({\varvec{S}})\} =\sum _{i=1}^p\sum _{j=1}^i\frac{\partial }{\partial s_{ij}}\{g_{ij}f({\varvec{S}})\} \\&=\mathrm {Vec}(\partial /\partial {\varvec{S}})^\top \mathrm {Vec}({\varvec{G}}f({\varvec{S}})), \end{aligned}$$

so that

$$ I_{HF}({\varvec{G}}) =\int _{\mathbb {S}_p^{(+)}} \mathrm {Vec}(\partial /\partial {\varvec{S}})^\top \mathrm {Vec}({\varvec{G}}f({\varvec{S}}))(\mathrm{d}{\varvec{S}}). $$

For $r>0$, let $\partial \mathbb {B}_r^q=\{\mathrm {Vec}({\varvec{S}})\in \mathbb {R}^q:\Vert \mathrm {Vec}({\varvec{S}})\Vert =r\}$ and, for $0< r_1\le r_2<{\infty }$, let $\mathbb {C}_{r_1,r_2}=\{\mathrm {Vec}({\varvec{S}})\in \mathbb {R}^q: r_1\le \Vert \mathrm {Vec}({\varvec{S}})\Vert \le r_2\}$. Then $\mathbb {C}_{r_1,r_2}\cap \mathbb {S}_p^{(+)}\rightarrow \mathbb {S}_p^{(+)}$ as $r_1\rightarrow 0$ and $r_2\rightarrow {\infty }$. The boundary of $\mathbb {C}_{r_1,r_2}\cap \mathbb {S}_p^{(+)}$ can be expressed as $\bigcup _{i=1}^3 \partial \mathbb {B}_i$, where $\partial \mathbb {B}_1$, $\partial \mathbb {B}_2$ and $\partial \mathbb {B}_3$ are certain sets satisfying $\partial \mathbb {B}_1\subset \partial \mathbb {B}_{r_1}^q$, $\partial \mathbb {B}_2\subset \partial \mathbb {B}_{r_2}^q$ and $\partial \mathbb {B}_3\subset \partial \mathbb {S}_p^{(+)}$. Note that, for any point ${\varvec{S}}\in \partial \mathbb {S}_p^{(+)}$, $|{\varvec{S}}|=0$, namely, $f({\varvec{S}})=0$ when $n-p-1>0$. Let ${\varvec{u}}_1=-\mathrm {Vec}({\varvec{S}})/\Vert \mathrm {Vec}({\varvec{S}})\Vert $ for $\mathrm {Vec}({\varvec{S}})\in \partial \mathbb {B}_{r_1}^q$ and ${\varvec{u}}_2=\mathrm {Vec}({\varvec{S}})/\Vert \mathrm {Vec}({\varvec{S}})\Vert $ for $\mathrm {Vec}({\varvec{S}})\in \partial \mathbb {B}_{r_2}^q$. Denote by ${\lambda }_{\partial \mathbb {B}_r^q}$ Lebesgue measure on $\partial \mathbb {B}_r^q$. Using the Gauss divergence theorem gives

$$ I_{HF}({\varvec{G}}) = \lim _{r_1\rightarrow 0} \int _{\partial \mathbb {B}_1} {\varvec{u}}_1^\top \mathrm {Vec}({\varvec{G}}) f({\varvec{S}}) (\mathrm{d}{\lambda }_{\partial \mathbb {B}_{r_1}^q}) + \lim _{r_2\rightarrow {\infty }} \int _{\partial \mathbb {B}_2} {\varvec{u}}_2^\top \mathrm {Vec}({\varvec{G}}) f({\varvec{S}}) (\mathrm{d}{\lambda }_{\partial \mathbb {B}_{r_2}^q}). $$

Using the Landau symbol $o(\cdot )$, we assume that

$$ \sup _{\mathrm {Vec}({\varvec{S}})\in \partial \mathbb {B}_1}\Vert \mathrm {Vec}({\varvec{G}})\Vert f({\varvec{S}})=o(r_1^{1-q}) \quad {\text {as}}\, r_1\rightarrow 0 $$

and

$$ \sup _{\mathrm {Vec}({\varvec{S}})\in \partial \mathbb {B}_2}\Vert \mathrm {Vec}({\varvec{G}})\Vert f({\varvec{S}})=o(r_2^{1-q}) \quad {\text {as}}\, r_2\rightarrow {\infty }. $$

Under these assumptions, we can see that

$$\begin{aligned} \int _{\partial \mathbb {B}_1} |{\varvec{u}}_1^\top \mathrm {Vec}({\varvec{G}})| f({\varvec{S}}) (\mathrm{d}{\lambda }_{\partial \mathbb {B}_{r_1}^q})&\le \int _{\partial \mathbb {B}_1} \Vert \mathrm {Vec}({\varvec{G}})\Vert f({\varvec{S}}) (\mathrm{d}{\lambda }_{\partial \mathbb {B}_{r_1}^q}) \\&\le o(r_1^{1-q}) \int _{\partial \mathbb {B}_{r_1}^q}(\mathrm{d}{\lambda }_{\partial \mathbb {B}_{r_1}^q}) =o(1) \qquad {\text {as}}\, r_1\rightarrow 0 \end{aligned}$$

and also

$$ \int _{\partial \mathbb {B}_2} |{\varvec{u}}_2^\top \mathrm {Vec}({\varvec{G}})| f({\varvec{S}}) (\mathrm{d}{\lambda }_{\partial \mathbb {B}_{r_2}^q}) \le o(r_2^{1-q}) \int _{\partial \mathbb {B}_{r_2}^q}(\mathrm{d}{\lambda }_{\partial \mathbb {B}_{r_2}^q}) =o(1) \qquad {\text {as}}\, r_2\rightarrow {\infty }, $$

so that $I_{HF}({\varvec{G}})=0$.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Tsukuma, H., Kubokawa, T. (2020). A Generalized Stein Identity and Matrix Differential Operators. In: Shrinkage Estimation for Mean and Covariance Matrices. SpringerBriefs in Statistics(). Springer, Singapore. https://doi.org/10.1007/978-981-15-1596-5_5

Download citation

DOI: https://doi.org/10.1007/978-981-15-1596-5_5
Published: 17 April 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1595-8
Online ISBN: 978-981-15-1596-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

A Generalized Stein Identity and Matrix Differential Operators

Abstract

Access this chapter

Subscribe and save

Buy Now

References

Author information

Authors and Affiliations

Corresponding author

Appendix

Appendix

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation