Abstract
In shrinkage estimation, the Stein (1973, 1981) identity is known as an integration by parts formula for deriving unbiased risk estimates. It is a simple but very powerful mathematical tool and has contributed significantly to the development of shrinkage estimation. This chapter provides a generalized Stein identity in matrix-variate normal distribution model and also some useful results on matrix differential operators for a unified application of the identity to high- and low-dimensional normal models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
M. Bilodeau, T. Kariya, Minimax estimators in the normal MANOVA model. J. Multivar. Anal. 28, 260–270 (1989)
B. Efron, C. Morris, Families of minimax estimators of the mean of a multivariate normal distribution. Ann. Stat. 4, 11–21 (1976)
W. Fleming, Functions of Several Variables, 2nd edn. (Springer, New York, 1977)
L.R. Haff, Minimax estimators for a multinormal precision matrix. J. Multivar. Anal. 7, 374–385 (1977)
L.R. Haff, An identity for the Wishart distribution with applications. J. Multivar. Anal. 9, 531–544 (1979)
Y. Konno, Improved estimation of matrix of normal mean and eigenvalues in the multivariate \(F\)-distribution. Doctoral dissertation, Institute of Mathematics, University of Tsukuba, 1992. http://mcm-www.jwu.ac.jp/~konno/
Y. Konno, Shrinkage estimators for large covariance matrices in multivariate real and complex normal distributions under an invariant quadratic loss. J. Multivar. Anal. 100, 2237–2253 (2009)
T. Kubokawa, M.S. Srivastava, Estimation of the precision matrix of a singular Wishart distribution and its application in high-dimensional data. J. Multivar. Anal. 99, 1906–1928 (2008)
Y. Sheena, Unbiased estimator of risk for an orthogonally invariant estimator of a covariance matrix. J. Jpn. Stat. Soc. 25, 35–48 (1995)
C. Stein, Estimation of the mean of a multivariate normal distribution. Technical Reports No.48 (Department of Statistics, Stanford University, Stanford, 1973)
C. Stein, Lectures on the theory of estimation of many parameters, in Proceedings of Scientific Seminars of the Steklov Institute Studies in the Statistical Theory of Estimation, Part I, vol. 74, ed. by I.A. Ibragimov, M.S. Nikulin (Leningrad Division, 1977), pp. 4–65
C. Stein, Estimation of the mean of a multivariate normal distribution. Ann. Stat. 9, 1135–1151 (1981)
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
In this appendix, we first give another derivation of the Stein identity (5.3). Denote by \(f({\varvec{X}})\) the p.d.f. of \(\mathcal{N}_{m\times p}({\varvec{{\Theta }}},{\varvec{I}}_m\otimes {\varvec{{\Sigma }}})\). Let
It follows that \(\nabla _X f({\varvec{X}})=-({\varvec{X}}-{\varvec{{\Theta }}}){\varvec{{\Sigma }}}^{-1} f({\varvec{X}})\), so that
provided the expectations exist. Hence the Stein identity (5.3) can be verified if \(I_{ST}({\varvec{G}})=0\).
For \(r>0\), let \(\mathbb {B}_r=\{{\varvec{X}}\in \mathbb {R}^{m\times p}:\Vert \mathrm {vec}({\varvec{X}})\Vert \le r\}\), where \(\Vert \cdot \Vert \) is the usual Euclidean norm and \(\mathrm {vec}(\cdot )\) is defined in Definition 2.3. Then \(\mathbb {B}_r\rightarrow \mathbb {R}^{m\times p}\) as \(r\rightarrow {\infty }\) and
The boundary of \(\mathbb {B}_r\) is expressed by \(\partial \mathbb {B}_r=\{\mathrm {vec}({\varvec{X}})\in \mathbb {R}^{mp}:\Vert \mathrm {vec}({\varvec{X}})\Vert =r\}\). Denote by \({\varvec{u}}\) an outward unit normal vector at a point \(\mathrm {vec}({\varvec{X}})\in \partial \mathbb {B}_r\). Let \({\lambda }_{\partial \mathbb {B}_r}\) be Lebesgue measure on \(\partial \mathbb {B}_r\). By the Gauss divergence theorem,
For details of the Gauss divergence theorem, see Fleming (1977).
Let \(o(\cdot )\) be the Landau symbol, namely, for real-valued functions f(x) and g(x) with \(g(x)\ne 0\), we write \(f(x)=o(g(x))\) when \(\lim _{x\rightarrow c}|f(x)/g(x)|=0\) for an extended real number c. If
then \(I_{ST}({\varvec{G}})=0\). In fact,
because \(\int _{\partial \mathbb {B}_r} (\mathrm{d}{\lambda }_{\partial \mathbb {B}_r})\) is the surface area of the \((mp-1)\)-sphere of radius r in \(\mathbb {R}^{mp}\), namely, \(\int _{\partial \mathbb {B}_r} (\mathrm{d}{\lambda }_{\partial \mathbb {B}_r})\approx r^{mp-1}\).
Next, a simple derivation of the Haff identity (5.5) is provided by using the Gauss divergence theorem. The derivation is essentially the same as Haff (1977, 1979). Let \(f({\varvec{S}})\) be the p.d.f. of \(\mathcal{W}_p(n,{\varvec{{\Sigma }}})\). For a differentiable matrix-valued function \({\varvec{G}}\in \mathbb {S}_p\), let
Since \(\mathrm {D}_S|{\varvec{S}}|={\varvec{S}}^{-1}|{\varvec{S}}|\) and \(\mathrm {D}_S\mathrm{\,tr\,}{\varvec{{\Sigma }}}^{-1}{\varvec{S}}={\varvec{{\Sigma }}}^{-1}\), we get \(\mathrm {D}_S f({\varvec{S}})=\{(n-p-1){\varvec{S}}^{-1}-{\varvec{{\Sigma }}}^{-1}\}f({\varvec{S}})/2\), implying that
provided the expectations exist. Hence the Haff identity (5.5) follows if \(I_{HF}({\varvec{G}})=0\).
Denote by \(\partial /\partial {\varvec{S}}=(\partial /\partial s_{ij})\) the \(p\times p\) matrix differential operator with respect to \({\varvec{S}}\in \mathbb {S}_p\). For \({\varvec{A}}=(a_{ij})\in \mathbb {S}_p\), define
where \(q=p(p+1)/2\). From symmetry of \({\varvec{G}}\), it holds that
so that
For \(r>0\), let \(\partial \mathbb {B}_r^q=\{\mathrm {Vec}({\varvec{S}})\in \mathbb {R}^q:\Vert \mathrm {Vec}({\varvec{S}})\Vert =r\}\) and, for \(0< r_1\le r_2<{\infty }\), let \(\mathbb {C}_{r_1,r_2}=\{\mathrm {Vec}({\varvec{S}})\in \mathbb {R}^q: r_1\le \Vert \mathrm {Vec}({\varvec{S}})\Vert \le r_2\}\). Then \(\mathbb {C}_{r_1,r_2}\cap \mathbb {S}_p^{(+)}\rightarrow \mathbb {S}_p^{(+)}\) as \(r_1\rightarrow 0\) and \(r_2\rightarrow {\infty }\). The boundary of \(\mathbb {C}_{r_1,r_2}\cap \mathbb {S}_p^{(+)}\) can be expressed as \(\bigcup _{i=1}^3 \partial \mathbb {B}_i\), where \(\partial \mathbb {B}_1\), \(\partial \mathbb {B}_2\) and \(\partial \mathbb {B}_3\) are certain sets satisfying \(\partial \mathbb {B}_1\subset \partial \mathbb {B}_{r_1}^q\), \(\partial \mathbb {B}_2\subset \partial \mathbb {B}_{r_2}^q\) and \(\partial \mathbb {B}_3\subset \partial \mathbb {S}_p^{(+)}\). Note that, for any point \({\varvec{S}}\in \partial \mathbb {S}_p^{(+)}\), \(|{\varvec{S}}|=0\), namely, \(f({\varvec{S}})=0\) when \(n-p-1>0\). Let \({\varvec{u}}_1=-\mathrm {Vec}({\varvec{S}})/\Vert \mathrm {Vec}({\varvec{S}})\Vert \) for \(\mathrm {Vec}({\varvec{S}})\in \partial \mathbb {B}_{r_1}^q\) and \({\varvec{u}}_2=\mathrm {Vec}({\varvec{S}})/\Vert \mathrm {Vec}({\varvec{S}})\Vert \) for \(\mathrm {Vec}({\varvec{S}})\in \partial \mathbb {B}_{r_2}^q\). Denote by \({\lambda }_{\partial \mathbb {B}_r^q}\) Lebesgue measure on \(\partial \mathbb {B}_r^q\). Using the Gauss divergence theorem gives
Using the Landau symbol \(o(\cdot )\), we assume that
and
Under these assumptions, we can see that
and also
so that \(I_{HF}({\varvec{G}})=0\).
Rights and permissions
Copyright information
© 2020 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Tsukuma, H., Kubokawa, T. (2020). A Generalized Stein Identity and Matrix Differential Operators. In: Shrinkage Estimation for Mean and Covariance Matrices. SpringerBriefs in Statistics(). Springer, Singapore. https://doi.org/10.1007/978-981-15-1596-5_5
Download citation
DOI: https://doi.org/10.1007/978-981-15-1596-5_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1595-8
Online ISBN: 978-981-15-1596-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)