Modelling covariance matrices by the trigonometric separation strategy with application to hidden Markov models

Spezia, Luigi

doi:10.1007/s11749-018-0580-8

Modelling covariance matrices by the trigonometric separation strategy with application to hidden Markov models

Original Paper
Published: 30 March 2018

Volume 28, pages 399–422, (2019)
Cite this article

TEST Aims and scope Submit manuscript

Luigi Spezia ORCID: orcid.org/0000-0003-4921-3659¹

302 Accesses
4 Citations
Explore all metrics

Abstract

Bayesian inference on the covariance matrix is usually performed after placing an inverse-Wishart or a multivariate Jeffreys as a prior density, but both of them, for different reasons, present some drawbacks. As an alternative, the covariance matrix can be modelled by separating out the standard deviations and the correlations. This separation strategy takes advantage of the fact that usually it is more straightforward and flexible to set priors on the standard deviations and the correlations rather than on the covariance matrix. On the other hand, the priors must preserve the positive definiteness of the correlation matrix. This can be obtained by considering the Cholesky decomposition of the correlation matrix, whose entries are reparameterized using trigonometric functions. The efficiency of the trigonometric separation strategy (TSS) is shown through an application to hidden Markov models (HMMs), with conditional distributions multivariate normal. In the case of an unknown number of hidden states, estimation is conducted using a reversible jump Markov chain Monte Carlo algorithm based on the split-and-combine and birth-and-death moves whose design is straightforward because of the use of the TSS. Finally, an example in remote sensing is described, where a HMM containing the TSS is used for the segmentation of a multi-colour satellite image.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Jeffreys’ Priors for Mixture Estimation

On the Use of the Matrix-Variate Tail-Inflated Normal Distribution for Parsimonious Mixture Modeling

From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering

Article Open access 24 August 2018

References

Barnard J, McCulloch R, Meng X-L (2000) Modeling covariance matrices in terms of standard deviations and correlations, with application to shrinkage. Stat Sin 10:1281–1311
MathSciNet MATH Google Scholar
Cappé O, Moulines E, Rydén T (2005) Inference in hidden Markov models. Springer, New York
Book MATH Google Scholar
Cappé O, Robert CP, Rydén T (2003) Reversible jump, birth-and-death and more general continuous Markov chain Monte Carlo samplers. J R Stat Soc Ser B 63:679–700
Article MathSciNet MATH Google Scholar
Celeux G, Hurn M, Robert CP (2000) Computational and differential difficulties with mixture posterior distributions. J Am Stat Assoc 95:957–970
Article MATH Google Scholar
Daniels MJ, Kass RE (1999) Nonconjugate Bayesian estimation of covariance matrices and its use in hierarchical models. J Am Stat Assoc 94:1254–1263
Article MathSciNet MATH Google Scholar
Daniels MJ, Pourahmadi M (2002) Bayesian analysis of covariance matrices and dynamic models for longitudinal data. Biometrika 89:553–566
Article MathSciNet MATH Google Scholar
Daniels MJ, Pourahmadi M (2009) Modeling covariance matrices via partial autocorrelations. J Multivariate Anal 100:2352–2363
Article MathSciNet MATH Google Scholar
Dellaportas P, Papageorgiou I (2006) Multivariate mixtures of normals with unknown number of components. Stat Comput 16:57–68
Article MathSciNet Google Scholar
Dellaportas P, Plataniotis A, Titsias MK (2015) Scalable inference for a full multivariate stochastic volatility model. arXiv:1510.05257v1. Accessed 25 Aug 2017
Friel N, Pettitt AN, Reeves R, Wit E (2009) Bayesian inference in hidden Markov random fields for binary data defined on large lattices. J Comput Graph Stat 18:243–261
Article MathSciNet Google Scholar
Frühwirth-Schnatter S (2001) Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models. J Am Stat Assoc 96:194–209
Article MathSciNet MATH Google Scholar
Gelman A, Meng X-L (1998) Simulating normalizing constants: from importance sampling to bridge sampling to path sampling. Stat Sci 13:163–185
Article MathSciNet MATH Google Scholar
Giordana N, Pieczynski W (1997) Estimation of generalised multisensor hidden Markov chains and unsupervised image segmentation. IEEE Trans Pattern Anal Mach Intell 19:465–475
Article Google Scholar
Green PJ, Richardson S (2002) Hidden Markov models and disease mapping. J Am Stat Assoc 97:1055–1070
Article MathSciNet MATH Google Scholar
Hamilton JD (1994) Time series analysis. Princeton University Press, Princeton
MATH Google Scholar
Hoff PD (2009) A hierarchical eigenmodel for pooled covariance estimation. J R Stat Soc Ser B 71:971–992
Article MathSciNet MATH Google Scholar
Kamary K, Robert CP (2014) Reflecting about selecting noninformative priors. arXiv:1402.6257v3. Accessed 25 Aug 2017
Kim C-J (1993) Dynamic linear models with Markov-switching. J Econ 60:1–22
Article MathSciNet Google Scholar
Krolzig H-M (1997) Markov-switching vector autoregressions: modelling, statistical inference and applications to business cycle analysis. Springer, Berlin
Book MATH Google Scholar
Leonard T, Hsu JST (1992) Bayesian inference for a covariance matrix. Ann Stat 20:1669–1696
Article MathSciNet MATH Google Scholar
Liechty JC, Liechty MW, Müller P (2004) Bayesian correlation estimation. Biometrika 91:1–14
Article MathSciNet MATH Google Scholar
Marin JM, Mengersen KL, Robert CP (2005) Bayesian modelling and inference on mixture of distributions. In: Dey D, Rao CR (eds) Handbooks of statistics 25. Elsevier Science, Amsterdam, pp 459–507
Google Scholar
Møller J, Pettitt AN, Berthelsen KK, Reeves RW (2006) An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants. Biometrika 93:451–458
Article MathSciNet MATH Google Scholar
Murray I, Ghahramani Z, MacKay DJC (2006) MCMC for doubly-intractable distributions. In: Dechter R, Richardson T (eds) Proceedings of the twenty-second conference on uncertainty in artificial intelligence. AUAI Press, Arlington, pp 359–366
Paroli R, Spezia L (2010) Reversible jump MCMC methods and segmentation algorithms in hidden Markov models. Aust N Z J Stat 52:151–166
Article MathSciNet MATH Google Scholar
Pinheiro JC, Bates DM (1996) Unconstrained parameterizations for the variance-covariance matrix. Stat Comput 6:289–296
Article Google Scholar
Qian W, Titterington DM (1991) Estimation of parameters in hidden Markov models. Philos Trans Roy Soc Lond Ser A 337:407–428
Article MATH Google Scholar
Richardson S, Green PJ (1997) On Bayesian analysis of mixtures with an unknown number of components (with discussion). J R Stat Soc Ser B 59:731–792
Article MATH Google Scholar
Scott SL, James GM, Sugar CA (2005) Hidden Markov models for longitudinal comparisons. J Am Stat Assoc 100:359–369
Article MathSciNet MATH Google Scholar
Seaman JW III, Seaman JW Jr, Stamey JD (2012) Hidden dangers of specifying noninformative priors. Am Stat 66:77–84
Article MathSciNet Google Scholar
Smith M, Kohn R (2002) Parsimonius covariance matrix estimation for longitudinal data. J Am Stat Assoc 97:1141–1153
Article MATH Google Scholar
Spezia L (2010) Bayesian analysis of multivariate Gaussian hidden Markov models with an unknown number of regimes. J Time Ser Anal 31:1–11
Article MathSciNet MATH Google Scholar
Spezia L, Friel N, Gimona A (2017) Spatial hidden Markov models and species distribution. J Appl Stat, published online
Wang H, Pillai NS (2013) On a class of shrinkage priors for covariance matrix estimation. J Comput Graph Stat 22:689–707
Article MathSciNet Google Scholar
Yang R, Berger JO (1994) Estimation of a covariance matrix using the reference prior. Ann Stat 22:1195–1211
Article MathSciNet MATH Google Scholar
Zucchini W, MacDonald IA, Langrock R (2016) Hidden Markov models for time series: an introduction using R, 2nd edn. Chapman & Hall/CRC Press, Boca Raton
MATH Google Scholar

Download references

Acknowledgements

This research was funded by the Scottish Government’s Rural and Environment Science and Analytical Services Division. The images in Fig. 6 were kindly produced by Laura Origgi. The satellite image was provided by Carlos Padovani. A discussion with Laura Poggio was useful to understand a few problems related to multispectral sensors. Comments from Mark Brewer, Glenn Marion, and two anonymous referees improved the quality of the final paper.

Author information

Authors and Affiliations

Biomathematics & Statistics Scotland, Aberdeen, UK
Luigi Spezia

Authors

Luigi Spezia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luigi Spezia.

Appendices

Appendix A

The vector $\left( \mu ,S,\alpha ,\Omega ,m\right) ^{\prime }$ is estimated in a single RJMCMC run only, by using the values of $\mu $, S, $\alpha $, and $\Omega $ collected in those sweeps in which the number of hidden states is equal to the most frequent value of m. Nevertheless, to make the understanding of the algorithm easier, we first present the details of the MCMC algorithm for a given number m of states and then describe the RJMCMC when m is unknown.

Known number of states

In any sweep of the MCMC algorithm, the parameters $\mu $, S, $\alpha $, $\Omega $ are accepted or rejected, after generating their elements by random walk moves. Random walks can update real parameters; thus, positive quantities $\omega _{j,i}$, standard deviations $\sigma _{i,h}$, and angles $\alpha _{i,k,l}$ are mapped onto the real line through the logarithmic transformation. As the angles belong to the interval $( A_{i,k,l};B_{i,k,l}) $, we consider the natural logarithm of $[ \alpha _{i,k,l}-A_{i,k,l}] /[ B_{i,k,l}-\alpha _{i,k,l}] $. The values $\mu _{i,h}$, $\ln \sigma _{i,h}$, $\ln [ \alpha _{i,k,l}-A_{i,k,l}] /[ B_{i,k,l}-\alpha _{i,k,l}] $, and $\ln \omega _{j,i}$, for any $ i;j=1,\ldots ,m$, $h=1,\ldots ,p$, $k=1,\ldots ,p-1$, for any $l=k+1,\ldots ,p$, belong to the interval $\left( -\,\infty ;+\,\infty \right) $ and can be generated by the following random walk proposals:

$$\begin{aligned}&\mu _{i,h}=\mu _{i,h}^{(old)}+U_{M} \\&\ln \sigma _{i,h}=\ln \sigma _{i,h}^{(old)}+U_{\Sigma } \\&\ln \left[ \alpha _{i,k,l}-A_{i,k,l}\right] /\left[ B_{i,k,l}-\alpha _{i,k,l} \right] =\ln \left[ \alpha _{i,k,l}^{(old)}-A_{i,k,l}\right] /\left[ B_{i,k,l}-\alpha _{i,k,l}^{(old)}\right] +U_{A} \\&\ln \omega _{j,i}=\ln \omega _{j,i}^{(old)}+U_{\Omega }, \end{aligned}$$

where $U_{\Psi }\sim \mathcal {N}\left( 0;\sigma _{\Psi }^{2}\right) $, with $ \Psi \in \left\{ M;\Sigma ;A;\Omega \right\} $.

The proposals $\mu $, S, $\alpha $, $\Omega $ are accepted if $u_{\Psi }\le \min \left\{ 1;A_{\Psi }\right\} $, where $u_{\Psi }$ is a random number generated from the uniform distribution $\mathcal {U}\left( 0;1\right) $ and $A_{\Psi }$ are the acceptance ratios ($ \Psi \in \left\{ M;\Sigma ;A;\Omega \right\} $), i.e.

$$\begin{aligned} A_{\Psi }=\text {likelihood ratio }\times \text { prior ratio }\times \text { ratio of the products of the Jacobians,} \end{aligned}$$

where the likelihood ratio is

$$\begin{aligned} p(y^{T}\mid \mu ,S,\alpha ,\Omega ,m) /p(y^{T}\mid \mu ^{(old)},S ^{(old)},\alpha ^{(old)},\Omega ^{(old)},m) , \end{aligned}$$

the prior ratios, respectively, are

$$\begin{aligned} \begin{array}{cccc} p\left( \mu \right) /p\left( \mu ^{(old)}\right) ;&p\left( S\right) /p\left( S^{(old)}\right) ;&p\left( \alpha \right) /p\left( \alpha ^{(old)}\right) ;&p\left( \Omega \right) /p\left( \Omega ^{(old)}\right) , \end{array} \end{aligned}$$

and the ratio of the products of the Jacobian of the logarithmic transformations of the $\sigma _{i,h}$, the $\alpha _{i,k,l}$, and the $\omega _{j,i}$, respectively, are

$$\begin{aligned} \overset{m}{\underset{i=1}{\prod }}\overset{p}{\underset{h=1}{\prod }} \sigma _{i,h}\bigg / \overset{m}{\underset{i=1}{\prod }}\overset{p}{ \underset{h=1}{\prod }}\sigma _{i,h}^{(old)} , \frac{{{\prod }_{i=1}^m}{{\prod }_{k=1}^p} {{\prod }_{l=1}^p}\left( B_{i,k,l}-\alpha _{i,k,l}\right) \left( \alpha _{i,k,l}-A_{i,k,l}\right) }{{{\prod }_{i=1}^m} {{\prod }_{k=1}^p}{{\prod }_{l=1}^p}\left( B_{i,k,l}-\alpha _{i,k,l}^{(old)}\right) \left( \alpha _{i,k,l}^{(old)}-A_{i,k,l}\right) }, \end{aligned}$$

and

$$\begin{aligned} \overset{m}{\underset{i=1}{\prod }}\overset{m}{\underset{j=1}{\prod }}\omega _{j,i}\bigg / \overset{m}{\underset{i=1}{\prod }}\overset{m}{\underset{j=1}{ \prod }}\omega _{j,i}^{(old)}. \end{aligned}$$

At the end of each iteration, the MCMC sample is post-processed as in Marin et al. (2005). Assume that at any iteration k of the MCMC algorithm, we store the values $\{\mu ^{(k)},S^{(k)}, \alpha ^{(k)},\Omega ^{(k)}\} $. Let H be the class of the m! permutations $\eta _{j}$ of the labels ($\eta _{j}\in H$, for any $ j=1,\ldots ,m!$), so that $\eta _{j}\ (\mu ^{(k)},S ^{(k)},\alpha ^{(k)},\Omega ^{(k)}) $ be some permutation of the parameters obtained at the k-th iteration, by which the means, the standard deviations, the angles, and the rows and the columns of the transition matrix assume a new order.

After the burn-in, if a sample of size N ($k=1,\ldots ,N$) is simulated, the post-processing algorithm works as follows:

(i):

compute the posterior mode $\left\{ \mu ^{*}, S^{*},\alpha ^{*},\Omega ^{*}\right\} $, such that

$$\begin{aligned} \left\{ \mu ^{*},S^{*},\alpha ^{*}, \Omega ^{*}\right\} =\arg \underset{k=1,\ldots ,N}{\max } p(\mu ^{(k)},S^{(k)},\alpha ^{(k)},\Omega ^{(k)}\mid y^{T},m) \end{aligned}$$

(ii):

for any $k=1,\ldots ,N$, compute $\eta ^{*}$ such that

$$\begin{aligned} \eta ^{*}=\arg \underset{\eta _{j}\in H}{\min }\left\| \eta _{j}(\mu ^{(k)},S^{(k)},\alpha ^{(k)}, \Omega ^{(k)}) -(\mu ^{*},S^{*},\alpha ^{*},\Omega ^{*}) \right\| \end{aligned}$$

and place

$$\begin{aligned} (\mu ^{(k)},S^{(k)},\alpha ^{(k)},\Omega ^{(k)}) =\eta ^{*}(\mu ^{(k)},S ^{(k)},\alpha ^{(k)},\Omega ^{(k)}) . \end{aligned}$$

In step (ii), for any entry of the MCMC sample, we first compute the Euclidean norm between any permuted vector of parameters and the posterior mode; then, we select that special reordered vector which is the nearest to the posterior mode. Therefore, the label switching problem is circumvented without selecting any artificial identifiability constraint.

Finally, the hidden sequence of the states is reconstructed. Each state is the maximizer of the current smoothed probabilities: after obtaining parameter estimates, we can compute backwards the smoothed probabilities of the states (Kim 1993), that is the probabilities of any state, at any time, given all observations and the estimates of the parameters, i.e. $\widehat{ \mu }$, $\widehat{\Sigma }$, $\widehat{\Gamma }$:

$$\begin{aligned} x_{t}=\arg \max _{j}P( X_{t}=j\mid y^{T},\widehat{\mu } ,\widehat{\Sigma },\widehat{\Gamma },m) , \end{aligned}$$

with $t=1,\ldots ,T$, where

$$\begin{aligned}&P( X_{t}=j\mid y^{T},\widehat{\mu },\widehat{ \Sigma },\widehat{\Gamma },m) \nonumber \\&\quad =P( X_{t}=j\mid y ^{t},\widehat{\mu },\widehat{\Sigma },\widehat{ \Gamma },m) \overset{m}{\underset{i=1}{\sum }}\frac{\gamma _{j,i} \text { }P( X_{t+1}=i\mid y^{T},\widehat{\mu }, \widehat{\Sigma },\widehat{\Gamma },m) }{P( X_{t+1}=i\mid y^{t},\widehat{\mu },\widehat{\Sigma },\widehat{\Gamma },m) }, \end{aligned}$$

(3)

for any $t=T-1,\ldots ,1$ and any $j=1,\ldots ,m$, starting from $P( X_{T}\mid y^{T},\widehat{\mu },\widehat{\Sigma }, \widehat{\Gamma },m) $, which can be obtained by means of the filtered probabilities (2).

Unknown number of states

Our RJMCMC algorithm is based on three main moves, which allow changes in the number of hidden states:

[i]:: update the parameters as described in the previous subsection;
[ii]:: split one state of the MVN-HMM into two or merge two states into one;
[iii]:: give birth or death to a state.

In move [ii], the split is randomly chosen with probability $b_{m}= \mathbb {I}(m=1)+0.5\cdot \mathbb {I}(2\le m<m_{\max })$, whereas the combine is randomly chosen with probability $d_{m}=1-b_{m}$.

In the combine move, two adjacent states, e.g. $i_{1}$ and $i_{2}=i_{1}+1$, are randomly selected and combined in state $i^{*}$, reducing by one the number of hidden states; the corresponding parameters are combined as follows:

$$\begin{aligned} \begin{array}{ll} \mu _{i^{*},h}=( \mu _{i_{1},h}+\mu _{i_{2},h}) /2 &{}\quad \text { for any }h=1,\ldots ,p \\ \sigma ^{2}_{i^{*},h}=( \sigma ^{2}_{i_{1},h}\cdot \sigma ^{2} _{i_{2},h}) ^{1/2} &{}\quad \text {for any }h=1,\ldots ,p \\ \alpha _{i^{*},k,l}=\alpha _{i_{1},k,l}+\alpha _{i_{2},k,l} &{}\quad \text {for any }k=1,\ldots ,p-1\text { and any }l=k+1,\ldots ,p \\ \omega _{i,i^{*}}=\omega _{i,i_{1}}+\omega _{i,i_{2}} &{}\quad \text {for any } i\ne i^{*} \\ \omega _{i^{*},j}=( \omega _{i_{1},j}\cdot \omega _{i_{2},j}) ^{1/2} &{}\quad \text {for any }j\ne i^{*} \\ \omega _{i^{*},i^{*}}=( \omega _{i_{1},i_{1}}\cdot \omega _{i_{2},i_{1}}) ^{1/2}+( \omega _{i_{1},i_{2}}\cdot \omega _{i_{2},i_{2}}) ^{1/2} &{} \end{array} \end{aligned}$$

(4)

In the split move, a state $i^{*}$ is picked at random and split in the two adjacent states $i_{1}$ and $i_{2}$; the corresponding parameters are split as follows, respecting the six equalities in (4). First, we generate the following $p(p+3)/2+2m+1$ random values:

$u_{1,h}$ from $\mathcal {N}\left( 0;0.5\right) $, for any $ h=1,\ldots ,p$;
$u_{2,h}$ from $\mathcal {G}\left( 1;5\right) $, for any $ h=1,\ldots ,p$;
$u_{3,k,l}$ from $\mathcal {U}\left( 0;1\right) $, for any $ k=1,\ldots ,p-1$ and any $l=k+1,\ldots ,p$;
$v_{i}$ from $\mathcal {U}\left( 0;1\right) $, for any $i\ne i^{*}$;
$w_{j}$ from $\mathcal {G}\left( 1;5\right) $, for any $j\ne i^{*}$;
$\rho $ from $\mathcal {U}\left( 0;1\right) $;
$\tau _{1}$ and $\tau _{2}$ from $\mathcal {G}\left( 1;5\right) $.

Then, we set:

$$\begin{aligned} \begin{array}{ll} \mu _{i_{1},h}=\mu _{i^{*},h}-\sigma _{i^{*},h,h}\cdot u_{1,h} &{}\quad \mu _{i_{2},h}=\mu _{i^{*},h}+\sigma _{i^{*},h,h}\cdot u_{1,h} \\ \sigma ^{2}_{i_{1},h}=\sigma ^{2}_{i^{*},h}\cdot u_{2,h} &{}\quad \sigma ^{2}_{i_{2},h}=\sigma ^{2}_{i^{*},h}/u_{2,h} \\ \alpha _{i_{1},k,l}=\alpha _{i^{*},k,l}\cdot u_{3,k,l} &{}\quad \alpha _{i_{2},k,l}=\alpha _{i^{*},k,l}\cdot \left( 1-u_{3,k,l}\right) \\ \omega _{i,i_{1}}=\omega _{i,i^{*}}\cdot v_{i} &{}\quad \omega _{i,i_{2}}=\omega _{i,i^{*}}\cdot \left( 1-v_{i}\right) \\ \omega _{i_{1},j}=\omega _{i^{*},j}\cdot w_{j} &{}\quad \omega _{i_{2},j}=\omega _{i^{*},j}/w_{j} \\ \omega _{i_{1},i_{1}}=\omega _{i^{*},i^{*}}\cdot \rho \cdot \tau _{1} &{}\quad \omega _{i_{2},i_{1}}=\omega _{i^{*},i^{*}}\cdot \rho /\tau _{1} \\ \omega _{i_{1},i_{2}}=\omega _{i^{*},i^{*}}\cdot \left( 1-\rho \right) \cdot \tau _{2} &{}\quad \omega _{i_{2},i_{2}}=\omega _{i^{*},i^{*}}\cdot \left( 1-\rho \right) /\tau _{2} \end{array} \end{aligned}$$

The split move is accepted with probability $\min \left\{ 1;A\right\} $, while the combine move is accepted with probability $\min \left\{ 1;A^{-1}\right\} $. Let the tilde mark the parameters in the model with $m+1$ states, with respect to those entering in the model with m states; the analytic expression of A is

$$\begin{aligned}&\frac{p\left( y^{T}\mid \widetilde{\mu },\widetilde{S}, \widetilde{\alpha },\widetilde{\Omega },m+1\right) }{ p\left( y^{T}\mid \mu ,S,\alpha ,\Omega ,m\right) }\cdot \frac{p(m+1)}{p(m)}\cdot \frac{p\left( \mu \right) \cdot p\left( \widetilde{S}\right) \cdot p\left( \widetilde{ \alpha }\right) \cdot p\left( \widetilde{\Omega }\right) }{p\left( \widetilde{\mu }\right) \cdot p\left( S\right) \cdot p\left( \alpha \right) \cdot p\left( \Omega \right) }\cdot \frac{d_{m+1}/m}{b_{m}/m}\cdot \nonumber \\&\quad \cdot \frac{m+1}{{{\prod }_{h=1}^p}p( u_{1,h}) \cdot {{\prod }_{h=1}^p}p( u_{2,h}) \cdot {{\prod }_{p-1}^{k=1}}{{\prod }_{l=k+1}^p} p( u_{3,k,l}) \cdot {{\prod }_{i\ne i^*}}p( v_{i}) \cdot {{\prod }_{j\ne i^*}}p( w_{j}) \cdot p( \rho ) \cdot p( \tau _{1}) \cdot p( \tau _{2}) }\cdot |J|, \nonumber \\ \end{aligned}$$

(5)

where $p(m+1)/p(m)$ cancels out; $b_{m}/m$ is the probability of splitting the special state $i^{*}$, while $d_{m+1}/m$ is the probability of merging one of the m pairs $\left( i_{1};i_{2}\right) $ of adjacent states; factor $\left( m+1\right) $ is the ratio $\left( m+1\right) !/m!$, in which the factorials arise from the exchangeability assumption on the states; J is the Jacobian of the transformation from $(\omega _{i,i^{*}},v_{i},\omega _{i^{*},j},w_{j},\omega _{i^{*},i^{*}},\rho ,\tau _{1},\tau _{2},\mu _{i^{*},h},u_{1,h},$$\sigma ^{2}_{i^{*},h},u_{2,h},\alpha _{i^{*},k,l},u_{3,k,l})$ to $(\widetilde{\omega } _{i,i_{1}},\widetilde{\omega }_{i,i_{2}},\widetilde{\omega }_{i_{1},j}, \widetilde{\omega }_{i_{2},j},\widetilde{\omega }_{i_{1},i_{1}},\widetilde{ \omega }_{i_{1},i_{2}},\widetilde{\omega }_{i_{2},i_{1}},\widetilde{\omega } _{i_{2},i_{2}},\widetilde{\mu }_{i_{1}},\widetilde{\mu }_{i_{2}},\widetilde{ \sigma }^{2}_{i_{1},h},\widetilde{\sigma }^{2}_{i_{2},h},\widetilde{\alpha }_{i_{1},k,l}$, $\widetilde{\alpha }_{i_{2},k,l})$.

Note that the Jacobian can be decomposed into the product of five subdeterminants, i.e. $J_{1}$ for the transformation from $\left( \omega _{i,i^{*}},v_{i}\right) $ to $( \widetilde{\omega }_{i,i_{1}}, \widetilde{\omega }_{i,i_{2}}) $, $J_{2}$ for the transformation from $ ( \omega _{i^{*},j},w_{j}) $ to $( \widetilde{\omega } _{i_{1},j},\widetilde{\omega }_{i_{2},j}) $, $J_{3}$ for the transformation from $( \omega _{i^{*},i^{*}},\rho ,\tau _{1},\tau _{2}) $ to $( \widetilde{\omega }_{i_{1},i_{1}}, \widetilde{\omega }_{i_{1},i_{2}},\widetilde{\omega }_{i_{2},i_{1}}, \widetilde{\omega }_{i_{2},i_{2}}) $, $J_{4}$ for the transformation from $( \mu _{i^{*},h},u_{1,h},\sigma ^{2}_{i^{*},h},u_{2}) $ to $( \widetilde{\mu }_{i_{1,h}},\widetilde{\mu }_{i_{2},h}, \widetilde{\sigma }^{2}_{i_{1},h},\widetilde{\sigma }^{2}_{i_{2},h}) $, $ J_{5}$ for the transformation from $( \alpha _{i^{*},k,l},u_{3,k,l}) $ to $( \widetilde{\alpha }_{i_{1},k,l}, \widetilde{\alpha }_{i_{2},k,l}) $. Hence, we have $|J|=|J_{1}\cdot J_{2}\cdot J_{3}\cdot J_{4}\cdot J_{5}|$, where

$$\begin{aligned} J_{1}= & {} \underset{i\ne i^{*}}{\prod }\left( -\,\omega _{i,i^{*}}\right) \qquad J_{2}=\left( -\,2\right) ^{m-1}\cdot \underset{j\ne i^{*} }{\prod }\frac{\omega _{i^{*},j}}{w_{j}}\qquad J_{3}=-\,2\cdot \omega _{i^{*},i^{*}}^{3}\cdot \frac{\rho \cdot \left( 1-\rho \right) }{ \tau _{1}\cdot \tau _{2}} \\ J_{4}= & {} \left( -\,4\right) ^{p}\cdot \overset{p}{\underset{h=1}{\prod }}\frac{ \sigma _{i^{*},h}^{3}}{u_{2,h}}\qquad J_{5}=\overset{p-1}{\underset{ k=1}{\prod }}\underset{l=k+1}{\overset{p}{\prod }}\left( -\,\alpha _{i^{*},k,l}\right) . \end{aligned}$$

In move [iii], birth and death are chosen with probability $b_{m}$ and $d_{m}$, respectively. In a death move, a state is selected at random and then suppressed along with the corresponding parameters. In a birth move a new state $i^{*}$ is added to the previous m and the new parameters are drawn from their respective priors; the position of the new state is generated at random. The birth move is accepted with probability $\min \left\{ 1;A\right\} $, while the death move is accepted with probability $ \min \left\{ 1;A^{-1}\right\} $; the analytic expression of A is

$$\begin{aligned}&\frac{p\left( y^{T}\mid \widetilde{\mu },\widetilde{S}, \widetilde{\alpha },\widetilde{\Omega },m+1\right) }{ p\left( y^{T}\mid \mu ,S,\alpha ,\Omega ,m\right) }\cdot \frac{p(m+1)}{p(m)}\cdot \frac{p\left( \mu \right) \cdot p\left( \widetilde{S}\right) \cdot p\left( \widetilde{ \alpha }\right) \cdot p\left( \widetilde{\Omega }\right) }{p\left( \widetilde{\mu }\right) \cdot p\left( S\right) \cdot p\left( \alpha \right) \cdot p\left( \Omega \right) }\\&\quad \cdot \frac{d_{m+1}/\left( m+1\right) }{b_{m}/\left( m+1\right) }\cdot \\&\quad \cdot \frac{m+1}{p( \mu _{i^{*}}) \cdot p( S_{i^{*}}) \cdot p( \alpha _{i^{*}}) \cdot {{\prod }_{i\ne i^*}}p( \omega _{i,i^{*}}) \cdot {{\prod }_{j\ne i^{*}}}p( \omega _{i^{*},j}) \cdot p( \omega _{i^{*},i^{*}}) }\cdot |J|, \end{aligned}$$

where $p(m+1)/p(m)$ cancels out; the ratio of the products of the prior densities multiplied by the reciprocal of the product of the densities of the new-born parameters is equal to 1; $b_{m}/\left( m+1\right) $ is the probability of giving birth to a new state in the special position $i^{*} $, while $d_{m+1}/\left( m+1\right) $ is the probability of killing a special state; factor $\left( m+1\right) $ has the same meaning as in (5); the Jacobian J is 1.

Appendix B

Estimates (posterior means) of the transition matrix:

$$\begin{aligned} \mathbf {\Gamma }=\left[ \begin{array}{ccc} 0.960 &{}\quad 0.029 &{}\quad 0.011 \\ 0.045 &{}\quad 0.921 &{}\quad 0.034 \\ 0.017 &{}\quad 0.032 &{}\quad 0.951 \end{array} \right] \end{aligned}$$

Estimates (posterior means) of the mean vectors:

$$\begin{aligned} \mathbf {\mu }_{1}= & {} \left( -\,1.292,-\,0.026,1.222\right) \quad \mathbf {\mu } _{2}=\left( -\,0.282,-\,0.064,0.467\right) \\ \mathbf {\mu }_{3}= & {} \left( -\,1.416,0.068,1.318\right) \end{aligned}$$

Estimates (posterior means) of the covariance matrices:

$$\begin{aligned} \mathbf {\Sigma }_{1}= & {} \left[ \begin{array}{ccc} 0.144 &{}\quad 0.273 &{}\quad 0.117 \\ 0.273 &{}\quad 1.309 &{}\quad 0.260 \\ 0.117 &{}\quad 0.260 &{}\quad 0.165 \end{array} \right] \quad \mathbf {\Sigma }_{2}=\left[ \begin{array}{ccc} 0.402 &{}\quad 0.249 &{}\quad 0.170 \\ 0.249 &{}\quad 0.836 &{}\quad 0.210 \\ 0.170 &{}\quad 0.210 &{}\quad 0.254 \end{array} \right] \\ \mathbf {\Sigma }_{3}= & {} \left[ \begin{array}{ccc} 0.195 &{}\quad 0.210 &{}\quad 0.120 \\ 0.210 &{}\quad 0.789 &{}\quad 0.190 \\ 0.120 &{}\quad 0.190 &{}\quad 0.105 \end{array} \right] \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Spezia, L. Modelling covariance matrices by the trigonometric separation strategy with application to hidden Markov models. TEST 28, 399–422 (2019). https://doi.org/10.1007/s11749-018-0580-8

Download citation

Received: 25 August 2017
Accepted: 20 February 2018
Published: 30 March 2018
Issue Date: 01 June 2019
DOI: https://doi.org/10.1007/s11749-018-0580-8

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Modelling covariance matrices by the trigonometric separation strategy with application to hidden Markov models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Jeffreys’ Priors for Mixture Estimation

On the Use of the Matrix-Variate Tail-Inflated Normal Distribution for Parsimonious Mixture Modeling

From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

Appendix B

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Navigation

Modelling covariance matrices by the trigonometric separation strategy with application to hidden Markov models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Jeffreys’ Priors for Mixture Estimation

On the Use of the Matrix-Variate Tail-Inflated Normal Distribution for Parsimonious Mixture Modeling

From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix A

Appendix B

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Subscribe and save

Buy Now

Search

Navigation