Abstract
The Double Chain Markov Model (DCMM) is used to model an observable process \(Y = \{Y_{t}\}_{t=1}^{T}\) as a Markov chain with transition matrix, \(P_{x_{t}}\), dependent on the value of an unobservable (hidden) Markov chain \(\{X_{t}\}_{t=1}^{T}\). We present and justify an efficient algorithm for sampling from the posterior distribution associated with the DCMM, when the observable process Y consists of independent vectors of (possibly) different lengths. Convergence of the Gibbs sampler, used to simulate the posterior density, is improved by adding a random permutation step. Simulation studies are included to illustrate the method. The problem that motivated our model is presented at the end. It is an application to real data, consisting of the credit rating dynamics of a portfolio of financial companies where the (unobserved) hidden process is the state of the broader economy.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Azzalini, A., Bowman, A.W.: A look at some data on the old faithful geyser. J. R. Stat. Soc., Ser. C, Appl. Stat. 39, 357–365 (1990)
Bangia, A., Diebold, F.X., Kronimus, A., Schagen, C., Schuermann, T.: Ratings migration and the business cycle, with application to credit portfolio stress testing. J. Bank. Finance 26, 445 (2002)
Baum, L.E., Petrie, T.: Statistical inference for probabilistic functions of finite state Markov chains. Ann. Math. Stat. 37, 1554–1563 (1966)
Berchtold, A.: The double chain Markov model. Commun. Stat., Theory Methods 28, 2569–2589 (1999)
Berchtold, A.: High-order extensions of the double chain Markov model. Stoch. Models 18, 193–227 (2002)
Boys, R.J., Henderson, D.A.: On determining the order of Markov dependence of an observed process governed by a hidden Markov model. Sci. Program. 10, 795–809 (2002)
Cappé, O.: Ten years of HMMs (online bibliography 1989–2000). URL http://www.tsi.enst.fr/~cappe/docs/hmmbib.html
Cappé, O., Moulines, E., Rydén, T.: Inference in Hidden Markov Models. Springer, Berlin (2005)
Chib, S.: Calculating posterior distributions and modal estimates in Markov mixture models. J. Econom. 75, 79–97 (1996) Ann. Econ.: Bayes, Bernoullis, and Basel (1993)
Churchill, G.A.: Stochastic models for heterogeneous DNA sequences. Bull. Math. Biol. 51, 79–94 (1989)
Eisenkopf, A.: (2008). The real nature of credit rating transitions. Working paper. URL http://ssrn.com/abstract=968311
Forney, G.D. Jr.: The Viterbi algorithm. Proc. IEEE 61, 268–278 (1973)
Frühwirth-Schnatter, S.: Markov chain Monte Carlo estimation of classical and dynamic switching and mixture models. J. Am. Stat. Assoc. 96, 194–209 (2001) URl: http://dx.doi.org/10.1198/016214501750333063
Giampieri, G., Davis, M., Crowder, M.: Analysis of default data using hidden Markov models. Quant. Finance 5, 27–34 (2005)
Hobert, J.P., Marchev, D.: A theoretical comparison of the data augmentation, marginal augmentation and PX-DA algorithms. Ann. Stat. 36, 532–554 (2008)
Hobert, J.P., Roy, V., Robert, C.P.: Improving the convergence properties of the data augmentation algorithm with an application to Bayesian mixture modelling. Stat. Sci. 26, 332–351 (2011)
Hughes, J.P., Guttorp, P., Charles, S.P.: A non-homogeneous hidden Markov model for precipitation occurrence. J. R. Stat. Soc., Ser. C, Appl. Stat. 48, 15–30 (1999)
Jarrow, R.A., Lando, D., Turnbull, S.: A Markov model for the term structure of credit risk spreads. Rev. Financ. Stud. 10, 481–523 (1997)
Kenny, P., Lennig, M., Mermelstein, P.: A linear predictive HMM for vector-valued observations with applications to speech recognition. Acoust. Speech Signal Process. 38, 220–225 (1990)
Kershner, S.: Modeling of multivariate time series using hidden Markov models. PHD thesis, University of California, Irvine (2005)
Khare, K., Hobert, J.P.: A spectral analytic comparison of trace-class data augmentation algorithms and their sandwich variants. Ann. Stat. 39, 2585–2606 (2011)
Korolkiewicz, M.W., Elliott, R.J.: A hidden Markov model of credit quality. J. Econ. Dyn. Control 32, 3807–3819 (2008)
Lanchantin, P., Lapuyade-Lahorgue, J., Pieczynski, W.: Unsupervised segmentation of triplet Markov chains hidden with long-memory noise. Signal Process. 88, 1134–1151 (2008)
Meyn, S.P., Tweedie, R.L.: Markov Chains and Stochastic Stability. Springer, London (1993)
Paliwal, K.: Use of temporal correlation between successive frames in a hidden Markov model based speech recognizer. Acoust. Speech Signal Process. 2, 215–218 (1993)
Pieczynski, W.: Multisensor triplet Markov chains and theory of evidence. Int. J. Approx. Reason. 45, 1–16 (2007)
Pieczynski, W., Desbouvries, F.: On triplet Markov chains. In: International Symposium on Applied Stochastic Models and Data Analysis (ASMDA 2005) (2005)
Poritz, A.B.: Linear predictive hidden Markov models and the speech signal. Acoust. Speech Signal Process. 7, 1291–1294 (1982)
Raftery, A.E.: A model for high-order Markov chains. J. R. Stat. Soc. B 47, 528–539 (1985)
Roy, V.: Spectral analytic comparisons for data augmentation. Stat. Probab. Lett. 82, 103–108 (2012)
Stephens, M.: Dealing with label switching in mixture models. J. R. Stat. Soc., Ser. B, Stat. Methodol. 62, 795–809 (2000)
Tanner, M.A., Wong, W.H.: The calculation of posterior distributions by data augmentation (with discussion). J. Am. Stat. Assoc. 82, 528–550 (1987)
Wellekens, C.: Explicit time correlation in hidden Markov models for speech recognition. Acoust. Speech Signal Process. 12, 384–386 (1987)
Acknowledgements
The author would like to thank an associate editor, and two referees for their valuable comments.
Author information
Authors and Affiliations
Corresponding author
Appendix: Proof of Theorem 1
Appendix: Proof of Theorem 1
Proof
The joint mass function of the hidden states, given the parameters and the observed data as vectors at each time point is
The “typical term” can be written as
by Lemma 1. Now, \(\frac{\mbox{P}(\boldsymbol {x}^{t+2}, \boldsymbol {y}^{t+1} \vert \boldsymbol {y}_{,t},x_{t}, x_{t+1}, \theta)}{\mbox{P}(\boldsymbol {x}^{t+1},\boldsymbol {y}^{t+1}\vert \boldsymbol {y}_{,t}, \theta)}\) depends only on x t+1, and is therefore independent of x t and thus can become the normalizing constant. That is,
We continue, in more detail, to show
By the law of total probability and Lemma 1, we have
and consequently from (10) and (11),
This is initialized at t=u 0 by setting \(\mbox{P}(x_{0} \vert \boldsymbol {y}_{,M}, \theta) = \mbox{P}(x_{u_{0}} \vert r)\) to be the same as the Dirichlet prior on D(α 01,…,α 0a ). □
Rights and permissions
About this article
Cite this article
Fitzpatrick, M., Marchev, D. Efficient Bayesian estimation of the multivariate Double Chain Markov Model. Stat Comput 23, 467–480 (2013). https://doi.org/10.1007/s11222-012-9323-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-012-9323-y