Parallell interacting MCMC for learning of topologies of graphical models

Corander, Jukka; Ekdahl, Magnus; Koski, Timo

doi:10.1007/s10618-008-0099-9

Parallell interacting MCMC for learning of topologies of graphical models

Published: 16 May 2008

Volume 17, pages 431–456, (2008)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Jukka Corander¹,
Magnus Ekdahl² &
Timo Koski³

561 Accesses
32 Citations
Explore all metrics

Abstract

Automated statistical learning of graphical models from data has attained a considerable degree of interest in the machine learning and related literature. Many authors have discussed and/or demonstrated the need for consistent stochastic search methods that would not be as prone to yield locally optimal model structures as simple greedy methods. However, at the same time most of the stochastic search methods are based on a standard Metropolis–Hastings theory that necessitates the use of relatively simple random proposals and prevents the utilization of intelligent and efficient search operators. Here we derive an algorithm for learning topologies of graphical models from samples of a finite set of discrete variables by utilizing and further enhancing a recently introduced theory for non-reversible parallel interacting Markov chain Monte Carlo-style computation. In particular, we illustrate how the non-reversible approach allows for novel type of creativity in the design of search operators. Also, the parallel aspect of our method illustrates well the advantages of the adaptive nature of search operators to avoid trapping states in the vicinity of locally optimal network topologies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Andersson SA, Madigan D, Perlman MD (1996) An alternative Markov property for chain graphs. In: Uncertainty in artificial intelligence: proceedings of the twelfth conference. Morgan Kaufmann, San Francisco, pp 40–48
Andersson SA, Madigan D and Perlman MD (1997). A characterization of Markov equivalence classes for acyclic digraphs. Ann Statist 25: 505–541
Article MATH MathSciNet Google Scholar
Andersson SA, Madigan D and Perlman MD (2001). Alternative Markov properties for chain graphs. Scand J Stat 28: 33–85
Article MATH MathSciNet Google Scholar
Chickering DM (1995) A transformational characterization of equivalent Bayesian network structures. In: Uncertainty in artificial intelligence: proceedings of the eleventh conference. Morgan Kaufmann, San Francisco, pp 87–98
Chickering DM (2002a). Learning equivalence classes of Bayesian network structures. J Mach Learn Res 2: 445–498
Article MATH MathSciNet Google Scholar
Chickering DM (2002b). Optimal structure identification with greedy search. J Mach Learn Res 3: 507–554
Article MathSciNet Google Scholar
Cooper G and Hershkovitz E (1992). A bayesian method for the induction of probabilistic networks from data. Mach Learn 9: 309–347
MATH Google Scholar
Corander J (2003). Bayesian graphical model determination using decision theory. J Multivariate Anal 85: 253–266
Article MATH MathSciNet Google Scholar
Corander J, Gyllenberg M and Koski T (2006). Bayesian model learning based on parallel mcmc strategy. Stat Comput 16: 355–362
Article MathSciNet Google Scholar
Cowell RG, Dawid AP, Lauritzen SL and Spiegelhalter DJ (1999). Probabilistic networks and expert systems. Springer, New York
MATH Google Scholar
Dawid AP (1979). Conditional independence in statistical theory. J Roy Stat Soc B 41: 1–31
MATH MathSciNet Google Scholar
Dawid AP and Lauritzen SL (1993). Hyper-Markov laws in the statistical analysis of decomposable graphical models. Ann Statist 21: 1272–1317
Article MATH MathSciNet Google Scholar
Dellaportas P and Forster J (1999). Markov chain monte carlo model determination for hierarchical and graphical log-linear models. Biometrika 86: 615–633
Article MATH MathSciNet Google Scholar
Durrett R (1996). Probability: theory and examples. Duxbury Press, CA
Google Scholar
Frydenberg M (1990). The chain graph Markov property. Scand J Stat 17: 333–353
MATH MathSciNet Google Scholar
Frydenberg M and Lauritzen SL (1989). Decomposition of maximum likelihood in mixed graphical interaction models. Biometrika 76: 539–555
Article MATH MathSciNet Google Scholar
Geyer CJ and Thompson EA (1995). Annealing Markov chain Monte Carlo with applications to ancestral inference. J Am Stat Assoc 90: 909–920
Article MATH Google Scholar
Gillispie SB, Perlman MD (2001) Enumerating Markov equivalence classes of acyclic digraph models. In: Uncertainty in artificial intelligence: proceedings of the seventeeth conference. Morgan Kaufmann, San Francisco, pp 171–177
Giudici P and Castelo R (2003). Improving Markov chain Monte Carlo model search for data mining. Mach Learn 50: 127–158
Article MATH Google Scholar
Giudici P and Green PJ (1999). Decomposable graphical Gaussian model determination. Biometrika 86: 785–801
Article MATH MathSciNet Google Scholar
Isaacson DL and Madsen RW (1976). Markov Chains: theory and applications. Wiley, New York
MATH Google Scholar
Janzura M and Nielsen J (2006). A simulated annealing-based method for learning Bayesian networks from statistical data. Int J Intell Syst 21: 335–348
Article MATH Google Scholar
Jones B, Carvalho C and Dobra A et al (2005). Experiments in stochastic computation for high-dimensional graphical models. Stat Sci 20: 388–400
Article MATH MathSciNet Google Scholar
Jordan MI (1998). Learning in graphical models. MIT Press, Cumberland
MATH Google Scholar
Koivisto M and Sood K (2004). Exact Bayesian structure discovery in Bayesian networks. J Mach Learn Res 5: 549–573
MathSciNet Google Scholar
Lam W and Bacchus F (1994). Learning Bayesian belief networks: An approach based on the MDL principle. Comput Intell 10: 269–293
Article Google Scholar
Madigan D, Andersson S, Perlman M and Volinsky C (1996). Bayesian model averaging and model selection for Markov equivalence classes of acyclic digraphs. Communtat Theor Meth 25: 2493–2519
Article MATH Google Scholar
Madigan D and Raftery A (1994). Model selection and accounting for model uncertainly in graphicalmodels using Occam’s window. J Am Stat Assoc 89: 1535–1546
Article MATH Google Scholar
Peña JM (2007) Approximate counting of graphical models via MCMC. In: Proceedings of the 11th international conference on artificial intelligence, pp 352–359
Poli I and Roverato A (1998). A genetic algorithm for graphical model selection. J Italian Stat Soc 2: 197–208
Article Google Scholar
Riggelsen C (2005). MCMC learning of Bayesian network models by markov blanket decomposition. Springer, New York
Google Scholar
Robert C and Casella G (2004). Monte Carlo statistical methods, 2nd edn. Springer, New York
MATH Google Scholar
Roverato A and Studený M (2006). A graphical representation of equivalence classes of AMP chain graphs. J Mach Learn Res 7: 1045–1078
MathSciNet Google Scholar
Sanguesa R and Cortes U (1997). Learning causal networks from data: a survey and a new algorithm to learn possibilistic causal networks from data.. AI Commun 4: 1–31
Google Scholar
Spirtes P, Glymour C and Scheines R (1993). Causation, prediction and search. Springer, New York
MATH Google Scholar
Studený M (1998) Bayesian networks from the point of view of chain graphs. Uncertainty in Artificial Intelligence: In: proceedings of the twelfth conference. Morgan Kaufmann, San Francisco, pp 496–503
Sundberg R (1975). Some results about decomposable (or markov-type) models for multidimensional contingency tables: distribution of marginals and partitioning of tests. Scand J Stat 2: 771–779
MathSciNet Google Scholar
Suzuki J (1996) Learning Bayesian belief networks based on the minimum description length principle. In: International Conference Machine on Learning, Morgan Kaufmann, San Francisco, pp 462–470
Suzuki J (2006). On strong consistency of model selection in classification. IEEE Trans Inform Theory 52: 4767–4774
Article MathSciNet Google Scholar
van Laarhoven PJM, Aarts EHJ (1987). Simulated annealing: theory and applications. Kluwer, Norwell
MATH Google Scholar
Verma E, Pearl J (1990) Equivalence and synthesis of causal models. In: Uncertainty in artificial intelligence: proceedings of the sixth conference. Elsevier, New York, pp 220–227
Volf M and Studený M (1999). A graphical characterization of the largest chain graphs. Int J Approx Reason 20: 209–236
Article MATH Google Scholar
Wedelin D (1996). Efficient estimation and model selection in large graphical models. Stat Comput 6: 313–323
Article Google Scholar
Whittaker J (1990). Graphical models in applied multivariate statistics. Wiley, Chichester
MATH Google Scholar
Wong F, Carter C and Kohn R (2003). Efficient estimation of covariance selection models. Biometrika 90: 809–830
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Åbo Akademi University, 20500, Abo, Finland
Jukka Corander
Department of Mathematics, Linköping University, 581 83, Linkoping, Sweden
Magnus Ekdahl
Department of Mathematics, Royal Institute of Technology, 100 44, Stockholm, Sweden
Timo Koski

Authors

Jukka Corander
View author publications
You can also search for this author in PubMed Google Scholar
Magnus Ekdahl
View author publications
You can also search for this author in PubMed Google Scholar
Timo Koski
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jukka Corander.

Additional information

Responsible editor: Charu Aggarwal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Corander, J., Ekdahl, M. & Koski, T. Parallell interacting MCMC for learning of topologies of graphical models. Data Min Knowl Disc 17, 431–456 (2008). https://doi.org/10.1007/s10618-008-0099-9

Download citation

Received: 04 May 2007
Accepted: 01 May 2008
Published: 16 May 2008
Issue Date: December 2008
DOI: https://doi.org/10.1007/s10618-008-0099-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallell interacting MCMC for learning of topologies of graphical models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning discrete decomposable graphical models via constraint optimization

Partitioned hybrid learning of Bayesian network structures

Using multi-step proposal distribution for improved MCMC convergence in Bayesian network structure learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Parallell interacting MCMC for learning of topologies of graphical models

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Learning discrete decomposable graphical models via constraint optimization

Partitioned hybrid learning of Bayesian network structures

Using multi-step proposal distribution for improved MCMC convergence in Bayesian network structure learning

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation