Abstract
A new method for model selection for Gaussian Bayesian networks and Markov networks, with extensions towards ancestral graphs, is constructed to have good mean squared error properties. The method is based on the focused information criterion, and offers the possibility of fitting individual-tailored models. The focus of the research, that is, the purpose of the model, directs the selection. It is shown that using the focused information criterion leads to a graph with small mean squared error. The low mean squared error ensures accurate estimation using a graphical model; here estimation rather than explanation is the main objective. Two situations that commonly occur in practice are treated: a data-driven estimation of a graphical model and the improvement of an already pre-specified feasible model. The search algorithms are illustrated by means of data examples and are compared with existing methods in a simulation study.
Similar content being viewed by others
Explore related subjects
Find the latest articles, discoveries, and news in related topics.References
Abreu, G., Labouriau, R., Edwards, D.: High-dimensional graphical model search with the gRapHD R package. J. Stat. Softw. 37(1), 1–18 (2010)
Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B., Csáki, F. (eds.) Second International Symposium on Information Theory, pp. 267–281. Akadémiai Kiadó, Budapest (1973)
Ali, R.A., Richardson, T., Spirtes, P.: Markov equivalence for ancestral graphs. Ann. Stat. 37(5B), 2808–2837 (2009)
Banerjee, O., El Ghaoui, L., d’Aspremont, A.: Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res. 9, 485–516 (2008)
Chow, C.K., Liu, C.N.: Approximating discrete probability distributions with dependence trees. IEEE Trans. Inf. Theory 14, 462–467 (1968)
Claeskens, G., Hjort, N.: The focused information criterion. J. Am. Stat. Assoc. 98, 900–916 (2003). With discussion and a rejoinder by the authors
Claeskens, G., Hjort, N.: Minimising average risk in regression models. Econom. Theory 24, 493–527 (2008a)
Claeskens, G., Hjort, N.: Model Selection and Model Averaging. Cambridge University Press, Cambridge (2008b)
Cox, D.R., Wermuth, N.: Multivariate Dependencies: Models Analysis and Interpretation. Chapman & Hall, London (1996)
Dempster, A.: Covariance selection. Biometrics 28(1), 157–175 (1972)
Dor, D., Tarsi, M.: A simple algorithm to construct a consistent extension of a partially oriented graph. Tech. Rep. (1992).
Drton, M., Perlman, M.: Model selection for Gaussian concentration graphs. Biometrika 91(3), 591–602 (2004)
Drton, M., Perlman, M.: A SINful approach to Gaussian graphical model selection. J. Stat. Plann. Inference 138(4), 1179–1200 (2008)
Drton, M., Richardson, T.: Iterative conditional fitting for Gaussian ancestral graph models. In: Chickering D, Halpern J (eds) Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence, pp. 130–137 (2004).
Edwards, D.: Introduction to Graphical Modelling, 2nd edn. Springer, New York (2000)
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)
Friedman, J.H.: Multivariate adaptive regression splines. Ann. Stat. 19(1), 1–67 (1991)
Gammelgaard Bøttcher, S.: Learning Bayesian Networks with Mixed Variables. PhD thesis, Aalborg University (2004).
Grossman, D., Domingos, P.: Learning Bayesian network classifiers by maximizing conditional likelihood. In: Brodley C (ed) Proceedings of the 21st International Conference on Machine Learning (2004).
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, 2nd edn. Springer Series in Statistics, Springer, New York, (2009)
Heckerman, D., Geiger, D.: Learning Bayesian networks: A unification for discrete and Gaussian domains. In: Proceedings of Eleventh Conference on Uncertainty in Artificial Intelligence, Morgan Kaufmann, pp. 274–284 (1995).
Hjort, N., Claeskens, G., Hjort, N.: Frequentist model average estimators. J. Am. Stat. Assoc. 98, 879–899 (2003). With discussion and a rejoinder by the authors
Hjort, N., Claeskens, G.: Focussed information criteria and model averaging for Cox’s hazard regression model. J. Am. Stat. Assoc. 101, 1449–1464 (2006)
Hjort, N.L.: The exact amount of t-ness that the normal model can tolerate. J. Am. Stat. Assoc. 89, 665–675 (1994)
Jardine, N., van Rijsbergen, C.: The use of hierarchic clustering in information retrieval. Inf. Storage Retr. 7(5), 217–240 (1971)
Kalisch, M., Mächler, M., Colombo, D., Maathuis, M., Bühlmann, P.: Causal inference using graphical models with the R package pcalg. J. Stat. Softw. 47(11), 1–26 (2012)
Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)
Krishnamurthy, V., Ahipaşaoğlu, S., d’Aspremont, A.: A pathwise algorithm for covariance selection. In: Sra, S., Nowozin, S., Wright, S. (eds.) Optimization for Machine Learning, pp. 479–494. MIT Press, Cambridge (2012)
Lauritzen, S.: Graphical Models. Oxford University Press, New York (1996)
Li, L., Toh, K.C.: An inexact interior point method for \(l_1\)-regularized sparse covariance selection. Math. Progr. Comput. 2(3–4), 291–315 (2010)
Mansour, J., Schwarz, R.: Molecular mechanisms for individualized cancer care. J. Am. Coll. Surg. 207(2), 250–258 (2008)
Mardia, K.V., Kent, J.T., Bibby, J.M.: Multivariate Analysis. Academic Press, London (1979)
Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the lasso. Ann. Stat. 34(3), 1436–1462 (2006)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc, San Francisco (1988)
Richardson, T., Spirtes, P.: Ancestral graph Markov models. Ann. Stat. 30(4), 962–1030 (2002)
Schmidt, M., Niculescu-Mizil, A., Murphy, K.: Learning graphical model structure using \(l_1\)-regularization paths. In: Proceedings of the 22nd AAAI Conference on Artificial Intelligence, AAAI Press, pp. 1278–1283 (2007).
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Scutari, M.: Learning bayesian networks with the bnlearn R package. J. Stat. Softw. 35(3), 1–22 (2010)
Shastry, B.S.: Pharmacogenetics and the concept of individualized medicine. Pharm. J. 6(1), 16–21 (2006)
Spirtes, P., Meek, C., Richardson, T.: An algorithm for causal inference in the presence of latent variables and selection bias. In: Glymour, C., Cooper, G. (eds.) Computation, Causation and Discovery, pp. 211–252. MIT Press, Cambridge (1999)
Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction and Search, 2nd edn. MIT Press, Cambridge (2000)
Tsamardinos, I., Brown, E.L., Aliferis, C.F.: The max-min hill-climbing Bayesian network structure learning algorithm. J. Mach. Learn. Res. 65(1), 31–78 (2006)
van ’t Veer, L., Bernards, R.: Enabling personalized cancer medicine through analysis of gene-expression patterns. Nature 452(7187), 564–570 (2008)
Whittaker, J.: Graphical Models in Applied Multivariate Statistics. John Wiley & Sons, Chichester (1990)
Williamson, J.: Bayesian Nets and Causality. Philosophical and Computational Foundations. Oxford University Press, Oxford (2005)
Witten, D.M., Friedman, J.H., Simon, N.: New insights and faster computations for the graphical lasso. J. Comput. Graph. Stat. 20(4), 892–900 (2011)
Yuan, M., Lin, Y.: Model selection and estimation in the Gaussian graphical model. Biometrika 94(1), 19–35 (2007)
Zhang, J.: On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artif. Intell. 172(16), 1873–1896 (2008)
Zhang, X., Liang, H.: Focused information criterion and model averaging for generalized additive partial linear models. Ann. Stat. 39(1), 174–200 (2011)
Zhao, T., Liu, H., Roeder, K., Lafferty, J., Wasserman, L.: The huge package for high-dimensional undirected graph estimation in R. J. Mach. Learn. Res. 13, 1059–1062 (2012)
Acknowledgments
The authors wish to thank the reviewers for their constructive comments. E. Pircalabelu and G. Claeskens acknowledge the support of the Fund for Scientific Research Flanders, KU Leuven grant GOA/12/14 and of the IAP Research Network P7/06 of the Belgian Science Policy. The computational resources and services used in this work were provided by the VSC (Flemish Supercomputer Center), funded by the Hercules Foundation and the Flemish Government - department EWI.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pircalabelu, E., Claeskens, G. & Waldorp, L. A focused information criterion for graphical models. Stat Comput 25, 1071–1092 (2015). https://doi.org/10.1007/s11222-014-9504-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-014-9504-y