Abstract
In this article, we introduce the concept of confidence graphs (CG) for graphical model selection. CG first identifies two nested graphical models—called small and large confidence graphs (SCG and LCG)—trapping the true graphical model in between at a given level of confidence, just like the endpoints of traditional confidence interval capturing the population parameter. Therefore, SCG and LCG provide us with more insights about the simplest and most complex forms of dependence structure the true model can possibly be, and their difference also offers us a measure of model selection uncertainty. In addition, rather than relying on a single selected model, CG consists of a group of graphical models between SCG and LCG as the candidates. The proposed method can be coupled with many popular model selection methods, making it an ideal tool for comparing model selection uncertainty as well as measuring reproducibility. We also propose a new residual bootstrap procedure for graphical model settings to approximate the sampling distribution of the selected models and to obtain CG. To visualize the distribution of selected models and its associated uncertainty, we further develop new graphical tools, such as grouped model selection distribution plot. Numerical studies further illustrate the advantages of the proposed method.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Banerjee, O., El Ghaoui, L., d’Aspremont, A.: Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data. J. Mach. Learn. Res. 9, 485–516 (2008)
Chatterjee, A., Lahiri, S.N.: Bootstrapping lasso estimators. J. Am. Stat. Assoc. 106(494), 608–625 (2011)
Efron, B., Tibshirani, R.J.: An introduction to the bootstrap, volume 56 Monographs on Statistics and Applied Probability. Chapman and Hall, New York (1993)
Epskamp, S., Borsboom, D., Fried, E.I.: Estimating psychological networks and their accuracy: a tutorial paper. Behav. Res. Methods 50(1), 195–212 (2018)
Fan, J., Feng, Y., Wu, Y.: Network exploration via the adaptive lasso and SCAD penalties. Ann. Appl. Stat. 3(2), 521–541 (2009)
Ferrari, D., Yang, Y.: Confidence sets for model selection by \(F\)-testing. Stat. Sin. 25(4), 1637–1658 (2015)
Freedman, D.A.: Bootstrapping regression models. Ann. Stat. 9(6), 1218–1228 (1981)
Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432–441 (2008)
Hansen, P.R., Lunde, A., Nason, J.M.: The model confidence set. Econometrica 79(2), 453–497 (2011)
Jiang, J., Rao, J.S., Gu, Z., Nguyen, T.: Fence methods for mixed model selection. Ann. Stat. 36(4), 1669–1692 (2008)
Johnson, C., Jalali, A., Ravikumar, P.: High-dimensional sparse inverse covariance estimation using greedy methods. Artif. Intell. Stat. 574–582, (2012)
Lam, C., Fan, J.: Sparsistency and rates of convergence in large covariance matrix estimation. Ann. Stat. 37(6B), 4254–4278 (2009)
Lei, J.: Cross-validation with confidence. J. Am. Stat. Assoc. 115, 1978–1997 (2019)
Li, S., Hsu, L., Peng, J., Wang, P.: Bootstrap inference for network construction with an application to a breast cancer microarray study. Ann. Appl. Stat. 7(1), 391–417 (2013)
Li, Y., Luo, Y., Ferrari, D., Hu, X., Qin, Y.: Model confidence bounds for variable selection. Biometrics 75(2), 392–403 (2019)
Liu, H., Wang, L.: TIGER: a tuning-insensitive approach for optimally estimating Gaussian graphical models. Electron. J. Stat. 11(1), 241–294 (2017)
Madigan, D., Raftery, A.E.: Model selection and accounting for model uncertainty in graphical models using occam’s window. J. Am. Stat. Assoc. 89(428), 1535–1546 (1994)
Mazumder, R., Hastie, T.: The graphical lasso: new insights and alternatives. Electron. J. Stat. 6, 2125–2149 (2012)
Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the lasso. Ann. Stat. 34(3), 1436–1462 (2006)
Sachs, K., Perez, O., Pe’er, D., Lauffenburger, D.A., Nolan, G.P.: Causal protein-signaling networks derived from multiparameter single-cell data. Science 308(5721), 523–529 (2003)
Shen, X., Pan, W., Zhu, Y.: Likelihood-based selection and sharp parameter estimation. J. Am. Stat. Assoc. 107(497), 223–232 (2012)
Shimodaira, H.: An application of multiple comparison techniques to model selection. Ann. Inst. Stat. Math. 50(1), 1–13 (1998)
Steck, H., Jaakkola, T.S.: Bias-corrected bootstrap and model uncertainty. Adv. Neural Inform. Process. Syst. 521–528, (2003)
Yuan, M., Lin, Y.: Model selection and estimation in the Gaussian graphical model. Biometrika 94(1), 19–35 (2007)
Zhang, Y., Shen, X., Wang, S.: Large multiple graphical model inference via bootstrap. Stat. Sin. 30(2), 695–717 (2020)
Zheng, C., Ferrari, D., Yang, Y.: Model selection confidence sets by likelihood ratio testing. Stat. Sin. 29(2), 827–851 (2019a)
Zheng, C., Ferrari, D., Zhang, M., Baird, P.: Ranking the importance of genetic factors by variable-selection confidence sets. J. R. Stat. Soc. Ser. C Appl. Stat. 68(3), 727–749 (2019b)
Acknowledgements
LW’s research was partially supported by Army Research Office grant W911NF-17-1-0006. YL’s research was partially supported by the Research Funds for the Major Innovation Platform of Public Health & Disease Control and Prevention, Renmin University of China. We thank Yizao Wang for productive discussion and comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, L., Qin, Y. & Li, Y. Confidence graphs for graphical model selection. Stat Comput 31, 52 (2021). https://doi.org/10.1007/s11222-021-10027-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11222-021-10027-5