Abstract
We consider a sparse linear regression model with unknown symmetric error under the high-dimensional setting. The true error distribution is assumed to belong to the locally \(\beta \)-Hölder class with an exponentially decreasing tail, which does not need to be sub-Gaussian. We obtain posterior convergence rates of the regression coefficient and the error density, which are nearly optimal and adaptive to the unknown sparsity level. Furthermore, we derive the semi-parametric Bernstein-von Mises (BvM) theorem to characterize asymptotic shape of the marginal posterior for regression coefficients. Under the sub-Gaussianity assumption on the true score function, strong model selection consistency for regression coefficients are also obtained, which eventually asserts the frequentist’s validity of credible sets.
Similar content being viewed by others
References
Armagan, A., Dunson, D. B., & Lee, J. (2013a). Generalized double pareto shrinkage. Statistica Sinica, 23(1), 119–143.
Armagan, A., Dunson, D. B., Lee, J., Bajwa, W. U., & Strawn, N. (2013b). Posterior consistency in linear models under shrinkage priors. Biometrika, 100(4), 1011–1018.
Bhattacharya, A., Pati, D., Pillai, N. S., & Dunson, D. B. (2015). Dirichlet-laplace priors for optimal shrinkage. Journal of the American Statistical Association, 110(512), 1479–1490.
Bickel, P. J., Ritov, Y., & Tsybakov, A. B. (2009). Simultaneous analysis of Lasso and Dantzig selector. The Annals of Statistics, 37(4), 1705–1732.
Bochkina, N., & Rousseau, J. (2017). Adaptive density estimation based on a mixture of gammas. Electronic Journal of Statistics, 11(1), 916–962.
Bühlmann, P., & van de Geer, S. (2011). Statistics for high-dimensional data: Methods, theory and applications. Springer Series in Statistics. Springer, Berlin. Retrieved from https://books.google.com/books?id=S6jYXmh988UC.
Canale, A., & De Blasi, P. (2017). Posterior asymptotics of nonparametric location-scale mixtures for multivariate density estimation. Bernoulli, 23(1), 379–404.
Candes, E., & Tao, T. (2007). The Dantzig selector: Statistical estimation when p is much larger than n. The Annals of Statistics, 35(6), 2313–2351.
Carvalho, C. M., Polson, N. G., & Scott, J. G. (2010). The horseshoe estimator for sparse signals. Biometrika, 97(2), 465–480.
Castillo, I., & Nickl, R. (2013). Nonparametric Bernstein-von Mises theorems in Gaussian white noise. The Annals of Statistics, 41(4), 1999–2028.
Castillo, I., & Nickl, R. (2014). On the Bernstein-von Mises phenomenon for nonparametric Bayes procedures. The Annals of Statistics, 42(5), 1941–1969.
Castillo, I., & Rousseau, J. (2015). A Bernstein-von Mises theorem for smooth functionals in semiparametric models. The Annals of Statistics, 43(6), 2353–2383.
Castillo, I., Schmidt-Hieber, J., & van der Vaart, A. (2015). Bayesian linear regression with sparse priors. The Annals of Statistics, 43(5), 1986–2018.
Castillo, I., & van der Vaart, A. (2012). Needles and straw in a haystack: Posterior concentration for possibly sparse sequences. The Annals of Statistics, 40(4), 2069–2101.
Chae, M., Kim, Y., & Kleijn, B. J. K. (2019a). The semi-parametric Bernstein-von Mises theorem for regression models with symmetric errors. Statistica Sinica, 29(3), 1465–1487.
Chae, M., Lin, L., & Dunson, D. B. (2016). Bayesian sparse linear regression with unknown symmetric error. arXiv e-prints arXiv:1608.02143.
Chae, M., Lin, L., & Dunson, D. B. (2019b). Bayesian sparse linear regression with unknown symmetric error. Information and Inference, 8(3), 621–653.
Chae, M., & Walker, S. G. (2017). A novel approach to Bayesian consistency. Electronic Journal of Statistics, 11(2), 4723–4745.
Fan, J., & Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American statistical Association, 96(456), 1348–1360.
Gao, C., van der Vaart, A. W. & Zhou, H. H. (2015). A general framework for Bayes structured linear models. The Annals of Statistics, To appear.
George, E. I., & Foster, D. P. (2000). Calibration and empirical Bayes variable selection. Biometrika, 87(4), 731–747.
Hanson, D. L., & Wright, F. T. (1971). A bound on tail probabilities for quadratic forms in independent random variables. The Annals of Mathematical Statistics, 42(3), 1079–1083.
Kim, Y., & Jeon, J. J. (2016). Consistent model selection criteria for quadratically supported risks. The Annals of Statistics, 44(6), 2467–2496.
Kleijn, B., & van der Vaart, A. (2012). The Bernstein-von-Mises theorem under misspecification. Electronic Journal of Statistics, 6, 354–381.
Kruijer, W., Rousseau, J., & van der Vaart, A. (2010). Adaptive Bayesian density estimation with location-scale mixtures. Electronic Journal of Statistics, 4, 1225–1257.
Martin, R., Mess, R., & Walker, S. G. (2017). Empirical Bayes posterior concentration in sparse high-dimensional linear models. Bernoulli, 23(3), 1822–1847.
Narisetty, N. N., & He, X. (2014). Bayesian variable selection with shrinking and diffusing priors. The Annals of Statistics, 42(2), 789–817.
Narisetty, N. N., Shen, J., & He, X. (2019). Skinny Gibbs: A consistent and scalable Gibbs sampler for model selection. Journal of the American Statistical Association, 114(527), 1205–1217.
Panov, M., & Spokoiny, V. (2015). Finite sample Bernstein-von Mises theorem for semiparametric problems. Bayesian Analysis, 10(3), 665–710.
Polson, N. G., & Scott, J. G. (2010). Shrink globally, act locally: Sparse bayesian regularization and prediction. Bayesian Statistics, 9, 501–538.
Ročková, V., & George, E. I. (2018). The spike-and-slab Lasso. Journal of the American Statistical Association, 113(521), 431–444.
Rossell, D., & Rubio, F. J. (2017). Tractable Bayesian variable selection: Beyond normality. Journal of the American Statistical Association (just-accepted).
Scott, J. G., & Berger, J. O. (2010). Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem. The Annals of Statistics, 38(5), 2587–2619.
Shen, W., Tokdar, S. T., & Ghosal, S. (2013). Adaptive Bayesian multivariate density estimation with Dirichlet mixtures. Biometrika, 100(3), 623–640.
Shin, M., Bhattacharya, A., & Johnson, V. E. (2015). Scalable Bayesian variable selection using nonlocal prior densities in ultrahigh-dimensional settings. arXiv:150707106.
Song, Q., & Liang, F. (2017). Nearly optimal Bayesian shrinkage for high dimensional regression. ArXiv e-prints arXiv:1712.08964.
Tibshirani, R. (1996). Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 58(1), 267–288.
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., & Knight, K. (2005). Sparsity and smoothness via the fused Lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(1), 91–108.
van de Geer, S. A., & Bühlmann, P. (2009). On the conditions used to prove oracle results for the Lasso. Electronic Journal of Statistics, 3, 1360–1392.
van der Pas, S., Salomond, J. B., & Schmidt-Hieber, J. (2016). Conditions for posterior contraction in the sparse normal means problem. Electronic Journal of Statistics, 10(1), 976–1000.
van der Vaart, A. (1998). Asymptotic statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge: Cambridge University Press. Retrieved from https://books.google.com/books?id=UEuQEM5RjWgC.
Wright, F. T. (1973). A bound on tail probabilities for quadratic forms in independent random variables whose distributions are not necessarily symmetric. The Annals of Probability, 1(6), 1068–1070.
Yang, D. (2017). Posterior asymptotic normality for an individual coordinate in high-dimensional linear regression. arXiv preprint arXiv:170402646.
Yang, Y., Wainwright, M. J., & Jordan, M. I. (2016a). On the computational complexity of high-dimensional Bayesian variable selection. The Annals of Statistics, 44(6), 2497–2532.
Yang, Y., Wang, H. J., & He, X. (2016b). Posterior inference in Bayesian quantile regression with asymmetric laplace likelihood. International Statistical Review, 84(3), 327–344.
Ye, F., & Zhang, C. H. (2010). Rate minimaxity of the Lasso and Dantzig selector for the \(\ell _q\) loss in \(\ell _r\) balls. Journal of Machine Learning Research, 11(Dec), 3519–3540.
Zhang, C. H., & Zhang, S. S. (2014). Confidence intervals for low dimensional parameters in high dimensional linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1), 217–242.
Zou, H. (2006). The adaptive Lasso and its oracle properties. Journal of the American statistical association, 101(476), 1418–1429.
Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320.
Funding
KL and LL were supported by NSF Grants IIS 1663870 and DMS Career 1654579. KL was also supported by INHA UNIVERSITY Research Grant. MC was supported by the National Research Foundation of Korea (NRF) Grant funded by the Korea government (MSIT) (No. 2020R1F1A1A01054718).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Lee, K., Chae, M. & Lin, L. Bayesian high-dimensional semi-parametric inference beyond sub-Gaussian errors. J. Korean Stat. Soc. 50, 511–527 (2021). https://doi.org/10.1007/s42952-020-00091-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42952-020-00091-4