SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient

Mishkin, Aaron; Kunstner, Frederik; Nielsen, Didrik; Schmidt, Mark; Khan, Mohammad Emtiyaz

Computer Science > Machine Learning

arXiv:1811.04504 (cs)

[Submitted on 11 Nov 2018 (v1), last revised 12 Jan 2019 (this version, v2)]

Title:SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient

Authors:Aaron Mishkin, Frederik Kunstner, Didrik Nielsen, Mark Schmidt, Mohammad Emtiyaz Khan

View PDF

Abstract:Uncertainty estimation in large deep-learning models is a computationally challenging task, where it is difficult to form even a Gaussian approximation to the posterior distribution. In such situations, existing methods usually resort to a diagonal approximation of the covariance matrix despite, the fact that these matrices are known to result in poor uncertainty estimates. To address this issue, we propose a new stochastic, low-rank, approximate natural-gradient (SLANG) method for variational inference in large, deep models. Our method estimates a "diagonal plus low-rank" structure based solely on back-propagated gradients of the network log-likelihood. This requires strictly less gradient computations than methods that compute the gradient of the whole variational objective. Empirical evaluations on standard benchmarks confirm that SLANG enables faster and more accurate estimation of uncertainty than mean-field methods, and performs comparably to state-of-the-art methods.

Comments:	NeurIPS 2018 final version
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:1811.04504 [cs.LG]
	(or arXiv:1811.04504v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1811.04504

Submission history

From: Aaron Mishkin [view email]
[v1] Sun, 11 Nov 2018 23:18:27 UTC (2,663 KB)
[v2] Sat, 12 Jan 2019 01:01:06 UTC (2,674 KB)

Computer Science > Machine Learning

Title:SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators