Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Density Estimation-Based Stein Variational Gradient Descent

  • Research
  • Published:
Cognitive Computation Aims and scope Submit manuscript

Abstract

Approximating a target distribution, such as a Bayesian posterior, is important in many areas, including cognitive computation. We introduce a variant of Stein variational gradient descent (SVGD) (Liu and Wang Adv Neural Inf Process Syst 29, 2016), called the density estimation-based Stein variational gradient descent (DESVGD). SVGD has proven to be promising as a sampling method for approximating target distributions. SVGD, however, suffers from discontinuity inherent in the empirical measure, making it difficult to closely monitor the convergence of the sampling-based approximation to the target. DESVGD utilizes kernel density estimation to replace the empirical measure in SVGD with its continuous counterpart. This allows direct computation of the KL divergence between the current approximation and the target distribution, thereby helping to monitor the numerical convergence of the iterative optimization process. DESVGD also offers derivatives of the KL divergence, which can be used to better design learning rates and thus to achieve faster convergence. By simply replacing the kernel used in SVGD with its weighted average, one can easily implement DESVGD based on existing SVGD algorithms. Our numerical experiments demonstrate that DESVGD approximates the target distribution well and outperforms the original SVGD in terms of approximation quality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Algorithm 2
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Data Availability

No datasets were generated or analyzed during the current study.

Notes

  1. Strictly speaking, SVGD per se produces neither continuous densities nor KL divergence measures due to its dependency on discrete empirical measures. In reporting results for SVGD, we use the same kernel as used in our DESVGD to produce a continuous approximate density out of the final set of particles obtained from SVGD.

References

  1. Liu Q, Wang D. Stein variational gradient descent: a general purpose Bayesian inference algorithm. Adv Neural Inf Process Syst. 2016;29.

  2. Brooks S, Gelman A, Jones G, Meng XL. Handbook of Markov chain Monte Carlo. CRC Press; 2011.

  3. Chater N, Oaksford M, Hahn U, Heit E. Bayesian models of cognition. Wiley Interdiscip Rev Cogn. 2010;1(6):811–23.

    Article  Google Scholar 

  4. Tenenbaum JB, Griffiths TL. Generalization, similarity, and Bayesian inference. Behav Brain Sci. 2001;24(4):629–40.

    Article  Google Scholar 

  5. Knill DC, Richards W. Perception as Bayesian inference. Cambridge University Press; 1996.

  6. Neal RM. Bayesian learning for neural networks. vol. 118. Springer Science & Business Media; 2012.

  7. Bishop CM, Nasrabadi NM. Pattern recognition and machine learning. vol. 4. Springer; 2006.

  8. Haarnoja T, Tang H, Abbeel P, Levine S. Reinforcement learning with deep energy-based policies. In: International conference on machine learning. PMLR; 2017. p. 1352–61.

  9. Jaini P, Holdijk L, Welling M. Learning equivariant energy based models with equivariant stein variational gradient descent. Adv Neural Inf Process Syst. 2021;34:16727–37.

    Google Scholar 

  10. Wang D, Zeng Z, Liu Q. Stein variational message passing for continuous graphical models. In: International Conference on Machine Learning. PMLR; 2018. p. 5219-27.

  11. Korba A, Salim A, Arbel M, Luise G, Gretton A. A non-asymptotic analysis for Stein variational gradient descent. Adv Neural Inf Process Syst. 2020;33:4672–82.

    Google Scholar 

  12. Salim A, Sun L, Richtarik P. A convergence theory for SVGD in the population limit under Talagrand’s inequality T1. In: International Conference on Machine Learning. PMLR; 2022. p. 19139-52.

  13. Sun L, Karagulyan A, Richtarik P. Convergence of Stein variational gradient descent under a weaker smoothness condition. In: International Conference on Artificial Intelligence and Statistics. PMLR; 2023. p. 3693-717.

  14. Nüsken N. On the geometry of Stein variational gradient descent. J Mach Learn Res. 2023;24:1–39.

    MathSciNet  Google Scholar 

  15. Li L, Li Y, Liu JG, Liu Z, Lu J. A stochastic version of Stein variational gradient descent for efficient sampling. Commun Appl Math Comput Sc. 2020;15(1):37–63.

    Article  MathSciNet  Google Scholar 

  16. Shi J, Mackey L. A finite-particle convergence rate for stein variational gradient descent. 2022. arXiv preprint arXiv:2211.09721

  17. Liu Q, Lee J, Jordan M. A kernelized Stein discrepancy for goodness-of-fit tests. In: International conference on machine learning. PMLR; 2016. p. 276–84.

  18. Liu Q. Stein variational gradient descent as gradient flow. Adv Neural Inf Process Syst. 2017;30.

  19. Lu J, Lu Y, Nolen J. Scaling limit of the Stein variational gradient descent: the mean field regime. SIAM J Math Anal. 2019;51(2):648–71.

    Article  MathSciNet  Google Scholar 

  20. Arias-Castro E, Mason D, Pelletier B. On the estimation of the gradient lines of a density and the consistency of the mean-shift algorithm. J Mach Learn Res. 2016;17(1):1487–514.

  21. Fleißner F. A kernel-density-estimator minimizing movement scheme for diffusion equations. 2023. arXiv preprint arXiv:2310.11961.

  22. Jiang H. Uniform convergence rates for kernel density estimation. In: International Conference on Machine Learning. PMLR; 2017. p. 1694–703.

  23. Kim J, Scott CD. Robust kernel density estimation. J Mach Learn Res. 2012;13(1):2529–65.

    MathSciNet  Google Scholar 

  24. Olver F, Lozier D, Boisver R, Clark C. Quadrature: Gauss-Hermit Formula: NIST Handbook of Mathematical Functions. London, UK: Cambridge University Press; 2010.

    Google Scholar 

  25. Shizgal B. A Gaussian quadrature procedure for use in the solution of the Boltzmann equation and related problems. J Comput Phys. 1981;41(2):309–28.

    Article  MathSciNet  Google Scholar 

  26. Pu Y, Gan Z, Henao R, Li C, Han S, Carin L. VAE learning via Stein variational gradient descent. Adv Neural Inf Process Syst. 2017;30.

  27. D’Angelo F, Fortuin V, Wenzel F. On stein variational neural network ensembles. 2021. arXiv preprint arXiv:2106.10760.

  28. Neal RM. Annealed importance sampling. Stat Comput. 2001;11:125–39.

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors are grateful to anonymous reviewers for their careful reading and valuable comments.

Funding

Jaewoo Park was partially supported by the National Research Foundation of Korea (2020R1C1C1A0100386814, RS-2023-00217705). The research of Byungjoon Lee was supported by the Catholic University of Korea, Research Fund, 2024.

Author information

Authors and Affiliations

Authors

Contributions

J. Kim and B. Lee contributed to conceptualization, methodology, software, and writing (original draft), and C. Min, J. Park, and K. Ryu contributed to formal analysis, validation, and writing (review and editing). All authors reviewed the manuscript.

Corresponding author

Correspondence to Byungjoon Lee.

Ethics declarations

Conflict of Interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, J., Lee, B., Min, C. et al. Density Estimation-Based Stein Variational Gradient Descent. Cogn Comput 17, 5 (2025). https://doi.org/10.1007/s12559-024-10370-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s12559-024-10370-5

Keywords