research-article

Free access

Just Accepted

Query lower bounds for log-concave sampling

Authors:

Jaume de Dios Pont,

Shyam NarayananAuthors Info & Claims

Journal of the ACM

Accepted on 11 June 2024

https://doi.org/10.1145/3673651

Online AM: 21 June 2024 Publication History

Abstract

Log-concave sampling has witnessed remarkable algorithmic advances in recent years, but the corresponding problem of proving lower bounds for this task has remained elusive, with lower bounds previously known only in dimension one. In this work, we establish the following query lower bounds: (1) sampling from strongly log-concave and log-smooth distributions in dimension d ≥ 2 requires Ω(log κ) queries, which is sharp in any constant dimension, and (2) sampling from Gaussians in dimension d (hence also from general log-concave and log-smooth distributions in dimension d) requires \(\widetilde{\Omega }(\min (\sqrt \kappa \log d, d)) \) queries, which is nearly sharp for the class of Gaussians. Here κ denotes the condition number of the target distribution. Our proofs rely upon (1) a multiscale construction inspired by work on the Kakeya conjecture in geometric measure theory, and (2) a novel reduction that demonstrates that block Krylov algorithms are optimal for this problem, as well as connections to lower bound techniques based on Wishart matrices developed in the matrix-vector query literature.

References

[1]

Kwangjun Ahn and Sinho Chewi. 2021. Efficient constrained sampling via the mirror-Langevin algorithm. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, K. Nguyen, P. S. Liang, J. W. Vaughan, and Y. Dauphin (Eds.), Vol. 34. Curran Associates, Inc., 28405–28418.

[2]

Jason M. Altschuler and Sinho Chewi. 2024. Faster high-accuracy log-concave sampling via algorithmic warm starts. J. ACM (3 2024).

[3]

Jason M. Altschuler and Kunal Talwar. 2023. Resolving the mixing time of the Langevin algorithm to its stationary distribution for log-concave sampling. In Proceedings of Thirty Sixth Conference on Learning Theory(Proceedings of Machine Learning Research, Vol. 195), Gergely Neu and Lorenzo Rosasco (Eds.). PMLR, 2509–2510.

[4]

Zhaojun Bai, Gark Fahey, and Gene Golub. 1996. Some large-scale matrix computation problems. J. Comput. Appl. Math. 7(1996), 71–89. Issue 1-2.

Digital Library

[5]

Ainesh Bakshi, Kenneth L. Clarkson, and David P. Woodruff. 2022. Low-rank approximation with 1/ϵ^1/3 matrix-vector products. In 54th Annual ACM SIGACT Symposium on Theory of Computing. ACM, 1130–1143.

Digital Library

[6]

Krishnakumar Balasubramanian, Sinho Chewi, Murat A. Erdogdu, Adil Salim, and Matthew S. Zhang. 2022. Towards a theory of non-log-concave sampling: first-order stationarity guarantees for Langevin Monte Carlo. In Conference on Learning Theory. PMLR, 2896–2923.

[7]

Espen Bernton. 2018. Langevin Monte Carlo and JKO splitting. In Conference on Learning Theory. PMLR, 1777–1798.

[8]

Mark Braverman, Elad Hazan, Max Simchowitz, and Blake E. Woodworth. 2020. The gradient complexity of linear regression. In Conference on Learning Theory, (COLT)(Proceedings of Machine Learning Research, Vol. 125). PMLR, 627–647.

[9]

Vladimir Braverman, Aditya Krishnan, and Christopher Musco. 2022. Sublinear time spectral density estimation. In 54th Annual ACM SIGACT Symposium on Theory of Computing. ACM, 1144–1157.

Digital Library

[10]

Matthew Brennan, Guy Bresler, and Brice Huang. 2021. De Finetti-style results for Wishart matrices: combinatorial structure and phase transitions. arXiv e-prints, Article arXiv:2103.14011 (2021).

[11]

Sébastien Bubeck. 2015. Convex optimization: algorithms and complexity. Foundations and Trends® in Machine Learning 8, 3-4 (2015), 231–357.

[12]

Sébastien Bubeck, Jian Ding, Ronen Eldan, and Miklós Z. Rácz. 2016. Testing for high-dimensional geometry in random graphs. Random Structures Algorithms 49, 3 (2016), 503–532.

[13]

Sébastien Bubeck and Shirshendu Ganguly. 2018. Entropic CLT and phase transition in high-dimensional Wishart matrices. Int. Math. Res. Not. IMRN2 (2018), 588–606.

[14]

Yu Cao, Jianfeng Lu, and Lihan Wang. 2021. Complexity of randomized algorithms for underdamped Langevin dynamics. Commun. Math. Sci. 19, 7 (2021), 1827–1853.

[15]

Niladri S. Chatterji, Peter L. Bartlett, and Philip M. Long. 2022. Oracle lower bounds for stochastic gradient sampling algorithms. Bernoulli 28, 2 (2022), 1074–1092.

[16]

Yongxin Chen, Sinho Chewi, Adil Salim, and Andre Wibisono. 2022. Improved analysis for a proximal algorithm for sampling. In Proceedings of Thirty Fifth Conference on Learning Theory(Proceedings of Machine Learning Research, Vol. 178), Po-Ling Loh and Maxim Raginsky (Eds.). PMLR, 2984–3014.

[17]

Yuansi Chen, Raaz Dwivedi, Martin J. Wainwright, and Bin Yu. 2020. Fast mixing of Metropolized Hamiltonian Monte Carlo: benefits of multi-step gradients. J. Mach. Learn. Res. 21(2020), 92–1.

[18]

Yuansi Chen and Ronen Eldan. 2022. Localization schemes: a framework for proving mixing bounds for Markov chains (extended abstract). In 2022 IEEE 63rd Annual Symposium on Foundations of Computer Science (FOCS). 110–122.

[19]

Xiang Cheng, Niladri S. Chatterji, Peter L. Bartlett, and Michael I. Jordan. 2018. Underdamped Langevin MCMC: a non-asymptotic analysis. In Proceedings of the 31st Conference on Learning Theory(Proceedings of Machine Learning Research, Vol. 75), Sébastien Bubeck, Vianney Perchet, and Philippe Rigollet (Eds.). PMLR, 300–323.

[20]

Sinho Chewi. 2024. Log-concave sampling. (2024). Book draft available at https://chewisinho.github.io/.

[21]

Sinho Chewi, Murat A. Erdogdu, Mufan B. Li, Ruoqi Shen, and Matthew S. Zhang. 2022. Analysis of Langevin Monte Carlo from Poincaré to log-Sobolev. In Proceedings of Thirty Fifth Conference on Learning Theory(Proceedings of Machine Learning Research, Vol. 178), Po-Ling Loh and Maxim Raginsky (Eds.). PMLR, 1–2.

[22]

Sinho Chewi, Patrik R. Gerber, Holden Lee, and Chen Lu. 2023. Fisher information lower bounds for sampling. In Proceedings of the 34th International Conference on Algorithmic Learning Theory(Proceedings of Machine Learning Research, Vol. 201), Shipra Agrawal and Francesco Orabona (Eds.). PMLR, 375–410.

[23]

Sinho Chewi, Patrik R. Gerber, Chen Lu, Thibaut Le Gouic, and Philippe Rigollet. 2022. The query complexity of sampling from strongly log-concave distributions in one dimension. In Proceedings of Thirty Fifth Conference on Learning Theory(Proceedings of Machine Learning Research, Vol. 178), Po-Ling Loh and Maxim Raginsky (Eds.). PMLR, 2041–2059.

[24]

Sinho Chewi, Thibaut Le Gouic, Chen Lu, Tyler Maunu, Philippe Rigollet, and Austin J. Stromme. 2020. Exponential ergodicity of mirror-Langevin diffusions. Advances in Neural Information Processing Systems 33 (2020), 19573–19585.

[25]

Sinho Chewi, Chen Lu, Kwangjun Ahn, Xiang Cheng, Thibaut Le Gouic, and Philippe Rigollet. 2021. Optimal dimension dependence of the Metropolis-adjusted Langevin algorithm. In Conference on Learning Theory. PMLR, 1260–1300.

[26]

David Cohen-Steiner, Weihao Kong, Christian Sohler, and Gregory Valiant. 2018. Approximating the spectrum of a graph. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining,. ACM, 1263–1271.

Digital Library

[27]

Thomas M. Cover and Joy A. Thomas. 2006. Elements of information theory(second ed.). Wiley-Interscience [John Wiley & Sons], Hoboken, NJ. xxiv+748 pages.

[28]

Arnak S. Dalalyan. 2017. Theoretical guarantees for approximate sampling from smooth and log-concave densities. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 79, 3 (2017), 651–676.

[29]

Arnak S. Dalalyan and Avetik Karagulyan. 2019. User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient. Stochastic Processes and their Applications 129, 12 (2019), 5278–5311.

[30]

Arnak S. Dalalyan and Lionel Riou-Durand. 2020. On sampling from a log-concave density using kinetic Langevin diffusions. Bernoulli 26, 3 (2020), 1956–1988.

[31]

Arnak S. Dalalyan and Alexandre B. Tsybakov. 2012. Sparse regression learning by aggregation and Langevin Monte-Carlo. J. Comput. System Sci. 78, 5 (2012), 1423–1443.

Digital Library

[32]

Prathamesh Dharangutte and Christopher Musco. 2021. Dynamic trace estimation. In Advances in Neural Information Processing Systems 34. 30088–30099.

[33]

Zhiyan Ding, Qin Li, Jianfeng Lu, and Stephen J. Wright. 2021. Random coordinate Langevin Monte Carlo. In Conference on Learning Theory. PMLR, 1683–1710.

[34]

Alain Durmus, Szymon Majewski, and Błażej Miasojedow. 2019. Analysis of Langevin Monte Carlo via convex optimization. J. Mach. Learn. Res. 20(2019), Paper No. 73, 46.

[35]

Alain Durmus and Eric Moulines. 2017. Nonasymptotic convergence analysis for the unadjusted Langevin algorithm. The Annals of Applied Probability 27, 3 (2017), 1551–1587.

[36]

Zeev Dvir. 2009. On the size of Kakeya sets in finite fields. Journal of the American Mathematical Society 22, 4 (2009), 1093–1097.

[37]

Raaz Dwivedi, Yuansi Chen, Martin J. Wainwright, and Bin Yu. 2018. Log-concave sampling: Metropolis–Hastings algorithms are fast!. In Conference on Learning Theory. PMLR, 793–797.

[38]

Alan Edelman. 1989. Eigenvalues and condition numbers of random matrices. Ph. D. Dissertation. Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA.

[39]

Jiaojiao Fan, Bo Yuan, and Yongxin Chen. 2023. Improved dimension dependence of a proximal algorithm for sampling. In Proceedings of Thirty Sixth Conference on Learning Theory(Proceedings of Machine Learning Research, Vol. 195), Gergely Neu and Lorenzo Rosasco (Eds.). PMLR, 1473–1521.

[40]

Khashayar Gatmiry and Santosh S. Vempala. 2022. Convergence of the Riemannian Langevin algorithm. arXiv e-prints, Article arXiv:2204.10818 (2022).

[41]

Rong Ge, Holden Lee, and Jianfeng Lu. 2020. Estimating normalizing constants for log-concave distributions: algorithms and lower bounds. In Proceedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing. 579–586.

Digital Library

[42]

Sivakanth Gopi, Yin Tat Lee, and Daogao Liu. 2022. Private convex optimization via exponential mechanism. In Proceedings of Thirty Fifth Conference on Learning Theory(Proceedings of Machine Learning Research, Vol. 178), Po-Ling Loh and Maxim Raginsky (Eds.). PMLR, 1948–1989.

[43]

Michael F. Hutchinson. 1990. A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines. Communications in Statistics-Simulation and Computation 19 (1990), 433–450. Issue 2.

[44]

Qijia Jiang. 2021. Mirror Langevin Monte Carlo: the case under isoperimetry. In Advances in Neural Information Processing Systems, M. Ranzato, A. Beygelzimer, Y. Dauphin, P.S. Liang, and J. Wortman Vaughan (Eds.), Vol. 34. Curran Associates, Inc., 715–725.

[45]

Tiefeng Jiang and Danning Li. 2015. Approximation of rectangular beta-Laguerre ensembles and large deviations. J. Theoret. Probab. 28, 3 (2015), 804–847.

[46]

Richard Jordan, David Kinderlehrer, and Felix Otto. 1998. The variational formulation of the Fokker–Planck equation. SIAM J. Math. Anal. 29, 1 (1998), 1–17.

Digital Library

[47]

Stasys Jukna. 2011. Extremal combinatorics: with applications in computer science. Vol. 571. Springer.

[48]

Yin Tat Lee, Ruoqi Shen, and Kevin Tian. 2020. Logsmooth gradient concentration and tighter runtimes for Metropolized Hamiltonian Monte Carlo. In Conference on Learning Theory. PMLR, 2565–2597.

[49]

Yin Tat Lee, Ruoqi Shen, and Kevin Tian. 2021. Lower bounds on Metropolized sampling methods for well-conditioned distributions. Advances in Neural Information Processing Systems 34 (2021), 18812–18824.

[50]

Yin Tat Lee, Ruoqi Shen, and Kevin Tian. 2021. Structured logconcave sampling with a restricted Gaussian oracle. In Conference on Learning Theory. PMLR, 2993–3050.

[51]

Ruilin Li, Molei Tao, Santosh S. Vempala, and Andre Wibisono. 2022. The mirror Langevin algorithm converges with vanishing bias. In Proceedings of the 33rd International Conference on Algorithmic Learning Theory(Proceedings of Machine Learning Research, Vol. 167), Sanjoy Dasgupta and Nika Haghtalab (Eds.). PMLR, 718–742.

[52]

László Lovász and Santosh Vempala. 2006. Simulated annealing in convex bodies and an O^*(n⁴) volume algorithm. J. Comput. System Sci. 72, 2 (2006), 392–417.

Digital Library

[53]

Yi-An Ma, Niladri S. Chatterji, Xiang Cheng, Nicolas Flammarion, Peter L. Bartlett, and Michael I. Jordan. 2021. Is there an analog of Nesterov acceleration for gradient-based MCMC?Bernoulli 27, 3 (2021), 1942–1992.

[54]

Raphael A. Meyer, Cameron Musco, Christopher Musco, and David P. Woodruff. 2021. Hutch++: optimal stochastic trace estimation. In 4th Symposium on Simplicity in Algorithms. SIAM, 142–155.

[55]

Dan Mikulincer. 2022. A CLT in Stein’s distance for generalized Wishart matrices and higher-order tensors. Int. Math. Res. Not. IMRN10 (2022), 7839–7872.

[56]

Cameron Musco and Christopher Musco. 2015. Randomized block Krylov methods for stronger and faster approximate singular value decomposition. In Advances in Neural Information Processing Systems 28. 1396–1404.

[57]

Arkadij S. Nemirovskij and David B. Yudin. 1983. Problem complexity and method efficiency in optimization. (1983).

[58]

Yurii Nesterov. 2018. Lectures on convex optimization. Springer Optimization and Its Applications, Vol. 137. Springer, Cham. xxiii+589 pages.

[59]

Akihiko Nishimura and Marc A. Suchard. 2022. Prior-preconditioned conjugate gradient method for accelerated Gibbs sampling in “large n, large p” Bayesian sparse regression. J. Amer. Statist. Assoc. 0, 0 (2022), 1–14.

[60]

Oskar Perron. 1928. Über einen Satz von Besicovitsch. Mathematische Zeitschrift 28, 1 (1928), 383–386.

[61]

Miklós Z. Rácz and Jacob Richey. 2019. A smooth transition from Wishart to GOE. J. Theoret. Probab. 32, 2 (2019), 898–906.

[62]

Luis Rademacher and Santosh Vempala. 2008. Dispersion of mass and the complexity of randomized geometric algorithms. Adv. Math. 219, 3 (2008), 1037–1069.

[63]

Cyrus Rashtchian, David P. Woodruff, and Hanlin Zhu. 2020. Vector-matrix-vector queries for solving linear algebra, statistics, and graph problems. In Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques(LIPIcs, Vol. 176). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 26:1–26:20.

[64]

Christian P. Robert and George Casella. 2004. Monte Carlo statistical methods(second ed.). Springer-Verlag, New York. xxx+645 pages.

[65]

Sushant Sachdeva and Nisheeth K. Vishnoi. 2014. Faster algorithms via approximation theory. Found. Trends Theor. Comput. Sci. 9, 2 (2014), 125–210.

Digital Library

[66]

Adil Salim and Peter Richtarik. 2020. Primal dual interpretation of the proximal stochastic gradient Langevin algorithm. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 3786–3796.

[67]

Shubhangi Saraf and Madhu Sudan. 2008. An improved lower bound on the size of Kakeya sets over finite fields. Analysis & PDE 1, 3 (2008), 375–379.

[68]

Ruoqi Shen and Yin Tat Lee. 2019. The randomized midpoint method for log-concave sampling. Advances in Neural Information Processing Systems 32 (2019).

[69]

Max Simchowitz, Ahmed El Alaoui, and Benjamin Recht. 2018. Tight query complexity lower bounds for PCA via finite sample deformed Wigner law. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing. ACM, 1249–1259.

Digital Library

[70]

Xiaoming Sun, David P. Woodruff, Guang Yang, and Jialin Zhang. 2019. Querying a matrix through matrix-vector products. In 46th International Colloquium on Automata, Languages, and Programming(LIPIcs, Vol. 132). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 94:1–94:16.

[71]

Stanisław J. Szarek. 1991. Condition numbers of random matrices. J. Complexity 7, 2 (1991), 131–149.

Digital Library

[72]

Kunal Talwar. 2019. Computational separations between sampling and optimization. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. Curran Associates, Inc.

[73]

Santosh S. Vempala and Andre Wibisono. 2019. Rapid convergence of the unadjusted Langevin algorithm: isoperimetry suffices. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 8094–8106.

[74]

Roman Vershynin. 2018. High-dimensional probability. Cambridge Series in Statistical and Probabilistic Mathematics, Vol. 47. Cambridge University Press, Cambridge. xiv+284 pages. An introduction with applications in data science, With a foreword by Sara van de Geer.

[75]

Andre Wibisono. 2018. Sampling as optimization in the space of measures: the Langevin dynamics as a composite optimization problem. In Conference on Learning Theory. PMLR, 2093–3027.

[76]

Andre Wibisono. 2019. Proximal Langevin algorithm: rapid convergence under isoperimetry. arXiv preprint arXiv:1911.01469(2019).

[77]

Karl Wimmer, Yi Wu, and Peng Zhang. 2014. Optimal query complexity for estimating the trace of a matrix. In 41st International Colloquium on Automata, Languages, and Programming(Lecture Notes in Computer Science, Vol. 8572). Springer, 1051–1062.

[78]

David P. Woodruff. 2014. Sketching as a tool for numerical linear algebra. Found. Trends Theor. Comput. Sci. 10, 1-2 (2014), 1–157.

Digital Library

[79]

Blake Woodworth and Nathan Srebro. 2017. Lower bound for randomized first order convex optimization. arXiv e-prints, Article arXiv:1709.03594 (2017).

[80]

Keru Wu, Scott Schmidler, and Yuansi Chen. 2022. Minimax mixing time of the Metropolis-adjusted Langevin algorithm for log-concave sampling. Journal of Machine Learning Research 23, 270 (2022), 1–63.

[81]

Kelvin S. Zhang, Gabriel Peyré, Jalal Fadili, and Marcelo Pereyra. 2020. Wasserstein control of mirror Langevin Monte Carlo. In Proceedings of Thirty Third Conference on Learning Theory(Proceedings of Machine Learning Research, Vol. 125), Jacob Abernethy and Shivani Agarwal (Eds.). PMLR, 3814–3841.

Index Terms

Query lower bounds for log-concave sampling
1. Theory of computation
  1. Design and analysis of algorithms
  2. Randomness, geometry and discrete structures
    1. Random walks and Markov chains

Recommendations

Distributed Lower Bounds for Ruling Sets

Given a graph $G=(V,E)$, an $(\alpha,\beta)$-ruling set is a subset $S\subseteq V$ such that the distance between any two vertices in $S$ is at least $\alpha$, and the distance between any vertex in $V$ and the closest vertex in $S$ is at most $\beta$. We ...
Sampling from log-concave distributions with infinity-distance guarantees
NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing Systems

For a d-dimensional log-concave distribution π(θ) ∝ e^-f(θ) constrained to a convex body K, the problem of outputting samples from a distribution ν which is ε-close in infinity-distance $\sup_{\theta \in K} |\log \frac{\nu(\theta)}{\pi(\theta)}|$ to π ...
Sampling lower bounds via information theory
STOC '03: Proceedings of the thirty-fifth annual ACM symposium on Theory of computing

We present a novel technique, based on the Jensen-Shannon divergence from information theory, to prove lower bounds on the query complexity of sampling algorithms that approximate functions over arbitrary domain and range. Unlike previous methods, our ...

Comments

Information & Contributors

Information

Published In

cover image Journal of the ACM

Journal of the ACM Just Accepted

ISSN:0004-5411

EISSN:1557-735X

Table of Contents

Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Online AM: 21 June 2024

Accepted: 11 June 2024

Revised: 25 May 2024

Received: 23 December 2023

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
38
Total Downloads

Downloads (Last 12 months)38
Downloads (Last 6 weeks)37

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables