research-article

Parallel Sampling via Counting

Authors:

Aviad RubinsteinAuthors Info & Claims

STOC 2024: Proceedings of the 56th Annual ACM Symposium on Theory of Computing

Pages 537 - 548

https://doi.org/10.1145/3618260.3649744

Published: 11 June 2024 Publication History

Abstract

We show how to use parallelization to speed up sampling from an arbitrary distribution µ on a product space [q]ⁿ, given oracle access to counting queries: ℙ_{X∼ µ}[X_S=σ_S] for any S⊆ [n] and σ_S ∈ [q]^S. Our algorithm takes O(n^2/3· polylog(n,q)) parallel time, to the best of our knowledge, the first sublinear in n runtime for arbitrary distributions. Our results have implications for sampling in autoregressive models. Our algorithm directly works with an equivalent oracle that answers conditional marginal queries ℙ_{X∼ µ}[X_i=σ_i | X_S=σ_S], whose role is played by a trained neural network in autoregressive models. This suggests a roughly n^1/3-factor speedup is possible for sampling in any-order autoregressive models. We complement our positive result by showing a lower bound of Ω(n^1/3) for the runtime of any parallel sampling algorithm making at most poly(n) queries to the counting oracle, even for q=2.

References

[1]

Nima Anari, Callum Burgess, Kevin Tian, and Thuy-Duong Vuong. 2023. Quadratic speedups in parallel sampling from determinantal distributioPnrso.-In ceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures, 367-377. Nima Anari, Nathan Hu, Amin Saberi, and Aaron Schild. 2020. Sampling arborescences in parallealr.Xiv preprint arXiv: 2012. 0950.2 Nima Anari, Yizhi Huang, Tianyu Liu, Thuy-Duong Vuong, Brian Xu, and Katherine Yu. 2023. Parallel discrete sampling via continuous walPkrso.I-n ceedings of the 55th Annual ACM Symposium on Theory of Computin, g103-[ 29 ] 116.

[2]

Eric Balkanski, Aviad Rubinstein, and Yaron Singer. 2019. An exponential speedup in parallel running time for submodular maximization without loss in approximation. InProceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithm. sSIAM, 283-302.

[3]

Eric Balkanski and Yaron Singer. 2020. A lower bound for parallel submodular minimization. InProccedings of the 52nd Annual ACM SIGACT Symposium on hTeory of Computing, STOC 2020, Chicago, IL, USA, June 22-26, 2020. ACM, [ 32 ] 130-139.

Digital Library

[4]

Alexander Barvinok. 2016C. ombinatorics and complexity of partition functi o. ns Vol. 30. Springer. [ 33 ] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learnAedrvsa. nces in neural information processing system, s33, 1877-1901.

[5]

Deeparnab Chakrabarty, Yu Chen, and Sanjeev Khanna. 2021. A polynomial lower bound on the number of rounds for parallel submodular function mini-mization. In62nd IEEE Annual Symposium on Foundations of Computer Science, FOCS 2021, Denver, CO, USA, February 7-10, 202. 2IEEE, 37-48.

[6]

Deeparnab Chakrabarty, Andrei Graur, Haotian Jiang, and Aaron Sidford. 2022. Improved lower bounds for submodular function minimization2. 02In2 IEEE 63rd Annual Symposium on Foundations of Computer Science (FO. CIESE)E, 245-254.

[7]

Charlie Chen, Sebastian Borgeaud, Geofrey Irving, Jean-Baptiste Lespiau, Laurent Sifre, and John Jumper. 2023. Accelerating large language model decoding with speculative samplinga. rXiv preprint arXiv:2302.0131.8 Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: pre-training of deep bidirectional transformers for language understanding.

[8]

arXiv preprint arXiv: 1810. 0480.5 Martin Dyer, Alan Frieze, and Ravi Kannan. 1991. A random polynomial-time algorithm for approximating the volume of convex bodJioeus.rnal of the ACM (JACM ), 38, 1, 1-17.

[9]

Weiming Feng, Thomas P Hayes, and Yitong Yin. 2021. Distributed metropolis sampler with optimal parallelism.PIrnoceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA.S)IAM, 2121-2140.

[10]

Hillel Gazit and Gary L Miller. 1987. A parallel algorithm for finding a separator in planar graphs. I2n8th Annual Symposium on Foundations of Computer Science (sfcs 1987 ). IEEE, 238-248.

[11]

Mark Jerrum, Alistair Sinclair, and Eric Vigoda. 2004. A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries.

[12]

Journal of the ACM (JACM, ) 51, 4, 671-697.

[13]

Mark R Jerrum, Leslie G Valiant, and Vijay V Vazirani. 1986. Random generation of combinatorial structures from a uniform distribuThetioorne. tical computer science, 43, 169-188.

[14]

Richard M. Karp, Eli Upfal, and Avi Wigderson. 1988. The complexity of parallel search. J. Comput. Syst. Sci., 36, 2, 225-253.

Digital Library

[15]

Hugo Larochelle and Iain Murray. 2011. The neural autoregressive distribution estimator. InProceedings of the fourteenth international conference on artificial intelligence and statistic.sJMLR Workshop and Conference Proceedings, 29-37.

[16]

Holden Lee. 2023. Parallelising glauber dynamaircXs. iv preprint arXiv:2307.0713.1 Yaniv Leviathan, Matan Kalman, and Yossi Matias. 2023. Fast inference from transformers via speculative decoding. IInnternational Conference on Machine Learning. PMLR, 19274-19286.

[17]

Wenzheng Li, Paul Liu, and Jan Vondrák. 2020. A polynomial lower bound on adaptive complexity of submodular maximization. PInroccedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, Chicago, IL, USA, June 22-26, 2020. ACM, 140-152.

Digital Library

[18]

Hongyang Liu and Yitong Yin. 2022. Simple parallel algorithms for single-site dynamics. In STOC '22: 54th Annual ACM SIGACT Symposium on Theory of Computing, Rome, Italy, June 20-24, 202. 2ACM, 1431-1444.

Digital Library

[19]

Andrea Montanari. 2008. Estimating random variables from random sparse observationsE. uropean Transactions on Telecommunicati,o1n9s, 4, 385-403.

[20]

Ketan Mulmuley, Umesh V Vazirani, and Vijay V Vazirani. 1987. Matching is as easy as matrix inversion. IPnroceedings of the nineteenth annual ACM symposium on Theory of computing, 345-354.

[21]

Prasad Raghavendra and Ning Tan. 2012. Approximating csps with global cardinality constraints using SDP hierarchiesP.rIonceedings of the TwentyhTird Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2012, Kyoto, Japan, January 17-19, 2012. SIAM, 373-387.

[22]

Daniel Revuz and Marc Yor. 201C3o. ntinuous martingales and Brownian mot.ion Vol. 293. Springer Science & Business Media.

[23]

2023. Parallel sampling of difusion modelasr. Xiv preprint arXiv:2305.1631.7 Andy Shih, Dorsa Sadigh, and Stefano Ermon. 2022. Training and inference on any-order autoregressive models the right wAadyv. ances in Neural Information Processing System, s35, 2762-2775.

[24]

Yang Song, Chenlin Meng, Renjie Liao, and Stefano Ermon. 2021. Accelerating feedforward computation via parallel nonlinear equation solving. In International Conference on Machine Lear n.iPnMgLR, 9791-9800.

[25]

Daniel Štefankovič, Santosh Vempala, and Eric Vigoda. 2009. Adaptive simulated annealing: a near-optimal connection between sampling and counting.

[26]

Journal of the ACM (JACM, ) 56, 3, 1-36.

[27]

Shang-Hua Teng. 1995. Independent sets versus perfect matchingTheos. retical Computer Science, 145, 1-2, 381-390.

[28]

Aäron Van Den Oord, Nal Kalchbrenner, and Koray Kavukcuoglu. 2016. Pixel recurrent neural networks. IInnternational conference on machine lear.ning PMLR, 1747-1756.

[29]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Atention is all you need. Advances in neural information processing syste, m30s.

[30]

Dror Weitz. 2006. Counting independent sets up to the tree threshold. In Proceedings of the thirty-eighth annual ACM symposium on Theory of computi n,g 140-149.

Digital Library

[31]

Auke Wiggers and Emiel Hoogeboom. 2020. Predictive sampling with forecasting autoregressive models. IInnternational Conference on Machine Lear n.ing PMLR, 10260-10269.

[32]

Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. Xlnet: generalized autoregressive pretraining for language understandingA. dvances in neural information processing syste, m32s.

Index Terms

Parallel Sampling via Counting
1. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic algorithms
2. Theory of computation
  1. Design and analysis of algorithms
    1. Parallel algorithms
  2. Randomness, geometry and discrete structures
    1. Generating random combinatorial structures

Recommendations

Parallel Discrete Sampling via Continuous Walks
STOC 2023: Proceedings of the 55th Annual ACM Symposium on Theory of Computing

We develop a framework for sampling from discrete distributions µ on the hypercube {± 1}ⁿ by sampling from continuous distributions supported on ℝⁿ obtained by convolution with spherical Gaussians. We show that for well-studied families of discrete ...
The Gibbs Cloner for Combinatorial Optimization, Counting and Sampling
Abstract
We present a randomized algorithm, called the cloning algorithm, for approximating the solutions of quite general NP-hard combinatorial optimization problems, counting, rare-event estimation and uniform sampling on complex regions. Similar to the ...
Sequential and non-sequential acceptance sampling plans for autocorrelated processes using ARMA(p,q) models

This paper presents variable acceptance sampling plans based on the assumption that consecutive observations on a quality characteristic(X) are autocorrelated and are governed by a stationary autoregressive moving average (ARMA) process. The sampling ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

STOC 2024: Proceedings of the 56th Annual ACM Symposium on Theory of Computing

June 2024

2049 pages

ISBN:9798400703836

DOI:10.1145/3618260

General Chairs:
Bojan Mohar
Simon Fraser University, Canada
,
Igor Shinkar
Simon Fraser University, Canada
,
Program Chair:
Ryan O'Donnell
Carnegie Mellon University, USA

Copyright © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGACT: ACM Special Interest Group on Algorithms and Computation Theory

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSF

Conference

STOC '24

Sponsor:

SIGACT

STOC '24: 56th Annual ACM Symposium on Theory of Computing

June 24 - 28, 2024

BC, Vancouver, Canada

Acceptance Rates

Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
236
Total Downloads

Downloads (Last 12 months)236
Downloads (Last 6 weeks)48

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents