Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Probabilistic Programming with Stochastic Probabilities

Published: 06 June 2023 Publication History

Abstract

We present a new approach to the design and implementation of probabilistic programming languages (PPLs), based on the idea of stochastically estimating the probability density ratios necessary for probabilistic inference. By relaxing the usual PPL design constraint that these densities be computed exactly, we are able to eliminate many common restrictions in current PPLs, to deliver a language that, for the first time, simultaneously supports first-class constructs for marginalization and nested inference, unrestricted stochastic control flow, continuous and discrete sampling, and programmable inference with custom proposals. At the heart of our approach is a new technique for compiling these expressive probabilistic programs into randomized algorithms for unbiasedly estimating their densities and density reciprocals. We employ these stochastic probability estimators within modified Monte Carlo inference algorithms that are guaranteed to be sound despite their reliance on inexact estimates of density ratios. We establish the correctness of our compiler using logical relations over the semantics of λSP, a new core calculus for modeling and inference with stochastic probabilities. We also implement our approach in an open-source extension to Gen, called GenSP, and evaluate it on six challenging inference problems adapted from the modeling and inference literature. We find that: (1)  ‍can automate fast density estimators for programs with very expensive exact densities; (2) convergence of inference is mostly unaffected by the noise from these estimators; and (3) our sound-by-construction estimators are competitive with hand-coded density estimators, incurring only a small constant-factor overhead.

References

[1]
Christophe Andrieu, Arnaud Doucet, and Roman Holenstein. 2010. Particle Markov chain Monte Carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72, 3 (2010), 269–342.
[2]
Christophe Andrieu and Gareth O Roberts. 2009. The pseudo-marginal approach for efficient Monte Carlo computations. The Annals of Statistics, 37, 2 (2009), 697–725.
[3]
Atilim Güneş Baydin, Lei Shao, Wahid Bhimji, Lukas Heinrich, Lawrence Meadows, Jialin Liu, Andreas Munk, Saeid Naderiparizi, Bradley Gram-Hansen, and Gilles Louppe. 2019. Etalumis: Bringing probabilistic programming to scientific simulators at scale. In Proceedings of the international conference for high performance computing, networking, storage and analysis. 1–24.
[4]
Mark A Beaumont. 2003. Estimation of population growth or decline in genetically monitored populations. Genetics, 164, 3 (2003), 1139–1160.
[5]
Eli Bingham, Jonathan P Chen, Martin Jankowiak, Fritz Obermeyer, Neeraj Pradhan, Theofanis Karaletsos, Rohit Singh, Paul Szerlip, Paul Horsfall, and Noah D Goodman. 2019. Pyro: Deep universal probabilistic programming. The Journal of Machine Learning Research, 20, 1 (2019), 973–978.
[6]
Berk Calli, Arjun Singh, Aaron Walsman, Siddhartha Srinivasa, Pieter Abbeel, and Aaron M Dollar. 2015. The YCB object and model set: Towards common benchmarks for manipulation research. In 2015 International Conference on Advanced Robotics (ICAR). 510–517.
[7]
Bob Carpenter, Andrew Gelman, Matthew D Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A probabilistic programming language. Journal of Statistical Software, 76, 1 (2017).
[8]
Yu-Hsi Cheng, Todd Millstein, Guy Van den Broeck, and Steven Holtzen. 2021. flip-hoisting: Exploiting Repeated Parameters in Discrete Probabilistic Programs. arXiv preprint arXiv:2110.10284.
[9]
Nicolas Chopin and Omiros Papaspiliopoulos. 2020. An introduction to sequential Monte Carlo. Springer.
[10]
Marco Cusumano-Towner and Vikash K Mansinghka. 2017. AIDE: An algorithm for measuring the accuracy of probabilistic inference algorithms. Advances in Neural Information Processing Systems, 30 (2017).
[11]
Marco Francis Cusumano-Towner. 2020. Gen: A High-Level Programming Platform for Probabilistic Inference. 231.
[12]
Marco F Cusumano-Towner and Vikash K Mansinghka. 2018. Using probabilistic programs as proposals. arXiv preprint arXiv:1801.03612.
[13]
Marco F Cusumano-Towner, Alexey Radul, David Wingate, and Vikash K Mansinghka. 2017. Probabilistic programs for inferring the goals of autonomous agents. arXiv preprint arXiv:1704.04977.
[14]
Marco F Cusumano-Towner, Feras A Saad, Alexander K Lew, and Vikash K Mansinghka. 2019. Gen: a general-purpose probabilistic programming system with programmable inference. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation. 221–236.
[15]
Arnaud Doucet, Michael K Pitt, George Deligiannidis, and Robert Kohn. 2015. Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator. Biometrika, 102, 2 (2015), 295–313.
[16]
Paul Fearnhead, Omiros Papaspiliopoulos, Gareth O Roberts, and Andrew Stuart. 2010. Random-weight particle filtering of continuous time processes. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72, 4 (2010), 497–512.
[17]
Martin A Fischler and Robert C Bolles. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM, 24, 6 (1981), 381–395.
[18]
Hong Ge, Kai Xu, and Zoubin Ghahramani. 2018. Turing: Composable inference for probabilistic programming. In International Conference on Artificial Intelligence and Statistics, AISTATS 2018, 9-11 April 2018, Playa Blanca, Lanzarote, Canary Islands, Spain, Amos J. Storkey and Fernando Pérez-Cruz (Eds.) (Proceedings of Machine Learning Research, Vol. 84). PMLR, 1682–1690. http://proceedings.mlr.press/v84/ge18b.html
[19]
Timon Gehr, Samuel Steffen, and Martin T. Vechev. 2020. λ PSI: exact inference for higher-order probabilistic programs. In Proceedings of the 41st ACM SIGPLAN International Conference on Programming Language Design and Implementation, PLDI 2020, London, UK, June 15-20, 2020, Alastair F. Donaldson and Emina Torlak (Eds.). ACM, 883–897. https://doi.org/10.1145/3385412.3386006
[20]
Noah D Goodman and Michael C Frank. 2016. Pragmatic language interpretation as probabilistic inference. Trends in Cognitive Sciences, 20, 11 (2016), 818–829.
[21]
Noah D. Goodman, Vikash K. Mansinghka, Daniel M. Roy, Kallista A. Bonawitz, and Joshua B. Tenenbaum. 2008. Church: a language for generative models. In UAI 2008, Proceedings of the 24th Conference in Uncertainty in Artificial Intelligence, Helsinki, Finland, July 9-12, 2008, David A. McAllester and Petri Myllymäki (Eds.). AUAI Press, 220–229. https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_id=1346&proceeding_id=24
[22]
Nishad Gothoskar, Marco F. Cusumano-Towner, Ben Zinberg, Matin Ghavamizadeh, Falk Pollok, Austin Garrett, Josh Tenenbaum, Dan Gutfreund, and Vikash K. Mansinghka. 2021. 3DP3: 3D Scene Perception via Probabilistic Programming. In Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, December 6-14, 2021, virtual, Marc’Aurelio Ranzato, Alina Beygelzimer, Yann N. Dauphin, Percy Liang, and Jennifer Wortman Vaughan (Eds.). 9600–9612. https://proceedings.neurips.cc/paper/2021/hash/4fc66104f8ada6257fa55f29a2a567c7-Abstract.html
[23]
Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. 2017. A convenient category for higher-order probability theory. In 2017 32nd Annual ACM/IEEE Symposium on Logic in Computer Science (LICS). 1–12.
[24]
Steven Holtzen, Guy Van den Broeck, and Todd D. Millstein. 2020. Scaling exact inference for discrete probabilistic programs. Proc. ACM Program. Lang., 4, OOPSLA (2020), 140:1–140:31. https://doi.org/10.1145/3428208
[25]
Mathieu Huot, Sam Staton, and Matthijs Vákár. 2020. Correctness of Automatic Differentiation via Diffeologies and Categorical Gluing. In Foundations of Software Science and Computation Structures - 23rd International Conference, FOSSACS 2020, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020, Dublin, Ireland, April 25-30, 2020, Proceedings, Jean Goubault-Larrecq and Barbara König (Eds.) (Lecture Notes in Computer Science, Vol. 12077). Springer, 319–338. https://doi.org/10.1007/978-3-030-45231-5_17
[26]
Justin Johnson, Agrim Gupta, and Li Fei-Fei. 2018. Image Generation From Scene Graphs. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018. Computer Vision Foundation / IEEE Computer Society, 1219–1228. https://doi.org/10.1109/CVPR.2018.00133
[27]
Justin Johnson, Ranjay Krishna, Michael Stark, Li-Jia Li, David A. Shamma, Michael S. Bernstein, and Li Fei-Fei. 2015. Image retrieval using scene graphs. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015. IEEE Computer Society, 3668–3678. https://doi.org/10.1109/CVPR.2015.7298990
[28]
Wonyeol Lee, Hangyeol Yu, Xavier Rival, and Hongseok Yang. 2020. Towards verified stochastic variational inference for probabilistic programs. Proc. ACM Program. Lang., 4, POPL (2020), 16:1–16:33. https://doi.org/10.1145/3371084
[29]
Alexander K. Lew, Monica Agrawal, David A. Sontag, and Vikash Mansinghka. 2021. PClean: Bayesian Data Cleaning at Scale with Domain-Specific Probabilistic Programming. In The 24th International Conference on Artificial Intelligence and Statistics, AISTATS 2021, April 13-15, 2021, Virtual Event, Arindam Banerjee and Kenji Fukumizu (Eds.) (Proceedings of Machine Learning Research, Vol. 130). PMLR, 1927–1935. http://proceedings.mlr.press/v130/lew21a.html
[30]
Alexander K. Lew, Marco F. Cusumano-Towner, and Vikash K. Mansinghka. 2022. Recursive Monte Carlo and variational inference with auxiliary variables. In Uncertainty in Artificial Intelligence, Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, UAI 2022, 1-5 August 2022, Eindhoven, The Netherlands, James Cussens and Kun Zhang (Eds.) (Proceedings of Machine Learning Research, Vol. 180). PMLR, 1096–1106. https://proceedings.mlr.press/v180/lew22a.html
[31]
Alexander K. Lew, Marco F. Cusumano-Towner, Benjamin Sherman, Michael Carbin, and Vikash K. Mansinghka. 2020. Trace types and denotational semantics for sound programmable inference in probabilistic languages. Proceedings of the ACM on Programming Languages, 4, POPL (2020), 1–32. issn:24751421 https://doi.org/10.1145/3371087
[32]
Alexander K. Lew, Mathieu Huot, Sam Staton, and Vikash K. Mansinghka. 2023. ADEV: Sound Automatic Differentiation of Expected Values of Probabilistic Programs. Proc. ACM Program. Lang., 7, POPL (2023), 121–153. https://doi.org/10.1145/3571198
[33]
Jianlin Li, Leni Ven, Pengyuan Shi, and Yizhou Zhang. 2023. Type-Preserving, Dependence-Aware Guide Generation for Sound, Effective Amortized Probabilistic Inference. Proc. ACM Program. Lang., 7, POPL (2023), 1454–1482. https://doi.org/10.1145/3571243
[34]
Vikash Mansinghka, Daniel Selsam, and Yura N. Perov. 2014. Venture: a higher-order probabilistic programming platform with programmable inference. CoRR, abs/1404.0099 (2014), arXiv:1404.0099. arxiv:1404.0099
[35]
Vikash K Mansinghka, Ulrich Schaechtle, Shivam Handa, Alexey Radul, Yutian Chen, and Martin Rinard. 2018. Probabilistic Programming with Programmable Inference. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2018). 14, ACM, New York, NY, USA. 603–616. isbn:9781450356985 https://doi.org/10.1145/3192366.3192409
[36]
Eric Mays, Fred J Damerau, and Robert L Mercer. 1991. Context based spelling correction. Information Processing & Management, 27, 5 (1991), 517–522.
[37]
Praveen Narayanan and Chung Chieh Shan. 2020. Symbolic Disintegration with a Variety of Base Measures. ACM Transactions on Programming Languages and Systems, 42, 2 (2020), issn:15584593 https://doi.org/10.1145/3374208
[38]
Radford M Neal. 2000. Markov chain sampling methods for Dirichlet process mixture models. Journal of computational and graphical statistics, 9, 2 (2000), 249–265.
[39]
Tom Rainforth. 2018. Nesting Probabilistic Programs. In Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, UAI 2018, Monterey, California, USA, August 6-10, 2018, Amir Globerson and Ricardo Silva (Eds.). AUAI Press, 249–258. http://auai.org/uai2018/proceedings/papers/92.pdf
[40]
Fredrik Ronquist, Jan Kudlicka, Viktor Senderov, Johannes Borgström, Nicolas Lartillot, Daniel Lundén, Lawrence Murray, Thomas B Schön, and David Broman. 2021. Universal probabilistic programming offers a powerful approach to statistical phylogenetics. Communications biology, 4, 1 (2021), 244.
[41]
Feras A Saad, Marco F Cusumano-Towner, Ulrich Schaechtle, Martin C Rinard, and Vikash K Mansinghka. 2019. Bayesian synthesis of probabilistic programs for automatic data modeling. Proceedings of the ACM on Programming Languages, 3, POPL (2019), 1–32.
[42]
Feras A. Saad, Martin C. Rinard, and Vikash K. Mansinghka. 2021. SPPL: Probabilistic programming with fast exact symbolic inference. Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 804–819. isbn:9781450383912 https://doi.org/10.1145/3453483.3454078 arxiv:2010.03485.
[43]
Adam Ścibior, Ohad Kammar, Matthijs Vákár, Sam Staton, Hongseok Yang, Yufei Cai, Klaus Ostermann, Sean K. Moss, Chris Heunen, and Zoubin Ghahramani. 2018. Denotational validation of higher-order Bayesian inference. Proc. ACM Program. Lang., 2, POPL (2018), 60:1–60:29. https://doi.org/10.1145/3158148
[44]
Chung-chieh Shan and Norman Ramsey. 2017. Exact Bayesian Inference by Symbolic Disintegration. Principles of Programming Languages, isbn:9781450346603 issn:07308566 https://doi.org/10.1145/3009837.3009852
[45]
Sam Stites, Heiko Zimmermann, Hao Wu, Eli Sennesh, and Jan-Willem van de Meent. 2021. Learning proposals for probabilistic programs with inference combinators. In Uncertainty in Artificial Intelligence. 1056–1066.
[46]
Andreas Stuhlmüller and Noah D. Goodman. 2012. A Dynamic Programming Algorithm for Inference in Recursive Probabilistic Programs. In 2nd International Workshop on Statistical Relational AI (StaRAI-12), held at the Uncertainty in Artificial Intelligence Conference (UAI 2012), Catalina Island, CA, USA, August 18, 2012, Henry A. Kautz, Kristian Kersting, Sriraam Natarajan, and David Poole (Eds.). https://starai.cs.kuleuven.be/2012/accepted/stuhlmuller.pdf
[47]
Minh-Ngoc Tran, Marcel Scharth, Michael K Pitt, and Robert Kohn. 2013. Importance sampling squared for Bayesian inference in latent variable models. arXiv preprint arXiv:1309.3339.
[48]
Di Wang, Jan Hoffmann, and Thomas Reps. 2021. Sound probabilistic inference via guide types. Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 788–803. isbn:9781450383912 https://doi.org/10.1145/3453483.3454077 arxiv:2104.03598.
[49]
Frank D. Wood, Jan-Willem van de Meent, and Vikash Mansinghka. 2014. A New Approach to Probabilistic Programming Inference. In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, AISTATS 2014, Reykjavik, Iceland, April 22-25, 2014 (JMLR Workshop and Conference Proceedings, Vol. 33). JMLR.org, 1024–1032. http://proceedings.mlr.press/v33/wood14.html
[50]
Yizhou Zhang and Nada Amin. 2022. Reasoning about “reasoning about reasoning”: semantics and contextual equivalence for probabilistic programs with nested queries and recursion. Proceedings of the ACM on Programming Languages, 6, POPL (2022), 1–28.
[51]
Tan Zhi-Xuan, Jordyn L. Mann, Tom Silver, Joshua B. Tenenbaum, and Vikash K. Mansinghka. 2020. Online Bayesian goal inference for boundedly-rational planning agents. Advances in Neural Information Processing Systems, 2020-December (2020), issn:10495258 arxiv:2006.07532.
[52]
Robert Zinkov and Chung-chieh Shan. 2016. Composing inference algorithms as program transformations. arXiv preprint arXiv:1603.01882.
[53]
Matt Zucker, James Kuffner, and Michael Branicky. 2007. Multipartite RRTs for rapid replanning in dynamic environments. In Proceedings 2007 IEEE International Conference on Robotics and Automation. 1603–1609.

Cited By

View all
  • (2024)Probabilistic Programming with Programmable Variational InferenceProceedings of the ACM on Programming Languages10.1145/36564638:PLDI(2123-2147)Online publication date: 20-Jun-2024
  • (2024)Compiling Probabilistic Programs for Variable Elimination with Information FlowProceedings of the ACM on Programming Languages10.1145/36564488:PLDI(1755-1780)Online publication date: 20-Jun-2024
  • (2024)GenSQL: A Probabilistic Programming System for Querying Generative Models of Database TablesProceedings of the ACM on Programming Languages10.1145/36564098:PLDI(790-815)Online publication date: 20-Jun-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 7, Issue PLDI
June 2023
2020 pages
EISSN:2475-1421
DOI:10.1145/3554310
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution 4.0 International License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2023
Published in PACMPL Volume 7, Issue PLDI

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. approximate computing
  2. probabilistic programming
  3. semantics

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)895
  • Downloads (Last 6 weeks)79
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Probabilistic Programming with Programmable Variational InferenceProceedings of the ACM on Programming Languages10.1145/36564638:PLDI(2123-2147)Online publication date: 20-Jun-2024
  • (2024)Compiling Probabilistic Programs for Variable Elimination with Information FlowProceedings of the ACM on Programming Languages10.1145/36564488:PLDI(1755-1780)Online publication date: 20-Jun-2024
  • (2024)GenSQL: A Probabilistic Programming System for Querying Generative Models of Database TablesProceedings of the ACM on Programming Languages10.1145/36564098:PLDI(790-815)Online publication date: 20-Jun-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media