Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Formal verification of higher-order probabilistic programs: reasoning about approximation, convergence, Bayesian inference, and optimization

Published: 02 January 2019 Publication History

Abstract

Probabilistic programming provides a convenient lingua franca for writing succinct and rigorous descriptions of probabilistic models and inference tasks. Several probabilistic programming languages, including Anglican, Church or Hakaru, derive their expressiveness from a powerful combination of continuous distributions, conditioning, and higher-order functions. Although very important for practical applications, these features raise fundamental challenges for program semantics and verification. Several recent works offer promising answers to these challenges, but their primary focus is on foundational semantics issues.
In this paper, we take a step further by developing a suite of logics, collectively named PPV for proving properties of programs written in an expressive probabilistic higher-order language with continuous sampling operations and primitives for conditioning distributions. Our logics mimic the comfortable reasoning style of informal proofs using carefully selected axiomatizations of key results from probability theory. The versatility of our logics is illustrated through the formal verification of several intricate examples from statistics, probabilistic inference, and machine learning. We further show expressiveness by giving sound embeddings of existing logics. In particular, we do this in a parametric way by showing how the semantics idea of (unary and relational) ⊤⊤-lifting can be internalized in our logics. The soundness of PPV follows by interpreting programs and assertions in quasi-Borel spaces (QBS), a recently proposed variant of Borel spaces with a good structure for interpreting higher order probabilistic programs.

Supplementary Material

WEBM File (a38-sato.webm)

References

[1]
Alejandro Aguirre, Gilles Barthe, Lars Birkedal, Ales Bizjak, Marco Gaboardi, and Deepak Garg. 2018. Relational Reasoning for Markov Chains in a Probabilistic Guarded Lambda Calculus. In Programming Languages and Systems - 27th European Symposium on Programming, ESOP 2018, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2018, Thessaloniki, Greece, April 14-20, 2018, Proceedings. 214–241.
[2]
Alejandro Aguirre, Gilles Barthe, Marco Gaboardi, Deepak Garg, and Pierre-Yves Strub. 2017. A Relational Logic for Higherorder Programs. Proc. ACM Program. Lang. 1, ICFP, Article 21 (Aug. 2017), 29 pages.
[3]
Torben Amtoft and Anindya Banerjee. 2016. A Theory of Slicing for Probabilistic Control Flow Graphs. In Foundations of Software Science and Computation Structures, Bart Jacobs and Christof Löding (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 180–196.
[4]
Kavosh Asadi, Evan Cater, Dipendra Misra, and Michael L. Littman. 2018. Equivalence Between Wasserstein and Value-Aware Model-based Reinforcement Learning. ArXiv e-prints (June 2018). arXiv: cs.LG/1806.01265
[5]
Philippe Audebaud and Christine Paulin-Mohring. 2009. Proofs of Randomized Algorithms in Coq. Sci. Comput. Program. 74, 8 (2009), 568–589.
[6]
Robert J. Aumann. 1961. Borel structures for function spaces. Illinois J. Math. 5, 4 (12 1961), 614–630. http://projecteuclid. org/euclid.ijm/1255631584
[7]
Jeremy Avigad, Johannes Hölzl, and Luke Serafin. 2014. A formally verified proof of the Central Limit Theorem. CoRR abs/1405.7012 (2014). http://arxiv.org/abs/1405.7012
[8]
Gilles Barthe, Thomas Espitau, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2018. An AssertionBased Program Logic for Probabilistic Programs. In Programming Languages and Systems, Amal Ahmed (Ed.). Springer International Publishing, Cham, 117–144.
[9]
Gilles Barthe, Gian Pietro Farina, Marco Gaboardi, Emilio Jesús Gallego Arias, Andy Gordon, Justin Hsu, and Pierre-Yves Strub. 2016a. Differentially Private Bayesian Programming. In ACM SIGSAC Conference on Computer and Communications Security, Vienna, Austria, October 24-28, 2016. 68–79.
[10]
Gilles Barthe, Marco Gaboardi, Benjamin Grégoire, Justin Hsu, and Pierre-Yves Strub. 2016b. A Program Logic for Union Bounds. In 43rd International Colloquium on Automata, Languages, and Programming, ICALP 2016, July 11-15, 2016, Rome, Italy. 107:1–107:15.
[11]
Johannes Borgström, Andrew D. Gordon, Michael Greenberg, James Margetson, and Jurgen Van Gael. 2011. Measure Transformer Semantics for Bayesian Machine Learning. In Programming Languages and Systems - 20th European Symposium on Programming, ESOP 2011, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2011, Saarbrücken, Germany, March 26-April 3, 2011. Proceedings. 77–96.
[12]
Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 2016. A lambda-calculus foundation for universal probabilistic programming. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, Nara, Japan, September 18-22, 2016. 33–46.
[13]
Simon Castellan, Pierre Clairambault, Hugo Paquet, and Glynn Winskel. 2018. The concurrent game semantics of Probabilistic PCF. In Proceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2018, Oxford, UK, July 09-12, 2018. 215–224.
[14]
Rohit Chadha, Luís Cruz-Filipe, Paulo Mateus, and Amílcar C. Sernadas. 2007. Reasoning about probabilistic sequential programs. Theoretical Computer Science 379, 1 (2007), 142 – 165.
[15]
Sourav Chatterjee and Persi Diaconis. 2018. The sample size required in importance sampling. Ann. Appl. Probab. 28, 2 (04 2018), 1099–1135.
[16]
Aaron R. Coble. 2010. Anonymity, information, and machine-assisted proof. Technical Report UCAM-CL-TR-785. University of Cambridge, Computer Laboratory.
[17]
Jared. Culbertson and Kirk. Sturtz. 2013. Bayesian machine learning via category theory. ArXiv e-prints (Dec. 2013). arXiv: math.CT/1312.1445
[18]
Ryan Culpepper and Andrew Cobb. 2017. Contextual Equivalence for Probabilistic Programs with Continuous Random Variables and Scoring. In Programming Languages and Systems - 26th European Symposium on Programming, ESOP 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22-29, 2017, Proceedings. 368–392.
[19]
Jeremy den Hartog. 2002. Probabilistic extensions of semantical models. Ph.D. Dissertation. Vrije Universiteit Amsterdam.
[20]
Thomas Ehrhard, Michele Pagani, and Christine Tasson. 2017. Measurable Cones and Stable, Measurable Functions: A Model for Probabilistic Higher-order Programming. Proc. ACM Program. Lang. 2, POPL, Article 59 (Dec. 2017), 28 pages.
[21]
Thomas Ehrhard, Christine Tasson, and Michele Pagani. 2014. Probabilistic coherence spaces are fully abstract for probabilistic PCF. In The 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’14, San Diego, CA, USA, January 20-21, 2014. 309–320.
[22]
Michèle Giry. 1982. A categorical approach to probability theory. In Categorical Aspects of Topology and Analysis, B. Banaschewski (Ed.). Lecture Notes in Mathematics, Vol. 915. Springer Berlin Heidelberg, 68–85.
[23]
Noah D. Goodman, Vikash K. Mansinghka, Daniel M. Roy, Keith Bonawitz, and Joshua B. Tenenbaum. 2008. Church: a language for generative models. In UAI 2008, Proceedings of the 24th Conference in Uncertainty in Artificial Intelligence, Helsinki, Finland, July 9-12, 2008. 220–229. https://dslpitt.org/uai/displayArticleDetails.jsp?mmnu=1&smnu=2&article_ id=1346&proceeding_id=24
[24]
Jean Goubault-Larrecq and Achim Jung. 2014. QRB, QFS, and the Probabilistic Powerdomain. Electr. Notes Theor. Comput. Sci. 308 (2014), 167–182.
[25]
Friedrich Gretz, Joost-Pieter Katoen, and Annabelle McIver. 2013. Prinsys - On a Quest for Probabilistic Loop Invariants. In Quantitative Evaluation of Systems - 10th International Conference, QEST 2013. 193–208.
[26]
Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. 2017. A convenient category for higher-order probability theory. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavik, Iceland, June 20-23, 2017. 1–12.
[27]
Johannes Hölzl. 2016. Markov chains and Markov decision processes in Isabelle/HOL. (2016). http://home.in.tum.de/~hoelzl/ mdptheory/hoelzl2016markov- draft.pdf
[28]
Johannes Hölzl and Armin Heller. 2011. Three Chapters of Measure Theory in Isabelle/HOL. In Interactive Theorem Proving, ITP 2011 (Lecture Notes in Computer Science), Marko C. J. D. van Eekelen, Herman Geuvers, Julien Schmaltz, and Freek Wiedijk (Eds.), Vol. 6898. Springer, 135–151.
[29]
Joe Hurd. 2003. Formal verification of probabilistic algorithms. Technical Report UCAM-CL-TR-566. University of Cambridge, Computer Laboratory.
[30]
Joe Hurd, Annabelle McIver, and Carroll Morgan. 2005. Probabilistic guarded commands mechanized in HOL. Theor. Comput. Sci. 346, 1 (2005), 96–112.
[31]
Bart Jacobs. 1999. Categorical Logic and Type Theory. Number 141 in Studies in Logic and the Foundations of Mathematics. North Holland, Amsterdam.
[32]
Bart Jacobs and Thomas F. Melham. 1993. Translating Dependent Type Theory into Higher Order Logic. In Typed Lambda Calculi and Applications, International Conference on Typed Lambda Calculi and Applications, TLCA ’93, Utrecht, The Netherlands, March 16-18, 1993, Proceedings. 209–229.
[33]
Claire Jones and Gordon D. Plotkin. 1989. A Probabilistic Powerdomain of Evaluations. In Proceedings of the Fourth Annual Symposium on Logic in Computer Science (LICS ’89), Pacific Grove, California, USA, June 5-8, 1989. 186–195.
[34]
Achim Jung and Regina Tix. 1998. The troublesome probabilistic powerdomain. Electr. Notes Theor. Comput. Sci. 13 (1998), 70–91.
[35]
Benjamin Lucien Kaminski, Joost-Pieter Katoen, Christoph Matheja, and Federico Olmedo. 2016. Weakest Precondition Reasoning for Expected Run-Times of Probabilistic Programs. arXiv: cs.LO/1601.01001
[36]
Joost-Pieter Katoen, Annabelle McIver, Larissa Meinicke, and Carroll C. Morgan. 2010. Linear-Invariant Generation for Probabilistic Programs: Automated Support for Proof-Based Methods. 390–406.
[37]
Shin-ya Katsumata and Tetsuya Sato. 2015. Codensity Liftings of Monads. In Conference on Algebra and Coalgebra in Computer Science (CALCO 2015) (Leibniz Intern. Proc. in Informatics (LIPIcs)), Vol. 35. Schloss Dagstuhl, 156–170.
[38]
Shin-ya Katsumata. 2014. Parametric Effect Monads and Semantics of Effect Systems. In ACM Symposium on Principles of Programming Languages (POPL ’14). ACM, New York, NY, USA, 633–645.
[39]
Klaus Keimel and Gordon D. Plotkin. 2017. Mixed powerdomains for probability and nondeterminism. Logical Methods in Computer Science 13, 1 (2017).
[40]
Dexter Kozen. 1981. Semantics of probabilistic programs. J. Comput. System Sci. 22, 3 (1981), 328 – 350.
[41]
Dexter Kozen. 1985. A Probabilistic PDL. J. Comput. Syst. Sci. 30, 2 (1985), 162–178.
[42]
A. McIver and C. Morgan. 2005. Abstraction, Refinement, and Proof for Probabilistic Systems. Springer.
[43]
Michael W. Mislove. 2017. Discrete Random Variables Over Domains, Revisited. In Concurrency, Security, and Puzzles - Essays Dedicated to Andrew William Roscoe on the Occasion of His 60th Birthday. 185–202.
[44]
Carroll Morgan, Annabelle McIver, and Karen Seidel. 1996. Probabilistic Predicate Transformers. ACM Trans. Program. Lang. Syst. 18, 3 (1996), 325–353.
[45]
Praveen Narayanan, Jacques Carette, Wren Romano, Chung-chieh Shan, and Robert Zinkov. 2016. Probabilistic Inference by Program Transformation in Hakaru (System Description). In Functional and Logic Programming - 13th International Symposium, FLOPS 2016, Kochi, Japan, March 4-6, 2016, Proceedings. 62–79.
[46]
Prakash Panangaden. 1999. The Category of Markov Kernels. Electronic Notes in Theoretical Computer Science 22 (1999), 171 – 187.
[47]
Ivan Radiček, Gilles Barthe, Marco Gaboardi, Deepak Garg, and Florian Zuleger. 2017. Monadic Refinements for Relational Cost Analysis. Proc. ACM Program. Lang. 2, POPL, Article 36 (Dec. 2017), 32 pages.
[48]
Lyle Harold Ramshaw. 1979. Formalizing the Analysis of Algorithms. Ph.D. Dissertation. Computer Science.
[49]
Robert Rand and Steve Zdancewic. 2015. VPHL: A Verified Partial-Correctness Logic for Probabilistic Programs. In Mathematical Foundations of Program Semantics (MFPS XXXI).
[50]
Stefan Richter. 2004. Formalizing Integration Theory with an Application to Probabilistic Algorithms. In Theorem Proving in Higher Order Logics, 17th International Conference, (TPHOL) 2004 (Lecture Notes in Computer Science), Konrad Slind, Annette Bunker, and Ganesh Gopalakrishnan (Eds.), Vol. 3223. Springer, 271–286.
[51]
Nasser Saheb-Djahromi. 1980. CPO’S of Measures for Nondeterminism. Theor. Comput. Sci. 12 (1980), 19–37.
[52]
Tetsuya Sato. 2018. The Giry monad is not strong for the canonical symmetric monoidal closed structure on Meas. Journal of Pure and Applied Algebra 222, 10 (2018), 2888 – 2896.
[53]
Adam Ścibior, Ohad Kammar, Matthijs Vákár, Sam Staton, Hongseok Yang, Yufei Cai, Klaus Ostermann, Sean K. Moss, Chris Heunen, and Zoubin Ghahramani. 2017. Denotational Validation of Higher-order Bayesian Inference. Proc. ACM Program. Lang. 2, POPL, Article 60 (Dec. 2017), 29 pages.
[54]
Chung-chieh Shan and Norman Ramsey. 2017. Exact Bayesian inference by symbolic disintegration. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, January 18-20, 2017. 130–144.
[55]
Sam Staton. 2017. Commutative Semantics for Probabilistic Programming. In Programming Languages and Systems -26th European Symposium on Programming, ESOP 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22-29, 2017, Proceedings. 855–879.
[56]
Regina Tix, Klaus Keimel, and Gordon D. Plotkin. 2009. Semantic Domains for Combining Probability and Non-Determinism. Electr. Notes Theor. Comput. Sci. 222 (2009), 3–99.
[57]
Daniele Varacca, Hagen Völzer, and Glynn Winskel. 2004. Probabilistic Event Structures and Domains. In CONCUR 2004 - Concurrency Theory, 15th International Conference, London, UK, August 31 - September 3, 2004, Proceedings. 481–496.
[58]
Frank D. Wood, Jan-Willem van de Meent, and Vikash Mansinghka. 2014. A New Approach to Probabilistic Programming Inference. In Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, AISTATS 2014, Reykjavik, Iceland, April 22-25, 2014. 1024–1032. http://jmlr.org/proceedings/papers/v33/wood14.html

Cited By

View all
  • (2024)Fast inference for probabilistic graphical modelsProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3691998(95-110)Online publication date: 10-Jul-2024
  • (2024)Automated Verification of Higher-Order Probabilistic Programs via a Dependent Refinement Type SystemProceedings of the ACM on Programming Languages10.1145/36746628:ICFP(973-1002)Online publication date: 15-Aug-2024
  • (2024)Error Credits: Resourceful Reasoning about Error Bounds for Higher-Order Probabilistic ProgramsProceedings of the ACM on Programming Languages10.1145/36746358:ICFP(284-316)Online publication date: 15-Aug-2024
  • Show More Cited By

Index Terms

  1. Formal verification of higher-order probabilistic programs: reasoning about approximation, convergence, Bayesian inference, and optimization

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image Proceedings of the ACM on Programming Languages
          Proceedings of the ACM on Programming Languages  Volume 3, Issue POPL
          January 2019
          2275 pages
          EISSN:2475-1421
          DOI:10.1145/3302515
          Issue’s Table of Contents
          This work is licensed under a Creative Commons Attribution International 4.0 License.

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 02 January 2019
          Published in PACMPL Volume 3, Issue POPL

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. formal reasoning
          2. probabilistic programming
          3. relational type systems

          Qualifiers

          • Research-article

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)187
          • Downloads (Last 6 weeks)22
          Reflects downloads up to 08 Feb 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2024)Fast inference for probabilistic graphical modelsProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3691998(95-110)Online publication date: 10-Jul-2024
          • (2024)Automated Verification of Higher-Order Probabilistic Programs via a Dependent Refinement Type SystemProceedings of the ACM on Programming Languages10.1145/36746628:ICFP(973-1002)Online publication date: 15-Aug-2024
          • (2024)Error Credits: Resourceful Reasoning about Error Bounds for Higher-Order Probabilistic ProgramsProceedings of the ACM on Programming Languages10.1145/36746358:ICFP(284-316)Online publication date: 15-Aug-2024
          • (2023)Program logic for higher-order probabilistic programs in Isabelle/HOLScience of Computer Programming10.1016/j.scico.2023.102993230:COnline publication date: 1-Aug-2023
          • (2022)Concrete categories and higher-order recursionProceedings of the 37th Annual ACM/IEEE Symposium on Logic in Computer Science10.1145/3531130.3533370(1-14)Online publication date: 2-Aug-2022
          • (2022)Reasoning about block-based cloud storage systems via separation logicTheoretical Computer Science10.1016/j.tcs.2022.09.015936:C(43-76)Online publication date: 10-Nov-2022
          • (2022)An adaptation-complete proof system for local reasoning about cloud storage systemsTheoretical Computer Science10.1016/j.tcs.2021.12.018903:C(39-73)Online publication date: 8-Feb-2022
          • (2022)Program Logic for Higher-Order Probabilistic Programs in Isabelle/HOLFunctional and Logic Programming10.1007/978-3-030-99461-7_4(57-74)Online publication date: 3-May-2022
          • (2021)Higher-order probabilistic adversarial computations: categorical semantics and program logicsProceedings of the ACM on Programming Languages10.1145/34735985:ICFP(1-30)Online publication date: 19-Aug-2021
          • (2021)Probabilistic programming semantics for name generationProceedings of the ACM on Programming Languages10.1145/34342925:POPL(1-29)Online publication date: 4-Jan-2021
          • Show More Cited By

          View Options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Login options

          Full Access

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media