research-article

Efficient synthesis of probabilistic programs

Authors:

Aditya V. Nori,

Sriram K. Rajamani,

Deepak VijaykeerthyAuthors Info & Claims

ACM SIGPLAN Notices, Volume 50, Issue 6

Pages 208 - 217

https://doi.org/10.1145/2813885.2737982

Published: 03 June 2015 Publication History

Abstract

We show how to automatically synthesize probabilistic programs from real-world datasets. Such a synthesis is feasible due to a combination of two techniques: (1) We borrow the idea of ``sketching'' from synthesis of deterministic programs, and allow the programmer to write a skeleton program with ``holes''. Sketches enable the programmer to communicate domain-specific intuition about the structure of the desired program and prune the search space, and (2) we design an efficient Markov Chain Monte Carlo (MCMC) based synthesis algorithm to instantiate the holes in the sketch with program fragments. Our algorithm efficiently synthesizes a probabilistic program that is most consistent with the data. A core difficulty in synthesizing probabilistic programs is computing the likelihood L(P | D) of a candidate program P generating data D. We propose an approximate method to compute likelihoods using mixtures of Gaussian distributions, thereby avoiding expensive computation of integrals. The use of such approximations enables us to speed up evaluation of the likelihood of candidate programs by a factor of 1000, and makes Markov Chain Monte Carlo based search feasible. We have implemented our algorithm in a tool called PSKETCH, and our results are encouraging PSKETCH is able to automatically synthesize 16 non-trivial real-world probabilistic programs.

References

[1]

Y. Bachrach, T. Graepel, T. Minka, and J. Guiver. How to grade a test without knowing the answers—a bayesian graphical model for adaptive crowdsourcing and aptitude testing. arXiv preprint arXiv:1206.6386, 2012.

[2]

S. Bhat, J. Borgström, A. D. Gordon, and C. V. Russo. Deriving probability density functions from probabilistic functional programs. In Tools and Algorithms for the Construction and Analysis of Systems (TACAS), pages 508–522, 2013.

Digital Library

[3]

S. Chib and E. Greenberg. Understanding the Metropolis-Hastings algorithm. American Statistician, 49(4):327–335, 1995.

[4]

A. Gelman, J. B. Carlin, H. S. Stern, D. B. Dunson, A. Vehtari, and D. B. Rubin. Bayesian data analysis. CRC press, 2013.

[5]

R. Gens and P. Domingos. Learning the structure of sum-product networks. In International Conference on Machine Learning (ICML), pages 873–880, 2013.

[6]

W. R. Gilks, A. Thomas, and D. J. Spiegelhalter. A language and program for complex Bayesian modelling. The Statistician, 43(1):169– 177, 1994.

[7]

V. Gogate, W. A. Webb, and P. Domingos. Learning efficient markov networks. In Neural Information Processing Systems (NIPS), pages 748–756, 2010.

[8]

N. D. Goodman, V. K. Mansinghka, D. M. Roy, K. Bonawitz, and J. B. Tenenbaum. Church: a language for generative models. In Uncertainty in Artificial Intelligence (UAI), pages 220–229, 2008.

[9]

A. D. Gordon, T. A. Henzinger, A. V. Nori, and S. K. Rajamani. Probabilistic programming. In Future of Software Engineering, FOSE 2014, pages 167–181, 2014.

Digital Library

[10]

A. D. Gordon, T. A. Henzinger, A. V. Nori, and S. K. Rajamani. Probabilistic programming. In Future of Software Engineering (FOSE), pages 167–181, 2014.

Digital Library

[11]

S. Gulwani. Dimensions in program synthesis. In Principles and Practice of Declarative Programming (PPDP), 2010. http://research.microsoft.com/˜sumitg/pubs/ppdp10-synthesis.pdf.

Digital Library

[12]

R. Herbrich, T. Minka, and T. Graepel. TrueSkill: A Bayesian skill rating system. In Neural Information Processing Systems (NIPS), pages 569–576, 2006.

[13]

M. D. Hoffman and A. Gelman. The no-U-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, in press, 2013.

[14]

J. H. Kim and J. Pearl. A computational model for causal and diagnostic reasoning in inference systems. In IJCAI, volume 83, pages 190–193. Citeseer, 1983.

Digital Library

[15]

S. Kok, M. Sumner, M. Richardson, P. Singla, H. Poon, D. Lowd, and P. Domingos. The Alchemy system for Statistical Relational AI. Technical report, University of Washington, 2007.

[16]

D. Koller, D. A. McAllester, and A. Pfeffer. Effective Bayesian inference for stochastic programs. In National Conference on Artificial Intelligence (AAAI), pages 740–747, 1997.

Digital Library

[17]

D. Kozen. Semantics of probabilistic programs. Journal of Computer and System Science (JCSS), 22:328–350, 1981.

[18]

P. Liang, M. I. Jordan, and D. Klein. Learning programs: A hierarchical bayesian approach. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), June 21-24, 2010, Haifa, Israel, pages 639–646, 2010.

[19]

D. Lowd and P. Domingos. Learning arithmetic circuits. In Uncertainty in Artificial Intelligence (UAI), pages 383–392, 2008.

[20]

D. J. C. MacKay. Information Theory, Inference & Learning Algorithms. Cambridge University Press, New York, NY, USA, 2002.

Digital Library

[21]

C. J. Maddison and D. Tarlow. Structured generative models of natural source code. In International Conference on Machine Learning (ICML), pages 649–657, 2014.

[22]

V. Maz’ya and G. Schmidt. On approximate approximations using gaussian kernels. IMA Journal of Numerical Analysis, 16:13–29, 1996.

[23]

T. Minka, J. Winn, J. Guiver, and A. Kannan. Infer.NET 2.3, 2009.

[24]

A. V. Nori, C.-K. Hur, S. K. Rajamani, and S. Samuel. R2: An efficient mcmc sampler for probabilistic programs. In AAAI Conference on Artificial Intelligence. AAAI Press, July 2014.

[25]

A. Pfeffer. The design and implementation of IBAL: A generalpurpose probabilistic language. In Statistical Relational Learning, pages 399–432, 2007.

[26]

J. Pfeffer. Probabilistic Reasoning in Intelligence Systems. Morgan Kaufmann, 1996.

[27]

E. Schkufza, R. Sharma, and A. Aiken. Stochastic superoptimization. In Architectural Support for Programming Languages and Operating Systems (ASPLOS), pages 305–316, 2013.

Digital Library

[28]

A. Solar-Lezama, R. M. Rabbah, R. Bod´ık, and K. Ebcioglu. Programming by sketching for bit-streaming programs. In Programming Language Design and Implementation (PLDI), pages 281–294, 2005.

Digital Library

[29]

S. Srivastava, S. Gulwani, and J. Foster. From program verification to program synthesis. In Principles of Programming Languages (POPL), pages 313–326, 2010.

Digital Library

Cited By

Sarkar A(2024)Automated quantum software engineeringAutomated Software Engineering10.1007/s10515-024-00436-x31:1Online publication date: 12-Apr-2024
https://dl.acm.org/doi/10.1007/s10515-024-00436-x
Stankovič MBartocci E(2024)Probabilistic Loop Synthesis from Sequences of MomentsQuantitative Evaluation of Systems and Formal Modeling and Analysis of Timed Systems10.1007/978-3-031-68416-6_14(233-248)Online publication date: 10-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-68416-6_14
Klaus JBlacher MGoral ALucas PGiesen J(2023)A visual analytics workflow for probabilistic modelingVisual Informatics10.1016/j.visinf.2023.05.0017:2(72-84)Online publication date: Jun-2023
https://doi.org/10.1016/j.visinf.2023.05.001
Show More Cited By

Recommendations

Efficient synthesis of probabilistic programs
PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation

We show how to automatically synthesize probabilistic programs from real-world datasets. Such a synthesis is feasible due to a combination of two techniques: (1) We borrow the idea of ``sketching'' from synthesis of deterministic programs, and allow ...
Can reactive synthesis and syntax-guided synthesis be friends?
SPLASH Companion 2021: Companion Proceedings of the 2021 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity

While reactive synthesis and syntax-guided synthesis (SyGuS) have seen enormous progress in recent years, combining the two approaches has remained a challenge. In this work, we present the synthesis of reactive programs from Temporal Stream Logic ...
Optimizing synthesis with metasketches
POPL '16

Many advanced programming tools---for both end-users and expert developers---rely on program synthesis to automatically generate implementations from high-level specifications. These tools often need to employ tricky, custom-built synthesis algorithms ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices

ACM SIGPLAN Notices Volume 50, Issue 6

PLDI '15

June 2015

630 pages

ISSN:0362-1340

EISSN:1558-1160

DOI:10.1145/2813885

Editor:
Andy Gill
University of Kansas, Lawrence, KS

Issue’s Table of Contents

PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation
June 2015
630 pages
ISBN:9781450334686
DOI:10.1145/2737924
General Chair:
David Grove
IBM Research, USA
,
Program Chair:
Steve Blackburn
Australian National University, Australia

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2015

Published in SIGPLAN Volume 50, Issue 6

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

38
Total Citations
View Citations
489
Total Downloads

Downloads (Last 12 months)28
Downloads (Last 6 weeks)1

Reflects downloads up to 11 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Sarkar A(2024)Automated quantum software engineeringAutomated Software Engineering10.1007/s10515-024-00436-x31:1Online publication date: 12-Apr-2024
https://dl.acm.org/doi/10.1007/s10515-024-00436-x
Stankovič MBartocci E(2024)Probabilistic Loop Synthesis from Sequences of MomentsQuantitative Evaluation of Systems and Formal Modeling and Analysis of Timed Systems10.1007/978-3-031-68416-6_14(233-248)Online publication date: 10-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-68416-6_14
Klaus JBlacher MGoral ALucas PGiesen J(2023)A visual analytics workflow for probabilistic modelingVisual Informatics10.1016/j.visinf.2023.05.0017:2(72-84)Online publication date: Jun-2023
https://doi.org/10.1016/j.visinf.2023.05.001
Mell SBastani FZdancewic SBastani O(2023)Synthesizing Trajectory Queries from ExamplesComputer Aided Verification10.1007/978-3-031-37706-8_23(459-484)Online publication date: 17-Jul-2023
https://dl.acm.org/doi/10.1007/978-3-031-37706-8_23
Andriushchenko RČeška MJunges SKatoen JStupinský Š(2021)PAYNT: A Tool for Inductive Synthesis of Probabilistic ProgramsComputer Aided Verification10.1007/978-3-030-81685-8_40(856-869)Online publication date: 15-Jul-2021
https://doi.org/10.1007/978-3-030-81685-8_40
Andriushchenko RČeška MJunges SKatoen J(2021)Inductive Synthesis for Probabilistic Programs Reaches New HorizonsTools and Algorithms for the Construction and Analysis of Systems10.1007/978-3-030-72016-2_11(191-209)Online publication date: 20-Mar-2021
https://doi.org/10.1007/978-3-030-72016-2_11
Laurel JMisailovic S(2020)Continualization of Probabilistic Programs With CorrectionProgramming Languages and Systems10.1007/978-3-030-44914-8_14(366-393)Online publication date: 18-Apr-2020
https://doi.org/10.1007/978-3-030-44914-8_14
Sherman BMichel JCarbin M(2019)Sound and robust solid modeling via exact real arithmetic and continuityProceedings of the ACM on Programming Languages10.1145/33417033:ICFP(1-29)Online publication date: 26-Jul-2019
https://dl.acm.org/doi/10.1145/3341703
Paraskevopoulou ZAppel A(2019)Closure conversion is safe for spaceProceedings of the ACM on Programming Languages10.1145/33416873:ICFP(1-29)Online publication date: 26-Jul-2019
https://dl.acm.org/doi/10.1145/3341687
Delaware BSuriyakarn SPit-Claudel CYe QChlipala A(2019)Narcissus: correct-by-construction derivation of decoders and encoders from binary formatsProceedings of the ACM on Programming Languages10.1145/33416863:ICFP(1-29)Online publication date: 26-Jul-2019
https://dl.acm.org/doi/10.1145/3341686
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents