Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A model-learner pattern for bayesian reasoning

Published: 23 January 2013 Publication History
  • Get Citation Alerts
  • Abstract

    A Bayesian model is based on a pair of probability distributions, known as the prior and sampling distributions. A wide range of fundamental machine learning tasks, including regression, classification, clustering, and many others, can all be seen as Bayesian models. We propose a new probabilistic programming abstraction, a typed Bayesian model, which is based on a pair of probabilistic expressions for the prior and sampling distributions. A sampler for a model is an algorithm to compute synthetic data from its sampling distribution, while a learner for a model is an algorithm for probabilistic inference on the model. Models, samplers, and learners form a generic programming pattern for model-based inference. They support the uniform expression of common tasks including model testing, and generic compositions such as mixture models, evidence-based model averaging, and mixtures of experts. A formal semantics supports reasoning about model equivalence and implementation correctness. By developing a series of examples and three learner implementations based on exact inference, factor graphs, and Markov chain Monte Carlo, we demonstrate the broad applicability of this new programming pattern.

    Supplementary Material

    MP4 File (r1d3_talk3.mp4)

    References

    [1]
    , Olmedo, and Zanella Béguelin}Barthe:2012:CertiPrivG. Barthe, B. Köpf, F. Olmedo, and S. Zanella Béguelin. Probabilistic relational reasoning for differential privacy. In J. Field and M. Hicks, editors, phPOPL, pages 97--110. ACM, 2012.
    [2]
    S. Bhat, A. Agarwal, R. W. Vuduc, and A. G. Gray. A type theory for probability density functions. In J. Field and M. Hicks, editors, POPL, pages 545--556. ACM, 2012.
    [3]
    S. Bhat, J. Borgström, A. D. Gordon, and C. Russo. Deriving probability density functions from probabilistic functional programs. Draft paper, 2012.
    [4]
    C. M. Bishop and M. Svensén. Bayesian hierarchical mixtures of experts. In C. Meek and U. Kjarulff, editors, Uncertainty in Artificial Intelligence (UAI'03), pages 57--64. Morgan Kaufmann, 2003.
    [5]
    D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3: 993--1022, 2003.
    [6]
    J. Borgström, A. D. Gordon, M. Greenberg, J. Margetson, and J. Van Gael. Measure transformer semantics for Bayesian machine learning. In European Symposium on Programming (ESOP'11), volume 6602 of LNCS, pages 77--96. Springer, 2011. Download available at http://research.microsoft.com/fun.
    [7]
    M. Bozga and O. Maler. On the representation of probabilities over structured domains. In Computer Aided Verification (CAV'09), pages 261--273, 1999.
    [8]
    M. Chavira and A. Darwiche. Compiling Bayesian networks using variable elimination. In International Joint Conference on on Artificial Intelligence (IJCAI'07), pages 2443--2449, 2007.
    [9]
    G. Claret, S. K. Rajamani, A. V. Nori, A. D. Gordon, and J. Borgström. Bayesian inference for probabilistic programs via symbolic execution. Technical Report MSR--TR--2012--86, Microsoft Research, 2012.
    [10]
    P. Cousot and R. Cousot. Abstract interpretation: a unified lattice model for the static analysis of programs by construction or approximation of fixpoints. In POPL, pages 238--252, 1977.
    [11]
    A. Darwiche. Modeling and Reasoning with Bayesian Networks. CUP, 2009.
    [12]
    H. Daumé III. HBC: Hierarchical Bayes Compiler, 2008. Available at http://www.cs.utah.edu/ hal/HBC/.
    [13]
    P. Domingos, S. Kok, D. Lowd, H. Poon, M. Richardson, and P. Singla. Markov logic. In L. De Raedt, P. Frasconi, K. Kersting, and S. Muggleton, editors, Probabilistic inductive logic programming, pages 92--117. Springer-Verlag, Berlin, Heidelberg, 2008.
    [14]
    M. Erwig and S. Kollmansberger. Functional pearls: Probabilistic functional programming in Haskell. J. Funct. Program., 16 (1): 21--34, 2006.
    [15]
    W. R. Gilks, A. Thomas, and D. J. Spiegelhalter. A language and program for complex Bayesian modelling. The Statistician, 43: 169--178, 1994.
    [16]
    M. Giry. A categorical approach to probability theory. In B. Banaschewski, editor, Categorical Aspects of Topology and Analysis, volume 915 of Lecture Notes in Mathematics, pages 68--85. Springer Berlin / Heidelberg, 1982.
    [17]
    N. Goodman, V. K. Mansinghka, D. M. Roy, K. Bonawitz, and J. B. Tenenbaum. Church: a language for generative models. In Uncertainty in Artificial Intelligence (UAI'08), pages 220--229. AUAI Press, 2008.
    [18]
    A. D. Gordon, M. Aizatulin, J. Borgström, G. Claret, T. Graepel, A. Nori, S. Rajamani, and C. Russo. A model-learner pattern for Bayesian reasoning. Technical Report MSR-TR-2013--1, Microsoft Research, 2013.
    [19]
    A. Guazzelli, M. Zeller, W. Chen, and G. Williams. PMML: An open standard for sharing models. The R Journal, 1 (1), May 2009.
    [20]
    V. Gupta, R. Jagadeesan, and P. Panangaden. Stochastic processes as concurrent constraint programs. In POPL, pages 189--202, 1999.
    [21]
    W. K. Hastings. Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57 (1): 97--109, 1970.
    [22]
    R. Herbrich, T. Minka, and T. Graepel. Trueskill™: A Bayesian skill rating system. In NIPS, pages 569--576, 2006.
    [23]
    J. A. Hoeting, D. Madigan, A. E. Raftery, and C. T. Volinsky. Bayesian model averaging: A tutorial. Statistical Science, 14 (4): 382--401, 1999.
    [24]
    R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton. Adaptive mixtures of local experts. Neural Computation, 3: 79--87, 1991.
    [25]
    C. Jones and G. D. Plotkin. A probabilistic powerdomain of evaluations. In Logic in Computer Science (LICS'89), pages 186--195. IEEE Computer Society, 1989.
    [26]
    M. I. Jordan and R. A. Jacobs. Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 6 (2): 181--214, 1994.
    [27]
    O. Kiselyov and C. Shan. Monolingual probabilistic programming using generalized coroutines. In Uncertainty in Artificial Intelligence (UAI'09), 2009.
    [28]
    D. Koller, D. A. McAllester, and A. Pfeffer. Effective Bayesian inference for stochastic programs. In AAAI/IAAI, pages 740--747, 1997.
    [29]
    D. Kozen. Semantics of probabilistic programs. Journal of Computer and System Sciences, 22 (3): 328--350, 1981.
    [30]
    M. Z. Kwiatkowska, G. Norman, and D. Parker. Quantitative analysis with the probabilistic model checker PRISM. In Quantitative Aspects of Programming Languages (QAPL 2005), volume 153(2) of ENTCS, pages 5--31, 2006.
    [31]
    D. J. C. MacKay. Information Theory, Inference, and Learning Algorithms. CUP, 2003.
    [32]
    P. Mardziel, S. Magill, M. Hicks, and M. Srivatsa. Dynamic enforcement of knowledge-based security policies. In Computer Security Foundations Symposium (CSF'11), pages 114--128, 2011.
    [33]
    A. McCallum, K. Schultz, and S. Singh. Factorie: Probabilistic programming via imperatively defined factor graphs. In NIPS, pages 1249--1257, 2009.
    [34]
    A. McIver and C. Morgan. Abstraction, refinement and proof for probabilistic systems. Monographs in computer science. Springer, 2005.
    [35]
    N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller. Equation of state calculations by fast computing machines. Journal of Chemical Physics, 21: 1087--1092, 1953.
    [36]
    T. Minka. A family of algorithms for approximate Bayesian inference. PhD thesis, MIT, 2001.
    [37]
    T. Minka and J. M. Winn. Gates. In phAdvances in Neural Information Processing Systems (NIPS'08), pages 1073--1080. MIT Press, 2008.
    [38]
    T. Minka, J. Winn, J. Guiver, and A. Kannan. Infer.NET 2.3, Nov. 2009. Software available from http://research.microsoft.com/infernet.
    [39]
    R. M. Neal. Probabilistic inference using Markov chain Monte Carlo methods. Technical Report CRG-TR-93--1, Dept. of Computer Science, University of Toronto, September 1993.
    [40]
    S. Park, F. Pfenning, and S. Thrun. A probabilistic language based upon sampling functions. In POPL, pages 171--182. ACM, 2005.
    [41]
    J. Pearl and G. Shafer. Probabilistic reasoning in intelligent systems: Networks of plausible inference. Synthese-Dordrecht, 104 (1): 161, 1995.
    [42]
    A. Pfeffer. IBAL: A probabilistic rational programming language. In B. Nebel, editor, International Joint Conference on Artificial Intelligence (IJCAI'01), pages 733--740. Morgan Kaufmann, 2001.
    [43]
    A. Pfeffer. The design and implementation of IBAL: A general-purpose probabilistic language. In L. Getoor and B. Taskar, editors, Introduction to Statistical Relational Learning. MIT Press, 2007.
    [44]
    A. Pfeffer. Practical probabilistic programming. In P. Frasconi and F. A. Lisi, editors, Inductive Logic Programming (ILP 2010), volume 6489 of Lecture Notes in Computer Science, pages 2--3. Springer, 2010.
    [45]
    D. Purves and V. Lyutsarev. Filzbach User Guide, 2012. Available at http://research.microsoft.com/en-us/um/cambridge/groups/science/tools/f%ilzbach/filzbach.htm.
    [46]
    A. Radul. Report on the probabilistic language scheme. In Proceedings of the 2007 symposium on Dynamic languages, DLS'07, pages 2--10, New York, NY, USA, 2007. ACM. ISBN 978--1--59593--868--8. 10.1145/1297081.1297085. URL http://doi.acm.org/10.1145/1297081.1297085.
    [47]
    N. Ramsey and A. Pfeffer. Stochastic lambda calculus and monads of probability distributions. In POPL, pages 154--165, 2002.
    [48]
    N. Saheb-Djahromi. Probabilistic LCF. In Mathematical Foundations of Computer Science (MFCS), volume 64 of LNCS, pages 442--451. Springer, 1978.
    [49]
    S. Sanner and D. A. McAllester. Affine Algebraic Decision Diagrams (AADDs) and their application to structured probabilistic inference. In International Joint Conference on on Artificial Intelligence (IJCAI'05), pages 1384--1390, 2005.
    [50]
    J. Schumann, T. Pressburger, E. Denney, W. Buntine, and B. Fischer. AutoBayes program synthesis system users manual. Technical Report NASA/TM--2008--215366, NASA Ames Research Center, 2008.
    [51]
    F. Somenzi. CUDD: CU decision diagram package, release 2.5.0, 2012. Software available from http://vlsi.colorado.edu.
    [52]
    D. Syme. Leveraging .NET meta-programming components from F#: integrated queries and interoperable heterogeneous execution. In A. Kennedy and F. Pottier, editors, ML, pages 43--54. ACM, 2006.
    [53]
    J. Winn and T. Minka. Probabilistic programming with Infer.NET. Machine Learning Summer School lecture notes, available at http://research.microsoft.com/ minka/papers/mlss2009/, 2009.

    Cited By

    View all
    • (2023)On Lexicographic Proof Rules for Probabilistic TerminationFormal Aspects of Computing10.1145/358539135:2(1-25)Online publication date: 27-Feb-2023
    • (2021)On Lexicographic Proof Rules for Probabilistic TerminationFormal Methods10.1007/978-3-030-90870-6_33(619-639)Online publication date: 20-Nov-2021
    • (2019)Cost analysis of nondeterministic probabilistic programsProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3314221.3314581(204-220)Online publication date: 8-Jun-2019
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 48, Issue 1
    POPL '13
    January 2013
    561 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/2480359
    Issue’s Table of Contents
    • cover image ACM Conferences
      POPL '13: Proceedings of the 40th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages
      January 2013
      586 pages
      ISBN:9781450318327
      DOI:10.1145/2429069
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 January 2013
    Published in SIGPLAN Volume 48, Issue 1

    Check for updates

    Author Tags

    1. bayesian reasoning
    2. machine learning
    3. model-learner pattern
    4. probabilistic programming

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)On Lexicographic Proof Rules for Probabilistic TerminationFormal Aspects of Computing10.1145/358539135:2(1-25)Online publication date: 27-Feb-2023
    • (2021)On Lexicographic Proof Rules for Probabilistic TerminationFormal Methods10.1007/978-3-030-90870-6_33(619-639)Online publication date: 20-Nov-2021
    • (2019)Cost analysis of nondeterministic probabilistic programsProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3314221.3314581(204-220)Online publication date: 8-Jun-2019
    • (2016)Robots at the Edge of the CloudProceedings of the 22nd International Conference on Tools and Algorithms for the Construction and Analysis of Systems - Volume 963610.1007/978-3-662-49674-9_1(3-13)Online publication date: 2-Apr-2016
    • (2014)On Probabilistic Applicative Bisimulation and Call-by-Value λ-CalculiProceedings of the 23rd European Symposium on Programming Languages and Systems - Volume 841010.1007/978-3-642-54833-8_12(209-228)Online publication date: 5-Apr-2014
    • (2021)Correctness of Sequential Monte Carlo Inference for Probabilistic Programming LanguagesProgramming Languages and Systems10.1007/978-3-030-72019-3_15(404-431)Online publication date: 23-Mar-2021
    • (2019)Modular verification for almost-sure termination of probabilistic programsProceedings of the ACM on Programming Languages10.1145/33605553:OOPSLA(1-29)Online publication date: 10-Oct-2019
    • (2019)Cost analysis of nondeterministic probabilistic programsProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation10.1145/3314221.3314581(204-220)Online publication date: 8-Jun-2019
    • (2019)Probabilistic Smart Contracts: Secure Randomness on the Blockchain2019 IEEE International Conference on Blockchain and Cryptocurrency (ICBC)10.1109/BLOC.2019.8751326(403-412)Online publication date: May-2019
    • (2016)Differentially Private Bayesian ProgrammingProceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security10.1145/2976749.2978371(68-79)Online publication date: 24-Oct-2016
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media