Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3453483.3454080acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article
Public Access

DreamCoder: bootstrapping inductive program synthesis with wake-sleep library learning

Published: 18 June 2021 Publication History

Abstract

We present a system for inductive program synthesis called DreamCoder, which inputs a corpus of synthesis problems each specified by one or a few examples, and automatically derives a library of program components and a neural search policy that can be used to efficiently solve other similar synthesis problems. The library and search policy bootstrap each other iteratively through a variant of "wake-sleep" approximate Bayesian learning. A new refactoring algorithm based on E-graph matching identifies common sub-components across synthesized programs, building a progressively deepening library of abstractions capturing the structure of the input domain. We evaluate on eight domains including classic program synthesis areas and AI tasks such as planning, inverse graphics, and equation discovery. We show that jointly learning the library and neural search policy leads to solving more problems, and solving them more quickly.

Supplementary Material

Auxiliary Archive (pldi21main-p355-p-archive.zip)
Appendix for "DreamCoder: Bootstrapping Inductive Program Synthesis with Wake-Sleep Library Learning"

References

[1]
Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama. 2017. Sygus-comp 2017: Results and analysis. arXiv preprint arXiv:1711.11438, https://doi.org/10.4204/EPTCS.260.9
[2]
Matej Balog, Alexander L Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2016. DeepCoder: Learning to Write Programs. ICLR.
[3]
Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag New York, Inc.
[4]
Xinyun Chen, Chang Liu, and Dawn Song. 2018. Execution-guided neural program synthesis. ICLR.
[5]
M.T.H. Chi, R. Glaser, and M.J. Farr. 1988. The Nature of Expertise. Taylor & Francis Group. isbn:9780898597110 lccn:lc87033071 https://doi.org/10.4324/9781315799681
[6]
Michelene TH Chi, Paul J Feltovich, and Robert Glaser. 1981. Categorization and representation of physics problems by experts and novices. Cognitive science, 5, 2 (1981), https://doi.org/10.1207/s15516709cog0502_2
[7]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, https://doi.org/10.3115/v1/D14-1179
[8]
Andrew Cropper. 2019. Playgol: Learning Programs Through Play. IJCAI, https://doi.org/10.24963/ijcai.2019/841
[9]
Luis Damas and Robin Milner. 1982. Principal type-schemes for functional programs. In Proceedings of the 9th ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 207–212. https://doi.org/10.1145/582153.582176
[10]
Eyal Dechter, Jon Malmaud, Ryan P. Adams, and Joshua B. Tenenbaum. 2013. Bootstrap Learning via Modular Concept Discovery. In IJCAI.
[11]
David Detlefs, Greg Nelson, and James B. Saxe. 2005. Simplify: a theorem prover for program checking. J. ACM, 52, 3 (2005), 365–473. https://doi.org/10.1145/1066100.1066102
[12]
Jacob Devlin, Rudy R Bunel, Rishabh Singh, Matthew Hausknecht, and Pushmeet Kohli. 2017. Neural Program Meta-Induction. In NIPS.
[13]
Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, and Pushmeet Kohli. 2017. RobustFill: Neural Program Learning under Noisy I/O. ICML.
[14]
Kevin Ellis, Lucas Morales, Mathias Sablé-Meyer, Armando Solar-Lezama, and Josh Tenenbaum. 2018. Library Learning for Neurally-Guided Bayesian Program Induction. In NeurIPS.
[15]
Kevin Ellis, Maxwell Nye, Yewen Pu, Felix Sosa, Josh Tenenbaum, and Armando Solar-Lezama. 2019. Write, execute, assess: Program synthesis with a repl. In Advances in Neural Information Processing Systems. 9169–9178.
[16]
Jonathan St BT Evans. 1984. Heuristic and analytic processes in reasoning. British Journal of Psychology, 75, 4 (1984), 451–468. https://doi.org/10.1111/j.2044-8295.1984.tb01915.x
[17]
John K Feser, Swarat Chaudhuri, and Isil Dillig. 2015. Synthesizing data structure transformations from input-output examples. In PLDI. https://doi.org/10.1145/2737924.2737977
[18]
Yaroslav Ganin, Tejas Kulkarni, Igor Babuschkin, S. M. Ali Eslami, and Oriol Vinyals. 2018. Synthesizing Programs for Images using Reinforced Adversarial Learning. ICML.
[19]
Jeremy Gibbons. 2003. Origami programming. https://doi.org/10.1017/S0956796804245324
[20]
Sumit Gulwani. 2011. Automating string processing in spreadsheets using input-output examples. In ACM SIGPLAN Notices. 46, 317–330. https://doi.org/10.1145/1926385.1926423
[21]
Robert John Henderson. 2013. Cumulative learning in the lambda calculus. Ph.D. Dissertation. Imperial College London. https://doi.org/10.25560/24759
[22]
Luke Hewitt, Tuan Anh Le, and Joshua Tenenbaum. 2020. Learning to learn generative programs with Memoised Wake-Sleep. In Conference on Uncertainty in Artificial Intelligence. 1278–1287.
[23]
Geoffrey E Hinton, Peter Dayan, Brendan J Frey, and Radford M Neal. 1995. The "wake-sleep" algorithm for unsupervised neural networks. Science, 268, 5214 (1995), 1158–1161.
[24]
Irvin Hwang, Andreas Stuhlmüller, and Noah D Goodman. 2011. Inducing probabilistic programs by Bayesian program merging. arXiv preprint arXiv:1110.5667.
[25]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
[26]
Kenichi Kurihara and Taisuke Sato. 2006. Variational Bayesian grammar induction for natural language. In International Colloquium on Grammatical Inference. 84–96. https://doi.org/10.1007/11872436_8
[27]
Brenden M Lake, Ruslan Salakhutdinov, and Joshua B Tenenbaum. 2015. Human-level concept learning through probabilistic program induction. Science, 350, 6266 (2015), 1332–1338. https://doi.org/10.1126/science.aab3050
[28]
Pat Langley. 1987. Scientific discovery: Computational explorations of the creative processes. MIT Press. https://doi.org/10.1177/027046768800800417
[29]
Miguel Lázaro-Gredilla, Dianhuan Lin, J Swaroop Guntupalli, and Dileep George. 2019. Beyond imitation: Zero-shot task transfer on robots by learning concepts as cognitive programs. Science Robotics, 4, 26 (2019), eaav3150. https://doi.org/10.1126/scirobotics.aav3150
[30]
Woosuk Lee, Kihong Heo, Rajeev Alur, and Mayur Naik. 2018. Accelerating search-based program synthesis using learned probabilistic models. ACM SIGPLAN Notices, 53, 4 (2018), 436–449. https://doi.org/10.1145/3296979.3192410
[31]
Percy Liang, Michael I. Jordan, and Dan Klein. 2010. Learning Programs: A Hierarchical Bayesian Approach. In ICML.
[32]
Dianhuan Lin, Eyal Dechter, Kevin Ellis, Joshua B. Tenenbaum, and Stephen Muggleton. 2014. Bias reformulation for one-shot function induction. In ECAI 2014. https://doi.org/10.3233/978-1-61499-419-0-525
[33]
John McCarthy. 1960. Recursive functions of symbolic expressions and their computation by machine, Part I. Commun. ACM, 3, 4 (1960), 184–195. https://doi.org/10.1145/367177.367199
[34]
Aditya Menon, Omer Tamuz, Sumit Gulwani, Butler Lampson, and Adam Kalai. 2013. A machine learning framework for programming by example. In ICML. 187–195.
[35]
Microsoft. 2016. F# Guide: Units of Measure. https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/units-of-measure
[36]
Stephen H Muggleton, Dianhuan Lin, and Alireza Tamaddoni-Nezhad. 2015. Meta-interpretive learning of higher-order dyadic datalog: Predicate invention revisited. Machine Learning, 100, 1 (2015), 49–73. https://doi.org/10.1007/s10994-014-5471-y
[37]
Stephen H Muggleton, Ute Schmid, Christina Zeller, Alireza Tamaddoni-Nezhad, and Tarek Besold. 2018. Ultra-Strong Machine Learning: comprehensibility of programs learned with ILP. Machine Learning, 107, 7 (2018), 1119–1140. https://doi.org/10.1007/s10994-018-5707-3
[38]
Maxwell Nye, Luke Hewitt, Joshua Tenenbaum, and Armando Solar-Lezama. 2019. Learning to infer program sketches. ICML.
[39]
Benjamin C. Pierce. 2002. Types and programming languages. MIT Press. isbn:978-0-262-16209-8
[40]
Riccardo Poli, William B. Langdon, and Nicholas Freitag McPhee. 2008. A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk. (With contributions by J. R. Koza).
[41]
Nadia Polikarpova, Ivan Kuraj, and Armando Solar-Lezama. 2016. Program synthesis from polymorphic refinement types. ACM SIGPLAN Notices, 51, 6 (2016), 522–538. https://doi.org/10.1145/2908080.2908093
[42]
Illia Polosukhin and Alexander Skidanov. 2018. Neural program search: Solving programming tasks from description and examples. arXiv preprint arXiv:1802.04335.
[43]
Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: A framework for inductive program synthesis. ACM SIGPLAN Notices, 50, 10 (2015), 107–126. https://doi.org/10.1145/2858965.2814310
[44]
Stuart J. Russell and Peter Norvig. 2003. Artificial Intelligence: A Modern Approach (2 ed.). Pearson Education. isbn:0137903952
[45]
Michael Schmidt and Hod Lipson. 2009. Distilling free-form natural laws from experimental data. science, 324, 5923 (2009), 81–85. https://doi.org/10.1126/science.1165893
[46]
Sanjit A. Seshia. 2012. Sciduction: Combining Induction, Deduction, and Structure for Verification and Synthesis. In Proceedings of the Design Automation Conference (DAC). 356–365. https://doi.org/10.1145/2228360.2228425
[47]
Richard Shin, Miltiadis Allamanis, Marc Brockschmidt, and Oleksandr Polozov. 2019. Program Synthesis and Semantic Parsing with Learned Code Idioms. NeurIPS.
[48]
Vighnesh Shiv and Chris Quirk. 2019. Novel positional encodings to enable tree-based transformers. In Advances in Neural Information Processing Systems.
[49]
Herbert A Simon, Patrick W Langley, and Gary L Bradshaw. 1981. Scientific discovery as problem solving. Synthese, 47, 1 (1981), 1–27. https://doi.org/10.1080/02698599208573403
[50]
Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical Networks for Few-shot Learning. In Advances in Neural Information Processing Systems.
[51]
Shashank Srivastava, Oleksandr Polozov, Nebojsa Jojic, and Christopher Meek. 2020. Learning Web-based Procedures by Reasoning over Explanations and Demonstrations in Context. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020, Online, July 5-10, 2020. Association for Computational Linguistics, 7652–7662. https://doi.org/10.18653/v1/2020.acl-main.684
[52]
Richard S Sutton, Doina Precup, and Satinder Singh. 1999. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial intelligence, 112, 1-2 (1999), 181–211.
[53]
Ross Tate, Michael Stepp, Zachary Tatlock, and Sorin Lerner. 2009. Equality saturation: a new approach to optimization. In ACM SIGPLAN Notices. 44, 264–276. https://doi.org/10.1145/1480881.1480915
[54]
David D. Thornburg. 1983. Friends of the Turtle. Compute!, March.
[55]
Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. 2017. Domain randomization for transferring deep neural networks from simulation to the real world. In 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 23–30. https://doi.org/10.1109/IROS.2017.8202133
[56]
Lazar Valkov, Dipak Chaudhari, Akash Srivastava, Charles Sutton, and Swarat Chaudhuri. 2018. Houdini: Lifelong learning as program synthesis. In Advances in Neural Information Processing Systems. 8687–8698.
[57]
Philip Wadler. 1990. Comprehending monads. In Proceedings of the 1990 ACM conference on LISP and functional programming. 61–78. https://doi.org/10.1145/91556.91592
[58]
Patrick Winston. 1972. The MIT Robot. Machine Intelligence.

Cited By

View all
  • (2025)Automated Program Refinement: Guide and Verify Code Large Language Model with Refinement CalculusProceedings of the ACM on Programming Languages10.1145/37049059:POPL(2057-2089)Online publication date: 9-Jan-2025
  • (2024)REGALProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693967(46605-46624)Online publication date: 21-Jul-2024
  • (2024)Bayesian program learning by decompiling amortized knowledgeProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693654(39042-39055)Online publication date: 21-Jul-2024
  • Show More Cited By

Index Terms

  1. DreamCoder: bootstrapping inductive program synthesis with wake-sleep library learning

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      PLDI 2021: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation
      June 2021
      1341 pages
      ISBN:9781450383912
      DOI:10.1145/3453483
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 June 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Badges

      Author Tags

      1. learning
      2. neural
      3. refactoring
      4. synthesis

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      PLDI '21
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 406 of 2,067 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3,004
      • Downloads (Last 6 weeks)310
      Reflects downloads up to 20 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)Automated Program Refinement: Guide and Verify Code Large Language Model with Refinement CalculusProceedings of the ACM on Programming Languages10.1145/37049059:POPL(2057-2089)Online publication date: 9-Jan-2025
      • (2024)REGALProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693967(46605-46624)Online publication date: 21-Jul-2024
      • (2024)Bayesian program learning by decompiling amortized knowledgeProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693654(39042-39055)Online publication date: 21-Jul-2024
      • (2024)Learning to infer generative template programs for visual conceptsProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692972(22465-22490)Online publication date: 21-Jul-2024
      • (2024)Equivalence by Canonicalization for Synthesis-Backed RefactoringProceedings of the ACM on Programming Languages10.1145/36564538:PLDI(1879-1904)Online publication date: 20-Jun-2024
      • (2024)Generating Function Names to Improve Comprehension of Synthesized Programs2024 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)10.1109/VL/HCC60511.2024.00035(248-259)Online publication date: 2-Sep-2024
      • (2024)Lifelong Robot Library Learning: Bootstrapping Composable and Generalizable Skills for Embodied Control with Language Models2024 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA57147.2024.10611448(515-522)Online publication date: 13-May-2024
      • (2024)Machine Learning and Information Theory Concepts towards an AI MathematicianBulletin of the American Mathematical Society10.1090/bull/183961:3(457-469)Online publication date: 15-May-2024
      • (2024)Building machines that learn and think with peopleNature Human Behaviour10.1038/s41562-024-01991-98:10(1851-1863)Online publication date: 22-Oct-2024
      • (2024)The relational bottleneck as an inductive bias for efficient abstractionTrends in Cognitive Sciences10.1016/j.tics.2024.04.001Online publication date: May-2024
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media