Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Just-in-time learning for bottom-up enumerative synthesis

Published: 13 November 2020 Publication History

Abstract

A key challenge in program synthesis is the astronomical size of the search space the synthesizer has to explore. In response to this challenge, recent work proposed to guide synthesis using learned probabilistic models. Obtaining such a model, however, might be infeasible for a problem domain where no high-quality training data is available. In this work we introduce an alternative approach to guided program synthesis: instead of training a model ahead of time we show how to bootstrap one just in time, during synthesis, by learning from partial solutions encountered along the way. To make the best use of the model, we also propose a new program enumeration algorithm we dub guided bottom-up search, which extends the efficient bottom-up search with guidance from probabilistic models.
We implement this approach in a tool called Probe, which targets problems in the popular syntax-guided synthesis (SyGuS) format. We evaluate Probe on benchmarks from the literature and show that it achieves significant performance gains both over unguided bottom-up search and over a state-of-the-art probability-guided synthesizer, which had been trained on a corpus of existing solutions. Moreover, we show that these performance gains do not come at the cost of solution quality: programs generated by Probe are only slightly more verbose than the shortest solutions and perform no unnecessary case-splitting.

Supplementary Material

Auxiliary Presentation Video (oopsla20main-p583-p-video.mp4)
This is a presentation video for our OOPSLA 2020 paper titled "Just-in-time Learning for Bottom-Up Enumerative Synthesis". A key challenge in program synthesis is the astronomical size of the search space the synthesizer has to explore. In response to this challenge, recent work proposed to guide synthesis using learned probabilistic models. Obtaining such a model, however, might be infeasible for a problem domain where no high-quality training data is available. In this work, we introduce an alternative approach to guided program synthesis: instead of training a model ahead of time we show how to bootstrap one just in time, during synthesis, by learning from partial solutions encountered along the way.

References

[1]
2018. Euphony Benchmark Suite. https://github.com/wslee/euphony/tree/master/benchmarks
[2]
Aws Albarghouthi, Sumit Gulwani, and Zachary Kincaid. 2013. Recursive program synthesis. In International Conference on Computer Aided Verification. Springer, 934-950.
[3]
Miltiadis Allamanis, Earl T Barr, Premkumar Devanbu, and Charles Sutton. 2018. A survey of machine learning for big code and naturalness. ACM Computing Surveys (CSUR) 51, 4 ( 2018 ), 1-37.
[4]
Ethem Alpaydin. 2014. Introduction to Machine Learning ( 3 ed.). MIT Press, Cambridge, MA.
[5]
Rajeev Alur, Rastislav Bodík, Garvit Juniwal, Milo M. K. Martin, Mukund Raghothaman, Sanjit A. Seshia, Rishabh Singh, Armando Solar-Lezama, Emina Torlak, and Abhishek Udupa. 2013. Syntax-guided synthesis. In Formal Methods in Computer-Aided Design, FMCAD 2013, Portland, OR, USA, October 20-23, 2013. 1-8. http://ieeexplore.ieee.org/document/ 6679385/
[6]
Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama. 2016. Sygus-comp 2016: results and analysis. arXiv preprint arXiv:1611.07627 ( 2016 ).
[7]
Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama. 2017a. Sygus-comp 2017 : Results and analysis. arXiv preprint arXiv:1711.11438 ( 2017 ).
[8]
Rajeev Alur, Arjun Radhakrishna, and Abhishek Udupa. 2017b. Scaling enumerative program synthesis via divide and conquer. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 319-336.
[9]
Rajeev Alur, Rishabh Singh, Dana Fisman, and Armando Solar-Lezama. 2018. Search-based Program Synthesis. Commun. ACM 61, 12 (Nov. 2018 ), 84-93. https://doi.org/10.1145/3208071
[10]
Matej Balog, Alexander L Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2016. Deepcoder: Learning to write programs. arXiv preprint arXiv:1611. 01989 ( 2016 ).
[11]
Shraddha Barke, Hila Peleg, and Nadia Polikarpova. 2020. Just-in-Time Learning for Bottom-up Enumerative Synthesis. ( 2020 ). https://shraddhabarke.github.io/publication/probe-oopsla
[12]
Pavol Bielik, Veselin Raychev, and Martin Vechev. 2016. PHOG: probabilistic model for code. In International Conference on Machine Learning. 2933-2942.
[13]
Yanju Chen, Chenglong Wang, Osbert Bastani, Isil Dillig, and Yu Feng. 2020. Program Synthesis Using Deduction-Guided Reinforcement Learning. In International Conference on Computer Aided Verification. Springer, 587-610.
[14]
Kevin Ellis, Lucas Morales, Mathias Sablé Meyer, Armando Solar-Lezama, and Joshua B Tenenbaum. 2018. Search, compress, compile: Library learning in neurally-guided bayesian program learning. Advances in neural information processing systems ( 2018 ).
[15]
Yu Feng, Ruben Martins, Jacob Van Gefen, Isil Dillig, and Swarat Chaudhuri. 2017a. Component-based synthesis of table consolidation and transformation tasks from examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017. 422-436.
[16]
Yu Feng, Ruben Martins, Yuepeng Wang, Isil Dillig, and Thomas W Reps. 2017b. Component-based synthesis for complex APIs. ACM SIGPLAN Notices 52, 1 ( 2017 ), 599-612.
[17]
John K Feser, Swarat Chaudhuri, and Isil Dillig. 2015. Synthesizing data structure transformations from input-output examples. In ACM SIGPLAN Notices, Vol. 50. ACM, 229-239.
[18]
Jonathan Frankle, Peter-Michael Osera, David Walker, and Steve Zdancewic. 2016. Example-directed Synthesis: A Typetheoretic Interpretation. In Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (St. Petersburg, FL, USA) ( POPL '16). ACM, New York, NY, USA, 802-815. https://doi.org/10.1145/2837614. 2837629
[19]
Jianhang Gao, Qing Zhao, Wei Ren, Ananthram Swami, Ram Ramanathan, and Amotz Bar-Noy. 2012. Dynamic Shortest Path Algorithms for Hypergraphs. CoRR abs/1202.0082 ( 2012 ). arXiv: 1202.0082 http://arxiv.org/abs/1202.0082
[20]
Sumit Gulwani. 2011. Automating String Processing in Spreadsheets Using Input-output Examples. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (Austin, Texas, USA) ( POPL '11). ACM, New York, NY, USA, 317-330. https://doi.org/10.1145/1926385.1926423
[21]
Sumit Gulwani. 2016. Programming by Examples (and its applications in Data Wrangling). In Verification and Synthesis of Correct and Secure Systems, Javier Esparza, Orna Grumberg, and Salomon Sickert (Eds.). IOS Press.
[22]
Sumit Gulwani, Susmit Jha, Ashish Tiwari, and Ramarathnam Venkatesan. 2011. Synthesis of loop-free programs. ACM SIGPLAN Notices 46, 6 ( 2011 ), 62-73.
[23]
Tihomir Gvero, Viktor Kuncak, Ivan Kuraj, and Ruzica Piskac. 2013. Complete completion using types and weights. In ACM SIGPLAN Notices, Vol. 48. ACM, 27-38.
[24]
Jeevana Priya Inala and Rishabh Singh. 2018. WebRelate: integrating web data with spreadsheets using examples. PACMPL 2, POPL ( 2018 ), 2 : 1-2 : 28. https://dl.acm.org/doi/10.1145/3158090
[25]
Susmit Jha, Sumit Gulwani, Sanjit A Seshia, and Ashish Tiwari. 2010. Oracle-guided component-based program synthesis. In 2010 ACM/IEEE 32nd International Conference on Software Engineering, Vol. 1. IEEE, 215-224.
[26]
Ashwin Kalyan, Abhishek Mohta, Oleksandr Polozov, Dhruv Batra, Prateek Jain, and Sumit Gulwani. 2018. Neural-guided deductive search for real-time program synthesis from examples. arXiv preprint arXiv: 1804. 01186 ( 2018 ).
[27]
Etienne Kneuss, Ivan Kuraj, Viktor Kuncak, and Philippe Suter. 2013. Synthesis Modulo Recursive Functions. SIGPLAN Not. 48, 10 (Oct. 2013 ), 407-426.
[28]
Manos Koukoutos, Etienne Kneuss, and Viktor Kuncak. 2016. An Update on Deductive Synthesis and Repair in the Leon Tool. In Proceedings Fifth Workshop on Synthesis, SYNT@CAV 2016, Toronto, Canada, July 17-18, 2016. 100-111.
[29]
Manos Koukoutos, Mukund Raghothaman, Etienne Kneuss, and Viktor Kuncak. 2017. On repair with probabilistic attribute grammars. arXiv preprint arXiv:1707.04148 ( 2017 ).
[30]
Vu Le and Sumit Gulwani. 2014. FlashExtract: a framework for data extraction by examples. In Proceedings of the 35th Conference on Programming Language Design and Implementation, Michael F. P. O'Boyle and Keshav Pingali (Eds.). ACM, 55. https://doi.org/10.1145/2594291.2594333
[31]
Woosuk Lee, Kihong Heo, Rajeev Alur, and Mayur Naik. 2018. Accelerating search-based program synthesis using learned probabilistic models. ACM SIGPLAN Notices 53, 4 ( 2018 ), 436-449.
[32]
Aditya Menon, Omer Tamuz, Sumit Gulwani, Butler Lampson, and Adam Kalai. 2013. A machine learning framework for programming by example. In International Conference on Machine Learning. 187-195.
[33]
Peter-Michael Osera and Steve Zdancewic. 2015. Type-and-example-directed program synthesis. In ACM SIGPLAN Notices, Vol. 50. ACM, 619-630.
[34]
Hila Peleg and Nadia Polikarpova. 2020. Perfect is the Enemy of Good: Best-Efort Program Synthesis. In 34th European Conference on Object-Oriented Programming, ECOOP.
[35]
Daniel Perelman, Sumit Gulwani, Dan Grossman, and Peter Provost. 2014. Test-driven synthesis. ACM Sigplan Notices 49, 6 ( 2014 ), 408-418.
[36]
Phitchaya Mangpo Phothilimthana, Aditya Thakur, Rastislav Bodik, and Dinakar Dhurjati. 2016. Scaling up Superoptimization. SIGARCH Comput. Archit. News 44, 2 (March 2016 ), 297-310. https://doi.org/10.1145/2980024.2872387
[37]
Veselin Raychev, Martin Vechev, and Eran Yahav. 2014. Code completion with statistical language models. In ACM SIGPLAN Notices, Vol. 49. ACM, 419-428.
[38]
Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark Barrett, and Cesare Tinelli. 2019. cvc 4 sy: smart and fast term enumeration for syntax-guided synthesis. In International Conference on Computer Aided Verification. Springer, 74-83.
[39]
Rohin Shah, Sumith Kulal, and Rastislav Bodik. 2018. Scalable Synthesis with Symbolic Syntax Graphs. ( 2018 ).
[40]
Kensen Shi, Jacob Steinhardt, and Percy Liang. 2019. FrAngel: component-based synthesis with control structures. Proceedings of the ACM on Programming Languages 3, POPL ( 2019 ), 1-29. https://dl.acm.org/doi/10.1145/3290386
[41]
Xujie Si, Yuan Yang, Hanjun Dai, Mayur Naik, and Le Song. 2019. Learning a Meta-Solver for Syntax-Guided Program Synthesis. https://openreview.net/forum?id=Syl8Sn0cK7
[42]
Calvin Smith and Aws Albarghouthi. 2019. Program Synthesis with Equivalence Reduction. In Verification, Model Checking, and Abstract Interpretation-20th International Conference, VMCAI 2019, Cascais, Portugal, January 13-15, 2019, Proceedings. 24-47. https://doi.org/10.1007/978-3-030-11245-5_2
[43]
Armando Solar-Lezama, Liviu Tancau, Rastislav Bodik, Sanjit Seshia, and Vijay Saraswat. 2006. Combinatorial sketching for ifnite programs. ACM SIGOPS Operating Systems Review 40, 5 ( 2006 ), 404-415.
[44]
Abhishek Udupa, Arun Raghavan, Jyotirmoy V Deshmukh, Sela Mador-Haim, Milo MK Martin, and Rajeev Alur. 2013. TRANSIT: specifying protocols with concolic snippets. ACM SIGPLAN Notices 48, 6 ( 2013 ), 287-296.
[45]
Chenglong Wang, Alvin Cheung, and Rastislav Bodik. 2017a. Synthesizing highly expressive SQL queries from input-output examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 452-466.
[46]
Xinyu Wang, Isil Dillig, and Rishabh Singh. 2017c. Program Synthesis Using Abstraction Refinement. Proc. ACM Program. Lang. 2, POPL, Article 63, 30 pages. https://doi.org/10.1145/3158151
[47]
Xinyu Wang, Isil Dillig, and Rishabh Singh. 2017b. Synthesis of Data Completion Scripts Using Finite Tree Automata. Proc. ACM Program. Lang. 1, OOPSLA, Article 62, 26 pages. https://doi.org/10.1145/3133886
[48]
Henry S Warren. 2013. Hacker's delight. Pearson Education.

Cited By

View all
  • (2024)Equivalence by Canonicalization for Synthesis-Backed RefactoringProceedings of the ACM on Programming Languages10.1145/36564538:PLDI(1879-1904)Online publication date: 20-Jun-2024
  • (2024)Enhanced Enumeration Techniques for Syntax-Guided Synthesis of Bit-Vector ManipulationsProceedings of the ACM on Programming Languages10.1145/36329138:POPL(2129-2159)Online publication date: 5-Jan-2024
  • (2024)Discovering quantum circuit components with program synthesisMachine Learning: Science and Technology10.1088/2632-2153/ad42525:2(025029)Online publication date: 3-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 4, Issue OOPSLA
November 2020
3108 pages
EISSN:2475-1421
DOI:10.1145/3436718
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 November 2020
Published in PACMPL Volume 4, Issue OOPSLA

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Domain-specific languages
  2. Probabilistic models
  3. Program Synthesis

Qualifiers

  • Research-article

Funding Sources

  • National Science Foundation

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)293
  • Downloads (Last 6 weeks)32
Reflects downloads up to 26 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Equivalence by Canonicalization for Synthesis-Backed RefactoringProceedings of the ACM on Programming Languages10.1145/36564538:PLDI(1879-1904)Online publication date: 20-Jun-2024
  • (2024)Enhanced Enumeration Techniques for Syntax-Guided Synthesis of Bit-Vector ManipulationsProceedings of the ACM on Programming Languages10.1145/36329138:POPL(2129-2159)Online publication date: 5-Jan-2024
  • (2024)Discovering quantum circuit components with program synthesisMachine Learning: Science and Technology10.1088/2632-2153/ad42525:2(025029)Online publication date: 3-May-2024
  • (2023)LAMBDABEAMProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668356(51327-51346)Online publication date: 10-Dec-2023
  • (2023)Can you improve my code? optimizing programs with local searchProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/328(2940-2948)Online publication date: 19-Aug-2023
  • (2023)Inductive Program Synthesis via Iterative Forward-Backward Abstract InterpretationProceedings of the ACM on Programming Languages10.1145/35912887:PLDI(1657-1681)Online publication date: 6-Jun-2023
  • (2023)Absynthe: Abstract Interpretation-Guided SynthesisProceedings of the ACM on Programming Languages10.1145/35912857:PLDI(1584-1607)Online publication date: 6-Jun-2023
  • (2023)Trace-Guided Inductive Synthesis of Recursive Functional ProgramsProceedings of the ACM on Programming Languages10.1145/35912557:PLDI(860-883)Online publication date: 6-Jun-2023
  • (2023)Simplifying Mixed Boolean-Arithmetic Obfuscation by Program Synthesis and Term RewritingProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security10.1145/3576915.3623186(2351-2365)Online publication date: 15-Nov-2023
  • (2023)Programming-by-Example with Nested Examples2023 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC)10.1109/VL-HCC57772.2023.00053(280-282)Online publication date: 3-Oct-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media