Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Guiding dynamic programing via structural probability for accelerating programming by example

Published: 13 November 2020 Publication History

Abstract

Programming by example (PBE) is an important subproblem of program synthesis, and PBE techniques have been applied to many domains. Though many techniques for accelerating PBE systems have been explored, the scalability remains one of the main challenges: There is still a gap between the performances of state-of-the-art synthesizers and the industrial requirement. To further speed up solving PBE tasks, in this paper, we propose a novel PBE framework MaxFlash. MaxFlash uses a model based on structural probability, named topdown prediction models, to guide a search based on dynamic programming, such that the search will focus on subproblems that form probable programs, and avoid improbable programs. Our evaluation shows that MaxFlash achieves × 4.107− × 2080 speed-ups against state-of-the-art solvers on 244 real-world tasks.

Supplementary Material

Auxiliary Presentation Video (oopsla20main-p515-p-video.mp4)
This is a presentation video of my talk at OOPSLA 2020 on our paper accepted in the research track. Programming by example (PBE) is an important subproblem of program synthesis, and PBE techniques have been applied to many domains. Though many techniques for accelerating PBE systems have been explored, the scalability remains one of the main challenges: There is still a gap between the performances of state-of-the-art synthesizers and the industrial requirement. To further speed up solving PBE tasks, in this paper, we propose a novel PBE framework MaxFlash. MaxFlash uses a model based on structural probability, named topdown prediction models, to guide a search based on dynamic programming, such that the search will focus on subproblems that form probable programs, and avoid improbable programs. Our evaluation shows that MaxFlash achieves × 4.107− × 2080 speed-ups against state-of-the-art solvers on 244 real-world tasks.

References

[1]
Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav. 2019. code2vec: learning distributed representations of code. Proc. ACM Program. Lang. 3, POPL ( 2019 ), 40 : 1-40 : 29. https://doi.org/10.1145/3290353
[2]
Rajeev Alur, Dana Fisman, Saswat Padhi, Rishabh Singh, and Abhishek Udupa. 2019. SyGuS-Comp 2018: Results and Analysis. CoRR abs/ 1904.07146 ( 2019 ). arXiv: 1904.07146 http://arxiv.org/abs/ 1904.07146
[3]
Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama. 2016. SyGuS-Comp 2016: Results and Analysis. In Proceedings Fifth Workshop on Synthesis, SYNT@CAV 2016, Toronto, Canada, July 17-18, 2016. 178-202. https://doi.org/10. 4204/EPTCS.229.13
[4]
Rajeev Alur, Dana Fisman, Rishabh Singh, and Armando Solar-Lezama. 2017a. SyGuS-Comp 2017 : Results and Analysis. In Proceedings Sixth Workshop on Synthesis, SYNT@CAV 2017, Heidelberg, Germany, 22nd July 2017. 97-115. https: //doi.org/10.4204/EPTCS.260.9
[5]
Rajeev Alur, Arjun Radhakrishna, and Abhishek Udupa. 2017b. Scaling Enumerative Program Synthesis via Divide and Conquer. In Tools and Algorithms for the Construction and Analysis of Systems-23rd International Conference, TACAS 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22-29, 2017, Proceedings, Part I. 319-336. https://doi.org/10.1007/978-3-662-54577-5_18
[6]
Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2017. DeepCoder: Learning to Write Programs. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. https://openreview.net/forum?id=ByldLrqlx
[7]
Daniel W. Barowy, Sumit Gulwani, Ted Hart, and Benjamin G. Zorn. 2015. FlashRelate: extracting relational data from semistructured spreadsheets using examples. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015. 218-228. https://doi.org/10.1145/2737924.2737952
[8]
Pavol Bielik, Veselin Raychev, and Martin T. Vechev. 2016. PHOG: Probabilistic Model for Code. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016. 2933-2942. http://proceedings.mlr.press/v48/bielik16.html
[9]
Qiaochu Chen, Xinyu Wang, Xi Ye, Greg Durrett, and Isil Dillig. 2020. Multi-Modal Synthesis of Regular Expressions. ( 2020 ).
[10]
Yanju Chen, Ruben Martins, and Yu Feng. 2019. Maximal multi-layer specification synthesis. In Proceedings of the ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/SIGSOFT FSE 2019, Tallinn, Estonia, August 26-30, 2019. 602-612. https://doi.org/10.1145/3338906.3338951
[11]
Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, and Pushmeet Kohli. 2017. RobustFill: Neural Program Learning under Noisy I/O. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017. 990-998. http://proceedings.mlr.press/v70/devlin17a.html
[12]
Yu Feng, Ruben Martins, Jacob Van Gefen, Isil Dillig, and Swarat Chaudhuri. 2017. Component-based synthesis of table consolidation and transformation tasks from examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017. 422-436. https: //doi.org/10.1145/3062341.3062351
[13]
Giorgio Gallo, Giustino Longo, and Stefano Pallottino. 1993. Directed Hypergraphs and Applications. Discret. Appl. Math. 42, 2 ( 1993 ), 177-201. https://doi.org/10.1016/ 0166-218X ( 93 ) 90045-P
[14]
Sumit Gulwani. 2011. Automating string processing in spreadsheets using input-output examples. In Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, Austin, TX, USA, January 26-28, 2011. 317-330. https://doi.org/10.1145/1926385.1926423
[15]
Ashwin Kalyan, Abhishek Mohta, Oleksandr Polozov, Dhruv Batra, Prateek Jain, and Sumit Gulwani. 2018. NeuralGuided Deductive Search for Real-Time Program Synthesis from Examples. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30-May 3, 2018, Conference Track Proceedings. https://openreview. net/forum?id=rywDjg-RW
[16]
Dileep Kini and Sumit Gulwani. 2015. FlashNormalize: Programming by Examples for Text Normalization. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015. 776-783. http://ijcai.org/Abstract/15/115
[17]
Richard E. Korf. 1985. Depth-First Iterative-Deepening: An Optimal Admissible Tree Search. Artif. Intell. 27, 1 ( 1985 ), 97-109. https://doi.org/10.1016/ 0004-3702 ( 85 ) 90084-0
[18]
Ailsa H. Land and Alison G. Doig. 1960. An Automatic Method of Solving Discrete Programming Problems. Econometrica 28 ( 1960 ), 497-520.
[19]
Vu Le and Sumit Gulwani. 2014. FlashExtract: a framework for data extraction by examples. In ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI '14, Edinburgh, United Kingdom-June 09-11, 2014. 542-553. https://doi.org/10.1145/2594291.2594333
[20]
Woosuk Lee, Kihong Heo, Rajeev Alur, and Mayur Naik. 2018. Accelerating search-based program synthesis using learned probabilistic models. In Proceedings of the 39th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2018, Philadelphia, PA, USA, June 18-22, 2018. 436-449. https://doi.org/10.1145/3192366.3192410
[21]
Aditya Krishna Menon, Omer Tamuz, Sumit Gulwani, Butler W. Lampson, and Adam Kalai. 2013. A Machine Learning Framework for Programming by Example. In Proceedings of the 30th International Conference on Machine Learning, ICML 2013, Atlanta, GA, USA, 16-21 June 2013. 187-195. http://proceedings.mlr.press/v28/menon13.html
[22]
Arvind Neelakantan, Quoc V. Le, Martín Abadi, Andrew McCallum, and Dario Amodei. 2017. Learning a Natural Language Interface with Neural Programmer. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. https://openreview.net/forum?id=ry2YOrcge
[23]
Saswat Padhi, Prateek Jain, Daniel Perelman, Oleksandr Polozov, Sumit Gulwani, and Todd D. Millstein. 2018. FlashProfile: a framework for synthesizing data profiles. PACMPL 2, OOPSLA ( 2018 ), 150 : 1-150 : 28. https://doi.org/10.1145/3276520
[24]
Oleksandr Polozov and Sumit Gulwani. 2015. FlashMeta: a framework for inductive program synthesis. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2015, part of SPLASH 2015, Pittsburgh, PA, USA, October 25-30, 2015. 107-126. https://doi.org/10.1145/2814270. 2814310
[25]
Oleksandr Polozov and Sumit Gulwani. 2016. Program synthesis in the industrial world: Inductive, incremental, interactive. In 5th Workshop on Synthesis (SYNT).
[26]
Andrew Reynolds, Haniel Barbosa, Andres Nötzli, Clark W. Barrett, and Cesare Tinelli. 2019a. cvc4sy: Smart and Fast Term Enumeration for Syntax-Guided Synthesis. In Computer Aided Verification-31st International Conference, CAV 2019, New York City, NY, USA, July 15-18, 2019, Proceedings, Part II. 74-83. https://doi.org/10.1007/978-3-030-25543-5_5
[27]
Andrew Reynolds, Morgan Deters, Viktor Kuncak, Cesare Tinelli, and Clark W. Barrett. 2015. Counterexample-Guided Quantifier Instantiation for Synthesis in SMT. In Computer Aided Verification-27th International Conference, CAV 2015, San Francisco, CA, USA, July 18-24, 2015, Proceedings, Part II. 198-216. https://doi.org/10.1007/978-3-319-21668-3_12
[28]
Andrew Reynolds, Viktor Kuncak, Cesare Tinelli, Clark W. Barrett, and Morgan Deters. 2019b. Refutation-based synthesis in SMT. Formal Methods in System Design 55, 2 ( 2019 ), 73-102. https://doi.org/10.1007/s10703-017-0270-2
[29]
David E. Shaw, William R. Swartout, and C. Cordell Green. 1975. Inferring LISP Programs From Examples. In Advance Papers of the Fourth International Joint Conference on Artificial Intelligence, Tbilisi, Georgia, USSR, September 3-8, 1975. 260-267. http://ijcai.org/Proceedings/75/Papers/037.pdf
[30]
Rishabh Singh and Sumit Gulwani. 2012. Learning Semantic String Transformations from Examples. PVLDB 5, 8 ( 2012 ), 740-751. https://doi.org/10.14778/2212351.2212356
[31]
Armando Solar-Lezama, Liviu Tancau, Rastislav Bodík, Sanjit A. Seshia, and Vijay A. Saraswat. 2006. Combinatorial sketching for finite programs. In Proceedings of the 12th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2006, San Jose, CA, USA, October 21-25, 2006. 404-415. https://doi.org/10.1145/ 1168857.1168907
[32]
Chenglong Wang, Alvin Cheung, and Rastislav Bodík. 2017. Synthesizing highly expressive SQL queries from input-output examples. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017. 452-466. https://doi.org/10.1145/3062341.3062365
[33]
Xinyu Wang, Greg Anderson, Isil Dillig, and Kenneth L. McMillan. 2018a. Learning Abstractions for Program Synthesis. In Computer Aided Verification-30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part I. 407-426. https://doi.org/10.1007/978-3-319-96145-3_22
[34]
Xinyu Wang, Isil Dillig, and Rishabh Singh. 2018b. Program synthesis using abstraction refinement. PACMPL 2, POPL ( 2018 ), 63 : 1-63 : 30. https://doi.org/10.1145/3158151
[35]
Yingfei Xiong, Bo Wang, Guirong Fu, and Linfei Zang. 2018. Learning to Synthesize. In International Genetic Improvement Workshop. https://doi.org/10.1145/3194810.3194816
[36]
Navid Yaghmazadeh, Christian Klinger, Isil Dillig, and Swarat Chaudhuri. 2016. Synthesizing transformations on hierarchically structured data. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2016, Santa Barbara, CA, USA, June 13-17, 2016. 508-521. https://doi.org/10.1145/2908080.2908088
[37]
Sai Zhang and Yuyin Sun. 2013. Automatically synthesizing SQL queries from input-output examples. In 2013 28th IEEE/ACM International Conference on Automated Software Engineering, ASE 2013, Silicon Valley, CA, USA, November 11-15, 2013, Ewen Denney, Tevfik Bultan, and Andreas Zeller (Eds.). IEEE, 224-234. https://doi.org/10.1109/ASE. 2013.6693082

Cited By

View all
  • (2024)A Post-training Framework for Improving the Performance of Deep Learning Models via Model TransformationACM Transactions on Software Engineering and Methodology10.1145/363001133:3(1-41)Online publication date: 15-Mar-2024
  • (2023)Scaling up Program Synthesis to Efficient AlgorithmsCompanion Proceedings of the 2023 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity10.1145/3618305.3623586(4-6)Online publication date: 22-Oct-2023
  • (2023)Improving Oracle-Guided Inductive Synthesis by Efficient Question SelectionProceedings of the ACM on Programming Languages10.1145/35860557:OOPSLA1(819-847)Online publication date: 6-Apr-2023
  • Show More Cited By

Index Terms

  1. Guiding dynamic programing via structural probability for accelerating programming by example

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Programming Languages
    Proceedings of the ACM on Programming Languages  Volume 4, Issue OOPSLA
    November 2020
    3108 pages
    EISSN:2475-1421
    DOI:10.1145/3436718
    Issue’s Table of Contents
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 November 2020
    Published in PACMPL Volume 4, Issue OOPSLA

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Dynamic Programming
    2. Probabilistic Model
    3. Programming by Example

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)223
    • Downloads (Last 6 weeks)25
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Post-training Framework for Improving the Performance of Deep Learning Models via Model TransformationACM Transactions on Software Engineering and Methodology10.1145/363001133:3(1-41)Online publication date: 15-Mar-2024
    • (2023)Scaling up Program Synthesis to Efficient AlgorithmsCompanion Proceedings of the 2023 ACM SIGPLAN International Conference on Systems, Programming, Languages, and Applications: Software for Humanity10.1145/3618305.3623586(4-6)Online publication date: 22-Oct-2023
    • (2023)Improving Oracle-Guided Inductive Synthesis by Efficient Question SelectionProceedings of the ACM on Programming Languages10.1145/35860557:OOPSLA1(819-847)Online publication date: 6-Apr-2023
    • (2023)Deep Reinforcement Learning Guided Decision Tree Learning For Program Synthesis2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00112(925-932)Online publication date: Mar-2023
    • (2022)Toward Improving the Robustness of Deep Learning Models via Model TransformationProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3556920(1-13)Online publication date: 10-Oct-2022
    • (2022)L2S: A Framework for Synthesizing the Most Probable Program under a SpecificationACM Transactions on Software Engineering and Methodology10.1145/348757031:3(1-45)Online publication date: 7-Mar-2022
    • (2021)Generalizable synthesis through unificationProceedings of the ACM on Programming Languages10.1145/34855445:OOPSLA(1-28)Online publication date: 15-Oct-2021
    • (2021)Neural-Guided Inductive Synthesis of Functional Programs on List Manipulation by Offline Supervised LearningIEEE Access10.1109/ACCESS.2021.30793519(71521-71534)Online publication date: 2021

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media