Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3453483.3454063acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article
Public Access

Synthesizing data structure refinements from integrity constraints

Published: 18 June 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Implementations of many data structures use several correlated fields to improve their performance; however, inconsistencies between these fields can be a source of serious program errors. To address this problem, we propose a new technique for automatically refining data structures from integrity constraints. In particular, consider a data structure D with fields F and methods M, as well as a new set of auxiliary fields F′ that should be added to D. Given this input and an integrity constraint Φ relating F and F′, our method automatically generates a refinement of D that satisfies the provided integrity constraint. Our method is based on a modular instantiation of the CEGIS paradigm and uses a novel inductive synthesizer that augments top-down search with three key ideas. First, it computes necessary preconditions of partial programs to dramatically prune its search space. Second, it augments the grammar with promising new productions by leveraging the computed preconditions. Third, it guides top-down search using a probabilistic context-free grammar obtained by statically analyzing the integrity checking function and the original code base. We evaluated our method on 25 data structures from popular Java projects and show that our method can successfully refine 23 of them. We also compare our method against two state-of-the-art synthesis tools and perform an ablation study to justify our design choices. Our evaluation shows that (1) our method is successful at refining many data structure implementations in the wild, (2) it advances the state-of-the-art in synthesis, and (3) our proposed ideas are crucial for making this technique practical.

    References

    [1]
    [n.d.]. CVE-2005-0034. https://nvd.nist.gov/vuln/detail/CVE-2005-0034
    [2]
    [n.d.]. CVE-2010-1013. https://nvd.nist.gov/vuln/detail/CVE-2010-1013
    [3]
    [n.d.]. CVE-2016-5195. https://cve.mitre.org/cgi-bin/cvename.cgi?name=cve-2016-5195
    [4]
    [n.d.]. CVE-2017-7308. https://nvd.nist.gov/vuln/detail/CVE-2017-7308
    [5]
    [n.d.]. Netty. https://github.com/netty/netty
    [6]
    Matej Balog, Alexander L. Gaunt, Marc Brockschmidt, Sebastian Nowozin, and Daniel Tarlow. 2017. DeepCoder: Learning to Write Programs. arxiv:1611.01989.
    [7]
    Shraddha Barke, Hila Peleg, and Nadia Polikarpova. 2020. Just-in-Time Learning for Bottom-Up Enumerative Synthesis.
    [8]
    Yanju Chen, Chenglong Wang, Osbert Bastani, Isil Dillig, and Yu Feng. 2020. Program Synthesis Using Deduction-Guided Reinforcement Learning. 587–610. isbn:978-3-030-53290-1 https://doi.org/10.1007/978-3-030-53291-8_30
    [9]
    Lucas Cordeiro, Pascal Kesseli, Daniel Kroening, Peter Schrammel, and Marek Trtik. 2018. JBMC: A bounded model checking tool for verifying Java bytecode. In International Conference on Computer Aided Verification. 183–190.
    [10]
    Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In International conference on Tools and Algorithms for the Construction and Analysis of Systems. 337–340.
    [11]
    Benjamin Delaware, Clément Pit-Claudel, Jason Gross, and Adam Chlipala. 2015. Fiat: Deductive Synthesis of Abstract Data Types in a Proof Assistant. In Proc. of POPL. 689–700.
    [12]
    Brian Demsky, Michael D. Ernst, Philip J. Guo, Stephen McCamant, Jeff H. Perkins, and Martin Rinard. 2006. Inference and Enforcement of Data Structure Consistency Specifications. In Proceedings of the 2006 International Symposium on Software Testing and Analysis. 233–244.
    [13]
    Brian Demsky and Martin C. Rinard. 2003. Automatic Detection and Repair of Errors in Data Structures. In Proceedings of the 18th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications. 78–95.
    [14]
    Brian Demsky and Martin C. Rinard. 2003. Static Specification Analysis for Termination of Specification-Based Data Structure Repair. In Proceedings of the 14th IEEE International Symposium on Software Reliability Engineering. 71–84.
    [15]
    Brian Demsky and Martin C. Rinard. 2005. Data Structure Repair Using Goal-Directed Reasoning. In Proceedings of the 2005 International Conference on Software Engineering. 176–185.
    [16]
    Isil Dillig, Thomas Dillig, and Alex Aiken. 2011. Precise Reasoning for Programs Using Containers. In Proceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’11). Association for Computing Machinery, New York, NY, USA. 187–200. isbn:9781450304900 https://doi.org/10.1145/1926385.1926407
    [17]
    Yu Feng, Ruben Martins, Osbert Bastani, and Isil Dillig. 2018. Program synthesis using conflict-driven learning. In Proceedings of PLDI. 420–435.
    [18]
    Yu Feng, Ruben Martins, Jacob Van Geffen, Isil Dillig, and Swarat Chaudhuri. 2017. Component-based synthesis of table consolidation and transformation tasks from examples. In Proceedings of PLDI. 422–436.
    [19]
    Yu Feng, Ruben Martins, Yuepeng Wang, Isil Dillig, and Thomas W. Reps. 2017. Component-based synthesis for complex APIs. In Proc. of POPL. 599–612.
    [20]
    John K. Feser, Swarat Chaudhuri, and Isil Dillig. 2015. Synthesizing data structure transformations from input-output examples. In Proc. of PLDI. 229–239.
    [21]
    Peter Hawkins, Alex Aiken, Kathleen Fisher, Martin Rinard, and Mooly Sagiv. 2011. Data Representation Synthesis. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). Association for Computing Machinery, New York, NY, USA. 38–49. isbn:9781450306638 https://doi.org/10.1145/1993498.1993504
    [22]
    Peter Hawkins, Alex Aiken, Kathleen Fisher, Martin Rinard, and Mooly Sagiv. 2012. Concurrent Data Representation Synthesis. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’12). Association for Computing Machinery, New York, NY, USA. 417–428. isbn:9781450312059 https://doi.org/10.1145/2254064.2254114
    [23]
    Jinseong Jeon, Xiaokang Qiu, Jeffrey S. Foster, and Armando Solar-Lezama. 2015. JSketch: sketching for Java. In Proc. of ESEC/FSE. 934–937.
    [24]
    Susmit Jha, Sumit Gulwani, Sanjit A. Seshia, and Ashish Tiwari. 2010. Oracle-guided component-based program synthesis. In Proc. of ICSE. 215–224.
    [25]
    Manos Koukoutos, Mukund Raghothaman, Etienne Kneuss, and Viktor Kuncak. 2017. On Repair with Probabilistic Attribute Grammars. 07.
    [26]
    Patrick Lam, Eric Bodden, Ondrej Lhoták, and Laurie Hendren. 2011. The Soot framework for Java program analysis: a retrospective.
    [27]
    Patrick Lam, Viktor Kuncak, and Martin Rinard. 2005. Generalized Typestate Checking for Data Structure Consistency. In Proceedings of the 6th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI’05). Springer-Verlag, Berlin, Heidelberg. 430–447. isbn:354024297X https://doi.org/10.1007/978-3-540-30579-8_28
    [28]
    Patrick Lam, Viktor Kuncak, and Martin Rinard. 2005. Hob: A Tool for Verifying Data Structure Consistency. isbn:978-3-540-25411-9 https://doi.org/10.1007/978-3-540-31985-6_16
    [29]
    Woosuk Lee, Kihong Heo, Rajeev Alur, and Mayur Naik. 2018. Accelerating Search-Based Program Synthesis Using Learned Probabilistic Models. PLDI 2018. Association for Computing Machinery, New York, NY, USA. 436–449. isbn:9781450356985 https://doi.org/10.1145/3192366.3192410
    [30]
    K. Rustan M. Leino and Peter Müller. 2004. Object Invariants in Dynamic Contexts. In ECOOP 2004 – Object-Oriented Programming, Martin Odersky (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg. 491–515. isbn:978-3-540-24851-4
    [31]
    Ondřej Lhoták and Laurie Hendren. 2003. Scaling Java points-to analysis using S park. In International Conference on Compiler Construction. 153–169.
    [32]
    Boyang Li, Isil Dillig, Thomas Dillig, K. McMillan, and S. Sagiv. 2013. Synthesis of Circular Compositional Program Proofs via Abduction. In TACAS.
    [33]
    Calvin Loncaric, Michael D. Ernst, and Emina Torlak. 2018. Generalized Data Structure Synthesis. In Proceedings of the 40th International Conference on Software Engineering (ICSE ’18). Association for Computing Machinery, New York, NY, USA. 958–968. isbn:9781450356381 https://doi.org/10.1145/3180155.3180211
    [34]
    Ruben Martins, Jia Chen, Yanju Chen, Yu Feng, and Isil Dillig. 2019. Trinity: An Extensible Synthesis Framework for Data Science. PVLDB, 12, 12 (2019), 1914–1917.
    [35]
    Kenneth L. McMillan. 1999. Circular Compositional Reasoning about Liveness. In Proceedings of the 10th IFIP WG 10.5 Advanced Research Working Conference on Correct Hardware Design and Verification Methods (CHARME ’99). Springer-Verlag, Berlin, Heidelberg. 342–345. isbn:3540665595
    [36]
    John Sarracino, Shraddha Barke, Nadia Polikarpova, and Sorin Lerner. 2019. Targeted Synthesis for Programming with Data Invariants. CoRR, abs/1904.13049 (2019), arxiv:1904.13049. arxiv:1904.13049
    [37]
    Kensen Shi, Jacob Steinhardt, and Percy Liang. 2019. FrAngel: component-based synthesis with control structures. Proc. ACM Program. Lang., 3, POPL (2019), 73:1–73:29.
    [38]
    Xujie Si, Y. Yang, Hanjun Dai, M. Naik, and L. Song. 2019. Learning a Meta-Solver for Syntax-Guided Program Synthesis. In ICLR.
    [39]
    Philippe Suter, Mirco Dotta, and Viktor Kuncak. 2010. Decision Procedures for Algebraic Data Types with Abstractions. In Proceedings of the 37th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’10). Association for Computing Machinery, New York, NY, USA. 199–210. isbn:9781605584799 https://doi.org/10.1145/1706299.1706325
    [40]
    Chenglong Wang, Alvin Cheung, and Rastislav Bodík. 2017. Synthesizing highly expressive SQL queries from input-output examples. In Proceedings of PLDI. 452–466.
    [41]
    Yuepeng Wang, James Dong, Rushi Shah, and Isil Dillig. 2019. Synthesizing database programs for schema refactoring. In Proceedings of PLDI. 286–300.

    Cited By

    View all
    • (2024)Semantic Code Refactoring for Abstract Data TypesProceedings of the ACM on Programming Languages10.1145/36328708:POPL(816-847)Online publication date: 5-Jan-2024
    • (2024)Programming-by-Demonstration for Long-Horizon Robot TasksProceedings of the ACM on Programming Languages10.1145/36328608:POPL(512-545)Online publication date: 5-Jan-2024
    • (2023)Inductive Program Synthesis via Iterative Forward-Backward Abstract InterpretationProceedings of the ACM on Programming Languages10.1145/35912887:PLDI(1657-1681)Online publication date: 6-Jun-2023
    • Show More Cited By

    Index Terms

    1. Synthesizing data structure refinements from integrity constraints

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      PLDI 2021: Proceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation
      June 2021
      1341 pages
      ISBN:9781450383912
      DOI:10.1145/3453483
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 June 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Data structure refinement
      2. Program Synthesis
      3. Programming Languages

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      PLDI '21
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 406 of 2,067 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)116
      • Downloads (Last 6 weeks)8

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Semantic Code Refactoring for Abstract Data TypesProceedings of the ACM on Programming Languages10.1145/36328708:POPL(816-847)Online publication date: 5-Jan-2024
      • (2024)Programming-by-Demonstration for Long-Horizon Robot TasksProceedings of the ACM on Programming Languages10.1145/36328608:POPL(512-545)Online publication date: 5-Jan-2024
      • (2023)Inductive Program Synthesis via Iterative Forward-Backward Abstract InterpretationProceedings of the ACM on Programming Languages10.1145/35912887:PLDI(1657-1681)Online publication date: 6-Jun-2023
      • (2023)Automated Translation of Functional Big Data Queries to SQLProceedings of the ACM on Programming Languages10.1145/35860477:OOPSLA1(580-608)Online publication date: 6-Apr-2023
      • (2022)Synthesis-powered optimization of smart contracts via data type refactoringProceedings of the ACM on Programming Languages10.1145/35633086:OOPSLA2(560-588)Online publication date: 31-Oct-2022
      • (2022)Complexity-guided container replacement synthesisProceedings of the ACM on Programming Languages10.1145/35273126:OOPSLA1(1-31)Online publication date: 29-Apr-2022
      • (2022)WebRobot: web robotic process automation using interactive programming-by-demonstrationProceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3519939.3523711(152-167)Online publication date: 9-Jun-2022

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media