Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3035918.3035961acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Handling Environments in a Nested Relational Algebra with Combinators and an Implementation in a Verified Query Compiler

Published: 09 May 2017 Publication History

Abstract

Algebras based on combinators, i.e., variable-free, have been proposed as a better representation for query compilation and optimization. A key benefit of combinators is that they avoid the need to handle variable shadowing or accidental capture during rewrites. This simplifies both the optimizer specification and its correctness analysis, but the environment from the source language has to be reified as records, which can lead to more complex query plans.
This paper proposes NRAe, an extension of a combinators-based nested relational algebra (NRA) with built-in support for environments. We show that it can naturally encode an equivalent NRA with lambda terms and that all optimizations on NRA carry over to NRAe. This extension provides an elegant way to represent views in query plans, and can radically simplify compilation and optimization for source languages with rich environment manipulations.
We have specified a query compiler using the Coq proof assistant with NRAe at its heart. Most of the compiler, including the query optimizer, is accompanied by a (machine-checked) correctness proof. The implementation is automatically extracted from the specification, resulting in a query compiler with a verified core.

References

[1]
M. Abadi, L. Cardelli, P.-L. Curien, and J.-J. Lévy. Explicit substitutions. Journal of Functional Programming (JFP), 1(04):375--416, 1991.
[2]
A. V. Aho, Y. Sagiv, and J. D. Ullman. Efficient optimization of a class of relational expressions. ACM Transactions on Database Systems (TODS), 4(4):435--454, 1979.
[3]
M. Armbrust et al. Spark SQL: Relational data processing in Spark. In International Conference on Management of Data (SIGMOD), pages 1383--1394, 2015.
[4]
M. Arnold et al. META: Middleware for events, transactions, and analytics. IBM Journal of Research and Development (IBMRD), 60(2--3), 2016.
[5]
B. E. Aydemir et al. Mechanized metatheory for the masses: The PoplMark challenge. In Conference on Theorem Proving in Higher Order Logics (TPHOLs), pages 50--65, 2005.
[6]
V. Benzaken, E. Contejean, and S. Dumbrava. A Coq formalization of the relational data model. In European Symposium on Programming (ESOP), pages 189--208, 2014.
[7]
K. Beyer et al. JAQL: A scripting language for large scale semistructured data analysis. In Conference on Very Large Data Bases (VLDB), pages 1272--1283, 2011.
[8]
J. Boyer and H. Mili. Agile Business Rule Development. Springer, 2011.
[9]
T. Braibant, J.-H. Jourdan, and D. Monniaux. Implementing and reasoning about hash-consed data structures in Coq. Journal of Automated Reasoning, 53(3):271--304, 2014.
[10]
M. J. Carey et al. The architecture of the EXODUS extensible DBMS. In On Object-Oriented Database Systems, pages 231--256. Springer Berlin Heidelberg, 1991.
[11]
J. Cheney and C. Urban. Mechanizing the metatheory of mini-XQuery. In Conference on Certified Programs and Proofs (CPP), pages 280--295, 2011.
[12]
M. Cherniack and S. B. Zdonik. Rule languages and internal algebras for rule-based optimizers. In International Conference on Management of Data (SIGMOD), 1996.
[13]
M. Cherniack and S. B. Zdonik. Changing the rules: Transformations for rule-based optimizers. In International Conference on Management of Data (SIGMOD), pages 61--72, 1998.
[14]
S. Cluet and G. Moerkotte. Nested queries in object bases. In Workshop on Database Programming Languages (DBPL), pages 226--242, 1993.
[15]
E. Cooper, S. Lindley, P. Wadler, and J. Yallop. Links: Web programming without tiers. In Formal Methods for Components and Objects, pages 266--296, 2007.
[16]
The Coq proof assistant reference manual, version 8.4pl41. coq.inria.fr.
[17]
L. Fegaras and D. Maier. Optimizing object queries using an effective calculus. ACM Transactions on Database Systems (TODS), 25(4):457--516, 2000.
[18]
L. Fegaras, D. Maier, and T. Sheard. Specifying rule-based query optimizers in a reflective framework. In Conference on Deductive and Object-Oriented Databases (DOOD), pages 146--168, 1993.
[19]
D. Florescu and G. Fourny. JSONiq: The history of a query language. IEEE Internet Computing, 17(5):86--90, 2013.
[20]
S. J. Garland and J. V. Guttag. An overview of LP, the Larch Prover. In Conference on Rewriting Techniques and Applications (RTA), pages 137--151, 1989. nms.lcs.mit.edu/Larch.
[21]
G. Giorgidze, T. Grust, A. Ulrich, and J. Weijers. Algebraic data types for language-integrated queries. In Workshop on Data Driven Functional Programming (DDFP), pages 5--10, 2013.
[22]
T. Grust, M. Mayr, J. Rittinger, and T. Schreiber. Ferry: Database-supported program execution. In International Conference on Management of Data (SIGMOD), pages 1063--1066, 2009.
[23]
P. Letouzey. A new extraction for Coq. In Conference on Types for Proofs and Programs (TYPES), pages 200--219, 2003.
[24]
T. W. Leung et al. The AQUA data model and algebra. In Workshop on Database Programming Languages (DBPL), pages 157--175. 1994.
[25]
G. Malecha and R. Wisnesky. Using dependent types and tactics to enable semantic optimization of language-integrated queries. In Symposium on Database Programming Languages (DBPL), pages 49--58, 2015.
[26]
J. G. Malecha, G. Morrisett, A. Shinnar, and R. Wisnesky. Toward a verified relational database management system. In Principles of Programming Languages (POPL), 2010.
[27]
N. May, S. Helmer, and G. Moerkotte. Strategies for query unnesting in XML databases. ACM Transactions on Database Systems (TODS), 31(3):968--1013, 2006.
[28]
E. Meijer. The world according to LINQ. Communications of the ACM (CACM), 54(10):45--51, 2011.
[29]
C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins. Pig Latin: a not-so-foreign language for data processing. In International Conference on Management of Data (SIGMOD), pages 1099--1110, 2008.
[30]
K. W. Ong, Y. Papakonstantinou, and R. Vernoux. The SQL semi-structured data model and query language: A capabilities survey of SQL-on-Hadoop, NoSQL and NewSQL databases. CoRR, abs/1405.3631, 2014.
[31]
H. Pirahesh, J. M. Hellerstein, and W. Hasan. Extensible/rule based query rewrite optimization in Starburst. In International Conference on Management of Data (SIGMOD), pages 39--48, 1992.
[32]
C. Ré, J. Siméon, and M. Fernandez. A complete and efficient algebraic compiler for XQuery. In International Conference on Data Engineering (ICDE), 2006.
[33]
A. Shinnar and J. Siméon. A branding strategy for business types. In Wadlerfest'2016, Edinburgh, Scotland, April 2016. LNCS.
[34]
A. Shinnar, J. Siméon, and M. Hirzel. A pattern calculus for rule languages: Expressiveness, compilation, and mechanization. In European Conference for Object-Oriented Programming (ECOOP), pages 542--567, 2015.
[35]
J. Van den Bussche and S. Vansummeren. Polymorphic type inference for the named nested relational calculus. Transactions on Computational Logic (TOCL), 9(1), 2007.

Cited By

View all
  • (2022)Translating canonical SQL to imperative code in CoqProceedings of the ACM on Programming Languages10.1145/35273276:OOPSLA1(1-27)Online publication date: 29-Apr-2022
  • (2021)A Coq formalization of data provenanceProceedings of the 10th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3437992.3439920(152-162)Online publication date: 17-Jan-2021
  • (2020)Verification supported refactoring of embedded sqlSoftware Quality Journal10.1007/s11219-020-09517-y29:3(629-665)Online publication date: 18-Jun-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data
May 2017
1810 pages
ISBN:9781450341974
DOI:10.1145/3035918
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication Notes

Badge change: Article originally badged under Version 1.0 guidelines https://www.acm.org/publications/policies/artifact-review-badging

Publication History

Published: 09 May 2017

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. coq
  2. environments
  3. formal verification
  4. nested queries
  5. optimization
  6. query compilers
  7. relational algebra
  8. variables handling

Qualifiers

  • Research-article

Conference

SIGMOD/PODS'17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)1
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Translating canonical SQL to imperative code in CoqProceedings of the ACM on Programming Languages10.1145/35273276:OOPSLA1(1-27)Online publication date: 29-Apr-2022
  • (2021)A Coq formalization of data provenanceProceedings of the 10th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3437992.3439920(152-162)Online publication date: 17-Jan-2021
  • (2020)Verification supported refactoring of embedded sqlSoftware Quality Journal10.1007/s11219-020-09517-y29:3(629-665)Online publication date: 18-Jun-2020
  • (2019)A Coq mechanised formal semantics for realistic SQL queries: formally reconciling SQL and bag relational algebraProceedings of the 8th ACM SIGPLAN International Conference on Certified Programs and Proofs10.1145/3293880.3294107(249-261)Online publication date: 14-Jan-2019
  • (2018)Axiomatic foundations and algorithms for deciding semantic equivalences of SQL queriesProceedings of the VLDB Endowment10.14778/3236187.323620011:11(1482-1495)Online publication date: 1-Jul-2018
  • (2018)Certified Graph View Maintenance with Regular DatalogTheory and Practice of Logic Programming10.1017/S147106841800022418:3-4(372-389)Online publication date: 10-Aug-2018
  • (2018)A Coq Formalisation of SQL’s Execution EnginesInteractive Theorem Proving10.1007/978-3-319-94821-8_6(88-107)Online publication date: 4-Jul-2018
  • (2017)Q*certProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3056447(1703-1706)Online publication date: 9-May-2017

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media