Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3448016.3459244acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Public Access

TreeToaster: Towards an IVM-Optimized Compiler

Published: 18 June 2021 Publication History

Abstract

A compiler's optimizer operates over abstract syntax trees (ASTs), continuously applying rewrite rules to replace subtrees of the AST with more efficient ones. Especially on large source repositories, even simply finding opportunities for a rewrite can be expensive, as optimizer traverses the AST naively. In this paper, we leverage the need to repeatedly find rewrites, and explore options for making the search faster through indexing and incremental view maintenance (IVM). Concretely, we consider bolt-on approaches that make use of embedded IVM systems like DBToaster, as well as two new approaches: Label-indexing and TreeToaster, an AST-specialized form of IVM. We integrate these approaches into an existing just-in-time data structure compiler and show experimentally that TreeToaster can significantly improve performance with minimal memory overheads.

Supplementary Material

Read me (3448016.3459244_readme.pdf)
Source Code (3448016.3459244_source_code.zip)
MP4 File (3448016.3459244.mp4)
A compiler's optimizer operates over abstract syntax trees (ASTs), continuously applying rewrite rules to replace subtrees of the AST with more efficient ones. Especially on large source repositories, even simply finding opportunities for a rewrite can be expensive, as optimizer traverses the AST naively. In this paper, we leverage the need to repeatedly find rewrites, and explore options for making the search faster through indexing and incremental view maintenance (IVM). Concretely, we consider bolt-on approaches that make use of embedded IVM systems like DBToaster, as well as two new approaches: Label-indexing and TreeToaster, an AST-specialized form of IVM. We integrate these approaches into an existing just-in-time data structure compiler and show experimentally that TreeToaster can significantly improve performance with minimal memory overheads.

References

[1]
Serge Abiteboul, Jason McHugh, Michael Rys, Vasilis Vassalos, and Janet L. Wiener. 1998. Incremental Maintenance for Materialized Views over Semistructured Data. In VLDB. Morgan Kaufmann, 38--49.
[2]
Yanif Ahmad, Oliver Kennedy, Christoph Koch, and Milos Nikolic. 2012. Dbtoaster: Higher-order delta processing for dynamic, frequently fresh views. arXiv preprint arXiv:1207.0137 (2012).
[3]
Michael Armbrust, Reynold S. Xin, Cheng Lian, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael J. Franklin, Ali Ghodsi, and Matei Zaharia. 2015. Spark SQL: Relational Data Processing in Spark. In SIGMOD Conference. ACM, 1383--1394.
[4]
Darshana Balakrishnan, Carl Nuessle, Oliver Kennedy, and Lukasz Ziarek. 2021. TreeToaster: Towards an IVM-Optimized Compiler. UB CSE Technical Report (2021). http://www.cse.buffalo.edu/tech-reports/2021-01.pdf
[5]
Darshana Balakrishnan, Lukasz Ziarek, and Oliver Kennedy. 2019 a. Fluid data structures. In DBPL. ACM, 3--17.
[6]
Darshana Balakrishnan, Lukasz Ziarek, and Oliver Kennedy. 2019 b. Just-in-Time Index Compilation. arXiv preprint arXiv:1901.07627 (2019).
[7]
José A. Blakeley, Per-Åke Larson, and Frank Wm. Tompa. 1986. Efficiently Updating Materialized Views. In SIGMOD Conference. ACM Press, 61--71.
[8]
Wayne D. Blizard. 1990. Negative Membership. Notre Dame J. Formal Log., Vol. 31, 3 (1990), 346--368.
[9]
Dennis Butterstein and Torsten Grust. 2016. Precision Performance Surgery for PostgreSQL: LLVM-based Expression Compilation, Just in Time. Proc. VLDB Endow., Vol. 9, 13 (2016), 1517--1520.
[10]
Surajit Chaudhuri, Ravi Krishnamurthy, Spyros Potamianos, and Kyuseok Shim. 1995. Optimizing Queries with Materialized Views. In ICDE. IEEE Computer Society, 190--200.
[11]
Latha S. Colby, Timothy Griffin, Leonid Libkin, Inderpal Singh Mumick, and Howard Trickey. 1996. Algorithms for Deferred View Maintenance. In SIGMOD Conference. ACM Press, 469--480.
[12]
Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM symposium on Cloud computing. 143--154.
[13]
Databricks. 2015. Project Tungsten. https://databricks.com/glossary/tungsten. (2015).
[14]
Katica Dimitrova, Maged El-Sayed, and Elke A. Rundensteiner. 2003. Order-Sensitive View Maintenance of Materialized XQuery Views. In ER (Lecture Notes in Computer Science, Vol. 2813). Springer, 144--157.
[15]
Goetz Graefe. 1995. The Cascades Framework for Query Optimization. IEEE Data Eng. Bull., Vol. 18, 3 (1995), 19--29.
[16]
D. Richard Hipp. 2000. SQLite: Small. Fast. Reliable. Choose any three. https://sqlite.org/.
[17]
Stratos Idreos, Martin L. Kersten, and Stefan Manegold. 2007. Database Cracking. In CIDR. www.cidrdb.org, 68--78.
[18]
Akira Kawaguchi, Daniel F. Lieuwen, Inderpal Singh Mumick, and Kenneth A. Ross. 1997. Implementing Incremental View Maintenance in Nested Data Models. In DBPL (Lecture Notes in Computer Science, Vol. 1369). Springer, 202--221.
[19]
Oliver Kennedy and Lukasz Ziarek. 2015a. Just-In-Time Data Structures. In CIDR. www.cidrdb.org.
[20]
Oliver Kennedy and Lukasz Ziarek. 2015b. Just-In-Time Data Structures. In CIDR. Citeseer.
[21]
Christoph Koch. 2010. Incremental query evaluation in a ring of databases. In PODS. ACM, 87--98.
[22]
Christoph Koch, Yanif Ahmad, Oliver Kennedy, Milos Nikolic, Andres Nö tzli, Daniel Lupei, and Amir Shaikhha. 2014. DBToaster: higher-order delta processing for dynamic, frequently fresh views. VLDB J., Vol. 23, 2 (2014), 253--278.
[23]
Christoph Koch, Daniel Lupei, and Val Tannen. 2016. Incremental View Maintenance For Collection Programming. In PODS. ACM, 75--90.
[24]
Per-Åke Larson and Jingren Zhou. 2007. Efficient Maintenance of Materialized Outer-Join Views. In ICDE. IEEE Computer Society, 56--65.
[25]
Frank McSherry, Derek Gordon Murray, Rebecca Isaacs, and Michael Isard. 2013. Differential Dataflow. In CIDR. www.cidrdb.org.
[26]
Erik Meijer, Brian Beckman, and Gavin M. Bierman. 2006. LINQ: reconciling object, relations and XML in the .NET framework. In SIGMOD Conference. ACM, 706.
[27]
Thomas Neumann. 2011. Efficiently Compiling Efficient Query Plans for Modern Hardware. Proc. VLDB Endow., Vol. 4, 9 (2011), 539--550.
[28]
Milos Nikolic, Mohammad Dashti, and Christoph Koch. 2016. How to Win a Hot Dog Eating Contest: Distributed Incremental View Maintenance with Batch Updates. In SIGMOD Conference. ACM, 511--526.
[29]
Patrick E. O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth J. O'Neil. 1996. The Log-Structured Merge-Tree (LSM-Tree). Acta Informatica, Vol. 33, 4 (1996), 351--385.
[30]
Oracle. 1994. Oracle BerkeleyDB. https://www.oracle.com/database/berkeley-db/.
[31]
Themistoklis Palpanas, Richard Sidle, Roberta Cochrane, and Hamid Pirahesh. 2002. Incremental Maintenance for Non-Distributive Aggregate Functions. In VLDB. Morgan Kaufmann, 802--813.
[32]
Mark Raasveldt and Hannes Mü hleisen. 2019. DuckDB: an Embeddable Analytical Database. In SIGMOD Conference. ACM, 1981--1984.
[33]
Kenneth A. Ross, Divesh Srivastava, and S. Sudarshan. 1996. Materialized View Maintenance and Integrity Constraint Checking: Trading Space for Time. In SIGMOD Conference. ACM Press, 447--458.
[34]
Amir Shaikhha. 2013. An Embedded Query Language in Scala. http://infoscience.epfl.ch/record/213124
[35]
Mohamed A. Soliman, Lyublena Antova, Venkatesh Raghavan, Amr El-Helw, Zhongxian Gu, Entong Shen, George C. Caragea, Carlos Garcia-Alvarado, Foyzur Rahman, Michalis Petropoulos, Florian Waas, Sivaramakrishnan Narayanan, Konstantinos Krikellas, and Rhonda Baldwin. 2014. Orca: a modular query optimizer architecture for big data. In SIGMOD Conference. ACM, 337--348.
[36]
Gábor Szárnyas. 2018. Incremental View Maintenance for Property Graph Queries. In SIGMOD Conference. ACM, 1843--1845.
[37]
Gá bor Szá rnyas, Jó zsef Marton, János Maginecz, and Dániel Varró. 2018. Reducing Property Graph Queries to Relational Algebra for Incremental View Maintenance. CoRR, Vol. abs/1806.07344 (2018).
[38]
The Transaction Processing Performance Council. [n.d.]. The TPC-H Benchmark. http://www.tpc.org/tpch/.
[39]
Thomas Wü rthinger. 2014. Graal and truffle: modularity and separation of concerns as cornerstones for building a multipurpose runtime. In MODULARITY. ACM, 3--4.
[40]
Jun Yang and Jennifer Widom. 2003. Incremental computation and maintenance of temporal aggregates. VLDB J., Vol. 12, 3 (2003), 262--283.
[41]
Ying Yang and Oliver Kennedy. 2017. Convergent Interactive Inference with Leaky Joins. In EDBT. OpenProceedings.org, 366--377.
[42]
Yue Zhuge and Hector Garcia-Molina. 1998. Graph Structured Views and Their Incremental Maintenance. In ICDE. IEEE Computer Society, 116--125.

Cited By

View all
  • (2023)Asymptotically Better Query Optimization Using Indexed AlgebraProceedings of the VLDB Endowment10.14778/3611479.361150516:11(3018-3030)Online publication date: 1-Jul-2023
  • (2023)F-IVM: analytics over relational databases under updatesThe VLDB Journal10.1007/s00778-023-00817-w33:4(903-929)Online publication date: 14-Nov-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data
June 2021
2969 pages
ISBN:9781450383431
DOI:10.1145/3448016
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2021

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. abstract syntax tress
  2. compilers
  3. incremental view maintenance
  4. indexing

Qualifiers

  • Research-article

Funding Sources

Conference

SIGMOD/PODS '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)91
  • Downloads (Last 6 weeks)14
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Asymptotically Better Query Optimization Using Indexed AlgebraProceedings of the VLDB Endowment10.14778/3611479.361150516:11(3018-3030)Online publication date: 1-Jul-2023
  • (2023)F-IVM: analytics over relational databases under updatesThe VLDB Journal10.1007/s00778-023-00817-w33:4(903-929)Online publication date: 14-Nov-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media