Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

DBSP: Automatic Incremental View Maintenance for Rich Query Languages

Published: 01 March 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Incremental view maintenance (IVM) has long been a central problem in database theory. Many solutions have been proposed for restricted classes of database languages, such as the relational algebra, or Datalog. These techniques do not naturally generalize to richer languages. In this paper we give a general, heuristic-free solution to this problem in 3 steps: (1) we describe a simple but expressive language called DBSP for describing computations over data streams; (2) we give a new mathematical definition of IVM and a general algorithm for solving IVM for arbitrary DBSP programs, and (3) we show how to model many rich database query languages using DBSP (including the full relational algebra, queries over sets and multisets, arbitrarily nested relations, aggregation, flatmap (unnest), monotonic and non-monotonic recursion, streaming aggregation, and arbitrary compositions of all of these). SQL and Datalog can both be implemented in DBSP. As a consequence, we obtain efficient incremental view maintenance algorithms for queries written in all these languages.

    References

    [1]
    [n.d.]. The Aurora Project. http://cs.brown.edu/research/aurora/. Last accessed November 2022.
    [2]
    [n.d.]. sqllogictest. https://www.sqlite.org/sqllogictest/doc/trunk/about.wiki. Last accessed March 2023.
    [3]
    Martín Abadi, Frank McSherry, and Gordon Plotkin. 2015. Foundations of Differential Dataflow. In Foundations of Software Science and Computation Structures (FoSSaCS). London, UK. http://homepages.inf.ed.ac.uk/gdp/publications/differentialweb.pdf
    [4]
    Supun Abeysinghe, Qiyang He, and Tiark Rompf. 2022. Efficient Incrementialization of Correlated Nested Aggregate Queries Using Relative Partial Aggregate Indexes (RPAI). In ACM SIGMOD International conference on Management of data (SIGMOD) (Philadelphia, PA, USA). 136--149.
    [5]
    Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley. http://webdam.inria.fr/Alice/
    [6]
    Yanif Ahmad and Christoph Koch. 2009. DBToaster: A SQL Compiler for High-Performance Delta Processing in Main-Memory Databases. Proc. VLDB Endow. 2, 2 (Aug. 2009), 1566--1569.
    [7]
    Mario Alvarez-Picallo, Alex Eyers-Taylor, Michael Peyton Jones, and C.-H. Luke Ong. 2019. Fixing Incremental Computation. In European Symposium on Programming Languages and Systems (ESOP). Prague, Czech Republic, 525--552. https://link.springer.com/chapter/10.1007/978-3-030-17184-1_19
    [8]
    Krzysztof R. Apt and Jean-Marc Pugin. 1987. Maintenance of Stratified Databases Viewed as a Belief Revision System. In ACM SIGMOD International conference on Management of data (SIGMOD). San Diego, California, 136--145.
    [9]
    Arvind Arasu, Shivnath Babu, and Jennifer Widom. 2002. An Abstract Semantics and Concrete Language for Continuous Queries over Streams and Relations. Technical Report 2002-57. Stanford InfoLab. http://ilpubs.stanford.edu:8090/563/
    [10]
    Edmon Begoli, Jesús Camacho-Rodríguez, Julian Hyde, Michael J. Mior, and Daniel Lemire. 2018. Apache Calcite: A Foundational Framework for Optimized Query Processing Over Heterogeneous Data Sources. In International Conference on Management of Data (IDMD) (Houston, TX, USA). 221--230.
    [11]
    Angela Bonifati, Stefania Dumbrava, and Emilio Jesús Gallego Arias. 2018. Certified Graph View Maintenance with Regular Datalog. Theory and Practice of Logic Programming 18, 3--4 (2018), 372--389.
    [12]
    Mihai Budiu, Frank McSherry, Leonid Ryzhyk, and Val Tannen. 2022. DBSP: A Language for Expressing Incremental View Maintenance for Rich Query Languages. https://github.com/vmware/database-stream-processor/blob/main/doc/spec.pdf.
    [13]
    Stefano Ceri and Jennifer Widom. 1991. Deriving Production Rules for Incremental View Maintenance. In International Conference of Very Large Data Bases (VLDB). Barcelona, Spain, 577--589. http://www.vldb.org/conf/1991/P577.PDF
    [14]
    Tej Chajed. 2022. DBSP formalization. https://github.com/tchajed/dbsp-theory
    [15]
    Surajit Chaudhuri, Ravi Krishnamurthy, Spyros Potamianos, and Kyuseok Shim. 1995. Optimizing Queries with Materialized Views. In International Conference on Data Engineering (ICDE). 190--200.
    [16]
    Rada Chirkova and Jun Yang. 2012. Materialized Views. Now Publishers Inc., Hanover, MA, USA.
    [17]
    Zaheer Chothia, John Liagouris, Frank McSherry, and Timothy Roscoe. 2016. Explaining Outputs in Modern Data Analytics. Proc. VLDB Endow. 9, 12 (Aug. 2016), 1137--1148.
    [18]
    Leonardo de Moura, Soonho Kong, Jeremy Avigad, Floris van Doorn, and Jakob von Raumer. 2015. The Lean Theorem Prover. In International Conference on Automated Deduction (CADE-25). Berlin, Germany.
    [19]
    Hasanat M. Dewan, David Ohsie, Salvatore J. Stolfo, Ouri Wolfson, and Sushil Da Silva. 1992. Incremental Database Rule Processing In PARADISER. J. Intell. Inf. Syst. 1, 2 (1992), 177--209.
    [20]
    J. Nathan Foster, Ravi Konuru, Jerome Simeon, and Lionel Villard. 2008. An Algebraic Approach to XQuery View Maintenance. In ACM SIGPLAN Workshop on Programming Languages Technologies for XML. San Francisco, CA.
    [21]
    Sergio Greco and Cristian Molinaro. 2015. Datalog and Logic Databases. Synthesis Lectures on Data Management 7, 2 (2015), 1--169.
    [22]
    Todd J Green, Zachary G Ives, and Val Tannen. 2011. Reconcilable differences. Theory of Computing Systems 49, 2 (2011), 460--488. https://web.cs.ucdavis.edu/~green/papers/tocs11_differences.pdf
    [23]
    Todd J. Green, Grigoris Karvounarakis, and Val Tannen. 2007. Provenance Semirings. In Symposium on Principles of Database Systems (PODS). Beijing, China, 31--40.
    [24]
    Timothy Griffin and Leonid Libkin. 1995. Incremental Maintenance of Views with Duplicates. In ACM SIGMOD International conference on Management of data (SIGMOD) (San Jose, California, USA). 328--339.
    [25]
    Ashish Gupta, Inderpal Singh Mumick, et al. 1995. Maintenance of materialized views: Problems, techniques, and applications. IEEE Data Eng. Bull. 18, 2 (1995), 3--18.
    [26]
    Ashish Gupta, Inderpal Singh Mumick, and V. S. Subrahmanian. 1993. Maintaining Views Incrementally. In ACM SIGMOD International Conference on Management of Data. Washington, D.C., USA, 157--166.
    [27]
    John V. Harrison and Suzanne W. Dietrich. 1992. Maintenance of Materialized Views in a Deductive Database: An Update Propagation Approach. In Workshop on Deductive Databases (Technical Report). Washington, D.C., 56--65.
    [28]
    Muhammad Idris, Martin Ugarte, and Stijn Vansummeren. 2017. The Dynamic Yannakakis Algorithm: Compact and Efficient Query Processing Under Updates. In ACM SIGMOD International conference on Management of data (SIGMOD) (Chicago, Illinois, USA). 1259--1274.
    [29]
    Muhammad Idris, Martín Ugarte, Stijn Vansummeren, Hannes Voigt, and Wolfgang Lehner. 2018. Conjunctive Queries with Inequalities under Updates. Proc. VLDB Endow. 11, 7 (mar 2018), 733--745.
    [30]
    Muhammad Idris, Martín Ugarte, Stijn Vansummeren, Hannes Voigt, and Wolfgang Lehner. 2019. Efficient Query Processing for Dynamically Changing Datasets. SIGMOD Rec. 48, 1 (November 2019), 33--40.
    [31]
    Hojjat Jafarpour, Rohan Desai, and Damian Guy. 2019. KSQL: Streaming SQL Engine for Apache Kafka. In International Conference on Extending Database Technology (EDBT). Lisbon, Portugal, 524--533. http://openproceedings.org/2019/conf/edbt/EDBT19_paper_329.pdf
    [32]
    Ahmet Kara, Hung Q. Ngo, Milos Nikolic, Dan Olteanu, and Haozhe Zhang. 2020. Maintaining Triangle Queries under Updates. ACM Trans. Database Syst. 45, 3, Article 11 (aug 2020), 46 pages.
    [33]
    Christoph Koch. 2010. Incremental Query Evaluation in a Ring of Databases. In Symposium on Principles of Database Systems (PODS). Indianapolis, Indiana, USA, 87--98.
    [34]
    Christoph Koch, Daniel Lupei, and Val Tannen. 2016. Incremental View Maintenance For Collection Programming. In Symposium on Principles of Database Systems (PODS). San Francisco, California, USA, 75--90.
    [35]
    Jakub Kotowski, François Bry, and Simon Brodt. 2011. Reasoning as Axioms Change - Incremental View Maintenance Reconsidered. In Web Reasoning and Rule Systems RR (Lecture Notes in Computer Science, Vol. 6902). Springer, Galway, Ireland, 139--154.
    [36]
    James J. Lu, Guido Moerkotte, Joachim Schü, and V. S. Subrahmanian. 1995. Efficient Maintenance of Materialized Mediated Views. In ACM SIGMOD International conference on Management of data (SIGMOD). San Jose, California, 340--351.
    [37]
    The mathlib Community. 2020. The Lean Mathematical Library. In Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs (New Orleans, LA, USA) (CPP 2020). Association for Computing Machinery, New York, NY, USA, 367--381.
    [38]
    Frank McSherry, Andrea Lattuada, Malte Schwarzkopf, and Timothy Roscoe. 2020. Shared Arrangements: Practical Inter-Query Sharing for Streaming Dataflows. Proc. VLDB Endow. 13, 10 (June 2020), 1793--1806.
    [39]
    Frank McSherry, Derek Gordon Murray, Rebecca Isaacs, and Michael Isard. 2013. Differential Dataflow. In Conference on Innovative Data Systems Research (CIDR). Asilomar, CA, 12 pages. https://www.cidrdb.org/cidr2013/Papers/CIDR13_Paper111.pdf
    [40]
    Boris Motik, Yavor Nenov, Robert Piro, and Ian Horrocks. 2019. Maintenance of Datalog materialisations revisited. Artif. Intell. 269 (2019), 76--136.
    [41]
    Boris Motik, Yavor Nenov, Robert Edgar Felix Piro, and Ian Horrocks. 2015. Incremental Update of Datalog Materialisation: the Backward/Forward Algorithm. In Conference on Artificial Intelligence (AAAI). Austin, Texas, 1560--1568. http://www.aaai.org/ocs/index.php/AAAI/AAAI15/paper/view/9660
    [42]
    Derek G. Murray, Frank McSherry, Rebecca Isaacs, Michael Isard, Paul Barham, and Martín Abadi. 2013. Naiad: A Timely Dataflow System. In ACM Symposium on Operating Systems Principles (SOSP). Farminton, Pennsylvania, 439--455.
    [43]
    Milos Nikolic and Dan Olteanu. 2018. Incremental View Maintenance with Triple Lock Factorization Benefits. In International Conference on Management of Data (ICMD) (Houston, TX, USA). 365--380.
    [44]
    L. R. Rabiner and B. Gold (Eds.). 1975. Theory and Application of Digital Signal Processing. Prentice-Hall.
    [45]
    Leonid Ryzhyk and Mihai Budiu. 2019. Differential Datalog. In Datalog 2.0. Philadelphia, PA, 12 pages. http://budiu.info/work/ddlog.pdf
    [46]
    Martin Staudt and Matthias Jarke. 1996. Incremental Maintenance of Externally Materialized Views. In International Conference of Very Large Data Bases (VLDB). Mumbai (Bombay), India, 75--86. http://www.vldb.org/conf/1996/P075.PDF
    [47]
    Kanat Tangwongsan, Martin Hirzel, Scott Schneider, and Kun-Lung Wu. 2015. General Incremental Sliding-Window Aggregation. Proc. VLDB Endow. 8, 7 (February 2015), 702--713.
    [48]
    Qichen Wang and Ke Yi. 2020. Maintaining Acyclic Foreign-Key Joins under Updates. In ACM SIGMOD International conference on Management of data (SIGMOD). Portland, OR, USA, 1225--1239.
    [49]
    Ouri Wolfson, Hasanat M. Dewan, Salvatore J. Stolfo, and Yechiam Yemini. 1991. Incremental Evaluation of Rules and its Relationship to Parallelism. In ACM SIGMOD International conference on Management of data (SIGMOD). ACM Press, Denver, Colorado, 78--87.

    Cited By

    View all
    • (2024)DBSP: Incremental Computation on Streams and Its Applications to DatabasesACM SIGMOD Record10.1145/3665252.366527153:1(87-95)Online publication date: 14-May-2024
    • (2024)Recent Increments in Incremental View MaintenanceCompanion of the 43rd Symposium on Principles of Database Systems10.1145/3635138.3654763(8-17)Online publication date: 9-Jun-2024
    • (2023)On The Suitability of Differential Dataflow For Datalog Interpretation In Highly Dynamic SettingsProceedings of the 2023 6th Artificial Intelligence and Cloud Computing Conference10.1145/3639592.3639622(218-225)Online publication date: 16-Dec-2023

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 16, Issue 7
    March 2023
    203 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 March 2023
    Published in PVLDB Volume 16, Issue 7

    Check for updates

    Badges

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)98
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)DBSP: Incremental Computation on Streams and Its Applications to DatabasesACM SIGMOD Record10.1145/3665252.366527153:1(87-95)Online publication date: 14-May-2024
    • (2024)Recent Increments in Incremental View MaintenanceCompanion of the 43rd Symposium on Principles of Database Systems10.1145/3635138.3654763(8-17)Online publication date: 9-Jun-2024
    • (2023)On The Suitability of Differential Dataflow For Datalog Interpretation In Highly Dynamic SettingsProceedings of the 2023 6th Artificial Intelligence and Cloud Computing Conference10.1145/3639592.3639622(218-225)Online publication date: 16-Dec-2023

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media