Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Output-sensitive Conjunctive Query Evaluation

Published: 07 November 2024 Publication History

Abstract

Join evaluation is one of the most fundamental operations performed by database systems and arguably the most well-studied problem in the Database community. A staggering number of join algorithms have been developed, and commercial database engines use finely tuned join heuristics that take into account many factors including the selectivity of predicates, memory, IO, etc. However, most of the results have catered to either full join queries or non-full join queries but with degree constraints (such as PK-FK relationships) that makes join evaluation easier. Further, most of the algorithms are also not output-sensitive. In this paper, we present a novel, output-sensitive algorithm for the evaluation of acyclic Conjunctive Queries (CQs) that contain arbitrary free variables. Our result is based on a novel generalization of the Yannakakis algorithm and shows that it is possible to improve the running time guarantee of Yannakakis algorithm by a polynomial factor. Importantly, our algorithmic improvement does not depend on the use of fast matrix multiplication, as a recently proposed algorithm does. The application of our algorithm recovers known prior results and improves on known state-of-the-art results for common queries such as paths and stars. The upper bound is complemented with a matching lower bound for star queries, a restricted subclass of acyclic CQs, and a family of cyclic CQs conditioned on two variants of the k-clique conjecture.

References

[1]
Amir Abboud, Seri Khoury, Oree Leibowitz, and Ron Safier. 2022. Listing 4-cycles. arXiv preprint arXiv:2211.10022 (2022).
[2]
Mahmoud Abo Khamis, Hung Q. Ngo, and Atri Rudra. 2016. FAQ: Questions Asked Frequently (PODS '16). Association for Computing Machinery, New York, NY, USA.
[3]
Mahmoud Abo Khamis, Hung Q Ngo, and Dan Suciu. 2017. What do Shannon-type inequalities, submodular width, and disjunctive datalog have to do with one another?. In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. 429--444.
[4]
Pankaj K Agarwal, Xiao Hu, Stavros Sintos, and Jun Yang. 2024. On reporting durable patterns in temporal proximity graphs. Proceedings of the ACM on Management of Data, Vol. 2, 2 (2024), 1--26.
[5]
Rasmus Resen Amossen and Rasmus Pagh. 2009. Faster join-projects and sparse matrix multiplications. In Proceedings of the 12th International Conference on Database Theory. 121--126.
[6]
Peter Auer, Nicolo Cesa-Bianabo2017shannon chi, Yoav Freund, and Robert E Schapire. 1995. Gambling in a rigged casino: The adversarial multi-armed bandit problem. In Proceedings of IEEE 36th annual foundations of computer science. IEEE, 322--331.
[7]
Guillaume Bagan, Arnaud Durand, and Etienne Grandjean. 2007. On acyclic conjunctive queries and constant delay enumeration. In International Workshop on Computer Science Logic. Springer, 208--222.
[8]
Christoph Berkholz and Nicole Schweikardt. 2020. Constant delay enumeration with fpt-preprocessing for conjunctive queries of bounded submodular width. arXiv preprint arXiv:2003.01075 (2020).
[9]
Philip A Bernstein and Nathan Goodman. 1981. Power of natural semijoins. SIAM J. Comput., Vol. 10, 4 (1981), 751--771.
[10]
Andreas Björklund, Rasmus Pagh, Virginia Vassilevska Williams, and Uri Zwick. 2014. Listing triangles. In International Colloquium on Automata, Languages, and Programming. Springer, 223--234.
[11]
Lijun Chang, Jeffrey Xu Yu, and Lu Qin. 2013. Fast maximal cliques enumeration in sparse graphs. Algorithmica, Vol. 66 (2013), 173--186.
[12]
Alessio Conte, Roberto Grossi, Andrea Marino, and Luca Versari. 2016. Sublinear-space bounded-delay enumeration for massive network analytics: Maximal cliques. In 43rd International Colloquium on Automata, Languages, and Programming (ICALP 2016), Vol. 148. 1--148.
[13]
Chris J Date. 1989. A Guide to the SQL Standard. Addison-Wesley Longman Publishing Co., Inc.
[14]
Shaleen Deep, Xiao Hu, and Paraschos Koutris. 2020. Fast Join Project Query Evaluation using Matrix Multiplication. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 1213--1223. https://doi.org/10.1145/3318464.3380607
[15]
Dong Deng, Yufei Tao, and Guoliang Li. 2018. Overlap set similarity joins with theoretical guarantees. In Proceedings of the 2018 International Conference on Management of Data. 905--920.
[16]
Shiyuan Deng, Shangqi Lu, and Yufei Tao. 2023. On join sampling and the hardness of combinatorial output-sensitive join algorithms. In Proceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. 99--111.
[17]
Austen Z. Fan, Paraschos Koutris, and Hangdong Zhao. 2023. The Fine-Grained Complexity of Boolean Conjunctive Queries and Sum-Product Problems. In ICALP (LIPIcs, Vol. 261). Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 127:1--127:20.
[18]
Todd J. Green, Grigoris Karvounarakis, and Val Tannen. [n.,d.]. Provenance semirings. In Proceedings of the Twenty-Sixth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS '07). Association for Computing Machinery, New York, NY, USA.
[19]
Xiao Hu. 2024. Fast Matrix Multiplication for Query Processing. Proceedings of the ACM on Management of Data, Vol. 2, 2 (2024), 1--25.
[20]
Riko Jacob and Morten Stöckel. 2015. Fast output-sensitive matrix multiplication. In Algorithms-ESA 2015: 23rd Annual European Symposium, Patras, Greece, September 14--16, 2015, Proceedings. Springer, 766--778.
[21]
Ce Jin, Virginia Vassilevska Williams, and Renfei Zhou. 2024. Listing 6-Cycles. In 2024 Symposium on Simplicity in Algorithms (SOSA). SIAM, 19--27.
[22]
Ce Jin and Yinzhan Xu. 2023. Removing additive structure in 3sum-based reductions. In Proceedings of the 55th Annual ACM Symposium on Theory of Computing. 405--418.
[23]
Ahmet Kara, Milos Nikolic, Dan Olteanu, and Haozhe Zhang. 2020. Trade-offs in static and dynamic evaluation of hierarchical queries. In Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. 375--392.
[24]
Mahmoud Abo Khamis, Hung Q. Ngo, and Dan Suciu. 2017. What Do Shannon-type Inequalities, Submodular Width, and Disjunctive Datalog Have to Do with One Another?. In PODS. ACM, 429--444.
[25]
Kazuhisa Makino and Takeaki Uno. 2004. New algorithms for enumerating all maximal cliques. In Algorithm Theory-SWAT 2004: 9th Scandinavian Workshop on Algorithm Theory, Humlebæk, Denmark, July 8--10, 2004. Proceedings 9. Springer, 260--272.
[26]
HG Marc. 1979. On the universal relation. Technical Report. Technical report, University of Toronto.
[27]
Dan Olteanu and Jakub Závodnỳ. 2015. Size bounds for factorised representations of query results. ACM Transactions on Database Systems (TODS), Vol. 40, 1 (2015), 1--44.
[28]
Virginia Vassilevska Williams and R Ryan Williams. 2018. Subcubic equivalences between path, matrix, and triangle problems. Journal of the ACM (JACM), Vol. 65, 5 (2018), 1--38.
[29]
Virginia Vassilevska Williams, Yinzhan Xu, Zixuan Xu, and Renfei Zhou. 2024. New bounds for matrix multiplication: from alpha to omega. In Proceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA). SIAM, 3792--3835.
[30]
Mihalis Yannakakis. 1981. Algorithms for acyclic database schemes. In VLDB, Vol. 81. 82--94.
[31]
Clement Tak Yu and Meral Z Ozsoyoglu. 1979. An algorithm for tree-query membership of a distributed query. In COMPSAC 79. Proceedings. Computer Software and The IEEE Computer Society's Third International Applications Conference, 1979. IEEE, 306--312.
[32]
Hangdong Zhao, Shaleen Deep, Paraschos Koutris, Sudeepa Roy, and Val Tannen. 2024. Evaluating Datalog over Semirings: A Grounding-based Approach. Proceedings of the ACM on Management of Data, Vol. 2, 2 (2024), 1--26.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 2, Issue 5
PODS
November 2024
363 pages
EISSN:2836-6573
DOI:10.1145/3703846
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 November 2024
Published in PACMMOD Volume 2, Issue 5

Permissions

Request permissions for this article.

Author Tags

  1. conjunctive queries
  2. output-sensitive
  3. projections
  4. yannakakis

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 90
    Total Downloads
  • Downloads (Last 12 months)90
  • Downloads (Last 6 weeks)59
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media