Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Parallel-Correctness and Transferability for Conjunctive Queries

Published: 04 September 2017 Publication History

Abstract

A dominant cost for query evaluation in modern massively distributed systems is the number of communication rounds. For this reason, there is a growing interest in single-round multiway join algorithms where data are first reshuffled over many servers and then evaluated in a parallel but communication-free way. The reshuffling itself is specified as a distribution policy. We introduce a correctness condition, called parallel-correctness, for the evaluation of queries w.r.t. a distribution policy. We study the complexity of parallel-correctness for conjunctive queries as well as transferability of parallel-correctness between queries. We also investigate the complexity of transferability for certain families of distribution policies, including the Hypercube distribution policies.

References

[1]
Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley.
[2]
Foto N. Afrati, Paraschos Koutris, Dan Suciu, and Jeffrey D. Ullman. 2012. Parallel skyline queries. In Proceedings of the 15th International Conference on Database Theory (ICDT’12). ACM, 274--284.
[3]
Foto N. Afrati and Jeffrey D. Ullman. 2010. Optimizing joins in a map-reduce environment. In Proceedings of the 13th International Conference on Extending Database Technology (EDBT’10). ACM, 99--110.
[4]
Tom J. Ameloot, Gaetano Geck, Bas Ketsman, Frank Neven, and Thomas Schwentick. 2015. Parallel-correctness and transferability for conjunctive queries. In Proceedings of the 34th ACM Symposium on Principles of Database Systems (PODS’15). ACM, 47--58.
[5]
Tom J. Ameloot, Gaetano Geck, Bas Ketsman, Frank Neven, and Thomas Schwentick. 2016. Data partitioning for single-round multi-join evaluation in massively parallel systems. SIGMOD Rec. 45, 1 (2016), 33--40.
[6]
Tom J. Ameloot, Gaetano Geck, Bas Ketsman, Frank Neven, and Thomas Schwentick. 2017. Reasoning on data partitioning for single-round multi-join evaluation in massively parallel systems. Commun. ACM 60, 3 (Feb. 2017), 93--100. 0001-0782DOI:http://dx.doi.org/10.1145/3041063
[7]
Tom J. Ameloot, Bas Ketsman, Frank Neven, and Daniel Zinn. 2014. Weaker forms of monotonicity for declarative networking: A more fine-grained answer to the CALM-conjecture. In Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’14). ACM, 64--75.
[8]
Albert Atserias, Martin Grohe, and Dániel Marx. 2013. Size bounds and query plans for relational joins. SIAM J. Comput. 42, 4 (2013), 1737--1767. Preliminary version in FOCS 08.
[9]
Paul Beame, Paraschos Koutris, and Dan Suciu. 2013. Communication steps for parallel query processing. In Proceedings of the 32nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’13). ACM, 273--284.
[10]
Paul Beame, Paraschos Koutris, and Dan Suciu. 2014. Skew in parallel query processing. In Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’14). ACM, 212--223.
[11]
Ashok K. Chandra and Philip M. Merlin. 1977. Optimal implementation of conjunctive queries in relational data bases. In Proceedings of the 9th Annual ACM Symposium on Theory of Computing (STOC’77). ACM, 77--90.
[12]
Shumo Chu, Magdalena Balazinska, and Dan Suciu. 2015. From theory to practice: Efficient join query evaluation in a parallel database system. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data (SIGMOD’15). ACM, 63--78.
[13]
Michael Fellows. 1988. Planar Emulators and Planar Covers. (Unpublished manuscript).
[14]
Jörg Flum, Markus Frick, and Martin Grohe. 2002. Query evaluation via tree-decompositions. J. ACM 49, 6 (2002), 716--752.
[15]
Sumit Ganguly, Abraham Silberschatz, and Shalom Tsur. 1990. A framework for the parallel processing of datalog queries. In Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data (SIGMOD’90). ACM Press, 143--152.
[16]
Sumit Ganguly, Abraham Silberschatz, and Shalom Tsur. 1992. Parallel bottom-up processing of datalog queries. J. Log. Program. 14, 182 (1992), 101--126.
[17]
Gaetano Geck, Bas Ketsman, Frank Neven, and Thomas Schwentick. 2016. Parallel-correctness and containment for conjunctive queries with union and negation. In Proceedings of the 19th International Conference on Database Theory (ICDT’16). 9:1--9:17.
[18]
Pavol Hell and Jaroslav Nesetril. 1992. The core of a graph. Discr. Math. 109, 1-3 (1992), 117--126.
[19]
Bas Ketsman and Dan Suciu. 2017. A worst-case optimal multi-round algorithm for parallel computation of conjunctive queries. In Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS’17). 417--428.
[20]
Shigeru Kitakubo. 1991. Planar branched coverings of graphs. Yokohama Math. J. 38, 2 (1991), 113--120.
[21]
Paraschos Koutris, Paul Beame, and Dan Suciu. 2016. Worst-case optimal algorithms for parallel query processing. In Proceedings of the 19th International Conference on Database Theory, (ICDT 2016, Bordeaux, France, March 15-18, 2016. 8:1--8:18.
[22]
Paraschos Koutris and Dan Suciu. 2011. Parallel evaluation of conjunctive queries. In Proceedings of the 30th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS’11. ACM, 223--234.
[23]
Spark 2014. Spark. (2014). http://spark.apache.org.
[24]
Larry J. Stockmeyer. 1976. The polynomial-time hierarchy. Theor. Comput. Sci. 3, 1 (1976), 1--22.
[25]
Gabriel Valiente. 2001. A general method for graph isomorphism. In Proceedings of the 13th International Symposium on Fundamentals of Computation Theory (FCT’01). 428--431.
[26]
Reynold S. Xin, Josh Rosen, Matei Zaharia, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2013. Shark: SQL and rich analytics at scale. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’13). ACM, 13--24.
[27]
Mihalis Yannakakis. 1981. Algorithms for acyclic database schemes. IEEE Press, 82--94.
[28]
Daniel Zinn, Todd J. Green, and Bertram Ludäscher. 2012. Win-move is coordination-free (sometimes). In Proceedings of the 15th International Conference on Database Theory (ICDT’12). ACM, 99--113.

Cited By

View all
  • (2024)Optimizing Distributed Protocols with Query RewritesProceedings of the ACM on Management of Data10.1145/36392572:1(1-25)Online publication date: 26-Mar-2024
  • (2021)Database Principles and Challenges in Text AnalysisACM SIGMOD Record10.1145/3484622.348462450:2(6-17)Online publication date: 24-Aug-2021
  • (2019)Parallel-Correctness and Containment for Conjunctive Queries with Union and NegationACM Transactions on Computational Logic10.1145/332912020:3(1-24)Online publication date: 7-Jun-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of the ACM
Journal of the ACM  Volume 64, Issue 5
October 2017
266 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/3136515
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 September 2017
Accepted: 01 June 2017
Revised: 01 March 2017
Received: 01 December 2015
Published in JACM Volume 64, Issue 5

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Distributed databases
  2. distribution policies
  3. one-round evaluation
  4. parallel query evaluation

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Research Foundation-Flanders (FWO)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)1
Reflects downloads up to 07 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Optimizing Distributed Protocols with Query RewritesProceedings of the ACM on Management of Data10.1145/36392572:1(1-25)Online publication date: 26-Mar-2024
  • (2021)Database Principles and Challenges in Text AnalysisACM SIGMOD Record10.1145/3484622.348462450:2(6-17)Online publication date: 24-Aug-2021
  • (2019)Parallel-Correctness and Containment for Conjunctive Queries with Union and NegationACM Transactions on Computational Logic10.1145/332912020:3(1-24)Online publication date: 7-Jun-2019
  • (2019)Split-Correctness in Information ExtractionProceedings of the 38th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3294052.3319684(149-163)Online publication date: 25-Jun-2019
  • (2019)A Case for Stale Synchronous Distributed Model for Declarative Recursive ComputationTheory and Practice of Logic Programming10.1017/S147106841900035819:5-6(1056-1072)Online publication date: 20-Sep-2019
  • (2019)Distribution Policies for DatalogTheory of Computing Systems10.1007/s00224-019-09959-3Online publication date: 4-Dec-2019

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media