Google Scholar

Dataflow query execution in a parallel main-memory environment

AN Wilschut, PMG Apers - Distributed and Parallel Databases, 1993 - Springer

Distributed and Parallel Databases, 1993•Springer

In this paper, the performance and characteristics of the execution of various join-trees on a
parallel DBMS are studied. The results of this study are a step into the direction of the design
of a query optimization strategy that is fit for parallel execution of complex queries. Among
others, synchronization issues are identified to limit the performance gain from parallelism. A
new hash-join algorithm is introduced that has fewer synchronization constraints than the
known hash-join algorithms. Also, the behavior of individual join operations in a join-tree is …

Abstract

Among others, synchronization issues are identified to limit the performance gain from parallelism. A new hash-join algorithm is introduced that has fewer synchronization constraints than the known hash-join algorithms. Also, the behavior of individual join operations in a join-tree is studied in a simulation experiment. The results show that the introduced Pipelining hash-join algorithm yields a better performance for multi-join queries. The format of the optimal join-tree appears to depend on the size of the operands of the join: A multi-join between small operands performs best with a bushy schedule; larger operands are better off with a linear schedule. The results from the simulation study are confirmed with an analytic model for dataflow query execution.

Springer

Show moreShow less

Save Cite Cited by 505 Related articles All 19 versions

Cite

Advanced search

Saved to My library

Dataflow query execution in a parallel main-memory environment