Abstract
In this paper, we consider the problem of scheduling a database query execution graph on a parallel machine. Specifically, we consider the problem of data-partitioning pipelined operators with the objective of minimizing response time. This is a basic problem in scheduling database execution trees. Partitioning promises increased parallelism and memory availability at the price of greater communication overhead. Current partitioning methods [BB90, TWPY92, LCRY93, NSHL93] do not consider these trade-offs. We present a mathematical framework within which these alternatives can be quantified for many interesting practical scenarios. We then present an algorithm whose performance is within a factor of 2 of the optimum possible.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
K.P. Belkhale and P. Banerjee: Approximate Algorithms for the Partitionable Independent Task Scheduling Problem. Proceedings of the 1990 International Conference on Parallel Processing. pp. I-72–I-75
D.DeWitt, S. Ghandehariziadeh, D. Schneider, A. Bricker, H.Hsiao, R.Rasmussen; The Gamma Database Machine. IEEE Transactions on Knowledge and Data Engineering, March 1990
D. DeWitt and J. Gray: The future of high performance database systems. Communications of the ACM, June 1992
S. Ganguly. Parallel Evaluation of Deductive Database Queries. PhD thesis, University of Texas, Austin, 1992
S. Ganguly, W. Hasan and R. Krishnamurthy. Query Optimization for Parallel Executions. Proceedings of the 1992 ACM SIGMOD International Conference on Management of Data
R.L. Graham. Bounds on Multiprocessing Timing Anomalies SIAM J. Appl. Math., 17(1969) 416–429
W. Hasan and R. Motwani. Optimization Algorithms for Exploiting the Parallelism-Communication Trade-off in Pipelined Parallelism. Proceedings of the 1994 International Conference on Very Large Databases.
W. Hong and M. Stonebraker. Optimization of Parallel Query Execution Plans in XPRS. Proceedings of the First International Conference on Parallel and Distributed Database Systems. December 1991
W. Hong. Exploiting Inter-Operation Parallelism in XPRS Proceedings of the 1992 ACM SIGMOD International Conference on Management of Data
R.S.G. Lanzelotte, P. Valduriez and M. Zait. On the Effectiveness of Optimization Search Strategies for Parallel Execution. Proceedings of the 1993 International Conference on Very Large Databases
M-L. Lo, M-S. Chen, C.V. Ravishankar and P.S. Yu. On Optimal Processor Allocation to Support Pipelined Hash Joins. Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data.
Hui-I Hsiao, M-S. Chen, P.S. Yu. On Parallel Execution of Multiple Pipelined Hash Joins. Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data.
H. Lu, M.C. Shan and K.L. Tan. Optimization of Multi-Way Join Queries for Parallel Execution. Proceedings of 1991 International Conference on Very Large Databases
T.H. Niccum, J. Srivastava, B. Himatsingka, J-Z. Li. A Tree-Decomposition Approach to the Parallel Execution of Relational Query Plans. Technical Report, University of Minnesota at Minneapolis
H. Pirahesh, C. Mohan, J.Cheung, T.S. Liu and P. Selinger. Parallelism in Relational Database Systems: Architectural Issues and Design Approaches. Proceedings of the 1991 International Conference on Parallel and Distributed Information Systems
D. Schneider. Complex Query Processing in Multiprocessor Database Machines. PhD thesis, University of Wisconsin, Madison, 1990
P. Selinger, M.M. Astrahan, D.D. Chamberlain, R.A. Lorie and T.G. Price. Access Path Selection in a Relational Database Management System. Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data
Jaideep Srivastava and G. Elsesser. Query Optimization for Parallel Relational Databases. Preliminary version appeared in Proceedings of 1993 International Conference on Parallel and Distributed Information Systems
Eugene J. Shekita, Honesty C. Young and Kian-Lee Tan. Multi-Join Optimization for Symmetric Multiprocessors. Proceedings of the 1993 Conference on Very large Databases
K-L. Tan, H. Lu. On resource scheduling of multi-join queries in parallel database systems. Information Processing Letters 48 (1993), 189–195.
J. Turek, J.L. Wolf, K.R. Pattipati and P.S. Yu. Scheduling Parallelizable Tasks: Putting it All on the Shelf. Proceedings of the 1992 ACM Sigmetrics Conference
M. Ziane, M. Zait, and P. Borla-Salamet. Parallel Query Processing in DBS3. In Proceedings of the 1993 International Conference on Parallel and Distributed Information Systems
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ganguly, S., Gerasoulis, A., Wang, W. (1995). Partitioning pipelines with communication costs. In: Bhalla, S. (eds) Information Systems and Data Management. CISMOD 1995. Lecture Notes in Computer Science, vol 1006. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60584-3_40
Download citation
DOI: https://doi.org/10.1007/3-540-60584-3_40
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60584-3
Online ISBN: 978-3-540-47799-0
eBook Packages: Springer Book Archive