Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

A performance evaluation of four parallel join algorithms in a shared-nothing multiprocessor environment

Published: 01 June 1989 Publication History
  • Get Citation Alerts
  • Abstract

    In this paper we analyze and compare four parallel join algorithms. Grace and Hybrid hash represent the class of hash-based join methods, Simple hash represents a looping algorithm with hashing, and our last algorithm is the more traditional sort-merge. The performance of each of the algorithms with different tuple distribution policies, the addition of bit vector filters, varying amounts of main-memory for joining, and non-uniformly distributed join attribute values is studied. The Hybrid hash-join algorithm is found to be superior except when the join attribute values of the inner relation are non-uniformly distributed and memory is limited. In this case, a more conservative algorithm such as the sort-merge algorithm should be used. The Gamma database machine serves as the host for the performance comparison.

    References

    [1]
    Babb, E., "implementing a Relational Database by Means of Specialized Hardware" ACM Transactions on Database Systems, Vol. 4, No. 1, March, 1979.
    [2]
    Bitton, D., D.J. DeWitt, and C. Turbyfill, "Benchmarking Database Systems - A Systematic Approach," Proceedings of the 1983 Very Large Database Conference, October, 1983.
    [3]
    Boral, H. "Parallelism in Bubba," Proceedings of the International Symposium on Databases in Parallel and Distributed Systems, Austin, TX, December, 1988.
    [4]
    Bratbergsengen, Kjell, "Hashing Methods and Relational Algebra Operations" Proceedings of the 1984 Very Large Database Conference, August, 1984.
    [5]
    Bratbergsengen, Kjell, "Algebra Operations on a Parallel Computer-- Performance Evaluation", Database Machines and Knowledge Base Machines, M. Kitsuregawa and H. Tanaka (eds), Kluwer Academic Publishers, 1987.
    [6]
    Chou, H-T, DeWitt, D. J., Katz, R., and T. Klug, "Design and Implementation of the Wisconsin Storage System (WiSS)" Software Practices and Experience, Vol. 15, No. 10, October, 1985.
    [7]
    DeWitt, D. J., et. al., "Implementation Techniques for Main Memory Database Systems," Proceedings of the 1984 SIGMOD Conference, Boston, MA, June, 1984.
    [8]
    DeWitt, D., and R. Gerber, "Multiprocessor Hash- Based Join Algorithms," Proceedings of the 1985 VLDB Conference, Stockholm, Sweden, August, 1985.
    [9]
    DeWitt, D., Gerber, B., Graefe, G., Heytens, M., Kumar, K. and M. Muralikrishna, "GAMMA - A High Performance Dataflow Database Machine," Proceedings of the 1986 VLDB Conference, Japan, August 1986.
    [10]
    DeWitt, D., Ghandeharizadeh, S., and D. Schneider, "A Performance Analysis of the Gamma Database Machine", Proceedings of the 1988 SIGMOD Conference, Chicago, IL, June 1988.
    [11]
    Kitsuregawa, M., Tanaka, H., and T. Moto-oka, "Application of Hash to Data Base Machine and Its Architecture," New Generation Computing, Vol. 1, No. 1, 1983.
    [12]
    Kitsuregawa, M., Nakano, M., and M. Takagi, "Query Execution for Large Relations On Functional Disk System," to appear, 1989 Data Engineering Conference.
    [13]
    Lorie, R., et. al., "Adding Intra-Transaction Parallelism to an Exisiting DBMS" Early Experience", RJ 6165, IBM Almaden Research Center, March 1988.
    [14]
    Ries, D. and R. Epstein, "Evaluation of Distribution Criteria for Distributed Database Systems," UCB/ERL Technical Report M78/22, UC Berkeley, May, 1978.
    [15]
    Selinger, P. G., et. al., "Access Path Selection in a Relational Database Management System," Proceedings of the 1979 SIGMOD Conference, Boston, MA., May 1979.
    [16]
    Tandem Performance Group, A Benchmark of Non- Stop SQL on the Debit Credit Transaction, Proceedings of the 1988 SIGMOD Conference, Chicago, IL, June 1988.
    [17]
    Teradata Corp., DBC/1012 Data Base Computer Concepts & Facilities, Teradata Corp. Document No. C02-0001- 00, 1983.
    [18]
    Valduriez, P., and G. Gardarin, "Join and Semi-Join Algorithms for a Multiprocessor Database Machine" ACM Transactions on Database Systems, Vol. 9, No. 1, March, 1984.

    Cited By

    View all
    • (2022)Scaling Equi-JoinsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526042(2163-2176)Online publication date: 10-Jun-2022
    • (2019)DistriPlan: an optimized join execution framework for geo-distributed scientific dataDistributed and Parallel Databases10.1007/s10619-019-07264-z38:1(127-152)Online publication date: 23-Mar-2019
    • (2018)Efficient Parallel Join Processing Exploiting SIMD in Multi-Thread EnvironmentsIEICE Transactions on Information and Systems10.1587/transinf.2017EDP7300E101.D:3(659-667)Online publication date: 2018
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGMOD Record
    ACM SIGMOD Record  Volume 18, Issue 2
    June 1989
    442 pages
    • cover image ACM Conferences
      SIGMOD '89: Proceedings of the 1989 ACM SIGMOD international conference on Management of data
      June 1989
      451 pages
      ISBN:0897913175
      DOI:10.1145/67544
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 June 1989
    Published in SIGMOD Volume 18, Issue 2

    Check for updates

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)103
    • Downloads (Last 6 weeks)17
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Scaling Equi-JoinsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526042(2163-2176)Online publication date: 10-Jun-2022
    • (2019)DistriPlan: an optimized join execution framework for geo-distributed scientific dataDistributed and Parallel Databases10.1007/s10619-019-07264-z38:1(127-152)Online publication date: 23-Mar-2019
    • (2018)Efficient Parallel Join Processing Exploiting SIMD in Multi-Thread EnvironmentsIEICE Transactions on Information and Systems10.1587/transinf.2017EDP7300E101.D:3(659-667)Online publication date: 2018
    • (2018)Submodularity of Distributed Join ComputationProceedings of the 2018 International Conference on Management of Data10.1145/3183713.3183728(1237-1252)Online publication date: 27-May-2018
    • (2017)Improving the robustness and performance of parallel joins over distributed systemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2017.06.016109:C(310-323)Online publication date: 1-Nov-2017
    • (2014)Scalable and adaptive online joinsProceedings of the VLDB Endowment10.14778/2732279.27322817:6(441-452)Online publication date: 1-Feb-2014
    • (2014)Robust and Skew-resistant Parallel Joins in Shared-Nothing SystemsProceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management10.1145/2661829.2661888(1399-1408)Online publication date: 3-Nov-2014
    • (2014)Privacy-Preserving Ad-Hoc Equi-Join on Outsourced DataACM Transactions on Database Systems10.1145/262950139:3(1-40)Online publication date: 7-Oct-2014
    • (2014)Analysis of Fork/Join and Related Queueing SystemsACM Computing Surveys10.1145/262891347:2(1-71)Online publication date: 25-Aug-2014
    • (2014)Distributed data management using MapReduceACM Computing Surveys10.1145/250300946:3(1-42)Online publication date: 1-Jan-2014
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media