Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1989323.1989328acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Design and evaluation of main memory hash join algorithms for multi-core CPUs

Published: 12 June 2011 Publication History
  • Get Citation Alerts
  • Abstract

    The focus of this paper is on investigating efficient hash join algorithms for modern multi-core processors in main memory environments. This paper dissects each internal phase of a typical hash join algorithm and considers different alternatives for implementing each phase, producing a family of hash join algorithms. Then, we implement these main memory algorithms on two radically different modern multi-processor systems, and carefully examine the factors that impact the performance of each method.
    Our analysis reveals some interesting results -- a very simple hash join algorithm is very competitive to the other more complex methods. This simple join algorithm builds a shared hash table and does not partition the input relations. Its simplicity implies that it requires fewer parameter settings, thereby making it far easier for query optimizers and execution engines to use it in practice. Furthermore, the performance of this simple algorithm improves dramatically as the skew in the input data increases, and it quickly starts to outperform all other algorithms.

    References

    [1]
    A. Ailamaki, D. J. DeWitt, M. D. Hill, and D. A. Wood. DBMSs on a modern processor: Where does time go? In VLDB, pages 266--277, 1999.
    [2]
    P. A. Boncz, S. Manegold, and M. L. Kersten. Database architecture optimized for the new bottleneck: Memory access. In VLDB, pages 54--65, 1999.
    [3]
    J. Cieslewicz, W. Mee, and K. A. Ross. Cache-conscious buffering for database operators with state. In DaMoN, pages 43--51, 2009.
    [4]
    J. Cieslewicz and K. A. Ross. Data partitioning on chip multiprocessors. In DaMoN, pages 25--34, 2008.
    [5]
    J. Cieslewicz, K. A. Ross, and I. Giannakakis. Parallel buffers for chip multiprocessors. In DaMoN, 2007.
    [6]
    D. J. DeWitt and J. Gray. Parallel database systems: The future of database processing or a passing fad? SIGMOD Record, 19(4):104--112, 1990.
    [7]
    D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. Stonebraker, and D. A. Wood. Implementation techniques for main memory database systems. In SIGMOD Conference, pages 1--8, 1984.
    [8]
    D. J. DeWitt, J. F. Naughton, D. A. Schneider, and S. Seshadri. Practical skew handling in parallel joins. In VLDB, pages 27--40, 1992.
    [9]
    P. Garcia and H. F. Korth. Database hash-join algorithms on multithreaded computer architectures. In Conf. Computing Frontiers, pages 241--252, 2006.
    [10]
    G. Graefe, R. Bunker, and S. Cooper. Hash joins and hash teams in Microsoft SQL Server. In VLDB, pages 86--97, 1998.
    [11]
    C. Kim, E. Sedlar, J. Chhugani, T. Kaldewey, A. D. Nguyen, A. D. Blas, V. W. Lee, N. Satish, and P. Dubey. Sort vs. hash revisited: Fast join implementation on modern multi-core CPUs. PVLDB, 2(2):1378--1389, 2009.
    [12]
    S. Manegold, P. A. Boncz, and M. L. Kersten. What happens during a join? Dissecting CPU and memory optimization effects. In VLDB, pages 339--350, 2000.
    [13]
    S. Manegold, P. A. Boncz, and M. L. Kersten. Optimizing main-memory join on modern hardware. IEEE Trans. Knowl. Data Eng., 14(4):709--730, 2002.
    [14]
    R. Pagh and F. F. Rodler. Cuckoo hashing. J. Algorithms, 51(2):122--144, 2004.
    [15]
    K. A. Ross. Efficient hash probes on modern processors. In ICDE, pages 1297--1301, 2007.
    [16]
    A. Shatdal, C. Kant, and J. F. Naughton. Cache conscious algorithms for relational query processing. In VLDB, pages 510--521, 1994.
    [17]
    M. Stonebraker. The case for shared nothing. In HPTS, 1985.

    Cited By

    View all
    • (2024)Heterogeneous acceleration design and implementation for join operation in database queryProceedings of the 3rd International Conference on Computer, Artificial Intelligence and Control Engineering10.1145/3672758.3672918(965-970)Online publication date: 26-Jan-2024
    • (2024)SIMDified Data Processing - Foundations, Abstraction, and Advanced TechniquesCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654694(613-621)Online publication date: 9-Jun-2024
    • (2024)CPU and GPU Hash Joins on Skewed Data2024 IEEE 40th International Conference on Data Engineering Workshops (ICDEW)10.1109/ICDEW61823.2024.00064(402-408)Online publication date: 13-May-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '11: Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
    June 2011
    1364 pages
    ISBN:9781450306614
    DOI:10.1145/1989323
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 June 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. hash join
    2. main memory
    3. multi-core

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)94
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Heterogeneous acceleration design and implementation for join operation in database queryProceedings of the 3rd International Conference on Computer, Artificial Intelligence and Control Engineering10.1145/3672758.3672918(965-970)Online publication date: 26-Jan-2024
    • (2024)SIMDified Data Processing - Foundations, Abstraction, and Advanced TechniquesCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654694(613-621)Online publication date: 9-Jun-2024
    • (2024)CPU and GPU Hash Joins on Skewed Data2024 IEEE 40th International Conference on Data Engineering Workshops (ICDEW)10.1109/ICDEW61823.2024.00064(402-408)Online publication date: 13-May-2024
    • (2024)Differential-Matching Prefetcher for Indirect Memory Access2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00040(439-453)Online publication date: 2-Mar-2024
    • (2023)Analyzing Vectorized Hash Tables across CPU ArchitecturesProceedings of the VLDB Endowment10.14778/3611479.361148516:11(2755-2768)Online publication date: 24-Aug-2023
    • (2023)SDPipe: A Semi-Decentralized Framework for Heterogeneity-Aware Pipeline-parallel TrainingProceedings of the VLDB Endowment10.14778/3598581.359860416:9(2354-2363)Online publication date: 10-Jul-2023
    • (2023)Opportunities for Quantum Acceleration of Databases: Optimization of Queries and Transaction SchedulesProceedings of the VLDB Endowment10.14778/3598581.359860316:9(2344-2353)Online publication date: 1-May-2023
    • (2023)Cracking-Like Join for Trusted Execution EnvironmentsProceedings of the VLDB Endowment10.14778/3598581.359860216:9(2330-2343)Online publication date: 10-Jul-2023
    • (2023)Neighborhood-Based Hypergraph Core DecompositionProceedings of the VLDB Endowment10.14778/3598581.359858216:9(2061-2074)Online publication date: 1-May-2023
    • (2023)The Case for Learned In-Memory JoinsProceedings of the VLDB Endowment10.14778/3587136.358714816:7(1749-1762)Online publication date: 8-May-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media