research-article

Open access

Morsel-driven parallelism: a NUMA-aware query evaluation framework for the many-core age

Authors:

Thomas NeumannAuthors Info & Claims

SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data

Pages 743 - 754

https://doi.org/10.1145/2588555.2610507

Published: 18 June 2014 Publication History

Abstract

With modern computer architecture evolving, two problems conspire against the state-of-the-art approaches in parallel query execution: (i) to take advantage of many-cores, all query work must be distributed evenly among (soon) hundreds of threads in order to achieve good speedup, yet (ii) dividing the work evenly is difficult even with accurate data statistics due to the complexity of modern out-of-order cores. As a result, the existing approaches for plan-driven parallelism run into load balancing and context-switching bottlenecks, and therefore no longer scale. A third problem faced by many-core architectures is the decentralization of memory controllers, which leads to Non-Uniform Memory Access (NUMA). In response, we present the morsel-driven query execution framework, where scheduling becomes a fine-grained run-time task that is NUMA-aware. Morsel-driven query processing takes small fragments of input data (morsels) and schedules these to worker threads that run entire operator pipelines until the next pipeline breaker. The degree of parallelism is not baked into the plan but can elastically change during query execution, so the dispatcher can react to execution speed of different morsels but also adjust resources dynamically in response to newly arriving queries in the workload. Further, the dispatcher is aware of data locality of the NUMA-local morsels and operator state, such that the great majority of executions takes place on NUMA-local memory. Our evaluation on the TPC-H and SSB benchmarks shows extremely high absolute performance and an average speedup of over 30 with 32 cores.

References

[1]

M.-C. Albutiu, A. Kemper, and T. Neumann. Massively parallel sort-merge joins in main memory multi-core database systems. PVLDB, 5(10), 2012.

Digital Library

[2]

G. Alonso. Hardware killed the software star. In ICDE, 2013.

Digital Library

[3]

K. Anikiej. Multi-core parallelization of vectorized query execution. Master's thesis, University of Warsaw and VU University Amsterdam, 2010. http://homepages.cwi.nl/~boncz/msc/2010-KamilAnikijej.pdf.

[4]

C. Balkesen, G. Alonso, J. Teubner, and M. T. Özsu. Multi-core, main-memory joins: Sort vs. hash revisited. PVLDB, 7(1), 2013.

[5]

C. Balkesen, J. Teubner, G. Alonso, and M. T. Özsu. Main-memory hash joins on multi-core CPUs: Tuning to the underlying hardware. In ICDE, 2013.

Digital Library

[6]

S. Bellamkonda, H.-G. Li, U. Jagtap, Y. Zhu, V. Liang, and T. Cruanes. Adaptive and big data scale parallel execution in oracle. PVLDB, 6(11), 2013.

Digital Library

[7]

S. Blanas, Y. Li, and J. M. Patel. Design and evaluation of main memory hash join algorithms for multi-core CPUs. In SIGMOD, 2011.

Digital Library

[8]

P. Boncz, T. Neumann, and O. Erling. TPC-H analyzed: Hidden messages and lessons learned from an influential benchmark. In TPCTC, 2013.

[9]

P. A. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-pipelining query execution. In CIDR, 2005.

[10]

J. Dees and P. Sanders. Efficient many-core query execution in main memory column-stores. In ICDE, 2013.

Digital Library

[11]

G. Giannikis, G. Alonso, and D. Kossmann. SharedDB: Killing one thousand queries with one stone. PVLDB, 5(6), 2012.

Digital Library

[12]

G. Graefe. Encapsulation of parallelism in the Volcano query processing system. In SIGMOD, 1990.

Digital Library

[13]

G. Graefe. Query evaluation techniques for large databases. ACM Comput. Surv., 25(2), 1993.

Digital Library

[14]

S. Harizopoulos, V. Shkapenyuk, and A. Ailamaki. QPipe: A simultaneously pipelined relational query engine. In SIGMOD, 2005.

Digital Library

[15]

M. Heimel, M. Saecker, H. Pirk, S. Manegold, and V. Markl. Hardware-oblivious parallelism for in-memory column-stores. PVLDB, 6(9), 2013.

Digital Library

[16]

A. Kemper and T. Neumann. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In ICDE, 2011.

Digital Library

[17]

T. Kiefer, B. Schlegel, and W. Lehner. Experimental evaluation of NUMA effects on database management systems. In BTW, 2013.

[18]

C. Kim, E. Sedlar, J. Chhugani, T. Kaldewey, A. D. Nguyen, A. D. Blas, V. W. Lee, N. Satish, and P. Dubey. Sort vs. hash revisited: Fast join implementation on modern multi-core CPUs. PVLDB, 2(2), 2009.

Digital Library

[19]

K. Krikellas, S. Viglas, and M. Cintra. Generating code for holistic query evaluation. In ICDE, 2010.

[20]

H. Lang, V. Leis, M.-C. Albutiu, T. Neumann, and A. Kemper. Massively parallel NUMA-aware hash joins. In IMDM Workshop, 2013.

[21]

P.-Å. Larson, C. Clinciu, C. Fraser, E. N. Hanson, M. Mokhtar, M. Nowakiewicz, V. Papadimos, S. L. Price, S. Rangarajan, R. Rusanu, and M. Saubhasik. Enhancements to SQL Server column stores. In SIGMOD, 2013.

Digital Library

[22]

P.-Å. Larson, E. N. Hanson, and S. L. Price. Columnar storage in SQL Server 2012. IEEE Data Eng. Bull., 35(1), 2012.

[23]

Y. Li, I. Pandis, R. Müller, V. Raman, and G. M. Lohman. NUMA-aware algorithms: the case of data shuffling. In CIDR, 2013.

[24]

S. Manegold, P. A. Boncz, and M. L. Kersten. Optimizing main-memory join on modern hardware. IEEE Trans. Knowl. Data Eng., 14(4), 2002.

Digital Library

[25]

T. Neumann. Efficiently compiling efficient query plans for modern hardware. PVLDB, 4, 2011.

Digital Library

[26]

P. O'Neil, B. O'Neil, and X. Chen. The star schema benchmark (SSB), 2007. http://www.cs.umb.edu/~poneil/StarSchemaB.PDF.

[27]

O. Polychroniou and K. A. Ross. High throughput heavy hitter aggregation for modern SIMD processors. In DaMoN, 2013.

Digital Library

[28]

D. Porobic, E. Liarou, P. Tözün, and A. Ailamaki. ATraPos: Adaptive transaction processing on hardware islands. In ICDE, 2014.

[29]

D. Porobic, I. Pandis, M. Branco, P. Tözün, and A. Ailamaki. OLTP on hardware islands. PVLDB, 5(11), 2012.

Digital Library

[30]

I. Psaroudakis, T. Scheuer, N. May, and A. Ailamaki. Task scheduling for highly concurrent analytical and transactional main-memory workloads. In ADMS Workshop, 2013.

[31]

V. Raman, G. Attaluri, R. Barber, N. Chainani, D. Kalmuk, V. KulandaiSamy, J. Leenstra, S. Lightstone, S. Liu, G. M. Lohman, T. Malkemus, R. Mueller, I. Pandis, B. Schiefer, D. Sharpe, R. Sidle, A. Storm, and L. Zhang. DB2 with BLU acceleration: So much more than just a column store. In VLDB, 2013.

Digital Library

[32]

J. Teubner and R. Müller. How soccer players would do stream joins. In SIGMOD, 2011.

Digital Library

[33]

Y. Ye, K. A. Ross, and N. Vesdapunt. Scalable aggregation on multicore processors. In DaMoN, 2011.

Digital Library

[34]

M. Zukowski and P. A. Boncz. Vectorwise: Beyond column stores. IEEE Data Eng. Bull., 35(1), 2012.

Cited By

Qiao YZhang H(2025)Data Chunk Compaction in Vectorized ExecutionProceedings of the ACM on Management of Data10.1145/37096763:1(1-25)Online publication date: 11-Feb-2025
https://dl.acm.org/doi/10.1145/3709676
Dreimann MMühlig JMüller MSpinczyk OTeubner J(2025)MxKernel: A Bare-Metal Runtime System for Database Operations on Heterogeneous Many-Core HardwareScalable Data Management for Future Hardware10.1007/978-3-031-74097-8_5(117-143)Online publication date: 24-Jan-2025
https://doi.org/10.1007/978-3-031-74097-8_5
Baumstark AJibril MSattler K(2025)Hybrid Transactional/Analytical Graph Processing in Modern Memory HierarchiesScalable Data Management for Future Hardware10.1007/978-3-031-74097-8_4(91-116)Online publication date: 24-Jan-2025
https://doi.org/10.1007/978-3-031-74097-8_4
Show More Cited By

Index Terms

Morsel-driven parallelism: a NUMA-aware query evaluation framework for the many-core age
1. Information systems
  1. Data management systems
    1. Database management system engines
      1. Database query processing
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Database theory
      1. Database query processing and optimization (theory)

Recommendations

Query Compilation based Distributed Morsel-driven Parallel Spatial Query Processing
SIGSPATIAL '24: Proceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems

Driven by the need to support spatial data applications, most relational databases offer spatial SQL query features. However, traditional relational databases are not scalable, and their query processing follows a pull-based tuple-at-a-time model, which ...
High Performance Parallelism Pearls: Multicore and Many-core Programming Approaches
Parallelism via Multithreaded and Multicore CPUs

Multicore and multithreaded CPUs have become the new approach to obtaining increases in CPU performance. Numeric applications mostly benefit from a large number of computationally powerful cores. Servers typically benefit more if chip circuitry is used ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data

June 2014

1645 pages

ISBN:9781450323765

DOI:10.1145/2588555

General Chairs:
Curtis Dyreson
Utah State University, USA
,
Feifei Li
University of Utah, USA
,
Program Chair:
M. Tamer Özsu
University of Waterloo, Canada

Copyright © 2014 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2014

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGMOD/PODS'14

Sponsor:

SIGMOD

SIGMOD/PODS'14: International Conference on Management of Data

June 22 - 27, 2014

Utah, Snowbird, USA

Acceptance Rates

SIGMOD '14 Paper Acceptance Rate 107 of 421 submissions, 25%;

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

220
Total Citations
View Citations
4,872
Total Downloads

Downloads (Last 12 months)1,229
Downloads (Last 6 weeks)165

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Qiao YZhang H(2025)Data Chunk Compaction in Vectorized ExecutionProceedings of the ACM on Management of Data10.1145/37096763:1(1-25)Online publication date: 11-Feb-2025
https://dl.acm.org/doi/10.1145/3709676
Dreimann MMühlig JMüller MSpinczyk OTeubner J(2025)MxKernel: A Bare-Metal Runtime System for Database Operations on Heterogeneous Many-Core HardwareScalable Data Management for Future Hardware10.1007/978-3-031-74097-8_5(117-143)Online publication date: 24-Jan-2025
https://doi.org/10.1007/978-3-031-74097-8_5
Baumstark AJibril MSattler K(2025)Hybrid Transactional/Analytical Graph Processing in Modern Memory HierarchiesScalable Data Management for Future Hardware10.1007/978-3-031-74097-8_4(91-116)Online publication date: 24-Jan-2025
https://doi.org/10.1007/978-3-031-74097-8_4
Kozar ABleichert JBreß SGrulich PLutz CRabl TRosenfeld VTraub JZeuch SMarkl V(2025)Query Processing on Heterogeneous HardwareScalable Data Management for Future Hardware10.1007/978-3-031-74097-8_2(39-64)Online publication date: 24-Jan-2025
https://doi.org/10.1007/978-3-031-74097-8_2
Hong KKim KLee YMoon YBhowmick SHan W(2024)Themis: A GPU-Accelerated Relational Query Execution EngineProceedings of the VLDB Endowment10.14778/3705829.370585618:2(426-438)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.14778/3705829.3705856
Wang JAthanassoulis M(2024)CUBIT: Concurrent Updatable Bitmap IndexingProceedings of the VLDB Endowment10.14778/3705829.370585418:2(399-412)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.14778/3705829.3705854
Arch SLiu YMowry TPatel JPavlo A(2024)The Key to Effective UDF Optimization: Before Inlining, First Perform OutliningProceedings of the VLDB Endowment10.14778/3696435.369643618:1(1-13)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.14778/3696435.3696436
Schulze RSchreiber TYatsishin IDahimene RMilovidov A(2024)ClickHouse - Lightning Fast Analytics for EveryoneProceedings of the VLDB Endowment10.14778/3685800.368580217:12(3731-3744)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.14778/3685800.3685802
Leis VDietrich C(2024)Cloud-Native Database Systems and Unikernels: Reimagining OS Abstractions for Modern HardwareProceedings of the VLDB Endowment10.14778/3659437.365946217:8(2115-2122)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.14778/3659437.3659462
Kuschewski MGiceva JNeumann TLeis V(2024)High-Performance Query Processing with NVMe Arrays: Spilling without Killing PerformanceProceedings of the ACM on Management of Data10.1145/36988132:6(1-27)Online publication date: 20-Dec-2024
https://doi.org/10.1145/3698813
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten