Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Orchestrating data placement and query execution in heterogeneous CPU-GPU DBMS

Published: 01 July 2022 Publication History

Abstract

There has been a growing interest in using GPU to accelerate data analytics due to its massive parallelism and high memory bandwidth. The main constraint of using GPU for data analytics is the limited capacity of GPU memory.
Heterogeneous CPU-GPU query execution is a compelling approach to mitigate the limited GPU memory capacity and PCIe bandwidth. However, the design space of heterogeneous CPU-GPU query execution has not been fully explored. We aim to improve state-of-the-art CPU-GPU data analytics engine by optimizing data placement and heterogeneous query execution. First, we introduce a semantic-aware fine-grained caching policy which takes into account various aspects of the workload such as query semantics, data correlation, and query frequency when determining data placement between CPU and GPU. Second, we introduce a heterogeneous query executor which can fully exploit data in both CPU and GPU and coordinate query execution at a fine granularity. We integrate both solutions in Mordred, our novel hybrid CPU-GPU data analytics engine.
Evaluation on the Star Schema Benchmark shows that the semantic-aware caching policy can outperform the best traditional caching policy by up to 3x. Compared to existing GPU DBMSs, Mordred can outperform by an order of magnitude.

References

[1]
[n.d.]. BlazingSQL. https://blazingsql.com. Accessed 15-May-2022.
[2]
[n.d.]. CUDA C Programming Guide. http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html. Accessed 15-May-2022.
[3]
[n.d.]. CXL. https://www.computeexpresslink.org/. Accessed 15-May-2022.
[4]
[n.d.]. HIP Programming Guide. https://github.com/ROCm-Developer-Tools/HIP. Accessed 15-May-2022.
[5]
[n.d.]. Kinetica. https://kinetica.com/. Accessed 15-May-2022.
[6]
[n.d.]. NVIDIA A100 Tensor Core GPU. https://www.nvidia.com/en-us/data-center/a100/. Accessed 15-May-2022.
[7]
[n.d.]. NVLINK. https://www.nvidia.com/en-us/data-center/nvlink/. Accessed 15-May-2022.
[8]
[n.d.]. OmniSci. https://omnisci.com. Accessed 15-May-2022.
[9]
[n.d.]. Opencl. https://www.khronos.org/opencl/. Accessed 15-May-2022.
[10]
[n.d.]. Parquet Encoding Format. https://github.com/apache/parquet-format/blob/master/Encodings.md. Accessed 15-May-2022.
[11]
[n.d.]. RAPIDS. https://rapids.ai. Accessed 15-May-2022.
[12]
[n.d.]. The RAPIDS Accelerator for Apache Spark. https://nvidia.github.io/spark-rapids/. Accessed 15-May-2022.
[13]
Daniel J. Abadi, Daniel S. Myers, David J. DeWitt, and Samuel R. Madden. 2007. Materialization Strategies in a Column-Oriented DBMS. In 2007 IEEE 23rd International Conference on Data Engineering. 466--475.
[14]
Peter Alexander Boncz et al. 2002. Monet: A next-generation DBMS kernel for query-intensive applications. Universiteit van Amsterdam [Host].
[15]
Peter A Boncz, Marcin Zukowski, and Niels Nes. 2005. MonetDB/X100: Hyper-Pipelining Query Execution. In Cidr, Vol. 5. 225--237.
[16]
Sebastian Breí, Felix Beier, Hannes Rauhe, Kai-Uwe Sattler, Eike Schallehn, and Gunter Saake. 2013. Efficient Co-Processor Utilization in Database Query Processing. Inf. Syst. 38, 8 (nov 2013), 1084--1096.
[17]
Sebastian Breß. 2014. The Design and Implementation of CoGaDB: A Column-oriented GPU-accelerated DBMS. Datenbank-Spektrum 14 (2014), 199--209.
[18]
Sebastian Breß, Felix Beier, Hannes Rauhe, Kai-Uwe Sattler, Eike Schallehn, and Gunter Saake. 2013. Efficient co-processor utilization in database query processing. Inf. Syst. 38 (2013), 1084--1096.
[19]
Sebastian Breß, Henning Funke, and Jens Teubner. 2016. Robust Query Processing in Co-Processor-Accelerated Databases. In Proceedings of the 2016 International Conference on Management of Data (San Francisco, California, USA) (SIGMOD '16). Association for Computing Machinery, New York, NY, USA, 1891--1906.
[20]
Sebastian Breß, Ingolf Geist, Eike Schallehn, Maik Mory, and Gunter Saake. 2012. A framework for cost based optimization of hybrid CPU/GPU query plans in database systems. Control and Cybernetics 41 (2012).
[21]
Sebastian Breß, Max Heimel, Michael Saecker, Bastian Köcher, Volker Markl, and Gunter Saake. 2014. Ocelot/HyPE: Optimized Data Processing on Heterogeneous Hardware. Proc. VLDB Endow. 7 (2014), 1609--1612.
[22]
Sebastian Breß, Siba Mohammad, and Eike Schallehn. 2012. Self-Tuning Distribution of DB-Operations on Hybrid CPU/GPU Platforms.
[23]
Sebastian Breß and Gunter Saake. 2013. Why It is Time for a HyPE: A Hybrid Query Processing Engine for Efficient GPU Coprocessing in DBMS. Proc. VLDB Endow. 6, 12 (aug 2013), 1398--1403.
[24]
Sebastian Breß, Norbert Siegmund, Max Heimel, Michael Saecker, Tobias Lauer, Ladjel Bellatreche, and Gunter Saake. 2014. Load-aware inter-co-processor parallelism in database query processing. Data Knowl. Eng. 93 (2014), 60--79.
[25]
Sebastian Breß, Felix Beier, Hannes Rauhe, Eike Schallehn, Kai-Uwe Sattler, and Gunter Saake. 2012. Automatic Selection of Processing Units for Coprocessing in Databases, Vol. 7503. 57--70.
[26]
Sebastian Breß, Norbert Siegmund, Ladjel Bellatreche, and Gunter Saake. 2013. An Operator-Stream-Based Scheduling Engine for Effective GPU Coprocessing, Vol. 8133. 288--301.
[27]
Periklis Chrysogelos, Manos Karpathiotakis, Raja Appuswamy, and Anastasia Ailamaki. 2019. HetExchange: Encapsulating Heterogeneous CPU-GPU Parallelism in JIT Compiled Engines. Proc. VLDB Endow. 12, 5 (Jan. 2019), 544--556.
[28]
Periklis Chrysogelos, Panagiotis Sioulas, and Anastasia Ailamaki. 2019. Hardware-conscious Query Processing in GPU-accelerated Analytical Engines. In 9th Biennial Conference on Innovative Data Systems Research, CIDR 2019, Asilomar, CA, USA, January 13--16, 2019, Online Proceedings. www.cidrdb.org. http://cidrdb.org/cidr2019/papers/p127-chrysogelos-cidr19.pdf
[29]
Shuhao Zhang et.al. 2013. OmniDB: Towards Portable and Efficient Query Processing on Parallel CPU/GPU Architectures. Proc. VLDB Endow. 6, 12 (aug 2013), 1374--1377.
[30]
Henning Funke, Sebastian Breß, Stefan Noll, Volker Markl, and Jens Teubner. 2018. Pipelined query processing in coprocessor environments. In Proceedings of the 2018 International Conference on Management of Data. ACM, 1603--1618.
[31]
Naga Govindaraju et al. 2006. GPUTeraSort: high performance graphics co-processor sorting for large database management. In SIGMOD.
[32]
Jim Gray, Prakash Sundaresan, Susanne Englert, Ken Baclawski, and Peter J. Weinberger. 1994. Quickly Generating Billion-Record Synthetic Databases. In Proceedings of the 1994 ACM SIGMOD International Conference on Management of Data (Minneapolis, Minnesota, USA) (SIGMOD '94). Association for Computing Machinery, New York, NY, USA, 243--252.
[33]
Bingsheng He, Mian Lu, Ke Yang, Rui Fang, Naga K. Govindaraju, Qiong Luo, and Pedro V. Sander. 2009. Relational Query Coprocessing on Graphics Processors. ACM Trans. Database Syst. 34, 4, Article 21 (dec 2009), 39 pages.
[34]
Bingsheng He, Ke Yang, Rui Fang, Mian Lu, Naga Govindaraju, Qiong Luo, and Pedro Sander. 2008. Relational joins on graphics processors. In SIGMOD.
[35]
Jiong He, Mian Lu, and Bingsheng He. 2013. Revisiting co-processing for hash joins on the coupled cpu-gpu architecture. PVLDB (2013).
[36]
Jiong He, Shuhao Zhang, and Bingsheng He. 2014. In-Cache Query Co-Processing on Coupled CPU-GPU Architectures. Proc. VLDB Endow. 8, 4 (dec 2014), 329--340.
[37]
Max Heimel, Michael Saecker, Holger Pirk, Stefan Manegold, and Volker Markl. 2013. Hardware-oblivious parallelism for in-memory column-stores. PVLDB (2013).
[38]
Tim Kaldewey, Guy Lohman, Rene Mueller, and Peter Volk. 2012. GPU join processing revisited. In DaMoN.
[39]
Tomas Karnagel, Dirk Habich, and Wolfgang Lehner. 2017. Adaptive Work Placement for Query Processing on Heterogeneous Computing Resources. Proc. VLDB Endow. 10, 7 (mar 2017), 733--744.
[40]
Jing Li, Hung-Wei Tseng, Chunbin Lin, Yannis Papakonstantinou, and Steven Swanson. 2016. Hippogriffdb: Balancing I/O and GPU bandwidth in big data analytics. Proceedings of the VLDB Endowment 9, 14 (2016), 1647--1658.
[41]
Clemens Lutz, Sebastian Breß, Steffen Zeuch, Tilmann Rabl, and Volker Markl. 2020. Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 1633--1649.
[42]
Sina Meraji, Berni Schiefer, Lan Pham, Lee Chu, Peter Kokosielis, Adam Storm, Wayne Young, Chang Ge, Geoffrey Ng, and Kajan Kanagaratnam. 2016. Towards a Hybrid Design for Fast Query Processing in DB2 with BLU Acceleration Using Graphical Processing Units: A Technology Demonstration. In Proceedings of the 2016 International Conference on Management of Data (San Francisco, California, USA) (SIGMOD '16). Association for Computing Machinery, New York, NY, USA, 1951--1960.
[43]
Patrick O'Neil, Elizabeth O'Neil, Xuedong Chen, and Stephen Revilak. 2009. The star schema benchmark and augmented fact table indexing. In Technology Conference on Performance Evaluation and Benchmarking. Springer, 237--252.
[44]
Johns Paul, Shengliang Lu, Bingsheng He, and Chiew Tong Lau. 2021. MG-Join: A Scalable Join for Massively Parallel Multi-GPU Architectures. Association for Computing Machinery, New York, NY, USA, 1413--1425.
[45]
Ran Rui, Hao Li, and Yi-Cheng Tu. 2020. Efficient Join Algorithms for Large Database Tables in a Multi-GPU Environment. Proc. VLDB Endow. 14, 4 (Dec. 2020), 708--720.
[46]
Ran Rui and Yi-Cheng Tu. 2017. Fast equi-join algorithms on gpus: Design and implementation. In Proceedings of the 29th International Conference on Scientific and Statistical Database Management. ACM, 17.
[47]
Anil Shanbhag, Xiangyao Yu, and Samuel Madden. 2020. A Study of the Fundamental Performance Charecteristics of GPUs and CPUs for Database Analytics. In Proceedings of the 2020 International Conference on Management of Data. ACM.
[48]
Panagiotis Sioulas, Periklis Chrysogelos, Manos Karpathiotakis, Raja Appuswamy, and Anastasia Ailamaki. 2019. Hardware-conscious Hash-Joins on GPUs. Technical Report.
[49]
Evangelia A Sitaridi and Kenneth A Ross. 2013. Optimizing select conditions on GPUs. In Proceedings of the Ninth International Workshop on Data Management on New Hardware. ACM, 4.
[50]
Elias Stehle and Hans-Arno Jacobsen. 2017. A Memory Bandwidth-Efficient Hybrid Radix Sort on GPUs. In SIGMOD. ACM.
[51]
Kaibo Wang, Kai Zhang, Yuan Yuan, Siyuan Ma, Rubao Lee, Xiaoning Ding, and Xiaodong Zhang. 2014. Concurrent analytical query processing with GPUs. Proceedings of the VLDB Endowment 7, 11 (2014), 1011--1022.
[52]
Haicheng Wu, Gregory Diamos, Srihari Cadambi, and Sudhakar Yalamanchili. 2012. Kernel weaver: Automatically fusing database primitives for efficient GPU computation. In 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture. IEEE.
[53]
Makoto Yabuta, Anh Nguyen, Shinpei Kato, Masato Edahiro, and Hideyuki Kawashima. 2017. Relational joins on GPUs: A closer look. IEEE Transactions on Parallel and Distributed Systems 28, 9 (2017), 2663--2673.
[54]
Yuan Yuan, Rubao Lee, and Xiaodong Zhang. 2013. The Yin and Yang of processing data warehousing queries on GPU devices. PVLDB (2013).
[55]
Kai Zhang, Feng Chen, Xiaoning Ding, Yin Huai, Rubao Lee, Tian Luo, Kaibo Wang, Yuan Yuan, and Xiaodong Zhang. 2015. Hetero-DB: Next Generation High-Performance Database Systems by Best Utilizing Heterogeneous Computing and Storage Resources. Journal of Computer Science and Technology 30 (2015).

Cited By

View all
  • (2024)Accelerating Merkle Patricia Trie with GPUProceedings of the VLDB Endowment10.14778/3659437.365944317:8(1856-1869)Online publication date: 1-Apr-2024
  • (2024)GOLAP: A GPU-in-Data-Path Architecture for High-Speed OLAPProceedings of the ACM on Management of Data10.1145/36988122:6(1-26)Online publication date: 20-Dec-2024
  • (2024)How Does Software Prefetching Work on GPU Query Processing?Proceedings of the 20th International Workshop on Data Management on New Hardware10.1145/3662010.3663445(1-9)Online publication date: 10-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 15, Issue 11
July 2022
980 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 July 2022
Published in PVLDB Volume 15, Issue 11

Badges

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)295
  • Downloads (Last 6 weeks)6
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Accelerating Merkle Patricia Trie with GPUProceedings of the VLDB Endowment10.14778/3659437.365944317:8(1856-1869)Online publication date: 1-Apr-2024
  • (2024)GOLAP: A GPU-in-Data-Path Architecture for High-Speed OLAPProceedings of the ACM on Management of Data10.1145/36988122:6(1-26)Online publication date: 20-Dec-2024
  • (2024)How Does Software Prefetching Work on GPU Query Processing?Proceedings of the 20th International Workshop on Data Management on New Hardware10.1145/3662010.3663445(1-9)Online publication date: 10-Jun-2024
  • (2024)Heterogeneous Intra-Pipeline Device-Parallel AggregationsProceedings of the 20th International Workshop on Data Management on New Hardware10.1145/3662010.3663441(1-10)Online publication date: 10-Jun-2024
  • (2023)An Empirical Evaluation of Columnar Storage FormatsProceedings of the VLDB Endowment10.14778/3626292.362629817:2(148-161)Online publication date: 1-Oct-2023
  • (2023)Random Forests over normalized data in CPU-GPU DBMSesProceedings of the 19th International Workshop on Data Management on New Hardware10.1145/3592980.3595318(98-101)Online publication date: 18-Jun-2023
  • (2023)Accelerating User-Defined Aggregate Functions (UDAF) with Block-wide Execution and JIT Compilation on GPUsProceedings of the 19th International Workshop on Data Management on New Hardware10.1145/3592980.3595307(19-26)Online publication date: 18-Jun-2023

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media