research-article

Using Vectorized Execution to Improve SQL Query Performance on Spark

Authors:

Dejun JiangAuthors Info & Claims

ICPP '21: Proceedings of the 50th International Conference on Parallel Processing

Article No.: 57, Pages 1 - 11

https://doi.org/10.1145/3472456.3472495

Published: 05 October 2021 Publication History

Abstract

MapReduce-based SQL processing frameworks, such as Hive and Spark SQL, are widely used to support big data analytics. Currently these systems mainly adopt the record-at-a-time execution model, which is less efficient in terms of CPU utilization. In contrast, vectorized execution is able to make better use of CPU cache by bulk processing a record batch at a time. However, simply applying vectorized execution to MapReduce-based frameworks results in low efficient vectorized shuffle. Moreover, existing vectorized execution donot make full use of CPU cache for complex operators (e.g. Sort and Aggregation). In this paper, we present VEE, a thorough vectorized execution engine designed for SQL query processing on Spark. First, VEE designs compact in-memory data layout and serialization-aware assembling for vectorized shuffle to expedites shuffle execution, since they reduce shuffle data footprint and related computations. Secondly, VEE applies in-memory record batch rearrangement for Sort and Aggregation to greatly reduce random memory access and increase query performance. Thirdly, VEE carefully designs operator-aware batch length when handling different operators, which makes better utilization of CPU cache and increases query performance. We conduct extensive performance evaluations. The experiment results show that the performance speedup of VEE against Spark is up to 72.7% and 25.0% on average for OLAP workloads (TPC-H). The vectorized execution technologies in VEE are also applicable to other MapReduce-based data analytic frameworks to improve their query performance.

References

[1]

2014. Apache Hive. http://hive.apache.org/

[2]

2018. Apache Parquet. https://parquet.apache.org/

[3]

2018. Vectorized Query Execution in Hive. https://issues.apache.org/jira/browse/HIVE-4160

[4]

2020. Aparch Drill. https://drill.apache.org/

[5]

2021. Apache ORC. https://orc.apache.org/

[6]

2021. Radix sort. https://en.wikipedia.org/wiki/Radix_sort

[7]

2021. TPC-H. http://www.tpc.org/tpch/

[8]

D. Abadi, P. Boncz, and et al.2013. . Now Foundations and Trends.

[9]

M. Albutiu, A. Kemper, and T. Neumann. 2012. Massively Parallel Sort-merge Joins in Main Memory Multi-core Database Systems. Proc. VLDB Endow. 5, 10 (2012), 1064–1075.

Digital Library

[10]

M. Armbrust, A. Ghodsi, and et al.2015. Spark SQL: Relational Data Processing in Spark. In SIGMOD. ACM, 1383–1394.

[11]

C. Balkesen, G.o Alonso, and et al.2013. Multi-core, Main-memory Joins: Sort vs. Hash Revisited. Proc. VLDB Endow. 7, 1 (2013), 85–96.

Digital Library

[12]

P. Boncz, S. Manegold, and M. Kersten. 1999. Database Architecture Optimized for the New Bottleneck: Memory Access. In VLDB. Morgan Kaufmann Publishers Inc., 54–65.

[13]

P. Boncz, M. Zukowski, and N. Nes. 2005. MonetDB/X100: Hyper-Pipelining Query Execution. In Second Biennial Conference on Innovative Data Systems Research(CIDR). 225–237.

[14]

B. Chandramouli and J. Goldstein. 2014. Patience is a Virtue: Revisiting Merge and Sort on Modern Processors. In SIGMOD. ACM, 731–742.

[15]

B. Chattopadhyay, L. Lin, and et al.2011. Tenzing: A SQL Implementation On The MapReduce Framework. Proc. VLDB Endow. 4, 12 (2011), 1318–1327.

Digital Library

[16]

J. Chhugani, A. Nguyen, and et al.2008. Efficient Implementation of Sorting on Multi-core SIMD CPU Architecture. Proc. VLDB Endow. 1, 2 (2008), 1313–1324.

Digital Library

[17]

M. Cho, D. Brand, and et al.2015. PARADIS: An Efficient Parallel Algorithm for In-place Radix Sort. Proc. VLDB Endow. 8, 12 (2015), 1518–1529.

Digital Library

[18]

J. Cieslewicz and K. Ross. 2007. Adaptive Aggregation on Chip Multiprocessors. In VLDB. VLDB Endowment, 339–350.

[19]

T Condie, N Conway, and et al.2010. MapReduce Online. In NSDI. USENIX Association.

[20]

A. Costea, A. Ionescu, and et al.2016. VectorH: Taking SQL-on-Hadoop to the next level. In SIGMOD. ACM.

[21]

J. Dean and S. Ghemawat. 2004. MapReduce: Simplified Data Processing on Large Clusters. In OSDI. USENIX Association, 137–150.

[22]

K. Elmeleegy, C. Olston, and B. Reed. 2014. SpongeFiles: Mitigating Data Skew in Mapreduce Using Distributed Memory. In SIGMOD. ACM, 551–562.

[23]

A. Floratou, J. Patel, and et al.2011. Column-oriented Storage Techniques for MapReduce. Proc. VLDB Endow. 4, 7 (2011), 419–429.

Digital Library

[24]

Z. Fu, T. Song, and et al.2018. Efficient Shuffle Management with SCache for DAG Computing Frameworks. In PPoPP. ACM, 305–316.

[25]

N. Govindaraju, J. Gray, and et al.2006. GPUTeraSort: High Performance Graphics Co-processor Sorting for Large Database Management. In SIGMOD. ACM, 325–336.

[26]

G. Graefe. 1994. Volcano–An Extensible and Parallel Query Evaluation System. TKDE 6, 1 (1994), 120–135.

Digital Library

[27]

S. Guo, J. Xiong, and et al.2012. Mastiff: A MapReduce-based System for Time-Based Big Data Analytics. In CLUSTER. IEEE Computer Society, 72–80.

[28]

Y. Guo, J. Rao, and X. Zhou. 2013. iShuffle: Improving Hadoop Performance with Shuffle-on-Write. In ICAC. USENIX Association, 107–117.

[29]

Y. He, R. Lee, Y. Huai, and et al.2011. RCFile: A Fast and Space-Efficient Data Placement Structure in MapReduce-based Warehouse Systems. In ICDE. IEEE Computer Society, 1199–1208.

[30]

Y. Huai, A. Chauhan, and et al.2014. Major Technical Advancements in Apache Hive. In SIGMOD. ACM, 1235–1246.

[31]

Y. Huai, S. Ma, and et al.2013. Understanding Insights into the Basic Structure and Essential Issues of Table Placement Methods in Clusters. Proc. VLDB Endow. 6, 14 (2013), 1750–1761.

Digital Library

[32]

D. Inkster, M. Zukowski, and P. Boncz. 2011. Integration of Vectorwise with Ingres. SIGMOD Rec. 40, 3 (2011), 45–53.

Digital Library

[33]

H. Inoue, T. Moriyama, and et al.2007. AA-Sort: A New Parallel Sorting Algorithm for Multi-Core SIMD Processors. In Proc. PACT. IEEE Computer Society, 189–198.

[34]

H. Inoue and K. Taura. 2015. SIMD- and Cache-friendly Algorithm for Sorting an Array of Structures. Proc. VLDB Endow. 8, 11 (2015), 1274–1285.

Digital Library

[35]

Peng Jiang and Gagan Agrawal. 2017. Efficient SIMD and MIMD Parallelization of Hash-based Aggregation by Conflict Mitigation. In ICS. ACM.

[36]

A. Kemper and T. Neumann. 2011. HyPer : A Hybrid OLTP & OLAP Main Memory Database System Based on Virtual Memory Snapshots. In ICDE. IEEE Computer Society, 195–206.

[37]

T. Kersten, V. Leis, and et al.2018. Everything You Always Wanted to Know About Compiled and Vectorized Queries but Were Afraid to Ask. Proc. VLDB Endow. 11, 13 (2018), 2209–2222.

[38]

C. Kim, T. Kaldewey, and et al.2009. Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-core CPUs. Proc. VLDB Endow. 2, 2 (2009), 1378–1389.

Digital Library

[39]

M. Kornacker, A. Behm, and et al.2015. Impala: A Modern, Open-Source SQL Engine for Hadoop. In Seventh Biennial Conference on Innovative Data Systems Research(CIDR).

[40]

A. LaMarca and R. Ladner. 1999. The Influence of Caches on the Performance of Sorting. Journal of Algorithms 31, 1 (1999), 66 – 104.

Digital Library

[41]

P. Larson, C. Clinciu, and et al.2011. SQL Server Column Store Indexes. In SIGMOD. ACM, 1177–1184.

[42]

D. Liu. 2016. Whole Stage Codegen. https://issues.apache.org/jira/browse/SPARK-12795

[43]

P. Menon, T. Mowry, and A. Pavlo. 2017. Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together at Last. Proc. VLDB Endow. 11, 1 (2017), 1–13.

Digital Library

[44]

I. Müller, P. Sanders, and et al.2015. Cache-Efficient Aggregation: Hashing Is Sorting. In SIGMOD. ACM, 1123–1136.

[45]

T Neumann. 2011. Efficiently Compiling Efficient Query Plans for Modern Hardware. Proc. VLDB Endow. (2011), 539–550.

Digital Library

[46]

C. Nyberg, T. Barclay, and et al.1995. AlphaSort: A Cache-Sensitive Parallel External Sort. The VLDB Journal 4, 4 (1995), 603–628.

[47]

K. Ousterhout, R. Rasti, and et al.2015. Making Sense of Performance in Data Analytics Frameworks. In NSDI. USENIX Association, 293–307.

[48]

S. Padmanabhan, T. Malkemus, and et al.2001. Block oriented processing of relational database operations in modern computer architectures. In ICDE. IEEE Computer Society, 567–574.

[49]

T. Peters. 2002. TimSort Description. http://svn.python.org/projects/python/trunk/Objects/listsort.txt

[50]

O. Polychroniou, A. Raghavan, and K. Ross. 2015. Rethinking SIMD Vectorization for In-Memory Databases. In SIGMOD. ACM, 1493–1508.

[51]

O. Polychroniou and K. Ross. 2014. A Comprehensive Study of Main-memory Partitioning and Its Application to Large-scale Comparison- and Radix-sort. In SIGMOD. ACM, 755–766.

[52]

V. Raman, G. Attaluri, and R. Barber. 2013. DB2 with BLU Acceleration: So Much More than Just a Column Store. Proc. VLDB Endow. (2013), 1080–1091.

[53]

S. Rao, R. Ramakrishnan, and et al.2012. Sailfish: A Framework for Large Scale Data Processing. In SoCC. ACM.

[54]

A. Rasmussen, V. Lam, and et al.2012. Themis: An I/O-efficient MapReduce. In SoCC. ACM.

[55]

N. Satish, C. Kim, and et al.2010. Fast Sort on CPUs and GPUs: A Case for Bandwidth Oblivious SIMD Sort. In SIGMOD. ACM, 351–362.

[56]

J. Sompolski, M. Zukowski, and P. Boncz. 2011. Vectorization vs. Compilation in Query Execution. In DaMoN. ACM, 33–40.

[57]

D. Tang, T. Liu, and et al.2015. A Case Study of Optimizing Big Data Analytical Stacks Using Structured Data Shuffling. In CLUSTER. IEEE, 70–73.

[58]

Y. Wang, X. Que, and et al.2011. Hadoop Acceleration Through Network Levitated Merge. In SC. ACM.

[59]

Md. Wasi-ur Rahman, N. Islam, and et al.2013. High-Performance RDMA-based Design of Hadoop MapReduce over InfiniBand. In IPDPSW. IEEE Computer Society, 1908–1917.

[60]

Y. Ye, K. Ross, and N. Vesdapunt. 2011. Scalable Aggregation on Multicore Processors. In DaMoN. ACM, 1–9.

[61]

M. Zaharia, M. Chowdhury, T. Das, and et al.2012. Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In NSDI. USENIX Association.

[62]

M. Zaharia, M. Chowdhury, and et al.2010. Spark: Cluster Computing with Working Sets. In HotCloud. USENIX Association.

[63]

H. Zhang, B. Cho, and et al.2018. Riffle: Optimized Shuffle Service for Large-scale Data Analytics. In EuroSys. ACM.

[64]

M. Zukowski and P. Boncz. 2012. Vectorwise: Beyond Column Stores. IEEE Data Eng. Bull. 35, 1 (2012), 21–27.

Cited By

Lu YZhang ZZheng W(2025)BⓈX: Subgraph Matching with Batch Backtracking SearchProceedings of the ACM on Management of Data10.1145/37096653:1(1-27)Online publication date: 11-Feb-2025
https://dl.acm.org/doi/10.1145/3709665
Al-Sayeh HJibril MSattler K(2024)Agile-Ant: Self-Managing Distributed Cache Management for Cost Optimization of Big Data ApplicationsProceedings of the VLDB Endowment10.14778/3681954.368199017:11(3151-3164)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.14778/3681954.3681990
Intasorn YRattanaopas KChuchuen Y(2022)Using compression tables to improve HiveQL Performance with Spark A Case study on NVMe Storage Devices2022 26th International Computer Science and Engineering Conference (ICSEC)10.1109/ICSEC56337.2022.10049309(90-93)Online publication date: 21-Dec-2022
https://doi.org/10.1109/ICSEC56337.2022.10049309

Recommendations

Query Execution Optimization in Spark SQL

Spark SQL is a big data processing tool for structured data query and analysis. However, due to the execution of Spark SQL, there are multiple times to write intermediate data to the disk, which reduces the execution efficiency of Spark SQL. ...
Performance Comparison of Hive, Impala and Spark SQL
IHMSC '15: Proceedings of the 2015 7th International Conference on Intelligent Human-Machine Systems and Cybernetics - Volume 01

Quick query in the Big Data is important for mining the valuable information to improve the system performance. To achieve this goal, research institutions and internet companies develop three-type script query tools which are respectively Hive based on ...
Spark SQL: Relational Data Processing in Spark
SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data

Spark SQL is a new module in Apache Spark that integrates relational processing with Spark's functional programming API. Built on our experience with Shark, Spark SQL lets Spark programmers leverage the benefits of relational processing (e.g. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICPP '21: Proceedings of the 50th International Conference on Parallel Processing

August 2021

927 pages

ISBN:9781450390682

DOI:10.1145/3472456

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICPP 2021

ICPP 2021: 50th International Conference on Parallel Processing

August 9 - 12, 2021

IL, Lemont, USA

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
293
Total Downloads

Downloads (Last 12 months)42
Downloads (Last 6 weeks)2

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Lu YZhang ZZheng W(2025)BⓈX: Subgraph Matching with Batch Backtracking SearchProceedings of the ACM on Management of Data10.1145/37096653:1(1-27)Online publication date: 11-Feb-2025
https://dl.acm.org/doi/10.1145/3709665
Al-Sayeh HJibril MSattler K(2024)Agile-Ant: Self-Managing Distributed Cache Management for Cost Optimization of Big Data ApplicationsProceedings of the VLDB Endowment10.14778/3681954.368199017:11(3151-3164)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.14778/3681954.3681990
Intasorn YRattanaopas KChuchuen Y(2022)Using compression tables to improve HiveQL Performance with Spark A Case study on NVMe Storage Devices2022 26th International Computer Science and Engineering Conference (ICSEC)10.1109/ICSEC56337.2022.10049309(90-93)Online publication date: 21-Dec-2022
https://doi.org/10.1109/ICSEC56337.2022.10049309

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten