Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Efficient processing of window functions in analytical SQL queries

Published: 01 June 2015 Publication History

Abstract

Window functions, also known as analytic OLAP functions, have been part of the SQL standard for more than a decade and are now a widely-used feature. Window functions allow to elegantly express many useful query types including time series analysis, ranking, percentiles, moving averages, and cumulative sums. Formulating such queries in plain SQL-92 is usually both cumbersome and inefficient.
Despite being supported by all major database systems, there have been few publications that describe how to implement an efficient relational window operator. This work aims at filling this gap by presenting an efficient and general algorithm for the window operator. Our algorithm is optimized for high-performance main-memory database systems and has excellent performance on modern multi-core CPUs. We show how to fully parallelize all phases of the operator in order to effectively scale for arbitrary input distributions.

References

[1]
http://docs.oracle.com/database/121/DWHSG/analysis.htm.
[2]
http://msdn.microsoft.com/en-us/library/ms189461(v=sql.120).aspx.
[3]
http://www-01.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.sql.ref.doc/doc/r0023461.html.
[4]
http://help.sap.de/hana/SAP_HANA_SQL_and_System_Views_Reference_en.pdf.
[5]
http://www.postgresql.org/docs/9.4/static/tutorial-window.html.
[6]
https://www.monetdb.org/Documentation/Manuals/SQLreference/WindowFunctions.
[7]
S. Bellamkonda, R. Ahmed, A. Witkowski, A. Amor, M. Zaït, and C. C. Lin. Enhanced subquery optimizations in Oracle. PVLDB, 2(2):1366--1377, 2009.
[8]
S. Bellamkonda, T. Bozkaya, B. Ghosh, A. Gupta, J. Haydu, S. Subramanian, and A. Witkowski. Analytic functions in Oracle 8i. Technical report, Oracle, 2000.
[9]
S. Bellamkonda, H.-G. Li, U. Jagtap, Y. Zhu, V. Liang, and T. Cruanes. Adaptive and big data scale parallel execution in Oracle. PVLDB, 6(11):1102--1113, 2013.
[10]
P. Boncz, T. Grust, M. van Keulen, S. Manegold, J. Rittinger, and J. Teubner. Pathfinder: XQuery - the relational way. In VLDB, pages 1322--1325, 2005.
[11]
P. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-pipelining query execution. In CIDR, pages 225--237, 2005.
[12]
Y. Cao, C.-Y. Chan, J. Li, and K.-L. Tan. Optimization of analytic window functions. PVLDB, 5(11):1244--1255, 2012.
[13]
V. Harinarayan, A. Rajaraman, and J. D. Ullman. Implementing data cubes efficiently. In SIGMOD, pages 205--216, 1996.
[14]
D. Inkster, M. Zukowski, and P. Boncz. Integration of VectorWise with Ingres. SIGMOD Record, 40(3):45--53, 2011.
[15]
A. Kemper and T. Neumann. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In ICDE, pages 195--206, 2011.
[16]
M. Kornacker, A. Behm, V. B. T. Bobrovytsky, C. Ching, A. Choi, J. Erickson, M. Grund, D. Hecht, M. Jacobs, I. Joshi, L. Kuff, D. Kumar, A. Leblang, N. Li, I. Pandis, H. Robinson, D. Rorke, S. Rus, J. Russell, D. Tsirogiannis, S. Wanderman-Milne, and M. Yoder. Impala: A modern, open-source SQL engine for Hadoop. In CIDR, 2015.
[17]
V. Leis, P. Boncz, A. Kemper, and T. Neumann. Morsel-driven parallelism: A NUMA-aware query evaluation framework for the many-core age. In SIGMOD, pages 743--754, 2014.
[18]
R. O. Nambiar and M. Poess. The making of TPC-DS. In VLDB, pages 1049--1058, 2006.
[19]
T. Neumann. Efficiently compiling efficient query plans for modern hardware. PVLDB, 4:539--550, 2011.
[20]
T. Neumann and A. Kemper. Unnesting arbitrary queries. In BTW, pages 383--402, 2015.
[21]
T. Neumann and V. Leis. Compiling database queries into machine code. IEEE Data Eng. Bull., 37(1):3--11, 2014.
[22]
F. Putze, P. Sanders, and J. Singler. MCSTL: the multi-core standard template library. In PPOPP, pages 144--145, 2007.
[23]
O. Shalev and N. Shavit. Split-ordered lists: Lock-free extensible hash tables. J. ACM, 53(3):379--405, 2006.
[24]
J. Yang and J. Widom. Incremental computation and maintenance of temporal aggregates. In ICDE, pages 51--60, 2001.
[25]
F. Zemke. What's new in SQL:2011. SIGMOD Record, 41(1):67--73, 2012.
[26]
C. Zuzarte, H. Pirahesh, W. Ma, Q. Cheng, L. Liu, and K. Wong. WinMagic: Subquery elimination using window aggregation. In SIGMOD, pages 652--656, 2003.

Cited By

View all
  • (2024)Window Function Expression: Let the Self-Join EnterProceedings of the VLDB Endowment10.14778/3665844.366584817:9(2162-2174)Online publication date: 6-Aug-2024
  • (2024)High-Performance Query Processing with NVMe Arrays: Spilling without Killing PerformanceProceedings of the ACM on Management of Data10.1145/36988132:6(1-27)Online publication date: 20-Dec-2024
  • (2024)Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query EngineCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653368(5-17)Online publication date: 9-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 8, Issue 10
June 2015
168 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 June 2015
Published in PVLDB Volume 8, Issue 10

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)35
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Window Function Expression: Let the Self-Join EnterProceedings of the VLDB Endowment10.14778/3665844.366584817:9(2162-2174)Online publication date: 6-Aug-2024
  • (2024)High-Performance Query Processing with NVMe Arrays: Spilling without Killing PerformanceProceedings of the ACM on Management of Data10.1145/36988132:6(1-27)Online publication date: 20-Dec-2024
  • (2024)Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query EngineCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653368(5-17)Online publication date: 9-Jun-2024
  • (2023)T-Rex: Optimizing Pattern Search on Time SeriesProceedings of the ACM on Management of Data10.1145/35892751:2(1-26)Online publication date: 20-Jun-2023
  • (2023)Bringing Data Analysis to the Files and the Database to the Command Line2023 Congress in Computer Science, Computer Engineering, & Applied Computing (CSCE)10.1109/CSCE60160.2023.00246(1490-1497)Online publication date: 24-Jul-2023
  • (2023)A new window Clause for SQL++The VLDB Journal10.1007/s00778-023-00830-z33:3(595-623)Online publication date: 19-Dec-2023
  • (2022)Efficient Evaluation of Arbitrarily-Framed Holistic SQL Aggregates and Window FunctionsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526184(1243-1256)Online publication date: 10-Jun-2022
  • (2021)Not black-box anymore!Proceedings of the VLDB Endowment10.14778/3476311.347637514:12(2959-2971)Online publication date: 28-Oct-2021
  • (2021)Optimizing in-memory database engine for AI-powered on-line decision augmentation using persistent memoryProceedings of the VLDB Endowment10.14778/3446095.344610214:5(799-812)Online publication date: 1-Jan-2021
  • (2021)Building Advanced SQL Analytics From Low-Level Plan OperatorsProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457288(1001-1013)Online publication date: 9-Jun-2021
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media