Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Towards multi-way join aware optimizer in SAP HANA

Published: 01 August 2020 Publication History

Abstract

Existing binary join based plans may be suboptimal for important, emerging applications. Typical query optimizers enumerate plans using binary joins only. In this paper, we introduce the multi-way join aware optimizer in SAP HANA. The naive way to extend the existing query optimizer to be aware of multi-way joins (m-way joins for short) is to enumerate m-way joins on top of a traditional binary join enumeration framework. However, many different binary joins correspond to the same m-way join. Thus, unnecessary join enumerations would be required for such naive integration. To solve this problem, we introduce the new concept of an m-way join unit and explain how the construction of join units is plugged into the SAP HANA query optimizer. We also provide a series of optimizer enhancements by exploiting m-way join unit characteristics. Using TPC-H and our customer workloads, we showcase the superiority of our m-way join aware optimizer.

References

[1]
C. R. Aberger, A. Lamb, S. Tu, A. Nötzli, K. Olukotun, and C. Ré. Emptyheaded: A relational engine for graph processing. ACM TODS, 42(4):1--44, 2017.
[2]
C. Binnig, S. Hildenbrand, and F. Färber. Dictionary-based order-preserving string compression for main memory column stores. In ACM SIGMOD conference, pages 283--296, 2009.
[3]
S. K. Cha and C. Song. P*TIME: Highly scalable OLTP DBMS for managing update-intensive stream workload. In VLDB conference, pages 1033--1044, 2004.
[4]
S. Chaudhuri and K. Shim. Including group-by in query optimization. In VLDB conference, pages 354--366, 1994.
[5]
F. Färber, S. K. Cha, J. Primsch, C. Bornhövd, S. Sigg, and W. Lehner. SAP HANA database: Data management for modern business applications. ACM SIGMOD Record, 40(4):45--51, 2012.
[6]
F. Färber, N. May, W. Lehner, P. Große, I. Müller, H. Rauhe, and J. Dees. The SAP HANA database - an architecture overview. IEEE Data Eng. Bull., 35(1):28--33, 2012.
[7]
G. Graefe. The cascades framework for query optimization. IEEE Data Eng. Bull., 18(3):19--29, 1995.
[8]
G. Graefe and W. J. McKenna. The volcano optimizer generator: Extensibility and efficient search. In IEEE ICDE conference, pages 209--218, 1993.
[9]
G. Hill and T. Peh. Fast algorithms for computing semijoin reduction sequences. U.S. Patent 8271478, 2012.
[10]
G. Hill and A. Ross. Reducing outer joins. The VLDB Journal, 18(3):599--610, 2009.
[11]
C. Jeong, S. Hwang, S. K. Cha, and S. H. Wi. Processing database queries using format conversion. U.S. Patent 8880508, 2014.
[12]
J. Lee, Y. S. Kwon, F. Färber, M. Muehle, C. Lee, C. Bensberg, J. Lee, A. H. Lee, and W. Lehner. SAP HANA distributed in-memory database system: Transaction, session, and metadata management. In IEEE ICDE conference, pages 1165--1173, 2013.
[13]
G. Moerkotte, P. Fender, and M. Eich. On the correct and complete enumeration of the core search space. In ACM SIGMOD conference, pages 493--504, 2013.
[14]
G. Moerkotte and T. Neumann. Analysis of two existing and one new dynamic programming algorithm for the generation of optimal bushy join trees without cross products. In VLDB conference, pages 930--941, 2006.
[15]
H. Q. Ngo, E. Porat, C. Ré, and A. Rudra. Worst-case optimal join algorithms. Journal of the ACM, 65(3):16, 2018.
[16]
K. Ono and G. M. Lohman. Measuring the complexity of join enumeration in query optimization. In VLDB conference, pages 314--325, 1990.
[17]
A. Pellenkoft, C. A. Galindo-Legaria, and M. L. Kersten. The complexity of transformation-based join enumeration. In VLDB conference, pages 306--315, 1997.
[18]
P. Roy, S. Seshadri, S. Sudarshan, and S. Bhobe. Efficient and extensible algorithms for multi query optimization. In ACM SIGMOD conference, pages 249--260, 2000.
[19]
https://www.sap.com/products/s4hana-erp.html.
[20]
V. Sikka, F. Färber, A. Goel, and W. Lehner. SAP HANA: The evolution from a modern main-memory data platform to an enterprise application platform. PVLDB, 6(11):1184--1185, 2013.
[21]
V. Sikka, F. Färber, W. Lehner, S. K. Cha, T. Peh, and C. Bornhövd. Efficient transaction processing in SAP HANA database: The end of a column store myth. In ACM SIGMOD conference, pages 731--742, 2012.
[22]
https://en.wikipedia.org/wiki/Snowflake_schema.
[23]
O. Steinau and J. Hartmann. Method for calculating distributed joins in main memory with minimal communication overhead. U.S. Patent 8 046 377, 2011.
[24]
K. Stocker, D. Kossmann, R. Braumandi, and A. Kemper. Integrating semi-join-reducers into state-of-the-art query processors. In IEEE ICDE conference, pages 575--584, 2001.
[25]
P. Valduriez. Join indices. ACM Trans. Database Syst., 12(2):218--246, 1987.
[26]
T. L. Veldhuizen. Triejoin: A simple, worst-case optimal join algorithm. In International Conference on Database Theory (ICDT), pages 96--106, 2014.
[27]
C. Weyerhaeuser, T. Mindnich, F. Faerber, and W. Lehner. Exploiting graphic card processor technology to accelerate data mining queries in SAP NetWeaver BIA. In IEEE International Conference on Data Mining (ICDM), pages 506--515, 2008.
[28]
T. Willhalm, N. Popovici, Y. Boshmaf, H. Plattner, A. Zeier, and J. Schaffner. Simd-scan: Ultra fast in-memory table scan using on-chip vector processing units. PVLDB, 2(1):385--394, 2009.
[29]
W. P. Yan and P. A. Larson. Performing group-by before join. In IEEE ICDE conference, pages 89--100, 1994.
[30]
W. P. Yan and P. A. Larson. Eager aggregation and lazy aggregation. In VLDB conference, pages 345--357, 1995.

Cited By

View all
  • (2023)Efficient Regular Path Query Evaluation with Structural Path ConstraintsAdvanced Data Mining and Applications10.1007/978-3-031-46671-7_21(308-322)Online publication date: 27-Aug-2023
  • (2021)LSQBProceedings of the 4th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)10.1145/3461837.3464516(1-11)Online publication date: 20-Jun-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 13, Issue 12
August 2020
1710 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 August 2020
Published in PVLDB Volume 13, Issue 12

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)2
Reflects downloads up to 14 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Efficient Regular Path Query Evaluation with Structural Path ConstraintsAdvanced Data Mining and Applications10.1007/978-3-031-46671-7_21(308-322)Online publication date: 27-Aug-2023
  • (2021)LSQBProceedings of the 4th ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)10.1145/3461837.3464516(1-11)Online publication date: 20-Jun-2021

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media