Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2882903.2903729acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Page As You Go: Piecewise Columnar Access In SAP HANA

Published: 14 June 2016 Publication History

Abstract

In-memory columnar databases such as SAP HANA achieve extreme performance by means of vector processing over logical units of main memory resident columns. The core in-memory algorithms can be challenged when the working set of an application does not fit into main memory. To deal with memory pressure, most in-memory columnar databases evict candidate columns (or tables) using a set of heuristics gleaned from recent workload. As an alternative approach, we propose to reduce the unit of load and eviction from column to a contiguous portion of the in-memory columnar representation, which we call a page. In this paper, we adapt the core algorithms to be able to operate with partially loaded columns while preserving the performance benefits of vector processing. Our approach has two key advantages. First, partial column loading reduces the mandatory memory footprint for each column, making more memory available for other purposes. Second, partial eviction extends the in-memory lifetime of partially loaded column. We present a new in-memory columnar implementation for our approach, that we term page loadable column. We design a new persistency layout and access algorithms for the encoded data vector of the column, the order-preserving dictionary, and the inverted index. We compare the performance attributes of page loadable columns with those of regular in-memory columns and present a use-case for page loadable columns for cold data in data aging scenarios. Page loadable columns are completely integrated in SAP HANA, and we present extensive experimental results that quantify the performance overhead and the resource consumption when these columns are deployed.

References

[1]
The ParAccel analytic database: A technical overview. http://www.paraccel.com/ Whitepaper, 2009.
[2]
{RFC} Minmax indexes. A. Herrera's email to Pg Hackers, 2013.
[3]
D. Abadi, P. A. Boncz, S. Harizopoulos, S. Idreos, and S. Madden. The design and implementation of modern column-oriented database systems. Foundations and Trends in Databases, 5(3):197--280, 2013.
[4]
R. Barber, G. M. Lohman, V. Raman, R. Sidle, S. Lightstone, and B. Schiefer. In-memory BLU acceleration in ibm's DB2 and dashdb: Optimized for modern workloads and hardware architectures. In ICDE, 2015.
[5]
J. DeBrabant, A. Pavlo, S. Tu, M. Stonebraker, and S. B. Zdonik. Anti-caching: A new approach to database management system architecture. PVLDB, 6(14):1942--1953, 2013.
[6]
A. Eldawy, J. Levandoski, and P. Larson. Trekking through siberia: Managing cold data in a memory-optimized database. PVLDB, 7(11), 2014.
[7]
F. Farber, N. May, W. Lehner, P. Große, I. Müller, H. Rauhe, and J. Dees. The SAP HANA database -- an architecture overview. IEEE Data Engineering Bulletin, 35(1):28--33, 2012.
[8]
Z. Feng, E. Lo, B. Kao, and W. Xu. Byteslice: Pushing the envelop of main memory data processing with a new storage layout. In ACM SIGMOD, 2015.
[9]
F. Funke, A. Kemper, and T. Neumann. Compacting transactional data in hybrid oltp&olap databases. PVLDB, 5(11), 2012.
[10]
G. Graefe, H. Volos, H. Kimura, H. Kuno, J. Tucek, M. Lillibridge, and A. Veitch. In-memory performance for big data. PVLDB, 8(1), 2014.
[11]
S. Idreos, M. L. Kersten, and S. Manegold. Self-organizing tuple reconstruction in column-stores. In ACM SIGMOD, 2009.
[12]
Intel 64-bit instruction set. https://software.intel.com/sites/default/files/managed/07/b7/319433-023.pdf.
[13]
A. Kemper and T. Neumann. Hyper: A hybrid oltp&olap main memory database system based on virtual memory snapshots. In ICDE, 2011.
[14]
A. Lamb, M. Fuller, R. Varadarajan, N. Tran, B. Vandier, L. Doshi, and C. Bear. The vertica analytic database: C-store 7 years later. PVLDB, 5(12), 2012.
[15]
C. Lemke, K. Sattler, F. Faerber, and A. Zeier. Speeding up queries in column stores - A case for compression. In DAWAK, 2010.
[16]
C. Lemke, K. Sattler, and F. F\"arber. Kompressionstechniken für spaltenorientierte bi-accelerator-lösungen. In BTW, 2009.
[17]
Y. Li and J. M. Patel. BitWeaving: Fast Scans for Main Memory Data Processing. In SIGMOD, 2013.
[18]
Monetdb. https://www.monetdb.org/.
[19]
Oracle timesten. http://www.oracle.com/technetwork/products/timesten.
[20]
I. Oukid, W. Lehner, T. Kissinger, T. Willhalm, and P. Bumbulis. Instant recovery for main memory databases. In CIDR, 2015.
[21]
H. Plattner. The impact of columnar in-memory databases on enterprise systems. PVLDB, 7(13):1722--1729, 2014.
[22]
V. Raman, G. K. Attaluri, R. Barber, N. Chainani, D. Kalmuk, V. KulandaiSamy, J. Leenstra, S. Lightstone, S. Liu, G. M. Lohman, T. Malkemus, R. Müller, I. Pandis, B. Schiefer, D. Sharpe, R. Sidle, A. J. Storm, and L. Zhang. DB2 with BLU acceleration: So much more than just a column store. PVLDB, 6(11):1080--1091, 2013.
[23]
V. Sikka, F. Farber, W. Lehner, S. K. Cha, T. Peh, and C. Bornhövd. Efficient transaction processing in SAP HANA database: the end of a column store myth. In ACM SIGMOD, 2012.
[24]
R. Stoica and A. Ailamaki. Enabling efficient os paging for main-memory oltp databases. In DaMon, 2013.
[25]
M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. J. O'Neil, P. E. O'Neil, A. Rasin, N. Tran, and S. B. Zdonik. C-store: A column-oriented DBMS. In VLDB, 2005.
[26]
T. Willhalm, N. Popovici, Y. Boshmaf, H. Plattner, A. Zeier, and J. Schaffner. Simd-scan: Ultra fast in-memory table scan using on-chip vector processing units. PVLDB, 2(1):385--394, 2009.
[27]
H. Zhang, G. Chen, B. C. Ooi, K. Tan, and M. Zhang. In-memory big data management and processing: A survey. IEEE TKDE, 27(7):1920--1948, 2015.
[28]
H. Zhang, G. Chen, B. C. Ooi, W. Wong, S. Wu, and Y. Xia. "anti-caching"-based elastic memory management for big data. In ICDE, 2015.
[29]
M. Zukowski and P. A. Boncz. Vectorwise: Beyond column stores. IEEE Data Eng. Bull., 35(1), 2012.

Cited By

View all
  • (2022)Cost modelling for optimal data placement in heterogeneous main memoryProceedings of the VLDB Endowment10.14778/3551793.355183715:11(2867-2880)Online publication date: 29-Sep-2022
  • (2019)In-memory for the massesProceedings of the VLDB Endowment10.14778/3352063.335214212:12(2273-2275)Online publication date: 1-Aug-2019
  • (2019)Native store extension for SAP HANAProceedings of the VLDB Endowment10.14778/3352063.335212312:12(2047-2058)Online publication date: 1-Aug-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '16: Proceedings of the 2016 International Conference on Management of Data
June 2016
2300 pages
ISBN:9781450335317
DOI:10.1145/2882903
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 June 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data aging
  2. in-memory columnar databases

Qualifiers

  • Research-article

Conference

SIGMOD/PODS'16
Sponsor:
SIGMOD/PODS'16: International Conference on Management of Data
June 26 - July 1, 2016
California, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)21
  • Downloads (Last 6 weeks)1
Reflects downloads up to 11 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Cost modelling for optimal data placement in heterogeneous main memoryProceedings of the VLDB Endowment10.14778/3551793.355183715:11(2867-2880)Online publication date: 29-Sep-2022
  • (2019)In-memory for the massesProceedings of the VLDB Endowment10.14778/3352063.335214212:12(2273-2275)Online publication date: 1-Aug-2019
  • (2019)Native store extension for SAP HANAProceedings of the VLDB Endowment10.14778/3352063.335212312:12(2047-2058)Online publication date: 1-Aug-2019
  • (2018)Hybrid Data Layouts for Tiered HTAP Databases with Pareto-Optimal Data Placements2018 IEEE 34th International Conference on Data Engineering (ICDE)10.1109/ICDE.2018.00028(209-220)Online publication date: Apr-2018
  • (2018)LeanStore: In-Memory Data Management beyond Main Memory2018 IEEE 34th International Conference on Data Engineering (ICDE)10.1109/ICDE.2018.00026(185-196)Online publication date: Apr-2018
  • (2017)StatisticumProceedings of the VLDB Endowment10.14778/3137765.313777210:12(1658-1669)Online publication date: 1-Aug-2017
  • (2017)The design of an adaptive column-store systemJournal of Big Data10.1186/s40537-017-0069-44:1Online publication date: 23-Mar-2017

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media