Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3318464.3384411acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
abstract

Workload-Aware Column Imprints

Published: 31 May 2020 Publication History

Abstract

In-memory columnar databases use indexes to accelerate highly selective queries. The additional storage requirement of indexes becomes prohibitive when kept in memory. For example, an inverted index requires as much space as the column itself. Column Imprints (CI) have been proposed as a space-efficient structure that supports range queries. We examine the limitations of CI and we suggest three enhancements for in-memory databases. We propose a workload-aware approach which considers recent data access patterns when constructing CI. We optimize the histogram towards reducing false positives and cache misses for highly selective queries. We propose efficient algorithms to construct our data structures. Preliminary experiments confirm that: 1) our workload-aware imprints reduce the cache lines scanned anywhere from 30% to 50% when compared to the original CI, and 2) have significantly smaller storage requirements.

References

[1]
Martin Faust, David Schwalb, and Jens Krueger. 2013. Fast column scans: Paged indices for in-memory column stores. In IMDM@VLDB. 13--24.
[2]
Tim Kraska, Alex Beutel, Ed H Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The case for learned index structures. In ACM SIGMOD. 489--504.
[3]
Mikołaj Morzy, Tadeusz Morzy, Alexandros Nanopoulos, and Yannis Manolopoulos. 2003. Hierarchical bitmap index: an efficient and scalable indexing technique for set-valued attributes. In ADBIS. 236--252.
[4]
Doron Rotem, Kurt Stockinger, and Kesheng Wu. 2004. Efficient binning for bitmap indices on high-cardinality attributes. Technical Report. Lawrence Berkeley National Lab.(LBNL), Berkeley, CA (United States).
[5]
Reza Sherkat, Colin Florendo, Mihnea Andrei, Anil K. Goel, Anisoara Nica, Peter Bumbulis, Ivan Schreter, Gü nter Radestock, Christian Bensberg, Daniel Booss, and Heiko Gerwens. 2016. Page As You Go: Piecewise Columnar Access In SAP HANA. In ACM SIGMOD. 1295--1306.
[6]
Lefteris Sidirourgos and Martin Kersten. 2013. Column imprints: a secondary index structure. In ACM SIGMOD. 893--904.
[7]
Liwen Sun, Michael J. Franklin, Sanjay Krishnan, and Reynold S. Xin. 2014. Fine-grained partitioning for aggressive data skipping. In ACM SIGMOD. 1115--1126.

Cited By

View all
  • (2024)Workload Prediction for Edge ComputingProceedings of the 25th International Conference on Distributed Computing and Networking10.1145/3631461.3632522(286-291)Online publication date: 4-Jan-2024
  • (2022)A Lightweight General Adaptive Optimization Tool for Relational DBMSs under HTAP Workloads2022 IEEE International Conference on Services Computing (SCC)10.1109/SCC55611.2022.00020(45-53)Online publication date: Jul-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
June 2020
2925 pages
ISBN:9781450367356
DOI:10.1145/3318464
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 31 May 2020

Check for updates

Author Tags

  1. column imprints
  2. histogram
  3. in-memory columnar database
  4. range query
  5. workload analysis

Qualifiers

  • Abstract

Conference

SIGMOD/PODS '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)1
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Workload Prediction for Edge ComputingProceedings of the 25th International Conference on Distributed Computing and Networking10.1145/3631461.3632522(286-291)Online publication date: 4-Jan-2024
  • (2022)A Lightweight General Adaptive Optimization Tool for Relational DBMSs under HTAP Workloads2022 IEEE International Conference on Services Computing (SCC)10.1109/SCC55611.2022.00020(45-53)Online publication date: Jul-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media