Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3318464.3384411acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
abstract

Workload-Aware Column Imprints

Published: 31 May 2020 Publication History
  • Get Citation Alerts
  • Abstract

    In-memory columnar databases use indexes to accelerate highly selective queries. The additional storage requirement of indexes becomes prohibitive when kept in memory. For example, an inverted index requires as much space as the column itself. Column Imprints (CI) have been proposed as a space-efficient structure that supports range queries. We examine the limitations of CI and we suggest three enhancements for in-memory databases. We propose a workload-aware approach which considers recent data access patterns when constructing CI. We optimize the histogram towards reducing false positives and cache misses for highly selective queries. We propose efficient algorithms to construct our data structures. Preliminary experiments confirm that: 1) our workload-aware imprints reduce the cache lines scanned anywhere from 30% to 50% when compared to the original CI, and 2) have significantly smaller storage requirements.

    References

    [1]
    Martin Faust, David Schwalb, and Jens Krueger. 2013. Fast column scans: Paged indices for in-memory column stores. In IMDM@VLDB. 13--24.
    [2]
    Tim Kraska, Alex Beutel, Ed H Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The case for learned index structures. In ACM SIGMOD. 489--504.
    [3]
    Mikołaj Morzy, Tadeusz Morzy, Alexandros Nanopoulos, and Yannis Manolopoulos. 2003. Hierarchical bitmap index: an efficient and scalable indexing technique for set-valued attributes. In ADBIS. 236--252.
    [4]
    Doron Rotem, Kurt Stockinger, and Kesheng Wu. 2004. Efficient binning for bitmap indices on high-cardinality attributes. Technical Report. Lawrence Berkeley National Lab.(LBNL), Berkeley, CA (United States).
    [5]
    Reza Sherkat, Colin Florendo, Mihnea Andrei, Anil K. Goel, Anisoara Nica, Peter Bumbulis, Ivan Schreter, Gü nter Radestock, Christian Bensberg, Daniel Booss, and Heiko Gerwens. 2016. Page As You Go: Piecewise Columnar Access In SAP HANA. In ACM SIGMOD. 1295--1306.
    [6]
    Lefteris Sidirourgos and Martin Kersten. 2013. Column imprints: a secondary index structure. In ACM SIGMOD. 893--904.
    [7]
    Liwen Sun, Michael J. Franklin, Sanjay Krishnan, and Reynold S. Xin. 2014. Fine-grained partitioning for aggressive data skipping. In ACM SIGMOD. 1115--1126.

    Cited By

    View all
    • (2024)Workload Prediction for Edge ComputingProceedings of the 25th International Conference on Distributed Computing and Networking10.1145/3631461.3632522(286-291)Online publication date: 4-Jan-2024
    • (2022)A Lightweight General Adaptive Optimization Tool for Relational DBMSs under HTAP Workloads2022 IEEE International Conference on Services Computing (SCC)10.1109/SCC55611.2022.00020(45-53)Online publication date: Jul-2022

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
    June 2020
    2925 pages
    ISBN:9781450367356
    DOI:10.1145/3318464
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 31 May 2020

    Check for updates

    Author Tags

    1. column imprints
    2. histogram
    3. in-memory columnar database
    4. range query
    5. workload analysis

    Qualifiers

    • Abstract

    Conference

    SIGMOD/PODS '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)14
    • Downloads (Last 6 weeks)0
    Reflects downloads up to

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Workload Prediction for Edge ComputingProceedings of the 25th International Conference on Distributed Computing and Networking10.1145/3631461.3632522(286-291)Online publication date: 4-Jan-2024
    • (2022)A Lightweight General Adaptive Optimization Tool for Relational DBMSs under HTAP Workloads2022 IEEE International Conference on Services Computing (SCC)10.1109/SCC55611.2022.00020(45-53)Online publication date: Jul-2022

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media