Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3655038.3665955acmconferencesArticle/Chapter ViewAbstractPublication PageshotstorageConference Proceedingsconference-collections
research-article
Open access

Advocating for Key-Value Stores with Workload Pattern Aware Dynamic Compaction

Published: 08 July 2024 Publication History

Abstract

In real life, the ratio of write and read operations of key-value (KV) store workloads usually changes over time. In this paper, we present a Dynamic wOrkload Pattern Aware LSM-based KV store (DOPA-DB), which supports dynamic compaction strategies depending on the workload pattern. In particular, DOPA-DB is a tiered LSM-based KV store with multiple key ranges, which enables varying compaction sizes. For write-intensive workloads, DOPA-DB can minimize write stalls while minimizing compaction overhead, and for read-intensive workloads, it can aggressively perform compaction to reduce the number of file accesses. Our preliminary experimental results show the potential benefits of dynamic compaction and provide insight into research directions for dynamic compaction strategies.

References

[1]
2016. HyperLevelDB. https://github.com/rescrv/HyperLevelDB.
[2]
2017. RocksDB Rate Limiter. https://rocksdb.org/blog/2017/12/18/17-auto-tuned-rate-limiter. html.
[3]
2020. RocksDB Direct IO. https://github.com/facebook/rocksdb/wiki/Direct-IO.
[4]
2022. LevelDB. https://github.com/google/leveldb.
[5]
2022. RocksDB. https://rocksdb.org/.
[6]
2023. RocksDB Universal compaction. https://github.com/facebook/rocksdb/wiki/Universal-Compaction.
[7]
D. F. Andrews and A. M. Herzberg. 1985. Monthly Mean Sunspot Numbers. Springer New York, New York, NY, 67--74. https://doi.org/10.1007/978-1-4612-5098-2_12
[8]
Oana Balmau, Florin Dinu, Willy Zwaenepoel, Karan Gupta, Ravishankar Chandhiramoorthi, and Diego Didona. 2019. SILK: Preventing Latency Spikes in Log-Structured Merge Key-Value Stores. In Proceedings of the 2019 USENIX Annual Technical Conference (USENIX ATC). 753--766.
[9]
Zhichao Cao, Siying Dong, Sagar Vemuri, and David HC Du. 2020. Characterizing, modeling, and benchmarking RocksDB Key-Value workloads at facebook. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST). 209--223.
[10]
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2008. Bigtable: A Distributed Storage System for Structured Data. ACM Transactions on Computer Systems (TOCS) (2008), 1--26.
[11]
Lixiang Chen, Ruihao Chen, Chengcheng Yang, Yuxing Han, Rong Zhang, Xuan Zhou, Peiquan Jin, and Weining Qian. 2023. Workload-Aware Log-Structured Merge Key-Value Store for NVM-SSD Hybrid Storage. In Proceedings of 2023 IEEE 39th International Conference on Data Engineering (ICDE). 2207--2219.
[12]
Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM symposium on Cloud computing (SoCC). 143--154.
[13]
Niv Dayan, Manos Athanassoulis, and Stratos Idreos. 2017. Monkey: Optimal navigable key-value store. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD). 79--94.
[14]
Niv Dayan and Stratos Idreos. 2018. Dostoevsky: Better Space-Time Trade-Offs for LSM-Tree Based Key-Value Stores via Adaptive Removal of Superfluous Merging. In Proceedings of the 2018 International Conference on Management of Data (SIGMOD). 505--520.
[15]
Alex Galakatos, Michael Markovitch, Carsten Binnig, Rodrigo Fonseca, and Tim Kraska. 2019. Fiting-tree: A data-aware index structure. In Proceedings of the 2019 International Conference on Management of Data (SIGMOD). 1189--1206.
[16]
Tyler Harter, Dhruba Borthakur, Siying Dong, Amitanand Aiyer, Liyin Tang, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2014. Analysis of HDFS Under HBase: A Facebook Messages Case Study. In Proceedings of 12th USENIX Conference on File and Storage Technologies (FAST). 199--212.
[17]
Marc Holze, Ali Haschimi, and Norbert Ritter. 2010. Towards workload-aware self-management: Predicting significant workload shifts. In Proceedings of 2010 IEEE 26th International Conference on Data Engineering Workshops (ICDEW). 111--116.
[18]
Avinash Lakshman and Prashant Malik. 2010. Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Systems Review (SIGOPS) (2010), 35--40.
[19]
Yongkun Li, Chengjin Tian, Fan Guo, Cheng Li, and Yinlong Xu. 2019. ElasticBF: Elastic Bloom Filter with Hotness Awareness for Boosting Read Performance in Large Key-Value Stores. In Proceedings of the 2019 USENIX Annual Technical Conference (USENIX ATC). 739--752.
[20]
Dingheng Mo, Fanchao Chen, Siqiang Luo, and Caihua Shan. 2023. Learning to Optimize LSM-trees: Towards A Reinforcement Learning based Key-Value Store for Dynamic Workloads. Proceedings of the ACM on Management of Data (PACMMOD) (2023), 1--25.
[21]
Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham. 2017. Pebblesdb: Building key-value stores using fragmented log-structured merge trees. In Proceedings of the ACM SIGOPS 26th Symposium on Operating Systems Principles (SOSP). 497--514.
[22]
Chuzhe Tang, Youyun Wang, Zhiyuan Dong, Gansen Hu, Zhaoguo Wang, Minjie Wang, and Haibo Chen. 2020. XIndex: A Scalable Learned Index for Multicore Data Storage. In Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP). 308--320.
[23]
Fenggang Wu, Ming-Hong Yang, Baoquan Zhang, and David HC Du. 2020. AC-Key: Adaptive caching for LSM-based Key-Value stores. In Proceedings of the 2020 USENIX Annual Technical Conference (USENIX ATC). 603--615.
[24]
Jin Yang, Heejin Yoon, Gyeongchan Yun, Sam H. Noh, and Young-ri Choi. 2023. DyTIS: A Dynamic Dataset Targeted Index Structure Simultaneously Efficient for Search, Insert, and Scan. In Proceedings of the 18th European Conference on Computer Systems (EuroSys). 800--816.
[25]
Jinghuan Yu, Sam H Noh, Young-ri Choi, and Chun Jason Xue. 2023. ADOC: Automatically Harmonizing Dataflow Between Components in Log-Structured Key-Value Stores for Improved Performance. In Proceedings of 21st USENIX Conference on File and Storage Technologies (FAST). 65--80.
[26]
Adar Zeitak and Adam Morrison. 2021. Cuckoo Trie: Exploiting Memory-Level Parallelism for Efficient DRAM Indexing. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles (SOSP). 147--162.
[27]
Xingsheng Zhao, Song Jiang, and Xingbo Wu. 2021. WipDB: A Write-in-place Key-value Store that Mimics Bucket Sort. In Proceedings of 2021 IEEE 37th International Conference on Data Engineering (ICDE). 1404--1415.

Cited By

View all
  • (2024)OnionDisk: A Log-Structured Write-Optimal Virtual Block DeviceProceedings of the 15th ACM SIGOPS Asia-Pacific Workshop on Systems10.1145/3678015.3680489(136-143)Online publication date: 4-Sep-2024

Index Terms

  1. Advocating for Key-Value Stores with Workload Pattern Aware Dynamic Compaction

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    HotStorage '24: Proceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems
    July 2024
    141 pages
    ISBN:9798400706301
    DOI:10.1145/3655038
    This work is licensed under a Creative Commons Attribution-NoDerivatives International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 July 2024

    Check for updates

    Author Tags

    1. dynamic workload
    2. key-value store
    3. log-structured merge-tree

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    HOTSTORAGE '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 34 of 87 submissions, 39%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)472
    • Downloads (Last 6 weeks)75
    Reflects downloads up to 27 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)OnionDisk: A Log-Structured Write-Optimal Virtual Block DeviceProceedings of the 15th ACM SIGOPS Asia-Pacific Workshop on Systems10.1145/3678015.3680489(136-143)Online publication date: 4-Sep-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media