Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Structural Designs Meet Optimality: Exploring Optimized LSM-tree Structures in a Colossal Configuration Space

Published: 30 May 2024 Publication History

Abstract

Mainstream LSM-tree-based key-value stores face challenges in optimizing performance for point lookup, range lookup, and update operations concurrently due to their constrained configurations. They typically follow fixed patterns to specify the level capacity and the number of sorted runs per-level. This confines their designs to a restricted space, limiting opportunities for broader optimizations.
To address this challenge, we consider a more flexible configuration that enables independent adjustments of the number of runs per-level, size ratio, and Bloom filter settings at each LSM-tree level. By carefully analyzing the cost of each operation based on the new design space, we unveil two critical insights for optimizing the tradeoff among the three operations. Firstly, achieving efficient point lookup requires a large last level. Secondly, there is a specific correlation between the number of runs per level and size ratio that is advantageous for overall update and range lookup performance.
Based on these insights, we introduce Moose, a structure delivering an impressive overall performance for point lookup, range lookup, and update concurrently. Furthermore, we also introduce a new framework, Smoose, to navigate the design space for adapting specific workloads. We implemented Moose and Smoose on top of RocksDB and experimental results demonstrate that our proposed approach outperforms state-of-the-art LSM-tree structures across diverse workloads.

References

[1]
Muhammad Yousuf Ahmad and Bettina Kemme. 2015. Compaction management in distributed key-value datastores. Proceedings of the VLDB Endowment, Vol. 8, 8 (2015), 850--861.
[2]
Wail Y Alkowaileet, Sattam Alsubaiee, and Michael J Carey. 2019. An LSM-based Tuple Compaction Framework for Apache AsterixDB (Extended Version). arXiv preprint arXiv:1910.08185 (2019).
[3]
Wail Y Alkowaileet and Michael J Carey. 2021. Columnar formats for schemaless LSM-based document stores. arXiv preprint arXiv:2111.11517 (2021).
[4]
Timothy G Armstrong, Vamsi Ponnekanti, Dhruba Borthakur, and Mark Callaghan. 2013. Linkbench: a database benchmark based on the facebook social graph. In SIGMOD. 1185--1196.
[5]
David F Bacon, Nathan Bales, Nico Bruno, Brian F Cooper, Adam Dickinson, Andrew Fikes, Campbell Fraser, Andrey Gubarev, Milind Joshi, Eugene Kogan, et al. 2017. Spanner: Becoming a SQL system. In Proceedings of the 2017 ACM International Conference on Management of Data. 331--343.
[6]
Lawrence Benson, Hendrik Makait, and Tilmann Rabl. 2021. Viper: An efficient hybrid pmem-dram key-value store. Proceedings of the VLDB Endowment, Vol. 14, 9 (2021), 1544--1556.
[7]
Helen HW Chan, Chieh-Jan Mike Liang, Yongkun Li, Wenjia He, Patrick PC Lee, Lianjie Zhu, Yaozu Dong, Yinlong Xu, Yu Xu, Jin Jiang, et al. 2018. HashKV: Enabling Efficient Updates in KV Storage via Hashing. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). 1007--1019.
[8]
Subarna Chatterjee, Meena Jagadeesan, Wilson Qin, and Stratos Idreos. 2021. Cosine: a cloud-cost optimized self-designing key-value storage engine. Proceedings of the VLDB Endowment, Vol. 15, 1 (2021), 112--126.
[9]
Guanduo Chen, Zhenying He, Meng Li, and Siqiang Luo. 2024. Oasis: An Optimal Disjoint Segmented Learned Range Filter. Proceedings of the VLDB Endowment (2024).
[10]
Source Code. 2022. WiredTiger. https://github.com/wiredtiger/wiredtiger.
[11]
Alex Conway, Mart'in Farach-Colton, and Rob Johnson. 2023. SplinterDB and Maplets: Improving the Tradeoffs in Key-Value Store Compaction Policy. Proceedings of the ACM on Management of Data, Vol. 1, 1 (2023), 1--27.
[12]
Marco Costa, Paolo Ferragina, and Giorgio Vinciguerra. 2024. Grafite: Taming Adversarial Queries with Optimal Range Filters. Proceedings of the ACM on Management of Data, Vol. 2, 1 (2024), 1--23.
[13]
Yifan Dai, Yien Xu, Aishwarya Ganesan, Ramnatthan Alagappan, Brian Kroth, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. 2020. From WiscKey to Bourbon: A Learned Index for Log-Structured Merge Trees. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 155--171.
[14]
Niv Dayan, Manos Athanassoulis, and Stratos Idreos. 2017. Monkey: Optimal navigable key-value store. In Proceedings of the 2017 ACM International Conference on Management of Data. 79--94.
[15]
Niv Dayan and Stratos Idreos. 2018. Dostoevsky: Better Space-Time Trade-Offs for LSM-Tree Based Key-Value Stores via Adaptive Removal of Superfluous Merging. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD '18). Association for Computing Machinery, New York, NY, USA, 505--520. https://doi.org/10.1145/3183713.3196927
[16]
Niv Dayan and Stratos Idreos. 2019. The log-structured merge-bush & the wacky continuum. In Proceedings of the 2019 International Conference on Management of Data. 449--466.
[17]
Niv Dayan and Moshe Twitto. 2021. Chucky: A Succinct Cuckoo Filter for LSM-Tree. In Proceedings of the 2021 International Conference on Management of Data. 365--378.
[18]
Niv Dayan, Tamar Weiss, Shmuel Dashevsky, Michael Pan, Edward Bortnikov, and Moshe Twitto. 2022. Spooky: granulating LSM-tree compactions correctly. Proceedings of the VLDB Endowment, Vol. 15, 11 (2022), 3071--3084.
[19]
Siying Dong, Mark Callaghan, Leonidas Galanis, Dhruba Borthakur, Tony Savor, and Michael Strum. 2017. Optimizing Space Amplification in RocksDB. In CIDR, Vol. 3. 3.
[20]
Carl Duffy, Jaehoon Shim, Sang-Hoon Kim, and Jin-Soo Kim. 2023. Dotori: A Key-Value SSD Based KV Store. Proceedings of the VLDB Endowment, Vol. 16, 6 (2023), 1560--1572.
[21]
Ahmed Eldawy, Vagelis Hristidis, Saheli Ghosh, Majid Saeedan, Akil Sevim, AB Siddique, Samriddhi Singla, Ganesh Sivaram, Tin Vu, and Yaming Zhang. 2021. Beast: Scalable exploratory analytics on spatio-temporal data. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 3796--3807.
[22]
Facebook. 2022. RocksDB. https://github.com/facebook/rocksdb.
[23]
Google. 2022. LevelDB. https://github.com/google/leveldb/.
[24]
Gui Huang, Xuntao Cheng, Jianying Wang, Yujie Wang, Dengcheng He, Tieying Zhang, Feifei Li, Sheng Wang, Wei Cao, and Qiang Li. 2019. X-Engine: An Optimized Storage Engine for Large-Scale E-Commerce Transaction Processing. In Proceedings of the 2019 International Conference on Management of Data (Amsterdam, Netherlands) (SIGMOD '19). Association for Computing Machinery, New York, NY, USA, 651--665. https://doi.org/10.1145/3299869.3314041
[25]
Haoyu Huang and Shahram Ghandeharizadeh. 2021. Nova-LSM: a distributed, component-based LSM-tree key-value store. In Proceedings of the 2021 International Conference on Management of Data. 749--763.
[26]
Andy Huynh, Harshal A Chaudhari, Evimaria Terzi, and Manos Athanassoulis. 2022. Endure: a robust tuning paradigm for LSM trees under workload uncertainty. Proceedings of the VLDB Endowment, Vol. 15, 8 (2022), 1605--1618.
[27]
Andy Huynh, Harshal A Chaudhari, Evimaria Terzi, and Manos Athanassoulis. 2024. Towards flexibility and robustness of LSM trees. The VLDB Journal (2024), 1--24.
[28]
Stratos Idreos, Niv Dayan, Wilson Qin, Mali Akmanalp, Sophie Hilgard, Andrew Ross, James Lennon, Varun Jain, Harshita Gupta, David Li, et al. 2019. Design Continuums and the Path Toward Self-Designing Key-Value Stores that Know and Learn. In CIDR.
[29]
Taewoo Kim, Alexander Behm, Michael Blow, Vinayak Borkar, Yingyi Bu, Michael J Carey, Murtadha Hubail, Shiva Jahangiri, Jianfeng Jia, Chen Li, et al. 2020. Robust and efficient memory management in Apache AsterixDB. Software: Practice and Experience, Vol. 50, 7 (2020), 1114--1151.
[30]
Eric R Knorr, Baptiste Lemaire, Andrew Lim, Siqiang Luo, Huanchen Zhang, Stratos Idreos, and Michael Mitzenmacher. 2022. Proteus: A Self-Designing Range Filter. In Proceedings of the 2022 International Conference on Management of Data. 1670--1684.
[31]
Haridimos Kondylakis, Niv Dayan, Kostas Zoumpatianos, and Themis Palpanas. 2018. Coconut: A Scalable Bottom-Up Approach for Building Data Series Indexes. PVLDB 11, 6 (2018), 677--690.
[32]
Avinash Lakshman and Prashant Malik. 2010. Cassandra: a decentralized structured storage system. ACM SIGOPS Operating Systems Review, Vol. 44, 2 (2010), 35--40.
[33]
Sekwon Lee, Soujanya Ponnapalli, Sharad Singhal, Marcos K Aguilera, Kimberly Keeton, and Vijay Chidambaram. 2022. DINOMO: an elastic, scalable, high-performance key-value store for disaggregated persistent memory. Proceedings of the VLDB Endowment, Vol. 15, 13 (2022), 4023--4037.
[34]
Meng Li, Deyi Chen, Haipeng Dai, Rongbiao Xie, Siqiang Luo, Rong Gu, Tong Yang, and Guihai Chen. 2022. Seesaw Counting Filter: An Efficient Guardian for Vulnerable Negative Keys During Dynamic Filtering. In Proceedings of the ACM Web Conference 2022. 2759--2767.
[35]
LinkedIn. 2022. Voldemort. http://www.project-voldemort.com.
[36]
Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Hariharan Gopalakrishnan, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2017. Wisckey: Separating keys from values in ssd-conscious storage. ACM Transactions on Storage (TOS), Vol. 13, 1 (2017), 1--28.
[37]
Chen Luo and Michael J Carey. 2020. Breaking down memory walls: adaptive memory management in LSM-based storage systems. Proceedings of the VLDB Endowment, Vol. 14, 3 (2020), 241--254.
[38]
Siqiang Luo, Subarna Chatterjee, Rafael Ketsetsidis, Niv Dayan, Wilson Qin, and Stratos Idreos. 2020. Rosetta: A robust space-time optimized range filter for key-value stores. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 2071--2086.
[39]
Siqiang Luo, Ben Kao, Guoliang Li, Jiafeng Hu, Reynold Cheng, and Yudian Zheng. 2018. Toain: a throughput optimizing adaptive index for answering dynamic k nn queries on road networks. Proceedings of the VLDB Endowment, Vol. 11, 5 (2018), 594--606.
[40]
Qizhong Mao, Mohiuddin Abdul Qader, and Vagelis Hristidis. 2020. Comprehensive comparison of LSM architectures for spatial data. In 2020 IEEE International Conference on Big Data (Big Data). IEEE, 455--460.
[41]
Dingheng Mo, Fanchao Chen, Siqiang Luo, and Caihua Shan. 2023. Learning to Optimize LSM-trees: Towards A Reinforcement Learning based Key-Value Store for Dynamic Workloads. arXiv preprint arXiv:2308.07013 (2023).
[42]
Bernhard Mößner, Christian Riegger, Arthur Bernhardt, and Ilia Petrov. 2023. bloomRF: On performing range-queries in Bloom-Filters with piecewise-monotone hash functions and prefix hashing. In Advances in database technology: Proceedings of the 26th International Conference on Extending database Technology (EDBT), 28th March-31st March 2023, Ioannina, Greece, Vol. 26. Open Proceedings. org, Univ. of Konstanz, 131--143.
[43]
Patrick O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Informatica, Vol. 33, 4 (1996), 351--385.
[44]
Fengfeng Pan, Yinliang Yue, and Jin Xiong. 2017. dCompaction: Delayed compaction for the LSM-tree. International Journal of Parallel Programming, Vol. 45, 6 (2017), 1310--1325.
[45]
Pandian Raju, Soujanya Ponnapalli, Evan Kaminsky, Gilad Oved, Zachary Keener, Vijay Chidambaram, and Ittai Abraham. 2018. $$mLSM$$: Making Authenticated Storage Faster in Ethereum. In 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 18).
[46]
Sean Rhea, Eric Wang, Edmund Wong, Ethan Atkins, and Nat Storer. 2017. Littletable: A time-series database and its uses. In Proceedings of the 2017 ACM International Conference on Management of Data. 125--138.
[47]
Subhadeep Sarkar, Tarikul Islam Papon, Dimitris Staratzis, and Manos Athanassoulis. 2020. Lethe: A tunable delete-aware LSM engine. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 893--908.
[48]
Subhadeep Sarkar, Dimitris Staratzis, Zichen Zhu, and Manos Athanassoulis. 2022. Constructing and Analyzing the LSM Compaction Design Space (Updated Version). arXiv preprint arXiv:2202.04522 (2022).
[49]
Russell Sears and Raghu Ramakrishnan. 2012. bLSM: a general purpose log structured merge tree. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. 217--228.
[50]
AB Siddique, Samet Oymak, and Vagelis Hristidis. 2020. Unsupervised paraphrasing via deep reinforcement learning. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining. 1800--1809.
[51]
Rebecca Taft, Irfan Sharif, Andrei Matei, Nathan VanBenschoten, Jordan Lewis, Tobias Grieger, Kai Niemi, Andy Woods, Anne Birzin, Raphael Poss, Paul Bardea, Amruta Ranade, Ben Darnell, Bram Gruneir, Justin Jaffray, Lucy Zhang, and Peter Mattis. 2020. CockroachDB: The Resilient Geo-Distributed SQL Database. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data (Portland, OR, USA) (SIGMOD '20). Association for Computing Machinery, New York, NY, USA, 1493--1509. https://doi.org/10.1145/3318464.3386134
[52]
textcolorblackYu, Geoffrey X and Markakis, Markos and Kipf, Andreas and Larson, Per-Åke and Minhas, Umar Farooq and Kraska, Tim. 2022. textcolorblackTreeLine: an update-in-place key-value store for modern storage. Proceedings of the VLDB Endowment, Vol. 16, 1 (2022), 99--112.
[53]
Risi Thonangi and Jun Yang. 2017. On log-structured merge for solid-state drives. In 2017 IEEE 33rd International Conference on Data Engineering (ICDE). IEEE, 683--694.
[54]
Kapil Vaidya, Subarna Chatterjee, Eric Knorr, Michael Mitzenmacher, Stratos Idreos, and Tim Kraska. 2022. SNARF: a learning-enhanced range filter. Proceedings of the VLDB Endowment, Vol. 15, 8 (2022), 1632--1644.
[55]
Tobias Vincc on, Sergej Hardock, Christian Riegger, Julian Oppermann, Andreas Koch, and Ilia Petrov. 2018. Noftl-kv: Tackling write-amplification on kv-stores with native storage management. In Advances in database technology-EDBT 2018: 21st International Conference on Extending Database Technology, Vienna, Austria, March 26--29, 2018. proceedings. University of Konstanz, University Library, 457--460.
[56]
Tin Vu, Ahmed Eldawy, Vagelis Hristidis, and Vassilis Tsotras. 2021. Incremental partitioning for efficient spatial data analytics. Proceedings of the VLDB Endowment, Vol. 15, 3 (2021), 713--726.
[57]
Peng Wang, Guangyu Sun, Song Jiang, Jian Ouyang, Shiding Lin, Chen Zhang, and Jason Cong. 2014. An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In Proceedings of the Ninth European Conference on Computer Systems. 1--14.
[58]
Ruihong Wang, Jianguo Wang, Prishita Kadam, M Tamer Özsu, and Walid G Aref. 2023 a. dLSM: An LSM-Based Index for Memory Disaggregation. In 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 2835--2849.
[59]
Ziwei Wang, Zheng Zhong, Jiarui Guo, Yuhan Wu, Haoyu Li, Tong Yang, Yaofeng Tu, Huanchen Zhang, and Bin Cui. 2023 b. Rencoder: A space-time efficient range filter with local encoder. In 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 2036--2049.
[60]
Baoyue Yan, Xuntao Cheng, Bo Jiang, Shibin Chen, Canfang Shang, Jianying Wang, Gui Huang, Xinjun Yang, Wei Cao, and Feifei Li. 2021. Revisiting the design of LSM-tree Based OLTP storage engine with persistent memory. Proceedings of the VLDB Endowment, Vol. 14, 10 (2021), 1872--1885.
[61]
Ting Yao, Jiguang Wan, Ping Huang, Xubin He, Qingxin Gui, Fei Wu, and Changsheng Xie. 2017. A light-weight compaction tree to reduce I/O amplification toward efficient key-value stores. In Proc. 33rd Int. Conf. Massive Storage Syst. Technol.(MSST). 1--13.
[62]
Huanchen Zhang, Hyeontaek Lim, Viktor Leis, David G Andersen, Michael Kaminsky, Kimberly Keeton, and Andrew Pavlo. 2018b. SuRF: Practical range query filtering with fast succinct tries. In Proceedings of the 2018 International Conference on Management of Data. 323--336.
[63]
Teng Zhang, Jian Tan, Xin Cai, Jianying Wang, Feifei Li, and Jianling Sun. 2022b. SA-LSM: optimize data layout for LSM-tree based storage using survival analysis. Proceedings of the VLDB Endowment, Vol. 15, 10 (2022), 2161--2174.
[64]
Teng Zhang, Jianying Wang, Xuntao Cheng, Hao Xu, Nanlong Yu, Gui Huang, Tieying Zhang, Dengcheng He, Feifei Li, Wei Cao, et al. 2020. FPGA-Accelerated Compactions for LSM-based Key-Value Store. In 18th USENIX Conference on File and Storage Technologies (FAST 20). 225--237.
[65]
Xin Zhang, Qizhong Mao, Ahmed Eldawy, Vagelis Hristidis, and Yihan Sun. 2022a. Bi-directional Log-Structured Merge Tree. In Proceedings of the 34th International Conference on Scientific and Statistical Database Management. 1--4.
[66]
Yinan Zhang, Huiqi Hu, Xuan Zhou, Enlong Xie, Hongdi Ren, and Le Jin. 2023. PM-Blade: A Persistent Memory Augmented LSM-tree Storage for Database. In 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 3363--3375.
[67]
Yueming Zhang, Yongkun Li, Fan Guo, Cheng Li, and Yinlong Xu. 2018a. ElasticBF: Fine-grained and Elastic Bloom Filter Towards Efficient Read for LSM-tree-based KV Stores. In 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 18).
[68]
Yijie Zhong, Zhirong Shen, Zixiang Yu, and Jiwu Shu. 2023. Redesigning High-Performance LSM-based Key-Value Stores with Persistent CPU Caches. In 2023 IEEE 39th International Conference on Data Engineering (ICDE). IEEE, 1098--1111.
[69]
Zichen Zhu, Ju Hyoung Mun, Aneesh Raman, and Manos Athanassoulis. 2021. Reducing bloom filter cpu overhead in lsm-trees on modern storage devices. In Proceedings of the 17th International Workshop on Data Management on New Hardware (DaMoN 2021). 1--10.

Cited By

View all
  • (2024)Oasis: An Optimal Disjoint Segmented Learned Range FilterProceedings of the VLDB Endowment10.14778/3659437.365944717:8(1911-1924)Online publication date: 31-May-2024
  • (2024)CAMAL: Optimizing LSM-trees via Active LearningProceedings of the ACM on Management of Data10.1145/36771382:4(1-26)Online publication date: 30-Sep-2024

Index Terms

  1. Structural Designs Meet Optimality: Exploring Optimized LSM-tree Structures in a Colossal Configuration Space

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 2, Issue 3
    SIGMOD
    June 2024
    1953 pages
    EISSN:2836-6573
    DOI:10.1145/3670010
    Issue’s Table of Contents
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 May 2024
    Published in PACMMOD Volume 2, Issue 3

    Author Tags

    1. LSM-tree
    2. data structure
    3. optimization

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)258
    • Downloads (Last 6 weeks)56
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Oasis: An Optimal Disjoint Segmented Learned Range FilterProceedings of the VLDB Endowment10.14778/3659437.365944717:8(1911-1924)Online publication date: 31-May-2024
    • (2024)CAMAL: Optimizing LSM-trees via Active LearningProceedings of the ACM on Management of Data10.1145/36771382:4(1-26)Online publication date: 30-Sep-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media