Abstract
Key-value (KV) stores are widely utilized in data-intensive applications to obtain exceptional storage performance. However, its caching mechanism often suffers read and write pauses. Especially when accessing old data periodically, it results in cache hit ratios and system throughput decline. To address the performance degradation issue, we propose an innovative dual-hash caching mechanism called DHCache. Firstly, we introduce a dual-hash structure in DHCache. It alleviates read and write pauses by reducing the frequency of rehash operations on the hash table. Secondly, we employ a Most Recently Used (MRU) cache replacement policy on DHCache to retain old data. This enhances the cache hit ratios and throughput when periodically accessing old data. DHCache is deployed within LevelDB, demonstrating significant performance advantages. Experimental results indicate that DHCache improves throughput by 11.89–21.92% in various read workloads compared to traditional LRUCache. Significantly, read performance improvement does not come at the cost of write performance degradation.
Similar content being viewed by others
Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Chen GJ, Wiener JL, Iyer S, Jaiswal A, Lei R, Simha N, Wang W, Wilfong K, Williamson T, Yilmaz S (2016) Realtime data processing at facebook. In: Proceedings of the 2016 International Conference on Management of Data, pp 1087–1098
Heller B, Marschner E, Rosenfeld E, Heer J (2011) Visualizing collaboration and influence in the open-source software community. In: Proceedings of the 8th Working Conference on Mining Software Repositories, pp 223–226
Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE (2008) Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst (TOCS) 26(2):1–26
O’Neil P, Cheng E, Gawlick D, O’Neil E (1996) The log-structured merge-tree (LSM-tree). Acta Informatica 33:351–385
Ghemawat JS Dean: Leveldb is a fast key-value storage library written at google that provides an ordered mapping from string keys to string values. Accessed 10 Sep 2023 (online). https://github.com/google/leveldb
Lersch L, Oukid I, Lehner W, Schreter I (2017) An analysis of LSM caching in NVRAM. In: Proceedings of the 13th International Workshop on Data Management on New Hardware, pp 1–5
Zuo P, Hua Y, Wu J (2018) \(\{\)Write-Optimized\(\}\) and \(\{\)High-Performance\(\}\) hashing index scheme for persistent memory. In: 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pp 461–476
Sun H, Dai S, Huang J, Yue Y, Qin X (2023) Dac: a dynamic active and collaborative cache management scheme for solid state disks. J Syst Architect 140:102896
Chen J, Chen L, Wang S, Zhu G, Sun Y, Liu H, Li F (2020) \(\{\)HotRing\(\}\): A \(\{\)Hotspot-Aware\(\}\)\(\{\)In-Memory\(\}\)\(\{\)Key-Value\(\}\) store. In: 18th USENIX Conference on File and Storage Technologies (FAST 20), pp 239–252
Zuo P, Hua Y (2017) A write-friendly and cache-optimized hashing scheme for non-volatile memory systems. IEEE Trans Parallel Distrib Syst 29(5):985–998
Qiu Z, Yang J, Zhang J, Li C, Ma X, Chen Q, Yang M, Xu Y (2023) Frozenhot cache: Rethinking cache management for modern hardware. In: Proceedings of the Eighteenth European Conference on Computer Systems, pp 557–573
Wang Y, Yang Y, Qiu X, Ke Y, Wang Q (2022) CCF-LRU: hybrid storage cache replacement strategy based on counting cuckoo filter hot-probe method. Appl Intell 52:5144–515
Wang K, Liu J, Chen F (2020) Put an elephant into a fridge: optimizing cache efficiency for in-memory key-value stores. Proc VLDB Endowm 13(9):1540–1554
Yang J, Yue Y, Vinayak R (2021) SEGCACHE: a memory-efficient and scalable in-memory key-value cache for small objects. In: 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21), pp 503–518
Li H, Ji S, Zhong H, Wang W, Xu L, Tang Z, Wei J, Huang T (2023) LPW: an efficient data-aware cache replacement strategy for apache spark. SCIENCE CHINA Inf Sci 66(1):112104
Jia Y, Shao Z, Chen F (2020) Slimcache: an efficient data compression scheme for flash-based key-value caching. ACM Trans Storage (TOS) 16(2):1–34
Wang J, Lu Y, Wang Q, Xie M, Huang K, Shu J (2022) Pacman: an efficient compaction approach for \(\{\)Log-Structured\(\}\)\(\{\)Key-Value\(\}\) store on persistent memory. In: 2022 USENIX Annual Technical Conference (USENIX ATC 22), pp 773–788
Dayan N, Weiss T, Dashevsky S, Pan M, Bortnikov E, Twitto M (2022) Spooky: granulating lsm-tree compactions correctly. Proceedings of the VLDB Endowment 15(11):3071–3084
Yao T, Zhang Y, Wan J, Cui Q, Tang L, Jiang H, Xie C, He X (2020) \(\{\)MatrixKV\(\}\): Reducing write stalls and write amplification in \(\{\)LSM-tree\(\}\) based \(\{\)KV\(\}\) stores with matrix container in \(\{\)NVM\(\}\). In: 2020 USENIX Annual Technical Conference (USENIX ATC 20), pp 17–31
Ma C, Yang H, Wu S, Wang Y, Mao R (2022) Tidal-tree-mem: toward read-intensive key-value stores with tidal structure based on lsm-tree. IEEE Trans Comput Aided Des Integr Circuits Syst 42(2):423–436
Li C, Chen H, Ruan C, Ma X, Xu Y (2021) Leveraging NVME SSDS for building a fast, cost-effective. LSM-tree-based KV store. ACM Trans Storage (TOS) 17(4):1–29
Zhao X, Zhong C, Jiang S (2023) Turbohash: a hash table for key-value store on persistent memory. In: Proceedings of the 16th ACM International Conference on Systems and Storage, pp 35–48
Bender MA, Das R, Farach-Colton M, Tagliavini G (2023) An associativity threshold phenomenon in set-associative caches. arXiv preprint arXiv:2304.04954
Zheran Liu E, Hashemi M, Swersky K, Ranganathan P, Ahn J (2020) An imitation learning approach for cache replacement. arXiv e-prints 2006
Yusuf AD, Abdullahi S, Boukar, MM, Yusuf SI (2021) Collision resolution techniques in hash table: a review. Int J Adv Comput Sci Appl 12(9)
Zuo P, Hua Y, Wu J (2019) Level hashing: a high-performance and flexible-resizing persistent hashing index structure. ACM Trans Storage (TOS) 15(2):1–30
Atay CE, Garani G (2019) Maintaining dimension’s history in data warehouses effectively. Int J Data Warehousing Min (IJDWM) 15(3):46–62
Byron J, Long DD, Miller EL (2018) Using simulation to design scalable and cost-efficient archival storage systems. In: 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), pp 25–39. IEEE
Baker M, Keeton K, Martin S (2005) Why traditional storage systems don’t help us save stuff forever. In: Proceedings of 1st IEEE Workshop on Hot Topics in System Dependability, pp 2005–120
Guan N, Lv M, Yi W, Yu G (2014) WCET analysis with MRU cache: challenging LRU for predictability. ACM Trans Embed Comput Syst (TECS) 13(4s):1–26
Huang H, Ye G, Chen Q, Yin Z, Luo X, Lin J, Yang Q, Zheng Z (2023) Blockemulator: an emulator enabling to test blockchain sharding protocols. arXiv preprint arXiv:2311.03612
Zhu Z, Saha A, Athanassoulis M, Sarkar S (2024) KVBENCH: a key-value benchmarking suite. In: Proceedings of the Tenth International Workshop on Testing Database Systems. DBTest’24. Association for Computing Machinery, New York, NY, USA, pp 9–15
Stilianakis G, Saloustros G, Chiotakis O, Xanthakis G, Bilas A (2024) Index shipping for efficient replication in lsm key-value stores with hybrid KV placement. ACM Trans. Storage, Just Accepted
Chang H, Chiang C (2018) PARC: a novel OS cache manager. Softw. Pract. Exp. 48(12):2193–2222
Cooper BF, Silberstein A, Tam E, Ramakrishnan R, Sears R (2010) Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp 143–154
Zhang W, Zhao X, Jiang S, Jiang H (2021) Chameleondb: a key-value store for OPTANE persistent memory. In: Proceedings of the Sixteenth European Conference on Computer Systems, pp 194–209
Ke Z-M, Li Y-Z, Chang D-W (2021) Dual-kv: Improving performance of key-value caches on multilevel cell non-volatile memory. In: 50th International Conference on Parallel Processing Workshop, pp 1–9
Jiang J, Yan Y, Zhang M, Yin B, Jiang Y, Yang T, Li X, Wang T (2019) Shifting hash table: An efficient hash table with delicate summary. In: 2019 IEEE Globecom Workshops (GC Wkshps). IEEE, pp 1–6
Bai S, Bai X, Che X (2016) Window-LRFU: a cache replacement policy subsumes the LRU and window-LFU policies. Concurr Comput: Pract Exp 28(9):2670–2684
Wang J, Liu D, Fu X, Xiao F, Tian C (2022) Dhash: dynamic hash tables with non-blocking regular operations. IEEE Trans Parallel Distrib Syst 33(12):3274–3290
Yang H-J, Fang J, Cai M, Cai Z (2023) A prefetch-adaptive intelligent cache replacement policy based on machine learning. J Comput Sci Technol 38(2):391–404
Rodriguez LV, Yusuf F, Lyons S, Paz E, Rangaswami R, Liu J, Zhao M, Narasimhan G (2021) Learning cache replacement with \(\{\)CACHEUS\(\}\). In: 19th USENIX Conference on File and Storage Technologies (FAST 21), pp 341–354
Kim T, Lee K, Lee J-H, Park S, Kim YH, Lee B (2014) A dynamic timeout control algorithm in software defined networks. International J Fut Comput Commun 3:331–336
Zhang K, Wang K, Yuan Y, Guo L, Lee R, Zhang X (2015) MEGA-KV: a case for gpus to maximize the throughput of in-memory key-value stores. Proc VLDB Endow 8(11):1226–1237
Ou Y, Härder T, Jin P (2010) Cfdc: a flash-aware buffer management algorithm for database systems. In: Advances in Databases and Information Systems: 14th East European Conference, ADBIS 2010, Novi Sad, Serbia, September 20–24, 2010. Proceedings 14, pp 435–449. Springer
Matani D, Shah K, Mitra A (2021) An o (1) algorithm for implementing the LFU cache eviction scheme. arXiv preprint arXiv:2110.11602
Sha Z, Li J, Zhang F, Huang M, Cai Z, Trahay F, Liao J (2023) Visibility graph-based cache management for dram buffer inside solid-state drives. ACM Trans Storage (TOS)
Izraelevitz J, Yang J, Zhang L, Kim J, Liu X, Memaripour AS, Soh YJ, Wang Z, Xu Y, Dulloor SR, Zhao J, Swanson S (2019) Basic performance measurements of the intel optane DC persistent memory module. CoRR abs/1903.05714
Acknowledgements
This work is supported by the National Natural Science Foundation of China (NSFC) under Grant Nos. 62362057 and 61762075. Meanwhile, this work is also supported by The Key R&D and Transformation Project of Qinghai Province under Grant 2022-SF-165. Jinkang Lu and Meng Lv have contributed equally to this work. Ping Xie is the corresponding author of this paper.
Funding
The National Natural Science Foundation of China (NSFC) under Grant No.62362057 and 61762075. The Key R&D and Transformation Project of Qinghai Province under Grant 2022-SF-165.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
This manuscript belongs to the scope of engineering and does not involve human and animal research. All authors in this manuscript have informed consent.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lu, J., Lv, M., Li, P. et al. Dhcache: a dual-hash cache for optimizing the read performance in key-value store. J Supercomput 81, 400 (2025). https://doi.org/10.1007/s11227-024-06828-w
Accepted:
Published:
DOI: https://doi.org/10.1007/s11227-024-06828-w