Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Dhcache: a dual-hash cache for optimizing the read performance in key-value store

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Key-value (KV) stores are widely utilized in data-intensive applications to obtain exceptional storage performance. However, its caching mechanism often suffers read and write pauses. Especially when accessing old data periodically, it results in cache hit ratios and system throughput decline. To address the performance degradation issue, we propose an innovative dual-hash caching mechanism called DHCache. Firstly, we introduce a dual-hash structure in DHCache. It alleviates read and write pauses by reducing the frequency of rehash operations on the hash table. Secondly, we employ a Most Recently Used (MRU) cache replacement policy on DHCache to retain old data. This enhances the cache hit ratios and throughput when periodically accessing old data. DHCache is deployed within LevelDB, demonstrating significant performance advantages. Experimental results indicate that DHCache improves throughput by 11.89–21.92% in various read workloads compared to traditional LRUCache. Significantly, read performance improvement does not come at the cost of write performance degradation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Algorithm 1
Algorithm 2
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Data availability

The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

References

  1. Chen GJ, Wiener JL, Iyer S, Jaiswal A, Lei R, Simha N, Wang W, Wilfong K, Williamson T, Yilmaz S (2016) Realtime data processing at facebook. In: Proceedings of the 2016 International Conference on Management of Data, pp 1087–1098

  2. Heller B, Marschner E, Rosenfeld E, Heer J (2011) Visualizing collaboration and influence in the open-source software community. In: Proceedings of the 8th Working Conference on Mining Software Repositories, pp 223–226

  3. Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, Gruber RE (2008) Bigtable: a distributed storage system for structured data. ACM Trans Comput Syst (TOCS) 26(2):1–26

    Article  Google Scholar 

  4. O’Neil P, Cheng E, Gawlick D, O’Neil E (1996) The log-structured merge-tree (LSM-tree). Acta Informatica 33:351–385

    Article  MATH  Google Scholar 

  5. Ghemawat JS Dean: Leveldb is a fast key-value storage library written at google that provides an ordered mapping from string keys to string values. Accessed 10 Sep 2023 (online). https://github.com/google/leveldb

  6. Lersch L, Oukid I, Lehner W, Schreter I (2017) An analysis of LSM caching in NVRAM. In: Proceedings of the 13th International Workshop on Data Management on New Hardware, pp 1–5

  7. Zuo P, Hua Y, Wu J (2018) \(\{\)Write-Optimized\(\}\) and \(\{\)High-Performance\(\}\) hashing index scheme for persistent memory. In: 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pp 461–476

  8. Sun H, Dai S, Huang J, Yue Y, Qin X (2023) Dac: a dynamic active and collaborative cache management scheme for solid state disks. J Syst Architect 140:102896

    Article  MATH  Google Scholar 

  9. Chen J, Chen L, Wang S, Zhu G, Sun Y, Liu H, Li F (2020) \(\{\)HotRing\(\}\): A \(\{\)Hotspot-Aware\(\}\)\(\{\)In-Memory\(\}\)\(\{\)Key-Value\(\}\) store. In: 18th USENIX Conference on File and Storage Technologies (FAST 20), pp 239–252

  10. Zuo P, Hua Y (2017) A write-friendly and cache-optimized hashing scheme for non-volatile memory systems. IEEE Trans Parallel Distrib Syst 29(5):985–998

    Article  MATH  Google Scholar 

  11. Qiu Z, Yang J, Zhang J, Li C, Ma X, Chen Q, Yang M, Xu Y (2023) Frozenhot cache: Rethinking cache management for modern hardware. In: Proceedings of the Eighteenth European Conference on Computer Systems, pp 557–573

  12. Wang Y, Yang Y, Qiu X, Ke Y, Wang Q (2022) CCF-LRU: hybrid storage cache replacement strategy based on counting cuckoo filter hot-probe method. Appl Intell 52:5144–515

    Article  MATH  Google Scholar 

  13. Wang K, Liu J, Chen F (2020) Put an elephant into a fridge: optimizing cache efficiency for in-memory key-value stores. Proc VLDB Endowm 13(9):1540–1554

    Article  MATH  Google Scholar 

  14. Yang J, Yue Y, Vinayak R (2021) SEGCACHE: a memory-efficient and scalable in-memory key-value cache for small objects. In: 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21), pp 503–518

  15. Li H, Ji S, Zhong H, Wang W, Xu L, Tang Z, Wei J, Huang T (2023) LPW: an efficient data-aware cache replacement strategy for apache spark. SCIENCE CHINA Inf Sci 66(1):112104

    Article  Google Scholar 

  16. Jia Y, Shao Z, Chen F (2020) Slimcache: an efficient data compression scheme for flash-based key-value caching. ACM Trans Storage (TOS) 16(2):1–34

    Article  MATH  Google Scholar 

  17. Wang J, Lu Y, Wang Q, Xie M, Huang K, Shu J (2022) Pacman: an efficient compaction approach for \(\{\)Log-Structured\(\}\)\(\{\)Key-Value\(\}\) store on persistent memory. In: 2022 USENIX Annual Technical Conference (USENIX ATC 22), pp 773–788

  18. Dayan N, Weiss T, Dashevsky S, Pan M, Bortnikov E, Twitto M (2022) Spooky: granulating lsm-tree compactions correctly. Proceedings of the VLDB Endowment 15(11):3071–3084

    Article  Google Scholar 

  19. Yao T, Zhang Y, Wan J, Cui Q, Tang L, Jiang H, Xie C, He X (2020) \(\{\)MatrixKV\(\}\): Reducing write stalls and write amplification in \(\{\)LSM-tree\(\}\) based \(\{\)KV\(\}\) stores with matrix container in \(\{\)NVM\(\}\). In: 2020 USENIX Annual Technical Conference (USENIX ATC 20), pp 17–31

  20. Ma C, Yang H, Wu S, Wang Y, Mao R (2022) Tidal-tree-mem: toward read-intensive key-value stores with tidal structure based on lsm-tree. IEEE Trans Comput Aided Des Integr Circuits Syst 42(2):423–436

    Article  MATH  Google Scholar 

  21. Li C, Chen H, Ruan C, Ma X, Xu Y (2021) Leveraging NVME SSDS for building a fast, cost-effective. LSM-tree-based KV store. ACM Trans Storage (TOS) 17(4):1–29

    Article  MATH  Google Scholar 

  22. Zhao X, Zhong C, Jiang S (2023) Turbohash: a hash table for key-value store on persistent memory. In: Proceedings of the 16th ACM International Conference on Systems and Storage, pp 35–48

  23. Bender MA, Das R, Farach-Colton M, Tagliavini G (2023) An associativity threshold phenomenon in set-associative caches. arXiv preprint arXiv:2304.04954

  24. Zheran Liu E, Hashemi M, Swersky K, Ranganathan P, Ahn J (2020) An imitation learning approach for cache replacement. arXiv e-prints 2006

  25. Yusuf AD, Abdullahi S, Boukar, MM, Yusuf SI (2021) Collision resolution techniques in hash table: a review. Int J Adv Comput Sci Appl 12(9)

  26. Zuo P, Hua Y, Wu J (2019) Level hashing: a high-performance and flexible-resizing persistent hashing index structure. ACM Trans Storage (TOS) 15(2):1–30

    Article  MATH  Google Scholar 

  27. Atay CE, Garani G (2019) Maintaining dimension’s history in data warehouses effectively. Int J Data Warehousing Min (IJDWM) 15(3):46–62

    Article  MATH  Google Scholar 

  28. Byron J, Long DD, Miller EL (2018) Using simulation to design scalable and cost-efficient archival storage systems. In: 2018 IEEE 26th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS), pp 25–39. IEEE

  29. Baker M, Keeton K, Martin S (2005) Why traditional storage systems don’t help us save stuff forever. In: Proceedings of 1st IEEE Workshop on Hot Topics in System Dependability, pp 2005–120

  30. Guan N, Lv M, Yi W, Yu G (2014) WCET analysis with MRU cache: challenging LRU for predictability. ACM Trans Embed Comput Syst (TECS) 13(4s):1–26

    Article  Google Scholar 

  31. Huang H, Ye G, Chen Q, Yin Z, Luo X, Lin J, Yang Q, Zheng Z (2023) Blockemulator: an emulator enabling to test blockchain sharding protocols. arXiv preprint arXiv:2311.03612

  32. Zhu Z, Saha A, Athanassoulis M, Sarkar S (2024) KVBENCH: a key-value benchmarking suite. In: Proceedings of the Tenth International Workshop on Testing Database Systems. DBTest’24. Association for Computing Machinery, New York, NY, USA, pp 9–15

  33. Stilianakis G, Saloustros G, Chiotakis O, Xanthakis G, Bilas A (2024) Index shipping for efficient replication in lsm key-value stores with hybrid KV placement. ACM Trans. Storage, Just Accepted

    Book  Google Scholar 

  34. Chang H, Chiang C (2018) PARC: a novel OS cache manager. Softw. Pract. Exp. 48(12):2193–2222

    Article  MATH  Google Scholar 

  35. Cooper BF, Silberstein A, Tam E, Ramakrishnan R, Sears R (2010) Benchmarking cloud serving systems with YCSB. In: Proceedings of the 1st ACM Symposium on Cloud Computing, pp 143–154

  36. Zhang W, Zhao X, Jiang S, Jiang H (2021) Chameleondb: a key-value store for OPTANE persistent memory. In: Proceedings of the Sixteenth European Conference on Computer Systems, pp 194–209

  37. Ke Z-M, Li Y-Z, Chang D-W (2021) Dual-kv: Improving performance of key-value caches on multilevel cell non-volatile memory. In: 50th International Conference on Parallel Processing Workshop, pp 1–9

  38. Jiang J, Yan Y, Zhang M, Yin B, Jiang Y, Yang T, Li X, Wang T (2019) Shifting hash table: An efficient hash table with delicate summary. In: 2019 IEEE Globecom Workshops (GC Wkshps). IEEE, pp 1–6

  39. Bai S, Bai X, Che X (2016) Window-LRFU: a cache replacement policy subsumes the LRU and window-LFU policies. Concurr Comput: Pract Exp 28(9):2670–2684

    Article  MATH  Google Scholar 

  40. Wang J, Liu D, Fu X, Xiao F, Tian C (2022) Dhash: dynamic hash tables with non-blocking regular operations. IEEE Trans Parallel Distrib Syst 33(12):3274–3290

    Article  MATH  Google Scholar 

  41. Yang H-J, Fang J, Cai M, Cai Z (2023) A prefetch-adaptive intelligent cache replacement policy based on machine learning. J Comput Sci Technol 38(2):391–404

    Article  MATH  Google Scholar 

  42. Rodriguez LV, Yusuf F, Lyons S, Paz E, Rangaswami R, Liu J, Zhao M, Narasimhan G (2021) Learning cache replacement with \(\{\)CACHEUS\(\}\). In: 19th USENIX Conference on File and Storage Technologies (FAST 21), pp 341–354

  43. Kim T, Lee K, Lee J-H, Park S, Kim YH, Lee B (2014) A dynamic timeout control algorithm in software defined networks. International J Fut Comput Commun 3:331–336

    Article  MATH  Google Scholar 

  44. Zhang K, Wang K, Yuan Y, Guo L, Lee R, Zhang X (2015) MEGA-KV: a case for gpus to maximize the throughput of in-memory key-value stores. Proc VLDB Endow 8(11):1226–1237

    Article  MATH  Google Scholar 

  45. Ou Y, Härder T, Jin P (2010) Cfdc: a flash-aware buffer management algorithm for database systems. In: Advances in Databases and Information Systems: 14th East European Conference, ADBIS 2010, Novi Sad, Serbia, September 20–24, 2010. Proceedings 14, pp 435–449. Springer

  46. Matani D, Shah K, Mitra A (2021) An o (1) algorithm for implementing the LFU cache eviction scheme. arXiv preprint arXiv:2110.11602

  47. Sha Z, Li J, Zhang F, Huang M, Cai Z, Trahay F, Liao J (2023) Visibility graph-based cache management for dram buffer inside solid-state drives. ACM Trans Storage (TOS)

  48. Izraelevitz J, Yang J, Zhang L, Kim J, Liu X, Memaripour AS, Soh YJ, Wang Z, Xu Y, Dulloor SR, Zhao J, Swanson S (2019) Basic performance measurements of the intel optane DC persistent memory module. CoRR abs/1903.05714

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (NSFC) under Grant Nos. 62362057 and 61762075. Meanwhile, this work is also supported by The Key R&D and Transformation Project of Qinghai Province under Grant 2022-SF-165. Jinkang Lu and Meng Lv have contributed equally to this work. Ping Xie is the corresponding author of this paper.

Funding

The National Natural Science Foundation of China (NSFC) under Grant No.62362057 and 61762075. The Key R&D and Transformation Project of Qinghai Province under Grant 2022-SF-165.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ping Xie.

Ethics declarations

Conflict of interest

This manuscript belongs to the scope of engineering and does not involve human and animal research. All authors in this manuscript have informed consent.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, J., Lv, M., Li, P. et al. Dhcache: a dual-hash cache for optimizing the read performance in key-value store. J Supercomput 81, 400 (2025). https://doi.org/10.1007/s11227-024-06828-w

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-024-06828-w

Keywords