Article

LHD: improving cache hit rate by maximizing hit density

Authors:

Nathan Beckmann,

Asaf CidonAuthors Info & Claims

NSDI'18: Proceedings of the 15th USENIX Conference on Networked Systems Design and Implementation

Pages 389 - 403

Published: 09 April 2018 Publication History

Abstract

Cloud application performance is heavily reliant on the hit rate of datacenter key-value caches. Key-value caches typically use least recently used (LRU) as their eviction policy, but LRU's hit rate is far from optimal under real workloads. Prior research has proposed many eviction policies that improve on LRU, but these policies make restrictive assumptions that hurt their hit rate, and they can be difficult to implement efficiently.

We introduce least hit density (LHD), a novel eviction policy for key-value caches. LHD predicts each object's expected hits-per-space-consumed (hit density), filtering objects that contribute little to the cache's hit rate. Unlike prior eviction policies, LHD does not rely on heuristics, but rather rigorously models objects' behavior using conditional probability to adapt its behavior in real time.

To make LHD practical, we design and implement RankCache, an efficient key-value cache based on memcached. We evaluate RankCache and LHD on commercial memcached and enterprise storage traces, where LHD consistently achieves better hit rates than prior policies. LHD requires much less space than prior policies to match their hit rate, on average 8× less than LRU and 2-3× less than recently proposed policies. Moreover, RankCache requires no synchronization in the common case, improving request throughput at 16 threads by 8× over LRU and by 2× over CLOCK.

References

[1]

Redis. http://redis.io/. 7/24/2015.

[2]

M. Abrams, C. R. Standridge, G. Abdulla, S. Williams, and E. A. Fox. Caching proxies: Limitations and potentials. Technical report, Blacksburg, VA, USA, 1995.

Digital Library

[3]

V. Almeida, A. Bestavros, M. Crovella, and A. de Oliveira. Characterizing reference locality in the WWW. In Proceedings of the Fourth International Conference on on Parallel and Distributed Information Systems, DIS '96, pages 92-107, Washington, DC, USA, 1996. IEEE Computer Society.

Digital Library

[4]

M. Arlitt, L. Cherkasova, J. Dilley, R. Friedrich, and T. Jin. Evaluating content management techniques for web proxy caches. ACM SIGMETRICS Performance Evaluation Review, 2000.

Digital Library

[5]

B. Atikoglu, Y. Xu, E. Frachtenberg, S. Jiang, and M. Paleczny. Workload analysis of a large-scale key-value store. In ACM SIGMETRICS Performance Evaluation Review, volume 40, pages 53-64. ACM, 2012.

Digital Library

[6]

N. Beckmann and D. Sanchez. Talus: A simple way to remove cliffs in cache performance. In HPCA-21, 2015.

[7]

N. Beckmann and D. Sanchez. Modeling cache performance beyond LRU. HPCA-22, 2016.

[8]

N. Beckmann and D. Sanchez. Maximizing cache performance under uncertainty. HPCA-23, 2017.

[9]

D. S. Berger, R. K. Sitaraman, and M. Harchol-Balter. AdaptSize: Orchestrating the hot object memory cache in a content delivery network. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), pages 483-498, Boston, MA, 2017. USENIX Association.

Digital Library

[10]

H. Bjornsson, G. Chockler, T. Saemundsson, and Y. Vigfusson. Dynamic performance profiling of cloud caches. In Proceedings of the 4th annual Symposium on Cloud Computing, page 59. ACM, 2013.

Digital Library

[11]

A. Blankstein, S. Sen, and M. J. Freedman. Hyperbolic caching: Flexible caching for web applications. In 2017 USENIX Annual Technical Conference (USENIX ATC 17), pages 499-511, Santa Clara, CA, 2017. USENIX Association.

Digital Library

[12]

A. Borodin, S. Irani, P. Raghavan, and B. Schieber. Competitive paging with locality of reference. Journal of Computer and System Sciences, 50(2):244-258, 1995.

Digital Library

[13]

L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker. Web caching and Zipf-like distributions: Evidence and implications. In INFOCOM'99. Eighteenth Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings. IEEE, volume 1, pages 126-134. IEEE, 1999.

[14]

P. Cao and S. Irani. Cost-aware www proxy caching algorithms. In Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems, USITS'97, pages 18-18, Berkeley, CA, USA, 1997. USENIX Association.

Digital Library

[15]

H. Che, Y. Tung, and Z. Wang. Hierarchical web caching systems: Modeling, design and experimental results. IEEE Journal on Selected Areas in Communications, 2002.

Digital Library

[16]

L. Cherkasova. Improving WWW proxies performance with greedy-dual-size-frequency caching policy. Hewlett-Packard Laboratories, 1998.

[17]

A. Cidon, A. Eisenman, M. Alizadeh, and S. Katti. Dynacache: Dynamic cloud caching. In 7th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 15), Santa Clara, CA, July 2015. USENIX Association.

Digital Library

[18]

A. Cidon, A. Eisenman, M. Alizadeh, and S. Katti. Cliffhanger: Scaling performance cliffs in web memory caches. In 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI 16), pages 379-392, Santa Clara, CA, Mar. 2016. USENIX Association.

Digital Library

[19]

A. Cidon, D. Rushton, S. M. Rumble, and R. Stutsman. Memshare: a dynamic multi-tenant key-value cache. In 2017 USENIX Annual Technical Conference (USENIX ATC 17), pages 321-334, Santa Clara, CA, 2017. USENIX Association.

Digital Library

[20]

C. Cunha, A. Bestavros, and M. Crovella. Characteristics of WWW client-based traces. Technical report, Boston, MA, USA, 1995.

Digital Library

[21]

J. Dean and L. A. Barroso. The tail at scale. Commun. ACM, 56(2), 2013.

Digital Library

[22]

B. Fan, D. G. Andersen, and M. Kaminsky. MemC3: Compact and concurrent MemCache with dumber caching and smarter hashing. In Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation, NSDI'13, pages 371-384, Berkeley, CA, USA, 2013. USENIX Association.

Digital Library

[23]

B. Fitzpatrick. Distributed caching with Memcached. Linux journal, 2004(124):5, 2004.

Digital Library

[24]

Q. Huang, K. Birman, R. van Renesse, W. Lloyd, S. Kumar, and H. C. Li. An analysis of facebook photo caching. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP '13, pages 167-181, New York, NY, USA, 2013. ACM.

Digital Library

[25]

A. Jaleel, K. B. Theobald, S. C. Steely Jr, and J. Emer. High performance cache replacement using re-reference interval prediction. In ISCA-37, 2010.

Digital Library

[26]

S. Jiang and X. Zhang. LIRS: An efficient low inter-reference recency set replacement policy to improve buffer cache performance. SIGMETRICS Perform. Eval. Rev., 30(1):31-42, June 2002.

Digital Library

[27]

S. Jin and A. Bestavros. GreedyDualâLŮ web caching algorithm: exploiting the two sources of temporal locality in web request streams. Computer Communications, 24(2):174-183, 2001.

Digital Library

[28]

T. Johnson and D. Shasha. 2Q: A low overhead high performance buffer management replacement algorithm. In Proceedings of the 20th International Conference on Very Large Data Bases, VLDB '94, pages 439-450, San Francisco, CA, USA, 1994. Morgan Kaufmann Publishers Inc.

Digital Library

[29]

R. Karedla, J. S. Love, and B. G. Wherry. Caching strategies to improve disk system performance. Computer, 27(3):38-46, Mar. 1994.

Digital Library

[30]

R. E. Kessler, M. D. Hill, and D. A. Wood. A comparison of trace-sampling techniques for multi-megabyte caches. IEEE Transactions on Computers, 1994.

Digital Library

[31]

D. Lee, J. Choi, J.-H. Kim, S. H. Noh, S. L. Min, Y. Cho, and C. S. Kim. On the existence of a spectrum of policies that subsumes the least recently used (LRU) and least frequently used (LFU) policies. SIGMETRICS Perform. Eval. Rev., 27(1):134-143, May 1999.

Digital Library

[32]

C. Li and A. L. Cox. GD-Wheel: a cost-aware replacement policy for key-value stores. In Proceedings of the Tenth European Conference on Computer Systems, page 5. ACM, 2015.

Digital Library

[33]

S. Li, H. Lim, V. W. Lee, J. H. Ahn, A. Kalia, M. Kaminsky, D. G. Andersen, O. Seongil, S. Lee, and P. Dubey. Architecting to achieve a billion requests per second throughput on a single keyvalue store server platform. In Proceedings of the 42Nd Annual International Symposium on Computer Architecture, ISCA '15, pages 476-488, New York, NY, USA, 2015. ACM.

Digital Library

[34]

H. Lim, D. Han, D. G. Andersen, and M. Kaminsky. MICA: A holistic approach to fast in-memory key-value storage. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14), pages 429-444, Seattle, WA, Apr. 2014. USENIX Association.

Digital Library

[35]

N. Megiddo and D. S. Modha. Arc: A self-tuning, low overhead replacement cache. In FAST, volume 3, pages 115-130, 2003.

Digital Library

[36]

Memcachier. www.memcachier.com.

[37]

R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani. Scaling Memcache at Facebook. In Presented as part of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13), pages 385-398, Lombard, IL, 2013. USENIX.

Digital Library

[38]

E. J. O'Neil, P. E. O'Neil, and G. Weikum. The LRU-K page replacement algorithm for database disk buffering. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD '93, pages 297-306, New York, NY, USA, 1993. ACM.

Digital Library

[39]

K. Psounis and B. Prabhakar. A randomized web-cache replacement scheme. In INFOCOM 2001. Twentieth Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings. IEEE, volume 3, pages 1407-1415. IEEE, 2001.

[40]

M. Qureshi, A. Jaleel, Y. N. Patt, S. C. Steely, and J. Emer. Adaptive insertion policies for high performance caching. In ISCA-34, 2007.

Digital Library

[41]

M. Rajashekhar and Y. Yue. Twemcache. blog.twitter.com/2012/caching-with-twemcache.

[42]

S. M. Rumble, A. Kejriwal, and J. Ousterhout. Log-structured Memory for DRAM-based Storage. In FAST, pages 1-16, 2014.

Digital Library

[43]

T. Saemundsson, H. Bjornsson, G. Chockler, and Y. Vigfusson. Dynamic performance profiling of cloud caches. In Proceedings of the ACM Symposium on Cloud Computing, pages 1-14. ACM, 2014.

Digital Library

[44]

D. Sanchez and C. Kozyrakis. The zcache: Decoupling ways and associativity. In MICRO-43, 2010.

Digital Library

[45]

P. Scheuermann, J. Shim, and R. Vingralek. A case for delay-conscious caching of web documents. Computer Networks and ISDN Systems, 29(8):997-1005, 1997.

Digital Library

[46]

A. Seznec. A case for two-way skewed-associative caches. In ACM SIGARCH Computer Architecture News, volume 21, pages 169-178. ACM, 1993.

Digital Library

[47]

D. D. Sleator and R. E. Tarjan. Amortized efficiency of list update and paging rules. Commun. ACM, 28(2):202-208, Feb. 1985.

Digital Library

[48]

SNIA. MSR Cambridge Traces. http://iotta.snia.org/traces/388, 2008.

[49]

I. Stefanovici, E. Thereska, G. O'Shea, B. Schroeder, H. Ballani, T. Karagiannis, A. Rowstron, and T. Talpey. Software-defined caching: Managing caches in multi-tenant data centers. In Proceedings of the Sixth ACM Symposium on Cloud Computing, pages 174-181. ACM, 2015.

Digital Library

[50]

L. Tang, Q. Huang, W. Lloyd, S. Kumar, and K. Li. RIPQ: Advanced photo caching on flash for Facebook. In 13th USENIX Conference on File and Storage Technologies (FAST 15), pages 373-386, Santa Clara, CA, Feb. 2015. USENIX Association.

Digital Library

[51]

C. Waldspurger, T. Saemundsson, I. Ahmad, and N. Park. Cache modeling and optimization using miniature simulations. In 2017 USENIX Annual Technical Conference (USENIX ATC 17), pages 487-498, Santa Clara, CA, 2017. USENIX Association.

Digital Library

[52]

R. P. Wooster and M. Abrams. Proxy caching that estimates page load delays. In Selected Papers from the Sixth International Conference on World Wide Web, pages 977-986, Essex, UK, 1997. Elsevier Science Publishers Ltd.

Digital Library

[53]

N. Young. The k-server dual and loose competitiveness for paging. Algorithmica, 11:525-541, 1994.

Digital Library

Cited By

Koo KKim SKim WChoi YHan JKim BMoon B(2024)PreVision: An Out-of-Core Matrix Computation System with Optimal Buffer ReplacementProceedings of the ACM on Management of Data10.1145/36392972:1(1-25)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639297
Guo XWang HZhou KJiang HHan YXing G(2024)FLOWS: Balanced MRC Profiling for Heterogeneous Object-Size CacheProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650078(421-440)Online publication date: 22-Apr-2024
https://dl.acm.org/doi/10.1145/3627703.3650078
Song WEo JUm TJeon MChun B(2024)Blaze: Holistic Caching for Iterative Data ProcessingProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629558(370-386)Online publication date: 22-Apr-2024
https://dl.acm.org/doi/10.1145/3627703.3629558
Show More Cited By

LHD: improving cache hit rate by maximizing hit density
1. General and reference
  1. Cross-computing tools and techniques

Recommendations

TLB Improvements for Chip Multiprocessors: Inter-Core Cooperative Prefetchers and Shared Last-Level TLBs

Translation Lookaside Buffers (TLBs) are critical to overall system performance. Much past research has addressed uniprocessor TLBs, lowering access times and miss rates. However, as Chip MultiProcessors (CMPs) become ubiquitous, TLB design and ...
SELECTIVE VICTIM CACHING: A METHOD TO IMPROVE THE PERFORMANCE OF DIRECT-MAPPED CACHES
High performance cache replacement using re-reference interval prediction (RRIP)
ISCA '10

Practical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy always predicts a near-immediate re-reference interval on cache hits and ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

NSDI'18: Proceedings of the 15th USENIX Conference on Networked Systems Design and Implementation

April 2018

623 pages

ISBN:9781931971430

Program Chairs:
Srinivasan Seshan
Carnegie Mellon University
,
Sujata Banerjee
VMWare Research

Sponsors

NetApp
Google Inc.
NSF
Microsoft: Microsoft

Publisher

USENIX Association

United States

Publication History

Published: 09 April 2018

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Koo KKim SKim WChoi YHan JKim BMoon B(2024)PreVision: An Out-of-Core Matrix Computation System with Optimal Buffer ReplacementProceedings of the ACM on Management of Data10.1145/36392972:1(1-25)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639297
Guo XWang HZhou KJiang HHan YXing G(2024)FLOWS: Balanced MRC Profiling for Heterogeneous Object-Size CacheProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650078(421-440)Online publication date: 22-Apr-2024
https://dl.acm.org/doi/10.1145/3627703.3650078
Song WEo JUm TJeon MChun B(2024)Blaze: Holistic Caching for Iterative Data ProcessingProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629558(370-386)Online publication date: 22-Apr-2024
https://dl.acm.org/doi/10.1145/3627703.3629558
Yang JWang YWang Z(2023)An Empirical Analysis on Memcached's Replacement PoliciesProceedings of the International Symposium on Memory Systems10.1145/3631882.3631883(1-10)Online publication date: 2-Oct-2023
https://dl.acm.org/doi/10.1145/3631882.3631883
Lee HGuo LTang MFiroz JTallent NKougkas ASun XMohror KArnold DBadia R(2023)Data Flow Lifecycles for Optimizing Workflow CoordinationProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607104(1-15)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607104
Yang JWang YWang Z(2021)Efficient Modeling of Random Sampling-Based LRUProceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3472514(1-11)Online publication date: 9-Aug-2021
https://dl.acm.org/doi/10.1145/3472456.3472514
Yang JYue YRashmi K(2021)A Large-scale Analysis of Hundreds of In-memory Key-value Cache Clusters at TwitterACM Transactions on Storage10.1145/346852117:3(1-35)Online publication date: 16-Aug-2021
https://dl.acm.org/doi/10.1145/3468521
Zhong CZhao XJiang SWassermann BMalka MChidambaram VRaz D(2021)LIRS2Proceedings of the 14th ACM International Conference on Systems and Storage10.1145/3456727.3463772(1-12)Online publication date: 14-Jun-2021
https://dl.acm.org/doi/10.1145/3456727.3463772
Song ZBerger DLi KLloyd WBhagwan RPorter G(2020)Learning relaxed Belady for content distribution network cachingProceedings of the 17th Usenix Conference on Networked Systems Design and Implementation10.5555/3388242.3388281(529-544)Online publication date: 25-Feb-2020
https://dl.acm.org/doi/10.5555/3388242.3388281
Zhang LKarimi RAhmad IVigfusson Y(2020)Optimal Data Placement for Heterogeneous Cache, Memory, and Storage SystemsProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/33794724:1(1-27)Online publication date: 5-Jun-2020
https://dl.acm.org/doi/10.1145/3379472
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Table of Contents