Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

A Large-scale Analysis of Hundreds of In-memory Key-value Cache Clusters at Twitter

Published: 16 August 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Modern web services use in-memory caching extensively to increase throughput and reduce latency. There have been several workload analyses of production systems that have fueled research in improving the effectiveness of in-memory caching systems. However, the coverage is still sparse considering the wide spectrum of industrial cache use cases. In this work, we significantly further the understanding of real-world cache workloads by collecting production traces from 153 in-memory cache clusters at Twitter, sifting through over 80 TB of data, and sometimes interpreting the workloads in the context of the business logic behind them. We perform a comprehensive analysis to characterize cache workloads based on traffic pattern, time-to-live (TTL), popularity distribution, and size distribution. A fine-grained view of different workloads uncover the diversity of use cases: many are far more write-heavy or more skewed than previously shown and some display unique temporal patterns. We also observe that TTL is an important and sometimes defining parameter of cache working sets. Our simulations show that ideal replacement strategy in production caches can be surprising, for example, FIFO works the best for a large number of workloads.

    References

    [1]
    [n.d.]. Anonymized Twitter Production Cache Traces. Retrieved from https://github.com/twitter/cache-trace.
    [2]
    [n.d.]. Apache Aurora. Retrieved from http://aurora.apache.org/.
    [3]
    [n.d.]. Apache Mesos. Retrieved from http://mesos.apache.org/.
    [4]
    [n.d.]. Apache Traffic Server. Retrieved from https://trafficserver.apache.org/.
    [5]
    [n.d.]. Art. 17 GDPR Right to Erasure (“Right to be Forgotten”). Retrieved from https://gdpr-info.eu/art-17-gdpr/.
    [6]
    [n.d.]. Caching with Twemcache. Retrieved from https://blog.twitter.com/engineering/en_us/a/2012/caching-with-twemcache.html.
    [7]
    [n.d.]. Database Caching Strategy Using Redis. Retrieved from https://d0.awsstatic.com/whitepapers/Database/database-caching-strategies-using-redis.pdf.
    [8]
    [n.d.]. Decomposing Twitter: Adventures in Service-Oriented Architecture. Retrieved from https://www.infoq.com/presentations/twitter-soa/.
    [9]
    [n.d.]. Do Not Join Lru and Slab Maintainer Threads if They Do Not Exist. Retrieved from https://github.com/memcached/memcached/pull/686.
    [10]
    [n.d.]. Enhance Slab Reallocation for Burst of Evictions. Retrieved from https://github.com/memcached/memcached/pull/695.
    [11]
    [n.d.]. Experiencing Slab OOMs After One Week of Uptime. Retrieved from https://github.com/memcached/memcached/issues/689.
    [12]
    [n.d.]. How To Interpret R-squared and Goodness-of-Fit in Regression Analysis. Retrieved from https://www.datasciencecentral.com/profiles/blogs/regression-analysis- how-do-i-interpret-r-squared-and-assess-the.
    [13]
    [n.d.]. Jemalloc. http://jemalloc.net/.
    [14]
    [n.d.]. Logging Control In W3C Httpd. Retrieved from https://www.w3.org/Daemon/User/Config/Logging.html#common-logfile-format.
    [15]
    [n.d.]. Memcached - a Distributed Memory Object Caching System. Retrieved from http://memcached.org/.
    [16]
    [n.d.]. Paper Review: MemC3. Retrieved from https://memcached.org/blog/paper-review-memc3/.
    [17]
    [n.d.]. Pelikan. Retrieved from https://github.com/twitter/pelikan.
    [18]
    [n.d.]. Redis. Retrieved from http://redis.io/.
    [19]
    [n.d.]. RocksDB. Retrieved from https://rocksdb.org/.
    [20]
    [n.d.]. Slab Auto-mover Anti-favours Slab 2. Retrieved from https://github.com/memcached/memcached/issues/677.
    [21]
    [n.d.]. Varnish Cache. Retrieved from https://varnish-cache.org/.
    [22]
    Kan Wu, Zhihan Guo, Guanzhou Hu, Kaiwei Tu, Ramnatthan Alagappan, Rathijit Sen, Kwanghyun Park, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2021. The storage hierarchy is not a hierarchy: Optimizing caching on modern storage devices with Orthus. In Proceedings of the 19th USENIX Conference on File and Storage Technologies (FAST’21). USENIX Association. Retrieved from https://www.usenix.org/conference/fast21/presentation/wu-kan.
    [23]
    Atul Adya, Daniel Myers, Jon Howell, Jeremy Elson, Colin Meek, Vishesh Khemani, Stefan Fulger, Pan Gu, Lakshminath Bhuvanagiri, Jason Hunter, Roberto Peon, Larry Kai, Alexander Shraer, Arif Merchant, and Kfir Lev-Ari. 2016. Slicer: Auto-sharding for datacenter applications. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Savannah, GA, 739–753. Retrieved from https://www.usenix.org/conference/osdi16/technical-sessions/presentation/adya.
    [24]
    Mehmet Altinel, Christof Bornhoevd, Chandrasekaran Mohan, Mir Hamid Pirahesh, Berthold Reinwald, and Saileshwar Krishnamurthy. 2008. System and Method for Adaptive Database Caching. U.S. Patent 7,395,258.
    [25]
    Martin Arlitt, Rich Friedrich, and Tai Jin. 1999. Workload characterization of a Web proxy in a cable modem environment. ACM SIGMETRICS Perform. Eval. Rev. 27, 2 (1999), 25–36.
    [26]
    Martin Arlitt and Tai Jin. 2000. A workload characterization study of the 1998 world cup web site. IEEE Netw. 14, 3 (2000), 30–37.
    [27]
    Martin F. Arlitt and Carey L. Williamson. 1997. Internet web servers: Workload characterization and performance implications. IEEE/ACM Trans. Netw. 5, 5 (1997), 631–645.
    [28]
    Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload analysis of a large-scale key-value store. In Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems. 53–64.
    [29]
    Mary G. Baker, John H. Hartman, Michael D. Kupfer, Ken W. Shirriff, and John K. Ousterhout. 1991. Measurements of a distributed file system. In Proceedings of the 13th ACM Symposium on Operating Systems Principles. 198–212.
    [30]
    Nathan Beckmann, Haoxian Chen, and Asaf Cidon. 2018. LHD : Improving cache hit rate by maximizing hit density. In Proceedings of the 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI’18). 389–403.
    [31]
    Nathan Beckmann and Daniel Sanchez. 2015. Talus: A simple way to remove cliffs in cache performance. In Proceedings of the IEEE 21st International Symposium on High Performance Computer Architecture (HPCA’15). IEEE, 64–75.
    [32]
    Benjamin Berg, Daniel S. Berger, Sara McAllister, Isaac Grosof, Sathya Gunasekar, Jimmy Lu, Michael Uhlar, Jim Carrig, Nathan Beckmann, Mor Harchol-Balter, et al. 2020. The CacheLib caching engine: Design and experiences at scale. In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI’20). 753–768.
    [33]
    Daniel S. Berger, Benjamin Berg, Timothy Zhu, Siddhartha Sen, and Mor Harchol-Balter. 2018. Robinhood: Tail latency aware caching–dynamic reallocation from cache-rich to cache-poor. In Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18). 195–212.
    [34]
    Daniel S. Berger, Ramesh K. Sitaraman, and Mor Harchol-Balter. 2017. AdaptSize: Orchestrating the hot object memory cache in a content delivery network. In Proceedings of the 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI’17). 483–498.
    [35]
    Aaron Blankstein, Siddhartha Sen, and Michael J. Freedman. 2017. Hyperbolic caching: Flexible caching for web applications. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’17). 499–511.
    [36]
    Lee Breslau, Pei Cao, Li Fan, Graham Phillips, and Scott Shenker. 1999. Web caching and Zipf-like distributions: Evidence and implications. In Proceedings of the IEEE International Conference on Computer Communications (INFOCOM’99), Vol. 1. IEEE, 126–134.
    [37]
    Nathan Bronson, Zach Amsden, George Cabrera, Prasad Chakka, Peter Dimov, Hui Ding, Jack Ferris, Anthony Giardullo, Sachin Kulkarni, Harry Li, et al. 2013. TAO: Facebook’s distributed data store for the social graph. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’13). 49–60.
    [38]
    Daniel Byrne, Nilufer Onder, and Zhenlin Wang. 2019. Faster slab reassignment in memcached. In Proceedings of the International Symposium on Memory Systems. 353–362.
    [39]
    Zhichao Cao, Siying Dong, Sagar Vemuri, and David H. C. Du. 2020. Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST’20). 209–223.
    [40]
    Badrish Chandramouli, Guna Prasaad, Donald Kossmann, Justin Levandoski, James Hunter, and Mike Barnett. 2018. FASTER: A concurrent key-value store with in-place updates. In Proceedings of the International Conference on Management of Data (SIGMOD’18). Association for Computing Machinery, New York, NY, 275–290.
    [41]
    Jiqiang Chen, Liang Chen, Sheng Wang, Guoyun Zhu, Yuanyuan Sun, Huan Liu, and Feifei Li. 2020. HotRing: A hotspot-aware in-memory key-value store. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST’20). USENIX Association, 239–252. Retrieved from https://www.usenix.org/conference/fast20/presentation/chen-jiqiang.
    [42]
    Yue Cheng, Fred Douglis, Philip Shilane, Grant Wallace, Peter Desnoyers, and Kai Li. 2016. Erasing Belady’s limitations: In search of flash cache offline optimality. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC 16). USENIX Association, 379–392. Retrieved from https://www.usenix.org/conference/atc16/technical-sessions/presentation/cheng.
    [43]
    Yue Cheng, Fred Douglis, Philip Shilane, Grant Wallace, Peter Desnoyers, and Kai Li. 2016. Erasing belady’s limitations: In search of flash cache offline optimality. In Proceedings of the USENIX Annual Technical Conference (USENIXATC’16). 379–392.
    [44]
    Yue Cheng, Aayush Gupta, and Ali R. Butt. 2015. An in-memory object caching framework with adaptive load balancing. In Proceedings of the 10th European Conference on Computer Systems (EuroSys’15). Association for Computing Machinery, New York, NY, Article 4, 16 pages.
    [45]
    Ludmila Cherkasova. 1998. Improving WWW Proxies Performance with Greedy-dual-size-frequency Caching Policy. Hewlett-Packard Laboratories.
    [46]
    Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh, and Sachin Katti. 2015. Dynacache: Dynamic cloud caching. In Proceedings of the 7th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’15).
    [47]
    Asaf Cidon, Assaf Eisenman, Mohammad Alizadeh, and Sachin Katti. 2016. Cliffhanger: Scaling performance cliffs in web memory caches. In Proceedings of the 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI’16). 379–392.
    [48]
    Asaf Cidon, Daniel Rushton, Stephen M. Rumble, and Ryan Stutsman. 2017. Memshare: A dynamic multi-tenant key-value cache. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’17). 321–334.
    [49]
    Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing. 143–154.
    [50]
    Jeffrey Dean and Luiz Andre Barroso. 2013. The tail at scale. Commun. ACM 56 (2013), 74–80. Retrieved from http://cacm.acm.org/magazines/2013/2/160173-the-tail-at-scale/fulltext.
    [51]
    Diego Didona and Willy Zwaenepoel. 2019. Size-aware sharding for improving tail latencies in in-memory key-value stores. In Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI’19). 79–94.
    [52]
    Dmitry Duplyakin, Robert Ricci, Aleksander Maricq, Gary Wong, Jonathon Duerig, Eric Eide, Leigh Stoller, Mike Hibler, David Johnson, Kirk Webb, Aditya Akella, Kuangching Wang, Glenn Ricart, Larry Landweber, Chip Elliott, Michael Zink, Emmanuel Cecchet, Snigdhaswin Kar, and Prabodh Mishra. 2019. The design and operation of CloudLab. In Proceedings of the USENIX Annual Technical Conference (ATC’19). Retrieved from https://www.flux.utah.edu/paper/duplyakin-atc19.
    [53]
    Gil Einziger, Roy Friedman, and Ben Manes. 2017. Tinylfu: A highly efficient cache admission policy. ACM Trans. Stor. 13, 4 (2017), 1–31.
    [54]
    Daniel E Eisenbud, Cheng Yi, Carlo Contavalli, Cody Smith, Roman Kononov, Eric Mann-Hielscher, Ardas Cilingiroglu, Bin Cheyney, Wentao Shang, and Jinnah Dylan Hosein. 2016. Maglev: A fast and reliable software network load balancer. In Proceedings of the 13th USENIX Symposium on Networked Systems Design and Implementation (NSDI’16). 523–535.
    [55]
    Assaf Eisenman, Asaf Cidon, Evgenya Pergament, Or Haimovich, Ryan Stutsman, Mohammad Alizadeh, and Sachin Katti. 2019. Flashield: A hybrid key-value cache that controls flash write amplification. In Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI’19). 65–78.
    [56]
    Assaf Eisenman, Asaf Cidon, Evgenya Pergament, Or Haimovich, Ryan Stutsman, Mohammad Alizadeh, and Sachin Katti. 2019. Flashield: A hybrid key-value cache that controls flash write amplification. In Proceedings of the 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI’19). 65–78.
    [57]
    Assaf Eisenman, Darryl Gardner, Islam AbdelRahman, Jens Axboe, Siying Dong, Kim Hazelwood, Chris Petersen, Asaf Cidon, and Sachin Katti. 2018. Reducing DRAM footprint with NVM in Facebook. In Proceedings of the 13th EuroSys Conference. 1–13.
    [58]
    Tyler Estro, Pranav Bhandari, Avani Wildani, and Erez Zadok. 2020. Desperately seeking ... optimal multi-tier cache configurations. In Proceedings of the 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’20). USENIX Association. Retrieved from https://www.usenix.org/conference/hotstorage20/presentation/estro.
    [59]
    Bin Fan, David G. Andersen, and Michael Kaminsky. 2013. Memc3: Compact and concurrent memcache with dumber caching and smarter hashing. In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI’13). 371–384.
    [60]
    Bin Fan, Hyeontaek Lim, David G. Andersen, and Michael Kaminsky. 2011. Small cache, big effect: Provable load balancing for randomly partitioned cluster services. In Proceedings of the 2nd ACM Symposium on Cloud Computing. 1–12.
    [61]
    Wolfram Gloger. [n.d.]. Ptmalloc. Retrieved from http://www.malloc.de/en/.
    [62]
    David A. Holland, Elaine Angelino, Gideon Wald, and Margo I. Seltzer. 2013. Flash caching on the storage client. In Proceedings of the USENIX Annual Technical Conference (USENIXATC’13). 127–138.
    [63]
    Yu-Ju Hong and Mithuna Thottethodi. 2013. Understanding and mitigating the impact of load imbalance in the memory caching tier. In Proceedings of the 4th Annual Symposium on Cloud Computing (SOCC’13). Association for Computing Machinery, New York, NY, Article 13, 17 pages.
    [64]
    Xiameng Hu, Xiaolin Wang, Yechen Li, Lan Zhou, Yingwei Luo, Chen Ding, Song Jiang, and Zhenlin Wang. 2015. LAMA: Optimized locality-aware memory allocation for key-value cache. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’15). 57–69.
    [65]
    Qi Huang, Ken Birman, Robbert Van Renesse, Wyatt Lloyd, Sanjeev Kumar, and Harry C. Li. 2013. An analysis of Facebook photo caching. In Proceedings of the 24th ACM Symposium on Operating Systems Principles. 167–181.
    [66]
    Qi Huang, Helga Gudmundsdottir, Ymir Vigfusson, Daniel A. Freedman, Ken Birman, and Robbert van Renesse. 2014. Characterizing load imbalance in real-world networked caches. In Proceedings of the 13th ACM Workshop on Hot Topics in Networks. 1–7.
    [67]
    Jinho Hwang and Timothy Wood. 2013. Adaptive performance-aware distributed memory caching. In Proceedings of the 10th International Conference on Autonomic Computing (ICAC’13). 33–43.
    [68]
    Sunghwan Ihm and Vivek S. Pai. 2011. Towards understanding modern web traffic. In Proceedings of the ACM SIGCOMM Conference on Internet Measurement Conference. 295–312.
    [69]
    Yichen Jia, Zili Shao, and Feng Chen. 2020. SlimCache: An efficient data compression scheme for flash-based key-value caching. ACM Trans. Stor. 16, 2, Article 14 (June 2020), 34 pages.
    [70]
    Xin Jin, Xiaozhou Li, Haoyu Zhang, Robert Soulé, Jeongkeun Lee, Nate Foster, Changhoon Kim, and Ion Stoica. 2017. Netcache: Balancing key-value stores with fast in-network caching. In Proceedings of the 26th Symposium on Operating Systems Principles. 121–136.
    [71]
    Jaeyeon Jung, Emil Sit, Hari Balakrishnan, and Robert Morris. 2001. DNS performance and the effectiveness of caching. In Proceedings of the 1st ACM SIGCOMM Workshop on Internet Measurement. 153–167.
    [72]
    Ankita Kejriwal, Arjun Gopalan, Ashish Gupta, Zhihao Jia, Stephen Yang, and John Ousterhout. 2016. SLIK : Scalable low-latency indexes for a key-value store. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’16). 57–70.
    [73]
    Donghee Lee, Jongmoo Choi, Jong-Hun Kim, Sam H Noh, Sang Lyul Min, Yookun Cho, and Chong Sang Kim. 2001. LRFU: A spectrum of policies that subsumes the least recently used and least frequently used policies. IEEE Comput. Architect. Lett. 50, 12 (2001), 1352–1361.
    [74]
    Bojie Li, Zhenyuan Ruan, Wencong Xiao, Yuanwei Lu, Yongqiang Xiong, Andrew Putnam, Enhong Chen, and Lintao Zhang. 2017. Kv-direct: High-performance in-memory key-value store with programmable nic. In Proceedings of the 26th Symposium on Operating Systems Principles. 137–152.
    [75]
    Cheng Li, Philip Shilane, Fred Douglis, Hyong Shim, Stephen Smaldone, and Grant Wallace. 2014. Nitro: A capacity-optimized SSD cache for primary storage. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’14). 501–512.
    [76]
    Cheng Li, Philip Shilane, Fred Douglis, and Grant Wallace. 2015. Pannier: A container-based flash cache for compound objects. In Proceedings of the 16th Annual Middleware Conference (Middleware’15). Association for Computing Machinery, New York, NY, 50–62.
    [77]
    Cheng Li, Philip Shilane, Fred Douglis, and Grant Wallace. 2017. Pannier: Design and analysis of a container-based flash cache for compound objects. ACM Trans. Stor. 13, 3 (2017), 1–34.
    [78]
    Conglong Li and Alan L Cox. 2015. GD-Wheel: A cost-aware replacement policy for key-value stores. In Proceedings of the 10th European Conference on Computer Systems. 1–15.
    [79]
    Sheng Li, Hyeontaek Lim, Victor W Lee, Jung Ho Ahn, Anuj Kalia, Michael Kaminsky, David G. Andersen, O. Seongil, Sukhan Lee, and Pradeep Dubey. 2015. Architecting to achieve a billion requests per second throughput on a single key-value store server platform. In Proceedings of the 42nd Annual International Symposium on Computer Architecture. 476–488.
    [80]
    Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. 2014. MICA: A holistic approach to fast in-memory key-value storage. In Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI’14). 429–444.
    [81]
    Zaoxing Liu, Zhihao Bai, Zhenming Liu, Xiaozhou Li, Changhoon Kim, Vladimir Braverman, Xin Jin, and Ion Stoica. 2019. Distcache: Provable load balancing for large-scale storage systems with distributed caching. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST’19). 143–157.
    [82]
    Qiong Luo, Sailesh Krishnamurthy, C. Mohan, Hamid Pirahesh, Honguk Woo, Bruce G. Lindsay, and Jeffrey F. Naughton. 2002. Middle-tier database caching for e-business. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 600–611.
    [83]
    Nimrod Megiddo and Dharmendra S. Modha. 2003. ARC: A self-tuning, low overhead replacement cache. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’03), Vol. 3. 115–130.
    [84]
    Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab et al. 2013. Scaling memcache at Facebook. In Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI’13). 385–398.
    [85]
    Elizabeth J. O’neil, Patrick E. O’neil, and Gerhard Weikum. 1993. The LRU-K page replacement algorithm for database disk buffering. ACM SIGMOD Rec. 22, 2 (1993), 297–306.
    [86]
    John K. Ousterhout, Herve Da Costa, David Harrison, John A. Kunze, Mike Kupfer, and James G. Thompson. 1985. A trace-driven analysis of the UNIX 4.2 BSD file system. In Proceedings of the 10th ACM Symposium on Operating Systems Principles. 15–24.
    [87]
    K. V. Rashmi, Mosharaf Chowdhury, Jack Kosaian, Ion Stoica, and Kannan Ramchandran. 2016. EC-Cache: Load-balanced, low-latency cluster caching with online erasure coding. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI’16). USENIX Association, Savannah, GA, 401–417. Retrieved from https://www.usenix.org/conference/osdi16/technical-sessions/presentation/rashmi.
    [88]
    Benjamin Reed and Darrell D. E. Long. 1996. Analysis of caching algorithms for distributed file systems. ACM SIGOPS Operat. Syst. Rev. 30, 3 (1996), 12–21.
    [89]
    Charles Reiss, Alexey Tumanov, Gregory R. Ganger, Randy H. Katz, and Michael A. Kozuch. 2012. Heterogeneity and dynamicity of clouds at scale: Google trace analysis. In Proceedings of the ACM Symposium on Cloud Computing (SoCC’12). San Jose, CA. Retrieved from http://www.pdl.cmu.edu/PDL-FTP/CloudComputing/googletrace-socc2012.pdf.
    [90]
    Charles Reiss, John Wilkes, and Joseph L. Hellerstein. 2011. Google Cluster-usage Traces: Format + schema. Technical Report. Google Inc., Mountain View, CA. Retrieved from https://github.com/google/cluster-data.
    [91]
    John T. Robinson and Murthy V. Devarakonda. 1990. Data cache management using frequency-based replacement. In Proceedings of the ACM SIGMETRICS Conference on Measurement and Modeling of Computer Systems. 134–142.
    [92]
    Stephen M. Rumble, Ankita Kejriwal, and John Ousterhout. 2014. Log-structured memory for DRAM-based storage. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST 14). USENIX Association, Santa Clara, CA, 1–16. Retrieved from https://www.usenix.org/conference/fast14/technical-sessions/presentation/rumble.
    [93]
    Stephen M. Rumble, Ankita Kejriwal, and John Ousterhout. 2014. Log-structured memory for DRAM-based storage. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (FAST’14). 1–16.
    [94]
    Mohit Saxena, Michael M. Swift, and Yiying Zhang. 2012. Flashtier: A lightweight, consistent and durable storage cache. In Proceedings of the 7th ACM European Conference on Computer Systems. 267–280.
    [95]
    Zhaoyan Shen, Feng Chen, Yichen Jia, and Zili Shao. 2018. Didacache: An integration of device and application for flash-based key-value caching. ACM Trans. Stor. 14, 3 (2018), 1–32.
    [96]
    Weisong Shi, Randy Wright, Eli Collins, and Vijay Karamcheti. 2002. Workload characterization of a personalized web site and its implications for dynamic content caching. In Proceedings of the 7th International Workshop on Web Caching and Content Distribution (WCW’02). Citeseer.
    [97]
    Ankit Singla, Balakrishnan Chandrasekaran, P. Brighten Godfrey, and Bruce Maggs. 2014. The Internet at the speed of light. In Proceedings of the 13th ACM Workshop on Hot Topics in Networks (HotNets’14). Association for Computing Machinery, New York, NY, 1–7.
    [98]
    Linpeng Tang, Qi Huang, Wyatt Lloyd, Sanjeev Kumar, and Kai Li. 2015. RIPQ : Advanced photo caching on flash for Facebook. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST’15). 373–386.
    [99]
    Giuseppe Vietri, Liana V. Rodriguez, Wendy A. Martinez, Steven Lyons, Jason Liu, Raju Rangaswami, Ming Zhao, and Giri Narasimhan. 2018. Driving cache replacement with ml-based lecar. In Proceedings of the 10th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’18).
    [100]
    Werner Vogels. 1999. File system usage in Windows NT 4.0. ACM SIGOPS Operat. Syst. Rev. 33, 5 (1999), 93–109.
    [101]
    Carl Waldspurger, Trausti Saemundsson, Irfan Ahmad, and Nohhyun Park. 2017. Cache modeling and optimization using miniature simulations. In Proceedings of the USENIX Annual Technical Conference (USENIX ATC’17). 487–498.
    [102]
    Qiuping Wang, Jinhong Li, Wen Xia, Erik Kruus, Biplob Debnath, and Patrick P. C. Lee. 2020. Austere flash caching with deduplication and compression. In Proceedings of the USENIX Annual Technical Conference (USENIXATC’20). 713–726.
    [103]
    Terry A. Welch. 1984. A technique for high-performance data compression. Computer6 (1984), 8–19.
    [104]
    Patrick Wendell and Michael J. Freedman. 2011. Going viral: Flash crowds in an open CDN. In Proceedings of the ACM SIGCOMM Conference on Internet Measurement Conference. 549–558.
    [105]
    Wikimedia. [n.d.]. Analytics/Data Lake/Traffic/Caching. Retrieved from https://wikitech.wikimedia.org/wiki/Analytics/Data_Lake/Traffic/Caching.
    [106]
    Wikimedia. [n.d.]. Caching Overview—Wikitech. Retrieved from https://wikitech.wikimedia.org/wiki/Caching_overview.
    [107]
    John Wilkes. 2011. More Google Cluster Data. Google research blog. Retrieved from http://googleresearch.blogspot.com/2011/11/more-google-cluster-data.html.
    [108]
    Juncheng Yang. [n.d.]. libCacheSim. Retrieved from https://github.com/1a1a11a/libCacheSim.
    [109]
    Juncheng Yang, Reza Karimi, Trausti Sæmundsson, Avani Wildani, and Ymir Vigfusson. 2017. Mithril: Mining sporadic associations for cache prefetching. In Proceedings of the Symposium on Cloud Computing. 66–79.
    [110]
    Juncheng Yang, Yao Yue, and K. V. Rashmi. 2020. A large-scale analysis of hundreds of in-memory cache clusters at Twitter. In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI’20). USENIX Association, 191–208. Retrieved from https://www.usenix.org/conference/osdi20/presentation/yang.
    [111]
    Juncheng Yang, Yao Yue, and K. V. Rashmi. 2021. Segcache: Memory-efficient and high-throughput DRAM cache for small objects. In Proceedings of the 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI’21). USENIX Association. Retrieved from https://www.usenix.org/conference/nsdi21/presentation/yang-juncheng.
    [112]
    Lei Zhang, Reza Karimi, Irfan Ahmad, and Ymir Vigfusson. 2020. Optimal data placement for heterogeneous cache, memory, and storage systems. Proc. ACM Measure. Anal. Comput. Syst. 4, 1 (2020), 1–27.
    [113]
    Lei Zhang, Juncheng Yang, Anna Blasiak, Mike McCall, and Ymir Vigfusson. 2020. When is the cache warm? Manufacturing a rule of thumb. In Proceedings of the 12th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 20).
    [114]
    Wei Zhang, Jinho Hwang, Timothy Wood, K. K. Ramakrishnan, and Howie Huang. 2014. Load balancing of heterogeneous workloads in memcached clusters. In Proceedings of the 9th International Workshop on Feedback Computing (Feedback Computing 14).
    [115]
    Ke Zhou, Si Sun, Hua Wang, Ping Huang, Xubin He, Rui Lan, Wenyan Li, Wenjie Liu, and Tianming Yang. 2018. Demystifying cache policies for photo stores at scale: A tencent case study. In Proceedings of the International Conference on Supercomputing. 284–294.
    [116]
    Yuanyuan Zhou, James Philbin, and Kai Li. 2001. The multi-queue replacement algorithm for second level buffer caches. In Proceedings of the USENIX Annual Technical Conference, General Track. 91–104.
    [117]
    Timothy Zhu, Anshul Gandhi, Mor Harchol-Balter, and Michael A. Kozuch. 2012. Saving cash by using less cache. In Proceedings of the 4th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud’12).

    Cited By

    View all
    • (2024)Object-oriented Unified Encrypted Memory Management for Heterogeneous Memory ArchitecturesProceedings of the ACM on Management of Data10.1145/36549582:3(1-29)Online publication date: 30-May-2024
    • (2024)Efficient Brain Tumor Segmentation with Lightweight Separable Spatial Convolutional NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365371520:7(1-19)Online publication date: 16-May-2024
    • (2024)NOC-NOC: Towards Performance-optimal Distributed TransactionsProceedings of the ACM on Management of Data10.1145/36392642:1(1-25)Online publication date: 26-Mar-2024
    • Show More Cited By

    Index Terms

    1. A Large-scale Analysis of Hundreds of In-memory Key-value Cache Clusters at Twitter

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Storage
      ACM Transactions on Storage  Volume 17, Issue 3
      August 2021
      227 pages
      ISSN:1553-3077
      EISSN:1553-3093
      DOI:10.1145/3477268
      • Editor:
      • Sam H. Noh
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 16 August 2021
      Accepted: 01 May 2021
      Received: 01 February 2021
      Published in TOS Volume 17, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Cache
      2. in-memory key-value cache
      3. key-value store
      4. workload analysis
      5. datasets
      6. Twitter

      Qualifiers

      • Research-article
      • Refereed

      Funding Sources

      • NSF

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1,458
      • Downloads (Last 6 weeks)199

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Object-oriented Unified Encrypted Memory Management for Heterogeneous Memory ArchitecturesProceedings of the ACM on Management of Data10.1145/36549582:3(1-29)Online publication date: 30-May-2024
      • (2024)Efficient Brain Tumor Segmentation with Lightweight Separable Spatial Convolutional NetworkACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365371520:7(1-19)Online publication date: 16-May-2024
      • (2024)NOC-NOC: Towards Performance-optimal Distributed TransactionsProceedings of the ACM on Management of Data10.1145/36392642:1(1-25)Online publication date: 26-Mar-2024
      • (2024)A Bitcoin-based Secure Outsourcing Scheme for Optimization Problem in Multimedia Internet of ThingsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363748920:6(1-23)Online publication date: 8-Mar-2024
      • (2023)Scope and Accuracy of Analytic and Approximate Results for FIFO, Clock-Based and LRU Caching PerformanceFuture Internet10.3390/fi1503009115:3(91)Online publication date: 24-Feb-2023
      • (2023)Requirements and Trade-Offs of Compression Techniques in Key–Value Stores: A SurveyElectronics10.3390/electronics1220428012:20(4280)Online publication date: 16-Oct-2023
      • (2023)Computational Technologies for Fashion Recommendation: A SurveyACM Computing Surveys10.1145/362710056:5(1-45)Online publication date: 25-Nov-2023
      • (2023)Complementary Coarse-to-Fine Matching for Video Object SegmentationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/359649619:6(1-21)Online publication date: 12-Jul-2023
      • (2023)Feedback Chain Network for Hippocampus SegmentationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357174419:3s(1-18)Online publication date: 14-Mar-2023
      • (2023)Mirror Segmentation via Semantic-aware Contextual Contrasted Feature LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/356612719:2s(1-22)Online publication date: 17-Feb-2023
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Get Access

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media