Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3710848.3710856acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article
Free access

AC-Cache: A Memory-Efficient Caching System for Small Objects via Exploiting Access Correlations

Published: 28 February 2025 Publication History

Abstract

In-memory key-value (KV) caching bridges the performance gap between high-performance networks and disk devices. However, prior in-memory KV caching systems either consider large objects or introduce additional memory overhead. In this paper, we conduct a systematic analysis over 56 production traces, and make three observations: (i) small objects dominate the traces and data accesses are highly skewed; (ii) the hotness of objects keeps stable across days; and (iii) the multi-get operation that retrieves multiple objects from the same node incurs much shorter tail latency than purely using the single-get operation.
These observations motivate the design of AC-Cache, an access-correlation-aware in-memory caching system for small objects. AC-Cache comprises three design primitives: (i) we formulate the distribution of KV objects as an integer linear programming problem to balance data accesses and memory consumption; (ii) we capture the access correlation in a memory-efficient means and generate fine-grained correlation groups; and (iii) we formulate the distribution of the correlation groups as a maximum flow problem to balance data accesses, and leverage a heuristic algorithm to dispatch other KV objects to balance memory consumption. Extensive experiments with billions of objects on Alibaba Cloud show that AC-Cache can reduce the tail latency by 5.1-80.2% and increase the access throughput by 42.8-534.8%.

References

[1]
Inc. Alluxio. 2024. Alluxio. https://www.alluxio.io/.
[2]
Ganesh Ananthanarayanan, Sameer Agarwal, Srikanth Kandula, Albert Greenberg, Ion Stoica, Duke Harlan, and Ed Harris. 2011. Scarlett: Coping with Skewed Content Popularity in Mapreduce Clusters. In Proc. of the 6th Conference on Computer Systems (EuroSys '11). 287--300.
[3]
Chris Aniszczyk. 2012. Caching with twemcache. Twitter Blog, Engineering Blog (2012), 1--7.
[4]
Caching at reddit. 2017. https://redditblog.com/2017/1/17/caching-at-reddit/.
[5]
Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. 2012. Workload Analysis of a Large-Scale Key-Value Store. In Proc. of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '12). 53--64.
[6]
Philip A. Bernstein, Vassos Hadzilacos, and Nathan Goodman. 1987. Concurrency Control and Recovery in Database Systems. Addison-Wesley.
[7]
Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast Unfolding of Communities in Large Networks. Journal of Statistical Mechanics: Theory and Experiment 2008, 10 (Oct. 2008), P10008.
[8]
CacheLib. 2024. Evaluating SSD hardware for Facebook workloads. https://cachelib.org/docs/Cache_Library_User_Guides/Cachebench_FB_HW_eval.
[9]
Brad Calder, Chandra Krintz, Simmi John, and Todd Austin. 1998. Cache-conscious data placement. In Proceedings of the Eighth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS VIII). 139--149.
[10]
Zhichao Cao, Siying Dong, Sagar Vemuri, and David H. C. Du. 2020. Characterizing, Modeling, and Benchmarking {RocksDB} Key-Value Workloads at Facebook. In Proc. of the 18th USENIX Conference on File and Storage Technologies (FAST 20). 209--223.
[11]
Badrish Chandramouli, Guna Prasaad, Donald Kossmann, Justin Levandoski, James Hunter, and Mike Barnett. 2018. FASTER: A Concurrent Key-Value Store with In-Place Updates. In Proc. of the 2018 International Conference on Management of Data (SIGMOD '18). 275--290.
[12]
Youxu Chen, Cheng Li, Min Lv, Xinyang Shao, Yongkun Li, and Yinlong Xu. 2019. Explicit Data Correlations-Directed Metadata Prefetching Method in Distributed File Systems. IEEE Transactions on Parallel and Distributed Systems 30, 12 (2019), 2692--2705.
[13]
Liangfeng Cheng, Yuchong Hu, Zhaokang Ke, Jia Xu, Qiaori Yao, Dan Feng, Weichun Wang, and Wei Chen. 2021. LogECMem: Coupling Erasure-Coded In-Memory Key-Value Stores with Parity Logging. In Proc. of International Conference for High Performance Computing, Networking, Storage and Analysis (SC '21). 1--15.
[14]
Liangfeng Cheng, Yuchong Hu, and Patrick P. C. Lee. 2019. Coupling Decentralized Key-Value Stores with Erasure Coding. In Proc. of the ACM Symposium on Cloud Computing (SoCC '19). 377--389.
[15]
Yue Cheng, Aayush Gupta, and Ali R. Butt. 2015. An In-Memory Object Caching Framework with Adaptive Load Balancing. In Proceedings of the Tenth European Conference on Computer Systems (EuroSys '15). 1--16.
[16]
Asaf Cidon, Daniel Rushton, Stephen M. Rumble, and Ryan Stutsman. 2017. Memshare: A Dynamic Multi-tenant Key-value Cache. In Proc. of the 2017 USENIX Annual Technical Conference (USENIX ATC 17). 321--334.
[17]
Alibaba Cloud. 2023. Alibaba Cloud: Cloud Computing Services. https://www.alibabacloud.com.
[18]
Alibaba Cloud. 2024. Tair. https://github.com/alibaba/tair.
[19]
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC '10). 143----154.
[20]
Graham Cormode and S. Muthukrishnan. 2004. What's New: Finding Significant Fifferences in Network Data Streams. In IEEE INFOCOM 2004 - 23rd Annual Joint Conference on Computer Communications. 1534--1545.
[21]
Graham Cormode and S. Muthukrishnan. 2005. An Improved Data Stream Summary: The Count-Min Sketch and Its Applications. Journal of Algorithms 55, 1 (April 2005), 58--75.
[22]
Carlo Curino, Evan Jones, Yang Zhang, and Sam Madden. 2010. Schism: a workload-driven approach to database replication and partitioning. Proceedings of the VLDB Endowment 3, 1--2 (2010), 48--57.
[23]
Diego Didona and Willy Zwaenepoel. 2019. Size-Aware Sharding For Improving Tail Latencies in In-Memory Key-Value Stores. In Proc. of the 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19) (2019). 79--94.
[24]
Yefim Dinitz. 2006. Dinitz' Algorithm: The Original Version and Even's Version. In Theoretical Computer Science: Essays in Memory of Shimon Even. 218--240.
[25]
EVCache. 2023. https://github.com/Netflix/EVCache.
[26]
Brad Fitzpatrick. 2022. Memcached - a Distributed Memory Object Caching System. https://www.memcached.org/.
[27]
M. Girvan and M. E. J. Newman. 2002. Community Structure in Social and Biological Networks. in Proc. of the National Academy of Sciences of the United States of America 99, 12 (June 2002), 7821--7826.
[28]
Nikolas Gloy and Michael D. Smith. 1999. Procedure placement using temporal-ordering information. ACM Trans. Program. Lang. Syst. 21, 5 (Sept. 1999), 977--1027.
[29]
Yu-Ju Hong and Mithuna Thottethodi. 2013. Understanding and Mitigating the Impact of Load Imbalance in the Memory Caching Tier. In Proc. of the 4th Annual Symposium on Cloud Computing (SOCC '13).
[30]
Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin. 2012. Erasure Coding in Windows Azure Storage. In Proc. of USENIX Annual Technical Conference (USENIX ATC 12). 15--26.
[31]
Qi Huang, Helga Gudmundsdottir, Ymir Vigfusson, Daniel A. Freedman, Ken Birman, and Robbert van Renesse. 2014. Characterizing Load Imbalance in Real-World Networked Caches. In Proceedings of the 13th ACM Workshop on Hot Topics in Networks (HotNets-XIII). 1--7.
[32]
Qun Huang, Xin Jin, Patrick P. C. Lee, Runhui Li, Lu Tang, Yi-Chao Chen, and Gong Zhang. 2017. SketchVisor: Robust Network Measurement for Software Packet Processing. In Proc. of the Conference of the ACM Special Interest Group on Data Communication (SIGCOMM '17). 113--126.
[33]
Bert Hubert, Jacco Geul, and Simon Séhier. 2023. Wonder-Shaper: Command-line utility for limiting an adapter's bandwidth. https://github.com/magnific0/wondershaper.
[34]
IBM. 2022. IBM ILOG CPLEX Optimization Studio 22.1.1 documentation. https://www.ibm.com/docs/en/icos/22.1.1.
[35]
Aerospike Inc. 2024. Aerospike. https://aerospike.com/.
[36]
Song Jiang, Xiaoning Ding, Yuehai Xu, and Kei Davis. 2013. A Prefetching Scheme Exploiting Both Data Layout and Access History on Disk. ACM Transactions on Storage 9, 3 (2013), 10:1--10:23.
[37]
Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. Design Guidelines for High Performance {RDMA} Systems. In 2016 USENIX Annual Technical Conference (USENIX ATC 16). 437--450.
[38]
Bisma S. Khan and Muaz A. Niazi. 2017. Network Community Detection: A Review and Visual Survey.
[39]
Lamport Leslie. 1998. The part-time parliament. ACM Trans. on Computer Systems 16 (1998), 133--169.
[40]
Huiba Li, Yiming Zhang, Zhiming Zhang, Shengyun Liu, Dongsheng Li, Xiaohui Liu, and Yuxing Peng. 2017. PARIX: Speculative Partial Writes in Erasure-Coded Systems. In Proc. of USENIX Annual Technical Conference (USENIX ATC 17). 581--587.
[41]
Jun Li, Xiaofei Xu, Zhigang Cai, Jianwei Liao, Kenli Li, Balazs Gerofi, and Yutaka Ishikawa. 2022. Pattern-Based Prefetching with Adaptive Cache Management Inside of Solid-State Drives. ACM Transactions on Storage 18, 1 (2022), 7:1--7:25.
[42]
Zhenmin Li, Zhifeng Chen, Sudarshan M. Srinivasan, and Yuanyuan Zhou. 2004. C-Miner: Mining Block Correlations in Storage. In Proc. of the 3rd USENIX Conference on File and Storage Technologies (FAST 04). 173--186.
[43]
Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. 2014. MICA: A Holistic Approach to Fast In-Memory Key-Value Storage. In Proc. of the 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14) (2014). 429--444.
[44]
Redis Ltd. 2022. Redis. https://redis.io/.
[45]
Sara McAllister, Benjamin Berg, Julian Tutuncu-Macias, Juncheng Yang, Sathya Gunasekar, Jimmy Lu, Daniel S. Berger, Nathan Beckmann, and Gregory R. Ganger. 2021. Kangaroo: Caching Billions of Tiny Objects on Flash. In Proc. of the ACM SIGOPS 28th Symposium on Operating Systems Principles (SOSP 21). 243--262.
[46]
Rajesh Nishtala, Hans Fugal, Steven Grimm, Marc Kwiatkowski, Herman Lee, Harry C. Li, Ryan McElroy, Mike Paleczny, Daniel Peek, Paul Saab, David Stafford, Tony Tung, and Venkateshwaran Venkataramani. 2013. Scaling Memcache at Facebook. In Proc. of 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI 13). 385--398.
[47]
Diego Ongaro and John Ousterhout. 2014. In Search of an Understandable Consensus Algorithm. In 2014 USENIX Annual Technical Conference (USENIX ATC 14). 305--319.
[48]
Karl Pettis and Robert C. Hansen. 1990. Profile guided code positioning. In Proceedings of the ACM SIGPLAN 1990 Conference on Programming Language Design and Implementation (PLDI '90). 16----27.
[49]
Rodric M. Rabbah and Krishna V. Palem. 2003. Data remapping for design space optimization of embedded memory systems. ACM Trans. Embed. Comput. Syst. 2, 2 (2003), 186--218.
[50]
K. V. Rashmi, Mosharaf Chowdhury, Jack Kosaian, Ion Stoica, and Kannan Ramchandran. 2016. EC-Cache: Load-Balanced, Low-Latency Cluster Caching with Online Erasure Coding. In Proc. of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16). 401--417.
[51]
Robert Schweller, Zhichun Li, Yan Chen, Yan Gao, Ashish Gupta, Yin Zhang, Peter A. Dinda, Ming-Yang Kao, and Gokhan Memik. 2007. Reversible Sketches: Enabling Monitoring and Analysis Over High-Speed Data Streams. IEEE/ACM Transactions on Networking 15, 5 (Oct. 2007), 1059--1072.
[52]
Zhirong Shen, Patrick P. C. Lee, Jiwu Shu, and Wenzhong Guo. 2019. Correlation-Aware Stripe Organization for Efficient Writes in Erasure-Coded Storage: Algorithms and Evaluation. IEEE Transactions on Parallel and Distributed Systems 30, 7 (2019), 1552--1564.
[53]
SoftwareAG. 2024. EHCache - Java's most widely used cache. https://www.ehcache.org/.
[54]
Cha Hwan Song, Xin Zhe Khooi, Raj Joshi, Inho Choi, Jialin Li, and Mun Choon Chan. 2023. Network Load Balancing with In-network Reordering Support for RDMA. In Proceedings of the ACM SIGCOMM 2023 Conference (ACM SIGCOMM '23). 816--831.
[55]
Gokul Soundararajan, Madalin Mihailescu, and Cristiana Amza. 2008. Context-Aware Prefetching at the Storage Server. In Proc. of the 2008 USENIX Annual Technical Conference (USENIX ATC 08). 377--390.
[56]
Yasodha Suriyakumar, Nathan R Tallent, Andres Marquez, Karen L Karavanic, and Ozgur O Kilic. 2024. MemFriend: Understanding Memory Performance with Spatial-Temporal Affinity. In Proceedings of the International Symposium on Memory Systems (MEMSYS '24). 270-- --284.
[57]
Robert Endre Tarjan. 1983. Data structures and network algorithms. SIAM.
[58]
Twitter. 2023. Pelikan Cache. https://pelikan.io/.
[59]
Xingda Wei, Rong Chen, and Haibo Chen. 2020. Fast {RDMA-based} Ordered {Key-Value} Store Using Remote Learned Cache. In 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 117--135.
[60]
Xingda Wei, Zhiyuan Dong, Rong Chen, and Haibo Chen. 2018. Deconstructing {RDMA-enabled} Distributed Transactions: Hybrid Is Better!. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18). 233--251.
[61]
Juncheng Yang, Yao Yue, and K. V. Rashmi. 2020. A Large Scale Analysis of Hundreds of In-Memory Cache Clusters at Twitter. In Proc. of the 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20). 191--208.
[62]
Juncheng Yang, Yao Yue, and Rashmi Vinayak. 2021. Segcache: A Memory-Efficient and Scalable In-Memory Key-Value Cache for Small Objects. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). 503--518.
[63]
Matt M. T. Yiu, Helen H. W. Chan, and Patrick P. C. Lee. 2017. Erasure Coding for Small Objects in In-Memory KV Storage. In Proc. of the 10th ACM International Systems and Storage Conference (SYSTOR 17). 1--12.
[64]
Minchen Yu, Yinghao Yu, Yunchuan Zheng, Baichen Yang, and Wei Wang. 2020. RepBun: Load-Balanced, Shuffle-Free Cluster Caching for Structured Data. In IEEE INFOCOM 2020 - IEEE Conference on Computer Communications. 954--963.
[65]
Yinghao Yu, Renfei Huang, Wei Wang, Jun Zhang, and Khaled Ben Letaief. 2018. SP-Cache: Load-Balanced, Redundancy-Free Cluster Caching with Selective Partition. In Proc. of International Conference for High Performance Computing, Networking, Storage and Analysis (SC 18). 1--13.
[66]
Chengliang Zhang, Chen Ding, Mitsunori Ogihara, Yutao Zhong, and Youfeng Wu. 2006. A hierarchical model of data locality. In Conference Record of the 33rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL '06). 16--29.
[67]
Chengliang Zhang and Martin Hirzel. 2008. Online Phase-Adaptive Data Layout Selection. In ECOOP 2008 - Object-Oriented Programming, Jan Vitek (Ed.). 309--334.
[68]
Mi Zhang, Qiuping Wang, Zhirong Shen, and Patrick P. C. Lee. 2019. Parity-Only Caching for Robust Straggler Tolerance. In Proc. of the 35th Symposium on Mass Storage Systems and Technologies (MSST 19). 257--268.
[69]
Yutao Zhong, Maksim Orlovich, Xipeng Shen, and Chen Ding. 2004. Array regrouping and structure splitting using whole-program reference affinity. SIGPLAN Not. 39, 6 (June 2004), 255--266.

Index Terms

  1. AC-Cache: A Memory-Efficient Caching System for Small Objects via Exploiting Access Correlations

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PPoPP '25: Proceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming
    February 2025
    580 pages
    ISBN:9798400714436
    DOI:10.1145/3710848
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 February 2025

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Caching
    2. Correlation analysis
    3. Load balance
    4. Storage

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    PPoPP '25

    Acceptance Rates

    Overall Acceptance Rate 230 of 1,014 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 249
      Total Downloads
    • Downloads (Last 12 months)249
    • Downloads (Last 6 weeks)249
    Reflects downloads up to 07 Mar 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media