tutorial

Characterizing Load Imbalance in Real-World Networked Caches

Authors:

Helga Gudmundsdottir,

Ymir Vigfusson,

Daniel A. Freedman,

Robbert van RenesseAuthors Info & Claims

HotNets-XIII: Proceedings of the 13th ACM Workshop on Hot Topics in Networks

Pages 1 - 7

https://doi.org/10.1145/2670518.2673882

Published: 27 October 2014 Publication History

Abstract

Modern Web services rely extensively upon a tier of in-memory caches to reduce request latencies and alleviate load on backend servers. Within a given cache, items are typically partitioned across cache servers via consistent hashing, with the goal of balancing the number of items maintained by each cache server. Effects of consistent hashing vary by associated hashing function and partitioning ratio. Most real-world workloads are also skewed, with some items significantly more popular than others. Inefficiency in addressing both issues can create an imbalance in cache-server loads.

We analyze the degree of observed load imbalance, focusing on read-only traffic against Facebook's graph cache tier in TAO. We investigate the principal causes of load imbalance, including data co-location, non-ideal hashing scenarios, and hot-spot temporal effects. We also employ trace-drive analytics to study the benefits and limitations of current load-balancing methods, suggesting areas for future research.

References

[1]

N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. Li, M. Marchukov, D. Petrov, L. Puzar, Y. J. Song, and V. Venkataramani. TAO: Facebook's Distributed Data Store for the Social Graph. In Proc. of the 2013 USENIX Annual Technical Conference (ATC '13), pages 49--60, San Jose, CA, USA, 2013.

Digital Library

[2]

G. Cormode and M. Hadjieleftheriou. Methods for Finding Frequent Items in Data Streams. The VLDB Journal, 19(1):3--20, February 2010.

Digital Library

[3]

F. Dabek, M. F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Wide-area Cooperative Storage with CFS. In Proc. of the 18th ACM Symposium on Operating Systems Principles (SOSP '01), pages 202--215, Banff, Alberta, Canada, 2001.

Digital Library

[4]

G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's Highly Available Key-value Store. In Proc. of 21st ACM SIGOPS Symposium on Operating Systems Principles (SOSP '07), pages 205--220, Stevenson, WA, USA, 2007.

Digital Library

[5]

B. Fan, H. Lim, D. G. Andersen, and M. Kaminsky. Small Cache, Big Effect: Provable Load Balancing for Randomly Partitioned Cluster Services. In Proc. of the 2nd ACM Symposium on Cloud Computing (SOCC '11), pages 23:1--23:12, Cascais, Portugal, 2011.

Digital Library

[6]

Y.-J. Hong and M. Thottethodi. Understanding and Mitigating the Impact of Load Imbalance in the Memory Caching Tier. In Proc. of the 4th ACM Symposium on Cloud Computing (SOCC '13), pages 13:1--13:17, Santa Clara, CA, USA, 2013.

Digital Library

[7]

Q. Huang, K. Birman, R. van Renesse, W. Lloyd, S. Kumar, and H. C. Li. An analysis of Facebook photo caching. In Proc. of the 24th ACM Symposium on Operating Systems Principles (SOSP '13), pages 167--181, Farminton, PA, USA, 2013.

Digital Library

[8]

J. Hwang and T. Wood. Adaptive Performance-Aware Distributed Memory Caching. In Proc. of the 10th International Conference on Autonomic Computing (ICAC '13), pages 33--43, San Jose, CA, USA, 2013.

[9]

R. James. libketama: a consistent hashing algo for memcache clients. http://github.com/RJ/ketama (accessed on 2014/07/15), April 2007.

[10]

D. Karger, E. Lehman, T. Leighton, R. Panigrahy, M. Levine, and D. Lewin. Consistent Hashing and Random Trees: Distributed Caching Protocols for Relieving Hot Spots on the World Wide Web. In Proc. of the 29th Annual ACM Symposium on Theory of Computing (STOC '97), pages 654--663, El Paso, TX, USA, 1997.

Digital Library

[11]

D. Karger, A. Sherman, A. Berkheimer, B. Bogstad, R. Dhanidina, K. Iwamoto, B. Kim, L. Matkins, and Y. Yerushalmi. Web Caching with Consistent Hashing. In Proc. of the 8th International World Wide Web Conference (WWW '99), pages 1203--1213, Toronto, Ontario, Canada, 1999.

Digital Library

[12]

A. Likhtarov, R. Nishtala, R. McElroy, H. Fugal, A. Grynenko, and V. Venkataramani. Introducing mcrouter: A memcached protocol router for scaling memcached deployments. http://tinyurl.com/n5t338j, September 2014.

[13]

R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani. Scaling Memcache at Facebook. In Proc. of the 10th USENIX Conference on Networked Systems Design and Implementation (NSDI '13), pages 385--398, Lombard, IL, USA, 2013.

Digital Library

[14]

X. Wang and D. Loguinov. Load-balancing Performance of Consistent Hashing: Asymptotic Analysis of Random Node Join. IEEE/ACM Transactions on Networking, 15 (4):892--905, August 2007.

Digital Library

[15]

T. Zhu, A. Gandhi, M. Harchol-Balter, and M. A. Kozuch. Saving Cash by Using Less Cache. In Proc. of the 4th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud '12), Boston, MA, USA, 2012.

Digital Library

Cited By

Nan FWu RShen ZYang JCheng LChen ZZhang YShu J(2025)AC-Cache: A Memory-Efficient Caching System for Small Objects via Exploiting Access CorrelationsProceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3710848.3710856(142-155)Online publication date: 28-Feb-2025
https://dl.acm.org/doi/10.1145/3710848.3710856
Li HWu SLi ZWang QLi YXu Y(2025)Enabling High Performance and Resource Utilization in Clustered Cache via Hotness Identification, Data Copying, and Instance MergingIEEE Transactions on Computers10.1109/TC.2024.347799474:2(371-385)Online publication date: Feb-2025
https://doi.org/10.1109/TC.2024.3477994
Wang PLiu YLiu ZZhao ZLiu KZhou KHuang Z(2024)-LAP: A Lightweight and Adaptive Cache Partitioning Scheme With Prudent Resizing Decisions for Content Delivery NetworksIEEE Transactions on Cloud Computing10.1109/TCC.2024.342045412:3(942-953)Online publication date: Jul-2024
https://doi.org/10.1109/TCC.2024.3420454
Show More Cited By

Index Terms

Characterizing Load Imbalance in Real-World Networked Caches

Recommendations

Dynamic Performance Profiling of Cloud Caches
SOCC '14: Proceedings of the ACM Symposium on Cloud Computing

Large-scale in-memory object caches such as memcached are widely used to accelerate popular web sites and to reduce burden on backend databases. Yet current cache systems give cache operators limited information on what resources are required to ...
Criticality aware tiered cache hierarchy: a fundamental relook at multi-level cache hierarchies
ISCA '18: Proceedings of the 45th Annual International Symposium on Computer Architecture

On-die caches are a popular method to help hide the main memory latency. However, it is difficult to build large caches without substantially increasing their access latency, which in turn hurts performance. To overcome this difficulty, on-die caches ...
Fetch Caches

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

HotNets-XIII: Proceedings of the 13th ACM Workshop on Hot Topics in Networks

October 2014

189 pages

ISBN:9781450332569

DOI:10.1145/2670518

General Chairs:
Ethan Katz-Bassett
University of Southern California
,
John Heidemann
University of Southern California/Information Sciences Institute
,
Program Chairs:
Brighten Godfrey
University of Illinois at Urbana-Champaign
,
Anja Feldmann
Technische Universität Berlin

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGCOMM: ACM Special Interest Group on Data Communication

In-Cooperation

CISCO

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Tutorial
Research
Refereed limited

Conference

HotNets-XIII

Sponsor:

SIGCOMM

HotNets-XIII: The 13th ACM Workshop on Hot Topics in Networks

October 27 - 28, 2014

CA, Los Angeles, USA

Acceptance Rates

HotNets-XIII Paper Acceptance Rate 26 of 118 submissions, 22%;

Overall Acceptance Rate 110 of 460 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

46
Total Citations
View Citations
379
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)1

Reflects downloads up to 08 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Nan FWu RShen ZYang JCheng LChen ZZhang YShu J(2025)AC-Cache: A Memory-Efficient Caching System for Small Objects via Exploiting Access CorrelationsProceedings of the 30th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3710848.3710856(142-155)Online publication date: 28-Feb-2025
https://dl.acm.org/doi/10.1145/3710848.3710856
Li HWu SLi ZWang QLi YXu Y(2025)Enabling High Performance and Resource Utilization in Clustered Cache via Hotness Identification, Data Copying, and Instance MergingIEEE Transactions on Computers10.1109/TC.2024.347799474:2(371-385)Online publication date: Feb-2025
https://doi.org/10.1109/TC.2024.3477994
Wang PLiu YLiu ZZhao ZLiu KZhou KHuang Z(2024)-LAP: A Lightweight and Adaptive Cache Partitioning Scheme With Prudent Resizing Decisions for Content Delivery NetworksIEEE Transactions on Cloud Computing10.1109/TCC.2024.342045412:3(942-953)Online publication date: Jul-2024
https://doi.org/10.1109/TCC.2024.3420454
Wu ZDeng YZhou YLi JPang SQin X(2024)FaaSBatch: Boosting Serverless Efficiency With In-Container Parallelism and Resource MultiplexingIEEE Transactions on Computers10.1109/TC.2024.335283473:4(1071-1085)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1109/TC.2024.3352834
Das SSilva AEugene Ng T(2024)Rearchitecting Datacenter Networks: A New Paradigm with Optical Core and Optical EdgeIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621224(1371-1380)Online publication date: 20-May-2024
https://doi.org/10.1109/INFOCOM52122.2024.10621224
Hemmatpour MZheng CZilberman N(2024)E-Commerce Bot Traffic: In-Network Impact, Detection, and Mitigation2024 27th Conference on Innovation in Clouds, Internet and Networks (ICIN)10.1109/ICIN60470.2024.10494459(179-185)Online publication date: 11-Mar-2024
https://doi.org/10.1109/ICIN60470.2024.10494459
Zhang KSha EZhuge QXu R(2024)An efficient flattened index structure with lazy restructuring and hotness awarenessFuture Generation Computer Systems10.1016/j.future.2023.11.025153(139-153)Online publication date: Apr-2024
https://doi.org/10.1016/j.future.2023.11.025
Liu KWang HZhou KLi C(2023)A Lightweight and Adaptive Cache Allocation Scheme for Content Delivery Networks2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10136922(1-6)Online publication date: Apr-2023
https://doi.org/10.23919/DATE56975.2023.10136922
Das SSilva ANg TSchulzrinne HKohler EMaltz DMisra V(2023)Poster: Near Non-blocking Performance with All-optical Circuit-switched CoreProceedings of the ACM SIGCOMM 2023 Conference10.1145/3603269.3610868(1117-1119)Online publication date: 10-Sep-2023
https://dl.acm.org/doi/10.1145/3603269.3610868
Baganal-Krishna NMunstein DRizk A(2023)LETHE: Combined Time-to-Live Caching and Load Balancing on the Network Data Plane2023 IEEE 29th International Symposium on Local and Metropolitan Area Networks (LANMAN)10.1109/LANMAN58293.2023.10189809(1-6)Online publication date: 10-Jul-2023
https://doi.org/10.1109/LANMAN58293.2023.10189809
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten