Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2785956.2787472acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Free access

Inside the Social Network's (Datacenter) Network

Published: 17 August 2015 Publication History

Abstract

Large cloud service providers have invested in increasingly larger datacenters to house the computing infrastructure required to support their services. Accordingly, researchers and industry practitioners alike have focused a great deal of effort designing network fabrics to efficiently interconnect and manage the traffic within these datacenters in performant yet efficient fashions. Unfortunately, datacenter operators are generally reticent to share the actual requirements of their applications, making it challenging to evaluate the practicality of any particular design.
Moreover, the limited large-scale workload information available in the literature has, for better or worse, heretofore largely been provided by a single datacenter operator whose use cases may not be widespread. In this work, we report upon the network traffic observed in some of Facebook's datacenters. While Facebook operates a number of traditional datacenter services like Hadoop, its core Web service and supporting cache infrastructure exhibit a number of behaviors that contrast with those reported in the literature. We report on the contrasting locality, stability, and predictability of network traffic in Facebook's datacenters, and comment on their implications for network architecture, traffic engineering, and switch design.

Supplementary Material

WEBM File (p123-roy.webm)

References

[1]
An open network operating system. http://onosproject.org.
[2]
Scribe (archived). https://github.com/facebookarchive/scribe.
[3]
L. Abraham, J. Allen, O. Barykin, V. Borkar, B. Chopra, C. Gerea, D. Merl, J. Metzler, D. Reiss, S. Subramanian, J. L. Wiener, and O. Zed. Scuba: Diving into data at Facebook. Proc. VLDB Endow., 6(11):1057--1067, Aug. 2013.
[4]
M. Al-Fares, A. Loukissas, and A. Vahdat. A scalable, commodity, data center network architecture. In Proc. ACM SIGCOMM, Aug. 2008.
[5]
M. Al-Fares, S. Radhakrishnan, B. Raghavan, N. Huang, and A. Vahdat. Hedera: Dynamic flow scheduling for data center networks. In Proc. USENIX NSDI, Apr. 2010.
[6]
A. Alameldeen, M. Martin, C. Mauer, K. Moore, X. Min, M. Hill, D. Wood, and D. Sorin. Simulating a $2M commercial server on a $2K PC. IEEE Computer, 36(2):50--57, Feb. 2003.
[7]
M. Alizadeh, T. Edsall, S. Dharmapurikar, R. Vaidyanathan, K. Chu, A. Fingerhut, V. T. Lam, F. Matus, R. Pan, N. Yadav, and G. Varghese. Conga: Distributed congestion-aware load balancing for datacenters. In Proc. ACM SIGCOMM, Aug. 2014.
[8]
M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. Data center TCP (DCTCP). In Proc. ACM SIGCOMM, Aug. 2010.
[9]
A. Andreyev. Introducing data center fabric, the next-generation Facebook data center network. https://code.facebook.com/posts/360346274145943, 2014.
[10]
B. Atikoglu, Y. Xu, E. Frachtenberg, S. Jiang, and M. Paleczny. Workload analysis of a large-scale key-value store. In Proc. ACM SIGMETRICS/Performance, June 2012.
[11]
L. A. Barroso, J. Clidaras, and U. Hölzle. The Datacenter as a Computer:An Introduction to the Design of Warehouse-Scale Machines. Morgan & Claypool, 2nd edition, 2013.
[12]
T. Benson, A. Akella, and D. A. Maltz. Network traffic characteristics of data centers in the wild. In Proc. ACM IMC, 2010.
[13]
T. Benson, A. Anand, A. Akella, and M. Zhang. Understanding data center traffic charachteristics. In Proc. ACM SIGCOMM WREN, Aug. 2009.
[14]
T. Benson, A. Anand, A. Akella, and M. Zhang. MicroTE: Fine grained traffic engineering for data centers. In Proc. ACM CoNEXT, Dec. 2011.
[15]
N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. Li, M. Marchukov, D. Petrov, L. Puzar, Y. J. Song, and V. Venkataramani. TAO: Facebook's distributed data store for the social graph. In Proc. USENIX ATC, June 2013.
[16]
M. Chowdhury, M. Zaharia, J. Ma, M. I. Jordan, and I. Stoica. Managing data transfers in computer clusters with orchestra. In Proceedings of the ACM SIGCOMM 2011 Conference, SIGCOMM '11, pages 98--109, New York, NY, USA, 2011. ACM.
[17]
C. Delimitrou, S. Sankar, A. Kansal, and C. Kozyrakis. ECHO: Recreating network traffic maps for datacenters with tens of thousands of servers. In Proc. IEEE International Symposium on Workload Characterization, Nov. 2012.
[18]
D. Ersoz, M. S. Yousif, and C. R. Das. Characterizing network traffic in a cluster-based, multi-tier data center. In Proc. IEEE International Conference on Distributed Computing Systems, June 2007.
[19]
N. Farrington and A. Andreyev. Facebook's data center network architecture. In Proc. IEEE Optical Interconnects, May 2013.
[20]
N. Farrington, G. Porter, S. Radhakrishnan, H. Bazzaz, V. Subramanya, Y. Fainman, G. Papen, and A. Vahdat. Helios: A hybrid electrical/optical switch architecture for modular data centers. In Proc. ACM SIGCOMM, Aug. 2010.
[21]
A. Greenberg, J. R. Hamilton, N. Jain, S. Kandula, C. Kim, P. Lahiri, D. A. Maltz, P. Patel, and S. Sengupta. VL2: A scalable and flexible data center network. In Proc. ACM SIGCOMM, Aug. 2009.
[22]
N. Gude, T. Koponen, J. Pettit, B. Pfaff, M. Casado, N. McKeown, and S. Shenker. NOX: Towards an operating system for networks. SIGCOMM CCR, 38(3), July 2008.
[23]
C. Guo, G. Lu, D. Li, H. Wu, X. Zhang, Y. Shi, C. Tian, Y. Zhang, and S. Lu. BCube: A high performance, server-centric network architecture for modular data centers. In Proc. ACM SIGCOMM, Aug. 2009.
[24]
V. Jalaparti, P. Bodik, S. Kandula, I. Menache, M. Rybalkin, and C. Yan. Speeding up distributed request-response workflows. In Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM, SIGCOMM '13, pages 219--230, New York, NY, USA, 2013. ACM.
[25]
S. Kandula, J. Padhye, and P. Bahl. Flyways to de-congest data center networks. In Proc. ACM HotNets, Oct. 2009.
[26]
S. Kandula, S. Sengupta, A. Greenberg, P. Patel, and R. Chaiken. The nature of data center traffic: Measurements & analysiss. In Proc. ACM IMC, Nov. 2009.
[27]
R. Kapoor, A. C. Snoeren, G. M. Voelker, and G. Porter. Bullet trains: A study of NIC burst behavior at microsecond timescales. In Proc. ACM CoNEXT, Dec. 2013.
[28]
T. Koponen, M. Casado, N. Gude, J. Stribling, L. Poutievski, M. Zhu, R. Ramanathan, Y. Iwata, H. Inoue, T. Hama, and S. Shenker. Onix: A distributed control platform for large-scale production networks. In Proc. USENIX OSDI, 2010.
[29]
A. Likhtarov, R. Nishtala, R. McElroy, H. Fugal, A. Grynenko, and V. Venkataramani. Introducing mcrouter: A memcached protocol router for scaling memcached deployments. https://code.facebook.com/posts/296442737213493, Sept. 2014.
[30]
H. Liu, F. Lu, A. Forencich, R. Kapoor, M. Tewari, G. M. Voelker, G. Papen, A. C. Snoeren, and G. Porter. Circuit switching under the radar with REACToR. In Proc. USENIX NSDI, Apr. 2014.
[31]
R. Mack. Building timeline: Scaling up to hold your life story. https://www.facebook.com/note.php?note_id=10150468255628920, Jan. 2012.
[32]
B. Pfaff, J. Pettit, T. Koponen, K. Amidon, M. Casado, and S. Shenker. Extending networking into the virtualization layer. In Proc. ACM HotNets, 2009.
[33]
L. Popa, S. Ratnasamy, G. Iannaccone, A. Krishnamurthy, and I. Stoica. A cost comparison of datacenter network architectures. In Proc. ACM CoNEXT, Dec. 2010.
[34]
R. Sherwood, G. Gibb, K.-K. Yap, G. Appenzeller, M. Casado, N. McKeown, and G. Parulkar. Can the production network be the testbed? In Proc. USENIX OSDI, 2010.
[35]
A. Simpkins. Facebook open switching system (fboss) and wedge in the open. https://code.facebook.com/posts/843620439027582/facebook-open-switching-system-fboss-and-wedge-in-the-open/, 2015.
[36]
A. Singla, C.-Y. Hong, L. Popa, and P. B. Godfrey. Jellyfish: Networking data centers randomly. In Proc. USENIX NSDI, Apr. 2012.
[37]
D. Sommermann and A. Frindell. Introducing Proxygen, Facebook's C++ HTTP framework. https://code.facebook.com/posts/1503205539947302, 2014.
[38]
A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, N. Zhang, S. Antony, H. Liu, and R. Murthy. Hive -- a petabyte scale data warehouse using Hadoop. In Proc. IEEE ICDE, Mar. 2010.
[39]
G. Wang, D. G. Andersen, M. Kaminsky, K. Papagiannaki, T. S. E. Ng, M. Kozuch, and M. Ryan. c-Through: Part-time optics in data centers. In Proc. ACM SIGCOMM, Aug. 2010.
[40]
X. Zhou, Z. Zhang, Y. Zhu, Y. Li, S. Kumar, A. Vahdat, B. Y. Zhao, and H. Zheng. Mirror mirror on the ceiling: Flexible wireless links for data centers. In Proc. ACM SIGCOMM, Aug. 2012.

Cited By

View all
  • (2025)Rethinking Cost-Efficient VM Scheduling on Public Edge Platforms: A Service Provider’s PerspectiveIEEE Transactions on Mobile Computing10.1109/TMC.2024.348808224:3(1846-1858)Online publication date: Mar-2025
  • (2025)GraphCC: A practical graph learning-based approach to Congestion Control in datacentersComputer Networks10.1016/j.comnet.2024.110981257(110981)Online publication date: Feb-2025
  • (2024)CARAVANProceedings of the 18th USENIX Conference on Operating Systems Design and Implementation10.5555/3691938.3691956(325-345)Online publication date: 10-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGCOMM '15: Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication
August 2015
684 pages
ISBN:9781450335423
DOI:10.1145/2785956
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 August 2015

Permissions

Request permissions for this article.

Check for updates

Author Tag

  1. datacenter traffic patterns

Qualifiers

  • Research-article

Funding Sources

  • NSF

Conference

SIGCOMM '15
Sponsor:
SIGCOMM '15: ACM SIGCOMM 2015 Conference
August 17 - 21, 2015
London, United Kingdom

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)996
  • Downloads (Last 6 weeks)97
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Rethinking Cost-Efficient VM Scheduling on Public Edge Platforms: A Service Provider’s PerspectiveIEEE Transactions on Mobile Computing10.1109/TMC.2024.348808224:3(1846-1858)Online publication date: Mar-2025
  • (2025)GraphCC: A practical graph learning-based approach to Congestion Control in datacentersComputer Networks10.1016/j.comnet.2024.110981257(110981)Online publication date: Feb-2025
  • (2024)CARAVANProceedings of the 18th USENIX Conference on Operating Systems Design and Implementation10.5555/3691938.3691956(325-345)Online publication date: 10-Jul-2024
  • (2024)Brain-on-switchProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691849(419-440)Online publication date: 16-Apr-2024
  • (2024)HarmonyProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691845(329-343)Online publication date: 16-Apr-2024
  • (2024)A large-scale deployment of DCTCPProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691839(239-252)Online publication date: 16-Apr-2024
  • (2024)Toward Enhanced Reliability: An Efficient Method for Link-Local Retransmission in a Programmable Data PlaneElectronics10.3390/electronics1401013114:1(131)Online publication date: 31-Dec-2024
  • (2024)ResCue: Inferring Fine-Grained Traffic Matrices via Distributed Deep Residual Networks2024 20th International Conference on Network and Service Management (CNSM)10.23919/CNSM62983.2024.10814640(1-9)Online publication date: 28-Oct-2024
  • (2024)Dynamic capacity sharing with multi-wavelength integrated transmitters in hybrid datacenter networksJournal of Optical Communications and Networking10.1364/JOCN.52844316:10(990)Online publication date: 19-Sep-2024
  • (2024)Accurate and fast congestion feedback in MEC-enabled RDMA datacentersJournal of Cloud Computing10.1186/s13677-024-00642-813:1Online publication date: 25-Mar-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media