Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3387514.3405899acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Open access

Annulus: A Dual Congestion Control Loop for Datacenter and WAN Traffic Aggregates

Published: 30 July 2020 Publication History

Abstract

Cloud services are deployed in datacenters connected though high-bandwidth Wide Area Networks (WANs). We find that WAN traffic negatively impacts the performance of datacenter traffic, increasing tail latency by 2.5x, despite its small bandwidth demand. This behavior is caused by the long round-trip time (RTT) for WAN traffic, combined with limited buffering in datacenter switches. The long WAN RTT forces datacenter traffic to take the full burden of reacting to congestion. Furthermore, datacenter traffic changes on a faster time-scale than the WAN RTT, making it difficult for WAN congestion control to estimate available bandwidth accurately.
We present Annulus, a congestion control scheme that relies on two control loops to address these challenges. One control loop leverages existing congestion control algorithms for bottlenecks where there is only one type of traffic (i.e., WAN or datacenter). The other loop handles bottlenecks shared between WAN and datacenter traffic near the traffic source, using direct feedback from the bottleneck. We implement Annulus on a testbed and in simulation. Compared to baselines using BBR for WAN congestion control and DCTCP or DCQCN for datacenter congestion control, Annulus increases bottleneck utilization by 10% and lowers datacenter flow completion time by 1.3-3.5x.

Supplementary Material

MP4 File (3387514.3405899.mp4)
This video is of the talk presenting Annulus, a dual congestion control loop scheme for datacenter and WAN traffic aggregates. \r\n\r\nCloud services are deployed in datacenters connected though high-bandwidth Wide Area Networks (WANs). The talk describes our findings related to the interaction between WAN and datacenter traffic, explaining why this interaction can cause performance impairments for both types of traffic. This behavior is caused by the long round-trip time (RTT) for WAN traffic, combined with limited buffering in datacenter switches. \r\n\r\nAnnulus addresses these problems through its two control loops. The first control loop leverages existing congestion control algorithms for bottlenecks where there is only one type of traffic (i.e., WAN or datacenter). The other loop handles bottlenecks shared between WAN and datacenter traffic near the traffic source, using direct feedback from the bottleneck. The talk presents some of the results evaluating Annulus on a testbed.

References

[1]
2010. IEEE Standard for Local and metropolitan area networks- Virtual Bridged Local Area Networks Amendment 13: Congestion Notification. IEEE Std 802.1Qau-2010 (Amendment to IEEE Std 802.1Q-2005) (April 2010), c1--119. https://doi.org/10.1109/IEEESTD.2010.5454063
[2]
2011. NS-2 network simulator. http://nsnam.sourceforge.net/wiki/index.php/Mainpage.
[3]
2019. BCM56980 12.8 Tb/s Multilayer Switch Data Sheet. https://www.broadcom.com/products/ethernet-connectivity/switching/strataxgs/bcm56980-series.
[4]
2019. In-Band Network Telemetry - A Powerful Analytics Framework for your Data Center. https://www.opencompute.org/files/INT-In-Band-Network-Telemetry-A-Powerful-Analytics-Framework-for-your-Data-Center-OCP-Final3.pdf.
[5]
2019. New Trident 3 switch delivers smarter programmability for enterprise and service provider datacenters. https://www.broadcom.com/blog/new-trident-3-switch-delivers-smarter-programmability-for-enterprise-and-service-provider-datacenters.
[6]
Mohammad Alizadeh, Albert Greenberg, David A Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2011. Data center tcp (dctcp). In Proc. of ACM SIGCOMM'11.
[7]
Mohammad Alizadeh, Adel Javanmard, and Balaji Prabhakar. 2011. Analysis of DCTCP: Stability, Convergence, and Fairness. In Proc. of ACM SIGMETRICS'11.
[8]
Mohammad Alizadeh, Abdul Kabbani, TomEdsall, Balaji Prabhakar, Amin Vahdat, and Masato Yasuda. 2012. Less is more: trading a little bandwidth for ultra-low latency in the data center. In Proc. of USENIX NSDI'12.
[9]
Guido Appenzeller, Isaac Keslassy, and Nick McKeown. 2004. Sizing Router Buffers. SIGCOMM Comput. Commun. Rev. (2004).
[10]
Alex Arcilla and Tony Palmer. 2019. Broadcom Trident 3 Platform Performance Analysis: Achieving Predictably High Performance for Real-world Data Center Workloads. ESG Technical Validation (2019).
[11]
Venkat Arun and Hari Balakrishnan. 2018. Copa: Practical Delay-Based Congestion Control for the Internet. In Proc. ofUSENIX NSDI'18.
[12]
Wei Bai, Kai Chen, Shuihai Hu, Kun Tan, and Yongqiang Xiong. 2017. Congestion Control for High-Speed Extremely Shallow-Buffered Datacenter Networks. In Proc. of the First Asia-Pacific Workshop on Networking (APNet'17).
[13]
John Border, Markku Kojo, Jim Griner, Gabriel Montenegro, and Zach Shelby. 2001. Performance enhancingproxies intended to mitigate link-related degradations. RFC 3135. Network Working Group.
[14]
Pat Bosshart, Dan Daly, Glen Gibb, Martin Izzard, Nick McKeown, Jennifer Rexford, Cole Schlesinger, Dan Talayco, Amin Vahdat, George Varghese, and David Walker. 2014. P4: Programming Protocol-Independent Packet Processors. SIGCOMM Comput. Commun. Rev. (2014).
[15]
Lawrence S. Brakmo, Sean W. O'Malley, and Larry L. Peterson. 1994. TCP Vegas: New Techniques for Congestion Detection and Avoidance. In Proc. of ACM SIGCOMM '94.
[16]
Neal Cardwell, Yuchung Cheng, C Stephen Gunn, Soheil Hassas Yeganeh, and Van Jacobson. 2016. BBR: Congestion-based congestion control. Queue (2016).
[17]
Neal Cardwell, Yuchung Cheng, Soheil Hassas Yeganeh, Priyaranjan Jha, Yousuk Seung, Ian Swett, Victor Vasiliev, Bin Wu, Matt Mathis, and Van Jacobson. 2019. BBR v2: A Model-based Congestion Control - IETF 105 Update. https://datatracker.ietf.org/meeting/105/materials/slides-105-iccrg-bbr-v2-a-model-based-congestion-control-00.
[18]
Nandita Dukkipati. 2007. Rate Control Protocol (RCP): Congestion control to make flows complete quickly. Ph.D. Dissertation. Stanford University.
[19]
Sally Floyd, Tom Henderson, and Andrei Gurtov. 2004. TheNewReno Modification to TCP s Fast Recovery Algorithm. RFC 3782. Network Working Group.
[20]
Prateesh Goyal, Anup Agarwal, Ravi Netravali, Mohammad Alizadeh, and Hari Balakrishnan. 2020. ABC: A Simple Explicit Congestion Controller for Wireless Networks. In Proc. of USENIX NSDI '20.
[21]
Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David A. Maltz, Parveen Patel, and Sudipta Sengupta. 2009. VL2: A Scalable and Flexible Data Center Network. In Proc. of ACM SIGCOMM 09.
[22]
Sangtae Ha, Injong Rhee, and Lisong Xu. 2008. CUBIC: A New TCP-friendly High-speed TCP Variant. SIGOPS Oper. Syst. Rev. 42, 5 (July 2008).
[23]
Chi-Yao Hong, Srikanth Kandula, Ratul Mahajan, Ming Zhang, Vijay Gill, Mohan Nanduri, and Roger Wattenhofer. 2013. Achieving high utilization with software-driven WAN. In Proc. of ACM SIGCOMM'13.
[24]
Chi-Yao Hong, Subhasree Mandal, Mohammad Al-Fares, Min Zhu, Richard Alimi, Kondapa Naidu B., Chandan Bhagat, Sourabh Jain, Jay Kaimal, Shiyu Liang, and et al. 2018. B4 and after: Managing Hierarchy, Partitioning, and Asymmetry for Availability and Scale in Google's Software-Defined WAN. In Proc. of ACM SIGCOMM 18.
[25]
V. Jacobson. 1988. Congestion Avoidance and Control. SIGCOMM Comput. Commun. Rev. (1988).
[26]
Sushant Jain, Alok Kumar, Subhasree Mandal, Joon Ong, Leon Poutievski, Arjun Singh, Subbaiah Venkata, Jim Wanderer, Junlan Zhou, Min Zhu, et al. 2013. B4: Experience with a globally-deployed software defined WAN. In Proc. of ACM SIGCOMM 13.
[27]
Mikel Jimenez and Henry Kwok. 2017. Building Express Backbone. https://engineering. fb. com/data-center-engineering/building-express-backbone-facebook-s-new-long-haul-network/.
[28]
Dina Katabi, Mark Handley, and Charlie Rohrs. 2002. Congestion Control for High Bandwidth-Delay Product Networks. In Proc. of ACM SIGCOMM '02.
[29]
Praveen Kumar, Nandita Dukkipati, Nathan Lewis, Yi Cui, Yaogong Wang, Chonggang Li, Valas Valancius, Jake Adriaens, Steve Gribble, Nate Foster, and Amin Vahdat. 2019. PicNIC: Predictable Virtualized NIC. In Proc. of ACM SIGCOMM '19.
[30]
Charles E Leiserson. 1985. Fat-trees: universal networks for hardware-efficient supercomputing. IEEE Trans. Comput. (1985).
[31]
Yuliang Li, Rui Miao, Hongqiang Harry Liu, Yan Zhuang, Fei Feng, Lingbo Tang, Zheng Cao, Ming Zhang, Frank Kelly, Mohammad Alizadeh, and Minlan Yu. 2019. HPCC: High Precision Congestion Control. In Proc. of ACM SIGCOMM 19.
[32]
Michael Marty, Marc de Kruijf, Jacob Adriaens, Christopher Alfeld, Sean Bauer, Carlo Contavalli, Michael Dalton, Nandita Dukkipati, William C. Evans, Steve Gribble, Nicholas Kidd, Roman Kononov, Gautam Kumar, Carl Mauer, Emily Musick, Lena Olson, Erik Rubow, Michael Ryan, Kevin Springborn, Paul Turner, Valas Valancius, Xi Wang, and Amin Vahdat. 2019. Snap: AMicrokernel Approach to Host Networking. In Proc. of ACM SOSP 19.
[33]
Partho P. Mishra and Hemant Kanakia. 1992. A Hop by Hop Rate-Based Congestion Control Scheme. In Proc. of ACM SIGCOMM 92.
[34]
Radhika Mittal, Nandita Dukkipati, Emily Blem, Hassan Wassel, Monia Ghobadi, Amin Vahdat, Yaogong Wang, David Wetherall, David Zats, et al. 2015. TIMELY: RTT-based congestion control for the datacenter. In Proc. of ACM SIGCOMM '15.
[35]
Timothy Prickett Morgan. 2018. A Deep Dive Into Cisco's Use Of Merchant Switch Chips. https://www. nextplatform. com/2018/06/20/a-deep-dive-into-ciscos-use-of-merchant-switch-chips/.
[36]
Eugene Opsasnick. 2011. Buffer management and flow control mechanism including packet-based dynamic thresholding. US Patent 7,953,002.
[37]
Rong Pan. [n.d.]. QCN Pseudo Code: Version 2.2. http://www.ieee802.org/1/files/public/docs2008/au-pan-QCN-pseudo-code-ver2-2.pdf.
[38]
Jonathan Perry, Hari Balakrishnan, and Devavrat Shah. 2017. Flowtune: Flowlet Control for Datacenter Networks. In Proc. of USENIX NSDI'17.
[39]
Jonathan Perry, Amy Ousterhout, Hari Balakrishnan, Devavrat Shah, and Hans Fugal. 2014. Fastpass: A centralized zero-queue datacenter network. In Proc. of ACM SIGCOMM'14.
[40]
Jon Postel. 1981. Internet Control Message Protocol. RFC 792. Network Working Group.
[41]
George F Riley and Thomas R Henderson. 2010. The ns-3 network simulator. In Modeling and tools for network simulation.
[42]
Arjun Roy, Hongyi Zeng, Jasmeet Bagga, George Porter, and Alex C. Snoeren. 2015. Inside the Social Network's (Datacenter) Network. In Proc. of ACM SIGCOMM'15.
[43]
Ahmed Saeed, Nandita Dukkipati, Vytautas Valancius, Vinh The Lam, Carlo Contavalli, and Amin Vahdat. 2017. Carousel: Scalable Traffic Shaping at End Hosts. In Proc. of ACM SIGCOMM'17.
[44]
Brandon Schlinker, Hyojeong Kim, Timothy Cui, Ethan Katz-Bassett, Harsha V. Madhyastha, Italo Cunha, James Quinn, Saif Hasan, Petr Lapukhov, and Hongyi Zeng. 2017. Engineering Egress with Edge Fabric: Steering Oceans of Content to the World. In Proc. of ACM SIGCOMM'17.
[45]
Arjun Singh, Joon Ong, Amit Agarwal, Glen Anderson, Ashby Armistead, Roy Bannon, Seb Boving, Gaurav Desai, Bob Felderman, Paulie Germano, Anand Kanagala, Jeff Provost, Jason Simmons, Eiichi Tanda, Jim Wanderer, Urs Holzle, Stephen Stuart, and Amin Vahdat. 2015. Jupiter Rising: A decade of Clos topologies and centralized control in Google's datacenter network. In Proc. of ACM SIGCOMM'15.
[46]
Vijay Vasudevan, Amar Phanishayee, Hiral Shah, Elie Krevat, David G Andersen, Gregory R Ganger, Garth A Gibson, and Brian Mueller. 2009. Safe and effective fine-grained TCP retransmissions for datacenter communication. In Proc. of ACM SIGCOMM'09.
[47]
Kok-Kiong Yap, Murtaza Motiwala, Jeremy Rahe, Steve Padgett, Matthew Holliman, Gary Baldus, Marcus Hines, Taeeun Kim, Ashok Narayanan, Ankur Jain, Victor Lin, Colin Rice, Brian Rogan, Arjun Singh, Bert Tanaka, Manish Verma, Puneet Sood, Mukarram Tariq, Matt Tierney, Dzevad Trumic, Vytautas Valancius, Calvin Ying, Mahesh Kallahalla, Bikash Koley, and Amin Vahdat. 2017. Taking the Edge off with Espresso: Scale, Reliability and Programmability for Global Internet Peering. In Proc. of ACM SIGCOMM'17.
[48]
David Zats, Anand Padmanabha Iyer, Ganesh Ananthanarayanan, Rachit Agarwal, Randy Katz, Ion Stoica, and Amin Vahdat. 2015. FastLane: making short flows shorter with agile drop notification. In Proc. of ACM SoCC '15.
[49]
Yimeng Zhao, Ahmed Saeed, Ellen Zegura, and Mostafa Ammar. 2019. zD: A Scalable Zero-Drop Network Stack at End Hosts. In Proc. of ACM CoNEXT'19.
[50]
Yibo Zhu, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, and Ming Zhang. 2015. Congestion control for large-scale RDMA deployments. In Proc. of ACM SIGCOMM'15.

Cited By

View all
  • (2024)ReverieProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691861(651-668)Online publication date: 16-Apr-2024
  • (2024)CredenceProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691859(613-634)Online publication date: 16-Apr-2024
  • (2024)F3: Fast and Flexible Network Telemetry with an FPGA coprocessorProceedings of the ACM on Networking10.1145/36963972:CoNEXT4(1-22)Online publication date: 25-Nov-2024
  • Show More Cited By

Index Terms

  1. Annulus: A Dual Congestion Control Loop for Datacenter and WAN Traffic Aggregates

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGCOMM '20: Proceedings of the Annual conference of the ACM Special Interest Group on Data Communication on the applications, technologies, architectures, and protocols for computer communication
      July 2020
      814 pages
      ISBN:9781450379557
      DOI:10.1145/3387514
      This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 July 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Congestion Control
      2. Data Center Networks
      3. Explicit Direct Congestion Notification
      4. Wide-Area Networks

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      Conference

      SIGCOMM '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 462 of 3,389 submissions, 14%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)625
      • Downloads (Last 6 weeks)71
      Reflects downloads up to 08 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)ReverieProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691861(651-668)Online publication date: 16-Apr-2024
      • (2024)CredenceProceedings of the 21st USENIX Symposium on Networked Systems Design and Implementation10.5555/3691825.3691859(613-634)Online publication date: 16-Apr-2024
      • (2024)F3: Fast and Flexible Network Telemetry with an FPGA coprocessorProceedings of the ACM on Networking10.1145/36963972:CoNEXT4(1-22)Online publication date: 25-Nov-2024
      • (2024)Lightweight Automated Reasoning for Network ArchitecturesProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696865(237-245)Online publication date: 18-Nov-2024
      • (2024)Accelerating Model Training in Multi-cluster Environments with Consumer-grade GPUsProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672228(707-720)Online publication date: 4-Aug-2024
      • (2024)Dragonfly: In-Flight CCA IdentificationIEEE Transactions on Network and Service Management10.1109/TNSM.2024.338041721:3(2675-2685)Online publication date: Jun-2024
      • (2024)Minimizing Buffer Utilization for Lossless Inter-DC LinksIEEE/ACM Transactions on Networking10.1109/TNET.2024.344360032:6(4960-4975)Online publication date: Dec-2024
      • (2024)Enhancing Load Balancing With In-Network Recirculation to Prevent Packet Reordering in Lossless Data CentersIEEE/ACM Transactions on Networking10.1109/TNET.2024.340367132:5(4114-4127)Online publication date: Oct-2024
      • (2024)Taming the Aggressiveness of Heterogeneous TCP Traffic in Data Center NetworksIEEE/ACM Transactions on Networking10.1109/TNET.2023.334704832:3(2253-2268)Online publication date: Jun-2024
      • (2024)Switch-Assistant Loss Recovery for RDMA Transport ControlIEEE/ACM Transactions on Networking10.1109/TNET.2023.333666132:3(2069-2084)Online publication date: Jun-2024
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media