Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2999572.2999593acmconferencesArticle/Chapter ViewAbstractPublication PagesconextConference Proceedingsconference-collections
research-article
Public Access

ECN or Delay: Lessons Learnt from Analysis of DCQCN and TIMELY

Published: 06 December 2016 Publication History

Abstract

Data center networks, and especially drop-free RoCEv2 networks require efficient congestion control protocols. DCQCN (ECN-based) and TIMELY (delay-based) are two recent proposals for this purpose. In this paper, we analyze DCQCN and TIMELY using fluid models and simulations, for stability, convergence, fairness and flow completion time. We uncover several surprising behaviors of these protocols. For example, we show that DCQCN exhibits non-monotonic stability behavior, and that TIMELY can converge to stable regime with arbitrary unfairness. We propose simple fixes and tuning for ensuring that both protocols converge to and are stable at the fair share point. Finally, using lessons learnt from the analysis, we address the broader question: are there fundamental reasons to prefer either ECN or delay for end-to-end congestion control in data center networks? We argue that ECN is a better congestion signal, due to the way modern switches mark packets, and due to a fundamental limitation of end-to-end delay-based protocols, that we derive.

References

[1]
https://github.com/bobzhuyb/ns3-rdma.
[2]
M. Alizadeh, A. Greenberg, D. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. Data Center TCP (DCTCP). In SIGCOMM, 2010.
[3]
M. Alizadeh, A. Javanmard, and B. Prabhakar. Analysis of DCTCP: Stability, convergence and fairness. In SIGMETRICS, 2011.
[4]
M. Alizadeh, A. Kabbani, B. Atikoglu, and B. Prabhakar. Stability analysis of QCN: the averaging principle. In SIGMETRICS, 2011.
[5]
M. Alizadeh, S. Yang, S. Katti, N. McKeown, B. Prabhakar, and S. Shenker. Deconstructing datacenter packet transport. In Proceedings of the 11th ACM Workshop on hot topics in networks, pages 133--138. ACM, 2012.
[6]
S. Athuraliya, D. E. Lapsley, and S. H. Low. An enhanced random early marking algorithm for internet flow control. In Proceedings of IEEE INFOCOM 2000, pages 1425--1434, 2000.
[7]
A. Dragojevic, D. Narayanan, O. Hodson, and M. Castro. FaRM: Fast remote memory. In NSDI, 2014.
[8]
N. Dukkipati. Rate control protocol (RCP): Congestion control to make flows complete quickly. In PhD diss., Stanford University, 2007.
[9]
N. Dukkipati, N. McKeown, and A. G. Fraser. Rcp-ac: Congestion control to make flows complete quickly in any environment. In INFOCOM 2006. 25th IEEE International Conference on Computer Communications. Proceedings, pages 1--5. IEEE, 2006.
[10]
S. Floyd and V. Jacobson. Random early detection gateways for congestion avoidance. IEEE/ACM Transactions on Networking, 1:397--413, 1993.
[11]
J. Gettys and K. Nichols. Bufferbloat: Dark buffers in the internet. Queue, 9(11):40, 2011.
[12]
M. Ghobadi, R. Mahajan, A. Phanishayee, J. Kulkarni, G. Ranade, N. Devanur, P.-A. Blanche, H. Rastegarfar, M. Glick, and D. Kilper. Projector: Agile reconfigurable data center interconnect. In sigcomm, 2016.
[13]
F. Golnarghi and B. C. Kuo. Automatic control systems. Wiely, 2009.
[14]
C. Hollot, V. Misra, D. Towsley, and W.-B. Gong. On designing improved controllers for aqm routers supporting tcp flows. In INFOCOM, 2001.
[15]
C. Hollot, V. Misra, D. Towsley, and W.-B. Gong. Analysis and design of controllers for aqm routers supporting tcp flows. IEEE Transactions on Automatic Control, 2002.
[16]
IEEE. 802.11Qau. Congestion notification, 2010.
[17]
IEEE. 802.11Qbb. Priority based flow control, 2011.
[18]
Infiniband Trade Association. Supplement to InfiniBand architecture specification volume 1 release 1.2.2 annex A17: RoCEv2 (IP routable RoCE), 2014.
[19]
D. Katabi, M. Handley, and C. Rohrs. Congestion control for high bandwidth-delay product networks. ACM SIGCOMM Computer Communication Review, 32(4):89--102, 2002.
[20]
V. Misra, W.-B. Gong, and D. Towsley. Fluid-based analysis of a network of aqm routers supporting tcp flows with an application to red. In ACM SIGCOMM Computer Communication Review, volume 30, pages 151--160. ACM, 2000.
[21]
R. Mitta, E. Blem, N. Dukkipati, T. Lam, A. Vahdat, Y. Wang, H. Wassel, D. Wetherall, D. Zats, and M. Ghobadi. TIMELY: RTT-based congestion control for the datacenter. In SIGCOMM, 2015.
[22]
R. Pan, P. Natarajan, C. Piglione, M. S. Prabhu, V. Subramanian, F. Baker, and B. VerSteeg. https://tools.ietf.org/html/draft-pan-tsvwg-pie-00.
[23]
R. Pan, P. Natarajan, C. Piglione, M. S. Prabhu, V. Subramanian, F. Baker, and B. VerSteeg. Pie: A lightweight control scheme to address the bufferbloat problem. In HPSR, pages 148--155. IEEE, 2013.
[24]
J. Perry, A. Ousterhout, H. Balakrishnan, D. Shah, and H. Fugal. Fastpass: A centralized zero-queue datacenter network. In Proceedings of the 2014 ACM conference on SIGCOMM, pages 307--318. ACM, 2014.
[25]
K. Ramakrishnan, S. Floyd, and D. Black. The addition of explicit congestion notification (ECN). RFC 3168.
[26]
S. Shalunov, G. Hazel, J. Iyengar, and M. Kuehlewind. Low extra delay background transport (ledbat). Technical report, 2012.
[27]
R. Srikant. The mathematics of Internet congestion control. Springer Science & Business Media, 2012.
[28]
B. Stephens, A. Cox, A. Singla, J. Carter, C. Dixon, and W. Felter. Practical DCB for improved data center networks. In INFOCOMM, 2014.
[29]
B. C. Vattikonda, G. Porter, A. Vahdat, and A. C. Snoeren. Practical tdma for datacenter ethernet. In EuroSys, pages 225--238. ACM, 2012.
[30]
C. Wilson, H. Ballani, T. Karagiannis, and A. Rowstron. Better never than late: Meeting deadlines in datacenter networks. In SIGCOMM, 2011.
[31]
Y. Zhu, H. Eran, D. Firestone, C. Guo, M. Lipshteyn, Y. Liron, J. Padhye, S. Raindel, M. H. Yahia, and M. Zhang. Congestion Control for Large-Scale RDMA Deployments. In SIGCOMM, 2015.
[32]
The NS3 Simulator. https://www.nsnam.org/.

Cited By

View all
  • (2024)Accurate and fast congestion feedback in MEC-enabled RDMA datacentersJournal of Cloud Computing10.1186/s13677-024-00642-813:1Online publication date: 25-Mar-2024
  • (2024)Lightweight Automated Reasoning for Network ArchitecturesProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696865(237-245)Online publication date: 18-Nov-2024
  • (2024)To switch or not to switch to TCP Prague? Incentives for adoption in a partial L4S deploymentProceedings of the 2024 Applied Networking Research Workshop10.1145/3673422.3674896(45-52)Online publication date: 23-Jul-2024
  • Show More Cited By

Index Terms

  1. ECN or Delay: Lessons Learnt from Analysis of DCQCN and TIMELY

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CoNEXT '16: Proceedings of the 12th International on Conference on emerging Networking EXperiments and Technologies
    December 2016
    524 pages
    ISBN:9781450342926
    DOI:10.1145/2999572
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 December 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. congestion control
    2. data center transport
    3. delay-based
    4. ecn
    5. rdma

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    CoNEXT '16
    Sponsor:

    Acceptance Rates

    CoNEXT '16 Paper Acceptance Rate 30 of 160 submissions, 19%;
    Overall Acceptance Rate 198 of 789 submissions, 25%

    Upcoming Conference

    CoNEXT '24

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)547
    • Downloads (Last 6 weeks)96
    Reflects downloads up to 13 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Accurate and fast congestion feedback in MEC-enabled RDMA datacentersJournal of Cloud Computing10.1186/s13677-024-00642-813:1Online publication date: 25-Mar-2024
    • (2024)Lightweight Automated Reasoning for Network ArchitecturesProceedings of the 23rd ACM Workshop on Hot Topics in Networks10.1145/3696348.3696865(237-245)Online publication date: 18-Nov-2024
    • (2024)To switch or not to switch to TCP Prague? Incentives for adoption in a partial L4S deploymentProceedings of the 2024 Applied Networking Research Workshop10.1145/3673422.3674896(45-52)Online publication date: 23-Jul-2024
    • (2024)Enhancing Load Balancing With In-Network Recirculation to Prevent Packet Reordering in Lossless Data CentersIEEE/ACM Transactions on Networking10.1109/TNET.2024.340367132:5(4114-4127)Online publication date: Oct-2024
    • (2024)PACC: A Proactive CNP Generation Scheme for Datacenter NetworksIEEE/ACM Transactions on Networking10.1109/TNET.2024.336177132:3(2586-2599)Online publication date: Jun-2024
    • (2024)R-PFC: Enhancing RDMA Network With Restricted And Fine-grained PFC2024 IEEE/ACM 32nd International Symposium on Quality of Service (IWQoS)10.1109/IWQoS61813.2024.10682907(1-10)Online publication date: 19-Jun-2024
    • (2024)RateMP: Optimizing Bandwidth Utilization with High Burst Tolerance in Data Center NetworksIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621096(1361-1370)Online publication date: 20-May-2024
    • (2024)BCC: Re-architecting Congestion Control in DCNsIEEE INFOCOM 2024 - IEEE Conference on Computer Communications10.1109/INFOCOM52122.2024.10621082(1441-1450)Online publication date: 20-May-2024
    • (2024)DCCSComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2024.110457247:COnline publication date: 18-Jul-2024
    • (2024)PCNPComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2024.110453247:COnline publication date: 18-Jul-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media