Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Aeolus: A Building Block for Proactive Transport in Datacenter Networks

Published: 03 November 2021 Publication History

Abstract

As datacenter network bandwidth keeps growing, proactive transport becomes attractive, where bandwidth is <italic>proactively</italic> allocated as &#x201C;credits&#x201D; to senders who then can send &#x201C;scheduled packets&#x201D; at a right rate to ensure high link utilization, low latency, and zero packet loss. Consequently, proactive solutions such as ExpressPass, NDP, Homa, etc., have been proposed recently. While promising, a fundamental challenge is that proactive transport requires at least one-RTT for credits to be computed and delivered. In this paper, we show such one-RTT &#x201C;pre-credit&#x201D; phase could carry a substantial amount of flows at high link-speeds, but none of existing proactive solutions treats it appropriately. We present Aeolus, a solution focusing on &#x201C;pre-credit&#x201D; packet transmission as a building block for proactive transports. Aeolus contains unconventional design principles such as scheduled-packet-first (SPF) that de-prioritizes the first-RTT packets, instead of prioritizing them as prior work. It further exploits the preserved, deterministic nature of proactive transport as a means to recover lost first-RTT packets efficiently. Aeolus is compatible with all existing proactive solutions and readily implementable with commodity switches. We have integrated Aeolus into ExpressPass, NDP and Homa, and shown, via both implementation and simulations, that the Aeolus-enhanced solutions deliver significant performance or deployability advantages. For example, it improves the average FCT of ExpressPass by 56&#x0025;, cuts the tail FCT of Homa by <inline-formula> <tex-math notation="LaTeX">$20\times $ </tex-math></inline-formula>, while achieving similar performance as NDP without switch modifications.

References

[1]
I. Cho, K. Jang, and D. Han, “Credit-scheduled delay-bounded congestion control for datacenters,” in Proc. Conf. ACM Special Interest Group Data Commun., Aug. 2017, pp. 239–252.
[2]
M. Handleyet al., “Re-architecting datacenter networks and stacks for low latency and high performance,” in Proc. Conf. ACM Special Interest Group Data Commun., Aug. 2017, pp. 29–42.
[3]
B. Montazeri, Y. Li, M. Alizadeh, and J. Ousterhout, “Homa: A receiver-driven low-latency transport protocol using network priorities,” in Proc. Conf. ACM Special Interest Group Data Commun., Aug. 2018, pp. 221–235.
[4]
M. Alizadehet al., “Data center TCP (DCTCP),” in Proc. ACM SIGCOMM Conf. SIGCOMM (SIGCOMM), 2010, pp. 63–74.
[5]
Y. Zhuet al., “Congestion control for large-scale RDMA deployments,” in Proc. ACM Conf. Special Interest Group Data Commun., Aug. 2015, pp. .
[6]
R. Mittalet al., “TIMELY: RTT-based congestion control for the datacenter,” in Proc. ACM Conf. Special Interest Group Data Commun., Aug. 2015, pp. 1–4.
[7]
J. Perry, A. Ousterhout, H. Balakrishnan, D. Shah, and H. Fugal, “Fastpass: A centralized ‘zero-queue’ datacenter network,” in Proc. SIGCOMM, 2014, pp. 307–318.
[8]
P. X. Gao, A. Narayan, G. Kumar, R. Agarwal, S. Ratnasamy, and S. Shenker, “PHost: Distributed near-optimal datacenter transport over commodity network fabric,” in Proc. 11th ACM Conf. Emerg. Netw. Exp. Technol., Dec. 2015, pp. 1–12.
[9]
S. Hu, W. Bai, B. Qiao, K. Chen, and K. Tan, “Augmenting proactive congestion control with Aeolus,” in Proc. 2nd Asia–Pacific Workshop Netw. (APNet), 2018, pp. 22–28.
[10]
S. Huet al., “Aeolus: A building block for proactive transport in datacenters,” in Proc. Annu. Conf. ACM Special Interest Group Data Commun. Appl., Technol., Archit., Protocols Comput. Commun., Jul. 2020, pp. 422–434.
[11]
Dpdk. [Online]. Available: https://www.dpdk.org/
[12]
P. Cheng, F. Ren, R. Shu, and C. Lin, “Catch the whole lot in an action: Rapid precise packet loss notification in data centers,” in Proc. NSDI, 2014, pp. 17–28.
[13]
C.-Y. Hong, M. Caesar, and P. B. Godfrey, “Finishing flows quickly with preemptive scheduling,” in Proc. ACM SIGCOMM Conf. Appl., Technol., Archit., Protocols Comput. Commun. (SIGCOMM), 2012, pp. 127–138.
[14]
J. Zhang, F. Ren, R. Shu, and P. Cheng, “TFC: Token flow control in data center networks,” in Proc. EuroSys, 2016, pp. 1–14.
[15]
A. Roy, H. Zeng, J. Bagga, G. Porter, and A. C. Snoeren, “Inside the social network’s (datacenter) network,” in Proc. ACM Conf. Special Interest Group Data Commun., Aug. 2015, pp. 123–137.
[16]
A. Greenberget al., “VL2: A scalable and flexible data center network,” in Proc. ACM SIGCOMM Conf. Data Commun. (SIGCOMM), 2009, pp. 51–62.
[17]
[18]
W. Bai, L. Chen, K. Chen, D. Han, C. Tian, and H. Wang, “Information-agnostic flow scheduling for commodity data centers,” in Proc. NSDI, 2015, pp. 455–468.
[19]
M. Alizadehet al., “PFabric: Minimal near-optimal datacenter transport,” in Proc. ACM SIGCOMM Conf. (SIGCOMM), Aug. 2013, pp. 435–446.
[20]
R. Mittal, J. Sherry, S. Ratnasamy, and S. Shenker, “Recursively cautious congestion control,” in Proc. NSDI, 2014, pp. 373–385.
[21]
W. Bai, L. Chen, K. Chen, and H. Wu, “Enabling ECN in multi-service multi-queue data centers,” in Proc. NSDI, 2016, pp. 537–549.
[22]
High-Density 25/100 Gigabit Ethernet Strataxgs Tomahawk Ethernet Switch Series. [Online]. Available: https://www.broadcom.com/products/ethernet-connectivity/switch-fabric/b%cm56960
[23]
High-Capacity Strataxgs Trident II Ethernet Switch Series. [Online]. Available: https://www.broadcom.com/products/ethernet-connectivity/switch-fabric/b%cm56850
[24]
[25]
K. Heet al., “AC/DC TCP: Virtual congestion control enforcement for datacenter networks,” in Proc. ACM SIGCOMM Conf., Aug. 2016, pp. .
[26]
H. Wu, J. Ju, G. Lu, C. Guo, Y. Xiong, and Y. Zhang, “Tuning ECN for data center networks,” in Proc. 8th Int. Conf. Emerg. Netw. Exp. Technol. (CoNEXT), 2012, pp. 244–257.
[27]
Expresspass Simulator. [Online]. Available: https://github.com/kaist-ina/ns2-xpass
[29]
S. Huet al., “Deadlocks in datacenter networks: Why do they form, and how to avoid them,” in Proc. 15th ACM Workshop Hot Topics Netw., Nov. 2016, pp. 92–98.
[30]
C. Guoet al., “RDMA over commodity Ethernet at scale,” in Proc. ACM SIGCOMM Conf., Aug. 2016, pp. 202–215.
[31]
S. Huet al., “Tagger: Practical PFC deadlock prevention in data center networks,” in Proc. 13th Int. Conf. Emerg. Netw. Exp. Technol., Nov. 2017, pp. 451–463.
[32]
R. Mittalet al., “Revisiting network support for RDMA,” in Proc. Conf. ACM Special Interest Group Data Commun., Aug. 2018, pp. 313–326.
[33]
L. Chen, S. Hu, K. Chen, H. Wu, and D. H. K. Tsang, “Towards minimal-delay deadline-driven data center TCP,” in Proc. 12th ACM Workshop Hot Topics Netw., Nov. 2013, pp. 1–7.
[34]
W. Bai, K. Chen, S. Hu, K. Tan, and Y. Xiong, “Congestion control for high-speed extremely shallow-buffered datacenter networks,” in Proc. 1st Asia–Pacific Workshop Netw., Aug. 2017, pp. 29–35.
[35]
W. Bai, S. Hu, K. Chen, K. Tan, and Y. Xiong, “One more config is enough: Saving (DC) TCP for high-speed extremely shallow-buffered datacenters,” IEEE/ACM Trans. Netw., vol. 29, no. 2, pp. 489–502, Dec. 2020.
[36]
G. Zeng, W. Bai, G. Chen, K. Chen, D. Han, and Y. Zhu, “Combining ECN and RTT for datacenter transport,” in Proc. 1st Asia–Pacific Workshop Netw., Aug. 2017, pp. 36–42.
[37]
G. Zenget al., “Congestion control for cross-datacenter networks,” in Proc. IEEE 27th Int. Conf. Netw. Protocols (ICNP), Oct. 2019, pp. 1–12.
[38]
Y. Liet al., “HPCC: High precision congestion control,” in Proc. ACM Special Interest Group Data Commun., Aug. 2019, pp. 44–58.
[39]
W. Bai, L. Chen, K. Chen, D. Han, C. Tian, and W. Sun, “PIAS: Practical information-agnostic flow scheduling for data center networks,” in Proc. 13th ACM Workshop Hot Topics Netw., Oct. 2014, pp. 1–7.
[40]
W. Bai, L. Chen, K. Chen, D. Han, C. Tian, and H. Wang, “PIAS: Practical information-agnostic flow scheduling for commodity data centers,” IEEE/ACM Trans. Netw., vol. 25, no. 4, pp. 1954–1967, Aug. 2017.
[41]
W. Bai, K. Chen, L. Chen, C. Kim, and H. Wu, “Enabling ECN over generic packet scheduling,” in Proc. 12th Int. Conf. Emerg. Netw. Exp. Technol., Dec. 2016, pp. 191–204.
[42]
L. Chen, K. Chen, W. Bai, and M. Alizadeh, “Scheduling mix-flows in commodity datacenters with karuna,” in Proc. ACM SIGCOMM Conf., Aug. 2016, pp. 174–187.
[43]
M. Chowdhury and I. Stoica, “Efficient coflow scheduling without prior knowledge,” ACM SIGCOMM Comput. Commun. Rev., vol. 45, no. 4, pp. 393–406, 2015.
[44]
H. Zhang, L. Chen, B. Yi, K. Chen, M. Chowdhury, and Y. Geng, “CODA: Toward automatically identifying and scheduling coflows in the dark,” in Proc. ACM SIGCOMM Conf., Aug. 2016, pp. 160–173.
[45]
H. Susanto, H. Jin, and K. Chen, “Stream: Decentralized opportunistic inter-coflow scheduling for datacenter networks,” in Proc. IEEE 24th Int. Conf. Netw. Protocols (ICNP), Nov. 2016, pp. 1–10.
[46]
M. Alizadehet al., “CONGA: Distributed congestion-aware load balancing for datacenters,” in Proc. ACM Conf. SIGCOMM, Aug. 2014, pp. 503–514.
[47]
H. Zhang, J. Zhang, W. Bai, K. Chen, and M. Chowdhury, “Resilient datacenter load balancing in the wild,” in Proc. Conf. ACM Special Interest Group Data Commun., Aug. 2017, pp. 253–266.
[48]
A. Munir, G. Baig, S. M. Irteza, I. A. Qazi, A. X. Liu, and F. R. Dogar, “Friends, not foes: Synthesizing existing transport strategies for data center networks,” in Proc. ACM Conf. SIGCOMM, Aug. 2014, pp. 491–502.
[49]
N. Jiang, D. U. Becker, G. Michelogiannakis, and W. J. Dally, “Network congestion avoidance through speculative reservation,” in Proc. IEEE Int. Symp. High-Perform. Comp Archit., Feb. 2012, pp. 1–12.

Cited By

View all
  • (2024)DDT: Dynamical Selective Dropping Threshold for Reactive Congestion ControlProceedings of the ACM Turing Award Celebration Conference - China 202410.1145/3674399.3674412(12-17)Online publication date: 5-Jul-2024
  • (2023)On Augmenting TCP/IP Stack via eBPFProceedings of the 1st Workshop on eBPF and Kernel Extensions10.1145/3609021.3609300(15-20)Online publication date: 10-Sep-2023
  • (2022)Congestion Control for Cross-Datacenter NetworksIEEE/ACM Transactions on Networking10.1109/TNET.2022.316158030:5(2074-2089)Online publication date: 5-Apr-2022

Index Terms

  1. Aeolus: A Building Block for Proactive Transport in Datacenter Networks
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image IEEE/ACM Transactions on Networking
    IEEE/ACM Transactions on Networking  Volume 30, Issue 2
    April 2022
    478 pages

    Publisher

    IEEE Press

    Publication History

    Published: 03 November 2021
    Published in TON Volume 30, Issue 2

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 13 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)DDT: Dynamical Selective Dropping Threshold for Reactive Congestion ControlProceedings of the ACM Turing Award Celebration Conference - China 202410.1145/3674399.3674412(12-17)Online publication date: 5-Jul-2024
    • (2023)On Augmenting TCP/IP Stack via eBPFProceedings of the 1st Workshop on eBPF and Kernel Extensions10.1145/3609021.3609300(15-20)Online publication date: 10-Sep-2023
    • (2022)Congestion Control for Cross-Datacenter NetworksIEEE/ACM Transactions on Networking10.1109/TNET.2022.316158030:5(2074-2089)Online publication date: 5-Apr-2022

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media