Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3106989.3106993acmotherconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article

Memory Efficient Loss Recovery for Hardware-based Transport in Datacenter

Published: 03 August 2017 Publication History

Abstract

Limited by the small on-chip memory, hardware-based transport typically implements go-back-N loss recovery mechanism, which costs very few memory but is well-known to perform inferior even under small packet loss ratio. We present MELO, an efficient selective retransmission mechanism for hardware-based transport, which consumes only a constant small memory regardless of the number of concurrent connections. Specifically, MELO employs an architectural separation between data and meta data storage and uses a shared bits pool allocation mechanism to reduce meta data on-chip memory footprint. By only adding in average 23B extra on-chip states for each connection, MELO achieves up to 14.02x throughput while reduces 99% tail FCT by 3.11x compared with go-back-N under certain loss ratio.

References

[1]
2008. InfiniBand architecture volume 1, general specifications, release 1.2.1. InfiniBand Trade Association.
[2]
2010. Supplement to InfiniBand architecture specification volume 1 release 1.2.2 annex A16: RDMA over converged ethernet (RoCE). InfiniBand Trade Association.
[3]
2012. Supplement to InfiniBand architecture specification volume 1 release 1.2.2 annex A17: RoCEv2 (IP routable RoCE). InfiniBand Trade Association.
[4]
Mohammad Alizadeh, Albert Greenberg, David A. Maltz, Jitendra Padhye, Parveen Patel, Balaji Prabhakar, Sudipta Sengupta, and Murari Sridharan. 2010. Data Center TCP (DCTCP). In Proceedings of the ACM SIGCOMM 2010 Conference (SIGCOMM '10). ACM, New York, NY, USA, 63--74.
[5]
Remzi H Arpaci-Dusseau and Andrea C Arpaci-Dusseau. 2014. Operating systems: Three easy pieces. Vol. 151. Arpaci-Dusseau Books Wisconsin.
[6]
Adrian M Caulfield, Eric S Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo-Young Kim, and others. 2016. A cloud-scale acceleration architecture. In Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on. IEEE, 1--13.
[7]
Cisco. 2015. Priority Flow Control: Build Reliable Layer 2 Infrastructure. (2015). http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/white_paper_c11-542809_ns783_Networking_Solutions_White_Paper.html.
[8]
Chuanxiong Guo, Haitao Wu, Zhong Deng, Gaurav Soni, Jianxi Ye, Jitu Padhye, and Marina Lipshteyn. 2016. RDMA over Commodity Ethernet at Scale. In Proceedings of the 2016 conference on ACM SIGCOMM 2016 Conference. ACM, 202--215.
[9]
Chuanxiong Guo, Lihua Yuan, Dong Xiang, Yingnong Dang, Ray Huang, Dave Maltz, Zhaoyi Liu, Vin Wang, Bin Pang, Hua Chen, and others. 2015. Pingmesh: A large-scale system for data center network latency measurement and analysis. ACM SIGCOMM Computer Communication Review 45, 4 (2015), 139--152.
[10]
Shuihai Hu, Yibo Zhu, Peng Cheng, Chuanxiong Guo, Kun Tan, Jitendra Padhye, and Kai Chen. 2016. Deadlocks in Datacenter Networks: Why Do They Form, and How to Avoid Them. In Proceedings of the 15th ACM Workshop on Hot Topics in Networks. ACM, 92--98.
[11]
ieee. 2010. 802.1Qbb - Priority-based Flow Control. (2010). http://www.ieee802.org/1/pages/802.1bb.html.
[12]
Anuj Kalia, Michael Kaminsky, and David G Andersen. 2016. Design Guidelines for High Performance RDMA Systems. In 2016 USENIX Annual Technical Conference (USENIX ATC 16).
[13]
Matt Mathis, Jamshid Mahdavi, Sally Floyd, and Allyn Romanow. 1996. TCP selective acknowledgment options. Technical Report.
[14]
Mellanox. 2012. Mellanox EN Driver for Linux. (2012). http://www.mellanox.com/page/products_dyn?product_family=27&mtag=linux_driver.
[15]
Andrew Putnam, Adrian M Caulfield, Eric S Chung, Derek Chiou, Kypros Constantinides, John Demme, Hadi Esmaeilzadeh, Jeremy Fowers, Gopi Prashanth, Gopal Jan, and others. 2014. A reconfigurable fabric for accelerating large-scale datacenter services. international symposium on computer architecture 42, 3 (2014), 13--24.
[16]
Yibo Zhu, Haggai Eran, Daniel Firestone, Chuanxiong Guo, Marina Lipshteyn, Yehonatan Liron, Jitendra Padhye, Shachar Raindel, Mohamad Haj Yahia, and Ming Zhang. 2015. Congestion control for large-scale RDMA deployments. In ACM SIGCOMM Computer Communication Review, Vol. 45. ACM, 523--536.

Cited By

View all
  • (2024)LEFT: LightwEight and FasT packet Reordering for RDMAProceedings of the 8th Asia-Pacific Workshop on Networking10.1145/3663408.3663418(67-73)Online publication date: 3-Aug-2024
  • (2024)PACC: A Proactive CNP Generation Scheme for Datacenter NetworksIEEE/ACM Transactions on Networking10.1109/TNET.2024.336177132:3(2586-2599)Online publication date: Jun-2024
  • (2024)Achieving Low Latency for Multipath Transmission in RDMA Based Data Center NetworkIEEE Transactions on Cloud Computing10.1109/TCC.2024.336507512:1(337-346)Online publication date: Jan-2024
  • Show More Cited By

Index Terms

  1. Memory Efficient Loss Recovery for Hardware-based Transport in Datacenter

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    APNet '17: Proceedings of the First Asia-Pacific Workshop on Networking
    August 2017
    127 pages
    ISBN:9781450352444
    DOI:10.1145/3106989
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 August 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Datacenter networks
    2. Hardware memory
    3. Loss recovery

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    APNet'17
    APNet'17: First Asia-Pacific Workshop on Networking
    August 3 - 4, 2017
    Hong Kong, China

    Acceptance Rates

    Overall Acceptance Rate 50 of 118 submissions, 42%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)46
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 15 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)LEFT: LightwEight and FasT packet Reordering for RDMAProceedings of the 8th Asia-Pacific Workshop on Networking10.1145/3663408.3663418(67-73)Online publication date: 3-Aug-2024
    • (2024)PACC: A Proactive CNP Generation Scheme for Datacenter NetworksIEEE/ACM Transactions on Networking10.1109/TNET.2024.336177132:3(2586-2599)Online publication date: Jun-2024
    • (2024)Achieving Low Latency for Multipath Transmission in RDMA Based Data Center NetworkIEEE Transactions on Cloud Computing10.1109/TCC.2024.336507512:1(337-346)Online publication date: Jan-2024
    • (2024)Implementation of NVMe Over TCP Using SPDK and Its Performance Measurement2024 IEEE International Workshop Technical Committee on Communications Quality and Reliability (CQR)10.1109/CQR62340.2024.10705884(13-18)Online publication date: 9-Sep-2024
    • (2023)Understanding the Micro-Behaviors of Hardware Offloaded Network Stacks with LuminaProceedings of the ACM SIGCOMM 2023 Conference10.1145/3603269.3604837(1074-1087)Online publication date: 10-Sep-2023
    • (2023)RoUD: Scalable RDMA over UD in Lossy Data Center Networks2023 IEEE/ACM 23rd International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid57682.2023.00014(36-46)Online publication date: May-2023
    • (2022)Clio: a hardware-software co-designed disaggregated memory systemProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3503222.3507762(417-433)Online publication date: 28-Feb-2022
    • (2022)PACC: Proactive and Accurate Congestion Feedback for RDMA Congestion ControlIEEE INFOCOM 2022 - IEEE Conference on Computer Communications10.1109/INFOCOM48880.2022.9796803(2228-2237)Online publication date: 2-May-2022
    • (2022)An Out-of-Order Packet Processing Algorithm of RoCE Based on Improved SACK2022 IEEE 6th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC )10.1109/IAEAC54830.2022.9929858(1402-1408)Online publication date: 3-Oct-2022
    • (2020)Enabling programmable transport protocols in high-speed NICsProceedings of the 17th Usenix Conference on Networked Systems Design and Implementation10.5555/3388242.3388250(93-110)Online publication date: 25-Feb-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media