Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3098822.3098852acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article
Open access

Carousel: Scalable Traffic Shaping at End Hosts

Published: 07 August 2017 Publication History

Abstract

Traffic shaping, including pacing and rate limiting, is fundamental to the correct and efficient operation of both datacenter and wide area networks. Sample use cases include policy-based bandwidth allocation to flow aggregates, rate-based congestion control algorithms, and packet pacing to avoid bursty transmissions that can overwhelm router buffers. Driven by the need to scale to millions of flows and to apply complex policies, traffic shaping is moving from network switches into the end hosts, typically implemented in software in the kernel networking stack.
In this paper, we show that the performance overhead of end-host traffic shaping is substantial limits overall system scalability as we move to thousands of individual traffic classes per server. Measurements from production servers show that shaping at hosts consumes considerable CPU and memory, unnecessarily drops packets, suffers from head of line blocking and inaccuracy, and does not provide backpressure up the stack. We present Carousel, a framework that scales to tens of thousands of policies and flows per server, built from the synthesis of three key ideas: i) a single queue shaper using time as the basis for releasing packets, ii) fine-grained, just-in-time freeing of resources in higher layers coupled to actual packet departures, and iii) one shaper per CPU core, with lock-free coordination. Our production experience in serving video traffic at a Cloud service provider shows that Carousel shapes traffic accurately while improving overall machine CPU utilization by 8% (an improvement of 20% in the CPU utilization attributed to networking) relative to state-of-art deployments. It also conforms 10 times more accurately to target rates, and consumes two orders of magnitude less memory than existing approaches.

Supplementary Material

WEBM File (carouselscalabletrafficshapingatendhosts.webm)

References

[1]
2002. pfifo-tc: PFIFO Qdisc. (2002). https://linux.die.net/man/8/tc-pfifo/.
[2]
2003. Linux Hierarchical Token Bucket. (2003). http://luxik.cdi.cz/devik/qos/htb/.
[3]
2008. net: Generic receive offload. (2008). https://lwn.net/Articles/358910/.
[4]
2012. tcp: TCP Small Queues. (2012). https://lwn.net/Articles/506237/.
[5]
2013. pkt_sched: fq: Fair Queue packet scheduler. (2013). https://lwn.net/Articles/564825/.
[6]
2013. tcp: TSO packets automatic sizing. (2013). https://lwn.net/Articles/564979/.
[7]
2016. neper: a Linux networking performance tool. (2016). https://github.com/google/neper.
[8]
2017. Leaky bucket as a queue. (2017). https://en.wikipedia.org/wiki/Leaky_bucket.
[9]
Mohammad Alizadeh, Abdul Kabbani, Tom Edsall, Balaji Prabhakar, Amin Vahdat, and Masato Yasuda. 2012. Less Is More: Trading a Little Bandwidth for Ultra-Low Latency in the Data Center. In NSDI '12. 253--266.
[10]
W. Almesberger, J. H. Salim, and A. Kuznetsov. 1999. Differentiated services on Linux. In GLOBECOM '99. 831--836 vol. 1b.
[11]
Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica, and Matei Zaharia. 2010. A View of Cloud Computing. Commun. ACM 53, 4 (2010), 50--58.
[12]
Wei Bai, Kai Chen, Haitao Wu, Wuwei Lan, and Yangming Zhao. 2014. PAC: taming TCP Incast congestion using proactive ACK control. In ICNP '14. 385--396.
[13]
Hitesh Ballani, Paolo Costa, Thomas Karagiannis, and Ant Rowstron. 2011. Towards Predictable Datacenter Networks. SIGCOMM Comput. Commun. Rev. 41, 4 (Aug. 2011), 242--253.
[14]
Neda Beheshti, Yashar Ganjali, Monia Ghobadi, Nick McKeown, and Geoff Salmon. 2008. Experimental Study of Router Buffer Sizing. In IMC '08. 197--210.
[15]
Jesper Dangaard Brouer. 2015. Network stack challenges at increasing speeds: The 100Gbit/s challenge. LinuxCon North America. (2015). http://events.linuxfoundation.org/sites/events/files/slides/net_stack_challenges_100G_1.pdf
[16]
Randy Brown. 1988. Calendar queues: a fast O(1) priority queue implementation for the simulation event set problem. Commun. ACM 31, 10 (1988), 1220--1227.
[17]
Neal Cardwell, Yuchung Cheng, C. Stephen Gunn, Van Jacobson, and Soheil Yeganeh. 2016. BBR: Congestion-Based Congestion Control. ACM Queue (2016), 50:20--50:53.
[18]
Yuchung Cheng and Neal Cardwell. 2016. Making Linux TCP Fast. In Netdev Conference.
[19]
David D Clark. 1985. The structuring of systems using upcalls. In SOSP '85. 171--180.
[20]
Glenn William Connery, W Paul Sherer, Gary Jaszewski, and James S Binder. 1999. Offload of TCP segmentation to a smart adapter. US Patent 5,937,169. (August 1999).
[21]
Tobias Flach, Pavlos Papageorge, Andreas Terzis, Luis Pedrosa, Yuchung Cheng, Tayeb Karim, Ethan Katz-Bassett, and Ramesh Govindan. 2016. An Internet-Wide Analysis of Traffic Policing. In SIGCOMM '16. 468--482.
[22]
Yashar Ganjali and Nick McKeown. 2006. Update on Buffer Sizing in Internet Routers. SIGCOMM Comput. Commun. Rev. (2006), 67--70.
[23]
Monia Ghobadi, Yuchung Cheng, Ankur Jain, and Matt Mathis. 2012. Trickle: Rate Limiting YouTube Video Streaming. In USENIX ATC '12. 191--196.
[24]
Sangjin Han, Keon Jang, Aurojit Panda, Shoumik Palkar, Dongsu Han, and Sylvia Ratnasamy. 2015. SoftNIC: A Software NIC to Augment Hardware. Technical Report UCB/EECS-2015-155. EECS Department, University of California, Berkeley.
[25]
Chi-Yao Hong, Srikanth Kandula, Ratul Mahajan, Ming Zhang, Vijay Gill, Mohan Nanduri, and Roger Wattenhofer. 2013. Achieving High Utilization with Software-driven WAN. SIGCOMM Comput. Commun. Rev. 43, 4 (2013), 15--26.
[26]
Intel Networking Division. 2016. 82599 10 GbE Controller Datasheet. (2016).
[27]
Intel Networking Division. 2016. Ethernet Switch FM10000. (2016).
[28]
Sushant Jain, Alok Kumar, Subhasree Mandal, Joon Ong, Leon Poutievski, Arjun Singh, Subbaiah Venkata, Jim Wanderer, Junlan Zhou, Min Zhu, Jon Zolla, Urs Hölzle, Stephen Stuart, and Amin Vahdat. 2013. B4: Experience with a Globally-deployed Software Defined Wan. In SIGCOMM '13. 3--14.
[29]
Keon Jang, Justine Sherry, Hitesh Ballani, and Toby Moncaster. 2015. Silo: Predictable Message Latency in the Cloud. In SIGCOMM '15. 435--448.
[30]
Vimalkumar Jeyakumar, Mohammad Alizadeh, David Mazières, Balaji Prabhakar, Albert Greenberg, and Changhoon Kim. 2013. EyeQ: Practical Network Performance Isolation at the Edge. In NSDI '13. 297--311.
[31]
Antoine Kaufmann, SImon Peter, Naveen Kr. Sharma, Thomas Anderson, and Arvind Krishnamurthy. 2016. High Performance Packet Processing with FlexNIC. In ASPLOS '16. 67--81.
[32]
Alok Kumar, Sushant Jain, Uday Naik, Anand Raghuraman, Nikhil Kasinadhuni, Enrique Cauich Zermeno, C. Stephen Gunn, Jing Ai, Björn Carlin, Mihai Amarandei-Stavila, Mathieu Robin, Aspi Siganporia, Stephen Stuart, and Amin Vahdat. 2015. BwE: Flexible, Hierarchical Bandwidth Allocation for WAN Distributed Computing. In SIGCOMM '15. 1--14.
[33]
Gautam Kumar, Srikanth Kandula, Peter Bodik, and Ishai Menache. 2013. Virtualizing Traffic Shapers for Practical Resource Allocation. In 5th USENIX Workshop on Hot Topics in Cloud Computing.
[34]
Adam Langley, Alistair Riddoch, Alyssa Wilk, Antonio Vicente, Charles Krasic, Dan Zhang, Fan Yang, Fedor Kouranov, Ian Swett, Janardhan Iyengar, Jeff Bailey, Jeremy Dorfman, Jim Roskind, Joanna Kulik, Patrik Westin, Raman Tenneti, Robbie Shade, Ryan Hamilton, Victor Vasiliev, Wan-Teh Chang, and Zhongyi Shi. 2017. The QUIC Transport Protocol: Design and Internet-Scale Deployment. In SIGCOMM '17.
[35]
Radhika Mittal, Rachit Agarwal, Sylvia Ratnasamy, and Scott Shenker. 2016. Universal Packet Scheduling. In NSDI '16. 501--521.
[36]
Radhika Mittal, Vinh The Lam, Nandita Dukkipati, Emily Blem, Hassan Wassel, Monia Ghobadi, Amin Vahdat, Yaogong Wang, David Wetherall, and David Zats. 2015. TIMELY: RTT-based Congestion Control for the Datacenter. In SIGCOMM '15. 537--550.
[37]
Lucian Popa, Praveen Yalagandula, Sujata Banerjee, Jeffrey C. Mogul, Yoshio Turner, and Jose Renato Santos. 2013. ElasticSwitch: Practical Work-conserving Bandwidth Guarantees for Cloud Computing. SIGCOMM Comput. Commun. Rev. 43, 4 (Aug. 2013), 351--362.
[38]
Sivasankar Radhakrishnan, Yilong Geng, Vimalkumar Jeyakumar, Abdul Kabbani, George Porter, and Amin Vahdat. 2014. SENIC: Scalable NIC for End-Host Rate Limiting. In NSDI '14.
[39]
Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran, Kenneth Yocum, and Alex C. Snoeren. 2007. Cloud Control with Distributed Rate Limiting. In SIGCOMM '07. 337--348.
[40]
Rusty Russell. 2008. virtio: towards a de-facto standard for virtual I/O devices. ACM SIGOPS Operating Systems Review 42, 5 (2008), 95--103.
[41]
M. Shreedhar and George Varghese. 1995. Efficient Fair Queueing Using Deficit Round Robin. In SIGCOMM '95. 375--385.
[42]
Anirudh Sivaraman, Suvinay Subramanian, Mohammad Alizadeh, Sharad Chole, Shang-Tse Chuang, Anurag Agrawal, Hari Balakrishnan, Tom Edsall, Sachin Katti, and Nick McKeown. 2016. Programmable Packet Scheduling at Line Rate. In SIGCOMM '16. 44--57.
[43]
G. Varghese and T. Lauck. 1987. Hashed and Hierarchical Timing Wheels: Data Structures for the Efficient Implementation of a Timer Facility. In Proceedings of the Eleventh ACM Symposium on Operating Systems Principles (SOSP '87). 25--38.
[44]
L. Zhang. 1990. Virtual Clock: A New Traffic Control Algorithm for Packet Switching Networks. In SIGCOMM '90. 19--29.

Cited By

View all
  • (2024)Fast, Scalable, and Accurate Rate Limiter for RDMA NICsProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672215(568-580)Online publication date: 4-Aug-2024
  • (2024)Adaptive Approximate Fair Queueing for Shared-Memory Programmable SwitchesIEEE Transactions on Network Science and Engineering10.1109/TNSE.2024.337781411:4(3563-3576)Online publication date: Jul-2024
  • (2024)Efficient Time-Series Data Delivery in IoT with XenderIEEE Transactions on Mobile Computing10.1109/TMC.2023.3296608(1-15)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. Carousel: Scalable Traffic Shaping at End Hosts

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGCOMM '17: Proceedings of the Conference of the ACM Special Interest Group on Data Communication
    August 2017
    515 pages
    ISBN:9781450346535
    DOI:10.1145/3098822
    This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 August 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Backpressure
    2. Pacing
    3. Rate-limiters
    4. Timing Wheel
    5. Traffic shaping

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    SIGCOMM '17
    Sponsor:
    SIGCOMM '17: ACM SIGCOMM 2017 Conference
    August 21 - 25, 2017
    CA, Los Angeles, USA

    Acceptance Rates

    Overall Acceptance Rate 462 of 3,389 submissions, 14%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)719
    • Downloads (Last 6 weeks)88
    Reflects downloads up to 26 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Fast, Scalable, and Accurate Rate Limiter for RDMA NICsProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672215(568-580)Online publication date: 4-Aug-2024
    • (2024)Adaptive Approximate Fair Queueing for Shared-Memory Programmable SwitchesIEEE Transactions on Network Science and Engineering10.1109/TNSE.2024.337781411:4(3563-3576)Online publication date: Jul-2024
    • (2024)Efficient Time-Series Data Delivery in IoT with XenderIEEE Transactions on Mobile Computing10.1109/TMC.2023.3296608(1-15)Online publication date: 2024
    • (2024)gPerfIsol: GNN-Based Rate-Limits Allocation for Performance Isolation in Multi-Tenant Cloud2024 27th Conference on Innovation in Clouds, Internet and Networks (ICIN)10.1109/ICIN60470.2024.10494419(194-201)Online publication date: 11-Mar-2024
    • (2024)LibPreemptible: Enabling Fast, Adaptive, and Hardware-Assisted User-Space Scheduling2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00075(922-936)Online publication date: 2-Mar-2024
    • (2023)Formal Abstractions for Packet SchedulingProceedings of the ACM on Programming Languages10.1145/36228457:OOPSLA2(1338-1362)Online publication date: 16-Oct-2023
    • (2023)BMW Tree: Large-scale, High-throughput and Modular PIFO Implementation using Balanced Multi-Way Sorting TreeProceedings of the ACM SIGCOMM 2023 Conference10.1145/3603269.3604862(208-219)Online publication date: 10-Sep-2023
    • (2023)Augmented Queue: A Scalable In-Network Abstraction for Data Center Network SharingProceedings of the ACM SIGCOMM 2023 Conference10.1145/3603269.3604858(305-318)Online publication date: 10-Sep-2023
    • (2023)Sammy: smoothing video traffic to be a friendly internet neighborProceedings of the ACM SIGCOMM 2023 Conference10.1145/3603269.3604839(754-768)Online publication date: 10-Sep-2023
    • (2023)Accurate and Scalable Rate Limiter for RDMA NICsProceedings of the 7th Asia-Pacific Workshop on Networking10.1145/3600061.3600078(15-20)Online publication date: 29-Jun-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media