Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Scalable Bandwidth Shaping Scheme via Adaptively Managed Parallel Heaps in Manycore-Based Network Processors

Published: 20 May 2017 Publication History

Abstract

Scalability of network processor-based routers heavily depends on limitations imposed by memory accesses and associated power consumption. Bandwidth shaping of a flow is a key function, which requires a token bucket per output queue and abuses memory bandwidth. As the number of output queues increases, managing token buckets becomes prohibitively expensive and limits scalability. In this work, we propose a scalable software-based token bucket management scheme that can reduce memory accesses and power consumption significantly. To satisfy real-time and low-cost constraints, we propose novel parallel heap data structures running on a manycore-based network processor. By using cache locking, the performance of heap processing is enhanced significantly and is more predictable. In addition, we quantitatively analyze the performance and memory footprint of the proposed software scheme using stochastic modeling and the Lyapunov central limit theorem. Finally, the proposed scheme provides an adaptive method to limit the size of heaps in the case of oversubscribed queues, which can successfully isolate the queues showing unideal behavior. The proposed scheme reduces memory accesses by up to three orders of magnitude for one million queues sharing a 100Gbps interface of the router while maintaining stability under stressful scenarios.

Supplementary Material

a59-kim-apndx.pdf (kim.zip)
Supplemental movie, appendix, image and software files for, Scalable Bandwidth Shaping Scheme via Adaptively Managed Parallel Heaps in Manycore-Based Network Processors

References

[1]
A. Aeron. 2010. Fine tuning of fuzzy token bucket scheme for congestion control in high speed networks. In 2nd International Conference on Computer Engineering and Applications (ICCEA’10), Vol. 1. IEEE, 170--174.
[2]
Agilent. 2007. The Journal of Internet Test Methodologies. Retrieved April 14, 2017 from https://intl.ixiacom.com/sites/default/files/resources/test-plan/agilent_journal_of_internet_test_methodologies.pdf.
[3]
Salman A. AlQahtani. 2015. Token bucket fair scheduling algorithm with adaptive rate allocations for heterogeneous wireless networks. Wireless Personal Communications 84, 2, 801--819.
[4]
K. J. Åström and B. Wittenmark. 2013. Adaptive control (2nd. ed.). Courier Corporation, North Chelmsford, MA, US.
[5]
N. Beheshti, E. Burmeister, Y. Ganjali, J. E. Bowers, D. J. Blumenthal, and N. McKeown. 2010. Optical packet buffers for backbone internet routers. IEEE/ACM Transactions on Networking 18, 5, 1599--1609.
[6]
J. C. R. Bennett and H. Zhang. 1996. WF 2 Q: Worst-case fair weighted fair queueing. In INFOCOM’96. Proceedings of the 15th Annual Joint Conference of the IEEE Computer Societies. Networking the Next Generation. Vol. 1. IEEE, 120--128.
[7]
P. Billingsley. 1995. Probability and measure (3rd. ed.). John Wiley 8 Sons, New York, NY.
[8]
N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood. 2011. The gem5 simulator. ACM SIGARCH Computer Architecture News 39, 2, 1--7.
[9]
The CAIDA. 2008. Statistical information for the CAIDA Anonymized Internet Traces. Retrieved April 14, 2017 from http://www.caida.org/data/passive/passive_trace_statistics.xml.
[10]
M. Campoy, A. P. Ivars, and J. Busquets-Mataix. 2001. Static use of locking caches in multitask preemptive real-time systems. In Proceedings of IEEE/IEE Real-Time Embedded Systems Workshop (Satellite of the IEEE Real-Time Systems Symposium).
[11]
Veena S. Chakravarthi and M. Shilpa. 2013. Ingress flow based triple token bucket traffic control system for distributed networks. In Proceedings of International Conference on VLSI, Communication, Advanced Devices, Signals 8 Systems and Networking (VCASAN-2013). Springer, 435--441.
[12]
Cisco. 2011. Cisco asr 9000 series ethernet line cards. Retrieved April 14, 2017 from http://www.cisco.com/en/US/prod/collateral/routers/ps9853/data_sheet_c78- 501338.pdf.
[13]
Cisco. 2014. The Cisco flow processor: Cisco’s next generation network processor. Retrieved April 14, 2017 from http://www.cisco.com/c/en/us/products/collateral/routers/asr-1000-series-aggregation-services-routers/solutionoverviewc22-448936.pdf.
[14]
R. Ennals, R. Sharp, and A. Mycroft. 2005. Task partitioning for multi-core network processors. In Compiler construction. Springer, 76--90.
[15]
M. A. Franklin, P. Crowley, H. Hadimioglu, and P. Z. Onufryk. 2003. Network Processor Design, Volume 2: Issues and Practices. Morgan Kaufmann, San Francisco, CA.
[16]
R. Giladi. 2008. Network processors: architecture, programming, and implementation. Morgan Kaufmann, Burlington, MA.
[17]
S. Han, K. Jang, K. Park, and S. Moon. 2010. Packetshader: A GPU-accelerated software router. ACM SIGCOMM Computer Communication Review 40, 4, 195--206.
[18]
E. Horowits, S. Sahani, and S. Anderson-Freed. 1992. Fundamentals of data structures in c. Computer Science Press.
[19]
X. Huang and T. Wolf. 2008. Evaluating dynamic task mapping in network processor runtime systems. IEEE Transactions on Parallel and Distributed Systems 19, 8, 1086--1098.
[20]
Intel. 2005. Intel IXP2800 and IXP2850 network processors. Retrieved April 14, 2017 from http://int.xscale-freak.com/XSDoc/IXP2xxx/27853715.pdf.
[21]
H. Jeon, W. H. Lee, and S. W. Chung. 2010. Load unbalancing strategy for multicore embedded processors. IEEE Transactions on Computers 59, 10, 1434--1440.
[22]
V. Jeyakumar, M. Alizadeh, D. Mazières, B. Prabhakar, A. Greenberg, and C. Kim. 2013. EyeQ: Practical network performance isolation at the edge. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI’13). 297--311.
[23]
W. Kang, S. H. Son, and J. A. Stankovic. 2012. Design, implementation, and evaluation of a QoS-aware real-time embedded database. IEEE Transactions on Computers 61, 1, 45--59.
[24]
J. Kidambi, D. Ghosal, and B. Mukherjee. 2000. Dynamic token bucket (DTB): A fair bandwidth allocation algorithm for high-speed networks. Journal of High Speed Networks 9, 2, 67--87.
[25]
M. M. Martin, D. J. Sorin, B. M. Beckmann, M. R. Marty, M. Xu, A. R. Alameldeen, K. E. Moore, M. D. Hill, and D. A. Wood. 2005. Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset. ACM SIGARCH Computer Architecture News 33, 4, 92--99.
[26]
C. Networks. 2013. OCTEON III CN78XX Multi-Core MIPS64 Processors. Retrieved April 14, 2017 from http://www.cavium.com/pdfFiles/CN78XXPBRev1.0.pdf?x=1.
[27]
A. K. Parekh and R. G. Gallager. 1993. A generalized processor sharing approach to flow control in integrated services networks: the single-node case. IEEE/ACM Transactions on Networking 1, 3, 344--357.
[28]
E.-C. Park and C.-H. Choi. 2003. Adaptive token bucket algorithm for fair bandwidth allocation in diffserv networks. In IEEE Global Telecommunications Conference (GLOBECOM’03). Vol. 6. IEEE, 3176--3180.
[29]
L. Popa, P. Yalagandula, S. Banerjee, J. C. Mogul, Y. Turner, and J. R. Santos. 2013. ElasticSwitch: Practical work-conserving bandwidth guarantees for cloud computing. ACM SIGCOMM Computer Communication Review 43, 4, 351--362.
[30]
I. Puaut and D. Decotigny. 2002. Low-complexity algorithms for static cache locking in multitasking hard real-time systems. In Real-Time Systems Symposium (RTSS’02). IEEE, 114--123.
[31]
S. Radhakrishnan, Y. Geng, V. Jeyakumar, A. Kabbani, G. Porter, and A. Vahdat. 2014. SENIC: Scalable NIC for end-host rate limiting. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI’14). 475--488.
[32]
P. Rosenfeld, E. Cooper-Balis, and B. Jacob. 2011. DRAMSim2: A cycle accurate memory system simulator. IEEE Computer Architecture Letters 10, 1, 16--19.
[33]
M. Shreedhar and G. Varghese. 1995. Efficient fair queueing using deficit round robin. In ACM SIGCOMM Computer Communication Review, Vol. 25. ACM, 231--242.
[34]
P. P. Tang and T.-Y. Tai. 1999. Network traffic characterization using token bucket model. In INFOCOM’99. Proceedings of the 18th Annual Joint Conference of the IEEE Computer and Communications Societies. IEEE, Vol. 1. IEEE, 51--62.
[35]
Tilera. 2016. NPS-400: 400 Gbps NPU for Smart Networks. Retrieved April 14, 2017 from http://www.mellanox.com/related-docs/prodnpu/PBNPS-400.pdf.
[36]
G. Varghese. 2010. Network algorithmics. Chapman 8 Hall/CRC, Boca Raton, FL.
[37]
X. Vera, B. Lisper, and J. Xue. 2003. Data cache locking for higher program predictability. ACM SIGMETRICS Performance Evaluation Review 31, 1, 272--282.
[38]
R. Zhang, C. Lu, T. F. Abdelzaher, and J. A. Stankovic. 2002. Controlware: A middleware architecture for feedback control of software performance. In Proceedings of the 22nd International Conference on Distributed Computing Systems. IEEE, 301--310.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Design Automation of Electronic Systems
ACM Transactions on Design Automation of Electronic Systems  Volume 22, Issue 4
October 2017
430 pages
ISSN:1084-4309
EISSN:1557-7309
DOI:10.1145/3097980
  • Editor:
  • Naehyuck Chang
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 20 May 2017
Accepted: 01 January 2017
Revised: 01 November 2016
Received: 01 February 2016
Published in TODAES Volume 22, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Manycore
  2. adaptive control
  3. heap tree
  4. network processor
  5. stochastic modeling
  6. token bucket

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Korean government
  • National Research Foundation of Korea (NRF)
  • ICT R8Dprogram of MSIP/IITP

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 146
    Total Downloads
  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)2
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media