research-article

Scalable Bandwidth Shaping Scheme via Adaptively Managed Parallel Heaps in Manycore-Based Network Processors

Authors:

Eui-Young Chung,

Hyuk-Jun LeeAuthors Info & Claims

ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 22, Issue 4

Article No.: 59, Pages 1 - 26

https://doi.org/10.1145/3065926

Published: 20 May 2017 Publication History

Abstract

Scalability of network processor-based routers heavily depends on limitations imposed by memory accesses and associated power consumption. Bandwidth shaping of a flow is a key function, which requires a token bucket per output queue and abuses memory bandwidth. As the number of output queues increases, managing token buckets becomes prohibitively expensive and limits scalability. In this work, we propose a scalable software-based token bucket management scheme that can reduce memory accesses and power consumption significantly. To satisfy real-time and low-cost constraints, we propose novel parallel heap data structures running on a manycore-based network processor. By using cache locking, the performance of heap processing is enhanced significantly and is more predictable. In addition, we quantitatively analyze the performance and memory footprint of the proposed software scheme using stochastic modeling and the Lyapunov central limit theorem. Finally, the proposed scheme provides an adaptive method to limit the size of heaps in the case of oversubscribed queues, which can successfully isolate the queues showing unideal behavior. The proposed scheme reduces memory accesses by up to three orders of magnitude for one million queues sharing a 100Gbps interface of the router while maintaining stability under stressful scenarios.

Supplementary Material

a59-kim-apndx.pdf (kim.zip)

Supplemental movie, appendix, image and software files for, Scalable Bandwidth Shaping Scheme via Adaptively Managed Parallel Heaps in Manycore-Based Network Processors

Download
34.33 KB

References

[1]

A. Aeron. 2010. Fine tuning of fuzzy token bucket scheme for congestion control in high speed networks. In 2nd International Conference on Computer Engineering and Applications (ICCEA’10), Vol. 1. IEEE, 170--174.

Digital Library

[2]

Agilent. 2007. The Journal of Internet Test Methodologies. Retrieved April 14, 2017 from https://intl.ixiacom.com/sites/default/files/resources/test-plan/agilent_journal_of_internet_test_methodologies.pdf.

[3]

Salman A. AlQahtani. 2015. Token bucket fair scheduling algorithm with adaptive rate allocations for heterogeneous wireless networks. Wireless Personal Communications 84, 2, 801--819.

Digital Library

[4]

K. J. Åström and B. Wittenmark. 2013. Adaptive control (2nd. ed.). Courier Corporation, North Chelmsford, MA, US.

[5]

N. Beheshti, E. Burmeister, Y. Ganjali, J. E. Bowers, D. J. Blumenthal, and N. McKeown. 2010. Optical packet buffers for backbone internet routers. IEEE/ACM Transactions on Networking 18, 5, 1599--1609.

Digital Library

[6]

J. C. R. Bennett and H. Zhang. 1996. WF 2 Q: Worst-case fair weighted fair queueing. In INFOCOM’96. Proceedings of the 15th Annual Joint Conference of the IEEE Computer Societies. Networking the Next Generation. Vol. 1. IEEE, 120--128.

[7]

P. Billingsley. 1995. Probability and measure (3rd. ed.). John Wiley 8 Sons, New York, NY.

[8]

N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood. 2011. The gem5 simulator. ACM SIGARCH Computer Architecture News 39, 2, 1--7.

Digital Library

[9]

The CAIDA. 2008. Statistical information for the CAIDA Anonymized Internet Traces. Retrieved April 14, 2017 from http://www.caida.org/data/passive/passive_trace_statistics.xml.

[10]

M. Campoy, A. P. Ivars, and J. Busquets-Mataix. 2001. Static use of locking caches in multitask preemptive real-time systems. In Proceedings of IEEE/IEE Real-Time Embedded Systems Workshop (Satellite of the IEEE Real-Time Systems Symposium).

[11]

Veena S. Chakravarthi and M. Shilpa. 2013. Ingress flow based triple token bucket traffic control system for distributed networks. In Proceedings of International Conference on VLSI, Communication, Advanced Devices, Signals 8 Systems and Networking (VCASAN-2013). Springer, 435--441.

[12]

Cisco. 2011. Cisco asr 9000 series ethernet line cards. Retrieved April 14, 2017 from http://www.cisco.com/en/US/prod/collateral/routers/ps9853/data_sheet_c78- 501338.pdf.

[13]

Cisco. 2014. The Cisco flow processor: Cisco’s next generation network processor. Retrieved April 14, 2017 from http://www.cisco.com/c/en/us/products/collateral/routers/asr-1000-series-aggregation-services-routers/solutionoverviewc22-448936.pdf.

[14]

R. Ennals, R. Sharp, and A. Mycroft. 2005. Task partitioning for multi-core network processors. In Compiler construction. Springer, 76--90.

Digital Library

[15]

M. A. Franklin, P. Crowley, H. Hadimioglu, and P. Z. Onufryk. 2003. Network Processor Design, Volume 2: Issues and Practices. Morgan Kaufmann, San Francisco, CA.

[16]

R. Giladi. 2008. Network processors: architecture, programming, and implementation. Morgan Kaufmann, Burlington, MA.

[17]

S. Han, K. Jang, K. Park, and S. Moon. 2010. Packetshader: A GPU-accelerated software router. ACM SIGCOMM Computer Communication Review 40, 4, 195--206.

Digital Library

[18]

E. Horowits, S. Sahani, and S. Anderson-Freed. 1992. Fundamentals of data structures in c. Computer Science Press.

[19]

X. Huang and T. Wolf. 2008. Evaluating dynamic task mapping in network processor runtime systems. IEEE Transactions on Parallel and Distributed Systems 19, 8, 1086--1098.

Digital Library

[20]

Intel. 2005. Intel IXP2800 and IXP2850 network processors. Retrieved April 14, 2017 from http://int.xscale-freak.com/XSDoc/IXP2xxx/27853715.pdf.

[21]

H. Jeon, W. H. Lee, and S. W. Chung. 2010. Load unbalancing strategy for multicore embedded processors. IEEE Transactions on Computers 59, 10, 1434--1440.

Digital Library

[22]

V. Jeyakumar, M. Alizadeh, D. Mazières, B. Prabhakar, A. Greenberg, and C. Kim. 2013. EyeQ: Practical network performance isolation at the edge. In 10th USENIX Symposium on Networked Systems Design and Implementation (NSDI’13). 297--311.

[23]

W. Kang, S. H. Son, and J. A. Stankovic. 2012. Design, implementation, and evaluation of a QoS-aware real-time embedded database. IEEE Transactions on Computers 61, 1, 45--59.

Digital Library

[24]

J. Kidambi, D. Ghosal, and B. Mukherjee. 2000. Dynamic token bucket (DTB): A fair bandwidth allocation algorithm for high-speed networks. Journal of High Speed Networks 9, 2, 67--87.

Digital Library

[25]

M. M. Martin, D. J. Sorin, B. M. Beckmann, M. R. Marty, M. Xu, A. R. Alameldeen, K. E. Moore, M. D. Hill, and D. A. Wood. 2005. Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset. ACM SIGARCH Computer Architecture News 33, 4, 92--99.

Digital Library

[26]

C. Networks. 2013. OCTEON III CN78XX Multi-Core MIPS64 Processors. Retrieved April 14, 2017 from http://www.cavium.com/pdfFiles/CN78XXPBRev1.0.pdf?x=1.

[27]

A. K. Parekh and R. G. Gallager. 1993. A generalized processor sharing approach to flow control in integrated services networks: the single-node case. IEEE/ACM Transactions on Networking 1, 3, 344--357.

Digital Library

[28]

E.-C. Park and C.-H. Choi. 2003. Adaptive token bucket algorithm for fair bandwidth allocation in diffserv networks. In IEEE Global Telecommunications Conference (GLOBECOM’03). Vol. 6. IEEE, 3176--3180.

[29]

L. Popa, P. Yalagandula, S. Banerjee, J. C. Mogul, Y. Turner, and J. R. Santos. 2013. ElasticSwitch: Practical work-conserving bandwidth guarantees for cloud computing. ACM SIGCOMM Computer Communication Review 43, 4, 351--362.

Digital Library

[30]

I. Puaut and D. Decotigny. 2002. Low-complexity algorithms for static cache locking in multitasking hard real-time systems. In Real-Time Systems Symposium (RTSS’02). IEEE, 114--123.

[31]

S. Radhakrishnan, Y. Geng, V. Jeyakumar, A. Kabbani, G. Porter, and A. Vahdat. 2014. SENIC: Scalable NIC for end-host rate limiting. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI’14). 475--488.

[32]

P. Rosenfeld, E. Cooper-Balis, and B. Jacob. 2011. DRAMSim2: A cycle accurate memory system simulator. IEEE Computer Architecture Letters 10, 1, 16--19.

Digital Library

[33]

M. Shreedhar and G. Varghese. 1995. Efficient fair queueing using deficit round robin. In ACM SIGCOMM Computer Communication Review, Vol. 25. ACM, 231--242.

Digital Library

[34]

P. P. Tang and T.-Y. Tai. 1999. Network traffic characterization using token bucket model. In INFOCOM’99. Proceedings of the 18th Annual Joint Conference of the IEEE Computer and Communications Societies. IEEE, Vol. 1. IEEE, 51--62.

[35]

Tilera. 2016. NPS-400: 400 Gbps NPU for Smart Networks. Retrieved April 14, 2017 from http://www.mellanox.com/related-docs/prodnpu/PBNPS-400.pdf.

[36]

G. Varghese. 2010. Network algorithmics. Chapman 8 Hall/CRC, Boca Raton, FL.

[37]

X. Vera, B. Lisper, and J. Xue. 2003. Data cache locking for higher program predictability. ACM SIGMETRICS Performance Evaluation Review 31, 1, 272--282.

Digital Library

[38]

R. Zhang, C. Lu, T. F. Abdelzaher, and J. A. Stankovic. 2002. Controlware: A middleware architecture for feedback control of software performance. In Proceedings of the 22nd International Conference on Distributed Computing Systems. IEEE, 301--310.

Index Terms

Scalable Bandwidth Shaping Scheme via Adaptively Managed Parallel Heaps in Manycore-Based Network Processors

Recommendations

Efficient memory management of a hierarchical and a hybrid main memory for MN-MATE platform
PMAM '12: Proceedings of the 2012 International Workshop on Programming Models and Applications for Multicores and Manycores

The advent of manycore in computing architecture causes severe energy consumption and memory wall problem. Thus, emerging technologies such as on-chip memory and nonvolatile memory (NVRAM) have led to a paradigm shift in computing architecture era. For ...
Efficient traffic aware power management in multicore communications processors
ANCS '12: Proceedings of the eighth ACM/IEEE symposium on Architectures for networking and communications systems

Multicore communications processors have become the main computing element in Internet routers and mobile base stations due to their flexibility and high processing capability. These processors are designed and equipped with enough resources to handle ...
Resource allocation in network processors for network intrusion prevention systems

Networking applications with high memory access overhead gradually exploit network processors that feature multiple hardware multithreaded processor cores along with a versatile memory hierarchy. Given rich hardware resources, however, the performance ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Design Automation of Electronic Systems

ACM Transactions on Design Automation of Electronic Systems Volume 22, Issue 4

October 2017

430 pages

ISSN:1084-4309

EISSN:1557-7309

DOI:10.1145/3097980

Editor:
Naehyuck Chang
Korea Advanced Institute of Science and Technology, Korea

Issue’s Table of Contents

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

ACM Journals for the Design of Smart and Connected Systems

Publication History

Published: 20 May 2017

Accepted: 01 January 2017

Revised: 01 November 2016

Received: 01 February 2016

Published in TODAES Volume 22, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Korean government
National Research Foundation of Korea (NRF)
ICT R8Dprogram of MSIP/IITP

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
146
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)2

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents