research-article

Guaranteed Services of the NoC of a Manycore Processor

Authors:

Benoît Dupont de Dinechin,

Duco van Amstel,

Alexandre GhitiAuthors Info & Claims

NoCArc '14: Proceedings of the 2014 International Workshop on Network on Chip Architectures

Pages 11 - 16

https://doi.org/10.1145/2685342.2685344

Published: 13 December 2014 Publication History

Abstract

The Kalray MPPA®-256 processor (Multi-Purpose Processing Array) integrates 256 processing engine (PE) cores and 32 resource management (RM) cores on a single 28nm CMOS chip. These cores are distributed across 16 compute clusters and 4 I/O subsystems. On-chip communications and synchronization are supported by an explicitly routed dual data & control network-on-chip (NoC), with one node per compute cluster and 4 nodes per I/O subsystem, for a total of 32 nodes. The data NoC is dedicated to streaming data transfers and may operate with guaranteed services, thanks to non-blocking routers and flow regulation at the source node. Its architecture has been designed so that (σ, ρ) network calculus applies with minimal approximations.

Given a set of flows across this data NoC with predetermined routes, we formulate the problem of guaranteeing fair allocation of bandwidth across flows and we present bounds on the maximum transfer latency. By considering the architecture of the data NoC and by introducing conservative approximations, we show how this formulation can be transformed into a linear program. Solving this linear program is efficient and the quality of its solutions appears comparable to those of the original formulation, based on problem instances obtained from the cyclostatic dataflow compilation toolchain of the Kalray MPPA®-256 processor.

References

[1]

R. Marculescu, U. Y. Ogras, L.-S. Peh, N. E. Jerger, and Y. Hoskote, "Outstanding research problems in noc design: System, microarchitecture, and circuit perspectives," Trans. Comp.-Aided Des. Integ. Cir. Sys., vol. 28, no. 1, pp. 3--21, 2009.

Digital Library

[2]

F. P. Kelly, A. K. Maulloo, and D. K. Tan, "Rate control for communication networks: shadow prices, proportional fairness and stability," Journal of the Operational Research society, pp. 237--252, 1998.

[3]

R. L. Cruz, "A calculus for network delay, part i: Network elements in isolation," IEEE Trans. on Information Theory, vol. 37, no. 1, pp. 114--131, 1991.

Digital Library

[4]

U. Y. Ogras and R. Marculescu, "Prediction-based flow control for network-on-chip traffic," in Proceedings of the 43rd Annual Design Automation Conference, ser. DAC '06, 2006, pp. 839--844.

Digital Library

[5]

M. Tang and X. Lin, "Injection level flow control for networks-on-chip (noc)," J. Inf. Sci. Eng., vol. 27, no. 2, pp. 527--544, 2011.

[6]

F. Jafari, M. H. Yaghmaee, M. S. Talebi, and A. Khonsari, "Max-min-fair best effort flow control in network-on-chip architectures," in Proceedings of the 8th International Conference on Computational Science, Part I, ser. ICCS '08, 2008, pp. 436--445.

Digital Library

[7]

J.-Y. Le Boudec and P. Thiran, Network Calculus: A Theory of Deterministic Queuing Systems for the Internet. Springer-Verlag, 2001.

Digital Library

[8]

F. Jafari, Z. Lu, A. Jantsch, and M. H. Yaghmaee, "Optimal regulation of traffic flows in networks-on-chip," in Proceedings of the Conference on Design, Automation and Test in Europe, ser. DATE '10, 2010, pp. 1621--1624.

Digital Library

[9]

H. Zhang, "Service disciplines for guaranteed performance service in packet-switching networks," Proc. of the IEEE, vol. 83, no. 10, pp. 1374--1396, 1995.

[10]

B. D. de Dinechin, D. van Amstel, M. Poulhiès, and G. Lager, "Time-critical computing on a single-chip massively parallel processor," in Proceedings of the Conference on Design, Automation & Test in Europe, ser. DATE '14, 2014, pp. 97:1--97:6.

Digital Library

[11]

Z. Lu, M. Millberg, A. Jantsch, A. Bruce, P. van der Wolf, and T. Henriksson, "Flow regulation for on-chip communication," in Proc. of the Conference on Design, Automation and Test in Europe, ser. DATE '09, 2009, pp. 578--581.

Digital Library

[12]

L. Lenzini, L. Martorini, E. Mingozzi, and G. Stea, "Tight end-to-end per-flow delay bounds in fifo multiplexing sink-tree networks," Perform. Eval., vol. 63, no. 9, pp. 956--987, 2006.

Digital Library

[13]

M. Pióro and D. Medhi, Routing, flow, and capacity design in communication and computer networks. Elsevier, 2004.

Digital Library

[14]

K. Denolf, M. J. G. Bekooij, J. Cockx, D. Verkest, and H. Corporaal, "Exploiting the expressiveness of cyclo-static dataflow to model multimedia implementations," EURASIP J. Adv. Sig. Proc., vol. 2007, 2007.

[15]

R. Karp and R. Miller, "Properties of a model for parallel computations: Determinacy, termination, queueing," SIAM J., vol. 14, pp. 1390--1411, 1966.

[16]

B. D. de Dinechin, R. Ayrignac, P.-E. Beaucamps, P. Couvert, B. Ganne, P. G. de Massas, F. Jacquet, S. Jones, N. M. Chaisemartin, F. Riss, and T. Strudel, "A clustered manycore processor architecture for embedded and accelerated applications," in IEEE High Performance Extreme Computing Conference, HPEC 2013, 2013, pp. 1--6.

Cited By

González YNelissen GTovar E(2023)Traffic Injection Regulation Protocol Based on Free Time-Slots Requests2023 IEEE 29th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA)10.1109/RTCSA58653.2023.00027(157-166)Online publication date: 30-Aug-2023
https://doi.org/10.1109/RTCSA58653.2023.00027
Geyer FBondorf S(2022)Network Synthesis under Delay Constraints: The Power of Network Calculus DifferentiabilityIEEE INFOCOM 2022 - IEEE Conference on Computer Communications10.1109/INFOCOM48880.2022.9796777(1539-1548)Online publication date: 2-May-2022
https://doi.org/10.1109/INFOCOM48880.2022.9796777
Penna PSouto JUller JCastro MFreitas HMéhaut J(2021)Inter-kernel communication facility of a distributed operating system for NoC-based lightweight manycoresJournal of Parallel and Distributed Computing10.1016/j.jpdc.2021.04.002154(1-15)Online publication date: Aug-2021
https://doi.org/10.1016/j.jpdc.2021.04.002
Show More Cited By

Recommendations

Processor Allocation Problem for NoC-Based Chip Multiprocessors
ITNG '09: Proceedings of the 2009 Sixth International Conference on Information Technology: New Generations

Chip multiprocessors (CMPs) have become the primary approach to build high-performance microprocessors. Such systems require fast and efficient communication that can only be realized using Network on Chip (NoC), particularly for large systems. ...
On topology reconfiguration for defect-tolerant NoC-based homogeneous manycore systems

Homogeneous manycore systems are emerging for tera-scale computation and typically utilize Network-on-Chip (NoC) as the communication scheme between embedded cores. Effective defect tolerance techniques are essential to improve the yield of such complex ...
An energy-efficient design of microkernel-based on-chip OS for NOC-based manycore system

The chip multiprocessor is the most prolific processor design because its many cores enhance system performance. Network on chip (NOC) has been proposed as a promising model to solve the connection problem of the cores. However, a new challenge consists ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

NoCArc '14: Proceedings of the 2014 International Workshop on Network on Chip Architectures

December 2014

63 pages

ISBN:9781450330640

DOI:10.1145/2685342

General Chairs:
Farhad Mehdipour
Kyushu University, Japan
,
Giorgos Dimitrakopoulos
Democritus University of Thrace, Greece

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 December 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

NoCArc '14

NoCArc '14: International Workshop on Network on Chip Architectures

December 13 - 14, 2014

Cambridge, United Kingdom

Acceptance Rates

NoCArc '14 Paper Acceptance Rate 9 of 22 submissions, 41%;

Overall Acceptance Rate 46 of 122 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

37
Total Citations
View Citations
296
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)1

Reflects downloads up to 09 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

González YNelissen GTovar E(2023)Traffic Injection Regulation Protocol Based on Free Time-Slots Requests2023 IEEE 29th International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA)10.1109/RTCSA58653.2023.00027(157-166)Online publication date: 30-Aug-2023
https://doi.org/10.1109/RTCSA58653.2023.00027
Geyer FBondorf S(2022)Network Synthesis under Delay Constraints: The Power of Network Calculus DifferentiabilityIEEE INFOCOM 2022 - IEEE Conference on Computer Communications10.1109/INFOCOM48880.2022.9796777(1539-1548)Online publication date: 2-May-2022
https://doi.org/10.1109/INFOCOM48880.2022.9796777
Penna PSouto JUller JCastro MFreitas HMéhaut J(2021)Inter-kernel communication facility of a distributed operating system for NoC-based lightweight manycoresJournal of Parallel and Distributed Computing10.1016/j.jpdc.2021.04.002154(1-15)Online publication date: Aug-2021
https://doi.org/10.1016/j.jpdc.2021.04.002
Hu XLu Z(2021)A Configurable Hardware Architecture for Runtime Application of Network CalculusInternational Journal of Parallel Programming10.1007/s10766-021-00700-7Online publication date: 2-Apr-2021
https://doi.org/10.1007/s10766-021-00700-7
Hu XLu Z(2021)A Configurable Hardware Architecture for Runtime Application of Network CalculusNetwork and Parallel Computing10.1007/978-3-030-79478-1_18(203-216)Online publication date: 23-Jun-2021
https://doi.org/10.1007/978-3-030-79478-1_18
Hu XLu Z(2020)End-to-End System QoS Modeling based on Network CalculusProceedings of the 2020 International Conference on Internet Computing for Science and Engineering10.1145/3424311.3424320(80-83)Online publication date: 14-Jan-2020
https://dl.acm.org/doi/10.1145/3424311.3424320
Schoeberl M(2020)Multicore Models of Communication for Cyber-Physical SystemsCyber Physical Systems. Model-Based Design10.1007/978-3-030-41131-2_2(28-43)Online publication date: 18-Feb-2020
https://doi.org/10.1007/978-3-030-41131-2_2
Penna PSouto JLima DCastro MBroquedis FFreitas HMehaut J(2019)On the Performance and Isolation of Asymmetric Microkernel Design for Lightweight Manycores2019 IX Brazilian Symposium on Computing Systems Engineering (SBESC)10.1109/SBESC49506.2019.9046080(1-8)Online publication date: Nov-2019
https://doi.org/10.1109/SBESC49506.2019.9046080
Lu ZZhao X(2018)xMAS-Based QoS Analysis MethodologyIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2017.270656137:2(364-377)Online publication date: 1-Feb-2018
https://dl.acm.org/doi/10.1109/TCAD.2017.2706561
Kostrzewa ATobuschat SErnst R(2018)Self-Aware Network-on-Chip Control in Real-Time SystemsIEEE Design & Test10.1109/MDAT.2017.276359835:5(19-27)Online publication date: Oct-2018
https://doi.org/10.1109/MDAT.2017.2763598
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents