research-article

MARCO: A High-performance Task Mapping and Routing Co-optimization Framework for Point-to-Point NoC-based Heterogeneous Computing Systems

Authors:

Weichen LiuAuthors Info & Claims

ACM Transactions on Embedded Computing Systems (TECS), Volume 20, Issue 5s

Article No.: 54, Pages 1 - 21

https://doi.org/10.1145/3476985

Published: 17 September 2021 Publication History

Get Access

Abstract

Heterogeneous computing systems (HCSs), which consist of various processing elements (PEs) that vary in their processing ability, are usually facilitated by the network-on-chip (NoC) to interconnect its components. The emerging point-to-point NoCs which support single-cycle-multi-hop transmission, reduce or eliminate the latency dependence on distance, addressing the scalability concern raised by high latency for long-distance transmission and enlarging the design space of the routing algorithm to search the non-shortest paths. For such point-to-point NoC-based HCSs, resource management strategies which are managed by compilers, scheduler, or controllers, e.g., mapping and routing, are complicated for the following reasons: (i) Due to the heterogeneity, mapping and routing need to optimize computation and communication concurrently (for homogeneous computing systems, only communication). (ii) Conducting mapping and routing consecutively cannot minimize the schedule length in most cases since the PEs with high processing ability may locate in the crowded area and suffer from high resource contention overhead. (iii) Since changing the mapping selection of one task will reconstruct the whole routing design space, the exploration of mapping and routing design space is challenging. Therefore, in this work, we propose MARCO, the mapping and routing co-optimization framework, to decrease the schedule length of applications on point-to-point NoC-based HCSs. Specifically, we revise the tabu search to explore the design space and evaluate the quality of mapping and routing. The advanced reinforcement learning (RL)algorithm, i.e., advantage actor-critic, is adopted to efficiently compute paths. We perform extensive experiments on various real applications, which demonstrates that the MARCO achieves a remarkable performance improvement in terms of schedule length (+44.94% ∼ +50.18%) when compared with the state-of-the-art mapping and routing co-optimization algorithm for homogeneous computing systems. We also compare MARCO with different combinations of state-of-the-art mapping and routing approaches.

References

[1]

Shoukat Ali, Howard Jay Siegel, Muthucumaru Maheswaran, Debra Hensgen, and Sahra Ali. 2000. Task execution time modeling for heterogeneous computing systems. Proceedings of the Heterogeneous Computing Workshop, HCW (2000), 185–199. https://doi.org/10.1109/hcw.2000.843743

Abstract

References

Cited By

Index Terms

Recommendations

Bandwidth-Constrained Mapping of Cores onto NoC Architectures

Dynamic thread and data mapping for NoC based CMPs

Link-load balance aware mapping and routing for NoC

Comments

Information

Published In

Publisher

Journal Family

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

HTML Format

Share

Share this Publication link

Share on social media

Affiliations