Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3061639.3062323acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Task Mapping on SMART NoC: Contention Matters, Not the Distance

Published: 18 June 2017 Publication History

Abstract

On-chip communication is the bottleneck of system performance for NoC-based MPSoCs. SMART, a recently proposed NoC architecture, enables single-cycle multi-hop communications. In SMART NoCs, unconflicted messages can go through an express bypass and the communication efficiency is significantly improved, while conflicted messages have to be buffered for guaranteed delivery with extra delays. Therefore, that performance of SMART NoC may be seriously degraded when communication contention increases. In this paper, we present task mapping techniques to address this problem for SMART NoCs, with the consideration of communication contention, rather than inter-processor distance, by minimizing conflicts and thus maximizing bypass utilization. We first model the entire problem by ILP formulations to find the theoretically optimal solution, and further propose polynomial-time algorithms for contention-aware task mapping and message priority assignment. Communicating tasks can be mapped to distant processors in SMART NoCs as long as conflict-free communication paths can be established and bypass can be enabled. Evaluation results on real benchmarks show an average of 44.1% and 32.8% improvement in communication efficiency and application performance compared to state-of-the-art techniques. The proposed heuristic algorithms only introduce 1.9% performance difference compared to the ILP model and are more scalable to large-size NoCs.

References

[1]
C. H. O. Chen et al. Smart: A single-cycle reconfigurable noc for soc applications. In Pro. of DATE'13.
[2]
Tushar Krishna et al. Single-cycle multihop asynchronous repeated traversal: A smart future for reconfigurable on-chip networks. Computer, (10):48--55, 2013.
[3]
Tushar Krishna and Li-Shiuan Peh. Single-cycle collective communication over a shared network fabric. In Pro.of NoCS'14, pages 1--8. IEEE, 2014.
[4]
Tushar Krishna et al. Breaking the on-chip latency barrier using smart. In Pro.of HPCA'13, pages 378--389. IEEE, 2013.
[5]
Tushar Krishna et al. Smart: single-cycle multihop traversals over a shared network on chip. IEEE Micro, 34(3):43--56, 2014.
[6]
Chen Wu et al. An efficient application mapping approach for the co-optimization of reliability, energy, and performance in reconfigurable noc architectures. IEEE TCAD, 34(8):1264--1277, 2015.
[7]
Umit Y Ogras and Radu Marculescu. It's a small world after all: Noc performance optimization via long-range link insertion. IEEE VLSI, 14(7):693--706, 2006.
[8]
Mehdi Modarressi et al. Virtual point-to-point connections for nocs. IEEE TCAD, 29(6):855--868, 2010.
[9]
Chris Jackson and Simon J Hollis. Skip-links: A dynamically reconfiguring topology for energy-efficient nocs. In Pro.of SoC'10, pages 49--54. IEEE, 2010.
[10]
Tushar Krishna et al. Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication. In Pro.of MICRO'11.
[11]
Ting-Shuo Hsu et al. A fast and accurate network-on-chip timing simulator with a flit propagation model. In Proc. of ASPDAC'15, pages 797--802. IEEE, 2015.
[12]
Amir Fadakar Noghondar and Midia Reshadi. A low-cost and latency bypass channel-based on-chip network. The Journal of Supercomputing, 71(10):3770--3786, 2015.
[13]
Donald Kline Jr et al. Mscs: Multi-hop segmented circuit switching. In Pro.of GLSVLSI'15, pages 179--184. ACM, 2015.
[14]
Xianmin Chen and Niraj K. Jha. Reducing wire and energy overheads of the smart noc using a setup request network. IEEE TVLSI, pages 1--14, 2016.
[15]
Nectarios Koziris et al. An efficient algorithm for the physical mapping of clustered task graphs onto multiprocessor architectures. In Pro.of EuroPDP'2000, pages 406--406, 2000.
[16]
Abhijit Davare et al. Classification, customization, and characterization: Using milp for task allocation and scheduling. EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2006-166, 2006.
[17]
Heng Yu et al. Communication-aware application mapping and scheduling for noc-based mpsocs. In Pro.of ISCAS'10.
[18]
W. Jiang et al. Optimal functional-unit assignment and buffer placement for probabilistic pipelines. In Proc. of CODES+ISSS'16, pages 1--10.
[19]
W. Jiang et al. Optimal functional-unit assignment for heterogeneous systems under timing constraint. IEEE Trans. TPDS, PP(99):1--1, 2017.
[20]
V. Zivojnovic et al. Dspstone: A dsp-oriented benchmarking methodology. In Proc. of ICSPAT'94 - Dallas, oct 1994.
[21]
J. Chong et al. Efficient parallelization of h.264 decoding with macro block level scheduling. In Proc. of ICME'07.

Cited By

View all
  • (2023)Layer-Puzzle: Allocating and Scheduling Multi-task on Multi-core NPUs by Using Layer Heterogeneity2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137320(1-6)Online publication date: Apr-2023
  • (2023)FIONA: Fine-grained Incoherent Optical DNN Accelerator Search for Superior Efficiency and Robustness2023 60th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC56929.2023.10247725(1-6)Online publication date: 9-Jul-2023
  • (2022)LAMP: Load-Balanced Multipath Parallel Transmission in Point-to-Point NoCsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.315102141:12(5232-5245)Online publication date: Dec-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '17: Proceedings of the 54th Annual Design Automation Conference 2017
June 2017
533 pages
ISBN:9781450349277
DOI:10.1145/3061639
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2017

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

DAC '17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)17
  • Downloads (Last 6 weeks)7
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Layer-Puzzle: Allocating and Scheduling Multi-task on Multi-core NPUs by Using Layer Heterogeneity2023 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE56975.2023.10137320(1-6)Online publication date: Apr-2023
  • (2023)FIONA: Fine-grained Incoherent Optical DNN Accelerator Search for Superior Efficiency and Robustness2023 60th ACM/IEEE Design Automation Conference (DAC)10.1109/DAC56929.2023.10247725(1-6)Online publication date: 9-Jul-2023
  • (2022)LAMP: Load-Balanced Multipath Parallel Transmission in Point-to-Point NoCsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.315102141:12(5232-5245)Online publication date: Dec-2022
  • (2022)ArSMART: An Improved SMART NoC Design Supporting Arbitrary-Turn TransmissionIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.309196141:5(1316-1329)Online publication date: May-2022
  • (2022)A communication-aware and predictive list scheduling algorithm for network-on-chip based heterogeneous muti-processor system-on-chipMicroelectronics Journal10.1016/j.mejo.2022.105367121:COnline publication date: 1-Mar-2022
  • (2021)A Latency-Optimized Network-on-Chip with Rapid Bypass ChannelsMicromachines10.3390/mi1206062112:6(621)Online publication date: 27-May-2021
  • (2021)SMT-Based Contention-Free Task Mapping and Scheduling on 2D/3D SMART NoC with Mixed Dimension-Order RoutingACM Transactions on Architecture and Code Optimization10.1145/348701819:1(1-21)Online publication date: 6-Dec-2021
  • (2021)MARCO: A High-performance Task Mapping and Routing Co-optimization Framework for Point-to-Point NoC-based Heterogeneous Computing SystemsACM Transactions on Embedded Computing Systems10.1145/347698520:5s(1-21)Online publication date: 17-Sep-2021
  • (2021)On the Design of Minimal-Cost Pipeline Systems Satisfying Hard/Soft Real-Time ConstraintsIEEE Transactions on Emerging Topics in Computing10.1109/TETC.2017.27888009:1(24-34)Online publication date: 1-Jan-2021
  • (2021)Reduced Worst-Case Communication Latency Using Single-Cycle Multihop Traversal Network-on-ChipIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2020.301544040:7(1381-1394)Online publication date: Jul-2021
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media