Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3337821.3337877acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

Express Link Placement for NoC-Based Many-Core Platforms

Published: 05 August 2019 Publication History

Abstract

With the integration of up to hundreds of cores in recent general-purpose processors that can be used in parallel processing systems, it is critical to design scalable and low-latency networks-on-chip (NoCs) to support various on-chip communications. An effective way to reduce on-chip latency and improve network scalability is to add express links between pairs of non-adjacent routers. However, increasing the number of express links may result in smaller bandwidth per link due to the limited total bisection bandwidth on chip, thus leading to higher serialization latency of packets in the network. Unlike previous works on application-specific designs or on fixed placement of express links, this paper aims at finding effective placement of express links for general-purpose processors considering all the possible placement options. We formulate the problem mathematically and propose an efficient algorithm that utilizes an initial solution generation heuristic and enhanced candidate generator in simulated annealing. Evaluation on 4x4, 8x8 and 16x16 networks using multi-threaded PARSEC benchmarks and various synthetic traffic patterns shows significant reduction of average packet latency over previous works.

References

[1]
Agarwal, N., Krishna, T., Peh, L. S., & Jha, N. K. (2009, April). "GARNET: A detailed on-chip network model inside a full-system simulator," International Symposium on Performance Analysis of Systems and Software (ISPASS), 33--42.
[2]
Bahirat, S., & Pasricha, S. (2009). "Exploring hybrid photonic networks-on-chip for emerging chip multiprocessors," Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis, 129--136.
[3]
Bienia, C., Kumar, S., Singh, J. P., & Li, K. (2008). "The PARSEC benchmark suite: Characterization and architectural implications," In 17th International Conference on Parallel Architectures and Compilation Techniques (PACT), 72--81.
[4]
Binkert, N., et al. (2011). "The gem5 simulator," ACM SIGARCH Computer Architecture News, 39(2), 1--7.
[5]
Chang, M. F., et al. (2008) "CMP network-on-chip overlaid with multi-band RF-interconnect," High Performance Computer Architecture (HPCA) IEEE 14th International Symposium on.
[6]
Chen, C. et al. (2010). "Physical vs. virtual express topologies with low-swing links for future many-core nocs," 4th ACM/IEEE International Symposium on Networks-on-Chip (NOCS), 173--180.
[7]
Chen, L. and Pinkston, T. M. (2012). "NoRD: Node-Router Decoupling for Effective Power-gating of On-Chip Routers," In 45th IEEE/ACM International Symposium on Microarchitecture (MICRO), 270--281.
[8]
T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, "Introduction to algorithms," MIT press, 2001.
[9]
Dally, W. J. (1991). "Express Cubes: Improving the Performance of k-ary n-cube Interconnection Networks," In IEEE Transactions on Computers, 40(9), 1016--1023.
[10]
Dally, W. J., & Towles, B. P. (2004). Principles and practices of interconnection networks. Elsevier.
[11]
Daya, B. K, et al. (2014). "SCORPIO: a 36-core research chip demonstrating snoopy coherence on a scalable mesh NoC with in-network ordering," In IEEE International Symposium on Computer Architecture (ISCA).
[12]
Dumitriu, V., & Khan, G. N. (2009). "Throughput-oriented NoC topology generation and analysis for high performance SoCs," In IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 17(10), 1433--1446.
[13]
Gratz, P., et al. (2007). "On-chip interconnection networks of the TRIPS chip," IEEE Micro, vol. 27, 41--50.
[14]
Grot, B., Hestness, J., Keckler, S. W., & Mutlu, O. (2009). "Express cube topologies for on-chip interconnects," In 15th IEEE International Symposium on High Performance Computer Architecture (HPCA), 163--174.
[15]
Ho, W., & Pinkston, T. (2006). "A Design Methodology for Efficient Application-Specific On-Chip Interconnects," In IEEE Transactions on Parallel & Distributed Systems (TPDS), vol.17, no. 2, 174--190.
[16]
Howard, J., et al. (2010). "A 48-core IA-32 message-passing processor with DVFS in 45nm CMOS," In Proceedings of the International Solid-State Circuits Conference (ISSCC).
[17]
Kim, B. and Stojanović, V. (2007). "Equalized interconnects for on-chip networks: modeling and optimization framework," In Int'l Conference Computer-Aided Design (ICCAD), 552--559.
[18]
Kim, J., Balfour, J., & Dally, W. (2007). "Flattened butterfly topology for on-chip networks," In Proceedings of the 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 172--182.
[19]
Kumar, A., Peh, L. S., Kundu, P., & Jha, N. K. (2007). "Express virtual channels: towards the ideal interconnection fabric," In ACM SIGARCH Computer Architecture News, 35(2), 150--161.
[20]
Ma, S., Jerger, N. E., & Wang, Z. (2012). "Whole packet forwarding: Efficient design of fully adaptive routing algorithms for networks-on-chip," In International Symposium on High Performance Computer Architecture (HPCA), 1--12.
[21]
Ogras, U. Y., & Marculescu, R. (2006). "It's a small world after all": NoC performance optimization via long-range link insertion. In IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 14(7), 693--706.
[22]
Park, S., et al. (2012). "Approaching the theoretical limits of a mesh NoC with a 16-node chip prototype in 45nm SOI," In Proceedings of the 49th ACM Annual Design Automation Conference (DAC), 398--405.
[23]
Sun, C., et al. (2012). "DSENT-a tool connecting emerging photonics with electronics for opto-electronic networks-on-chip modeling," In IEEE/ACM International Symposium on Networks-on-Chip (NOCS), 201--210.
[24]
S. Vangal, et al. (2007). "An 80-tile 1.28 TFLOPS network-on-chip in 65nm CMOS," In International Solid-State Circuits Conference (ISSCC).

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '19: Proceedings of the 48th International Conference on Parallel Processing
August 2019
1107 pages
ISBN:9781450362955
DOI:10.1145/3337821
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • University of Tsukuba: University of Tsukuba

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 August 2019

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICPP 2019

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 95
    Total Downloads
  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media