Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2155620.2155631acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article

Packet chaining: efficient single-cycle allocation for on-chip networks

Published: 03 December 2011 Publication History

Abstract

This paper introduces packet chaining, a simple and effective method to increase allocator matching efficiency and hence network performance, particularly suited to networks with short packets and short cycle times. Packet chaining operates by chaining packets destined to the same output together, to reuse the switch connection of a departing packet. This allows an allocator to build up an efficient matching over a number of cycles like incremental allocation, but not limited by packet length. For a 64-node 2D mesh at maximum injection rate and with single-flit packets, packet chaining increases network throughput by 15% compared to a highly-tuned router using a conventional single-iteration separable iSLIP allocator, and outperforms significantly more complex allocators. Specifically, it outperforms multiple-iteration iSLIP allocators and wavefront allocators by 10% and 6% respectively, and gives comparable throughput with an augmenting paths allocator. Packet chaining achieves this performance with a cycle time comparable to a single-iteration separable allocator. Packet chaining also reduces average network latency by 22.5% compared to a single-iteration iSLIP allocator. Finally, packet chaining increases IPC up to 46% (16% average) for application benchmarks because short packets are critical in a typical cache-coherent chip multiprocessor.

References

[1]
M. Ahn and E. J. Kim. Pseudo-circuit: Accelerating communication for on-chip interconnection networks. In Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010.
[2]
D. U. Becker and W. J. Dally. Allocator implementations for network-on-chip routers. In Proceedings of the 2009 ACM/IEEE Conference on Supercomputing, 2009.
[3]
C. Bienia. Benchmarking Modern Multiprocessors. PhD thesis, Princeton University, January 2011.
[4]
W. J. Dally. Virtual-channel flow control. IEEE Transactions on Parallel and Distributed Systems, 3(2), 1992.
[5]
W. J. Dally and B. Towles. Route packets, not wires: On-chip interconnection networks. In Proceedings of the 38th annual Design Automation Conference, 2001.
[6]
W. J. Dally and B. Towles. Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers Inc., 2003.
[7]
L. R. Ford and D. R. Fulkerson. Maximal flow through a network. Canadian Journal of Mathematics, 8(3), 1956.
[8]
M. Galles. Spider: A high-speed network interconnect. IEEE Micro, 17(1):34--39, 1997.
[9]
P. Gupta and N. McKeown. Designing and implementing a fast crossbar scheduler. IEEE Micro, 19:20--28, 1999.
[10]
R. R. Hoare, Z. Ding, and A. K. Jones. A near-optimal real-time hardware scheduler for large cardinality crossbar switches. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, 2006.
[11]
J. Kim, W. J. Dally, and D. Abts. Flattened butterfly: a cost-efficient topology for high-radix networks. In Proceedings of the 34th annual International Symposium on Computer Architecture, 2007.
[12]
C. P. Kruskal and M. Snir. The performance of multistage interconnection networks for multiprocessors. IEEE Transanctions on Computers, pages 1091--1098, December 1983.
[13]
A. Kumar, P. Kundu, A. Singh, L.-S. Peh, and N. Jhay. A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS. In Proceedings of the 25th International Conference on Computer Design, 2007.
[14]
A. Kumar, L.-S. Peh, and N. K. Jha. Token flow control. In Proceedings of the 41st annual IEEE/ACM International Symposium on Microarchitecture, 2008.
[15]
A. Kumar, L.-S. Peh, P. Kundu, and N. K. Jha. Express virtual channels: towards the ideal interconnection fabric. In Proceedings of the 34th annual international symposium on Computer architecture, 2007.
[16]
R. Kumar, V. Zyuban, and D. Tullsen. Interconnections in multi-core architectures: Understanding mechanisms, overheads and scaling. In Proceedings of the 32nd annual international symposium on Computer architecture, 2005.
[17]
C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In Proceedings of the 2005 ACM SIGPLAN conference on Programming language design and implementation, 2005.
[18]
N. McKeown. The iSLIP scheduling algorithm for input-queued switches. IEEE/ACM Transanctions on Networking, 7:188--201, 1999.
[19]
G. Michelogiannakis, N. Jiang, D. U. Becker, and W. J. Dally. Packet chaining: Efficient single-cycle allocation for on-chip networks. IEEE Computer Architecture Letters, 2011.
[20]
S. S. Mukherjee, F. Silla, P. Bannon, J. Emer, S. Lang, and D. Webb. A comparative study of arbitration algorithms for the alpha 21364 pipelined router. SIGARCH Computer Architecture News, 30:223--234, 2002.
[21]
R. Mullins, A. West, and S. Moore. Low-latency virtual-channel routers for on-chip networks. In Proceedings of the 31st annual International Symposium on Computer Architecture, 2004.
[22]
D. Park, R. Das, C. Nicopoulos, J. Kim, N. Vijaykrishnan, R. K. Iyer, and C. R. Das. Design of a dynamic priority-based fast path architecture for on-chip interconnects. In Proceedings of the 15th Symposium on High Performance Interconnects, 2007.
[23]
D. Sanchez, G. Michelogiannakis, and C. Kozyrakis. An analysis of interconnection networks for large scale chip-multiprocessors. ACM Transactions on Architecture and Code Optimization, 7(1):4:1--4:28, 2010.
[24]
A. Singh. Load-Balanced Routing in Interconnection Networks. PhD in electrical engineering, Stanford University, 2005.
[25]
Y. Tamir and H. C. Chi. Symmetric crossbar arbiters for VLSI communication switches. IEEE Transactions on Parallel and Distributed Systems, 4:13--27, 1993.

Cited By

View all
  • (2024)AS-Router: A novel allocation service for efficient Network-on-ChipEngineering Science and Technology, an International Journal10.1016/j.jestch.2023.10160750(101607)Online publication date: Feb-2024
  • (2022)MUA-Router: Maximizing the Utility-of-Allocation for On-chip Pipelining RoutersACM Transactions on Architecture and Code Optimization10.1145/351902719:3(1-23)Online publication date: 4-May-2022
  • (2022)Object Intersection Captures on Interactive Apps to Drive a Crowd-sourced Replay-based Compiler OptimizationACM Transactions on Architecture and Code Optimization10.1145/351733819:3(1-25)Online publication date: 4-May-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO-44: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture
December 2011
519 pages
ISBN:9781450310536
DOI:10.1145/2155620
  • Conference Chair:
  • Carlo Galuzzi,
  • General Chair:
  • Luigi Carro,
  • Program Chairs:
  • Andreas Moshovos,
  • Milos Prvulovic
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 December 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. allocation
  2. iterations
  3. on-chip networks
  4. packet chaining

Qualifiers

  • Research-article

Funding Sources

Conference

MICRO-44
Sponsor:

Acceptance Rates

Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)AS-Router: A novel allocation service for efficient Network-on-ChipEngineering Science and Technology, an International Journal10.1016/j.jestch.2023.10160750(101607)Online publication date: Feb-2024
  • (2022)MUA-Router: Maximizing the Utility-of-Allocation for On-chip Pipelining RoutersACM Transactions on Architecture and Code Optimization10.1145/351902719:3(1-23)Online publication date: 4-May-2022
  • (2022)Object Intersection Captures on Interactive Apps to Drive a Crowd-sourced Replay-based Compiler OptimizationACM Transactions on Architecture and Code Optimization10.1145/351733819:3(1-25)Online publication date: 4-May-2022
  • (2022)SIMD-Matcher: A SIMD-based Arbitrary Matching FrameworkACM Transactions on Architecture and Code Optimization10.1145/351424619:3(1-20)Online publication date: 4-May-2022
  • (2022)Revisiting network congestion avoidance through adaptive packet-chaining reservationComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2022.109008212:COnline publication date: 20-Jul-2022
  • (2022)Design and Area Performance Energy Consumption Comparison of Secured Network-on-Chip with PTP and Bus InterconnectionsJournal of The Institution of Engineers (India): Series B10.1007/s40031-022-00735-5103:5(1479-1491)Online publication date: 30-May-2022
  • (2021)ALPHA: A Learning-Enabled High-Performance Network-on-Chip Router Design for Heterogeneous Manycore ArchitecturesIEEE Transactions on Sustainable Computing10.1109/TSUSC.2020.29813406:2(274-288)Online publication date: 1-Apr-2021
  • (2021)SB-Router: A Swapped Buffer Activated Low Latency Network-on-Chip RouterIEEE Access10.1109/ACCESS.2021.31112949(126564-126578)Online publication date: 2021
  • (2020)Designing a XSS Defensive Framework for Web Servers Deployed in the Existing Smart City InfrastructureJournal of Organizational and End User Computing10.4018/JOEUC.202010010532:4(85-111)Online publication date: 1-Oct-2020
  • (2020)A Risk Analysis Framework for Social Engineering Attack Based on User ProfilingJournal of Organizational and End User Computing10.4018/JOEUC.202007010432:3(37-49)Online publication date: 1-Jul-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media