research-article

A hybrid NoC design for cache coherence optimization for chip multiprocessors

Authors:

Mahmut Kandemir, and

Mary Jane IrwinAuthors Info & Claims

DAC '12: Proceedings of the 49th Annual Design Automation Conference

June 2012

Pages 834 - 842

https://doi.org/10.1145/2228360.2228511

Published: 03 June 2012 Publication History

Abstract

On chip many-core systems, evolving from prior multi-processor systems, are considered as a promising solution to the performance scalability and power consumption problems. The long communication distance between the traditional multi-processors makes directory-based cache coherence protocols better solutions compared to bus-based snooping protocols even with the overheads from indirections. However, much smaller distances between the CMP cores enhance the reachability of buses, revitalizing the applicability of snooping protocols for cache-to-cache transfers. In this work, we propose a hybrid NoC design to provide optimized support for cache coherency. In our design, on-chip links can be dynamically configured as either point-to-point links between NoC nodes or short buses to facilitate localized snooping. By taking advantage of the best of both worlds, bus-based snooping coherency and NoC-based directory coherency, our approach brings both power and performance benefits.

References

[1]

Intel. From a few cores to many: A tera-scale computing research overview. http://download.intel.com/research/platform/terascale/terascale_overview_paper.pdf.

[2]

W. J. Dally and B. Towles. Route Packets, Not Wires: On-Chip Interconnection Networks. DAC, 2001.

Digital Library

[3]

N. Jerger, et. al. Virtual Tree Coherence: Leveraging Regions and In-network Multicast Trees for Scalable Cache Coherence MICRO, 2008.

[4]

M. R. Marty and M. D. Hill. Coherence ordering for ring based chip multiprocessors. MICRO, 2006.

Digital Library

[5]

K. Strauss, et. al. Uncorq: Unconstrained snoop request delivery in embedded-ring multiprocessors. MICRO, 2007.

Digital Library

[6]

J. A. Brown, et. al. Proximity-Aware Directory-based Coherence for Multi-core Processor Architectures. In Proceedings of SPAA, 19, 2007.

Digital Library

[7]

N. Barrow-Williams, et. al. Proximity coherence for chip multiprocessors. In Proceedings of PACT, 2010.

Digital Library

[8]

R. Das, et. al. Design and Evaluation of Hierarchical On-Chip Network Topologies for next generation CMPs. HPCA, 2009.

[9]

L. Cheng, et. al. Interconnect-Aware coherence Protocols for Chip Multiprocessors. ISCA, 2006.

Digital Library

[10]

A. N. Udipi, et. al. Towards Scalable, Energy-Efficient, Bus-Based On-Chip Networks. HPCA, 2010.

[11]

S. Akram, et. al. A Workload-Adaptive and Reconfigurable Bus Architecture for Multicore Processors. International Journal of Reconfigurable Computing, 2010.

Digital Library

[12]

M. Kim, et. al. Polymorphic On-Chip Networks. ISCA, 2008.

Digital Library

[13]

A. S. Tanenbaum. Computer Networks. Prentice Hall Pub., 1999.

Digital Library

[14]

M. M. K. Martin, et. al. Multifacets General Execution-driven Multiprocessor Simulator (GEMS) Toolset. SIGARCH, Nov. 2005.

Digital Library

[15]

S. C. Woo, et. al. The SPLASH-2 Programs: Characterization and Methodological Considerations. ISCA, 1995.

Digital Library

[16]

V. Aslot, et. al. SPEComp: A new benchmark suite for measuring parallel computer performance. Lecture Notes in Computer Science (WOMPEI2001), 2001.

Digital Library

[17]

N. Muralimanohar, et. al. Optimizing NUCA organizations and wiring alternatives for large caches with Cacti 6.0. MICRO, 2007.

Digital Library

[18]

R. Mukherjee, et. al. Thermal sensor allocation and placement for reconfigurable systems. ICCAD, 2006.

Digital Library

[19]

H. Wang, et. al. Orion: A Power-Performance Simulator for Interconnection Networks. MICRO, 2006.

Digital Library

Cited By

Pham-Quoc C(2023)A Survey of On-Chip Hybrid Interconnect for Multicore ArchitecturesContext-Aware Systems and Applications10.1007/978-3-031-28816-6_5(59-75)Online publication date: 24-Mar-2023
https://doi.org/10.1007/978-3-031-28816-6_5
Gade SSinha MKumar MDeb S(2022)Scalable Hybrid Cache Coherence Using Emerging Links for Chiplet Architectures2022 35th International Conference on VLSI Design and 2022 21st International Conference on Embedded Systems (VLSID)10.1109/VLSID2022.2022.00029(92-97)Online publication date: Feb-2022
https://doi.org/10.1109/VLSID2022.2022.00029
Shikama YKawano RMatsutani HAmano HNagasaka YFukumoto NKoibuchi M(2022)A traffic-aware memory-cube network using bypassingMicroprocessors and Microsystems10.1016/j.micpro.2022.10447190(104471)Online publication date: Apr-2022
https://doi.org/10.1016/j.micpro.2022.104471
Show More Cited By

Index Terms

A hybrid NoC design for cache coherence optimization for chip multiprocessors
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
      1. Interconnection architectures
2. Hardware
  1. Communication hardware, interfaces and storage
    1. Buses and high-speed links

Recommendations

NoC-aware cache design for multithreaded execution on tiled chip multiprocessors
HiPEAC '11: Proceedings of the 6th International Conference on High Performance and Embedded Architectures and Compilers

In chip multiprocessors (CMPs), data access latency depends on the memory hierarchy organization, the on-chip interconnect (NoC), and the running workload. Reducing data access latency is vital to achieving performance improvements and scalability of ...
Read More
Bandwidth Adaptive Cache Coherence Optimizations for Chip Multiprocessors

Chip Multiprocessors (CMPs) have different technological parameters and physical constraints than earlier multi-processor systems, which should be taken into consideration when designing cache coherence protocols. Also, contemporary cache coherence ...
Read More
Group-caching for NoC based multicore cache coherent systems
DATE '09: Proceedings of the Conference on Design, Automation and Test in Europe

Most CMPs use on-chip networks to connect cores and tend to integrate more simple cores on a single die. Low-radix networks, such as 2D-MESH, are widely used in tiled CMPs since they can be mapped to on-chip networks efficiently. However, low-radix ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DAC '12: Proceedings of the 49th Annual Design Automation Conference

June 2012

1357 pages

ISBN:9781450311991

DOI:10.1145/2228360

General Chair:
Patrick Groeneveld
Magma Design Automation, Inc., San Jose, CA
,
Program Chairs:
Donatella Sciuto
Politecnico di Milano, Milano, Italy
,
Soha Hassoun
Tufts Univ., Medford, MA

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

EDAC: Electronic Design Automation Consortium
SIGDA: ACM Special Interest Group on Design Automation
IEEE-CEDA

In-Cooperation

SIGBED: ACM Special Interest Group on Embedded Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

DAC '12

Sponsor:

EDAC
SIGDA

DAC '12: The 49th Annual Design Automation Conference 2012

June 3 - 7, 2012

California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
449
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)5

Other Metrics

View Author Metrics

Citations

Cited By

Pham-Quoc C(2023)A Survey of On-Chip Hybrid Interconnect for Multicore ArchitecturesContext-Aware Systems and Applications10.1007/978-3-031-28816-6_5(59-75)Online publication date: 24-Mar-2023
https://doi.org/10.1007/978-3-031-28816-6_5
Gade SSinha MKumar MDeb S(2022)Scalable Hybrid Cache Coherence Using Emerging Links for Chiplet Architectures2022 35th International Conference on VLSI Design and 2022 21st International Conference on Embedded Systems (VLSID)10.1109/VLSID2022.2022.00029(92-97)Online publication date: Feb-2022
https://doi.org/10.1109/VLSID2022.2022.00029
Shikama YKawano RMatsutani HAmano HNagasaka YFukumoto NKoibuchi M(2022)A traffic-aware memory-cube network using bypassingMicroprocessors and Microsystems10.1016/j.micpro.2022.10447190(104471)Online publication date: Apr-2022
https://doi.org/10.1016/j.micpro.2022.104471
Gade SDeb S(2021)A Novel Hybrid Cache Coherence with Global Snooping for Many-core ArchitecturesACM Transactions on Design Automation of Electronic Systems10.1145/346277527:1(1-31)Online publication date: 13-Sep-2021
https://dl.acm.org/doi/10.1145/3462775
Cheng XZhao HMohanty SFang J(2019)Improving GPU NoC Power Efficiency through Dynamic Bandwidth Allocation2019 IEEE International Conference on Consumer Electronics (ICCE)10.1109/ICCE.2019.8662004(1-4)Online publication date: Jan-2019
https://doi.org/10.1109/ICCE.2019.8662004
Cheng XZhao YRobaei MJiang BZhao HFang J(2019)A Low-Cost and Energy-Efficient NoC Architecture for GPGPUs2019 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)10.1109/ANCS.2019.8901890(1-12)Online publication date: Sep-2019
https://doi.org/10.1109/ANCS.2019.8901890
Cheng XZhao YZhao HXie Y(2018)Packet pumpProceedings of the 55th Annual Design Automation Conference10.1145/3195970.3196087(1-6)Online publication date: 24-Jun-2018
https://dl.acm.org/doi/10.1145/3195970.3196087
Cheng XZhao YZhao HXie Y(2018) Packet Pump: Overcoming Network Bottleneck in On-Chip Interconnects for GPGPUs * 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)10.1109/DAC.2018.8465889(1-6)Online publication date: Jun-2018
https://doi.org/10.1109/DAC.2018.8465889
Vardi FKhadem-Zadeh AReshadi M(2017)A heuristic clustering approach to use case-aware application-specific network-on-chip synthesisThe Journal of Supercomputing10.1007/s11227-016-1905-673:5(2098-2129)Online publication date: 1-May-2017
https://dl.acm.org/doi/10.1007/s11227-016-1905-6
Kwon WPeh LMarculescu DLiu F(2015)A universal ordered NoC design platform for shared-memory MPSoCProceedings of the IEEE/ACM International Conference on Computer-Aided Design10.5555/2840819.2840916(697-704)Online publication date: 2-Nov-2015
https://dl.acm.org/doi/10.5555/2840819.2840916
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents