research-article

Domain-wall memory buffer for low-energy NoCs

Authors:

Donald Kline, Jr.,

Alex K. JonesAuthors Info & Claims

DAC '15: Proceedings of the 52nd Annual Design Automation Conference

Article No.: 11, Pages 1 - 6

https://doi.org/10.1145/2744769.2744826

Published: 07 June 2015 Publication History

Abstract

Networks-on-chip (NoCs) have become a leading energy consumer in modern multi-core processors, with a considerable portion of this energy originating from the large number of virtual channel (FIFO) buffers. While emerging memories have been considered for many architectural components such as caches, the asymmetric access properties and relatively small size of network-FIFOs compared to the required peripheral circuitry has led to few such replacements proposed for NoCs. In this paper, we propose control schemes that leverage the\shift-register" nature of spintronic domain-wall memory (DWM) to replace conventional memory buffers for the NoC. Our results indicate that the best shift-based scheme utilizes a dual-nanowire approach to ensure that reads and writes can be more effectively aligned with access ports for simultaneous access in the same cycle. Our approach provides a 2.93X speedup over a DWM buffer using a traditional FIFO memory control scheme with a 1.16X savings in energy. Compared to a SRAM-FIFO it exhibits an 8% message latency degradation versus a 56% energy reduction. The resulting approach achieves a 53% reduction in energy delay product compared to SRAM and a 42% reduction in energy delay product versus STT-MRAM.

References

[1]

S. S. P. Parkin, M. Hayashi, and L. Thomas, "Magnetic Domain-Wall Racetrack Memory," Science, Vol. 320, No. 5874, pp. 190--194, Apr. 2008.

[2]

S. Parkin, "Racetrack Memory: A Storage Class Memory Based on Current Controlled Magnetic Domain Wall Motion," Proc. of DRC, pp. 3--6, 2009.

[3]

R. Venkatesan, M. Sharad, K. Roy, and A. Raghunathan, "DWM-TAPESTRI-an energy efficient all-spin cache using domain wall shift based writes," Proc. of DATE, pp. 1825--1830, 2013.

Digital Library

[4]

Y. Zhang, W. S. Zhao, D. Ravelosona, J.-O. Klein, J. V. Kim, and C. Chappert, "Perpendicular-magnetic-anisotropy CoFeB racetrack memory," Journal of Applied Physics, Vol. 111, No. 9, No. 9, 2012.

[5]

A. Annunziata, M. Gaidis, L. Thomas, C. Chien, C.-C. Hung, P. Chevalier, E. O'Sullivan, J. Hummel, E. Joseph, Y. Zhu, T. Topuria, E. Delenia, P. Rice, S. Parkin, and W. Gallagher, "Racetrack Memory Cell Array with Integrated Magnetic Tunnel Junction Readout," Proc. of IEDM, 2011.

[6]

R. Venkatesan, V. Kozhikkottu, C. Augustine, A. Raychowdhury, K. Roy, and A. Raghunathan, "TapeCache: a high density, energy efficient cache based on domain wall memory," Proc. of ISLPED, pp. 185--190, 2012.

Digital Library

[7]

H. Xu, R. Melhem, and A. K. Jones, "Multilane Racetrack Caches: Improving Efficiency Through Compression and Independent Shifting," Proc. of ASPDAC, 2015.

[8]

Y. Zhang, W. Zhao, J.-O. Klein, D. Ravelsona, and C. Chappert, "Ultra-High Density Content Addressable Memory Based on Current Induced Domain Wall Motion in Magnetic Track," IEEE TMAG, Vol. 48, No. 11, pp. 3219--3222, Nov. 2012.

[9]

R. Nebashi, N. Sakimura, Y. Tsuji, S. Fukami, H. Honjo, S. Saito, S. Miura, N. Ishiwata, K. Kinoshita, T. Hanyu, T. Endoh, N. Kasai, H. Ohno, and T. Sugibayashi, "A Content Addressable Memory using Magnetic Domain Wall Motion Cells," Proc. of VLSIC, pp. 300--301, Jun. 2011.

[10]

W. Zhao, N. Ben Romdhane, Y. Zhang, J.-O. Klein, and D. Ravelosona, "Racetrack memory based reconfigurable computing," Proc. of FTFC, 2013.

[11]

M. Mao, W. Wen, Y. Zhang, Y. Chen, and H. H. Li, "Exploration of GPGPU Register File Architecture Using Domain-wall-shift-write based Racetrack Memory," Proc. of DAC, pp. 1--6, 2014.

Digital Library

[12]

L. Thomas, S.-H. Yang, K.-S. Ryu, B. Hughes, C. Rettner, D.-S. Wang, C.-H. Tsai, K.-H. Shen, and S. Parkin, "Racetrack Memory: A High-performance, Low-cost, Non-volatile Memory based on Magnetic Domain Walls," Proc. of IEDM, Dec. 2011.

[13]

F. Jafari, Z. Lu, A. Jantsch, and M. Yaghmaee, "Buffer Optimization in Network-on-Chip Through Flow Regulation," IEEE TCAD, Vol. 29, No. 12, pp. 1973--1986, Dec 2010.

Digital Library

[14]

T. Moscibroda and O. Mutlu, "A Case for Bufferless Routing in On-chip Networks," Proc. of ISCA, ISCA '09, (New York, NY, USA), pp. 196--207, ACM, 2009.

Digital Library

[15]

H. Jang, B. S. An, N. Kulkarni, K. H. Yum, and E. J. Kim, "A Hybrid Buffer Design with STT-MRAM for On-Chip Interconnects," Proc. of NOCS, pp. 193--200, 2012.

Digital Library

[16]

C. Smullen, V. Mohan, A. Nigam, S. Gurumurthi, and M. Stan, "Relaxing non-volatility for fast and energy-efficient STT-RAM caches," Proc. of HPCA, pp. 50--61, Feb 2011.

Digital Library

[17]

Z. Sun, X. Bi, H. H. Li, W.-F. Wong, Z.-L. Ong, X. Zhu, and W. Wu, "Multi Retention Level STT-RAM Cache Designs with a Dynamic Refresh Scheme," Proc. of MICRO, pp. 329--338, ACM, 2011.

Digital Library

[18]

Z. Sun, X. Bi, A. K. Jones, and H. Li, "Design exploration of racetrack lower-level caches," Proc. of ISLPED, pp. 263--266, ACM, 2014.

Digital Library

[19]

W. Zhao et al., "Magnetic domain-wall racetrack memory for high density and fast data storage," Proc. of ICSICT, pp. 1--4, IEEE, 2012.

[20]

X. Dong, C. Xu, Y. Xie, and N. Jouppi, "NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory," IEEE TCAD, Vol. 31, No. 7, pp. 994--1007, July 2012.

Digital Library

[21]

M. Lis, P. Ren, M. H. Cho, K. S. Shim, C. Fletcher, O. Khan, and S. Devadas, "Scalable, accurate multi-core simulation in the 1000-core era," Proc. of ISPASS, pp. 175--185, April 2011.

Digital Library

[22]

T. Carlson, W. Heirman, and L. Eeckhout, "Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation," SC, pp. 1--12, Nov 2011.

Digital Library

[23]

C. Bienia, S. Kumar, and K. Li, "PARSEC vs. SPLASH-2: A Quantitative Comparison of Two Multi-threaded Benchmark Suites on Chip-Multiprocessors," Tech. Rep. TR-818-08, Princeton, 2008.

[24]

R. Venkatesan, S. G. Ramasubramanian, S. Venkataramani, K. Roy, and A. Raghunathan, "STAG: Spintronic-tape Architecture for GPGPU Cache Hierarchies," Proc. of ISCA, pp. 253--264, 2014.

Digital Library

Cited By

LIU XGAO YHE YYUE XJIANG HWANG X(2023)Hybrid, Asymmetric and Reconfigurable Input Unit Designs for Energy-Efficient On-Chip NetworksIEICE Transactions on Electronics10.1587/transele.2022CTP0005E106.C:10(570-579)Online publication date: 1-Oct-2023
https://doi.org/10.1587/transele.2022CTP0005
Hakert CKhan AChen KHameed FCastrillon JChen J(2023) ROLLED: R acetrack Memory O ptimized L inear L ayout and E fficient D ecomposition of Decision Trees IEEE Transactions on Computers10.1109/TC.2022.319709472:5(1488-1502)Online publication date: 1-May-2023
https://doi.org/10.1109/TC.2022.3197094
Khan AOllivier SLongofono SHempel GCastrillon JJones A(2022)Brain-inspired Cognition in Next-generation Racetrack MemoriesACM Transactions on Embedded Computing Systems10.1145/352407121:6(1-28)Online publication date: 12-Dec-2022
https://dl.acm.org/doi/10.1145/3524071
Show More Cited By

Index Terms

Domain-wall memory buffer for low-energy NoCs
1. Networks
  1. Network architectures

Recommendations

Cache Design with Domain Wall Memory
Domain wall memory (DWM) is a recently developed spin-based memory technology in which several bits of data are densely packed into the domains of a ferromagnetic wire. DWM has shown great promise in enabling non-volatile memory with very high density and ...
A Survey of Techniques for Architecting Processor Components Using Domain-Wall Memory
Special Issue on Nanoelectronic Circuit and System Design Methods for the Mobile Computing Era and Regular Papers

Recent trends of increasing core-count and bandwidth/memory wall have motivated researchers to explore novel memory technologies for designing processor components such as cache, register file, shared memory, and so on. Domain-wall memory (DWM), also ...
CORUSCANT: Fast Efficient Processing-in-Racetrack Memories
MICRO '22: Proceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture

The growth in data needs of modern applications has created significant challenges for modern systems leading to a "memory wall." Spintronic Domain-Wall Memory (DWM), provides near-SRAM read/write performance, energy savings and non-volatility, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DAC '15: Proceedings of the 52nd Annual Design Automation Conference

June 2015

1204 pages

ISBN:9781450335201

DOI:10.1145/2744769

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGDA: ACM Special Interest Group on Design Automation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 June 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

DAC '15

Sponsor:

SIGDA

DAC '15: The 52nd Annual Design Automation Conference 2015

June 7 - 11, 2015

California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
281
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)4

Reflects downloads up to 10 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

LIU XGAO YHE YYUE XJIANG HWANG X(2023)Hybrid, Asymmetric and Reconfigurable Input Unit Designs for Energy-Efficient On-Chip NetworksIEICE Transactions on Electronics10.1587/transele.2022CTP0005E106.C:10(570-579)Online publication date: 1-Oct-2023
https://doi.org/10.1587/transele.2022CTP0005
Hakert CKhan AChen KHameed FCastrillon JChen J(2023) ROLLED: R acetrack Memory O ptimized L inear L ayout and E fficient D ecomposition of Decision Trees IEEE Transactions on Computers10.1109/TC.2022.319709472:5(1488-1502)Online publication date: 1-May-2023
https://doi.org/10.1109/TC.2022.3197094
Khan AOllivier SLongofono SHempel GCastrillon JJones A(2022)Brain-inspired Cognition in Next-generation Racetrack MemoriesACM Transactions on Embedded Computing Systems10.1145/352407121:6(1-28)Online publication date: 12-Dec-2022
https://dl.acm.org/doi/10.1145/3524071
Hameed FCastrillon J(2022)BlendCache: An Energy and Area Efficient Racetrack Last-Level-Cache ArchitectureIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.316119841:12(5288-5298)Online publication date: Dec-2022
https://doi.org/10.1109/TCAD.2022.3161198
Gao YHe YYue XJiang HWang X(2022)Traffic-Aware Energy-Efficient Hybrid Input Buffer Design for On-Chip Routers2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)10.1109/MCSoC57363.2022.10023992(395-401)Online publication date: Dec-2022
https://doi.org/10.1109/MCSoC57363.2022.10023992
Cao WWang JWang DMei K(2022)A Hop-Parity-Involved Task Schedule for Lightweight Racetrack-Buffer in Energy-Efficient NoCsSmart Computing and Communication10.1007/978-3-030-97774-0_25(276-285)Online publication date: 15-Mar-2022
https://doi.org/10.1007/978-3-030-97774-0_25
Yang LLiu WGuan NDutt N(2019)Optimal Application Mapping and Scheduling for Network-on-Chips with Computation in STT-RAM Based RouterIEEE Transactions on Computers10.1109/TC.2018.286474968:8(1174-1189)Online publication date: 1-Aug-2019
https://dl.acm.org/doi/10.1109/TC.2018.2864749
Ding MLiang DHe AXiong JWu J(2019)A New Memory-Based Routing Policy for Mesh Network2019 IEEE 4th International Conference on Integrated Circuits and Microsystems (ICICM)10.1109/ICICM48536.2019.8977195(237-241)Online publication date: Oct-2019
https://doi.org/10.1109/ICICM48536.2019.8977195
Ollivier SKline DKawsher RMelhem RBanja SJones A(2019)Leveraging Transverse Reads to Correct Alignment Faults in Domain Wall Memories2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN.2019.00047(375-387)Online publication date: Jun-2019
https://doi.org/10.1109/DSN.2019.00047
(2018)Efficient cache replacement policy for minimising error rate in L2-STT-MRAM cachesInternational Journal of Grid and Utility Computing10.5555/3292801.32928039:4(307-321)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.5555/3292801.3292803
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents