Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2744769.2744826acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Domain-wall memory buffer for low-energy NoCs

Published: 07 June 2015 Publication History

Abstract

Networks-on-chip (NoCs) have become a leading energy consumer in modern multi-core processors, with a considerable portion of this energy originating from the large number of virtual channel (FIFO) buffers. While emerging memories have been considered for many architectural components such as caches, the asymmetric access properties and relatively small size of network-FIFOs compared to the required peripheral circuitry has led to few such replacements proposed for NoCs. In this paper, we propose control schemes that leverage the\shift-register" nature of spintronic domain-wall memory (DWM) to replace conventional memory buffers for the NoC. Our results indicate that the best shift-based scheme utilizes a dual-nanowire approach to ensure that reads and writes can be more effectively aligned with access ports for simultaneous access in the same cycle. Our approach provides a 2.93X speedup over a DWM buffer using a traditional FIFO memory control scheme with a 1.16X savings in energy. Compared to a SRAM-FIFO it exhibits an 8% message latency degradation versus a 56% energy reduction. The resulting approach achieves a 53% reduction in energy delay product compared to SRAM and a 42% reduction in energy delay product versus STT-MRAM.

References

[1]
S. S. P. Parkin, M. Hayashi, and L. Thomas, "Magnetic Domain-Wall Racetrack Memory," Science, Vol. 320, No. 5874, pp. 190--194, Apr. 2008.
[2]
S. Parkin, "Racetrack Memory: A Storage Class Memory Based on Current Controlled Magnetic Domain Wall Motion," Proc. of DRC, pp. 3--6, 2009.
[3]
R. Venkatesan, M. Sharad, K. Roy, and A. Raghunathan, "DWM-TAPESTRI-an energy efficient all-spin cache using domain wall shift based writes," Proc. of DATE, pp. 1825--1830, 2013.
[4]
Y. Zhang, W. S. Zhao, D. Ravelosona, J.-O. Klein, J. V. Kim, and C. Chappert, "Perpendicular-magnetic-anisotropy CoFeB racetrack memory," Journal of Applied Physics, Vol. 111, No. 9, No. 9, 2012.
[5]
A. Annunziata, M. Gaidis, L. Thomas, C. Chien, C.-C. Hung, P. Chevalier, E. O'Sullivan, J. Hummel, E. Joseph, Y. Zhu, T. Topuria, E. Delenia, P. Rice, S. Parkin, and W. Gallagher, "Racetrack Memory Cell Array with Integrated Magnetic Tunnel Junction Readout," Proc. of IEDM, 2011.
[6]
R. Venkatesan, V. Kozhikkottu, C. Augustine, A. Raychowdhury, K. Roy, and A. Raghunathan, "TapeCache: a high density, energy efficient cache based on domain wall memory," Proc. of ISLPED, pp. 185--190, 2012.
[7]
H. Xu, R. Melhem, and A. K. Jones, "Multilane Racetrack Caches: Improving Efficiency Through Compression and Independent Shifting," Proc. of ASPDAC, 2015.
[8]
Y. Zhang, W. Zhao, J.-O. Klein, D. Ravelsona, and C. Chappert, "Ultra-High Density Content Addressable Memory Based on Current Induced Domain Wall Motion in Magnetic Track," IEEE TMAG, Vol. 48, No. 11, pp. 3219--3222, Nov. 2012.
[9]
R. Nebashi, N. Sakimura, Y. Tsuji, S. Fukami, H. Honjo, S. Saito, S. Miura, N. Ishiwata, K. Kinoshita, T. Hanyu, T. Endoh, N. Kasai, H. Ohno, and T. Sugibayashi, "A Content Addressable Memory using Magnetic Domain Wall Motion Cells," Proc. of VLSIC, pp. 300--301, Jun. 2011.
[10]
W. Zhao, N. Ben Romdhane, Y. Zhang, J.-O. Klein, and D. Ravelosona, "Racetrack memory based reconfigurable computing," Proc. of FTFC, 2013.
[11]
M. Mao, W. Wen, Y. Zhang, Y. Chen, and H. H. Li, "Exploration of GPGPU Register File Architecture Using Domain-wall-shift-write based Racetrack Memory," Proc. of DAC, pp. 1--6, 2014.
[12]
L. Thomas, S.-H. Yang, K.-S. Ryu, B. Hughes, C. Rettner, D.-S. Wang, C.-H. Tsai, K.-H. Shen, and S. Parkin, "Racetrack Memory: A High-performance, Low-cost, Non-volatile Memory based on Magnetic Domain Walls," Proc. of IEDM, Dec. 2011.
[13]
F. Jafari, Z. Lu, A. Jantsch, and M. Yaghmaee, "Buffer Optimization in Network-on-Chip Through Flow Regulation," IEEE TCAD, Vol. 29, No. 12, pp. 1973--1986, Dec 2010.
[14]
T. Moscibroda and O. Mutlu, "A Case for Bufferless Routing in On-chip Networks," Proc. of ISCA, ISCA '09, (New York, NY, USA), pp. 196--207, ACM, 2009.
[15]
H. Jang, B. S. An, N. Kulkarni, K. H. Yum, and E. J. Kim, "A Hybrid Buffer Design with STT-MRAM for On-Chip Interconnects," Proc. of NOCS, pp. 193--200, 2012.
[16]
C. Smullen, V. Mohan, A. Nigam, S. Gurumurthi, and M. Stan, "Relaxing non-volatility for fast and energy-efficient STT-RAM caches," Proc. of HPCA, pp. 50--61, Feb 2011.
[17]
Z. Sun, X. Bi, H. H. Li, W.-F. Wong, Z.-L. Ong, X. Zhu, and W. Wu, "Multi Retention Level STT-RAM Cache Designs with a Dynamic Refresh Scheme," Proc. of MICRO, pp. 329--338, ACM, 2011.
[18]
Z. Sun, X. Bi, A. K. Jones, and H. Li, "Design exploration of racetrack lower-level caches," Proc. of ISLPED, pp. 263--266, ACM, 2014.
[19]
W. Zhao et al., "Magnetic domain-wall racetrack memory for high density and fast data storage," Proc. of ICSICT, pp. 1--4, IEEE, 2012.
[20]
X. Dong, C. Xu, Y. Xie, and N. Jouppi, "NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory," IEEE TCAD, Vol. 31, No. 7, pp. 994--1007, July 2012.
[21]
M. Lis, P. Ren, M. H. Cho, K. S. Shim, C. Fletcher, O. Khan, and S. Devadas, "Scalable, accurate multi-core simulation in the 1000-core era," Proc. of ISPASS, pp. 175--185, April 2011.
[22]
T. Carlson, W. Heirman, and L. Eeckhout, "Sniper: Exploring the level of abstraction for scalable and accurate parallel multi-core simulation," SC, pp. 1--12, Nov 2011.
[23]
C. Bienia, S. Kumar, and K. Li, "PARSEC vs. SPLASH-2: A Quantitative Comparison of Two Multi-threaded Benchmark Suites on Chip-Multiprocessors," Tech. Rep. TR-818-08, Princeton, 2008.
[24]
R. Venkatesan, S. G. Ramasubramanian, S. Venkataramani, K. Roy, and A. Raghunathan, "STAG: Spintronic-tape Architecture for GPGPU Cache Hierarchies," Proc. of ISCA, pp. 253--264, 2014.

Cited By

View all
  • (2023)Hybrid, Asymmetric and Reconfigurable Input Unit Designs for Energy-Efficient On-Chip NetworksIEICE Transactions on Electronics10.1587/transele.2022CTP0005E106.C:10(570-579)Online publication date: 1-Oct-2023
  • (2023) ROLLED: R acetrack Memory O ptimized L inear L ayout and E fficient D ecomposition of Decision Trees IEEE Transactions on Computers10.1109/TC.2022.319709472:5(1488-1502)Online publication date: 1-May-2023
  • (2022)Brain-inspired Cognition in Next-generation Racetrack MemoriesACM Transactions on Embedded Computing Systems10.1145/352407121:6(1-28)Online publication date: 12-Dec-2022
  • Show More Cited By

Index Terms

  1. Domain-wall memory buffer for low-energy NoCs

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DAC '15: Proceedings of the 52nd Annual Design Automation Conference
    June 2015
    1204 pages
    ISBN:9781450335201
    DOI:10.1145/2744769
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 June 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. FIFOs
    2. domain-wall memory
    3. network-on-chip

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    DAC '15
    Sponsor:
    DAC '15: The 52nd Annual Design Automation Conference 2015
    June 7 - 11, 2015
    California, San Francisco

    Acceptance Rates

    Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

    Upcoming Conference

    DAC '25
    62nd ACM/IEEE Design Automation Conference
    June 22 - 26, 2025
    San Francisco , CA , USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)4
    Reflects downloads up to 10 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Hybrid, Asymmetric and Reconfigurable Input Unit Designs for Energy-Efficient On-Chip NetworksIEICE Transactions on Electronics10.1587/transele.2022CTP0005E106.C:10(570-579)Online publication date: 1-Oct-2023
    • (2023) ROLLED: R acetrack Memory O ptimized L inear L ayout and E fficient D ecomposition of Decision Trees IEEE Transactions on Computers10.1109/TC.2022.319709472:5(1488-1502)Online publication date: 1-May-2023
    • (2022)Brain-inspired Cognition in Next-generation Racetrack MemoriesACM Transactions on Embedded Computing Systems10.1145/352407121:6(1-28)Online publication date: 12-Dec-2022
    • (2022)BlendCache: An Energy and Area Efficient Racetrack Last-Level-Cache ArchitectureIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.316119841:12(5288-5298)Online publication date: Dec-2022
    • (2022)Traffic-Aware Energy-Efficient Hybrid Input Buffer Design for On-Chip Routers2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC)10.1109/MCSoC57363.2022.10023992(395-401)Online publication date: Dec-2022
    • (2022)A Hop-Parity-Involved Task Schedule for Lightweight Racetrack-Buffer in Energy-Efficient NoCsSmart Computing and Communication10.1007/978-3-030-97774-0_25(276-285)Online publication date: 15-Mar-2022
    • (2019)Optimal Application Mapping and Scheduling for Network-on-Chips with Computation in STT-RAM Based RouterIEEE Transactions on Computers10.1109/TC.2018.286474968:8(1174-1189)Online publication date: 1-Aug-2019
    • (2019)A New Memory-Based Routing Policy for Mesh Network2019 IEEE 4th International Conference on Integrated Circuits and Microsystems (ICICM)10.1109/ICICM48536.2019.8977195(237-241)Online publication date: Oct-2019
    • (2019)Leveraging Transverse Reads to Correct Alignment Faults in Domain Wall Memories2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN.2019.00047(375-387)Online publication date: Jun-2019
    • (2018)Efficient cache replacement policy for minimising error rate in L2-STT-MRAM cachesInternational Journal of Grid and Utility Computing10.5555/3292801.32928039:4(307-321)Online publication date: 1-Jan-2018
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media