research-article

Heterogeneous Instruction Set Architecture for RRAM-enabled In-memory Computing

Authors:

Shengguang Ren,

Yi Li, and

Xiangshui MiaoAuthors Info & Claims

NANOARCH '23: Proceedings of the 18th ACM International Symposium on Nanoscale Architectures

December 2023

Article No.: 7, Pages 1 - 6

https://doi.org/10.1145/3611315.3633244

Published: 25 January 2024 Publication History

Abstract

RRAM-enabled in-memory computing (IMC) is regarded as a promising solution for breaking the von Neumann bottleneck. Using RRAM-based IMC to construct heterogeneous computing systems can fully leverage the advantages of both digital and IMC platforms. Critical challenges are effectively managing the dataflows between the digital system and the analog IMC and providing a standard for communication. In this paper, from the perspective of hardware instruction execution, we designed a general RRAM-enabled analog instruction set architecture compatible with digital computing. These instructions adopted the vector-based computing concepts in RISC-V, and the examples compatible with RISC-V vector extension are demonstrated in detail. A tile-processing unit-array three-level architecture is also devolved to support the instruction execution. The hardware estimations are performed on 65 nm technology. Results indicate that the total activated power of the activated processing unit is 8.64 mW which is 4.9 times smaller than PUMA and 33.4 times smaller than ISAAC. The energy efficiency reaches 1190.7 GOPS/W, 1.42 × and 3.12 × compared with PUMA and ISAAC, respectively. Furthermore, as the analog and digital computing frequency increases, the peak energy efficiency can reach 40 TOPS/W which enables the future general use of the IMC-based heterogeneous system.

References

[1]

Joao Ambrosi 2018. Hardware-Software Co-Design for an Analog-Digital Accelerator for Machine Learning. In 2018 IEEE International Conference on Rebooting Computing (ICRC). IEEE, McLean, VA, USA, 1–13. https://doi.org/10.1109/icrc.2018.8638612

[2]

Aayush Ankit 2019. PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems(ASPLOS). ACM, New York, NY, USA, 715–731. https://doi.org/10.1145/3297858.3304049

Digital Library

[3]

H. Bao 2022. Toward memristive in-memory computing: principles and applications. Front Optoelectron 15, 1 (2022), 23. https://doi.org/10.1007/s12200-022-00025-4

[4]

Pai-Yu Chen, Xiaochen Peng, and Shimeng Yu. 2017. NeuroSim+: An integrated device-to-algorithm framework for benchmarking synaptic devices and array architectures. In 2017 IEEE International Electron Devices Meeting (IEDM). IEEE, San Francisco, CA, USA, 6.1.1–6.1.4. https://doi.org/10.1109/iedm.2017.8268337

[5]

Xiaoming Chen, Tao Song, and Yinhe Han. 2021. RRAM-based Analog In-Memory Computing : Invited Paper. In 2021 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH). IEEE, AB, Canada, 1–6. https://doi.org/10.1109/nanoarch53687.2021.9642235

[6]

Daniele Ielmini and H. S. Philip Wong. 2018. In-memory computing with resistive switching devices. Nature Electronics 1, 6 (2018), 333–343. https://doi.org/10.1038/s41928-018-0092-2

[7]

Mohsen Imani 2019. FloatPIM:In-Memory Acceleration of Deep Neural Network Training with High Precision. In Proceedings of the 46th International Symposium on Computer Architecture. IEEE, Phoenix, AZ, USA, 802–815. https://doi.org/10.1145/3307650.3322237

Digital Library

[8]

C. E. Leiserson 2020. There’s plenty of room at the Top: What will drive computer performance after Moore’s law?Science 368, 6495 (2020), eaam9744. https://doi.org/10.1126/science.aam9744

[9]

Jialeand others Liang. 2012. Scaling Challenges for the Cross-Point Resistive Memory Array to Sub-10nm Node - An Interconnect Perspective. In 2012 4th IEEE International Memory Workshop. IEEE, Milan, Italy, 1–4. https://doi.org/10.1109/imw.2012.6213650

[10]

Haikun Liu 2022. A Simulation Framework for Memristor-Based Heterogeneous Computing Architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 41, 12 (2022), 5476–5488. https://doi.org/10.1109/tcad.2022.3152385

[11]

NVIDIA. 2023. NVIDIA H100 Tensor Core GPU. https://www.nvidia.cn/data-center/h100/

[12]

Ashish Ranjan 2019. X-Mann: A Crossbar based Architecture for Memory Augmented Neural Networks. In Proceedings of the 56th Annual Design Automation Conference 2019. ACM, Las Vegas, NV, USA, 1–6. https://doi.org/10.1145/3316781.3317935

Digital Library

[13]

Ali others Shafiee. 2016. ISAAC: A Convolutional Neural Network Accelerator with In-Situ Analog Arithmetic in Crossbars. In 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA). IEEE, Seoul, Korea (South), 14–26. https://doi.org/10.1109/isca.2016.12

Digital Library

[14]

Fengbin Tu 2023. ReDCIM: Reconfigurable Digital Computing-In-Memory Processor With Unified FP/INT Pipeline for Cloud AI Acceleration. IEEE Journal of Solid-State Circuits 58, 1 (2023), 243–255. https://doi.org/10.1109/jssc.2022.3222059

[15]

Andrew Waterman 2014. The RISC-V instruction set manual, volume I: User-level ISA, version 2.0.

[16]

P. Yao 2020. Fully hardware-implemented memristor convolutional neural network. Nature 577, 7792 (2020), 641–646. https://doi.org/10.1038/s41586-020-1942-4

Index Terms

Heterogeneous Instruction Set Architecture for RRAM-enabled In-memory Computing
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
      1. Single instruction, multiple data
2. Hardware
  1. Emerging technologies
    1. Analysis and design of emerging devices and systems
      1. Emerging architectures

Recommendations

Block-aware instruction set architecture

Instruction delivery is a critical component for wide-issue, high-frequency processors since its bandwidth and accuracy place an upper limit on performance. The processor front-end accuracy and bandwidth are limited by instruction-cache misses, ...
Read More
Clockhands: Rename-free Instruction Set Architecture for Out-of-order Processors
MICRO '23: Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture

Out-of-order superscalar processors are currently the only architecture that speeds up irregular programs, but they suffer from poor power efficiency. To tackle this issue, we focused on how to specify register operands. Specifying operands by register ...
Read More
Reducing code size for heterogeneous-connectivity-based VLIW DSPs through synthesis of instruction set extensions
CASES '03: Proceedings of the 2003 international conference on Compilers, architecture and synthesis for embedded systems

VLIW DSP architectures exhibit heterogeneous connections between functional units and register files for speeding up special tasks. Such architectural characteristics can be effectively exploited through the use of complex instruction set extensions (...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

NANOARCH '23: Proceedings of the 18th ACM International Symposium on Nanoscale Architectures

December 2023

222 pages

ISBN:9798400703256

DOI:10.1145/3611315

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 January 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

NANOARCH '23

NANOARCH '23: 18th ACM International Symposium on Nanoscale Architectures

December 18 - 20, 2023

Dresden, Germany

Acceptance Rates

Overall Acceptance Rate 55 of 87 submissions, 63%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
84
Total Downloads

Downloads (Last 12 months)84
Downloads (Last 6 weeks)12

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents