Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3453688.3461529acmconferencesArticle/Chapter ViewAbstractPublication PagesglsvlsiConference Proceedingsconference-collections
research-article
Public Access

Processing-in-Memory Acceleration of MAC-based Applications Using Residue Number System: A Comparative Study

Published: 22 June 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Processing-in-memory (PIM) has raised as a viable solution for the memory wall crisis and has attracted great interest in accelerating computationally intensive AI applications ranging from filtering to complex neural networks. In this paper, we try to take advantage of both PIM and the residue number system (RNS) as an alternative for the conventional binary number representation to accelerate multiplication-and-accumulations (MACs), primary operations of target applications. The PIM architecture utilizes the maximum internal bandwidth of memory chips to realize a local and parallel computation to eliminates the off-chip data transfer. Moreover, RNS limits inter-digit carry propagation by performing arithmetic operations on small residues independently and in parallel. Thus, we develop a PIM-RNS, entitled PRIMS, and analyze the potential of intertwining PIM architecture with the inherent parallelism of the RNS arithmetic to delineate the opportunities and challenges. To this end, we build a comprehensive device-to-architecture evaluation framework to quantitatively study this problem considering the impact of PIM technology for a well-known three-moduli set as a case study.

    Supplemental Material

    MP4 File
    In this presentation, we analyze the potential of intertwining PIM architecture with the inherent parallelism of the RNS arithmetic to delineate the opportunities and challenges.

    References

    [1]
    Y.-H. Chen et al., "Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks," IEEE journal of solid-state circuits, vol. 52, no. 1, pp. 127--138, 2016.
    [2]
    A. Roohi et al., "Apgan: Approximate gan for robust low energy learning from imprecise components," IEEE Transactions on Computers, vol. 69, no. 3, pp. 349--360, 2019.
    [3]
    H. Sharma et al., "Bit fusion: Bit-level dynamically composable architecture for accelerating deep neural network," in ISCA. IEEE, 2018.
    [4]
    M. Horowitz, "1.1 computing's energy problem (and what we can do about it)," in ISSCC, 2014, pp. 10--14.
    [5]
    P. Chi et al., "Prime: A novel processing-in-memory architecture for neural network computation in reram-based main memory," ACM SIGARCH Computer Architecture News, vol. 44, no. 3, pp. 27--39, 2016.
    [6]
    V. Seshadri et al., "Ambit: In-memory accelerator for bulk bitwise operations using commodity dram technology," in MICRO. IEEE, 2017, pp. 273--287.
    [7]
    S. Li et al., "Drisa: A dram-based reconfigurable in-situ accelerator," in 2017 MICRO. IEEE, 2017, pp. 288--301.
    [8]
    S. Angizi and D. Fan, "Redram: A reconfigurable processing-in-dram platform for accelerating bulk bit-wise operations," in 38th ICCAD, 2019, p. 8942101.
    [9]
    A. Roohi, S. Angizi, D. Fan, and R. F. DeMara, "Processing-in-memory acceleration of convolutional neural networks for energy-effciency, and power-intermittency resilience," in 20th International Symposium on Quality Electronic Design (ISQED). IEEE, 2019, pp. 8--13.
    [10]
    C.-H. Chang et al., "Residue number systems: A new paradigm to datapath optimization for low-power and high-performance digital signal processing applications," IEEE circuits and systems magazine, vol. 15, no. 4, pp. 26--44, 2015.
    [11]
    M. Taheri et al., "Efficient incorporation of the rns datapath in reverse converter," IEEE TCAS II: Express Briefs, 2020.
    [12]
    M. Taheri, N. Shafiee, M. Esmaeildoust, Z. Amirjamshidi, R. Sabbaghi-nadooshan, and K. Navi, "A high speed residue-to-binary converter for balanced 4-moduli set," Journal of Computing and Security, vol. 2, no. 1, pp. 43--54, 2015.
    [13]
    M. Taheri, K. Navi, and A. Sabbagh Molahosseini, "Efficient programmable power-of-two scaler for the three-moduli set {2np, 2n- 1, 2n 1-1}," ETRI Journal, vol. 42, no. 4, pp. 596--607, 2020.
    [14]
    D. Reis et al., "Modeling and benchmarking computing-in-memory for design space exploration," in GLSVLSI, 2020, pp. 39--44.
    [15]
    S. Angizi et al., "Accelerating deep neural networks in processing-in-memory platforms: Analog or digital approach?" in ISVLSI. IEEE, 2019, pp. 197--202.
    [16]
    S. Angizi, J. Sun, W. Zhang, and D. Fan, "Graphs: A graph processing accelerator leveraging sot-mram," in DATE. IEEE, 2019, pp. 378--383.
    [17]
    S. Jain et al., "Computing in memory with spin-transfer torque magnetic ram," IEEE TVLSI, vol. 26, no. 3, pp. 470--483, 2017.
    [18]
    S. Angizi et al., "Rimpa: A new reconfigurable dual-mode in-memory processing architecture with spin hall effect-driven domain wall motion device," in ISVLSI. IEEE, 2017, pp. 45--50.
    [19]
    C. Eckert et al., "Neural cache: Bit-serial in-cache acceleration of deep neural networks," pp. 383--396, 2018.
    [20]
    X. Fong et al., "Spin-transfer torque devices for logic and memory: Prospects and perspectives," IEEE TCAD, vol. 35, no. 1, pp. 1--22, 2015.
    [21]
    X. Fong, S. K. Gupta et al., "Knack: A hybrid spin-charge mixed-mode simulator for evaluating different genres of spin-transfer torque mram bit-cells," in SISPAD. IEEE, 2011, pp. 51--54.
    [22]
    X. Dong et al., "Nvsim: A circuit-level performance, energy, and area model for emerging nonvolatile memory," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 31, no. 7, pp. 994--1007, 2012.
    [23]
    .DRAM Power Model. https://www.rambus.com/energy/.
    [24]
    (2011) Ncsu eda freepdk45. [Online]. Available: http://www.eda.ncsu.edu/wiki/FreePDK45:Contents

    Cited By

    View all
    • (2023)A Generalized Residue Number System Design Approach for Ultralow-Power Arithmetic Circuits Based on Deterministic Bit-StreamsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.325060342:11(3787-3800)Online publication date: 1-Mar-2023
    • (2022)ReFACE: Efficient Design Methodology for Acceleration of Digital Filter Implementations2022 23rd International Symposium on Quality Electronic Design (ISQED)10.1109/ISQED54688.2022.9806144(1-6)Online publication date: 6-Apr-2022
    • (2022)Accelerating Neural Network Training with Processing-in-Memory GPU2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid54584.2022.00051(414-421)Online publication date: May-2022
    • Show More Cited By

    Index Terms

    1. Processing-in-Memory Acceleration of MAC-based Applications Using Residue Number System: A Comparative Study

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      GLSVLSI '21: Proceedings of the 2021 Great Lakes Symposium on VLSI
      June 2021
      504 pages
      ISBN:9781450383936
      DOI:10.1145/3453688
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 22 June 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. multiplication-and-accumulation
      2. processing-in-memory
      3. residue number system

      Qualifiers

      • Research-article

      Data Availability

      In this presentation, we analyze the potential of intertwining PIM architecture with the inherent parallelism of the RNS arithmetic to delineate the opportunities and challenges. https://dl.acm.org/doi/10.1145/3453688.3461529#GLSVLSI21-glsv111p.mp4

      Funding Sources

      Conference

      GLSVLSI '21
      Sponsor:
      GLSVLSI '21: Great Lakes Symposium on VLSI 2021
      June 22 - 25, 2021
      Virtual Event, USA

      Acceptance Rates

      Overall Acceptance Rate 312 of 1,156 submissions, 27%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)114
      • Downloads (Last 6 weeks)11
      Reflects downloads up to 27 Jul 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)A Generalized Residue Number System Design Approach for Ultralow-Power Arithmetic Circuits Based on Deterministic Bit-StreamsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.325060342:11(3787-3800)Online publication date: 1-Mar-2023
      • (2022)ReFACE: Efficient Design Methodology for Acceleration of Digital Filter Implementations2022 23rd International Symposium on Quality Electronic Design (ISQED)10.1109/ISQED54688.2022.9806144(1-6)Online publication date: 6-Apr-2022
      • (2022)Accelerating Neural Network Training with Processing-in-Memory GPU2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid54584.2022.00051(414-421)Online publication date: May-2022
      • (2022)Energy-Efficient approximate compressor design for error-resilient digital signal processingInternational Journal of Electronics10.1080/00207217.2022.2117854110:9(1555-1577)Online publication date: 7-Sep-2022
      • (2022)Enabling Edge Computing Using Emerging Memory Technologies: From Device to ArchitectureFrontiers of Quality Electronic Design (QED)10.1007/978-3-031-16344-9_11(415-464)Online publication date: 6-Sep-2022

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media