Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3195970.3196083acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article
Public Access

On-chip deep neural network storage with multi-level eNVM

Published: 24 June 2018 Publication History
  • Get Citation Alerts
  • Abstract

    One of the biggest performance bottlenecks of today's neural network (NN) accelerators is off-chip memory accesses [11]. In this paper, we propose a method to use multi-level, embedded nonvolatile memory (eNVM) to eliminate all off-chip weight accesses. The use of multi-level memory cells increases the probability of faults. Therefore, we co-design the weights and memories such that their properties complement each other and the faults result in no noticeable NN accuracy loss. In the extreme case, the weights in fully connected layers can be stored using a single transistor. With weight pruning and clustering, we show our technique reduces the memory area by over an order of magnitude compared to an SRAM baseline. In the case of VGG16 (130M weights), we are able to store all the weights in 4.9 mm2, well within the area allocated to SRAM in modern NN accelerators [6].

    References

    [1]
    I. Bayram, E. Eken, D. Kline, N. Parshook, Y. Chen, and A. K. Jones. Modeling STT-RAM fabrication cost and impacts in NVSim. In IGSC, 2016.
    [2]
    X. Bi, M. Mao, D. Wang, and H. Li. Unleashing the potential of MLC STT-RAM caches. In ICCAD, 2013.
    [3]
    Y. Cai et al. Threshold Voltage Distribution in MLC NAND Flash Memory: Characterization, Analysis and Modeling. DATE, 2013.
    [4]
    A. Chen. A review of emerging non-volatile memory (NVM) technologies and applications. Solid. State. Electron., 2016.
    [5]
    F. Chollet et al. Keras, 2015.
    [6]
    G. Desoli et al. A 2.9TOPS/W deep convolutional neural network SoC in FD-SOI 28nm for intelligent embedded systems. ISSCC, 2017.
    [7]
    X. Dong et al. NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory. IEEE Trans. Comput. Des. Integr. Circuits Syst., 2012.
    [8]
    Y. Du et al. A Memristive Neural Network Computing Engine using CMOS-Compatible Charge-Trap-Transistor (CTT). CoRR, 2017.
    [9]
    Y. Gong et al. Compressing Deep Convolutional Networks using Vector Quantization. CoRR, 2014.
    [10]
    S. Han et al. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. ICLR, 2016.
    [11]
    N. P. Jouppi et al. In-Datacenter Performance Analysis of a Tensor Processing Unit. ISCA, 2017.
    [12]
    F. Khan et al. The Impact of Self-Heating on Charge Trapping in High-k-Metal-Gate nFETs. IEEE Electron Device Lett., 2016.
    [13]
    F. Khan et al. Charge Trap Transistor (CTT): An Embedded Fully Logic-Compatible Multiple-Time Programmable Non-Volatile Memory Element for high-k-metal-gate CMOS technologies. IEEE Electron Device Letters, 2017.
    [14]
    Y. LeCun and C. Cortes. The MNIST database of handwritten digits.
    [15]
    K. Miyaji et al. Zero Additional Process, Local Charge Trap, Embedded Flash Memory with Drain-Side Assisted Erase Scheme Using Minimum Channel LengthWidth Standard Complemental Metal-Oxide-Semiconductor Single Transistor Cell. Jpn. J. Appl. Phys., 2012.
    [16]
    B. Reagen et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators. ISCA, 2016.
    [17]
    O. Russakovsky et al. ImageNet Large Scale Visual Recognition Challenge. IJCV, 2015.
    [18]
    K. Simonyan and A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR, 2014.

    Cited By

    View all
    • (2022)NVMExplorer: A Framework for Cross-Stack Comparisons of Embedded Non-Volatile Memories2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00073(938-956)Online publication date: Apr-2022
    • (2021)Application-driven design exploration for dense ferroelectric embedded non-volatile memoriesProceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design10.1109/ISLPED52811.2021.9502489(1-6)Online publication date: 26-Jul-2021
    • (2021)Illusion of large on-chip memory by networked computing chips for neural network inferenceNature Electronics10.1038/s41928-020-00515-34:1(71-80)Online publication date: 11-Jan-2021
    • Show More Cited By
    1. On-chip deep neural network storage with multi-level eNVM

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      DAC '18: Proceedings of the 55th Annual Design Automation Conference
      June 2018
      1089 pages
      ISBN:9781450357005
      DOI:10.1145/3195970
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 June 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      DAC '18
      Sponsor:
      DAC '18: The 55th Annual Design Automation Conference 2018
      June 24 - 29, 2018
      California, San Francisco

      Acceptance Rates

      Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

      Upcoming Conference

      DAC '25
      62nd ACM/IEEE Design Automation Conference
      June 22 - 26, 2025
      San Francisco , CA , USA

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)149
      • Downloads (Last 6 weeks)19

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)NVMExplorer: A Framework for Cross-Stack Comparisons of Embedded Non-Volatile Memories2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00073(938-956)Online publication date: Apr-2022
      • (2021)Application-driven design exploration for dense ferroelectric embedded non-volatile memoriesProceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design10.1109/ISLPED52811.2021.9502489(1-6)Online publication date: 26-Jul-2021
      • (2021)Illusion of large on-chip memory by networked computing chips for neural network inferenceNature Electronics10.1038/s41928-020-00515-34:1(71-80)Online publication date: 11-Jan-2021
      • (2020)Radiation Tolerance of 3-D NAND Flash Based Neuromorphic Computing System2020 IEEE International Reliability Physics Symposium (IRPS)10.1109/IRPS45951.2020.9128219(1-4)Online publication date: 28-Apr-2020
      • (2019)MaxNVMProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358258(769-781)Online publication date: 12-Oct-2019
      • (2019)MnnFastProceedings of the 46th International Symposium on Computer Architecture10.1145/3307650.3322214(250-263)Online publication date: 22-Jun-2019

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media