research-article

Public Access

On-chip deep neural network storage with multi-level eNVM

Authors:

Brandon Reagen,

Lillian Pentecost,

David Brooks, and

Gu-Yeon WeiAuthors Info & Claims

DAC '18: Proceedings of the 55th Annual Design Automation Conference

June 2018

Article No.: 169, Pages 1 - 6

https://doi.org/10.1145/3195970.3196083

Published: 24 June 2018 Publication History

Abstract

One of the biggest performance bottlenecks of today's neural network (NN) accelerators is off-chip memory accesses [11]. In this paper, we propose a method to use multi-level, embedded nonvolatile memory (eNVM) to eliminate all off-chip weight accesses. The use of multi-level memory cells increases the probability of faults. Therefore, we co-design the weights and memories such that their properties complement each other and the faults result in no noticeable NN accuracy loss. In the extreme case, the weights in fully connected layers can be stored using a single transistor. With weight pruning and clustering, we show our technique reduces the memory area by over an order of magnitude compared to an SRAM baseline. In the case of VGG16 (130M weights), we are able to store all the weights in 4.9 mm², well within the area allocated to SRAM in modern NN accelerators [6].

References

[1]

I. Bayram, E. Eken, D. Kline, N. Parshook, Y. Chen, and A. K. Jones. Modeling STT-RAM fabrication cost and impacts in NVSim. In IGSC, 2016.

[2]

X. Bi, M. Mao, D. Wang, and H. Li. Unleashing the potential of MLC STT-RAM caches. In ICCAD, 2013.

Digital Library

[3]

Y. Cai et al. Threshold Voltage Distribution in MLC NAND Flash Memory: Characterization, Analysis and Modeling. DATE, 2013.

Digital Library

[4]

A. Chen. A review of emerging non-volatile memory (NVM) technologies and applications. Solid. State. Electron., 2016.

[5]

F. Chollet et al. Keras, 2015.

[6]

G. Desoli et al. A 2.9TOPS/W deep convolutional neural network SoC in FD-SOI 28nm for intelligent embedded systems. ISSCC, 2017.

[7]

X. Dong et al. NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory. IEEE Trans. Comput. Des. Integr. Circuits Syst., 2012.

Digital Library

[8]

Y. Du et al. A Memristive Neural Network Computing Engine using CMOS-Compatible Charge-Trap-Transistor (CTT). CoRR, 2017.

[9]

Y. Gong et al. Compressing Deep Convolutional Networks using Vector Quantization. CoRR, 2014.

[10]

S. Han et al. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. ICLR, 2016.

[11]

N. P. Jouppi et al. In-Datacenter Performance Analysis of a Tensor Processing Unit. ISCA, 2017.

Digital Library

[12]

F. Khan et al. The Impact of Self-Heating on Charge Trapping in High-k-Metal-Gate nFETs. IEEE Electron Device Lett., 2016.

[13]

F. Khan et al. Charge Trap Transistor (CTT): An Embedded Fully Logic-Compatible Multiple-Time Programmable Non-Volatile Memory Element for high-k-metal-gate CMOS technologies. IEEE Electron Device Letters, 2017.

[14]

Y. LeCun and C. Cortes. The MNIST database of handwritten digits.

[15]

K. Miyaji et al. Zero Additional Process, Local Charge Trap, Embedded Flash Memory with Drain-Side Assisted Erase Scheme Using Minimum Channel LengthWidth Standard Complemental Metal-Oxide-Semiconductor Single Transistor Cell. Jpn. J. Appl. Phys., 2012.

[16]

B. Reagen et al. Minerva: Enabling Low-Power, Highly-Accurate Deep Neural Network Accelerators. ISCA, 2016.

Digital Library

[17]

O. Russakovsky et al. ImageNet Large Scale Visual Recognition Challenge. IJCV, 2015.

Digital Library

[18]

K. Simonyan and A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR, 2014.

Cited By

Pentecost LHankin ADonato MHempstead MWei GBrooks D(2022)NVMExplorer: A Framework for Cross-Stack Comparisons of Embedded Non-Volatile Memories2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00073(938-956)Online publication date: Apr-2022
https://doi.org/10.1109/HPCA53966.2022.00073
Sharifi MPentecost LRajaei RKazemi ALou QWei GBrooks DNi KHu XNiemier MDonato M(2021)Application-driven design exploration for dense ferroelectric embedded non-volatile memoriesProceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design10.1109/ISLPED52811.2021.9502489(1-6)Online publication date: 26-Jul-2021
https://dl.acm.org/doi/10.1109/ISLPED52811.2021.9502489
Radway RBartolo AJolly PKhan ZLe BTandon PWu TXin YVianello EVivet PNowak EWong HAly MBeigne EWootters MMitra S(2021)Illusion of large on-chip memory by networked computing chips for neural network inferenceNature Electronics10.1038/s41928-020-00515-34:1(71-80)Online publication date: 11-Jan-2021
https://doi.org/10.1038/s41928-020-00515-3
Show More Cited By

On-chip deep neural network storage with multi-level eNVM
1. Hardware

Recommendations

On-Chip Deep Neural Network Storage with Multi-Level eNVM
2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC)
One of the biggest performance bottlenecks of today’s neural network (NN) accelerators is off-chip memory accesses [11]. In this paper, we propose a method to use multi-level, embedded nonvolatile memory (eNVM) to eliminate all off-chip weight ...
Read More
EnVM: virtual memory design for new memory architectures
CASES '14: Proceedings of the 2014 International Conference on Compilers, Architecture and Synthesis for Embedded Systems

Virtual memory is optimized for SRAM-based memory devices in which memory accesses are symmetric, i.e., the latency of read and write accesses are similar. Unfortunately, with the emergence of newer non-volatile memory (NVM) technologies that are denser ...
Read More
Design and Implementation for Multi-level Cell Flash Memory Storage Systems
RTCSA '10: Proceedings of the 2010 IEEE 16th International Conference on Embedded and Real-Time Computing Systems and Applications

NAND flash memory has gained its popularity in a variety of applications as a storage medium due to its low power consumption, non-volatility, high performance, physical stability, and portability. In particular, Multi-Level Cell (MLC) flash memory, ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DAC '18: Proceedings of the 55th Annual Design Automation Conference

June 2018

1089 pages

ISBN:9781450357005

DOI:10.1145/3195970

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

EDAC: Electronic Design Automation Consortium
SIGDA: ACM Special Interest Group on Design Automation
IEEE Council on Electronic Design Automation (CEDA)

In-Cooperation

SIGBED: ACM Special Interest Group on Embedded Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 June 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

DARPA
Semiconductor Research Corporation
National Science Foundation

Conference

DAC '18

Sponsor:

EDAC
SIGDA

DAC '18: The 55th Annual Design Automation Conference 2018

June 24 - 29, 2018

California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25

Sponsor:
sigda

62nd ACM/IEEE Design Automation Conference

June 22 - 26, 2025

San Francisco , CA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
666
Total Downloads

Downloads (Last 12 months)149
Downloads (Last 6 weeks)19

Other Metrics

View Author Metrics

Citations

Cited By

Pentecost LHankin ADonato MHempstead MWei GBrooks D(2022)NVMExplorer: A Framework for Cross-Stack Comparisons of Embedded Non-Volatile Memories2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00073(938-956)Online publication date: Apr-2022
https://doi.org/10.1109/HPCA53966.2022.00073
Sharifi MPentecost LRajaei RKazemi ALou QWei GBrooks DNi KHu XNiemier MDonato M(2021)Application-driven design exploration for dense ferroelectric embedded non-volatile memoriesProceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design10.1109/ISLPED52811.2021.9502489(1-6)Online publication date: 26-Jul-2021
https://dl.acm.org/doi/10.1109/ISLPED52811.2021.9502489
Radway RBartolo AJolly PKhan ZLe BTandon PWu TXin YVianello EVivet PNowak EWong HAly MBeigne EWootters MMitra S(2021)Illusion of large on-chip memory by networked computing chips for neural network inferenceNature Electronics10.1038/s41928-020-00515-34:1(71-80)Online publication date: 11-Jan-2021
https://doi.org/10.1038/s41928-020-00515-3
Hasan MRaquibuzzaman MChatterjee IRay B(2020)Radiation Tolerance of 3-D NAND Flash Based Neuromorphic Computing System2020 IEEE International Reliability Physics Symposium (IRPS)10.1109/IRPS45951.2020.9128219(1-4)Online publication date: 28-Apr-2020
https://dl.acm.org/doi/10.1109/IRPS45951.2020.9128219
Pentecost LDonato MReagen BGupta UMa SWei GBrooks D(2019)MaxNVMProceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3352460.3358258(769-781)Online publication date: 12-Oct-2019
https://dl.acm.org/doi/10.1145/3352460.3358258
Jang HKim JJo JLee JKim JManne SHunter HAltman E(2019)MnnFastProceedings of the 46th International Symposium on Computer Architecture10.1145/3307650.3322214(250-263)Online publication date: 22-Jun-2019
https://dl.acm.org/doi/10.1145/3307650.3322214

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents