Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators

Rasch, Malte J.; Mackin, Charles; Gallo, Manuel Le; Chen, An; Fasoli, Andrea; Odermatt, Frederic; Li, Ning; Nandakumar, S. R.; Narayanan, Pritish; Tsai, Hsinyu; Burr, Geoffrey W.; Sebastian, Abu; Narayanan, Vijay

Computer Science > Machine Learning

arXiv:2302.08469 (cs)

[Submitted on 16 Feb 2023]

Title:Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators

Authors:Malte J. Rasch, Charles Mackin, Manuel Le Gallo, An Chen, Andrea Fasoli, Frederic Odermatt, Ning Li, S. R. Nandakumar, Pritish Narayanan, Hsinyu Tsai, Geoffrey W. Burr, Abu Sebastian, Vijay Narayanan

View PDF

Abstract:Analog in-memory computing (AIMC) -- a promising approach for energy-efficient acceleration of deep learning workloads -- computes matrix-vector multiplications (MVMs) but only approximately, due to nonidealities that often are non-deterministic or nonlinear. This can adversely impact the achievable deep neural network (DNN) inference accuracy as compared to a conventional floating point (FP) implementation. While retraining has previously been suggested to improve robustness, prior work has explored only a few DNN topologies, using disparate and overly simplified AIMC hardware models. Here, we use hardware-aware (HWA) training to systematically examine the accuracy of AIMC for multiple common artificial intelligence (AI) workloads across multiple DNN topologies, and investigate sensitivity and robustness to a broad set of nonidealities. By introducing a new and highly realistic AIMC crossbar-model, we improve significantly on earlier retraining approaches. We show that many large-scale DNNs of various topologies, including convolutional neural networks (CNNs), recurrent neural networks (RNNs), and transformers, can in fact be successfully retrained to show iso-accuracy on AIMC. Our results further suggest that AIMC nonidealities that add noise to the inputs or outputs, not the weights, have the largest impact on DNN accuracy, and that RNNs are particularly robust to all nonidealities.

Comments:	35 pages, 7 figures, 5 tables
Subjects:	Machine Learning (cs.LG); Emerging Technologies (cs.ET)
Cite as:	arXiv:2302.08469 [cs.LG]
	(or arXiv:2302.08469v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2302.08469

Submission history

From: Malte J. Rasch [view email]
[v1] Thu, 16 Feb 2023 18:25:06 UTC (5,522 KB)

Computer Science > Machine Learning

Title:Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators