research-article

A Precision-Scalable Deep Neural Network Accelerator With Activation Sparsity Exploitation

Authors:

Wenjie Li,

Aokun Hu,

Ningyi Xu,

Guanghui HeAuthors Info & Claims

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Volume 43, Issue 1

Pages 263 - 276

https://doi.org/10.1109/TCAD.2023.3310916

Published: 31 August 2023 Publication History

Abstract

To meet the demand in a wide range of practical applications, precision-scalable deep neural network (DNN) accelerators are becoming an unavoidable trend. On the other hand, it has been demonstrated that a DNN accelerator may achieve better computation efficiency through exploiting the sparsity. Therefore, DNN accelerators with both precision scalability and sparsity exploitation are expected to have better performance. In this article, we propose an efficient precision-scalable DNN accelerator that can exploit the sparsity of activations. The precision scalability is obtained from the decomposable multiplier which is inspired by the well-known design, Bit Fusion. Besides, a zero-skipping scheme is adopted to leverage the inherent sparsity of activations. We first modify the architecture of the conventional fusion unit (FU) to make it amenable to the zero-skipping scheme. Then, a segmentation approach is devised to tackle the memory access conflict. Furthermore, a sparsity-aware mapping method is proposed to balance the workload of processing elements (PEs). Moreover, we present a bit-splitting strategy which can take advantage of the sparsity in the bit level. Compared with the state-of-the-art precision-scalable designs, our proposed accelerator can provide speedups of <inline-formula> <tex-math notation="LaTeX">$4.12\times $ </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">$4.07\times $ </tex-math></inline-formula>, and <inline-formula> <tex-math notation="LaTeX">$6.62\times $ </tex-math></inline-formula> in the precision modes <inline-formula> <tex-math notation="LaTeX">$8b\times 8b$ </tex-math></inline-formula>, <inline-formula> <tex-math notation="LaTeX">$4b\times 4b$ </tex-math></inline-formula>, and <inline-formula> <tex-math notation="LaTeX">$2b\times 2b$ </tex-math></inline-formula>, respectively. Meanwhile, it also achieves <inline-formula> <tex-math notation="LaTeX">$3.92\times $ </tex-math></inline-formula> peak area efficiency and competitive peak energy efficiency.

References

[1]

V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, “Efficient processing of deep neural networks: A tutorial and survey,” Proc. IEEE, vol. 105, no. 12, pp. 2295–2329, Dec. 2017.

Abstract

References

Cited By

Index Terms

Recommendations

A hardware-efficient computing engine for FPGA-based deep convolutional neural network accelerator

An FPGA-based accelerator platform implements for convolutional neural network

Throughput-Optimized FPGA Accelerator for Deep Convolutional Neural Networks

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations