E2BNet: MAC-free yet accurate 2-level binarized neural network accelerator for embedded systems

Mirsalari, Seyed Ahmad; Nazari, Najmeh; Ansarmohammadi, Seyed Ali; Salehi, Mostafa E.; Ghiasi, Soheil

doi:10.1007/s11554-021-01148-1

E2BNet: MAC-free yet accurate 2-level binarized neural network accelerator for embedded systems

Special Issue Paper
Published: 10 July 2021

Volume 18, pages 1285–1299, (2021)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Seyed Ahmad Mirsalari¹,
Najmeh Nazari ORCID: orcid.org/0000-0003-3094-9439¹,
Seyed Ali Ansarmohammadi¹,
Mostafa E. Salehi^1,2 &
…
Soheil Ghiasi³

416 Accesses
6 Citations
Explore all metrics

Abstract

Deep neural networks are widely used in computer vision, pattern recognition, and speech recognition and achieve high accuracy at the cost of remarkable computation. High computational complexity and memory accesses of such networks create a big challenge for using them in resource-limited and low-power embedded systems. Several binary neural networks have been proposed that exploit only 1-bit values for both weights and activations. Binary neural networks substitute complex multiply-accumulation operations with bitwise logic operations to reduce computations and memory usage. However, these quantized neural networks suffer from accuracy loss, especially in big datasets. In this paper, we introduce a quantized neural network with 2-bit weights and activations that is more accurate compared to the state-of-the-art quantized neural networks, and also the accuracy is close to the full precision neural networks. Moreover, we propose E2BNet, an efficient MAC-free hardware architecture that increases power efficiency and throughput/W about 3.6 × and 1.5 × , respectively, compared to the state-of-the-art quantized neural networks. E2BNet processes more than 500 images/s on the ImageNet dataset that not only meet real-time requirements of images/video processing but also can be deployed on high frame rate video applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deploying deep learning networks based advanced techniques for image processing on FPGA platform

Article 15 June 2023

Quantization Effects on a Convolutional Layer of a Deep Neural Network

MXQN:Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks

Article 04 January 2021

References

Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 29, 82–97 (2012)
Article Google Scholar
Deng, L. et al.: Recent advances in deep learning for speech research at Microsoft. In: IEEE International Conference on Zcoustics, Speech, and Signal Processing, pp. 0–4, 2013.
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., Seliya, N., Wald, R., Muharemagic, E.: Deep learning applications and challenges in big data analytics. J. Big Data 2(1), 1–21 (2015)
Article Google Scholar
He, K., Zhang, X., Ren, Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. 2016.
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105, (2012)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587, (2014)
Nasse, F., Thurau, C., Fink, G.A.: Face detection using GPU-based convolutional neural networks. In: International Conference on Computer Analysis of Images and Patterns, pp. 83–90. Springer, Berlin, Heidelberg, 2009.
Sze, V., Chen, Y.H., Yang, T.J., Emer, J.S.: Efficient processing of deep neural networks: A tutorial and survey. Proc. IEEE 105(12), 2295–2329 (2017)
Article Google Scholar
Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks. In: Advances in neural information processing systems, pp. 1135–1143. 2015.
Chen, Y.-H., Emer, J., Sze, V.: Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks. ACM SIGARCH Comput. Arch. News 44(3), 367–379 (2016)
Article Google Scholar
Gysel, P., Pimentel, J., Motamedi, M., Ghiasi, S.: Ristretto: a framework for empirical study of resource-efficient inference in convolutional neural networks. IEEE Trans. Neural Netw. Learn. Syst. 29(11), 1–6 (2018)
Article Google Scholar
Gysel, P., Motamedi, M., Ghiasi, S.: Hardware-oriented approximation of convolutional neural networks. In International Conference on Learning Representations, pp. 1–8, 2016.
Ma, Y., Suda, N., Cao, Y., Seo, J., Vrudhula, S.; Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In: Field Programmable Logic and Applications (FPL), 2016 26th International Conference on, pp. 1–8. IEEE, 2016.
Yang, T.J., Chen, Y.H., Sze, V.: Designing energy-efficient convolutional neural networks using energy-aware pruning. In Proceeding of the 30th IEEE Conference Computer Vision Pattern Recognition, CVPR 2017, pp. 6071–6079, 2017.
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520. 2018
Zhang, X., Xinyu, Z., Mengxiao, L., Jian, S.: Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6848–6856. 2018.
Choi, Y., Mostafa, E.K., Jungwon, L.: Universal deep neural network compression. IEEE Journal of Selected Topics in Signal Processing (2020).
Sharma, H. et al.: Bit fusion: Bit-level dynamically composable architecture for accelerating deep neural networks. In the 45th Annual International Symposium on Computer Architecture (ISCA), pp. 764–775. IEEE, 2018.
Courbariaux, M., David, J.: Training deep neural networks with low precision multiplications. ArXiv Preprint ArXiv: 1412.7024, 2015.
Nazari, N., Salehi, M.E.: Binary neural networks. In Hardware Architectures for Deep Learning, The Institution of Engineering and Technology, 2020.
Courbariaux, M., David, J.: BinaryConnect : Training deep neural networks with binary weights during propagations. In: Advances in Neural Information Processing Systems, pp. 3123–3131. 2015.
Liu, B.: Ternary weight networks. ArXiv Preprint ArXiv: 1605.04711, 2016.
Nazari, N., Loni, M., Salehi, M. E., Daneshtalab, M., Mikael S.: TOT-Net: an endeavor toward optimizing ternary neural networks. In 2019 22nd Euromicro Conference on Digital System Design (DSD), pp. 305–312. IEEE, 2019.
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net : imagenet classification using binary. In European Conference on Computer Vision, pp. 525–542. Springer, Cham, 2016.
Deng, L., Jiao, P., Pei, J., Wu, Z., Li, G.: GXNOR-Net : training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework. Neural Netw. 100(4), 49–58 (2018)
Article Google Scholar
Zhu, C., Song Han, H.M., Dally, W.J.: Trained ternary quantization. In International Conference on Learning Representations (ICLR), 2017.
Choi, J., Swagath, V., Vijayalakshmi, S., Kailash, G., Zhuo, W., Pierce, C.: Accurate and efficient 2-bit quantized neural networks. In: Proceedings of the SysML Conference, vol. 2019. 2019.
Yang, Z., Yunhe, W., Kai, H., Chunjing, X., Chao, X., Dacheng, T., Chang, X.: Searching for low-bit weights in quantized neural networks. 34th Conference on Neural Information Processing Systems (NIPS), 2020.
Zhang, D., Yang, J., Ye, D., Hua, G.: LQ-Nets: Learned quantization for highly accurate and compact deep neural networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, vol. 11212 LNCS, pp. 373–390.
Zhou, S., Wu, Y., Ni, Z., Zhou, X., Wen, H., Zou, Y.: DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients. 1(1):1–13, (2016)
Umuroglu, Y., Fraser, N.J., Gambardella, G., Blott, M.: FINN : A framework for fast, scalable binarized neural network inference. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 65–74. ACM, 2017.
Ghasemzadeh, M., Samragh, M., Koushanfar, F.: ReBNet: residual binarized neural network. In The 26th IEEE International Symposium on Field-Programmable Custom Computing Machines. pp. 57–64, 2017.
Faraone, J., Fraser, N., Blott, M., Leong, P.H.W.: SYQ: Learning symmetric quantization for efficient deep neural networks. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 4300–4309, 2018.
Andri, R., Cavigelli, L., Rossi, D., Benini, L.: YodaNN 1: an architecture for ultra-low power binary-weight CNN acceleration. IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 37(1), 48–60 (2018)
Article Google Scholar
Al Bahou, A., Karunaratne, G., Andri, R., Cavigelli, L., Benini, L.: XNORBIN : A 95 TOp / s / W hardware accelerator for binary convolutional neural networks. In 2018 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS), pp. 1–3. IEEE, 2018.
Conti, F., Schiavone, P.D., Member, S., Benini, L.: XNOR neural engine: a hardware accelerator ip for 21 . 6 fJ / op Binary Neural Network Inference. ArXiv Preprint ArXiv: 1807.03010, (2018)
Sato, S., Nakahara, H., Ikebe, M.: BRein memory : A single-chip binary / ternary reconfigurable in-memory deep neural network. (2017)
Nazari, N., Seyed Ahmad M., Sima S., Salehi, M.E., Masoud D.: Multi-level binarized lstm in eeg classification for wearable devices. In 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 175–181. IEEE, 2020.
Mirsalari, S.A., Sima S., Salehi, M.E., Masoud D.: MuBiNN: Multi-level binarized recurrent neural network for EEG signal classification. In: 2020 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5. IEEE, 2020.
Mirsalari, S.A., Najmeh Nazari, S.A.A., Sinaei, S., Salehi, M.E., Daneshtalab, M.: ELC-ECG: Efficient LSTM Cell for ECG classification based on quantized architecture. In 2021 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5. IEEE, (2021)
Mirsalari, S.A., Najmeh Nazari, S.S., Salehi, M.E., Daneshtalab, M: FaCT-LSTM: fast and compact ternary architecture for LSTM recurrent neural networks. IEEE Design & Test (2021)
Ouyang, W. et al.: DeepID-Net: multi-stage and deformable deep convolutional neural networks for object detection. ArXiv Preprint ArXiv: 1409.3505. (2014)
Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. In: IEEE International Conference on Machine Learning. pp. 1–9, 2013.
Li, B., Hassan Najafi, M., Yuan, B., Lilja, D.J: Quantized neural networks with new stochastic multipliers. In 2018 19th International Symposium on Quality Electronic Design (ISQED), pp. 376–382. IEEE, 2018.
Li, B., Hassan Najafi, M., Lilja, D.J.: Low-cost stochastic hybrid multiplier for quantized neural networks. ACM J. Emerg. Technol. Comput. Syst. (JETC) 15(2), 1–19 (2019)
Article Google Scholar
Wess, M., Dinakarrao, S.M.P., Jantsch, A.: Weighted quantization-regularization in DNNs for weight memory minimization toward HW implementation. IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 37(11), 2929–2939 (2018). https://doi.org/10.1109/TCAD.2018.2857080
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
Seyed Ahmad Mirsalari, Najmeh Nazari, Seyed Ali Ansarmohammadi & Mostafa E. Salehi
School of Computer Science, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
Mostafa E. Salehi
Electrical and Computer Engineering Department, University of California, Davis, USA
Soheil Ghiasi

Authors

Seyed Ahmad Mirsalari
View author publications
You can also search for this author in PubMed Google Scholar
Najmeh Nazari
View author publications
You can also search for this author in PubMed Google Scholar
Seyed Ali Ansarmohammadi
View author publications
You can also search for this author in PubMed Google Scholar
Mostafa E. Salehi
View author publications
You can also search for this author in PubMed Google Scholar
Soheil Ghiasi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Najmeh Nazari.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Mirsalari, S.A., Nazari, N., Ansarmohammadi, S.A. et al. E2BNet: MAC-free yet accurate 2-level binarized neural network accelerator for embedded systems. J Real-Time Image Proc 18, 1285–1299 (2021). https://doi.org/10.1007/s11554-021-01148-1

Download citation

Received: 10 January 2021
Accepted: 05 June 2021
Published: 10 July 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s11554-021-01148-1

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

E2BNet: MAC-free yet accurate 2-level binarized neural network accelerator for embedded systems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deploying deep learning networks based advanced techniques for image processing on FPGA platform

Quantization Effects on a Convolutional Layer of a Deep Neural Network

MXQN:Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

E2BNet: MAC-free yet accurate 2-level binarized neural network accelerator for embedded systems

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deploying deep learning networks based advanced techniques for image processing on FPGA platform

Quantization Effects on a Convolutional Layer of a Deep Neural Network

MXQN:Mixed quantization for reducing bit-width of weights and activations in deep convolutional neural networks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation