Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

E2BNet: MAC-free yet accurate 2-level binarized neural network accelerator for embedded systems

  • Special Issue Paper
  • Published:
Journal of Real-Time Image Processing Aims and scope Submit manuscript

Abstract

Deep neural networks are widely used in computer vision, pattern recognition, and speech recognition and achieve high accuracy at the cost of remarkable computation. High computational complexity and memory accesses of such networks create a big challenge for using them in resource-limited and low-power embedded systems. Several binary neural networks have been proposed that exploit only 1-bit values for both weights and activations. Binary neural networks substitute complex multiply-accumulation operations with bitwise logic operations to reduce computations and memory usage. However, these quantized neural networks suffer from accuracy loss, especially in big datasets. In this paper, we introduce a quantized neural network with 2-bit weights and activations that is more accurate compared to the state-of-the-art quantized neural networks, and also the accuracy is close to the full precision neural networks. Moreover, we propose E2BNet, an efficient MAC-free hardware architecture that increases power efficiency and throughput/W about 3.6 × and 1.5 × , respectively, compared to the state-of-the-art quantized neural networks. E2BNet processes more than 500 images/s on the ImageNet dataset that not only meet real-time requirements of images/video processing but also can be deployed on high frame rate video applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 29, 82–97 (2012)

    Article  Google Scholar 

  2. Deng, L. et al.: Recent advances in deep learning for speech research at Microsoft. In: IEEE International Conference on Zcoustics, Speech, and Signal Processing, pp. 0–4, 2013.

  3. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)

    MATH  Google Scholar 

  4. Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., Seliya, N., Wald, R., Muharemagic, E.: Deep learning applications and challenges in big data analytics. J. Big Data 2(1), 1–21 (2015)

    Article  Google Scholar 

  5. He, K., Zhang, X., Ren, Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. 2016.

  6. Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105, (2012)

  7. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587, (2014)

  8. Nasse, F., Thurau, C., Fink, G.A.: Face detection using GPU-based convolutional neural networks. In: International Conference on Computer Analysis of Images and Patterns, pp. 83–90. Springer, Berlin, Heidelberg, 2009.

  9. Sze, V., Chen, Y.H., Yang, T.J., Emer, J.S.: Efficient processing of deep neural networks: A tutorial and survey. Proc. IEEE 105(12), 2295–2329 (2017)

    Article  Google Scholar 

  10. Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks. In: Advances in neural information processing systems, pp. 1135–1143. 2015.

  11. Chen, Y.-H., Emer, J., Sze, V.: Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks. ACM SIGARCH Comput. Arch. News 44(3), 367–379 (2016)

    Article  Google Scholar 

  12. Gysel, P., Pimentel, J., Motamedi, M., Ghiasi, S.: Ristretto: a framework for empirical study of resource-efficient inference in convolutional neural networks. IEEE Trans. Neural Netw. Learn. Syst. 29(11), 1–6 (2018)

    Article  Google Scholar 

  13. Gysel, P., Motamedi, M., Ghiasi, S.: Hardware-oriented approximation of convolutional neural networks. In International Conference on Learning Representations, pp. 1–8, 2016.

  14. Ma, Y., Suda, N., Cao, Y., Seo, J., Vrudhula, S.; Scalable and modularized RTL compilation of convolutional neural networks onto FPGA. In: Field Programmable Logic and Applications (FPL), 2016 26th International Conference on, pp. 1–8. IEEE, 2016.

  15. Yang, T.J., Chen, Y.H., Sze, V.: Designing energy-efficient convolutional neural networks using energy-aware pruning. In Proceeding of the 30th IEEE Conference Computer Vision Pattern Recognition, CVPR 2017, pp. 6071–6079, 2017.

  16. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4510–4520. 2018

  17. Zhang, X., Xinyu, Z., Mengxiao, L., Jian, S.: Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 6848–6856. 2018.

  18. Choi, Y., Mostafa, E.K., Jungwon, L.: Universal deep neural network compression. IEEE Journal of Selected Topics in Signal Processing (2020).

  19. Sharma, H. et al.: Bit fusion: Bit-level dynamically composable architecture for accelerating deep neural networks. In the 45th Annual International Symposium on Computer Architecture (ISCA), pp. 764–775. IEEE, 2018.

  20. Courbariaux, M., David, J.: Training deep neural networks with low precision multiplications. ArXiv Preprint ArXiv: 1412.7024, 2015.

  21. Nazari, N., Salehi, M.E.: Binary neural networks. In Hardware Architectures for Deep Learning, The Institution of Engineering and Technology, 2020.

  22. Courbariaux, M., David, J.: BinaryConnect : Training deep neural networks with binary weights during propagations. In: Advances in Neural Information Processing Systems, pp. 3123–3131. 2015.

  23. Liu, B.: Ternary weight networks. ArXiv Preprint ArXiv: 1605.04711, 2016.

  24. Nazari, N., Loni, M., Salehi, M. E., Daneshtalab, M., Mikael S.: TOT-Net: an endeavor toward optimizing ternary neural networks. In 2019 22nd Euromicro Conference on Digital System Design (DSD), pp. 305–312. IEEE, 2019.

  25. Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net : imagenet classification using binary. In European Conference on Computer Vision, pp. 525–542. Springer, Cham, 2016.

  26. Deng, L., Jiao, P., Pei, J., Wu, Z., Li, G.: GXNOR-Net : training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework. Neural Netw. 100(4), 49–58 (2018)

    Article  Google Scholar 

  27. Zhu, C., Song Han, H.M., Dally, W.J.: Trained ternary quantization. In International Conference on Learning Representations (ICLR), 2017.

  28. Choi, J., Swagath, V., Vijayalakshmi, S., Kailash, G., Zhuo, W., Pierce, C.: Accurate and efficient 2-bit quantized neural networks. In: Proceedings of the SysML Conference, vol. 2019. 2019.

  29. Yang, Z., Yunhe, W., Kai, H., Chunjing, X., Chao, X., Dacheng, T., Chang, X.: Searching for low-bit weights in quantized neural networks. 34th Conference on Neural Information Processing Systems (NIPS), 2020.

  30. Zhang, D., Yang, J., Ye, D., Hua, G.: LQ-Nets: Learned quantization for highly accurate and compact deep neural networks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2018, vol. 11212 LNCS, pp. 373–390.

  31. Zhou, S., Wu, Y., Ni, Z., Zhou, X., Wen, H., Zou, Y.: DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients. 1(1):1–13, (2016)

  32. Umuroglu, Y., Fraser, N.J., Gambardella, G., Blott, M.: FINN : A framework for fast, scalable binarized neural network inference. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 65–74. ACM, 2017.

  33. Ghasemzadeh, M., Samragh, M., Koushanfar, F.: ReBNet: residual binarized neural network. In The 26th IEEE International Symposium on Field-Programmable Custom Computing Machines. pp. 57–64, 2017.

  34. Faraone, J., Fraser, N., Blott, M., Leong, P.H.W.: SYQ: Learning symmetric quantization for efficient deep neural networks. Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., pp. 4300–4309, 2018.

  35. Andri, R., Cavigelli, L., Rossi, D., Benini, L.: YodaNN 1: an architecture for ultra-low power binary-weight CNN acceleration. IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 37(1), 48–60 (2018)

    Article  Google Scholar 

  36. Al Bahou, A., Karunaratne, G., Andri, R., Cavigelli, L., Benini, L.: XNORBIN : A 95 TOp / s / W hardware accelerator for binary convolutional neural networks. In 2018 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS), pp. 1–3. IEEE, 2018.

  37. Conti, F., Schiavone, P.D., Member, S., Benini, L.: XNOR neural engine: a hardware accelerator ip for 21 . 6 fJ / op Binary Neural Network Inference. ArXiv Preprint ArXiv: 1807.03010, (2018)

  38. Sato, S., Nakahara, H., Ikebe, M.: BRein memory : A single-chip binary / ternary reconfigurable in-memory deep neural network. (2017)

  39. Nazari, N., Seyed Ahmad M., Sima S., Salehi, M.E., Masoud D.: Multi-level binarized lstm in eeg classification for wearable devices. In 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 175–181. IEEE, 2020.

  40. Mirsalari, S.A., Sima S., Salehi, M.E., Masoud D.: MuBiNN: Multi-level binarized recurrent neural network for EEG signal classification. In: 2020 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5. IEEE, 2020.

  41. Mirsalari, S.A., Najmeh Nazari, S.A.A., Sinaei, S., Salehi, M.E., Daneshtalab, M.: ELC-ECG: Efficient LSTM Cell for ECG classification based on quantized architecture. In 2021 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5. IEEE, (2021)

  42. Mirsalari, S.A., Najmeh Nazari, S.S., Salehi, M.E., Daneshtalab, M: FaCT-LSTM: fast and compact ternary architecture for LSTM recurrent neural networks. IEEE Design & Test (2021)

  43. Ouyang, W. et al.: DeepID-Net: multi-stage and deformable deep convolutional neural networks for object detection. ArXiv Preprint ArXiv: 1409.3505. (2014)

  44. Zeiler, M.D., Fergus, R.: Stochastic pooling for regularization of deep convolutional neural networks. In: IEEE International Conference on Machine Learning. pp. 1–9, 2013.

  45. Li, B., Hassan Najafi, M., Yuan, B., Lilja, D.J: Quantized neural networks with new stochastic multipliers. In 2018 19th International Symposium on Quality Electronic Design (ISQED), pp. 376–382. IEEE, 2018.

  46. Li, B., Hassan Najafi, M., Lilja, D.J.: Low-cost stochastic hybrid multiplier for quantized neural networks. ACM J. Emerg. Technol. Comput. Syst. (JETC) 15(2), 1–19 (2019)

    Article  Google Scholar 

  47. Wess, M., Dinakarrao, S.M.P., Jantsch, A.: Weighted quantization-regularization in DNNs for weight memory minimization toward HW implementation. IEEE Trans. Comput. Aided Des. Integr. Circ. Syst. 37(11), 2929–2939 (2018). https://doi.org/10.1109/TCAD.2018.2857080

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Najmeh Nazari.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mirsalari, S.A., Nazari, N., Ansarmohammadi, S.A. et al. E2BNet: MAC-free yet accurate 2-level binarized neural network accelerator for embedded systems. J Real-Time Image Proc 18, 1285–1299 (2021). https://doi.org/10.1007/s11554-021-01148-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11554-021-01148-1

Keywords