A Survey on Hardware Accelerator Design of Deep Learning for Edge Devices

Samanta, Anu; Hatai, Indranil; Mal, Ashis Kumar

doi:10.1007/s11277-024-11443-2

A Survey on Hardware Accelerator Design of Deep Learning for Edge Devices

Published: 19 July 2024

Volume 137, pages 1715–1760, (2024)
Cite this article

Wireless Personal Communications Aims and scope Submit manuscript

Anu Samanta¹,
Indranil Hatai² &
Ashis Kumar Mal³

518 Accesses
Explore all metrics

Abstract

In artificial intelligence, the large role is played by machine learning (ML) in a variety of applications. This article aims at providing a comprehensive survey on summarizing recent trends and advances in hardware accelerator design for machine learning based on various hardware platforms like ASIC, FPGA and GPU. In this article, we look at different architectures that allow NN executions in respect of computational units, network topologies, dataflow optimization and accelerators based on new technologies. The important features of the various strategies for enhancing acceleration performance are highlighted. The numerous current difficulties like fair comparison, as well as potential subjects and obstacles in this field has been examined. This study intends to provide readers with a fast overview of neural network compression and acceleration, a clear evaluation of different methods, and the confidence to get started in the right path.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey of FPGA-based accelerators for convolutional neural networks

Article 06 October 2018

An Anatomization of FPGA-Based Neural Networks

Optimizing Neural Networks for Efficient FPGA Implementation: A Survey

Article 11 January 2021

Data Availability

No data and material are used for this review article.

Code Availability

No software application or custom code is required in this review article.

References

Liang, Q., Shenoy, P., & Irwin, D. (2020). Ai on the edge: Rethinking ai-based iot applications using specialized edge architectures. arXiv preprint arXiv:2003.12488.
Li, W., & Liewig, M. (2020). A survey of AI accelerators for edge environment. In Trends and Innovations in Information Systems and Technologies: Volume 28 (pp. 35–44). Springer International Publishing.
Zhou, X., Canady, R., Bao, S., & Gokhale, A. (2020). Cost-effective hardware accelerator recommendation for edge computing. In 3rd USENIX Workshop on Hot Topics in Edge Computing (HotEdge 20).
Marchisio, A., Hanif, M. A., Khalid, F., Plastiras, G., Kyrkou, C., Theocharides, T., & Shafique, M. (2019, July). Deep learning for edge computing: Current trends, cross-layer optimizations, and open research challenges. In 2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI) (pp. 553–559). IEEE.
Krestinskaya, O., James, A. P., & Chua, L. O. (2019). Neuromemristive circuits for edge computing: a review. IEEE transactions on neural networks and learning systems, 31(1), 4–23.
Article MathSciNet Google Scholar
Rodríguez, A., Valverde, J., Portilla, J., Otero, A., Riesgo, T., & De la Torre, E. (2018). Fpga-based high-performance embedded systems for adaptive edge computing in cyber-physical systems: the artico3 framework. Sensors, 18(6), 1877.
Article Google Scholar
Osta, M., Ibrahim, A., & Valle, M. (2019). FPGA implementation of approximate CORDIC circuits for energy efficient applications. In 2019 26th IEEE International Conference on Electronics, Circuits and Systems (ICECS) (pp. 127–128). IEEE.
Usami, K., Ochi, H., & Ono, Y. (2020). Approximate computing based on latest-result reuse for image edge detection. In 2020 35th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC) (pp. 234–239). IEEE.
Leipnitz, M. T., & Nazar, G. L. (2019). High-level synthesis of approximate designs under real-time constraints. ACM Transactions on Embedded Computing Systems (TECS), 18(5s), 1–21.
Article Google Scholar
Ono, Y., & Usami, K. (2019). Approximate computing technique using memoization and simplified multiplication. In 2019 34th International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC) (pp. 1–4). IEEE.
Ibrahim, A., Osta, M., Alameh, M., Saleh, M., Chible, H., & Valle, M. (2018). Approximate computing methods for embedded machine learning. In 2018 25th IEEE International Conference on Electronics, Circuits and Systems (ICECS) (pp. 845–848). IEEE.
Liu, B., Qin, H., Gong, Y., Ge, W., Xia, M., & Shi, L. (2018). EERA-ASR: An energy-efficient reconfigurable architecture for automatic speech recognition with hybrid DNN and approximate computing. IEEE Access, 6, 52227–52237.
Article Google Scholar
Choi, J., & Venkataramani, S. (2019). Approximate computing techniques for deep neural networks. Approximate Circuits: Methodologies and CAD. https://doi.org/10.1007/978-3-319-99322-5_15
Book Google Scholar
Chen, C. Y., Choi, J., Gopalakrishnan, K., Srinivasan, V., & Venkataramani, S. (2018, March). Exploiting approximate computing for deep learning acceleration. In 2018 Design, Automation & Test in Europe Conference & Exhibition (DATE) (pp. 821–826). IEEE.
Mazahir, S., Hasan, O., & Shafique, M. (2019). Self-compensating accelerators for efficient approximate computing. Microelectronics Journal, 88, 9–17.
Article Google Scholar
Wang, X., Han, Y., Leung, V. C., Niyato, D., Yan, X., & Chen, X. (2020). Convergence of edge computing and deep learning: a comprehensive survey. IEEE Communications Surveys & Tutorials, 22(2), 869–904.
Article Google Scholar
Reuther, A., Michaleas, P., Jones, M., Gadepally, V., Samsi, S., & Kepner, J. (2020). Survey of machine learning accelerators. In 2020 IEEE high performance extreme computing conference (HPEC) (pp. 1–12). IEEE.
Owaida, M., Alonso, G., Fogliarini, L., Hock-Koon, A., & Melet, P. E. (2019). Lowering the latency of data processing pipelines through FPGA based hardware acceleration. Proceedings of the VLDB Endowment, 13(1), 71–85.
Article Google Scholar
Capra, M., Bussolino, B., Marchisio, A., Shafique, M., Masera, G., & Martina, M. (2020). An updated survey of efficient hardware architectures for accelerating deep convolutional neural networks. Future Internet, 12(7), 113.
Article Google Scholar
Zaman, K. S., Reaz, M. B. I., Ali, S. H. M., Bakar, A. A. A., & Chowdhury, M. E. H. (2021). Custom hardware architectures for deep learning on portable devices: a review. IEEE Transactions on Neural Networks and Learning Systems, 33(11), 6068–6088.
Article Google Scholar
Akkad, G., Mansour, A., & Inaty, E. (2023). Embedded deep learning accelerators: a survey on recent advances. IEEE Transactions on Artificial Intelligence. https://doi.org/10.1109/TAI.2023.3311776
Article Google Scholar
Mohaidat, T., & Khalil, K. (2024). A survey on neural network hardware accelerators. IEEE Transactions on Artificial Intelligence. https://doi.org/10.1109/TAI.2024.3377147
Article Google Scholar
Bertazzoni, S., Canese, L., Cardarilli, G. C., Di Nunzio, L., Fazzolari, R., Re, M., & Spanò, S. (2024). Design space exploration for edge machine learning featured by MathWorks FPGA DL processor: a survey. IEEE Access, 12, 9418–9439. https://doi.org/10.1109/ACCESS.2024.3352266
Article Google Scholar
Manor, E., & Greenberg, S. (2022). Custom hardware inference accelerator for tensorflow lite for microcontrollers. IEEE Access, 10, 73484–73493.
Article Google Scholar
Wulfert, L., Kühnel, J., Krupp, L., Viga, J., Wiede, C., Gembaczka, P., & Grabmaier, A. (2024). AIfES: a next-generation edge AI framework. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(6), 4519–4533. https://doi.org/10.1109/TPAMI.2024.3355495
Article Google Scholar
Rosero-Montalvo, P. D., Tözün, P., & Hernandez, W. (2024). Optimized CNN architectures benchmarking in hardware-constrained edge devices in IoT environments. IEEE Internet of Things Journal, 11(11), 20357–20366. https://doi.org/10.1109/JIOT.2024.3369607
Article Google Scholar
Haris, J., Gibson, P., Cano, J., Agostini, N. B., & Kaeli, D. (2023). SECDA-TFLite: a toolkit for efficient development of FPGA-based DNN accelerators for edge inference. Journal of Parallel and Distributed Computing, 173, 140–151.
Article Google Scholar
Al Koutayni, M. R., Reis, G., & Stricker, D. (2023). Deepedgesoc: END-to-end deep learning framework for edge iot devices. Internet of Things, 21, 100665.
Article Google Scholar
Kim, V. H., & Choi, K. K. (2023). A reconfigurable CNN-based accelerator design for fast and energy-efficient object detection system on mobile FPGA. IEEE Access, 11, 59438–59445. https://doi.org/10.1109/ACCESS.2023.3285279
Article Google Scholar
Magalhães, S. C., dos Santos, F. N., Machado, P., Moreira, A. P., & Dias, J. (2023). Benchmarking edge computing devices for grape bunches and trunks detection using accelerated object detection single shot multibox deep learning models. Engineering Applications of Artificial Intelligence, 117, 105604.
Article Google Scholar
Jin, Y., Cai, J., Xu, J., Huan, Y., Yan, Y., Huang, B., & Zou, Z. (2021). Self-aware distributed deep learning framework for heterogeneous IoT edge devices. Future Generation Computer Systems, 125, 908–920.
Article Google Scholar
Xia, M., Huang, Z., Tian, L., Wang, H., Chang, V., Zhu, Y., & Feng, S. (2021). SparkNoC: an energy-efficiency FPGA-based accelerator using optimized lightweight CNN for edge computing. Journal of Systems Architecture, 115, 101991.
Article Google Scholar
Liu, X., Yang, J., Zou, C., Chen, Q., Yan, X., Chen, Y., & Cai, C. (2021). Collaborative edge computing with FPGA-based CNN accelerators for energy-efficient and time-aware face tracking system. IEEE Transactions on Computational Social Systems, 9(1), 252–266.
Article Google Scholar
Sadi, M., & Guin, U. (2021). Test and yield loss reduction of AI and deep learning accelerators. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 41(1), 104–115.
Article Google Scholar
Lee, J., Kang, S., Lee, J., Shin, D., Han, D., & Yoo, H. J. (2020). The hardware and algorithm co-design for energy-efficient DNN processor on edge/mobile devices. IEEE Transactions on Circuits and Systems I: Regular Papers, 67(10), 3458–3470.
Google Scholar
Jain, V., Giraldo, S., De Roose, J., Mei, L., Boons, B., & Verhelst, M. (2023). Tinyvers: a tiny versatile system-on-chip with state-retentive eMRAM for ML inference at the extreme edge. IEEE Journal of Solid-State Circuits, 58(8), 2360–2371. https://doi.org/10.1109/JSSC.2023.3236566
Article Google Scholar
Chang, I. F., Chen, H. R., & Chao, P. C. P. (2023). Design and implementation for a high-efficiency hardware accelerator to realize the learning machine for predicting OLED degradation. Microsystem Technologies, 29(8), 1069–1081.
Article Google Scholar
Wang, H., Sayadi, H., Dinakarrao, S. M. P., Sasan, A., Rafatirad, S., & Homayoun, H. (2021). Enabling micro AI for securing edge devices at hardware level. IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 11(4), 803–815.
Article Google Scholar
Russo, E., Palesi, M., Monteleone, S., Patti, D., Mineo, A., Ascia, G., & Catania, V. (2021). DNN model compression for IoT domain-specific hardware accelerators. IEEE Internet of Things Journal, 9(9), 6650–6662.
Article Google Scholar
Sze, V., Chen, Y. H., Yang, T. J., & Emer, J. S. (2017). Efficient processing of deep neural networks: a tutorial and survey. Proceedings of the IEEE, 105(12), 2295–2329.
Article Google Scholar
Han, S., Liu, X., Mao, H., Pu, J., Pedram, A., Horowitz, M. A., & Dally, W. J. (2016). EIE: efficient inference engine on compressed deep neural network. ACM SIGARCH Computer Architecture News, 44(3), 243–254.
Article Google Scholar
Wang, C., Gong, L., Yu, Q., Li, X., Xie, Y., & Zhou, X. (2016). DLAU: a scalable deep learning accelerator unit on FPGA. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 36(3), 513–517.
Google Scholar
Zhao, R., Song, W., Zhang, W., Xing, T., Lin, J. H., Srivastava, M., & Zhang, Z. (2017). Accelerating binarized convolutional neural networks with software-programmable FPGAs. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (pp. 15–24).
Mohsin, M. A., & Perera, D. G. (2018). An FPGA-based hardware accelerator for K-nearest neighbor classification for machine learning on mobile devices. In Proceedings of the 9th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies (pp. 1–7).
Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., & Cong, J. (2015). Optimizing FPGA-based accelerator design for deep convolutional neural networks. In Proceedings of the 2015 ACM/SIGDA international symposium on field-programmable gate arrays (pp. 161–170).
Chen, Y., Xie, Y., Song, L., Chen, F., & Tang, T. (2020). A survey of accelerator architectures for deep neural networks. Engineering, 6(3), 264–274.
Article Google Scholar
Liu, X., Mao, M., Liu, B., Li, H., Chen, Y., Li, B., & Yang, J. (2015). RENO: A high-efficient reconfigurable neuromorphic computing accelerator design. In Proceedings of the 52nd Annual Design Automation Conference (pp. 1–6).
Chen, Y., Chen, T., Xu, Z., Sun, N., & Temam, O. (2016). DianNao family: energy-efficient hardware accelerators for machine learning. Communications of the ACM, 59(11), 105–112.
Article Google Scholar
Shawahna, A., Sait, S. M., & El-Maleh, A. (2018). FPGA-based accelerators of deep learning networks for learning and classification: A review. ieee Access, 7, 7823–7859.
Jouppi, N. P., Young, C., Patil, N., Patterson, D., Agrawal, G., Bajwa, R., & Yoon, D. H. (2017, June). In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th annual international symposium on computer architecture (pp. 1–12).
Chen, Y. H., Krishna, T., Emer, J. S., & Sze, V. (2016). Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE journal of solid-state circuits, 52(1), 127–138.
Article Google Scholar
Chen, Y. H., Emer, J., & Sze, V. (2016). Eyeriss: A spatial architecture for energy-efficient dataflow for convolutional neural networks. ACM SIGARCH computer architecture news, 44(3), 367–379.
Article Google Scholar
Sze, V., Chen, Y. H., Emer, J., Suleiman, A., & Zhang, Z. (2017). Hardware for machine learning: Challenges and opportunities. In 2017 IEEE custom integrated circuits conference (CICC) (pp. 1–8). IEEE.
Deng, L., Li, G., Han, S., Shi, L., & Xie, Y. (2020). Model compression and hardware acceleration for neural networks: a comprehensive survey. Proceedings of the IEEE, 108(4), 485–532.
Article Google Scholar
Ardestani, A. S. (2018). Design and Optimization of Hardware Accelerators for Deep Learning (Doctoral dissertation, The University of Utah).
Bojnordi, M. N., & Ipek, E. (2016). Memristive boltzmann machine: A hardware accelerator for combinatorial optimization and deep learning. In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA) (pp. 1–13). IEEE.
Kim, D., Kung, J., Chai, S., Yalamanchili, S., & Mukhopadhyay, S. (2016). Neurocube: a programmable digital neuromorphic architecture with high-density 3D memory. ACM SIGARCH Computer Architecture News, 44(3), 380–392.
Article Google Scholar
Lu, H., Wei, X., Lin, N., Yan, G., & Li, X. (2018). Tetris: Re-architecting convolutional neural network computation for machine learning accelerators. In 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (pp. 1–8). IEEE.
Du, L., & Du, Y. (2017). Hardware accelerator design for machine learning. Machine Learning-Advanced Techniques and Emerging Applications. IntechOpen: London.
Google Scholar
Gawande, N. A., Daily, J. A., Siegel, C., Tallent, N. R., & Vishnu, A. (2020). Scaling deep learning workloads: Nvidia dgx-1/pascal and intel knights landing. Future Generation Computer Systems, 108, 1162–1172.
Article Google Scholar
Chen, J., & Ran, X. (2019). Deep learning with edge computing: a review. Proceedings of the IEEE, 107(8), 1655–1674.
Article Google Scholar
Merenda, M., Porcaro, C., & Iero, D. (2020). Edge machine learning for AI-enabled iot devices: a review. Sensors, 20(9), 2533.
Article Google Scholar
Li, H., Ota, K., & Dong, M. (2018). Learning IoT in edge: Deep learning for the Internet of Things with edge computing. IEEE Network, 32(1), 96–101.
Article Google Scholar
Teerapittayanon, S., McDanel, B., & Kung, H. T. (2017). Distributed deep neural networks over the cloud, the edge and end devices. In 2017 IEEE 37th international conference on distributed computing systems (ICDCS) (pp. 328–339). IEEE.
Zhao, Z., Barijough, K. M., & Gerstlauer, A. (2018). Deepthings: Distributed adaptive deep learning inference on resource-constrained iot edge clusters. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 37(11), 2348–2359.
Article Google Scholar
Wang, J., Zhang, J., Bao, W., Zhu, X., Cao, B., & Yu, P. S. (2018). Not just privacy: Improving performance of private deep learning in mobile cloud. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining (pp. 2407–2416).
Dias, M., Abad, A., & Trancoso, I. (2018). Exploring hashing and cryptonet based approaches for privacy-preserving speech emotion recognition. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 2057–2061). IEEE.
Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., & Zhang, J. (2019). Edge intelligence: Paving the last mile of artificial intelligence with edge computing. Proceedings of the IEEE, 107(8), 1738–1762.
Article Google Scholar
Deng, S., Zhao, H., Fang, W., Yin, J., Dustdar, S., & Zomaya, A. Y. (2020). Edge intelligence: the confluence of edge computing and artificial intelligence. IEEE Internet of Things Journal, 7(8), 7457–7469.
Article Google Scholar
Sajjad, M., Nasir, M., Muhammad, K., Khan, S., Jan, Z., Sangaiah, A. K., & Baik, S. W. (2020). Raspberry Pi assisted face recognition framework for enhanced law-enforcement services in smart cities. Future Generation Computer Systems, 108, 995–1007.
Article Google Scholar
Nikouei, S. Y., Chen, Y., Song, S., Xu, R., Choi, B. Y., & Faughnan, T. (2018). Smart surveillance as an edge network service: From harr-cascade, svm to a lightweight cnn. In 2018 IEEE 4th international conference on collaboration and internet computing (cic) (pp. 256–265). IEEE.
Xu, R., Nikouei, S. Y., Chen, Y., Polunchenko, A., Song, S., Deng, C., & Faughnan, T. R. (2018). Real-time human objects tracking for smart surveillance at the edge. In 2018 IEEE International conference on communications (ICC) (pp. 1–6). IEEE.
Fafoutis, X., Marchegiani, L., Elsts, A., Pope, J., Piechocki, R., & Craddock, I. (2018). Extending the battery lifetime of wearable sensors with embedded machine learning. In 2018 IEEE 4th World Forum on Internet of Things (WF-IoT) (pp. 269–274). IEEE.
Haigh, K. Z., Mackay, A. M., Cook, M. R., & Lin, L. G. (2015). Machine learning for embedded systems: a case study. BBN Technologies: Cambridge, MA, USA, 8571, 1–12.
Google Scholar
Chand, G., Ali, M., Barmada, B., Liesaputra, V., & Ramirez-Prado, G. (2019). Tracking a person’s behaviour in a smart house. In Service-Oriented Computing–ICSOC 2018 Workshops: ADMS, ASOCA, ISYyCC, CloTS, DDBS, and NLS4IoT, Hangzhou, China, November 12–15, 2018, Revised Selected Papers 16 (pp. 241–252). Springer International Publishing.
Rosato, D., Masciadri, A., Comai, S., & Salice, F. (2018). Non-invasive monitoring system to detect sitting people. In Proceedings of the 4th EAI International Conference on Smart Objects and Technologies for Social Good (pp. 261–264).
Martin Wisniewski, L., Bec, J. M., Boguszewski, G., & Gamatié, A. (2022). Hardware solutions for low-power smart edge computing. Journal of Low Power Electronics and Applications, 12(4), 61.
Article Google Scholar

Download references

Funding

In this review article has not been funded by anyone.

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, Brainware University, Kolkata , India
Anu Samanta
Mathworks India Private Limited, Bangalore, India
Indranil Hatai
Department of Electronics and Communication Engineering, National Institute of Technology, Durgapur, India
Ashis Kumar Mal

Authors

Anu Samanta
View author publications
You can also search for this author in PubMed Google Scholar
Indranil Hatai
View author publications
You can also search for this author in PubMed Google Scholar
Ashis Kumar Mal
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

1st author prepared (Idea, writing, grammar correction) this manuscript. 2nd and 3rd author guided the 1st author for manuscript preparation.

Corresponding author

Correspondence to Anu Samanta.

Ethics declarations

Conflict of interest

All authors do not have any conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Samanta, A., Hatai, I. & Mal, A.K. A Survey on Hardware Accelerator Design of Deep Learning for Edge Devices. Wireless Pers Commun 137, 1715–1760 (2024). https://doi.org/10.1007/s11277-024-11443-2

Download citation

Accepted: 30 June 2024
Published: 19 July 2024
Issue Date: August 2024
DOI: https://doi.org/10.1007/s11277-024-11443-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Survey on Hardware Accelerator Design of Deep Learning for Edge Devices

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A survey of FPGA-based accelerators for convolutional neural networks

An Anatomization of FPGA-Based Neural Networks

Optimizing Neural Networks for Efficient FPGA Implementation: A Survey

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A Survey on Hardware Accelerator Design of Deep Learning for Edge Devices

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A survey of FPGA-based accelerators for convolutional neural networks

An Anatomization of FPGA-Based Neural Networks

Optimizing Neural Networks for Efficient FPGA Implementation: A Survey

Data Availability

Code Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation