Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3697467.3697591acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiotmlConference Proceedingsconference-collections
research-article

A Review of FPGA Accelerated Computing Methods for YOLO Models

Published: 08 November 2024 Publication History

Abstract

In recent years, with the rapid evolution of deep learning and neural networks, combined with the advent of the big data and intelligence era, one of the models in the field of object detection, YOLO (You Only Look Once) has also become a hot research topic. Addressing the critical benchmarks of object detection—Timeliness Rate and Accuracy Rate—has prompted a surge in research dedicated to constructing an FPGA (Field-Programmable Gate Array)-based acceleration scheme. In this article, we first provide an overview of neural networks and hardware platforms, followed by an in-depth exploration of the implementation of the YOLO model on FPGA hardware platforms. Additionally, we consolidate and review the current state of FPGA acceleration for YOLO models. Subsequently, we undertake a thorough analysis of the performance of different acceleration techniques. Finally, we delve into the exploration and discussion of potential future directions for development.

References

[1]
Molanes, R. F.; Amarasinghe, K., Rodriguez-Andina, J.; Manic, M. Deep learning and reconfigurable platforms in the internet of things: Challenges and opportunities in algorithms and hardware. IEEE Ind. Electron 2018, 12, 36-49.
[2]
Nasiri, N. Cost-effective programming for maximum power-efficiency of data centric applications on FPGAs. Doctoral dissertation, University of Massachusetts Lowell, 789 East Eisenhower Parkway, 2016.
[3]
Attia, S.; Betz, V. StateMover: Combining simulation and hardware execution for efficient FPGA debugging. FPGA '20: Proceedings of the 2020 ACM/SIGDA InternationalF Symposium on Field-Programmable Gate Arrays, Seaside CA, United States, 24 February; Association for Computer Machinery: New York, United States, 2020; pp. 175–185.
[4]
Geier, M.; Brändle, M.; Faller, D.; Chakraborty, S. Debugging FPGA-accelerated real-time systems. 2020 IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS), Sydney NSW, Australia, 21-24 April; IEEE: New York, United States, 2020; pp. 350-363.
[5]
Yap, J.W.; bin Mohd Yussof, Z.; bin Salim, S.I.; Lim, K.C. Fixed point implementation of tiny-yolo-v2 using opencl on fpga. Int J Adv Comput Sci Appl 2018, 9.
[6]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas Neveda, United States, 26-30 June; 2016; pp. 779-788.
[7]
Khalid, M.; Sarfraz, M.S.; Iqbal, U.; Aftab, M.U.; Niedbała, G.; Rauf, H.T. Real-Time Plant Health Detection Using Deep Convolutional Neural Networks. Agriculture 2023, 13, 510.
[8]
Psaltis, A.; Dimou, A.; Alvarez, F.; Daras, P. Flow R-CNN: Flow-enhanced object detection. Pattern Recognition. ICPR International Workshops and Challenges: Virtual Event, Taiwan, China, 10–15 January; Springer International Publishing: Berlin, German, 2021; pp. 685-700,
[9]
Wu, H.; Liu, Q.; Liu, X. A review on deep learning approaches to image classification and object segmentation. CMC-COMPUT MATER CON 2019, 60, 575-597.
[10]
Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018 (Cornell University).
[11]
Elmessery, W.M. YOLO-based model for automatic detection of broiler pathological phenomena through visual and thermal images in intensive poultry houses. Agriculture 2023, 13, 1527.
[12]
Valeja, Y.; Pathare, S.; Patel, D.; Pawar, M. (2021) Traffic sign detection using Clara and Yolo in python. 2021 7th international conference on advanced computing and communication systems (ICACCS), Coimbatore, India, 19-20 March; IEEE: New York, United States, 2021; pp. 367-371.
[13]
He, T. Achieving real-time target tracking usingwireless sensor networks. 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS'06), San Jose California, United States, 4-7 April; IEEE: New York, United States, 2006; pp. 37-48.
[14]
Chollet, F. Xception: Deep learning with depthwise separable convolutions. Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu Hawaii, United States, 21-26 July; IEEE: New York, United States, 2017; pp. 1251-1258.
[15]
Zhao, B.; Liu, S.; Liu, G.; Yang, Z.; Ma, Z.; Fu, H. Efficient Object Detection based on Deep Feature Fusion Network. J Phys Conf Ser 2021, 1848, 012005.
[16]
Porambage, P.; Okwuibe, J.; Liyanage, M.; Ylianttila, M.; Taleb, T. Survey on multi-access edge computing for internet of things realization. Ieee Commun Surv Tut 2018, 20, 2961-2991.
[17]
Liu, S.; Liu, L.; Tang, J.; Yu, B.; Wang, Y.; Shi, W. Edge computing for autonomous driving: Opportunities and challenges. P Ieee 2019, 107, 1697-1716.
[18]
Mittal, S. A Survey on optimized implementation of deep learning models on the NVIDIA Jetson platform. J Syst Architect 2019, 97, 428-442.
[19]
Chen, S.; Zhan, R.; Wang, W.; Zhang, J. Learning slimming SAR ship object detector through network pruning and knowledge distillation. Ieee J-Stars 2020, 14, 1267-1282.
[20]
Anwar, S.; Hwang, K.; Sung, W. Structured pruning of deep convolutional neural networks. ACM J. Emerging Technol. Comput 2017, 13, 1-18.
[21]
Liberatori, B.; Mami, C.A.; Santacatterina, G.; Zullich, M.; Pellegrino, F.A. Yolo-based face mask detection on low-end devices using pruning and quantization. 2022 45th Jubilee International Convention on Information, Communication and Electronic Technology (MIPRO), Opatija, Croatia, 23-27 May; IEEE: New York, United States, 2022; pp. 900-905.
[22]
Nguyen, D.T.; Nguyen, T.N.; Kim, H.; Lee, H.J. A high-throughput and power-efficient FPGA implementation of YOLO CNN for object detection. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 2019, 27, 1861-1873.
[23]
Yang, B.; Liu, J.; Zhou, L.; Wang, Y.; Chen, J. Quantization and training of object detection networks with low-precision weights and activations. J Electron Imaging 2018, 27, 013020-013020.
[24]
Huang, R.; Pedoeem, J.; Chen, C. YOLO-LITE: a real-time object detection algorithm optimized for non-GPU computers. In 2018 IEEE international conference on big data (big data), Seattle WA, United States, 10-13 December; IEEE: New York, United States, 2018; pp. 2503-2510.
[25]
Wang, D.; He, D. Channel pruned YOLO V5s-based deep learning approach for rapid and accurate apple fruitlet detection before fruit thinning. Bioproc Eng 2021, 210, 271-281.
[26]
Wang, Z.; Zhang, J.; Zhao, Z.; Su, F. (2020) Efficient yolo: A lightweight model for embedded deep learning object detection. 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), London, United Kingdom, 6-10 July; IEEE: New York, United States, 2020; pp. 1-6.
[27]
Han, S.; Liu, X.; Mao, H.; Pu, J.; Pedram, A.; Horowitz, M.A.; Dally, W.J. EIE: Efficient inference engine on compressed deep neural network. SIGARCH COMPUT. ARCHIT. NEWS 2016, 44, 243-254.
[28]
Zeng, K.; Ma, Q.; Wu, J.W.; Chen, Z.; Shen, T.; Yan, C. FPGA-based accelerator for object detection: A comprehensive survey. J SUPERCOMPUT 2022, 78, 14096-14136.
[29]
Liu, Y.; Chu, H.; Song, L.; Zhang, Z.; Wei, X.; Chen, M.; Shen, J. An improved tuna-YOLO model based on YOLO v3 for real-time tuna detection considering lightweight deployment. J Mar Sci Eng 2023, 11, 542.
[30]
Romero, A.; Ballas, N.; Kahou, S.E.; Chassang, A.; Gatta, C.; Bengio, Y. Fitnets: Hints for thin deep nets. arXiv preprint arXiv:1412.6550, 2014 (Cornell University).
[31]
Li, T.; Ma, Y.; Endoh, T. A systematic study of tiny YOLO3 inference: Toward compact brainware processor with less memory and logic gate. IEEE Access 2020, 8, 142931-142955.
[32]
Chen, S.; Zhan, R.; Wang, W.; Zhang, J. Learning slimming SAR ship object detector through network pruning and knowledge distillation. Ieee J-Stars 2020, 14, 1267-1282.
[33]
Zhang, C.; Li, P.; Sun, G.; Guan, Y.; Xiao, B.; Cong, J. Optimizing FPGA-based accelerator design for deep convolutional neural networks. Proceedings of the 2015 ACM/SIGDA international symposium on field-programmable gate arrays, Monterey California, United States, 22-24 February; 2015; pp. 161-170.
[34]
Karapurkar, S.S.; Bramhane, L.K.; Rahulkar, A.D.; Veerakumar, T. Energy Efficient Implementation of Processing Elements for CNN Hardware Accelerator. 2023 11th International Conference on Emerging Trends in Engineering & Technology-Signal and Information Processing (ICETET-SIP), Nagpur, India, 28-29 April; IEEE: New York, United States, 2023; pp. 1-5.
[35]
Li, Z.; Wang, J. An improved algorithm for deep learning YOLO network based on Xilinx ZYNQ FPGA. 2020 International Conference on Culture-oriented Science & Technology (ICCST), Beijing, China, 28-31 October; IEEE: New York, United States, 2020; pp. 447-451.
[36]
Li, S.; Yu, C.; Xie, T.; Feng, W. A power-efficient optimizing framework FPGA accelerator for YOLO. In 2022 15th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), Beijing, China, 5-7 November; IEEE: New York, United States, 2022; pp. 1-6.
[37]
Wang, J.; Gu, S. Fpga implementation of object detection accelerator based on vitis-ai. 2021 11th International Conference on Information Science and Technology (ICIST), Chengdu, China, 21-23 May; IEEE: New York, United States, 2021; pp. 571-577.
[38]
Tan, M.; Pang, R.; Quoc, V. Le. "EfficientDet: Scalable and Efficient Object Detection," In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, United States, 13-19 June; IEEE: New York, United States, 2020; pp. 10778-10787.

Index Terms

  1. A Review of FPGA Accelerated Computing Methods for YOLO Models

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    IoTML '24: Proceedings of the 2024 4th International Conference on Internet of Things and Machine Learning
    August 2024
    443 pages
    ISBN:9798400710353
    DOI:10.1145/3697467
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 November 2024

    Check for updates

    Author Tags

    1. FPGA (Field-Programmable Gate Array)
    2. YOLO (You Only Look Once)
    3. hardware acceleration
    4. model acceleration

    Qualifiers

    • Research-article

    Conference

    IoTML 2024

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 160
      Total Downloads
    • Downloads (Last 12 months)160
    • Downloads (Last 6 weeks)80
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media