Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3125502.3125534acmotherconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
research-article

A power-efficient and high performance FPGA accelerator for convolutional neural networks: work-in-progress

Published: 15 October 2017 Publication History

Abstract

Recently, FPGAs have been widely used in the implementation of hardware accelerators for Convolutional Neural Networks (CNN), especially on mobile and embedded devices. However, most of these existing accelerators are designed with the same concept as their ASIC counterparts, that is all operations from different CNN layers are mapped to the same hardware units and work in a multiplexed way. Although this approach improves the generality of these accelerators, it does not take full advantage of reconfigurability and customizability of FPGAs, resulting in a certain degree of computational efficiency degradation, which is even worse on the embedded platforms. In this paper, we propose an FPGA-based CNN accelerator with all the layers mapped to their own on-chip units, and working concurrently as a pipeline. A strategy which can find the optimized paralleling scheme for each layer is proposed to eliminate the pipeline stall and achieve high resource utilization. In addition, a balanced pruning-based method is applied on fully connected (FC) layers to reduce the computational redundancy. As a case study, we implement a widely used CNNs model, LeNet-5, on an embedded FPGA device, Xilinx Zedboard. It can achieve a peak performance of 39.78 GOP/s and the power efficiency with a value 19.6 GOP/s/W which outperforms previous approaches.

References

[1]
Jiachen Mao, Xiang Chen, Kent W Nixon, Christopher Krieger, and Yiran Chen. Modnn: Local distributed mobile computing system for deep neural network. In DATE, pages 1396--1401, 2017.
[2]
Jiantao Qiu, Jie Wang, Song Yao, Kaiyuan Guo, Boxun Li, Erjin Zhou, Jincheng Yu, Tianqi Tang, Ningyi Xu, and Huazhong Yang. Going deeper with embedded fpga platform for convolutional neural network. In FPGA, pages 26--35, 2016.
[3]
Chao Wang, Lei Gong, Qi Yu, Xi Li, Yuan Xie, and Xuehai Zhou. Dlau: A scalable deep learning accelerator unit on fpga. TCAD, 36(3):513--517, 2017.
[4]
Chen Zhang, Zhenman Fang, Peipei Zhou, Peichen Pan, and Jason Cong. Caffeine: towards uniformed representation and acceleration for deep convolutional neural networks. In ICCAD, page 12, 2016.
[5]
Chen Zhang, Peng Li, Guangyu Sun, Yijin Guan, Bingjun Xiao, and Jason Cong. Optimizing fpga-based accelerator design for deep convolutional neural networks. In FPGA, pages 161--170, 2015.

Cited By

View all
  • (2024)A Post-Quantum Encryption Mechanism Based on Convolutional Neural Network AcceleratorIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2024.337746071:8(3945-3949)Online publication date: Aug-2024
  • (2024)A Trusted Inference Mechanism for Edge Computing Based on Post-Quantum Encryption2024 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS58744.2024.10557963(1-5)Online publication date: 19-May-2024
  • (2023)A Convolutional Computing Design Using Pulsating Arrays2023 19th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)10.1109/ICNC-FSKD59587.2023.10281046(1-5)Online publication date: 29-Jul-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CODES '17: Proceedings of the Twelfth IEEE/ACM/IFIP International Conference on Hardware/Software Codesign and System Synthesis Companion
October 2017
84 pages
ISBN:9781450351850
DOI:10.1145/3125502
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CNNs
  2. FPGA-based accelerator
  3. pipelines
  4. power efficient

Qualifiers

  • Research-article

Conference

ESWEEK'17
ESWEEK'17: THIRTEENTH EMBEDDED SYSTEM WEEK
October 15 - 20, 2017
Seoul, Republic of Korea

Acceptance Rates

Overall Acceptance Rate 280 of 864 submissions, 32%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)2
Reflects downloads up to 11 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Post-Quantum Encryption Mechanism Based on Convolutional Neural Network AcceleratorIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2024.337746071:8(3945-3949)Online publication date: Aug-2024
  • (2024)A Trusted Inference Mechanism for Edge Computing Based on Post-Quantum Encryption2024 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS58744.2024.10557963(1-5)Online publication date: 19-May-2024
  • (2023)A Convolutional Computing Design Using Pulsating Arrays2023 19th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)10.1109/ICNC-FSKD59587.2023.10281046(1-5)Online publication date: 29-Jul-2023
  • (2022)WGeod: A General and Efficient FPGA Accelerator for Object Detection2022 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00099(730-738)Online publication date: Dec-2022
  • (2022)A Survey of FPGA-Based Deep Learning Acceleration ResearchThe International Conference on Image, Vision and Intelligent Systems (ICIVIS 2021)10.1007/978-981-16-6963-7_5(59-65)Online publication date: 3-Mar-2022
  • (2021)Enhancing Performance of Gabriel Graph-Based Classifiers by a Hardware Co-Processor for Embedded System ApplicationsIEEE Transactions on Industrial Informatics10.1109/TII.2020.298732917:2(1186-1196)Online publication date: Feb-2021
  • (2021)A High Energy Efficiency and Low Resource Consumption FPGA Accelerator for Convolutional Neural Network2021 7th International Conference on Computer and Communications (ICCC)10.1109/ICCC54389.2021.9674340(1278-1283)Online publication date: 10-Dec-2021
  • (2021)Realization of convolution layer using system verilog for achieving parallelism and improvement in performance parametersInternational Journal of Information Technology10.1007/s41870-021-00724-9Online publication date: 19-Jun-2021
  • (2020)An Inference Hardware Accelerator for EEG-Based Emotion Detection2020 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS45731.2020.9180728(1-5)Online publication date: Oct-2020
  • (2020)BioCNN: A Hardware Inference Engine for EEG-based Emotion DetectionIEEE Access10.1109/ACCESS.2020.3012900(1-1)Online publication date: 2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media