Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3477145.3477167acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiconsConference Proceedingsconference-collections
research-article

Dynamic Vision Sensor integration on FPGA-based CNN accelerators for high-speed visual classification

Published: 13 October 2021 Publication History

Abstract

Deep-learning is a cutting edge theory that is being applied to many fields. For vision applications the Convolutional Neural Networks (CNN) are demanding significant accuracy for classification tasks. Numerous hardware accelerators have populated during the last years to improve CPU or GPU based solutions. This technology is commonly prototyped and tested over FPGAs before being considered for ASIC fabrication for mass production. The use of commercial typical cameras (30fps) limits the capabilities of these systems for high speed applications. The use of dynamic vision sensors (DVS) that emulate the behaviour of a biological retina is taking an incremental importance to improve this applications due to its nature, where the information is represented by a continuous stream of spikes and the frames to be processed by the CNN are constructed collecting a fixed number of these spikes (called events). The faster an object is, the more events are produced by DVS, so the higher is the equivalent frame rate. Therefore, these DVS utilization allows to compute a frame at the maximum speed a CNN accelerator can offer. In this paper we present a VHDL/HLS description of a pipelined design for FPGA able to collect events from an Address-Event-Representation (AER) DVS retina to obtain a normalized histogram to be used by a particular CNN accelerator, called NullHop. VHDL is used to describe the circuit, and HLS for computation blocks, which are used to perform the normalization of a frame needed for the CNN. Results outperform previous implementations of frames collection and normalization using ARM processors running at 800MHz on a Zynq7100 in both latency and power consumption. A measured 67% speed-up factor is presented for a Roshambo CNN real-time experiment running at  160fps peak rate.

References

[1]
A. Aimar, H. Mostafa, E. Calabrese, A. Rios-Navarro, R. Tapiador-Morales, I. Lungu, M. B. Milde, F. Corradi, A. Linares-Barranco, S. Liu, and T. Delbruck. 2019. NullHop: A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps. IEEE Transactions on Neural Networks and Learning Systems 30, 3 (March 2019), 644–656. https://doi.org/10.1109/TNNLS.2018.2852335
[2]
R. Berner, T. Delbruck, A. Civit-Balcells, and A. Linares-Barranco. 2007. A 5 Meps $100 USB2.0 Address-Event Monitor-Sequencer Interface. In International Symposium on Circuits and Systems, (ISCAS), 2007. IEEE.
[3]
C. Brandli, R. Berner, M. Yang, S-C. Liu, and T. Delbruck. 2014. A 240 × 180 130 dB 3 μs latency global shutter spatiotemporal vision sensor. IEEE Journal of Solid-State Circuits 49, 10 (2014). https://doi.org/10.1109/JSSC.2014.2342715
[4]
Yu-Hsin Chen, Tushar Krishna, Joel Emer, and Vivienne Sze. 2016. Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks. In IEEE International Solid-State Circuits Conference.
[5]
Zidong Du, Robert Fasthuber, Tianshi Chen, Paolo Ienne, Ling Li, Tao Luo, Xiaobing Feng, Yunji Chen, and Olivier Temam. [n.d.]. ShiDianNao. ISCA 2015 ([n. d.]).
[6]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385(2015).
[7]
Andrey Ignatov, Radu Timofte, William Chou, Ke Wang, Max Wu, Tim Hartley, and Luc Van Gool. 2018. AI Benchmark: Running Deep Neural Networks on Android Smartphones. CoRR abs/1810.01109(2018). arxiv:1810.01109http://arxiv.org/abs/1810.01109
[8]
Norman P Jouppi [n.d.]. In-Datacenter Performance Analysis of a Tensor Processing Unit. In Proceedings of the 44th Annual International Symposium on Computer Architecture - ISCA2017. https://doi.org/10.1145/3079856.3080246 arxiv:1704.04760
[9]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097–1105.
[10]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (2015), 436–444.
[11]
Jinmook Lee, Changhyeon Kim, Sanghoon Kang, Dongjoo Shin, Sangyeob Kim, and Hoi-Jun Yoo. 2018. UNPU: a 50.6 TTOPS/W unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision. In 2018 IEEE International Solid-State Circuits Conference (ISSCC). IEEE.
[12]
P. Lichtsteiner, C. Posch, and T. Delbruck. 2008. A 128x128 120 dB 15 μs Latency Asynchronous Temporal Contrast Vision Sensor. IEEE Journal of Solid-State Circuits 43, 2 (Feb 2008), 566–576. https://doi.org/10.1109/JSSC.2007.914337
[13]
A. Linares-Barranco, F. Gómez-Rodríguez, V. Villanueva, L. Longinotti, and T. Delbrück. 2015. A USB3.0 FPGA event-based filtering and tracking framework for dynamic vision sensors. In 2015 IEEE International Symposium on Circuits and Systems (ISCAS). 2417–2420. https://doi.org/10.1109/ISCAS.2015.7169172
[14]
Bert Moons, Roel Uytterhoeven, Wim Dehaene, and Marian Verhelst. 2017. 14.5 Envision: A 0.26-to-10TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable Convolutional Neural Network processor in 28nm FDSOI. In 2017 IEEE International Solid-State Circuits Conference (ISSCC). 246–247. https://doi.org/10.1109/ISSCC.2017.7870353
[15]
Bert Moons and Marian Verhelst. [n.d.]. A 0.3-2.6 TOPS/W precision-scalable processor for real-time large-scale ConvNets. IEEE Symposium on VLSI Circuits, Digest of Technical Papers 2016-Sept ([n. d.]), 1–2. https://doi.org/10.1109/VLSIC.2016.7573525 arxiv:1606.05094
[16]
Phi-Hung Pham, Darko Jelaca, Clement Farabet, Berin Martini, Yann LeCun, and Eugenio Culurciello. [n.d.]. Neuflow: Dataflow vision processing system-on-a-chip. In 2012 IEEE 55th International Midwest Symposium on Circuits and Systems (2012).
[17]
A. Rios-Navarro, J. P. Dominguez-Morales, R. Tapiador-Morales, D. Gutierrez-Galan, A. Jimenez-Fernandez, and A. Linares-Barranco. 2016. A 20Mevps/32Mev event-based USB framework for neuromorphic systems debugging. In 2016 Second International Conference on Event-based Control, Communication, and Signal Processing (EBCCSP). 1–6. https://doi.org/10.1109/EBCCSP.2016.7605248
[18]
A. Rios-Navarro, R. Tapiador-Morales, A. Jimenez-Fernandez, C. Amaya, M. Dominguez-Morales, T. Delbruck, and A. Linares-Barranco. 2018. Performance evaluation over HW/SW co-design SoC memory transfers for a CNN accelerator. In 2018 IEEE 18th International Conference on Nanotechnology (IEEE-NANO). 1–4. https://doi.org/10.1109/NANO.2018.8626313
[19]
Pierre Sermanet, David Eigen, Xiang Zhang, Michaël Mathieu, Rob Fergus, and Yann LeCun. 2013. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229(2013).
[20]
Dongjoo Shin, Jinmook Lee, Jinsu Lee, and Hoi-Jun Yoo. 2017. 14.2 DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks. In 2017 IEEE International Solid-State Circuits Conference (ISSCC). IEEE, 240–241. https://doi.org/10.1109/ISSCC.2017.7870350
[21]
Jaehyeong Sim, Jun-Seok Park, Minhye Kim, Dongmyung Bae, Yeongjae Choi, and Lee-Sup Kim. 2016. 14.6 A 1.42 TOPS/W deep convolutional neural network recognition processor for intelligent IoE systems. In 2016 IEEE International Solid-State Circuits Conference (ISSCC). IEEE, 264–265.
[22]
Shouyi Yin, Peng Ouyang, Shibin Tang, Fengbin Tu, Xiudong Li, Leibo Liu, and Shaojun Wei. [n.d.]. A 1.06-to-5.09 TOPS/W reconfigurable hybrid-neural-network processor for deep learning applications. In 2017 IEEE Symposium on VLSI Circuits. https://doi.org/10.23919/VLSIC.2017.8008534
[23]
A. Yousefzadeh, M. Jabłoński, T. Iakymchuk, A. Linares-Barranco, A. Rosado, L. A. Plana, S. Temple, T. Serrano-Gotarredona, S. B. Furber, and B. Linares-Barranco. 2017. On Multiple AER Handshaking Channels Over High-Speed Bit-Serial Bidirectional LVDS Links With Flow-Control and Clock-Correction on Commercial FPGAs for Scalable Neuromorphic Systems. IEEE Transactions on Biomedical Circuits and Systems 11, 5 (Oct 2017), 1133–1147. https://doi.org/10.1109/TBCAS.2017.2717341

Cited By

View all
  • (2024)Memristor–CMOS Hybrid Circuits Implementing Event-Driven Neural Networks for Dynamic Vision Sensor CameraMicromachines10.3390/mi1504042615:4(426)Online publication date: 22-Mar-2024
  • (2024)A 593nJ/Inference DVS Hand Gesture Recognition Processor Embedded With Reconfigurable Multiple Constant Multiplication TechniqueIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2024.338799871:6(2749-2759)Online publication date: Jun-2024
  • (2023)High-definition event frame generation using SoC FPGA devices2023 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)10.23919/SPA59660.2023.10274447(106-111)Online publication date: 20-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICONS 2021: International Conference on Neuromorphic Systems 2021
July 2021
198 pages
ISBN:9781450386913
DOI:10.1145/3477145
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Address-Event-Representation
  2. FPGA
  3. Neuromorphic Engineering
  4. convolutional neural networks

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • Agencia Estatal de Investigación

Conference

ICONS 2021

Acceptance Rates

Overall Acceptance Rate 13 of 22 submissions, 59%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)95
  • Downloads (Last 6 weeks)9
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Memristor–CMOS Hybrid Circuits Implementing Event-Driven Neural Networks for Dynamic Vision Sensor CameraMicromachines10.3390/mi1504042615:4(426)Online publication date: 22-Mar-2024
  • (2024)A 593nJ/Inference DVS Hand Gesture Recognition Processor Embedded With Reconfigurable Multiple Constant Multiplication TechniqueIEEE Transactions on Circuits and Systems I: Regular Papers10.1109/TCSI.2024.338799871:6(2749-2759)Online publication date: Jun-2024
  • (2023)High-definition event frame generation using SoC FPGA devices2023 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)10.23919/SPA59660.2023.10274447(106-111)Online publication date: 20-Sep-2023
  • (2023)Within-Camera Multilayer Perceptron DVS Denoising2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)10.1109/CVPRW59228.2023.00409(3933-3942)Online publication date: Jun-2023
  • (2022)Using Deep Reinforcement Learning For Robot Arm ControlJournal of Artificial Intelligence and Capsule Networks10.36548/jaicn.2022.3.0024:3(160-166)Online publication date: 19-Aug-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media