Event-Based Gesture Recognition through a Hierarchy of Time-Surfaces for FPGA
Abstract
:1. Introduction
- HDL description and implementation of HOTS for FPGA, taking advantage of their memory organization and square-root algorithms.
- Real-time demonstration for embedded systems and proof of their low latency and reduced power consumption.
2. Materials and Methods
2.1. Event-Based Vision Sensors
2.2. Time-Surfaces
2.3. System Architecture
2.3.1. Time-Surface Generator
2.3.2. Euclidean Distance Estimator
2.3.3. Histograms Generator and Comparator Module
2.3.4. Hardware Implementation
3. Experimental Set-Up and Results
3.1. Loss Test
3.2. Performance Test
4. Discussion and Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
References
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Proceedings of the 2012 Advances in Neural Information Processing Systems, Lake Tahoe, NV, USA, 3–6 December 2012; pp. 1097–1105. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, arXiv:1409.1556v6. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Tallent, N.R.; Gawande, N.A.; Siegel, C.; Vishnu, A.; Hoisie, A. Evaluating On-Node GPU Interconnects for Deep Learning Workloads; Springer: Berlin, Germany, 2018; pp. 3–21. [Google Scholar]
- Saeed, A.; Al-Hamadi, A.; Niese, R.; Elzobi, M. Frame-Based Facial Expression Recognition Using Geometrical Features. Adv. Hum. Comput. Interact. 2014, 2014. [Google Scholar] [CrossRef]
- Zanchettin, C.; Bezerra, B.L.D.; Azevedo, W.W. A KNN-SVM hybrid model for cursive handwriting recognition. In Proceedings of the 2012 International Joint Conference on Neural Networks (IJCNN), Brisbane, Australia, 10–15 June 2012; pp. 1–8. [Google Scholar]
- Farabet, C.; Paz, R.; Pérez-Carrasco, J.; Zamarreño, C.; Linares-Barranco, A.; LeCun, Y.; Culurciello, E.; Serrano-Gotarredona, T.; Linares-Barranco, B. Comparison between frame-constrained fix-pixel-value and frame-free spiking-dynamic-pixel ConvNets for visual processing. Front. Neurosci. 2012, 6, 32. [Google Scholar] [CrossRef] [Green Version]
- Mead, C. Analog VLSI and Neutral Systems; Addison-Wesley: Boston, MA, USA, 1989. [Google Scholar]
- Sterling, P.; Laughlin, S. Principles of Neural Design; MIT Press: Cambridge, MA, USA, 2015; pp. 1–542. [Google Scholar]
- Yang, M.; Chien, C.; Delbrück, T.; Liu, S. A 0.5 V 55 μW 64 × 2 Channel Binaural Silicon Cochlea for Event-Driven Stereo-Audio Sensing. IEEE J. Solid-State Circuits 2016, 51, 2554–2569. [Google Scholar] [CrossRef]
- Jiménez-Fernández, A.; Cerezuela-Escudero, E.; Miró-Amarante, L.; Domínguez-Morales, M.J.; Gomez-Rodríguez, F.; Linares-Barranco, A.; Jiménez-Moreno, G. A Binaural Neuromorphic Auditory Sensor for FPGA: A Spike Signal Processing Approach. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 804–818. [Google Scholar] [CrossRef] [PubMed]
- Lichtsteiner, P.; Posch, C.; Delbrück, T. A 128 × 128 120 dB 15 us Latency Asynchronous Temporal Contrast Vision Sensor. IEEE J. Solid-State Circuits 2008, 43, 566–576. [Google Scholar] [CrossRef] [Green Version]
- Shoushun, C.; Bermak, A. Arbitrated Time-to-First Spike CMOS Image Sensor With On-Chip Histogram Equalization. IEEE Trans. Very Large Scale Integr. VLSI Syst. 2007, 15, 346–357. [Google Scholar] [CrossRef]
- Posch, C.; Matolin, D.; Wohlgenannt, R. A QVGA 143 dB Dynamic Range Frame-Free PWM Image Sensor With Lossless Pixel-Level Video Compression and Time-Domain CDS. IEEE J. Solid-State Circuits 2011, 46, 259–275. [Google Scholar] [CrossRef]
- Leñero-Bardallo, J.A.; Serrano-Gotarredona, T.; Linares-Barranco, B. A 3.6 μ s Latency Asynchronous Frame-Free Event-Driven Dynamic-Vision-Sensor. IEEE J. Solid-State Circuits 2011, 46, 1443–1455. [Google Scholar] [CrossRef] [Green Version]
- Brandli, C.; Berner, R.; Yang, M.; Liu, S.; Delbruck, T. A 240 × 180 130 dB 3 μs Latency Global Shutter Spatiotemporal Vision Sensor. IEEE J. Solid-State Circuits 2014, 49, 2333–2341. [Google Scholar] [CrossRef]
- Pardo, F.; Boluda, J.A.; Vegara, F. Selective Change Driven Vision Sensor With Continuous-Time Logarithmic Photoreceptor and Winner-Take-All Circuit for Pixel Selection. IEEE J. Solid-State Circuits 2015, 50, 786–798. [Google Scholar] [CrossRef]
- Son, B.; Suh, Y.; Kim, S.; Jung, H.; Kim, J.; Shin, C.; Park, K.; Lee, K.; Park, J.; Woo, J.; et al. 4.1 A 640×480 dynamic vision sensor with a 9 μm pixel and 300 Meps address-event representation. In Proceedings of the 2017 IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, 5–9 February 2017; pp. 66–67. [Google Scholar]
- Linares-Barranco, A.; Gómez-Rodríguez, F.; Villanueva, V.; Longinotti, L.; Delbrück, T. A USB3.0 FPGA event-based filtering and tracking framework for dynamic vision sensors. In Proceedings of the 2015 IEEE International Symposium on Circuits and Systems (ISCAS), Lisbon, Portugal, 24–27 May 2015; pp. 2417–2420. [Google Scholar]
- Delbruck, T.; Lang, M. Robotic goalie with 3 ms reaction time at 4event-based dynamic vision sensor. Front. Neurosci. 2013, 7, 223. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Linares-Barranco, A.; Perez-Peña, F.; Moeys, D.P.; Gomez-Rodriguez, F.; Jimenez-Moreno, G.; Liu, S.; Delbruck, T. Low Latency Event-Based Filtering and Feature Extraction for Dynamic Vision Sensors in Real-Time FPGA Applications. IEEE Access 2019, 7, 134926–134942. [Google Scholar] [CrossRef]
- Linares-Barranco, A.; Liu, H.; Rios-Navarro, A.; Gomez-Rodriguez, F.; Moeys, D.P.; Delbruck, T. Approaching Retinal Ganglion Cell Modeling and FPGA Implementation for Robotics. Entropy 2018, 20, 475. [Google Scholar] [CrossRef] [Green Version]
- Zhao, B.; Ding, R.; Chen, S.; Linares-Barranco, B.; Tang, H. Feedforward Categorization on AER Motion Events Using Cortex-Like Features in a Spiking Neural Network. IEEE Trans. Neural Netw. Learn. Syst. 2015, 26, 1963–1978. [Google Scholar] [CrossRef] [Green Version]
- Tapiador-Morales, R.; Linares-Barranco, A.; Jimenez-Fernandez, A.; Jimenez-Moreno, G. Neuromorphic LIF Row-by-Row Multiconvolution Processor for FPGA. IEEE Trans. Biomed. Circuits Syst. 2019, 13, 159–169. [Google Scholar]
- Pérez-Carrasco, J.A.; Zhao, B.; Serrano, C.; Acha, B.; Serrano-Gotarredona, T.; Chen, S.; Linares-Barranco, B. Mapping from frame-driven to frame-free event-driven vision systems by low-rate rate coding and coincidence processing—Application to feedforward convnets. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 2706–2719. [Google Scholar] [CrossRef] [PubMed]
- Serrano-Gotarredona, T.; Linares-Barranco, B. Poker-DVS and MNIST-DVS. Their History, How They Were Made, and Other Details. Front. Neurosci. 2015, 9, 481. [Google Scholar] [CrossRef] [PubMed]
- Orchard, G.; Jayawant, A.; Cohen, G.K.; Thakor, N. Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades. Front. Neurosci. 2015, 9, 437. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lagorce, X.; Orchard, G.; Galluppi, F.; Shi, B.E.; Benosman, R.B. Hots: A hierarchy of event-based time-surfaces for pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1346–1359. [Google Scholar] [CrossRef] [PubMed]
- Furber, S.B.; Lester, D.R.; Plana, L.A.; Garside, J.D.; Painkras, E.; Temple, S.; Brown, A.D. Overview of the spinnaker system architecture. IEEE Trans. Comput. 2013, 62, 2454–2467. [Google Scholar] [CrossRef] [Green Version]
- Schmitt, S.; Klähn, J.; Bellec, G.; Grübl, A.; Güttler, M.; Hartel, A.; Hartmann, S.; de Oliveira, D.H.; Husmann, K.; Jeltsch, S.; et al. Neuromorphic hardware in the loop: Training a deep spiking network on the BrainScaleS wafer-scale system. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 2227–2234. [Google Scholar]
- Akopyan, F.; Sawawa, J.; Cassidy, A.; Alvarez-Icaza, R.; Arthur, J.; Merolla, P.; Imam, N.; Nakamura, Y.; Datta, P.; Nam, G.; et al. TrueNorth: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2015, 34, 1537–1557. [Google Scholar] [CrossRef]
- Moradi, S.; Qiao, N.; Stefanini, F.; Indiveri, G. A Scalable Multicore Architecture With Heterogeneous Memory Structures for Dynamic Neuromorphic Asynchronous Processors (DYNAPs). IEEE Trans. Biomed. Circuits Syst. 2018, 12, 106–122. [Google Scholar] [CrossRef] [Green Version]
- Lin, C.; Wild, A.; Chinya, G.N.; Cao, Y.; Davies, M.; Lavery, D.M.; Wang, H. Programming Spiking Neural Networks on Intel’s Loihi. Computer 2018, 51, 52–61. [Google Scholar] [CrossRef]
- Furber, S. Large-scale neuromorphic computing systems. J. Neural Eng. 2016, 13, 051001. [Google Scholar] [CrossRef]
- Delbrück, T. jAER Open Source Project (2007). Available online: https://github.com/SensorsINI/jaer (accessed on 14 June 2020).
- Maro, J.; Benosman, R. Event-based Gesture Recognition with Dynamic Background Suppression using Smartphone Computational Capabilities. Front. Neurosci. 2020, 14, 275. [Google Scholar] [CrossRef]
- Piromsopa, K.; Arporntewan, C.; Chongstitvatana, P. An FPGA Implementation of a Fixed-Point Square Root Operation. In Proceedings of the International Symposium on Communications and Information Technology, (ISCIT 2001), Chiang Mai, Thailand, 14–16 November 2001; pp. 14–16. [Google Scholar]
- Li, Y.; Chu, W. A new non-restoring square root algorithm and its VLSI implementations. In Proceedings of the International Conference on Computer Design, Austin, TX, USA, 7–9 October 1996; pp. 538–544. [Google Scholar]
- Aimar, A.; Mostafa, H.; Calabrese, E.; Riós-Navarro, A.; Tapiador-Morales, R.; Lungu, I.A.; Milde, M.B.; Corradi, F.; Linares-Barranco, A.; Liu, S.C.; et al. NullHop:A Flexible Convolutional Neural Network Accelerator Based on Sparse Representations of Feature Maps. Trans. Neural Netw. Learn. Syst. 2018, 30, 644–656. [Google Scholar] [CrossRef] [Green Version]
- Berner, R.; Delbrück, T.; Civit-Balcells, A.; Linares-Barranco, A. A 5 Meps $100 USB2.0 address-event monitor-sequencer interface. In Proceedings of the 2007 IEEE International Symposium on Circuits and Systems, New Orleans, LA, USA, 27–30 May 2007; pp. 2451–2454. [Google Scholar]
- Zamarreño-Ramos, C.; Linares-Barranco, A.; Serrano-Gotarredona, T.; Linares-Barranco, B. Multicasting Mesh AER: A Scalable Assembly Approach for Reconfigurable Neuromorphic Structured AER Systems. Application to ConvNets. IEEE Trans. Biomed. Circuits Syst. 2013, 7, 82–102. [Google Scholar] [CrossRef] [Green Version]
- Baby, S.A.; Vinod, B.; Chinni, C.; Mitra, K. Dynamic Vision Sensors for Human Activity Recognition. In Proceedings of the 2017 4th IAPR Asian Conference on Pattern Recognition (ACPR), Nanjing, China, 26–29 November 2017; pp. 316–321. [Google Scholar]
- Amir, A.; Taba, B.; Berg, D.; Melano, T.; McKinstry, J.; Nolfo, C.D.; Nayak, T.; Andreopoulos, A.; Garreau, G.; Mendoza, M.; et al. A Low Power, Fully Event-Based Gesture Recognition System. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 7388–7397. [Google Scholar]
- Camuñas-Mesa, L.A.; Domínguez-Cordero, Y.L.; Linares-Barranco, A.; Serrano-Gotarredona, T.; Linares-Barranco, B. A Configurable Event-Driven Convolutional Node with Rate Saturation Mechanism for Modular ConvNet Systems Implementation. Front. Neurosci. 2018, 12, 63. [Google Scholar] [CrossRef] [Green Version]
Zedboard (xc7020clg482) | Zynq7000 (xc7z100ffg2) | |
---|---|---|
LUT | 8313/53,200 (15.6%) | 8351/277,400 (3%) |
LUTRAM | 2879/17400 (16.5%) | 2872/108,200 (2.6%) |
FF | 5627/106,400 (5.2%) | 6092/54,800 (1.1%) |
DSP | 46/220 (20%) | 46/2020 (2%) |
BRAM | 18/140 (12.8%) | 18/755 (2%) |
Maro et al. [36] | Q8.8 | Q16.16 | Q32.32 | |
---|---|---|---|---|
NavGestures-sit | 94.5% | 93.3% | 93.72% | 94.1% |
Zynq7000 (xc7z100ffg2) | |||
---|---|---|---|
Resolution | Q8.8 | Q16.16 | Q32.32 |
LUT | 3% | 4.22% | 4.68% |
LUTRAM | 0.31% | 0.36% | 1.67% |
FF | 0.34% | 0.38% | 0.48% |
DSP | 2% | 3.23% | 6.92% |
BRAM | 2.12% | 2.12% | 2.12% |
Zedboard (xc7020clg482) | |||
Resolution | Q8.8 | Q16.16 | Q32.32 |
LUT | 15.6% | 16.7% | 22.01% |
LUTRAM | 7.43% | 8.41% | 10.37% |
FF | 1.62% | 1.88% | 2.41% |
DSP | 20.2% | 34.09% | 64.45% |
BRAM | 11.43% | 11.43% | 11.43% |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tapiador-Morales, R.; Maro, J.-M.; Jimenez-Fernandez, A.; Jimenez-Moreno, G.; Benosman, R.; Linares-Barranco, A. Event-Based Gesture Recognition through a Hierarchy of Time-Surfaces for FPGA. Sensors 2020, 20, 3404. https://doi.org/10.3390/s20123404
Tapiador-Morales R, Maro J-M, Jimenez-Fernandez A, Jimenez-Moreno G, Benosman R, Linares-Barranco A. Event-Based Gesture Recognition through a Hierarchy of Time-Surfaces for FPGA. Sensors. 2020; 20(12):3404. https://doi.org/10.3390/s20123404
Chicago/Turabian StyleTapiador-Morales, Ricardo, Jean-Matthieu Maro, Angel Jimenez-Fernandez, Gabriel Jimenez-Moreno, Ryad Benosman, and Alejandro Linares-Barranco. 2020. "Event-Based Gesture Recognition through a Hierarchy of Time-Surfaces for FPGA" Sensors 20, no. 12: 3404. https://doi.org/10.3390/s20123404
APA StyleTapiador-Morales, R., Maro, J.-M., Jimenez-Fernandez, A., Jimenez-Moreno, G., Benosman, R., & Linares-Barranco, A. (2020). Event-Based Gesture Recognition through a Hierarchy of Time-Surfaces for FPGA. Sensors, 20(12), 3404. https://doi.org/10.3390/s20123404