Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3212725.3212731acmconferencesArticle/Chapter ViewAbstractPublication PagesmobisysConference Proceedingsconference-collections
research-article

On-the-fly deterministic binary filters for memory efficient keyword spotting applications on embedded devices

Published: 15 June 2018 Publication History

Abstract

Lightweight keyword spotting (KWS) applications are often used to trigger the execution of more complex speech recognition algorithms that are computationally demanding and therefore cannot be constantly running on the device. Often KWS applications are executed in small microcontrollers with very constrained memory (e.g. 128kB) and compute capabilities (e.g. CPU at 80MHz) limiting the complexity of deployable KWS systems. We present a compact binary architecture with 60% fewer parameters and 50% fewer operations (OP) during inference compared to the current state of the art for KWS applications at the cost of 3.4% accuracy drop. It makes use of binary orthogonal codes to analyse speech features from a voice command resulting in a model with minimal memory footprint and computationally cheap, making possible its deployment in very resource-constrained microcontrollers with less than 30kB of on-chip memory. Our technique offers a different perspective to how filters in neural networks could be constructed at inference time instead of directly loading them from disk.

References

[1]
Abadi, M., et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.
[2]
Adachi, F., et al. Wideband ds-cdma for next-generation mobile communications systems. IEEE communications Magazine 36, 9 (1998), 56--69.
[3]
Adachi, F., Sawahashi, M., and Okawa, K. Tree-structured generation of orthogonal spreading codes with different lengths for forward link of ds-cdma mobile radio. Electronics Letters 33, 1 (Jan 1997), 27--28.
[4]
Andreev, B. D., et al. Orthogonal code generator for 3g wireless transceivers. In Proceedings of the 13th ACM Great Lakes Symposium on VLSI (New York, NY, USA, 2003), GLSVLSI '03, ACM, pp. 229--232.
[5]
Arik, S. Ö., et al. Convolutional recurrent neural networks for small-footprint keyword spotting. CoRR abs/1703.05390 (2017).
[6]
Bhattacharya, S., and Lane, N. D. Sparsification and separation of deep learning layers for constrained resource inference on wearables. In Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems CD-ROM (New York, NY, USA, 2016), SenSys '16, ACM, pp. 176--189.
[7]
Cavigelli, L., et al. Origami: A convolutional network accelerator. CoRR abs/1512.04295 (2015).
[8]
Chen, Yu-Hsin and others. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. In IEEE International Solid-State Circuits Conference, ISSCC 2016, Digest of Technical Papers (2016), pp. 262--263.
[9]
Chollet, F. Xception: Deep learning with depthwise separable convolutions. CoRR abs/1610.02357 (2016).
[10]
David Pearce, C. d. Speech processing, transmission and quality aspects (stq); distributed speech recognition; front-end feature extraction algorithm; compression algorithms.
[11]
Du, Z., et al. Shidiannao: Shifting vision processing closer to the sensor. SIGARCH Comput. Archit. News 43, 3 (June 2015), 92--104.
[12]
Fernández-Marqués, J., Vincent, W.-S. T., Bhattachara, S., and Lane, N. D. Binarycmd: Keyword spotting with deterministic binary basis. SysML (2018).
[13]
Gokhale, V., Jin, J., Dundar, A., Martini, B., and Culurciello, E. A 240 g-ops/s mobile coprocessor for deep neural networks. In 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (June 2014), pp. 696--701.
[14]
Han, S., Mao, H., and Dally, W. J. Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. CoRR abs/1510.00149 (2015).
[15]
He, K., et al. Deep residual learning for image recognition. CoRR abs/1512.03385 (2015).
[16]
Howard, A. G., et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017).
[17]
Ioffe, S., and Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. CoRR abs/1502.03167 (2015).
[18]
Jacob, B., et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference. CoRR abs/1712.05877 (2017).
[19]
Juefei-Xu, F., et al. Local binary convolutional neural networks. CoRR abs/1608.06049 (2016).
[20]
Kim, S., Kim, M., Shin, C., Lee, J., and Kim, Y. Efficient implementation of ovsf code generator for umts systems. In 2009 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (Aug 2009), pp. 483--486.
[21]
Lai, L., Suda, N., and Chandra, V. CMSIS-NN: efficient neural network kernels for arm cortex-m cpus. CoRR abs/1801.06601 (2018).
[22]
Lane, N. D., Bhattacharya, S., Georgiev, P., Forlivesi, C., and Kawsar, F. An early resource characterization of deep learning on wearables, smartphones and internet-of-things devices. In Proceedings of the 2015 International Workshop on Internet of Things Towards Applications (New York, NY, USA, 2015), IoT-App '15, ACM, pp. 7--12.
[23]
Lane, N. D., Bhattacharya, S., Mathur, A., Georgiev, P., Forlivesi, C., and Kawsar, F. Squeezing deep learning into mobile and embedded devices. IEEE Pervasive Computing 16, 3 (2017), 82--88.
[24]
McDanel, B., et al. Incomplete dot products for dynamic computation scaling in neural network inference. CoRR abs/1710.07830 (2017).
[25]
Purohit, G., Chaubey, V. K., Raju, K. S., and Reddy, P. V. Fpga based implementation and testing of ovsf code. In 2013 International Conference on Advanced Electronic Systems (ICAES) (Sept 2013), pp. 88--92.
[26]
Rintakoski, T., Kuulusa, M., and Nurmi, J. Hardware unit for ovsf/walsh/hadamard code generation {3g mobile communication applications}. In 2004 International Symposium on System-on-Chip, 2004. Proceedings. (Nov 2004), pp. 143--145.
[27]
Sun, M., et al. Max-pooling loss training of long short-term memory networks for small-footprint keyword spotting. CoRR abs/1705.02411 (2017).
[28]
Tara N. Sainath, C. P. Convolutional neural networks for small-footprint keyword spotting. Sixteenth Annual Conference of the International Speech Communication Association (2015).
[29]
Tseng, V. W.-S., et al. Deterministic binary filters for convolutional neural networks. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18 (2018).
[30]
Wang, Y., et al. Cnnpack: Packing convolutional neural networks in the frequency domain. In Advances in Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, Eds. Curran Associates, Inc., 2016, pp. 253--261.
[31]
Wang, Y., et al. Beyond filters: Compact feature map for portable deep model. In Proceedings of the 34th International Conference on Machine Learning (International Convention Centre, Sydney, Australia, 06--11 Aug 2017), D. Precup and Y. W. Teh, Eds., vol. 70 of Proceedings of Machine Learning Research, PMLR, pp. 3703--3711.
[32]
Wang, Z., et al. Small-footprint keyword spotting using deep neural network and connectionist temporal classifier. CoRR abs/1709.03665 (2017).
[33]
Warden, P. Speech commands: A public dataset for single-word speech recognition.
[34]
Zhang, Y., et al. Hello edge: Keyword spotting on microcontrollers. CoRR abs/1711.07128 (2017).

Cited By

View all
  • (2023)Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights GenerationACM Transactions on Design Automation of Electronic Systems10.1145/361167328:6(1-31)Online publication date: 16-Oct-2023
  • (2021)Hardware Acceleration for Embedded Keyword Spotting: Tutorial and SurveyACM Transactions on Embedded Computing Systems10.1145/347436520:6(1-25)Online publication date: 18-Oct-2021
  • (2021)A High Accuracy Multiple-Command Speech Recognition ASIC Based on Configurable One-Dimension Convolutional Neural Network2021 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS51556.2021.9401401(1-4)Online publication date: May-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
EMDL'18: Proceedings of the 2nd International Workshop on Embedded and Mobile Deep Learning
June 2018
51 pages
ISBN:9781450358446
DOI:10.1145/3212725
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2018

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

MobiSys '18
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)1
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights GenerationACM Transactions on Design Automation of Electronic Systems10.1145/361167328:6(1-31)Online publication date: 16-Oct-2023
  • (2021)Hardware Acceleration for Embedded Keyword Spotting: Tutorial and SurveyACM Transactions on Embedded Computing Systems10.1145/347436520:6(1-25)Online publication date: 18-Oct-2021
  • (2021)A High Accuracy Multiple-Command Speech Recognition ASIC Based on Configurable One-Dimension Convolutional Neural Network2021 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS51556.2021.9401401(1-4)Online publication date: May-2021
  • (2021)unzipFPGA: Enhancing FPGA-based CNN Engines with On-the-Fly Weights Generation2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)10.1109/FCCM51124.2021.00027(165-175)Online publication date: May-2021
  • (2018)Deterministic binary filters for convolutional neural networksProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304889.3305041(2739-2747)Online publication date: 13-Jul-2018
  • (2012)Approximate Computing for Energy-Constrained DNN-Based Speech RecognitionApproximate Computing10.1007/978-3-030-98347-5_18(451-480)Online publication date: 24-Feb-2012

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media