research-article

On-the-fly deterministic binary filters for memory efficient keyword spotting applications on embedded devices

Authors:

Javier Fernández-Marqués,

Vincent W.-S. Tseng,

Sourav Bhattachara,

Nicholas D. LaneAuthors Info & Claims

EMDL'18: Proceedings of the 2nd International Workshop on Embedded and Mobile Deep Learning

Pages 13 - 18

https://doi.org/10.1145/3212725.3212731

Published: 15 June 2018 Publication History

Abstract

Lightweight keyword spotting (KWS) applications are often used to trigger the execution of more complex speech recognition algorithms that are computationally demanding and therefore cannot be constantly running on the device. Often KWS applications are executed in small microcontrollers with very constrained memory (e.g. 128kB) and compute capabilities (e.g. CPU at 80MHz) limiting the complexity of deployable KWS systems. We present a compact binary architecture with 60% fewer parameters and 50% fewer operations (OP) during inference compared to the current state of the art for KWS applications at the cost of 3.4% accuracy drop. It makes use of binary orthogonal codes to analyse speech features from a voice command resulting in a model with minimal memory footprint and computationally cheap, making possible its deployment in very resource-constrained microcontrollers with less than 30kB of on-chip memory. Our technique offers a different perspective to how filters in neural networks could be constructed at inference time instead of directly loading them from disk.

References

[1]

Abadi, M., et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.

[2]

Adachi, F., et al. Wideband ds-cdma for next-generation mobile communications systems. IEEE communications Magazine 36, 9 (1998), 56--69.

Digital Library

[3]

Adachi, F., Sawahashi, M., and Okawa, K. Tree-structured generation of orthogonal spreading codes with different lengths for forward link of ds-cdma mobile radio. Electronics Letters 33, 1 (Jan 1997), 27--28.

[4]

Andreev, B. D., et al. Orthogonal code generator for 3g wireless transceivers. In Proceedings of the 13th ACM Great Lakes Symposium on VLSI (New York, NY, USA, 2003), GLSVLSI '03, ACM, pp. 229--232.

Digital Library

[5]

Arik, S. Ö., et al. Convolutional recurrent neural networks for small-footprint keyword spotting. CoRR abs/1703.05390 (2017).

[6]

Bhattacharya, S., and Lane, N. D. Sparsification and separation of deep learning layers for constrained resource inference on wearables. In Proceedings of the 14th ACM Conference on Embedded Network Sensor Systems CD-ROM (New York, NY, USA, 2016), SenSys '16, ACM, pp. 176--189.

Digital Library

[7]

Cavigelli, L., et al. Origami: A convolutional network accelerator. CoRR abs/1512.04295 (2015).

[8]

Chen, Yu-Hsin and others. Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks. In IEEE International Solid-State Circuits Conference, ISSCC 2016, Digest of Technical Papers (2016), pp. 262--263.

[9]

Chollet, F. Xception: Deep learning with depthwise separable convolutions. CoRR abs/1610.02357 (2016).

[10]

David Pearce, C. d. Speech processing, transmission and quality aspects (stq); distributed speech recognition; front-end feature extraction algorithm; compression algorithms.

[11]

Du, Z., et al. Shidiannao: Shifting vision processing closer to the sensor. SIGARCH Comput. Archit. News 43, 3 (June 2015), 92--104.

Digital Library

[12]

Fernández-Marqués, J., Vincent, W.-S. T., Bhattachara, S., and Lane, N. D. Binarycmd: Keyword spotting with deterministic binary basis. SysML (2018).

[13]

Gokhale, V., Jin, J., Dundar, A., Martini, B., and Culurciello, E. A 240 g-ops/s mobile coprocessor for deep neural networks. In 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (June 2014), pp. 696--701.

Digital Library

[14]

Han, S., Mao, H., and Dally, W. J. Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding. CoRR abs/1510.00149 (2015).

[15]

He, K., et al. Deep residual learning for image recognition. CoRR abs/1512.03385 (2015).

[16]

Howard, A. G., et al. Mobilenets: Efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017).

[17]

Ioffe, S., and Szegedy, C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. CoRR abs/1502.03167 (2015).

[18]

Jacob, B., et al. Quantization and training of neural networks for efficient integer-arithmetic-only inference. CoRR abs/1712.05877 (2017).

[19]

Juefei-Xu, F., et al. Local binary convolutional neural networks. CoRR abs/1608.06049 (2016).

[20]

Kim, S., Kim, M., Shin, C., Lee, J., and Kim, Y. Efficient implementation of ovsf code generator for umts systems. In 2009 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (Aug 2009), pp. 483--486.

[21]

Lai, L., Suda, N., and Chandra, V. CMSIS-NN: efficient neural network kernels for arm cortex-m cpus. CoRR abs/1801.06601 (2018).

[22]

Lane, N. D., Bhattacharya, S., Georgiev, P., Forlivesi, C., and Kawsar, F. An early resource characterization of deep learning on wearables, smartphones and internet-of-things devices. In Proceedings of the 2015 International Workshop on Internet of Things Towards Applications (New York, NY, USA, 2015), IoT-App '15, ACM, pp. 7--12.

Digital Library

[23]

Lane, N. D., Bhattacharya, S., Mathur, A., Georgiev, P., Forlivesi, C., and Kawsar, F. Squeezing deep learning into mobile and embedded devices. IEEE Pervasive Computing 16, 3 (2017), 82--88.

[24]

McDanel, B., et al. Incomplete dot products for dynamic computation scaling in neural network inference. CoRR abs/1710.07830 (2017).

[25]

Purohit, G., Chaubey, V. K., Raju, K. S., and Reddy, P. V. Fpga based implementation and testing of ovsf code. In 2013 International Conference on Advanced Electronic Systems (ICAES) (Sept 2013), pp. 88--92.

[26]

Rintakoski, T., Kuulusa, M., and Nurmi, J. Hardware unit for ovsf/walsh/hadamard code generation {3g mobile communication applications}. In 2004 International Symposium on System-on-Chip, 2004. Proceedings. (Nov 2004), pp. 143--145.

[27]

Sun, M., et al. Max-pooling loss training of long short-term memory networks for small-footprint keyword spotting. CoRR abs/1705.02411 (2017).

[28]

Tara N. Sainath, C. P. Convolutional neural networks for small-footprint keyword spotting. Sixteenth Annual Conference of the International Speech Communication Association (2015).

[29]

Tseng, V. W.-S., et al. Deterministic binary filters for convolutional neural networks. In Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18 (2018).

[30]

Wang, Y., et al. Cnnpack: Packing convolutional neural networks in the frequency domain. In Advances in Neural Information Processing Systems 29, D. D. Lee, M. Sugiyama, U. V. Luxburg, I. Guyon, and R. Garnett, Eds. Curran Associates, Inc., 2016, pp. 253--261.

Digital Library

[31]

Wang, Y., et al. Beyond filters: Compact feature map for portable deep model. In Proceedings of the 34th International Conference on Machine Learning (International Convention Centre, Sydney, Australia, 06--11 Aug 2017), D. Precup and Y. W. Teh, Eds., vol. 70 of Proceedings of Machine Learning Research, PMLR, pp. 3703--3711.

[32]

Wang, Z., et al. Small-footprint keyword spotting using deep neural network and connectionist temporal classifier. CoRR abs/1709.03665 (2017).

[33]

Warden, P. Speech commands: A public dataset for single-word speech recognition.

[34]

Zhang, Y., et al. Hello edge: Keyword spotting on microcontrollers. CoRR abs/1711.07128 (2017).

Cited By

Venieris SFernandez-Marques JLane N(2023)Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights GenerationACM Transactions on Design Automation of Electronic Systems10.1145/361167328:6(1-31)Online publication date: 16-Oct-2023
https://dl.acm.org/doi/10.1145/3611673
Giraldo JVerhelst M(2021)Hardware Acceleration for Embedded Keyword Spotting: Tutorial and SurveyACM Transactions on Embedded Computing Systems10.1145/347436520:6(1-25)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3474365
Wu LWang ZZhao MHu WCai YHuang R(2021)A High Accuracy Multiple-Command Speech Recognition ASIC Based on Configurable One-Dimension Convolutional Neural Network2021 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS51556.2021.9401401(1-4)Online publication date: May-2021
https://doi.org/10.1109/ISCAS51556.2021.9401401
Show More Cited By

Index Terms

On-the-fly deterministic binary filters for memory efficient keyword spotting applications on embedded devices
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Speech recognition
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks
2. Theory of computation
  1. Design and analysis of algorithms
    1. Mathematical optimization
      1. Discrete optimization
        Network optimization

Recommendations

Efficient Utilization of Scratch-Pad Memory in Embedded Processor Applications
EDTC '97: Proceedings of the 1997 European conference on Design and Test

Efficient utilization of on-chip memory space is extremely important in modern embedded system applications based on microprocessor cores. In addition to a data cache that interfaces with slower off-chip memory, a fast on-chip SRAM, called Scratch-Pad ...
Discriminative keyword spotting

This paper proposes a new approach for keyword spotting, which is based on large margin and kernel methods rather than on HMMs. Unlike previous approaches, the proposed method employs a discriminative learning procedure, in which the learning phase aims ...
A durable and energy efficient main memory using phase change memory technology
ISCA '09: Proceedings of the 36th annual international symposium on Computer architecture

Using nonvolatile memories in memory hierarchy has been investigated to reduce its energy consumption because nonvolatile memories consume zero leakage power in memory cells. One of the difficulties is, however, that the endurance of most nonvolatile ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

EMDL'18: Proceedings of the 2nd International Workshop on Embedded and Mobile Deep Learning

June 2018

51 pages

ISBN:9781450358446

DOI:10.1145/3212725

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOBILE: ACM Special Interest Group on Mobility of Systems, Users, Data and Computing

In-Cooperation

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed limited

Conference

MobiSys '18

Sponsor:

SIGMOBILE

MobiSys '18: The 16th Annual International Conference on Mobile Systems, Applications, and Services

June 15, 2018

Munich, Germany

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
193
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)1

Reflects downloads up to 11 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Venieris SFernandez-Marques JLane N(2023)Mitigating Memory Wall Effects in CNN Engines with On-the-Fly Weights GenerationACM Transactions on Design Automation of Electronic Systems10.1145/361167328:6(1-31)Online publication date: 16-Oct-2023
https://dl.acm.org/doi/10.1145/3611673
Giraldo JVerhelst M(2021)Hardware Acceleration for Embedded Keyword Spotting: Tutorial and SurveyACM Transactions on Embedded Computing Systems10.1145/347436520:6(1-25)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3474365
Wu LWang ZZhao MHu WCai YHuang R(2021)A High Accuracy Multiple-Command Speech Recognition ASIC Based on Configurable One-Dimension Convolutional Neural Network2021 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS51556.2021.9401401(1-4)Online publication date: May-2021
https://doi.org/10.1109/ISCAS51556.2021.9401401
Venieris SFernandez-Marques JLane N(2021)unzipFPGA: Enhancing FPGA-based CNN Engines with On-the-Fly Weights Generation2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)10.1109/FCCM51124.2021.00027(165-175)Online publication date: May-2021
https://doi.org/10.1109/FCCM51124.2021.00027
Tseng VBhattacharya SMarqués JAlizadeh MTong CLane N(2018)Deterministic binary filters for convolutional neural networksProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304889.3305041(2739-2747)Online publication date: 13-Jul-2018
https://dl.acm.org/doi/10.5555/3304889.3305041
Liu BCai HWang ZYang J(2012)Approximate Computing for Energy-Constrained DNN-Based Speech RecognitionApproximate Computing10.1007/978-3-030-98347-5_18(451-480)Online publication date: 24-Feb-2012
https://doi.org/10.1007/978-3-030-98347-5_18

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten