research-article

embARC MLI based Design and Implementation of Real-time Keyword Spotting

Authors:

Liangliang LuAuthors Info & Claims

ICIT '20: Proceedings of the 2020 8th International Conference on Information Technology: IoT and Smart City

Pages 46 - 50

https://doi.org/10.1145/3446999.3447008

Published: 09 April 2021 Publication History

Abstract

Efficient implementation of inference is essential for neural network applications on edge devices. This paper presents a neural network based Keyword Spotting (KWS) system built with embARC MLI Library and ARC EM9D micro-processor. embARC MLI Library is a highly optimized machine learning inference library for IoT edge devices, and it is open source. With unique XY-architecture, EM9D processor achieves high efficiency when executing continuous MAC instructions. Performance of the combination is analyzed in detail and is compared with other processors. As edge devices generally have limited computing and memory resources, there are many optimization tasks need to be done and they are also presented in the paper. The paper shows that with highly optimized code based on particular hardware, AI applications can meet real-time requirements even on low-cost edge devices.

References

[1]

Stojkoska, B. L. R., & Trivodaliev, K. V. (2017). A review of Internet of Things for smart home: Challenges and solutions. Journal of Cleaner Production, 140, 1454-1464. DOI= https://doi.org/10.1016/j.jclepro.2016.10.006.

[2]

Atal, B. S. (1974). Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. the Journal of the Acoustical Society of America, 55(6), 1304-1312. DOI= https://doi.org/10.1121/1.1914702.

[3]

Davis, S., & Mermelstein, P. (1980). Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE transactions on acoustics, speech, and signal processing, 28(4), 357-366. DOI= https://doi.org/10.1109/TASSP.1980.1163420.

[4]

Rohlicek, J. R., Russell, W., Roukos, S., & Gish, H. (1989, May). Continuous hidden Markov modeling for speaker-independent word spotting. In International Conference on Acoustics, Speech, and Signal Processing, (pp. 627-630). IEEE. DOI= https://doi.org/10.1109/ICASSP.1989.266505.

[5]

Fernández, S., Graves, A., & Schmidhuber, J. (2007, September). An application of recurrent neural networks to discriminative keyword spotting. In International Conference on Artificial Neural Networks (pp. 220-229). Springer, Berlin, Heidelberg.

[6]

Sun, M., Raju, A., Tucker, G., Panchapagesan, S., Fu, G., Mandal, A., ... & Vitaladevuni, S. (2016, December). Max-pooling loss training of long short-term memory networks for small-footprint keyword spotting. In 2016 IEEE Spoken Language Technology Workshop (SLT) (pp. 474-480). IEEE. DOI= https://doi.org/10.1109/SLT.2016.7846306.

[7]

Arik, S. O., Kliegl, M., Child, R., Hestness, J., Gibiansky, A., Fougner, C., ... & Coates, A. (2017). Convolutional recurrent neural networks for small-footprint keyword spotting. arXiv preprint arXiv:1703.05390.

[8]

Zhang, Y., Suda, N., Lai, L., & Chandra, V. (2017). Hello edge: Keyword spotting on microcontrollers. arXiv preprint arXiv:1711.07128.

[9]

Chollet, F. (2017). Xception: Deep learning with depthwise separable convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1251-1258).

[10]

Chen, G., Parada, C., & Heigold, G. (2014, May). Small-footprint keyword spotting using deep neural networks. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 4087-4091). IEEE. DOI= https://doi.org/10.1109/ICASSP.2014.6854370.

[11]

Say Welcome to the Machine - Low-Power Machine Learning for Smart IoT Applications. Retrieved May, 2019, from https://www.synopsys.com/dw/doc.php/wp/arc_low_power_machine_learning_for_iot.pdf.

[12]

Lai, L., Suda, N., & Chandra, V. (2017). Deep convolutional neural network inference with floating-point weights and fixed-point activations. arXiv preprint arXiv:1703.03073.

[13]

Q (number format). https://en.wikipedia.org/wiki/Q_(number_format).

[14]

Zhang, T., Shao, Y., Wu, Y., Geng, Y., & Fan, L. (2020). An overview of speech endpoint detection algorithms. Applied Acoustics, 160, 107133. DOI= https://doi.org/10.1016/j.apacoust.2019.107133.

[15]

Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., ... & Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861.

[16]

Warden, P. (2018). Speech commands: A dataset for limited-vocabulary speech recognition. arXiv preprint arXiv:1804.03209.

[17]

Arm Cortex-M7 Devices Generic User Guide. https://developer.arm.com/documentation/dui0646/.

Cited By

Birke SHartmann BAuras DWloka MAscheid GLeupers R(2022)Design and Exploration of an ARC-Coprocessor for LSTM Based Audio Applications2022 IEEE Nordic Circuits and Systems Conference (NorCAS)10.1109/NorCAS57515.2022.9934553(1-7)Online publication date: 25-Oct-2022
https://doi.org/10.1109/NorCAS57515.2022.9934553

Recommendations

EMBARC: an efficient memory bank assignment algorithm for retargetable compilers
LCTES '04: Proceedings of the 2004 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems

Many architectures today, especially embedded systems, have multiple memory partitions, each with potentially different performance and energy characteristics. To meet the strict time-to-market requirements of systems containing these chips, compilers ...
EMBARC: an efficient memory bank assignment algorithm for retargetable compilers
LCTES '04

Many architectures today, especially embedded systems, have multiple memory partitions, each with potentially different performance and energy characteristics. To meet the strict time-to-market requirements of systems containing these chips, compilers ...
A New Lightweight CRNN Model for Keyword Spotting with Edge Computing Devices
Machine Learning for Cyber Security
Abstract
Keyword Spotting (KWS) is a significant branch of Automatic Speech Recognition (ASR), which has been widely used in edge computing devices. The goal of KWS is to provide high accuracy at a low false alarm rate (FAR) while reducing the costs of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICIT '20: Proceedings of the 2020 8th International Conference on Information Technology: IoT and Smart City

December 2020

266 pages

ISBN:9781450388559

DOI:10.1145/3446999

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 April 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Natural Science Foundation of China

Conference

ICIT 2020

ICIT 2020: IoT and Smart City

December 25 - 27, 2020

Xi'an, China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
27
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)1

Reflects downloads up to 18 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Birke SHartmann BAuras DWloka MAscheid GLeupers R(2022)Design and Exploration of an ARC-Coprocessor for LSTM Based Audio Applications2022 IEEE Nordic Circuits and Systems Conference (NorCAS)10.1109/NorCAS57515.2022.9934553(1-7)Online publication date: 25-Oct-2022
https://doi.org/10.1109/NorCAS57515.2022.9934553

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents