Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2742647.2742658acmconferencesArticle/Chapter ViewAbstractPublication PagesmobisysConference Proceedingsconference-collections
research-article

AccelWord: Energy Efficient Hotword Detection through Accelerometer

Published: 18 May 2015 Publication History

Abstract

Voice control has emerged as a popular method for interacting with smart-devices such as smartphones, smartwatches etc. Popular voice control applications like Siri and Google Now are already used by a large number of smartphone and tablet users. A major challenge in designing a voice control application is that it requires continuous monitoring of user?s voice input through the microphone. Such applications utilize hotwords such as "Okay Google" or "Hi Galaxy" allowing them to distinguish user?s voice command and her other conversations. A voice control application has to continuously listen for hotwords which significantly increases the energy consumption of the smart-devices.
To address this energy efficiency problem of voice control, we present AccelWord in this paper. AccelWord is based on the empirical evidence that accelerometer sensors found in today?s mobile devices are sensitive to user?s voice. We also demonstrate that the effect of user?s voice on accelerometer data is rich enough so that it can be used to detect the hotwords spoken by the user. To achieve the goal of low energy cost but high detection accuracy, we combat multiple challenges, e.g. how to extract unique signatures of user?s speaking hotwords only from accelerometer data and how to reduce the interference caused by user?s mobility.
We finally implement AccelWord as a standalone application running on Android devices. Comprehensive tests show AccelWord has hotword detection accuracy of 85% in static scenarios and 80% in mobile scenarios. Compared to the microphone based hotword detection applications such as Google Now and Samsung S Voice, AccelWord is 2 times more energy efficient while achieving the accuracy of 98% and 92% in static and mobile scenarios respectively.

References

[1]
"Apple siri, https://www.apple.com/ios/siri/."
[2]
"Google now, http://www.google.com/landing/now."
[3]
"Android wear." http://www.android.com/wear/.
[4]
"Google glass." https://www.google.com/glass/start/.
[5]
"Amazon echo." http://www.amazon.com/oc/echo.
[6]
"Nexus 6, https://www.google.com/nexus/6/."
[7]
Y. Michalevsky, D. Boneh, and G. Nakibly, "Gyrophone: Recognizing speech from gyroscope signals," in USENIX'2014.
[8]
P. Marquardt, A. Verma, H. Carter, and P. Traynor, "(sp)iphone: Decoding vibrations from nearby keyboards using mobile phone accelerometers," in Proceedings of the 18th ACM Conference on Computer and Communications Security, CCS'2011.
[9]
"Samsung s voice." http://www.samsung.com/global/galaxys3/svoice.html.
[10]
"Monsoon power monitor." https://www.msoon.com/LabEquipment/PowerMonitor/.
[11]
Y. Zhong, T. V. Raman, C. Burkhardt, F. Biadsy, and J. P. Bigham, "Justspeak: Enabling universal voice control on android," in W4A 2014, 2014.
[12]
I. Lopez-Moreno, J. Gonzalez-Dominguez, and O. Plchot, "Automatic language identification using deep neural networks," in ICASSP'2014.
[13]
W. Zhang and P. Fung, "Discriminatively trained sparse inverse covariance matrices for speech recognition," IEEE/ACM Trans. Audio, Speech and Lang. Proc., vol. 22, pp. 873--882, May 2014.
[14]
C. Chelba, P. Xu, F. Pereira, and T. Richardson, "Distributed acoustic modeling with back-off n-grams," in ICASSP'2012.
[15]
O. Abdel-Hamid, A.-R. Mohamed, H. Jiang, L. Deng, G. Penn, and D. Yu, "Convolutional neural networks for speech recognition," IEEE/ACM Trans. Audio, Speech and Lang. Proc., vol. 22, no. 10, 2014.
[16]
Wikipedia. Examples of Sound Pressure, http://en.wikipedia.org/wiki/Sound_pressure#Examples_of_sound_pressure.
[17]
STMicroelectronics. Everything about STMicroelectronics 3-axis digital MEMS andoscopes, http://www.st.com/web/en/resource/technical/document/technical_article/DM00034730.pdf.
[18]
Ceramic capacitors feature reduced acoustic noise, http://www.electronics-eetimes.com/en/ceramic-capacitors-feature-reduced-acoustic-noise.html.
[19]
G. Roth, "Simulation of the effects of acoustic noise on mems gyroscopes," Thesis, Auburn Univeristy, 2009.
[20]
"Inven sense inc. mpu-6000 and mpu 6050 product speficication." http://www.invensense.com/mems/gyro/documents/PSMPU-6000A-00v3.4.pdf.
[21]
Wikipedia. Human Hearing Range, http://en.wikipedia.org/wiki/Hearing_range.
[22]
Wikipedia. Voice Frequency, http://en.wikipedia.org/wiki/Voice_frequency.
[23]
S. Meter. Google Play Store, https://play.google.com/store/apps/details?id=kr.sira.sound.
[24]
EngineeringToolnbox. Sound Pressure Levels of Common Sources, http://www.engineeringtoolbox.com/sound-pressure-d_711.html.
[25]
E. Munguia Tapia, Using machine learning for real-time activity recognition and estimation of energy expenditure. PhD thesis, Massachusetts Institute of Technology, 2008.
[26]
X. Huang, F. Alleva, H.-W. Hon, M.-Y. Hwang, K.-F. Lee, and R. Rosenfeld, "The sphinx-ii speech recognition system: an overview," Computer Speech & Language, vol. 7, no. 2, pp. 137--148, 1993.
[27]
H. Hermansky, D. P. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional hmm systems," in IEEE ICASSP'2000.
[28]
I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 3rd ed., 2011.
[29]
J. R. Kwapisz, G. M. Weiss, and S. A. Moore, "Activity recognition using cell phone accelerometers," in SIGKDD'2010.
[30]
A. Bayat, M. Pomplun, and D. A. Tran, "A study on human activity recognition using accelerometer data from smartphones," Procedia Computer Science, vol. 34, pp. 450--457, August 2014.
[31]
T. K. Ho, "The random subspace method for constructing decision forests," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 20, no. 8, pp. 832--844, 1998.
[32]
United States Environmental Protection Agency, Summary of the Noise Control Act, 1972.
[33]
Enviroment Projection Agency of the State Council of China, The quality standared of noisy enviroment, 2008.
[34]
Ministry of the Environment of Japan, Current Framework of Vehicle Noise Regulation in Japan, September 2012.
[35]
"Moto x (2rd generation), https://www.motorola.com/us/motomaker?pid=flexr2."
[36]
S. A. Hadei and M. Lotfizad, "A family of adapative filter algorithms in noise cancellation for speech enhancement," International Journal of Computer and Electrical Engineering, vol. 2, April 2010.
[37]
A. Matic, V. Osmani, and O. Mayora, "Speech activity detection using accelerometer," in IEEE EMBC'2012.
[38]
S. V. Dusan, E. B. Andersen, A. Lindahl, and A. P. Bright, "System and method of detecting a user's voice activity using an acceleromter." US Patent No. 20140093093 A1.
[39]
J. Wang, K. Zhao, X. Zhang, and C. Peng, "Ubiquitous keyboard for small mobile devices: Harnessing multipath fading for fine-grained keystroke localization," MobiSys'14.
[40]
A. Davis, M. Rubinstein, N. Wadhwa, G. J. Mysore, F. Durand, and W. T. Freeman, "The visual microphone: Passive recovery of sound from video," ACM Trans. Graph., July 2014.
[41]
G. Galatas, G. Potamianos, and F. Makedon, "Audio-visual speech recognition incorporating facial depth information captured by the kinect," in EUSIPCO'2012.

Cited By

View all
  • (2025)Ui-Ear: On-Face Gesture Recognition Through On-Ear Vibration SensingIEEE Transactions on Mobile Computing10.1109/TMC.2024.348021624:3(1482-1495)Online publication date: Mar-2025
  • (2025)Ambient Light Reflection-Based Eavesdropping Enhanced With cGANIEEE Transactions on Mobile Computing10.1109/TMC.2024.346039224:1(72-85)Online publication date: Jan-2025
  • (2025)A Comprehensive Survey of Side-Channel Sound-Sensing MethodsIEEE Internet of Things Journal10.1109/JIOT.2024.350133412:2(1554-1578)Online publication date: 15-Jan-2025
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MobiSys '15: Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services
May 2015
516 pages
ISBN:9781450334945
DOI:10.1145/2742647
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 May 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. accelerometer
  2. accelword
  3. energy
  4. hotword detection
  5. measurement

Qualifiers

  • Research-article

Conference

MobiSys'15
Sponsor:

Acceptance Rates

MobiSys '15 Paper Acceptance Rate 29 of 219 submissions, 13%;
Overall Acceptance Rate 274 of 1,679 submissions, 16%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)66
  • Downloads (Last 6 weeks)2
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Ui-Ear: On-Face Gesture Recognition Through On-Ear Vibration SensingIEEE Transactions on Mobile Computing10.1109/TMC.2024.348021624:3(1482-1495)Online publication date: Mar-2025
  • (2025)Ambient Light Reflection-Based Eavesdropping Enhanced With cGANIEEE Transactions on Mobile Computing10.1109/TMC.2024.346039224:1(72-85)Online publication date: Jan-2025
  • (2025)A Comprehensive Survey of Side-Channel Sound-Sensing MethodsIEEE Internet of Things Journal10.1109/JIOT.2024.350133412:2(1554-1578)Online publication date: 15-Jan-2025
  • (2024)Tuning-free estimation and inference of cumulative distribution function under local differential privacyProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693329(31147-31164)Online publication date: 21-Jul-2024
  • (2024)Watch the Rhythm: Breaking Privacy with Accelerometer at the Extremely-Low Sampling Rate of 5HzProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690370(1776-1790)Online publication date: 2-Dec-2024
  • (2024)SAFARI: Speech-Associated Facial Authentication for AR/VR Settings via Robust VIbration SignaturesProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670358(153-167)Online publication date: 2-Dec-2024
  • (2024)RFSpy: Eavesdropping on Online Conversations with Out-of-Vocabulary Words by Sensing Metal Coil Vibration of Headsets Leveraging RFIDProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661887(169-182)Online publication date: 3-Jun-2024
  • (2024)An Eavesdropping System Based on Magnetic Side-Channel Signals Leaked by SpeakersACM Transactions on Sensor Networks10.1145/363706320:2(1-30)Online publication date: 10-Jan-2024
  • (2024)Accuth+: Accelerometer-Based Anti-Spoofing Voice Authentication on Wrist-Worn WearablesIEEE Transactions on Mobile Computing10.1109/TMC.2023.331483723:5(5571-5588)Online publication date: May-2024
  • (2024)Towards Unconstrained Vocabulary Eavesdropping With mmWave Radar Using GANIEEE Transactions on Mobile Computing10.1109/TMC.2022.322669023:1(941-954)Online publication date: Jan-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

EPUB

View this article in ePub.

ePub

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media