research-article

AccelWord: Energy Efficient Hotword Detection through Accelerometer

Authors:

Parth H. Pathak,

Prasant MohapatraAuthors Info & Claims

MobiSys '15: Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services

Pages 301 - 315

https://doi.org/10.1145/2742647.2742658

Published: 18 May 2015 Publication History

Abstract

Voice control has emerged as a popular method for interacting with smart-devices such as smartphones, smartwatches etc. Popular voice control applications like Siri and Google Now are already used by a large number of smartphone and tablet users. A major challenge in designing a voice control application is that it requires continuous monitoring of user?s voice input through the microphone. Such applications utilize hotwords such as "Okay Google" or "Hi Galaxy" allowing them to distinguish user?s voice command and her other conversations. A voice control application has to continuously listen for hotwords which significantly increases the energy consumption of the smart-devices.

To address this energy efficiency problem of voice control, we present AccelWord in this paper. AccelWord is based on the empirical evidence that accelerometer sensors found in today?s mobile devices are sensitive to user?s voice. We also demonstrate that the effect of user?s voice on accelerometer data is rich enough so that it can be used to detect the hotwords spoken by the user. To achieve the goal of low energy cost but high detection accuracy, we combat multiple challenges, e.g. how to extract unique signatures of user?s speaking hotwords only from accelerometer data and how to reduce the interference caused by user?s mobility.

We finally implement AccelWord as a standalone application running on Android devices. Comprehensive tests show AccelWord has hotword detection accuracy of 85% in static scenarios and 80% in mobile scenarios. Compared to the microphone based hotword detection applications such as Google Now and Samsung S Voice, AccelWord is 2 times more energy efficient while achieving the accuracy of 98% and 92% in static and mobile scenarios respectively.

References

[1]

"Apple siri, https://www.apple.com/ios/siri/."

[2]

"Google now, http://www.google.com/landing/now."

[3]

"Android wear." http://www.android.com/wear/.

[4]

"Google glass." https://www.google.com/glass/start/.

[5]

"Amazon echo." http://www.amazon.com/oc/echo.

[6]

"Nexus 6, https://www.google.com/nexus/6/."

[7]

Y. Michalevsky, D. Boneh, and G. Nakibly, "Gyrophone: Recognizing speech from gyroscope signals," in USENIX'2014.

Digital Library

[8]

P. Marquardt, A. Verma, H. Carter, and P. Traynor, "(sp)iphone: Decoding vibrations from nearby keyboards using mobile phone accelerometers," in Proceedings of the 18th ACM Conference on Computer and Communications Security, CCS'2011.

Digital Library

[9]

"Samsung s voice." http://www.samsung.com/global/galaxys3/svoice.html.

[10]

"Monsoon power monitor." https://www.msoon.com/LabEquipment/PowerMonitor/.

[11]

Y. Zhong, T. V. Raman, C. Burkhardt, F. Biadsy, and J. P. Bigham, "Justspeak: Enabling universal voice control on android," in W4A 2014, 2014.

Digital Library

[12]

I. Lopez-Moreno, J. Gonzalez-Dominguez, and O. Plchot, "Automatic language identification using deep neural networks," in ICASSP'2014.

[13]

W. Zhang and P. Fung, "Discriminatively trained sparse inverse covariance matrices for speech recognition," IEEE/ACM Trans. Audio, Speech and Lang. Proc., vol. 22, pp. 873--882, May 2014.

Digital Library

[14]

C. Chelba, P. Xu, F. Pereira, and T. Richardson, "Distributed acoustic modeling with back-off n-grams," in ICASSP'2012.

[15]

O. Abdel-Hamid, A.-R. Mohamed, H. Jiang, L. Deng, G. Penn, and D. Yu, "Convolutional neural networks for speech recognition," IEEE/ACM Trans. Audio, Speech and Lang. Proc., vol. 22, no. 10, 2014.

Digital Library

[16]

Wikipedia. Examples of Sound Pressure, http://en.wikipedia.org/wiki/Sound_pressure#Examples_of_sound_pressure.

[17]

STMicroelectronics. Everything about STMicroelectronics 3-axis digital MEMS andoscopes, http://www.st.com/web/en/resource/technical/document/technical_article/DM00034730.pdf.

[18]

Ceramic capacitors feature reduced acoustic noise, http://www.electronics-eetimes.com/en/ceramic-capacitors-feature-reduced-acoustic-noise.html.

[19]

G. Roth, "Simulation of the effects of acoustic noise on mems gyroscopes," Thesis, Auburn Univeristy, 2009.

[20]

"Inven sense inc. mpu-6000 and mpu 6050 product speficication." http://www.invensense.com/mems/gyro/documents/PSMPU-6000A-00v3.4.pdf.

[21]

Wikipedia. Human Hearing Range, http://en.wikipedia.org/wiki/Hearing_range.

[22]

Wikipedia. Voice Frequency, http://en.wikipedia.org/wiki/Voice_frequency.

[23]

S. Meter. Google Play Store, https://play.google.com/store/apps/details?id=kr.sira.sound.

[24]

EngineeringToolnbox. Sound Pressure Levels of Common Sources, http://www.engineeringtoolbox.com/sound-pressure-d_711.html.

[25]

E. Munguia Tapia, Using machine learning for real-time activity recognition and estimation of energy expenditure. PhD thesis, Massachusetts Institute of Technology, 2008.

[26]

X. Huang, F. Alleva, H.-W. Hon, M.-Y. Hwang, K.-F. Lee, and R. Rosenfeld, "The sphinx-ii speech recognition system: an overview," Computer Speech & Language, vol. 7, no. 2, pp. 137--148, 1993.

[27]

H. Hermansky, D. P. Ellis, and S. Sharma, "Tandem connectionist feature extraction for conventional hmm systems," in IEEE ICASSP'2000.

[28]

I. H. Witten, E. Frank, and M. A. Hall, Data Mining: Practical Machine Learning Tools and Techniques. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 3rd ed., 2011.

Digital Library

[29]

J. R. Kwapisz, G. M. Weiss, and S. A. Moore, "Activity recognition using cell phone accelerometers," in SIGKDD'2010.

Digital Library

[30]

A. Bayat, M. Pomplun, and D. A. Tran, "A study on human activity recognition using accelerometer data from smartphones," Procedia Computer Science, vol. 34, pp. 450--457, August 2014.

[31]

T. K. Ho, "The random subspace method for constructing decision forests," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 20, no. 8, pp. 832--844, 1998.

Digital Library

[32]

United States Environmental Protection Agency, Summary of the Noise Control Act, 1972.

[33]

Enviroment Projection Agency of the State Council of China, The quality standared of noisy enviroment, 2008.

[34]

Ministry of the Environment of Japan, Current Framework of Vehicle Noise Regulation in Japan, September 2012.

[35]

"Moto x (2rd generation), https://www.motorola.com/us/motomaker?pid=flexr2."

[36]

S. A. Hadei and M. Lotfizad, "A family of adapative filter algorithms in noise cancellation for speech enhancement," International Journal of Computer and Electrical Engineering, vol. 2, April 2010.

[37]

A. Matic, V. Osmani, and O. Mayora, "Speech activity detection using accelerometer," in IEEE EMBC'2012.

[38]

S. V. Dusan, E. B. Andersen, A. Lindahl, and A. P. Bright, "System and method of detecting a user's voice activity using an acceleromter." US Patent No. 20140093093 A1.

[39]

J. Wang, K. Zhao, X. Zhang, and C. Peng, "Ubiquitous keyboard for small mobile devices: Harnessing multipath fading for fine-grained keystroke localization," MobiSys'14.

Digital Library

[40]

A. Davis, M. Rubinstein, N. Wadhwa, G. J. Mysore, F. Durand, and W. T. Freeman, "The visual microphone: Passive recovery of sound from video," ACM Trans. Graph., July 2014.

Digital Library

[41]

G. Galatas, G. Potamianos, and F. Makedon, "Audio-visual speech recognition incorporating facial depth information captured by the kinect," in EUSIPCO'2012.

Cited By

Zhao GShen YLi FLiu LCui LWen H(2025)Ui-Ear: On-Face Gesture Recognition Through On-Ear Vibration SensingIEEE Transactions on Mobile Computing10.1109/TMC.2024.348021624:3(1482-1495)Online publication date: Mar-2025
https://doi.org/10.1109/TMC.2024.3480216
Zhang GFu HXiang ZZhou XHu PCheng XYang Y(2025)Ambient Light Reflection-Based Eavesdropping Enhanced With cGANIEEE Transactions on Mobile Computing10.1109/TMC.2024.346039224:1(72-85)Online publication date: Jan-2025
https://doi.org/10.1109/TMC.2024.3460392
Chen YYu JKong LZhu Y(2025)A Comprehensive Survey of Side-Channel Sound-Sensing MethodsIEEE Internet of Things Journal10.1109/JIOT.2024.350133412:2(1554-1578)Online publication date: 15-Jan-2025
https://doi.org/10.1109/JIOT.2024.3501334
Show More Cited By

Index Terms

AccelWord: Energy Efficient Hotword Detection through Accelerometer
1. Hardware
  1. Communication hardware, interfaces and storage
    1. Signal processing systems
    2. Sound-based input / output
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction devices
      1. Sound-based input / output

Recommendations

WakeScope: runtime WakeLock anomaly management scheme for Android platform
EMSOFT '13: Proceedings of the Eleventh ACM International Conference on Embedded Software

Android provides a WakeLock mechanism for application developers to ensure the proper execution of applications without having to enter the sleep state of a device. When using the WakeLock mechanism, application developers should bear the responsibility ...
Where is the energy spent inside my app?: fine grained energy accounting on smartphones with Eprof
EuroSys '12: Proceedings of the 7th ACM european conference on Computer Systems

Where is the energy spent inside my app? Despite the immense popularity of smartphones and the fact that energy is the most crucial aspect in smartphone programming, the answer to the above question remains elusive. This paper first presents eprof, the ...
Performance and Energy Consumption Analysis of Embedded Applications Based on Android Platform
SBESC '12: Proceedings of the 2012 Brazilian Symposium on Computing System Engineering

This paper presents an analysis of embedded applications based on Android Platform. Analyzing performance and energy consumption from different algorithmic versions this work tries to find a performance and energy pattern for the paradigm used in each ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MobiSys '15: Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services

May 2015

516 pages

ISBN:9781450334945

DOI:10.1145/2742647

General Chairs:
Gaetano Borriello
University of Washington, USA
,
Giovanni Pau
UPMC-LIP6, France / UCLA, USA
,
Program Chairs:
Marco Gruteser
Rutgers University, USA
,
Jason Hong
Carnegie Mellon University, USA

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOBILE: ACM Special Interest Group on Mobility of Systems, Users, Data and Computing

In-Cooperation

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 May 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MobiSys'15

Sponsor:

SIGMOBILE

MobiSys'15: The 13th Annual International Conference on Mobile Systems, Applications, and Services

May 18 - 22, 2015

Florence, Italy

Acceptance Rates

MobiSys '15 Paper Acceptance Rate 29 of 219 submissions, 13%;

Overall Acceptance Rate 274 of 1,679 submissions, 16%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

92
Total Citations
View Citations
1,008
Total Downloads

Downloads (Last 12 months)66
Downloads (Last 6 weeks)2

Reflects downloads up to 11 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhao GShen YLi FLiu LCui LWen H(2025)Ui-Ear: On-Face Gesture Recognition Through On-Ear Vibration SensingIEEE Transactions on Mobile Computing10.1109/TMC.2024.348021624:3(1482-1495)Online publication date: Mar-2025
https://doi.org/10.1109/TMC.2024.3480216
Zhang GFu HXiang ZZhou XHu PCheng XYang Y(2025)Ambient Light Reflection-Based Eavesdropping Enhanced With cGANIEEE Transactions on Mobile Computing10.1109/TMC.2024.346039224:1(72-85)Online publication date: Jan-2025
https://doi.org/10.1109/TMC.2024.3460392
Chen YYu JKong LZhu Y(2025)A Comprehensive Survey of Side-Channel Sound-Sensing MethodsIEEE Internet of Things Journal10.1109/JIOT.2024.350133412:2(1554-1578)Online publication date: 15-Jan-2025
https://doi.org/10.1109/JIOT.2024.3501334
Liu YHu QKong LSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Tuning-free estimation and inference of cumulative distribution function under local differential privacyProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693329(31147-31164)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3693329
Yao QLiu YSun XDong XJi XMa JLuo BLiao XXu JKirda ELie D(2024)Watch the Rhythm: Breaking Privacy with Accelerometer at the Extremely-Low Sampling Rate of 5HzProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690370(1776-1790)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3690370
Zhang TJi QYe ZAkanda MMahdad AShi CWang YSaxena NChen YLuo BLiao XXu JKirda ELie D(2024)SAFARI: Speech-Associated Facial Authentication for AR/VR Settings via Robust VIbration SignaturesProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3670358(153-167)Online publication date: 2-Dec-2024
https://dl.acm.org/doi/10.1145/3658644.3670358
Chen YYu JChen YKong LZhu YChen YOkoshi TKo JLiKamWa R(2024)RFSpy: Eavesdropping on Online Conversations with Out-of-Vocabulary Words by Sensing Metal Coil Vibration of Headsets Leveraging RFIDProceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services10.1145/3643832.3661887(169-182)Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3643832.3661887
Liao QHuang YHuang YWu K(2024)An Eavesdropping System Based on Magnetic Side-Channel Signals Leaked by SpeakersACM Transactions on Sensor Networks10.1145/363706320:2(1-30)Online publication date: 10-Jan-2024
https://dl.acm.org/doi/10.1145/3637063
Han FYang PDu HLi X(2024)Accuth+: Accelerometer-Based Anti-Spoofing Voice Authentication on Wrist-Worn WearablesIEEE Transactions on Mobile Computing10.1109/TMC.2023.331483723:5(5571-5588)Online publication date: May-2024
https://doi.org/10.1109/TMC.2023.3314837
Hu PLi WMa YSanthalingam PPathak PLi HZhang HZhang GCheng XMohapatra P(2024)Towards Unconstrained Vocabulary Eavesdropping With mmWave Radar Using GANIEEE Transactions on Mobile Computing10.1109/TMC.2022.322669023:1(941-954)Online publication date: Jan-2024
https://doi.org/10.1109/TMC.2022.3226690
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

EPUB

View this article in ePub.

Figures

Tables

Media

View Table of Conten