research-article

DSP.Ear: leveraging co-processor support for continuous audio sensing on smartphones

Authors:

Petko Georgiev,

Nicholas D. Lane,

Kiran K. Rachuri,

Cecilia MascoloAuthors Info & Claims

SenSys '14: Proceedings of the 12th ACM Conference on Embedded Network Sensor Systems

Pages 295 - 309

https://doi.org/10.1145/2668332.2668349

Published: 03 November 2014 Publication History

Abstract

The rapidly growing adoption of sensor-enabled smartphones has greatly fueled the proliferation of applications that use phone sensors to monitor user behavior. A central sensor among these is the microphone which enables, for instance, the detection of valence in speech, or the identification of speakers. Deploying multiple of these applications on a mobile device to continuously monitor the audio environment allows for the acquisition of a diverse range of sound-related contextual inferences. However, the cumulative processing burden critically impacts the phone battery.

To address this problem, we propose DSP.Ear -- an integrated sensing system that takes advantage of the latest low-power DSP co-processor technology in commodity mobile devices to enable the continuous and simultaneous operation of multiple established algorithms that perform complex audio inferences. The system extracts emotions from voice, estimates the number of people in a room, identifies the speakers, and detects commonly found ambient sounds, while critically incurring little overhead to the device battery. This is achieved through a series of pipeline optimizations that allow the computation to remain largely on the DSP. Through detailed evaluation of our prototype implementation we show that, by exploiting a smartphone's co-processor, DSP.Ear achieves a 3 to 7 times increase in the battery lifetime compared to a solution that uses only the phone's main processor. In addition, DSP.Ear is 2 to 3 times more power efficient than a naïve DSP solution without optimizations. We further analyze a large-scale dataset from 1320 Android users to show that in about 80-90% of the daily usage instances DSP.Ear is able to sustain a full day of operation (even in the presence of other smartphone workloads) with a single battery charge.

References

[1]

British Library of Sounds. http://sounds.bl.uk/.

[2]

Monsoon Power Monitor. http://www.msoon.com/LabEquipment/PowerMonitor/.

[3]

Free Sound Effects. http://www.freesfx.co.uk/.

[4]

Google Nexus 5. https://www.google.com/nexus/5/.

[5]

HTK Speech Recognition Toolkit. http://htk.eng.cam.ac.uk/.

[6]

iPhone 5s M7 Motion Coprocessor. https://www.apple.com/iphone-5s/specs/.

[7]

Motorola Moto X. http://www.motorola.com/us/FLEXR1-1/moto-x-specifications.html.

[8]

Qualcomm Hexagon SDK. https://developer.qualcomm.com/mobile-development/maximize-hardware/multimedia-optimization-hexagon-sdk.

[9]

Qualcomm Snapdragon 800 MDP. https://developer.qualcomm.com/mobile-development/development-devices/snapdragon-mobile-development-platform-mdp.

[10]

Qualcomm Snapdragon 800 Processors. http://www.qualcomm.com/snapdragon/processors/800.

[11]

Scikit-Learn Python Library. http://scikit-learn.org/stable/.

[12]

Trepn Profiler. https://developer.qualcomm.com/mobile-development/increase-app-performance/trepn-profiler.

[13]

Hexagon DSP processor. https://developer. qualcomm.com/mobile-development/maximize-hardware/multimedia-optimization-hexagon-sdk/hexagon-dsp-processor.

[14]

R. J. Baken. Clinical Measurement of Speech and Voice. Taylor & Francis Ltd, London, 1987.

[15]

N. Lane, et al. Piggyback CrowdSensing (PCS): Energy Efficient Crowdsourcing of Mobile Sensor Data by Exploiting Smartphone App Opportunities. In SenSys '13.

Digital Library

[16]

A. Carroll and G. Heiser. An Analysis of Power Consumption in a Smartphone. In USENIXATC '10.

Digital Library

[17]

G. Chechik, E. Ie, M. Rehn, S. Bengio, and D. Lyon. Large-scale content-based audio retrieval from text queries. In MIR '08.

Digital Library

[18]

B.-G. Chun, S. Ihm, P. Maniatis, M. Naik, and A. Patti. Clonecloud: Elastic execution between mobile device and cloud. In EuroSys '11.

Digital Library

[19]

A. de Cheveigné and H. Kawahara. YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America, 111(4):1917--1930, 2002.

[20]

S. Dixon. Onset Detection Revisited. In Proc. of the Int. Conf. on Digital Audio Effects (DAFx-06), pages 133--137, Montreal, Quebec, Canada, Sept. 2006.

[21]

Z. Fang, et al. Comparison of different implementations of MFCC. J. Comput. Sci. Technol., 16(6):582--589, Nov. 2001.

Digital Library

[22]

H. Hermansky. Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am., 57(4):1738--52, Apr. 1990.

[23]

Y. Lee, et al. Sociophone: Everyday face-to-face interaction monitoring platform using multi-phone sensor fusion. In MobiSys '13.

Digital Library

[24]

D. Li, et al. Classification of general audio data for content-based retrieval. Pattern Recognition Letters, 22(5):533--544, 2001.

Digital Library

[25]

T. Li. Musical genre classification of audio signals. In IEEE Transactions on Speech and Audio Processing, pages 293--302, 2002.

[26]

M. Liberman, K. Davis, M. Grossman, N. Martey, and J. Bell. Emotional prosody speech and transcripts. 2002.

[27]

H. Lu, A. J. B. Brush, B. Priyantha, A. K. Karlson, and J. Liu. Speakersense: Energy efficient unobtrusive speaker identification on mobile phones. In Pervasive '11.

Digital Library

[28]

H. Lu, et al. Stresssense: Detecting stress in unconstrained acoustic environments using smartphones. In UbiComp '12.

Digital Library

[29]

H. Lu, et al. Soundsense: Scalable sound sensing for people-centric applications on mobile phones. In MobiSys '09.

Digital Library

[30]

H. Lu, et al. The JigSaw continuous sensing engine for mobile phone applications. In SenSys '10.

Digital Library

[31]

P. Mohan, V. N. Padmanabhan, and R. Ramjee. Nericell: Rich monitoring of road and traffic conditions using mobile smartphones. In SenSys '08.

Digital Library

[32]

S. Nath. ACE: Exploiting correlation for energy-efficient and continuous context sensing. In MobiSys '12.

Digital Library

[33]

S. Ntalampiras, I. Potamitis, and N. Fakotakis. Acoustic detection of human activities in natural environments. Journal of Audio Engineering Society, 2012 2012.

[34]

B. Priyantha, D. Lymberopoulos, and J. Liu. Enabling energy efficient continuous sensing on mobile phones with littlerock. In IPSN '10.

Digital Library

[35]

K. K. Rachuri, et al. Sociablesense: Exploring the trade-offs of adaptive sampling and computation offloading for social sensing. In MobiCom '11.

Digital Library

[36]

K. K. Rachuri, et al. Emotionsense: A mobile phones based adaptive platform for experimental social psychology research. Ubicomp '10.

Digital Library

[37]

J. Saunders. Real-time discrimination of broadcast speech/music. In ICASSP '96.

Digital Library

[38]

E. Scheirer and M. Slaney. Construction and evaluation of a robust multifeature speech/music discriminator. In ICASSP '97.

Digital Library

[39]

I. Constandache et al. EnLoc: Energy-Efficient Localization for Mobile Phones. In InfoCom '09.

[40]

C. Shen et al. Exploiting Processor Heterogeneity for Energy Efficient Context Inference on Mobile Phones. In HotPower '13.

Digital Library

[41]

D. Zhang et al. ACC: Generic On-demand Accelerations for Neighbor Discovery in Mobile Applications. In SenSys '12.

Digital Library

[42]

M. Ra et al. Improving Energy Efciency of Personal Sensing Applications with Heterogeneous Multi-Processors. In Ubicomp '12.

Digital Library

[43]

G. Schwarz. Estimating the Dimension of a Model. The Annals of Statistics, 6(2):461--464, 1978.

[44]

S. Verma, A. Robinson, and P. Dutta. Audiodaq: Turning the mobile phone's ubiquitous headset port into a universal data acquisition interface. In SenSys '12.

Digital Library

[45]

Y. Wang, et al. A framework of energy efficient mobile sensing for automatic user state recognition. In MobiSys '09.

Digital Library

[46]

C. Xu, et al. Crowd++: Unsupervised speaker count with smartphones. In UbiComp '13.

Digital Library

[47]

B. Yan, and G. Chen. AppJoy: Personalized Mobile Application Discovery In MobiSys '11.

Digital Library

Cited By

Xu DZhang HYang LLiu RHuang GXu MLiu XEeckhout LSmaragdakis GLiang KSampson AKim MRossbach C(2025)Fast On-device LLM Inference with NPUsProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707239(445-462)Online publication date: 30-Mar-2025
https://dl.acm.org/doi/10.1145/3669940.3707239
Xu DXu MLou CZhang LHuang GJin XLiu XTsafrir DMUSUVATHI MGupta RAbu-Ghazaleh N(2024)SoCFlow: Efficient and Scalable DNN Training on SoC-Clustered Edge ServersProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624847(368-385)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3617232.3624847
Xu MXu DLou CZhang LHuang GJin XLiu X(2024)Efficient, Scalable, and Sustainable DNN Training on SoC-Clustered Edge ServersIEEE Transactions on Mobile Computing10.1109/TMC.2024.344243023:12(14344-14360)Online publication date: Dec-2024
https://doi.org/10.1109/TMC.2024.3442430
Show More Cited By

Index Terms

DSP.Ear: leveraging co-processor support for continuous audio sensing on smartphones
1. Computer systems organization
  1. Embedded and cyber-physical systems
  2. Real-time systems
2. Human-centered computing
  1. Human computer interaction (HCI)

Recommendations

A Speed Area Optimized Embedded Co-processor for McEliece Cryptosystem
ASAP '12: Proceedings of the 2012 IEEE 23rd International Conference on Application-Specific Systems, Architectures and Processors

This paper describes the systematic design methods of an embedded co-processor for a post quantum secure McEliece cryptosystem. A hardware/software co-design has been targeted for the realization of McEliece in practice on low-cost embedded platforms. ...
FPGA design of EKF block accelerator for 3D visual SLAM

FPGA implementation of computing EKF gain and cross-covariance matrices is proposed.Exploiting cross-covariance matrix symmetry reduces computational and resource costs.EKF innovation matrix dimension allows for simple SA computational designs.Our ...
UbiqLog: a generic mobile phone-based life-log framework

Smartphones are conquering the mobile phone market; they are not just phones; they also act as media players, gaming consoles, personal calendars, storage, etc. They are portable computers with fewer computing capabilities than personal computers. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SenSys '14: Proceedings of the 12th ACM Conference on Embedded Network Sensor Systems

November 2014

380 pages

ISBN:9781450331432

DOI:10.1145/2668332

General Chair:
Ákos Lédecz
Vanderbilt University
,
Program Chairs:
Prabal Dutta
University of Michigan
,
Chenyang Lu
Washington Univ. in St. Louis

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SenSys '14

Sponsor:

SenSys '14: The 12th ACM Conference on Embedded Network Sensor Systems

November 3 - 6, 2014

Tennessee, Memphis

Acceptance Rates

Overall Acceptance Rate 198 of 990 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

40
Total Citations
View Citations
470
Total Downloads

Downloads (Last 12 months)27
Downloads (Last 6 weeks)5

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xu DZhang HYang LLiu RHuang GXu MLiu XEeckhout LSmaragdakis GLiang KSampson AKim MRossbach C(2025)Fast On-device LLM Inference with NPUsProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707239(445-462)Online publication date: 30-Mar-2025
https://dl.acm.org/doi/10.1145/3669940.3707239
Xu DXu MLou CZhang LHuang GJin XLiu XTsafrir DMUSUVATHI MGupta RAbu-Ghazaleh N(2024)SoCFlow: Efficient and Scalable DNN Training on SoC-Clustered Edge ServersProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624847(368-385)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3617232.3624847
Xu MXu DLou CZhang LHuang GJin XLiu X(2024)Efficient, Scalable, and Sustainable DNN Training on SoC-Clustered Edge ServersIEEE Transactions on Mobile Computing10.1109/TMC.2024.344243023:12(14344-14360)Online publication date: Dec-2024
https://doi.org/10.1109/TMC.2024.3442430
Guo RHuang BHao LJia B(2024)Crowd Counting in Large Surveillance Areas by Fusing Audio and WiFi Sniffing Data2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651535(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10651535
Cai CPu HHu MZheng RLuo J(2023)Acoustic Software Defined Platform: A Versatile Sensing and General Benchmarking PlatformIEEE Transactions on Mobile Computing10.1109/TMC.2021.309325922:2(647-660)Online publication date: 1-Feb-2023
https://doi.org/10.1109/TMC.2021.3093259
Cao CDong WZhang WGao Y(2023)WiEdge: Edge Computing for Audio Sensing Applications With Accurate Wireless Link PredictionIEEE Internet of Things Journal10.1109/JIOT.2022.317366810:5(3982-3994)Online publication date: 1-Mar-2023
https://doi.org/10.1109/JIOT.2022.3173668
Kim TLee JJung HKim S(2023)AI Accelerators for Standalone ComputerArtificial Intelligence and Hardware Accelerators10.1007/978-3-031-22170-5_2(53-93)Online publication date: 16-Mar-2023
https://doi.org/10.1007/978-3-031-22170-5_2
Xu DXu MWang QWang SMa YHuang KHuang GJin XLiu X(2022)MandhelingProceedings of the 28th Annual International Conference on Mobile Computing And Networking10.1145/3495243.3560545(214-227)Online publication date: 14-Oct-2022
https://dl.acm.org/doi/10.1145/3495243.3560545
Rathore AXu CZhu WDaiyan AWang KLin FRen KXu W(2022)Scanning the Voice of Your Fingerprint With Everyday SurfacesIEEE Transactions on Mobile Computing10.1109/TMC.2021.304921721:8(3024-3040)Online publication date: 1-Aug-2022
https://doi.org/10.1109/TMC.2021.3049217
Wen EShen J(2022)DSPBooster: Offloading Unmodified Mobile Applications to DSPs for Power-performance Optimal Execution2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC54236.2022.00108(614-623)Online publication date: Jun-2022
https://doi.org/10.1109/COMPSAC54236.2022.00108
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten