Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2668332.2668349acmconferencesArticle/Chapter ViewAbstractPublication PagessensysConference Proceedingsconference-collections
research-article

DSP.Ear: leveraging co-processor support for continuous audio sensing on smartphones

Published: 03 November 2014 Publication History

Abstract

The rapidly growing adoption of sensor-enabled smartphones has greatly fueled the proliferation of applications that use phone sensors to monitor user behavior. A central sensor among these is the microphone which enables, for instance, the detection of valence in speech, or the identification of speakers. Deploying multiple of these applications on a mobile device to continuously monitor the audio environment allows for the acquisition of a diverse range of sound-related contextual inferences. However, the cumulative processing burden critically impacts the phone battery.
To address this problem, we propose DSP.Ear -- an integrated sensing system that takes advantage of the latest low-power DSP co-processor technology in commodity mobile devices to enable the continuous and simultaneous operation of multiple established algorithms that perform complex audio inferences. The system extracts emotions from voice, estimates the number of people in a room, identifies the speakers, and detects commonly found ambient sounds, while critically incurring little overhead to the device battery. This is achieved through a series of pipeline optimizations that allow the computation to remain largely on the DSP. Through detailed evaluation of our prototype implementation we show that, by exploiting a smartphone's co-processor, DSP.Ear achieves a 3 to 7 times increase in the battery lifetime compared to a solution that uses only the phone's main processor. In addition, DSP.Ear is 2 to 3 times more power efficient than a naïve DSP solution without optimizations. We further analyze a large-scale dataset from 1320 Android users to show that in about 80-90% of the daily usage instances DSP.Ear is able to sustain a full day of operation (even in the presence of other smartphone workloads) with a single battery charge.

References

[1]
British Library of Sounds. http://sounds.bl.uk/.
[2]
Monsoon Power Monitor. http://www.msoon.com/LabEquipment/PowerMonitor/.
[3]
Free Sound Effects. http://www.freesfx.co.uk/.
[4]
Google Nexus 5. https://www.google.com/nexus/5/.
[5]
HTK Speech Recognition Toolkit. http://htk.eng.cam.ac.uk/.
[6]
iPhone 5s M7 Motion Coprocessor. https://www.apple.com/iphone-5s/specs/.
[7]
Motorola Moto X. http://www.motorola.com/us/FLEXR1-1/moto-x-specifications.html.
[8]
Qualcomm Hexagon SDK. https://developer.qualcomm.com/mobile-development/maximize-hardware/multimedia-optimization-hexagon-sdk.
[9]
Qualcomm Snapdragon 800 MDP. https://developer.qualcomm.com/mobile-development/development-devices/snapdragon-mobile-development-platform-mdp.
[10]
Qualcomm Snapdragon 800 Processors. http://www.qualcomm.com/snapdragon/processors/800.
[11]
Scikit-Learn Python Library. http://scikit-learn.org/stable/.
[12]
Trepn Profiler. https://developer.qualcomm.com/mobile-development/increase-app-performance/trepn-profiler.
[13]
Hexagon DSP processor. https://developer. qualcomm.com/mobile-development/maximize-hardware/multimedia-optimization-hexagon-sdk/hexagon-dsp-processor.
[14]
R. J. Baken. Clinical Measurement of Speech and Voice. Taylor & Francis Ltd, London, 1987.
[15]
N. Lane, et al. Piggyback CrowdSensing (PCS): Energy Efficient Crowdsourcing of Mobile Sensor Data by Exploiting Smartphone App Opportunities. In SenSys '13.
[16]
A. Carroll and G. Heiser. An Analysis of Power Consumption in a Smartphone. In USENIXATC '10.
[17]
G. Chechik, E. Ie, M. Rehn, S. Bengio, and D. Lyon. Large-scale content-based audio retrieval from text queries. In MIR '08.
[18]
B.-G. Chun, S. Ihm, P. Maniatis, M. Naik, and A. Patti. Clonecloud: Elastic execution between mobile device and cloud. In EuroSys '11.
[19]
A. de Cheveigné and H. Kawahara. YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America, 111(4):1917--1930, 2002.
[20]
S. Dixon. Onset Detection Revisited. In Proc. of the Int. Conf. on Digital Audio Effects (DAFx-06), pages 133--137, Montreal, Quebec, Canada, Sept. 2006.
[21]
Z. Fang, et al. Comparison of different implementations of MFCC. J. Comput. Sci. Technol., 16(6):582--589, Nov. 2001.
[22]
H. Hermansky. Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am., 57(4):1738--52, Apr. 1990.
[23]
Y. Lee, et al. Sociophone: Everyday face-to-face interaction monitoring platform using multi-phone sensor fusion. In MobiSys '13.
[24]
D. Li, et al. Classification of general audio data for content-based retrieval. Pattern Recognition Letters, 22(5):533--544, 2001.
[25]
T. Li. Musical genre classification of audio signals. In IEEE Transactions on Speech and Audio Processing, pages 293--302, 2002.
[26]
M. Liberman, K. Davis, M. Grossman, N. Martey, and J. Bell. Emotional prosody speech and transcripts. 2002.
[27]
H. Lu, A. J. B. Brush, B. Priyantha, A. K. Karlson, and J. Liu. Speakersense: Energy efficient unobtrusive speaker identification on mobile phones. In Pervasive '11.
[28]
H. Lu, et al. Stresssense: Detecting stress in unconstrained acoustic environments using smartphones. In UbiComp '12.
[29]
H. Lu, et al. Soundsense: Scalable sound sensing for people-centric applications on mobile phones. In MobiSys '09.
[30]
H. Lu, et al. The JigSaw continuous sensing engine for mobile phone applications. In SenSys '10.
[31]
P. Mohan, V. N. Padmanabhan, and R. Ramjee. Nericell: Rich monitoring of road and traffic conditions using mobile smartphones. In SenSys '08.
[32]
S. Nath. ACE: Exploiting correlation for energy-efficient and continuous context sensing. In MobiSys '12.
[33]
S. Ntalampiras, I. Potamitis, and N. Fakotakis. Acoustic detection of human activities in natural environments. Journal of Audio Engineering Society, 2012 2012.
[34]
B. Priyantha, D. Lymberopoulos, and J. Liu. Enabling energy efficient continuous sensing on mobile phones with littlerock. In IPSN '10.
[35]
K. K. Rachuri, et al. Sociablesense: Exploring the trade-offs of adaptive sampling and computation offloading for social sensing. In MobiCom '11.
[36]
K. K. Rachuri, et al. Emotionsense: A mobile phones based adaptive platform for experimental social psychology research. Ubicomp '10.
[37]
J. Saunders. Real-time discrimination of broadcast speech/music. In ICASSP '96.
[38]
E. Scheirer and M. Slaney. Construction and evaluation of a robust multifeature speech/music discriminator. In ICASSP '97.
[39]
I. Constandache et al. EnLoc: Energy-Efficient Localization for Mobile Phones. In InfoCom '09.
[40]
C. Shen et al. Exploiting Processor Heterogeneity for Energy Efficient Context Inference on Mobile Phones. In HotPower '13.
[41]
D. Zhang et al. ACC: Generic On-demand Accelerations for Neighbor Discovery in Mobile Applications. In SenSys '12.
[42]
M. Ra et al. Improving Energy Efciency of Personal Sensing Applications with Heterogeneous Multi-Processors. In Ubicomp '12.
[43]
G. Schwarz. Estimating the Dimension of a Model. The Annals of Statistics, 6(2):461--464, 1978.
[44]
S. Verma, A. Robinson, and P. Dutta. Audiodaq: Turning the mobile phone's ubiquitous headset port into a universal data acquisition interface. In SenSys '12.
[45]
Y. Wang, et al. A framework of energy efficient mobile sensing for automatic user state recognition. In MobiSys '09.
[46]
C. Xu, et al. Crowd++: Unsupervised speaker count with smartphones. In UbiComp '13.
[47]
B. Yan, and G. Chen. AppJoy: Personalized Mobile Application Discovery In MobiSys '11.

Cited By

View all
  • (2025)Fast On-device LLM Inference with NPUsProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707239(445-462)Online publication date: 30-Mar-2025
  • (2024)SoCFlow: Efficient and Scalable DNN Training on SoC-Clustered Edge ServersProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624847(368-385)Online publication date: 27-Apr-2024
  • (2024)Efficient, Scalable, and Sustainable DNN Training on SoC-Clustered Edge ServersIEEE Transactions on Mobile Computing10.1109/TMC.2024.344243023:12(14344-14360)Online publication date: Dec-2024
  • Show More Cited By

Index Terms

  1. DSP.Ear: leveraging co-processor support for continuous audio sensing on smartphones

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        SenSys '14: Proceedings of the 12th ACM Conference on Embedded Network Sensor Systems
        November 2014
        380 pages
        ISBN:9781450331432
        DOI:10.1145/2668332
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 03 November 2014

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. DSP
        2. audio
        3. co-processor
        4. energy
        5. mobile sensing

        Qualifiers

        • Research-article

        Conference

        Acceptance Rates

        Overall Acceptance Rate 198 of 990 submissions, 20%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)27
        • Downloads (Last 6 weeks)5
        Reflects downloads up to 25 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2025)Fast On-device LLM Inference with NPUsProceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3669940.3707239(445-462)Online publication date: 30-Mar-2025
        • (2024)SoCFlow: Efficient and Scalable DNN Training on SoC-Clustered Edge ServersProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3617232.3624847(368-385)Online publication date: 27-Apr-2024
        • (2024)Efficient, Scalable, and Sustainable DNN Training on SoC-Clustered Edge ServersIEEE Transactions on Mobile Computing10.1109/TMC.2024.344243023:12(14344-14360)Online publication date: Dec-2024
        • (2024)Crowd Counting in Large Surveillance Areas by Fusing Audio and WiFi Sniffing Data2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651535(1-8)Online publication date: 30-Jun-2024
        • (2023)Acoustic Software Defined Platform: A Versatile Sensing and General Benchmarking PlatformIEEE Transactions on Mobile Computing10.1109/TMC.2021.309325922:2(647-660)Online publication date: 1-Feb-2023
        • (2023)WiEdge: Edge Computing for Audio Sensing Applications With Accurate Wireless Link PredictionIEEE Internet of Things Journal10.1109/JIOT.2022.317366810:5(3982-3994)Online publication date: 1-Mar-2023
        • (2023)AI Accelerators for Standalone ComputerArtificial Intelligence and Hardware Accelerators10.1007/978-3-031-22170-5_2(53-93)Online publication date: 16-Mar-2023
        • (2022)MandhelingProceedings of the 28th Annual International Conference on Mobile Computing And Networking10.1145/3495243.3560545(214-227)Online publication date: 14-Oct-2022
        • (2022)Scanning the Voice of Your Fingerprint With Everyday SurfacesIEEE Transactions on Mobile Computing10.1109/TMC.2021.304921721:8(3024-3040)Online publication date: 1-Aug-2022
        • (2022)DSPBooster: Offloading Unmodified Mobile Applications to DSPs for Power-performance Optimal Execution2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC54236.2022.00108(614-623)Online publication date: Jun-2022
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media