Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3379503.3403551acmconferencesArticle/Chapter ViewAbstractPublication PagesmobilehciConference Proceedingsconference-collections
research-article

Characterizing the Effect of Audio Degradation on Privacy Perception And Inference Performance in Audio-Based Human Activity Recognition

Published: 05 October 2020 Publication History

Abstract

Audio has been increasingly adopted as a sensing modality in a variety of human-centered mobile applications and in smart assistants in the home. Although acoustic features can capture complex semantic information about human activities and context, continuous audio recording often poses significant privacy concerns. An intuitive way to reduce privacy concerns is to degrade audio quality such that speech and other relevant acoustic markers become unintelligible, but this often comes at the cost of activity recognition performance. In this paper, we employ a mixed-methods approach to characterize this balance. We first conduct an online survey with 266 participants to capture their perception of privacy qualitatively and quantitatively with degraded audio. Given our findings that privacy concerns can be significantly reduced at high levels of audio degradation, we then investigate how intentional degradation of audio frames can affect the recognition results of the target classes while maintaining effective privacy mitigation. Our results indicate that degradation of audio frames can leave minimal effects for audio recognition using frame-level features. Furthermore, degradation of audio frames can hurt the performance to some extend for audio recognition using segment-level features, though the usage of such features may still yield superior recognition performance. Given the different requirements on privacy mitigation and recognition performance for different sensing purposes, such trade-offs need to be balanced in actual implementations.

Supplementary Material

a32-liang-supplement (a32-liang-supplement.zip)
Sample audio clips with or without degradation; the degradation levels are shown as the filenames. They have the similar quality as presented to the Mechanical Turk participants on the online survey.

References

[1]
Tawfiq Ammari, Jofish Kaye, Janice Y Tsai, and Frank Bentley. 2019. Music, Search, and IoT: How People (Really) Use Voice Assistants. ACM Transactions on Computer-Human Interaction (TOCHI) 26, 3(2019), 1–28.
[2]
Vincent Becker, Linus Fessler, and Gábor Sörös. 2019. GestEar: combining audio and motion sensing for gesture recognition on smartwatches. In Proceedings of the 23rd International Symposium on Wearable Computers. 10–19.
[3]
Francine Chen, John Adcock, and Shruti Krishnagiri. 2008. Audio privacy: reducing speech intelligibility while preserving environmental sounds. In Proceedings of the 16th ACM international conference on Multimedia. ACM, 733–736.
[4]
Jianfeng Chen, Alvin Harvey Kam, Jianmin Zhang, Ning Liu, and Louis Shue. 2005. Bathroom activity monitoring based on sound. In International Conference on Pervasive Computing. Springer, 47–61.
[5]
François Chollet 2015. Keras. https://keras.io.
[6]
Delphine Christin, Andreas Reinhardt, Salil S Kanhere, and Matthias Hollick. 2011. A survey on privacy in mobile participatory sensing applications. Journal of systems and software 84, 11 (2011), 1928–1946.
[7]
Matthew JC Crump, John V McDonnell, and Todd M Gureckis. 2013. Evaluating Amazon’s Mechanical Turk as a tool for experimental behavioral research. PloS one 8, 3 (2013), e57410.
[8]
Wenrui Diao, Xiangyu Liu, Zhe Zhou, and Kehuan Zhang. 2014. Your voice assistant is mine: How to abuse speakers to steal information and control your phone. In Proceedings of the 4th ACM Workshop on Security and Privacy in Smartphones & Mobile Devices. 63–74.
[9]
Mariella Dimiccoli, Juan Marín, and Edison Thomaz. 2018. Mitigating Bystander Privacy Concerns in Egocentric Activity Recognition with Deep Learning and Intentional Image Degradation. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 4 (2018), 132.
[10]
Antti J Eronen, Vesa T Peltonen, Juha T Tuomi, Anssi P Klapuri, Seppo Fagerlund, Timo Sorsa, Gaëtan Lorho, and Jyri Huopaniemi. 2006. Audio-based context recognition. IEEE Transactions on Audio, Speech, and Language Processing 14, 1(2006), 321–329.
[11]
Frederic Font, Gerard Roma, and Xavier Serra. 2013. Freesound technical demo. In Proceedings of the 21st ACM international conference on Multimedia. 411–412.
[12]
Jort F Gemmeke, Daniel PW Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R Channing Moore, Manoj Plakal, and Marvin Ritter. 2017. Audio set: An ontology and human-labeled dataset for audio events. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 776–780.
[13]
Mark S Granovetter. 1977. The strength of weak ties. In Social networks. Elsevier, 347–367.
[14]
Muhammad Huzaifah. 2017. Comparison of time-frequency representations for environmental sound classification using convolutional neural networks. arXiv preprint arXiv:1706.07156(2017).
[15]
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167(2015).
[16]
Predrag Klasnja, Sunny Consolvo, Tanzeem Choudhury, Richard Beckwith, and Jeffrey Hightower. 2009. Exploring privacy concerns about personal sensing. In International Conference on Pervasive Computing. Springer, 176–183.
[17]
Sacha Krstulović. 2018. Audio event recognition in the smart home. In Computational Analysis of Sound Scenes and Events. Springer, 335–371.
[18]
Anurag Kumar, Maksim Khadkevich, and Christian Fügen. 2018. Knowledge transfer from weakly labeled audio using convolutional neural network for sound events and scenes. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 326–330.
[19]
Sumeet Kumar, Le T Nguyen, Ming Zeng, Kate Liu, and Joy Zhang. 2015. Sound shredding: Privacy preserved audio sensing. In Proceedings of the 16th International Workshop on Mobile Computing Systems and Applications. ACM, 135–140.
[20]
Nicholas D Lane, Petko Georgiev, and Lorena Qendro. 2015. DeepEar: robust smartphone audio sensing in unconstrained acoustic environments using deep learning. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing. ACM, 283–294.
[21]
Gierad Laput, Karan Ahuja, Mayank Goel, and Chris Harrison. 2018. Ubicoustics: Plug-and-play acoustic activity recognition. In The 31st Annual ACM Symposium on User Interface Software and Technology. ACM, 213–224.
[22]
Eric C Larson, TienJui Lee, Sean Liu, Margaret Rosenfeld, and Shwetak N Patel. 2011. Accurate and privacy preserving cough sensing using a low-cost microphone. In Proceedings of the 13th international conference on Ubiquitous computing. ACM, 375–384.
[23]
Dawei Liang and Edison Thomaz. 2019. Audio-Based Activities of Daily Living (ADL) Recognition with Large-Scale Acoustic Embeddings from Online Videos. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 1 (2019), 17.
[24]
Daniyal Liaqat, Ebrahim Nemati, Mahbubur Rahman, and Jilong Kuang. 2017. A method for preserving privacy during audio recordings by filtering speech. In 2017 IEEE Life Sciences Conference (LSC). IEEE, 79–82.
[25]
Hong Lu, Wei Pan, Nicholas D Lane, Tanzeem Choudhury, and Andrew T Campbell. 2009. SoundSense: scalable sound sensing for people-centric applications on mobile phones. In Proceedings of the 7th international conference on Mobile systems, applications, and services. ACM, 165–178.
[26]
Md Nasir, Wei Xia, Bo Xiao, Brian Baucom, Shrikanth S Narayanan, and Panayiotis G Georgiou. 2015. Still together?: The role of acoustic features in predicting marital outcome. In Sixteenth Annual Conference of the International Speech Communication Association.
[27]
Gonzalo Navarro. 2001. A guided tour to approximate string matching. ACM computing surveys (CSUR) 33, 1 (2001), 31–88.
[28]
Alexandru Nelus, Janek Ebbers, Reinhold Haeb-Umbach, and Rainer Martin. 2019. Privacy-Preserving Variational Information Feature Extraction for Domestic Activity Monitoring versus Speaker Identification. Proc. Interspeech 2019(2019), 3710–3714.
[29]
M Pathak. 2010. Privacy Preserving Techniques for Speech Processing. Dec 1(2010), 1–54.
[30]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, 2011. Scikit-learn: Machine learning in Python. Journal of machine learning research 12, Oct (2011), 2825–2830.
[31]
Pablo Perez Zarazaga, Sneha Das, Tom Bäckström, VV Raju, Anil Vuppala, 2019. Sound Privacy: A Conversational Speech Corpus for Quantifying the Experience of Privacy. In Interspeech. International Speech Communication Association.
[32]
Karol J. Piczak. [n.d.]. ESC: Dataset for Environmental Sound Classification. In Proceedings of the 23rd Annual ACM Conference on Multimedia (Brisbane, Australia, 2015-10-13). ACM Press, 1015–1018. https://doi.org/10.1145/2733373.2806390
[33]
Andrew Raij, Animikh Ghosh, Santosh Kumar, and Mani Srivastava. 2011. Privacy risks emerging from the adoption of innocuous wearable sensors in the mobile environment. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 11–20.
[34]
Mirco Rossi, Sebastian Feese, Oliver Amft, Nils Braune, Sandro Martis, and Gerhard Tröster. 2013. AmbientSense: A real-time ambient sound recognition system for smartphones. In 2013 IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOM Workshops). IEEE, 230–235.
[35]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research 15, 1 (2014), 1929–1958.
[36]
Edison Thomaz, Cheng Zhang, Irfan Essa, and Gregory D Abowd. 2015. Inferring meal eating activities in real world settings from ambient sounds: A feasibility study. In Proceedings of the 20th International Conference on Intelligent User Interfaces. ACM, 427–431.
[37]
Nik Vaessen. 2019. Jiwer. https://pypi.org/project/jiwer.
[38]
Korosh Vatanparvar, Viswam Nathan, Ebrahim Nemati, Md Mahbubur Rahman, and Jilong Kuang. 2019. A Generative Model for Speech Segmentation and Obfuscation for Remote Health Monitoring. In 2019 IEEE 16th International Conference on Wearable and Implantable Body Sensor Networks (BSN). IEEE, 1–4.
[39]
Danny Wyatt, Tanzeem Choudhury, and Jeff Bilmes. 2007. Conversation detection and speaker segmentation in privacy-sensitive situated speech data. In Eighth Annual Conference of the International Speech Communication Association.
[40]
Koji Yatani and Khai N Truong. 2012. BodyScope: a wearable acoustic sensor for activity recognition. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing. ACM, 341–350.
[41]
Tomoko Yonezawa, Naoki Okamoto, Hirotake Yamazoe, Shinji Abe, Fumio Hattori, and Norihiro Hagita. 2011. Privacy protected life-context-aware alert by simplified sound spectrogram from microphone sensor. In Proceedings of the 5th ACM International Workshop on Context-Awareness for Self-Managing Systems. ACM, 4–9.

Cited By

View all
  • (2024)Audio- and Video-Based Human Activity Recognition Systems in HealthcareIEEE Access10.1109/ACCESS.2024.335313812(8230-8245)Online publication date: 2024
  • (2023)A Dataset for Foreground Speech Analysis With Smartwatches In Everyday Home Environments2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)10.1109/ICASSPW59220.2023.10192949(1-5)Online publication date: 4-Jun-2023
  • (2023)KI-basiertes akustisches Monitoring: Herausforderungen und Lösungsansätze für datengetriebene Innovationen auf Basis audiovisueller AnalyseEntrepreneurship der Zukunft10.1007/978-3-658-42060-4_4(85-115)Online publication date: 30-Dec-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MobileHCI '20: 22nd International Conference on Human-Computer Interaction with Mobile Devices and Services
October 2020
418 pages
ISBN:9781450375160
DOI:10.1145/3379503
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Activity Recognition
  2. Audio Processing
  3. Mobile Sensing
  4. Privacy

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

MobileHCI '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 202 of 906 submissions, 22%

Upcoming Conference

MOBILEHCI '24
26th International Conference on Mobile Human-Computer Interaction
September 30 - October 3, 2024
Melbourne , VIC , Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)44
  • Downloads (Last 6 weeks)4
Reflects downloads up to 18 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Audio- and Video-Based Human Activity Recognition Systems in HealthcareIEEE Access10.1109/ACCESS.2024.335313812(8230-8245)Online publication date: 2024
  • (2023)A Dataset for Foreground Speech Analysis With Smartwatches In Everyday Home Environments2023 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW)10.1109/ICASSPW59220.2023.10192949(1-5)Online publication date: 4-Jun-2023
  • (2023)KI-basiertes akustisches Monitoring: Herausforderungen und Lösungsansätze für datengetriebene Innovationen auf Basis audiovisueller AnalyseEntrepreneurship der Zukunft10.1007/978-3-658-42060-4_4(85-115)Online publication date: 30-Dec-2023
  • (2022)SAMoSAProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35502846:3(1-19)Online publication date: 7-Sep-2022
  • (2022)Deceiving Audio Design in Augmented Environments : A Systematic Review of Audio Effects in Augmented Reality2022 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct)10.1109/ISMAR-Adjunct57072.2022.00018(36-43)Online publication date: Oct-2022
  • (2022)Source Domain Selection for Cross-House Human Activity Recognition with Ambient Sensors2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA55696.2022.00126(754-759)Online publication date: Dec-2022
  • (2021)TheophanyProceedings of the 29th ACM International Conference on Multimedia10.1145/3474085.3475507(2056-2064)Online publication date: 17-Oct-2021

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media