research-article

Theophany: Multimodal Speech Augmentation in Instantaneous Privacy Channels

Authors:

Abhishek Kumar,

Pan HuiAuthors Info & Claims

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Pages 2056 - 2064

https://doi.org/10.1145/3474085.3475507

Published: 17 October 2021 Publication History

Abstract

Many factors affect speech intelligibility in face-to-face conversations. These factors lead conversation participants to speak louder and more distinctively, exposing the content to potential eavesdroppers. To address these issues, we introduce Theophany, a privacy-preserving framework for augmenting speech. Theophany establishes ad-hoc social networks between conversation participants to exchange contextual information, improving speech intelligibility in real-time. At the core of Theophany, we develop the first privacy perception model that assesses the privacy risk of a face-to-face conversation based on its topic, location, and participants. This framework allows to develop any privacy-preserving application for face-to-face conversation. We implement the framework within a prototype system that augments the speaker's speech with real-life subtitles to overcome the loss of contextual cues brought by mask-wearing and social distancing during the COVID-19 pandemic. We evaluate Theophany through a user survey and a user study on 53 and 17 participants, respectively. Theophany's privacy predictions match the participants' privacy preferences with an accuracy of 71.26%. Users considered Theophany to be useful to protect their privacy (3.88/5), easy to use (4.71/5), and enjoyable to use (4.24/5). We also raise the question of demographic and individual differences in the design of privacy-preserving solutions.

Supplementary Material

ZIP File (mfp1828aux.zip)

The supplemental material contains further details on

Download
2.00 MB

References

[1]

David G. Allen and Rodger W. Griffeth. 1997. Vertical and Lateral Information Processing: The Effects of Gender, Employee Classification Level, and Media Richness on Communication and Work Outcomes. Human Relations 50, 10 (1997), 1239--1260. https://doi.org/10.1177/001872679705001003

[2]

A.J. Bernheim Brush, John Krumm, and James Scott. 2010. Exploring End User Preferences for Location Obfuscation, Location-Based Services, and the Value of Location. In Proceedings of the 12th ACM international conference on Ubiquitous computing (Copenhagen, Denmark) (UbiComp '10). Association for Computing Machinery, New York, NY, USA, 95--104. https://doi.org/10.1145/1864349.1864381

Digital Library

[3]

Isha Chaturvedi, Farshid Hassani Bijarbooneh, Tristan Braud, and Pan Hui. 2019. Peripheral Vision: A New Killer App for Smart Glasses. In Proceedings of the 24th International Conference on Intelligent User Interfaces (Marina del Ray, California) (IUI '19). Association for Computing Machinery, New York, NY, USA, 625--636. https://doi.org/10.1145/3301275.3302263

Digital Library

[4]

Francine Chen, John Adcock, and Shruti Krishnagiri. 2008. Audio Privacy: Reduc- ing Speech Intelligibility While Preserving Environmental Sounds. In Proceedings of the 16th ACM International Conference on Multimedia (Vancouver, British Columbia, Canada) (MM '08). Association for Computing Machinery, New York, NY, USA, 733--736. https://doi.org/10.1145/1459359.1459472

Digital Library

[5]

Richard Daft and Robert Lengel. 1986. Organizational Information Requirements, Media Richness and Structural Design. Management Science 32 (05 1986), 554--571. https://doi.org/10.1287/mnsc.32.5.554

Digital Library

[6]

Richard L Daft and Robert H Lengel. 1983. Information richness. A new approach to managerial behavior and organization design. Technical Report. Texas A and M Univ College Station Coll of Business Administration.

[7]

Çağlar Genç, Ashley Colley, Markus Löchtefeld, and Jonna Häkkilä. 2020. Face Mask Design to Mitigate Facial Expression Occlusion. In Proceedings of the 2020 International Symposium on Wearable Computers (Virtual Event, Mexico) (ISWC '20). Association for Computing Machinery, New York, NY, USA, 40--44. https: //doi.org/10.1145/3410531.3414303

Digital Library

[8]

Kevin Granville. 2018. Facebook and Cambridge Analytica: What You Need to Know as Fallout Widens. https://www.nytimes.com/2018/03/19/technology/ facebook-cambridge-analytica-explained.html

[9]

Dominic Watt Senior Lecturer in Forensic Speech Science. 2020. The science of how you sound when you talk through a face mask. https://theconversation.com/ the-science-of-how-you-sound-when-you-talk-through-a-face-mask-139817

[10]

Jin Yong Jeon, Joo Young Hong, Hyung Suk Jang, and Jae Hyeon Kim. 2015. Speech privacy and annoyance considerations in the acoustic environment of passenger cars of high-speed trains. The Journal of the Acoustical Society of America 138, 6 (2015), 3976--3984.

[11]

Abhishek Kumar, Tristan Braud, Young D. Kwon, and Pan Hui. 2020. Aquilis: Using Contextual Integrity for Privacy Protection on Mobile Devices. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 4, 4, Article 137 (December 2020), 28 pages. https://doi.org/10.1145/3432205

Digital Library

[12]

Abhishek Kumar, Tristan Braud, Sasu Tarkoma, and Pan Hui. 2020. Trustworthy AI in the Age of Pervasive Computing and Big Data. In 2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops). 1--6. https://doi.org/10.1109/PerComWorkshops48775.2020.9156127

[13]

Hyein Lee, Yoonji Kim, and Andrea Bianchi. 2020. MAScreen: Augmenting Speech with Visual Cues of Lip Motions, Facial Expressions, and Text Using a Wearable Display. In SIGGRAPH Asia 2020 Emerging Technologies. Association for Computing Machinery, New York, NY, USA, Article 2, 2 pages. https://doi. org/10.1145/3415255.3422886

Digital Library

[14]

Lik-Hang Lee, Yiming Zhu, Yui-Pan Yau, Tristan Braud, Xiang Su, and Pan Hui. 2020. One-thumb Text Acquisition on Force-assisted Miniature Interfaces for Mobile Headsets. In 2020 IEEE International Conference on Pervasive Computing and Communications, PerCom 2020, March 23-27, 2020. IEEE, Austin, TX, USA, 1--10. https://doi.org/10.1109/PerCom45495.2020.9127378

[15]

Lik-Hang Lee and Pan Hui. 2018. Interaction Methods for Smart Glasses: A Survey. IEEE Access 6 (2018), 28712--28732. https://doi.org/10.1109/ACCESS.2018.2831081

[16]

Lik Hang Lee, Yiming Zhu, Yui-Pan Yau, Pan Hui, and Susanna Pirttikangas. 2021. Press-n-Paste: Copy-and-Paste Operations with Pressure-Sensitive Caret Navigation for Miniaturized Surface in Mobile Augmented Reality. Proc. ACM Hum.-Comput. Interact. 5, EICS, Article 199 (May 2021), 29 pages. https://doi. org/10.1145/3457146

Digital Library

[17]

Robert H. Lengel and Richard L. Daft. 1988. The Selection of Communication Media as an Executive Skill. Academy of Management Executive 2 (1988), 225--232. https://www.jstor.org/stable/4164833

[18]

Dawei Liang, Wenting Song, and Edison Thomaz. 2020. Characterizing the Effect of Audio Degradation on Privacy Perception And Inference Performance in Audio-Based Human Activity Recognition. In 22nd International Conference on Human-Computer Interaction with Mobile Devices and Services (Oldenburg, Germany) (MobileHCI '20). Association for Computing Machinery, New York, NY, USA, Article 32, 10 pages. https://doi.org/10.1145/3379503.3403551

Digital Library

[19]

E Michael Maximilien, Tyrone Grandison, Tony Sun, Dwayne Richardson, Sherry Guo, and Kun Liu. 2009. Privacy-as-a-service: Models, algorithms, and results onthe facebook platform. In 2009 IEEE Symposium on Security and Privacy Workshops, WEB 2.0 SECURITY AND PRIVACY. IEEE, Oakland, California, 1--4. http://www. ieee-security.org/TC/W2SP/2009/papers/s4p2.pdf

[20]

Markus Müller-Trapet and Bradford N Gover. 2019. Relationship between the privacy index and the speech privacy class. The Journal of the Acoustical Society of America 145, 5 (2019), EL435--EL441.

[21]

Helen Nissenbaum. 2004. Privacy as contextual integrity. Washington Law Review 79, 1 (2004), 119--157. https://heinonline.org/HOL/LandingPage?handle=hein. journals/washlr79&div=16

[22]

Marie Caroline Oetzel and Sarah Spiekermann. 2014. A systematic methodology for privacy impact assessments: a design science approach. European Journal of Information Systems 23, 2 (2014), 126--150. https://doi.org/10.1057/ejis.2013.18 arXiv:https://doi.org/10.1057/ejis.2013.18

[23]

Jeff Reinking. 2012. Contingency Theory in Information Systems Research. In Information Systems Theory: Explaining and Predicting Our Digital Society, Vol. 1, Yogesh K. Dwivedi, Michael R. Wade, and Scott L. Schneberger (Eds.). Springer New York, New York, NY, 247--263. https://doi.org/10.1007/978-1-4419-6108- 2_13

[24]

David Sánchez and Montserrat Batet. 2016. C-Sanitized: A Privacy Model for Document Redaction and Sanitization. J. Assoc. Inf. Sci. Technol. 67, 1 (Jan. 2016), 148--163. https://doi.org/10.1002/asi.23363

Digital Library

[25]

Mark Scott. 2017. E.U. Fines Facebook $122 Million Over Disclosures in WhatsApp Deal. https://www.nytimes.com/2017/05/18/technology/facebook-european- union-fine-whatsapp.html

[26]

Awanthika Senarath, Marthie Grobler, and Nalin A. G. Arachchilage. 2019. A Model for System Developers to Measure the Privacy Risk of Data. In 52nd Hawaii International Conference on System Sciences, HICSS 2019, January 8-11, 2019. University of Hawaii at Manoa, Grand Wailea, Maui, Hawaii, USA, 6135--6144. https://doi.org/10.24251/HICSS.2019.738

[27]

Jiayu Shu, Sokol Kosta, Rui Zheng, and Pan Hui. 2018. Talk2Me: A Framework for Device-to-Device Augmented Reality Social Network. In 2018 IEEE International Conference on Pervasive Computing and Communications, PerCom 2018, March 19-23, 2018. IEEE Computer Society, Athens, Greece, 1--10. https://doi.org/10. 1109/PERCOM.2018.8444578

[28]

Virpi Kristiina Tuunainen, Olli Pitkänen, and Marjaana Hovi. 2009. Users' Aware- ness of Privacy on Online Social Networking Sites - Case Facebook. In 22nd Bled eConference: eEnablement: Facilitating an Open, Effective and Representative eSociety, June 14-17, 2009. Association for Information Systems, Bled, Slovenia, 42. http://aisel.aisnet.org/bled2009/42

[29]

Petra Virjonen, Jukka Keränen, Riikka Helenius, Jarkko Hakala, and OV Hongisto. 2007. Speech privacy between neighboring workstations in an open office-a laboratory study. Acta Acustica united with Acustica 93, 5 (2007), 771--782.

[30]

Yang Wang, Gregory Norcie, Saranga Komanduri, Alessandro Acquisti, Pedro Gio- vanni Leon, and Lorrie Faith Cranor. 2011. "I Regretted the Minute I Pressed Share": A Qualitative Study of Regrets on Facebook. In Proceedings of the Seventh Symposium on Usable Privacy and Security (Pittsburgh, Pennsylvania) (SOUPS '11). Association for Computing Machinery, New York, NY, USA, Article 10, 16 pages. https://doi.org/10.1145/2078827.2078841

Digital Library

[31]

Jeanne Whalen. 2020. Europe fined Google nearly $10 billion for antitrust vio- lations, but little has changed. https://www.washingtonpost.com/technology/ 2020/11/10/eu-antitrust-probe-google/

[32]

Yui-Pan Yau, Lik Hang Lee, Zheng Li, Tristan Braud, Yi-Hsuan Ho, and Pan Hui. 2020. How Subtle Can It Get? A Trimodal Study of Ring-Sized Interfaces for One-Handed Drone Control. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 4, 2, Article 63 (June 2020), 29 pages. https://doi.org/10.1145/3397319

Digital Library

[33]

Pablo Pérez Zarazaga, Sneha Das, Tom Bäckström, Vishnu Vidyadhara Raju Vegesna, and Anil Kumar Vuppala. 2019. Sound Privacy: A Conversational Speech Corpus for Quantifying the Experience of Privacy. In Interspeech 2019, 20th Annual Conference of the International Speech Communication Association, 15-19 September 2019, Gernot Kubin and Zdravko Kacic (Eds.). ISCA, Graz, Austria, 3720--3724. https://doi.org/10.21437/Interspeech.2019--1172

Cited By

Kavitha KJoshith V(2024)Factors Shaping the Adoption of AI Tools among Gen Z: An Extended UTAUT2 Model Investigation Using CB-SEMBulletin of Science, Technology & Society10.1177/0270467624128336244:1-2(12-32)Online publication date: 18-Sep-2024
https://doi.org/10.1177/02704676241283362
Lee LHosio SBraud TZhou P(2024)A Roadmap Toward Metaversity: Recent Developments and Perspectives in EducationApplication of the Metaverse in Education10.1007/978-981-97-1298-4_5(73-95)Online publication date: 8-May-2024
https://doi.org/10.1007/978-981-97-1298-4_5
Wang DZhao TYu WChawla NJiang M(2023)Deep Multimodal Complementarity LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.316518034:12(10213-10224)Online publication date: Dec-2023
https://doi.org/10.1109/TNNLS.2022.3165180
Show More Cited By

Index Terms

Theophany: Multimodal Speech Augmentation in Instantaneous Privacy Channels
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction paradigms
      1. Mixed / augmented reality
  2. Ubiquitous and mobile computing
    1. Ubiquitous and mobile computing theory, concepts and paradigms
      1. Ubiquitous computing
    2. Ubiquitous and mobile devices
      1. Smartphones
2. Security and privacy
  1. Human and societal aspects of security and privacy
    1. Privacy protections
    2. Usability in security and privacy

Recommendations

The effect of clear speech to foreign-sounding interlocutors on native listeners’ perception of intelligibility
Highlights
- Naturally elicited speech to foreign-sounding listeners is rated clearer than naturally elicited speech to native-sounding interlocutors.
Abstract
Hyperarticulation is an acoustic modification of the speech stream that has been reliably shown to be naturally part of clear speech. Despite the large number of studies that have investigated the relationship between clear speech ...
Speech intelligibility improvement in noisy environments based on energy correlation in frequency bands

A new speech processing algorithm is proposed to improve speech intelligibility in noisy environments without increasing speech energy. The method improves the near-end speech intelligibility by optimizing the frame-based spectral energy correlation ...
Gender-Dependent Babble Maskers Created from Multi-speaker Speech for Speech Privacy Protection
IIH-MSP '14: Proceedings of the 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing

We investigated on an efficient masker for speech privacy protection. Previously, we proposed a speaker-dependent (SD) masker created from speech of the masked speaker. This masker decreased the speech intelligibility to a much lower level than ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

October 2021

5796 pages

ISBN:9781450386517

DOI:10.1145/3474085

General Chairs:
Heng Tao Shen
University of Electronic Science&Technology of China, China
,
Yueting Zhuang
Zhejiang University, China
,
John R. Smith
IBM, USA
,
Program Chairs:
Yang Yang
University of Electronic Science and Technology of China, China
,
Pablo Cesar
CWI&TU Delft, The Netherlands
,
Florian Metze
FACEBOOK, Inc., USA
,
Balakrishnan Prabhakaran
University of Texas at Dallas, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Academy of Finland

Conference

MM '21

Sponsor:

SIGMM

MM '21: ACM Multimedia Conference

October 20 - 24, 2021

Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
276
Total Downloads

Downloads (Last 12 months)35
Downloads (Last 6 weeks)4

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kavitha KJoshith V(2024)Factors Shaping the Adoption of AI Tools among Gen Z: An Extended UTAUT2 Model Investigation Using CB-SEMBulletin of Science, Technology & Society10.1177/0270467624128336244:1-2(12-32)Online publication date: 18-Sep-2024
https://doi.org/10.1177/02704676241283362
Lee LHosio SBraud TZhou P(2024)A Roadmap Toward Metaversity: Recent Developments and Perspectives in EducationApplication of the Metaverse in Education10.1007/978-981-97-1298-4_5(73-95)Online publication date: 8-May-2024
https://doi.org/10.1007/978-981-97-1298-4_5
Wang DZhao TYu WChawla NJiang M(2023)Deep Multimodal Complementarity LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.316518034:12(10213-10224)Online publication date: Dec-2023
https://doi.org/10.1109/TNNLS.2022.3165180
Yang SHe YChen Y(2023)Spatialgaze: towards spatial gaze tracking for extended realityCCF Transactions on Pervasive Computing and Interaction10.1007/s42486-023-00139-45:4(430-446)Online publication date: 16-Oct-2023
https://doi.org/10.1007/s42486-023-00139-4
Lee LChatzopoulos DZhou PBraud T(2023)Metaverse: An IntroductionMetaverse Communication and Computing Networks10.1002/9781394160013.ch1(1-16)Online publication date: 6-Oct-2023
https://doi.org/10.1002/9781394160013.ch1
Kumar ALee LChauhan JSu XHoque MPirttikangas STarkoma SHui PMagalhães Jdel Bimbo ASatoh SSebe NAlameda-Pineda XJin QOria VToni L(2022)PassWalk: Spatial Authentication Leveraging Lateral Shift and Gaze on Mobile HeadsetsProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3548252(952-960)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3503161.3548252
Wang YLee LBraud THui P(2022)Re-shaping Post-COVID-19 Teaching and Learning: A Blueprint of Virtual-Physical Blended Classrooms in the Metaverse Era2022 IEEE 42nd International Conference on Distributed Computing Systems Workshops (ICDCSW)10.1109/ICDCSW56584.2022.00053(241-247)Online publication date: Jul-2022
https://doi.org/10.1109/ICDCSW56584.2022.00053
Bermejo Fernandez CLee LNurmi PHui P(2021)PARA: Privacy Management and Control in Emerging IoT Ecosystems using Augmented RealityProceedings of the 2021 International Conference on Multimodal Interaction10.1145/3462244.3479885(478-486)Online publication date: 18-Oct-2021
https://dl.acm.org/doi/10.1145/3462244.3479885

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents