Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3643832.3661860acmconferencesArticle/Chapter ViewAbstractPublication PagesmobisysConference Proceedingsconference-collections
research-article

F2Key: Dynamically Converting Your Face into a Private Key Based on COTS Headphones for Reliable Voice Interaction

Published: 04 June 2024 Publication History

Abstract

In this paper, we proposed F2Key, the first earable physical security system based on commercial off-the-shelf headphones. F2Key enables impactful applications, such as enhancing voiceprint-based authentication systems, reliable voice assistants, audio deepfake defense, and the legal validity of artifacts. The key idea of F2Key is to establish a stable acoustic sensing field across the user's face and embed the user's facial structures and articulatory habits into a user-specific generative model that serves as a private key. The private key can decrypt the Channel Impulse Response (CIR) profiles provided by the acoustic sensing field into an inferred spectrogram that can match the real one calculated from the corresponding speech, provided that the user's CIR-spectrogram mapping relationship is consistent with the one embedded in the generative model. Extensive experiments demonstrate that F2Key resists 99.9%, 96.4%, and 95.3% of speech replay attacks, mimicry attacks, and hybrid attacks, respectively. We discussed and evaluated F2Key from different perspectives, such as the health consideration and identical twins study, to show the practicality and reliability.

References

[1]
2023. Face ID Security. Apple Support Page. https://support.apple.com/en-us/102381 Accessed on: 2023-10-15.
[2]
Lawrence Abrams. 2023. Elon Musk Deep Fakes Promote New BitVex Cryptocurrency Scam. https://www.bleepingcomputer.com/news/security/elon-musk-deep-fakes-promote-new-bitvex-cryptocurrency-scam Accessed: 2023-11-16.
[3]
Sercan Arik, Jitong Chen, Kainan Peng, Wei Ping, and Yanqi Zhou. 2018. Neural voice cloning with a few samples. Advances in neural information processing systems (NeurIPS) 31 (2018).
[4]
Antlion Audio. 2023. Antilion Mod Mic. Retrieved April 6, 2023 from https://antlionaudio.com/collections/microphones/products/modmic-usb
[5]
Audio-Technica. 2023. ATH-G1WL. Retrieved April 6, 2023 from https://www.audio-technica.com/en-us/ath-g1wl
[6]
G. Bradski. 2000. The OpenCV Library. Dr. Dobb's Journal of Software Tools (2000).
[7]
Jane Bromley, Isabelle Guyon, Yann LeCun, Eduard Säckinger, and Roopak Shah. 1993. Signature verification using a" siamese" time delay neural network. Advances in neural information processing systems (NeurIPS) 6 (1993).
[8]
Nam Bui, Nhat Pham, Jessica Jacqueline Barnitz, Zhanan Zou, Phuc Nguyen, Hoang Truong, Taeho Kim, Nicholas Farrow, Anh Nguyen, Jianliang Xiao, et al. 2019. ebp: A wearable system for frequent and comfortable blood pressure monitoring from user's ear. In Proceedings of the 25th annual international conference on mobile computing and networking (MobiCom). 1--17.
[9]
Tu Bui, Daniel Cooper, John Collomosse, Mark Bell, Alex Green, John Sheridan, Jez Higgins, Arindra Das, Jared Keller, Olivier Thereaux, et al. 2019. Archangel: Tamper-proofing video archives using temporal content hashes on the blockchain. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). 0--0.
[10]
Kayla-Jade Butkow, Ting Dang, Andrea Ferlini, Dong Ma, and Cecilia Mascolo. 2023. hEARt: Motion-resilient Heart Rate Monitoring with In-ear Microphones. In Proceedings of the IEEE International Conference on Pervasive Computing and Communications (PerCom). IEEE, 200--209.
[11]
Chao Cai, Rong Zheng, and Jun Luo. 2022. Ubiquitous acoustic sensing on commodity iot devices: A survey. IEEE Communications Surveys & Tutorials 24, 1 (2022), 432--454.
[12]
Gaoshuai Cao, Kuang Yuan, Jie Xiong, Panlong Yang, Yubo Yan, Hao Zhou, and Xiang-Yang Li. 2020. Earphonetrack: involving earphones into the ecosystem of acoustic motion tracking. In Proceedings of the 18th Conference on Embedded Networked Sensor Systems (SenSys). 95--108.
[13]
Hanqun Cao, Cheng Tan, Zhangyang Gao, Yilun Xu, Guangyong Chen, Pheng-Ann Heng, and Stan Z Li. 2022. A survey on generative diffusion model. arXiv preprint arXiv:2209.02646 (2022).
[14]
Ishan Chatterjee, Maruchi Kim, Vivek Jayaram, Shyamnath Gollakota, Ira Kemelmacher, Shwetak Patel, and Steven M Seitz. 2022. ClearBuds: wireless binaural earbuds for learning-based speech enhancement. In Proceedings of the 20th Annual International Conference on Mobile Systems, Applications and Services (MobiSys). 384--396.
[15]
Nanxin Chen, Yanmin Qian, Heinrich Dinkel, Bo Chen, and Kai Yu. 2015. Robust deep feature for spoofing detection---The SJTU system for ASVspoof 2015 challenge. In Sixteenth Annual Conference of the International Speech Communication Association.
[16]
Qi Chen, Mingkui Tan, Yuankai Qi, Jiaqiu Zhou, Yuanqing Li, and Qi Wu. 2022. V2C: visual voice cloning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 21242--21251.
[17]
Kun Cheng, Xiaodong Cun, Yong Zhang, Menghan Xia, Fei Yin, Mingrui Zhu, Xuan Wang, Jue Wang, and Nannan Wang. 2022. Videoretalking: Audio-based lip synchronization for talking head video editing in the wild. In Proceedings of the SIGGRAPH Asia 2022 Conference Papers (SIGGRAPH-Asia). 1--9.
[18]
Thien-Phuc Doan, Long Nguyen-Vu, Souhwan Jung, and Kihun Hong. 2023. BTS-E: Audio Deepfake Detection Using Breathing-Talking-Silence Encoder. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1--5.
[19]
Di Duan, Yongliang Chen, Weitao Xu, and Tianxing Li. 2024. EarSE: Bringing Robust Speech Enhancement to COTS Headphones. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 7, 4 (2024), 1--33.
[20]
Xiaoran Fan, Longfei Shangguan, Siddharth Rupavatharam, Yanyong Zhang, Jie Xiong, Yunfei Ma, and Richard Howard. 2021. HeadFi: bringing intelligence to all headphones. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking (MobiCom). 147--159.
[21]
Andrea Ferlini, Dong Ma, Robert Harle, and Cecilia Mascolo. 2021. EarGate: gait-based user identification with in-ear microphones. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking (MobiCom). 337--349.
[22]
Pablo Ferrezuelo. 2023. Why Deepfake Fraud Losses Should Scare Financial Institutions. (2023). https://www.finextra.com/blogposting/23223/why-deepfake-fraud-losses-should-scare-financial-institutions Accessed: 2023-11-17.
[23]
Panagiotis P Filntisis, George Retsinas, Foivos Paraperas-Papantoniou, Athanasios Katsamanis, Anastasios Roussos, and Petros Maragos. 2023. SPECTRE: Visual Speech-Informed Perceptual 3D Facial Expression Reconstruction From Videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 5744--5754.
[24]
Emily Flitter and Stacy Cowley. 2023. Voice Deepfakes Are Coming for Your Bank Balance. (2023). https://www.nytimes.com/2023/08/30/business/voice-deepfakes-bank-scams.html Accessed: 2023-11-17.
[25]
Centers for Disease Control and Prevention. 2023. Public Health and Scientific Information. https://www.cdc.gov/nceh/hearing_loss/public_health_scientific_info.html Accessed: 2023-07-31.
[26]
Yang Gao, Yincheng Jin, Jagmohan Chauhan, Seokmin Choi, Jiyang Li, and Zhanpeng Jin. 2021. Voice in ear: Spoofing-resistant and passphrase-independent body sound authentication. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 5, 1 (2021), 1--25.
[27]
Yang Gao, Wei Wang, Vir V Phoha, Wei Sun, and Zhanpeng Jin. 2019. EarEcho: Using ear canal echo for wearable authentication. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 3, 3 (2019), 1--24.
[28]
John S Garofolo et al. 1988. DARPA TIMIT acoustic-phonetic speech database. National Institute of Standards and Technology (NIST) 15 (1988), 29--50.
[29]
Google. 2023. Google Cloud Speech-to-Text. https://cloud.google.com/speech-to-text/. Accessed: 2023-11-13.
[30]
Alexandros Haliassos, Rodrigo Mira, Stavros Petridis, and Maja Pantic. 2022. Leveraging real talking faces via self-supervision for robust forgery detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 14950--14962.
[31]
Lixing He, Haozheng Hou, Shuyao Shi, Xian Shuai, and Zhenyu Yan. 2023. Towards Bone-Conducted Vibration Speech Enhancement on Head-Mounted Wearables. In Proceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services. 14--27.
[32]
Chien-yu Huang, Yist Y Lin, Hung-yi Lee, and Lin-shan Lee. 2021. Defending your voice: Adversarial attack on voice conversion. In 2021 IEEE Spoken Language Technology Workshop (SLT). IEEE, 552--559.
[33]
iperov. 2023. DeepFaceLab is the leading software for creating deepfakes. https://github.com/iperov/DeepFaceLab.
[34]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 1125--1134.
[35]
Ye Jia, Yu Zhang, Ron Weiss, Quan Wang, Jonathan Shen, Fei Ren, Patrick Nguyen, Ruoming Pang, Ignacio Lopez Moreno, Yonghui Wu, et al. 2018. Transfer learning from speaker verification to multispeaker text-to-speech synthesis. Advances in neural information processing systems (NeurIPS) 31 (2018).
[36]
Nan Jiang, Terence Sim, and Jun Han. 2022. EarWalk: towards walking posture identification using earables. In Proceedings of the 23rd Annual International Workshop on Mobile Computing Systems and Applications (HotMobile). 35--40.
[37]
Yincheng Jin, Yang Gao, Xiaotao Guo, Jun Wen, Zhengxiong Li, and Zhanpeng Jin. 2022. EarHealth: an earphone-based acoustic otoscope for detection of multiple ear diseases in daily life. In Proceedings of the 20th Annual International Conference on Mobile Systems, Applications and Services (MobiSys). 397--408.
[38]
Yincheng Jin, Yang Gao, Xuhai Xu, Seokmin Choi, Jiyang Li, Feng Liu, Zhengxiong Li, and Zhanpeng Jin. 2022. EarCommand: " Hearing" Your Silent Speech Commands In Ear. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 6, 2 (2022), 1--28.
[39]
Vahid Kazemi and Josephine Sullivan. 2014. One millisecond face alignment with an ensemble of regression trees. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 1867--1874.
[40]
Jan Niklas Kolf, Tim Rieber, Jurek Elliesen, Fadi Boutros, Arjan Kuijper, and Naser Damer. 2023. Identity-driven Three-Player Generative Adversarial Network for Synthetic-based Face Recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 806--816.
[41]
Ke Li, Ruidong Zhang, Bo Liang, François Guimbretière, and Cheng Zhang. 2022. Eario: A low-power acoustic sensing earable for continuously tracking detailed facial movements. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 6, 2 (2022), 1--24.
[42]
Lingzhi Li, Jianmin Bao, Ting Zhang, Hao Yang, Dong Chen, Fang Wen, and Baining Guo. 2020. Face x-ray for more general face forgery detection. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR). 5001--5010.
[43]
Yuezun Li and Siwei Lyu. 2018. Exposing deepfake videos by detecting face warping artifacts. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR).
[44]
Zhuohang Li, Cong Shi, Tianfang Zhang, Yi Xie, Jian Liu, Bo Yuan, and Yingying Chen. 2021. Robust detection of machine-induced audio attacks in intelligent audio systems with microphone array. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security (CCS). 1884--1899.
[45]
Qianru Liao, Yongzhi Huang, Yandao Huang, Yuheng Zhong, Huitong Jin, and Kaishun Wu. 2022. MagEar: eavesdropping via audio recovery using magnetic side channel. In Proceedings of the 20th Annual International Conference on Mobile Systems, Applications and Services (MobiSys). 371--383.
[46]
Xiaomei Liu and Xin Tang. 2020. Image authentication using QR code watermarking approach based on image segmentation. In 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). IEEE, 1572--1577.
[47]
Yuxin Liu, Yoshimichi Nakatsuka, Ardalan Amiri Sani, Sharad Agarwal, and Gene Tsudik. 2022. Vronicle: verifiable provenance for videos from mobile devices. In Proceedings of the 20th Annual International Conference on Mobile Systems, Applications and Services (MobiSys). 196--208.
[48]
Logitech. 2023. G733. Retrieved April 6, 2023 from https://www.logitechg.com/en-us/products/gaming-audio/g733-rgb-wireless-headset.981-000863.html
[49]
lucidrains. 2020. vit-pytorch: An implementation of Vision Transformers in PyTorch. https://github.com/lucidrains/vit-pytorch.
[50]
Dong Ma, Andrea Ferlini, and Cecilia Mascolo. 2021. OESense: employing occlusion effect for in-ear human sensing. In Proceedings of the 19th Annual International Conference on Mobile Systems, Applications, and Services (MobiSys). 175--187.
[51]
Brian B Monson, Eric J Hunter, Andrew J Lotto, and Brad H Story. 2014. The perceptual significance of high-frequency energy in the human voice. Frontiers in psychology 5 (2014), 587.
[52]
HarryNyquist. 1928. Certain topics in telegraph transmission theory. Transactions of the American Institute of Electrical Engineers 47, 2 (1928), 617--644.
[53]
Nhat Pham, Tuan Dinh, Taeho Kim, Zohreh Raghebi, Nam Bui, Hoang Truong, Tuan Nguyen, Farnoush Banaei-Kashani, Ann Halbower, Thang N Dinh, et al. 2021. Detection of Microsleep Events with a Behind-the-ear Wearable System. IEEE Transactions on Mobile Computing (TMC) (2021).
[54]
Jay Prakash, Zhijian Yang, Yu-Lin Wei, Haitham Hassanieh, and Romit Roy Choudhury. 2020. EarSense: earphones as a teeth activity sensor. In Proceedings of the 26th Annual International Conference on Mobile Computing and Networking (MobiCom). 1--13.
[55]
D. Purves, G.J. Augustine, D. Fitzpatrick, et al. 2001. The Audible Spectrum. Sinauer Associates.
[56]
Radiopaedia. 2023. Acoustic impedance. Retrieved November 17, 2023 from https://radiopaedia.org/articles/acoustic-impedance
[57]
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI). Springer, 234--241.
[58]
Md Sahidullah, Tomi Kinnunen, and Cemal Hanilçi. 2015. A comparison of features for synthetic speech detection. (2015).
[59]
Philipp Schilk, Niccolò Polvani, Andrea Ronco, Milos Cernak, and Michele Magno. 2023. In-Ear-Voice: Towards Milli-Watt Audio Enhancement With Bone-Conduction Microphones for In-Ear Sensing Platforms. In Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation (IOTDI). 1--12.
[60]
Florian Schroff, Dmitry Kalenichenko, and James Philbin. 2015. Facenet: A unified embedding for face recognition and clustering. In Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 815--823.
[61]
Avast security news team. 2023. Voice fraud scams company out of $243,000. (2023). https://blog.avast.com/deepfake-voice-fraud-causes-243k-scam Accessed: 2023-11-17.
[62]
Irtaza Shahid and Nirupam Roy. 2023. " Is this my president speaking?" Tamper-proofing Speech in Live Recordings. In Proceedings of the 21st Annual International Conference on Mobile Systems, Applications and Services (MobiSys). 219--232.
[63]
Claude E Shannon. 1949. Communication in the presence of noise. Proceedings of the IRE 37, 1 (1949), 10--21.
[64]
Sayaka Shiota, Fernando Villavicencio, Junichi Yamagishi, Nobutaka Ono, Isao Echizen, and Tomoko Matsui. 2015. Voice liveness detection algorithms based on pop noise caused by human breath for automatic speaker verification. In Sixteenth annual conference of the international speech communication association.
[65]
Xingzhe Song, Kai Huang, and Wei Gao. 2022. FaceListener: Recognizing Human Facial Expressions via Acoustic Sensing on Commodity Headphones. In Proceedings of the 21st ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN). IEEE, 145--157.
[66]
Sony. 2023. WH-1000XM4. Retrieved April 6, 2023 from https://electronics.sony.com/audio/headphones/headband/p/wh1000xm4-b
[67]
Tanmay Srivastava, Prerna Khanna, Shijia Pan, Phuc Nguyen, and Shubham Jain. 2022. MuteIt: Jaw Motion Based Unvoiced Command Recognition Using Earable. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 6, 3 (2022), 1--26.
[68]
Ke Sun, Ting Zhao, Wei Wang, and Lei Xie. 2018. Vskin: Sensing touch gestures on surfaces of mobile devices using acoustic signals. In Proceedings of the 24th Annual International Conference on Mobile Computing and Networking (MobiCom). 591--605.
[69]
Midjourney Team. 2023. Midjourney Image Generation. https://www.midjourney.com/ Accessed: 2023-11-18.
[70]
Ingo R Titze and Daniel W Martin. 1998. Principles of voice production.
[71]
Jiadong Wang, Xinyuan Qian, Malu Zhang, Robby T Tan, and Haizhou Li. 2023. Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 14653--14662.
[72]
Tianben Wang, Daqing Zhang, Yuanqing Zheng, Tao Gu, Xingshe Zhou, and Bernadette Dorizzi. 2018. C-FMCW based contactless respiration detection using acoustic signal. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 1, 4 (2018), 1--20.
[73]
Yuntao Wang, Jiexin Ding, Ishan Chatterjee, Farshid Salemi Parizi, Yuzhou Zhuang, Yukang Yan, Shwetak Patel, and Yuanchun Shi. 2022. FaceOri: Tracking Head Position and Orientation Using Ultrasonic Ranging on Earphones. In Proceedings of the Conference on Human Factors in Computing Systems (CHI). 1--12.
[74]
Yuanda Wang, Hanqing Guo, Guangjing Wang, Bocheng Chen, and Qiben Yan. 2023. VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation. arXiv preprint arXiv:2305.05736 (2023).
[75]
Zi Wang, Yili Ren, Yingying Chen, and Jie Yang. 2022. Toothsonic: Earable authentication via acoustic toothprint. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 6, 2 (2022), 1--24.
[76]
Zi Wang, Sheng Tan, Linghan Zhang, Yili Ren, Zhi Wang, and Jie Yang. 2021. EarDynamic: An Ear Canal Deformation Based Continuous User Authentication Using In-Ear Wearables. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) 5, 1 (2021), 1--27.
[77]
Yi Wu, Vimal Kakaraparthi, Zhuohang Li, Tien Pham, Jian Liu, and Phuc Nguyen. 2021. BioFace-3D: continuous 3d facial reconstruction through lightweight single-ear biosensors. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking (MobiCom). 350--363.
[78]
Zhizheng Wu, Xiong Xiao, Eng Siong Chng, and Haizhou Li. 2013. Synthetic speech detection using temporal modulation feature. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 7234--7238.
[79]
Xiong Xiao, Xiaohai Tian, Steven Du, Haihua Xu, Engsiong Chng, and Haizhou Li. 2015. Spoofing speech detection using high dimensional magnitude and phase features: the NTU approach for ASVspoof 2015 challenge. In Interspeech. 2052--2056.
[80]
Yadong Xie, Fan Li, Yue Wu, Huijie Chen, Zhiyuan Zhao, and Yu Wang. 2022. TeethPass: Dental Occlusion-based User Authentication via In-ear Acoustic Sensing. In Proceedings of the 2022-IEEE Conference on Computer Communications (INFOCOM). IEEE, 1789--1798.
[81]
Xuhai Xu, Haitian Shi, Xin Yi, WenJia Liu, Yukang Yan, Yuanchun Shi, Alex Mariakakis, Jennifer Mankoff, and Anind K Dey. 2020. Earbuddy: Enabling on-face interaction via wireless earbuds. In Proceedings of the Conference on Human Factors in Computing Systems (CHI). 1--14.
[82]
Ryoya Yaguchi, Sayaka Shiota, Nobutaka Ono, and Hitoshi Kiya. 2019. Replay attack detection using generalized cross-correlation of stereo signal. In 2019 27th European Signal Processing Conference (EUSIPCO). IEEE, 1--5.
[83]
Wenyuan Yang, Xiaoyu Zhou, Zhikai Chen, Bofei Guo, Zhongjie Ba, Zhihua Xia, Xiaochun Cao, and Kui Ren. 2023. AVoiD-DF: Audio-Visual Joint Learning for Detecting Deepfake. IEEE Transactions on Information Forensics and Security (TIFS) 18 (2023), 2015--2029.
[84]
Xiaolong Yang, Xiaohong Jia, Dihong Gong, Dong-Ming Yan, Zhifeng Li, and Wei Liu. 2023. LARNeXt: End-to-End Lie Algebra Residual Network for Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2023).
[85]
Jiangyan Yi, Chenglong Wang, Jianhua Tao, Xiaohui Zhang, Chu Yuan Zhang, and Yan Zhao. 2023. Audio Deepfake Detection: A Survey. arXiv preprint arXiv:2308.14970 (2023).
[86]
Ning Yu, Larry S Davis, and Mario Fritz. 2019. Attributing fake images to gans: Learning and analyzing gan fingerprints. In Proceedings of the IEEE/CVF international conference on computer vision. 7556--7566.
[87]
Linghan Zhang, Sheng Tan, Jie Yang, and Yingying Chen. 2016. Voicelive: A phoneme localization based liveness detection for voice authentication on smartphones. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 1080--1091.
[88]
Yipin Zhou and Ser-Nam Lim. 2021. Joint audio-visual deepfake detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR). 14800--14809.
[89]
Jun-Yan Zhu and Taesung Park. 2023. pytorch-CycleGAN-and-pix2pix: Image-to-Image Translation in PyTorch. https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix.

Cited By

View all
  • (2024)SonicID: User Identification on Smart Glasses with Acoustic SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997348:4(1-27)Online publication date: 21-Nov-2024
  • (2024)Medusa3D: The Watchful Eye Freezing Illegitimate Users in Virtual Reality InteractionsProceedings of the ACM on Human-Computer Interaction10.1145/36765158:MHCI(1-21)Online publication date: 24-Sep-2024

Index Terms

  1. F2Key: Dynamically Converting Your Face into a Private Key Based on COTS Headphones for Reliable Voice Interaction

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MOBISYS '24: Proceedings of the 22nd Annual International Conference on Mobile Systems, Applications and Services
    June 2024
    778 pages
    ISBN:9798400705816
    DOI:10.1145/3643832
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 June 2024

    Check for updates

    Author Tags

    1. acoustic sensing
    2. deepfake detection
    3. earable sensing
    4. physical security system

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MOBISYS '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 274 of 1,679 submissions, 16%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)378
    • Downloads (Last 6 weeks)31
    Reflects downloads up to 10 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)SonicID: User Identification on Smart Glasses with Acoustic SensingProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36997348:4(1-27)Online publication date: 21-Nov-2024
    • (2024)Medusa3D: The Watchful Eye Freezing Illegitimate Users in Virtual Reality InteractionsProceedings of the ACM on Human-Computer Interaction10.1145/36765158:MHCI(1-21)Online publication date: 24-Sep-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media