research-article

Accuth: Anti-Spoofing Voice Authentication via Accelerometer

Authors:

Xiang-Yang LiAuthors Info & Claims

SenSys '22: Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems

Pages 637 - 650

https://doi.org/10.1145/3560905.3568522

Published: 24 January 2023 Publication History

Abstract

Most existing voice-based user authentication systems mainly rely on microphones to capture the unique vocal characteristics of an individual, which makes these systems vulnerable to various acoustic attacks and suffer high-security risks. In this work, we present Accuth, a novel authentication system that takes advantage of a low-cost accelerometer to verify the user's identity and resist spoofing acoustic attacks. Accuth captures unique sound vibrations during the human pronunciation process and extracts multi-level features to verify the user's identity. Specifically, we analyze and model the differences between the physical sound field of human beings and loudspeakers, and extract a novel sound-field-level liveness feature to defend against spoofing attacks. Accuth is an effective complement to existing authentication approaches as it only leverages a ubiquitous, low-cost, and small-size accelerometer. In real-world experiments, Accuth achieves over 90% identification accuracy among 15 human participants and an average equal error rate (EER) of 3.02% for spoofing attack detection.

References

[1]

2022. Basic English Speaking. https://basicenglishspeaking.com

[2]

2022. Profile Your Code to Improve Performance. https://www.mathworks.com/help/matlab/matlab_prog/profiling-for-improving-performance.html

[3]

2022. SmartWatch Specifications. https://www.smartwatchspecifications.com/

[4]

2022. void-voice-liveness-detection. https://github.com/chislab/void-voice-liveness-detection

[5]

Masanobu Abe, Satoshi Nakamura, Kiyohiro Shikano, and Hisao Kuwabara. 1990. Voice conversion through vector quantization. Journal of the Acoustical Society of Japan (E) 11, 2 (1990), 71--76.

[6]

Muhammad Ejaz Ahmed, Il-Youp Kwak, Jun Ho Huh, Iljoo Kim, Taekkyung Oh, and Hyoungshick Kim. 2020. Void: A fast and light voice liveness detection system. In 29th {USENIX} Security Symposium ({USENIX} Security 20). 2685--2702.

[7]

S Abhishek Anand, Jian Liu, Chen Wang, Maliheh Shirvanian, Nitesh Saxena, and Yingying Chen. 2021. Echovib: Exploring voice authentication via unique non-linear vibrations of short replayed speech. In Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security. 67--81.

Digital Library

[8]

S Abhishek Anand and Nitesh Saxena. 2018. Speechless: Analyzing the threat to speech privacy from smartphone motion sensors. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 1000--1017.

[9]

B Azhagusundari, Antony Selvadoss Thanamani, et al. 2013. Feature selection based on information gain. International Journal of Innovative Technology and Exploring Engineering (IJITEE) 2, 2 (2013), 18--21.

[10]

Zhongjie Ba, Tianhang Zheng, Xinyu Zhang, Zhan Qin, Baochun Li, Xue Liu, and Kui Ren. 2020. Learning-based Practical Smartphone Eavesdropping with Built-in Accelerometer. In NDSS.

[11]

Leo Leroy Beranek and Tim Mellow. 2012. Acoustics: sound fields and transducers. Academic Press.

[12]

Michael Berouti, Richard Schwartz, and John Makhoul. 1979. Enhancement of speech corrupted by acoustic noise. In ICASSP'79. IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 4. IEEE, 208--211.

[13]

Steven Boll. 1979. Suppression of acoustic noise in speech using spectral subtraction. IEEE Transactions on acoustics, speech, and signal processing 27, 2 (1979), 113--120.

[14]

Si Chen, Kui Ren, Sixu Piao, Cong Wang, Qian Wang, Jian Weng, Lu Su, and Aziz Mohaisen. 2017. You can hear but you cannot steal: Defending against voice impersonation attacks on smartphones. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). IEEE, 183--195.

[15]

Huan Feng, Kassem Fawaz, and Kang G Shin. 2017. Continuous authentication for voice assistants. In Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking. 343--355.

Digital Library

[16]

Jun Han, Albert Jin Chung, and Patrick Tague. 2017. Pitchln: eavesdropping via intelligible speech reconstruction using non-acoustic sensor fusion. In Proceedings of the 16th ACM/IEEE International Conference on Information Processing in Sensor Networks. 181--192.

Digital Library

[17]

Colin H Hansen. 2001. Fundamentals of acoustics. Occupational Exposure to Noise: Evaluation, Prevention and Control. World Health Organization (2001), 23--52.

[18]

Danoush Hosseinzadeh and Sridhar Krishnan. 2007. Combining vocal source and MFCC features for enhanced speaker recognition performance using GMMs. In 2007 IEEE 9th Workshop on Multimedia Signal Processing. IEEE, 365--368.

[19]

Pengfei Hu, Hui Zhuang, Panneer Selvam Santhalingam, Riccardo Spolaor, Parth Pathak, Guoming Zhang, and Xiuzhen Cheng. 2022. AccEar: Accelerometer Acoustic Eavesdropping with Unconstrained Vocabulary. In 2022 IEEE Symposium on Security and Privacy (SP). IEEE Computer Society, 1530--1530.

[20]

TDK InvenSense and PACKAGING Cut Tape CT. 2020. MPU-6050. TDX Invensense (2020).

[21]

Anil Jain and Douglas Zongker. 1997. Feature selection: Evaluation, application, and small sample performance. IEEE transactions on pattern analysis and machine intelligence 19, 2 (1997), 153--158.

[22]

Sunil Kamath, Philipos Loizou, et al. 2002. A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In ICASSP, Vol. 4. Citeseer, 44164--44164.

[23]

Shruti Kapil and Meenu Chawla. 2016. Performance evaluation of K-means clustering algorithm with various distance metrics. In 2016 IEEE 1st international conference on power electronics, intelligent control and energy systems (ICPEICES). IEEE, 1--4.

[24]

Patrick Kenny, Themos Stafylakis, Pierre Ouellet, Md Jahangir Alam, and Pierre Dumouchel. 2013. PLDA for speaker verification with utterances of arbitrary duration. In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. IEEE, 7649--7653.

[25]

Tomi Kinnunen and Haizhou Li. 2010. An overview of text-independent speaker recognition: From features to supervectors. Speech communication 52, 1 (2010), 12--40.

[26]

Huining Li, Chenhan Xu, Aditya Singh Rathore, Zhengxiong Li, Hanbin Zhang, Chen Song, Kun Wang, Lu Su, Feng Lin, Kui Ren, et al. 2020. VocalPrint: exploring a resilient and secure voice authentication via mmWave biometric interrogation. In Proceedings of the 18th Conference on Embedded Networked Sensor Systems. 312--325.

Digital Library

[27]

Xiang-Yang Li, Huiqi Liu, Lan Zhang, Zhenan Wu, Yaochen Xie, Ge Chen, Chunxiao Wan, and Zhongwei Liang. 2019. Finding the stars in the fireworks: Deep understanding of motion sensor fingerprint. IEEE/ACM Transactions on Networking 27, 5 (2019), 1945--1958.

Digital Library

[28]

Zhuohang Li, Cong Shi, Tianfang Zhang, Yi Xie, Jian Liu, Bo Yuan, and Yingying Chen. 2021. Robust detection of machine-induced audio attacks in intelligent audio systems with microphone array. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 1884--1899.

Digital Library

[29]

Johan Lindberg and Mats Blomberg. 1999. Vulnerability in speaker verification-a study of technical impostor techniques. In Sixth European Conference on Speech Communication and Technology.

[30]

Anders Löfqvist and Bengt Mandersson. 1987. Long-time average spectrum of speech and voice analysis. Folia phoniatrica et logopaedica 39, 5 (1987), 221--229.

[31]

Li Lu, Jiadi Yu, Yingying Chen, Hongbo Liu, Yanmin Zhu, Yunfei Liu, and Minglu Li. 2018. Lippass: Lip reading-based user authentication on smartphones leveraging acoustic signals. In IEEE INFOCOM 2018-IEEE Conference on Computer Communications. IEEE, 1466--1474.

Digital Library

[32]

Li Lu, Jiadi Yu, Yingying Chen, and Yan Wang. 2020. Vocallock: Sensing vocal tract for passphrase-independent user authentication leveraging acoustic signals on smartphones. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 4, 2 (2020), 1--24.

Digital Library

[33]

Pavel Matějka, Ondřej Glembek, Fabio Castaldo, Md Jahangir Alam, Oldřich Plchot, Patrick Kenny, Lukáš Burget, and Jan Černocky. 2011. Full-covariance UBM and heavy-tailed PLDA in i-vector speaker verification. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 4828--4831.

[34]

Yan Meng, Jiachun Li, Matthew Pillari, Arjun Deopujari, Liam Brennan, Hafsah Shamsie, Haojin Zhu, and Yuan Tian. [n. d.]. Your Microphone Array Retains Your Identity: A Robust Voice Liveness Detection System for Smart Speakers. ([n. d.]).

[35]

Yan Meng, Jiachun Li, Matthew Pillari, Arjun Deopujari, Liam Brennan, Hafsah Shamsie, Haojin Zhu, and Yuan Tian. 2022. Your microphone array retains your identity: A robust voice liveness detection system for smart speaker. In USENIX Security.

[36]

Yan Meng, Zichang Wang, Wei Zhang, Peilin Wu, Haojin Zhu, Xiaohui Liang, and Yao Liu. 2018. Wivo: Enhancing the security of voice control system via wireless signal in iot environment. In Proceedings of the Eighteenth ACM International Symposium on Mobile Ad Hoc Networking and Computing. 81--90.

Digital Library

[37]

Yan Michalevsky, Dan Boneh, and Gabi Nakibly. 2014. Gyrophone: Recognizing speech from gyroscope signals. In 23rd {USENIX} Security Symposium ({USENIX} Security 14). 1053--1067.

[38]

Mohammad Hossein Moattar and Mohammad M Homayounpour. 2009. A simple but efficient real-time voice activity detection algorithm. In 2009 17th European signal processing conference. IEEE, 2549--2553.

[39]

Shraddha Pandit, Suchita Gupta, et al. 2011. A comparative study on distance measuring approaches for clustering. International journal of research in computer science 2, 1 (2011), 29--31.

[40]

Md Sahidullah, Héctor Delgado, Massimiliano Todisco, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi, and Kong-Aik Lee. 2019. Introduction to voice presentation attack detection and recent advances. In Handbook of Biometric Anti-Spoofing. Springer, 321--361.

[41]

Katie Seaborn, Norihisa P Miyake, Peter Pennefather, and Mihoko Otake-Matsuura. 2021. Voice in human-agent interaction: a survey. ACM Computing Surveys (CSUR) 54, 4 (2021), 1--43.

Digital Library

[42]

Cong Shi, Yan Wang, Yingying Chen, Nitesh Saxena, and Chen Wang*. 2020. WearID: Low-Effort Wearable-Assisted Authentication of Voice Commands via Cross-Domain Comparison without Training. In Annual Computer Security Applications Conference. 829--842.

Digital Library

[43]

Cong Shi, Xiangyu Xu, Tianfang Zhang, Payton Walker, Yi Wu, Jian Liu, Nitesh Saxena, Yingying Chen, and Jiadi Yu. 2021. Face-Mic: inferring live speech and speaker identity via subtle facial dynamics captured by AR/VR motion sensors. In Proceedings of the 27th Annual International Conference on Mobile Computing and Networking. 478--490.

Digital Library

[44]

Sayaka Shiota, Fernando Villavicencio, Junichi Yamagishi, Nobutaka Ono, Isao Echizen, and Tomoko Matsui. 2015. Voice liveness detection algorithms based on pop noise caused by human breath for automatic speaker verification. In Sixteenth annual conference of the international speech communication association.

[45]

Sten Ternström. 1993. Long-time average spectrum characteristics of different choirs in different rooms. Voice (UK) 2 (1993), 55--77.

[46]

Ingo R Titze and Brad H Story. 1997. Acoustic interactions of the voice source with the lower vocal tract. The Journal of the Acoustical Society of America 101, 4 (1997), 2234--2243.

[47]

Qian Wang, Xiu Lin, Man Zhou, Yanjiao Chen, Cong Wang, Qi Li, and Xiangyang Luo. 2019. Voicepop: A pop noise based anti-spoofing system for voice authentication on smartphones. In IEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 2062--2070.

Digital Library

[48]

Tianshi Wang, Shuochao Yao, Shengzhong Liu, Jinyang Li, Dongxin Liu, Huajie Shao, Ruijie Wang, and Tarek Abdelzaher. 2021. Audio Keyword Reconstruction from On-Device Motion Sensor Signals via Neural Frequency Unfolding. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 3 (2021), 1--29.

Digital Library

[49]

Yuxuan Wang, RJ Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J Weiss, Navdeep Jaitly, Zongheng Yang, Ying Xiao, Zhifeng Chen, Samy Bengio, et al. 2017. Tacotron: Towards end-to-end speech synthesis. arXiv preprint arXiv:1703.10135 (2017).

[50]

Kuo-Guan Wu and Po-Cheng Chen. 2001. Efficient speech enhancement using spectral subtraction for car hands-free applications. In ICCE. International Conference on Consumer Electronics (IEEE Cat. No. 01CH37182). IEEE, 220--221.

[51]

Chen Yan, Yan Long, Xiaoyu Ji, and Wenyuan Xu. 2019. The catcher in the field: A fieldprint based spoofing detection for text-independent speaker verification. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security. 1215--1229.

Digital Library

[52]

Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, and Wenyuan Xu. 2017. Dolphinattack: Inaudible voice commands. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 103--117.

Digital Library

[53]

Li Zhang, Parth H Pathak, Muchen Wu, Yixin Zhao, and Prasant Mohapatra. 2015. Accelword: Energy efficient hotword detection through accelerometer. In Proceedings of the 13th Annual International Conference on Mobile Systems, Applications, and Services. 301--315.

Digital Library

[54]

Linghan Zhang, Sheng Tan, and Jie Yang. 2017. Hearing your voice is not enough: An articulatory gesture based liveness detection for voice authentication. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 57--71.

Digital Library

[55]

Linghan Zhang, Sheng Tan, Jie Yang, and Yingying Chen. 2016. Voicelive: A phoneme localization based liveness detection for voice authentication on smartphones. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. 1080--1091.

Digital Library

Cited By

Han FYang PDu HLi X(2024)Accuth+: Accelerometer-Based Anti-Spoofing Voice Authentication on Wrist-Worn WearablesIEEE Transactions on Mobile Computing10.1109/TMC.2023.331483723:5(5571-5588)Online publication date: May-2024
https://doi.org/10.1109/TMC.2023.3314837

Index Terms

Accuth: Anti-Spoofing Voice Authentication via Accelerometer
1. Human-centered computing
  1. Ubiquitous and mobile computing
2. Security and privacy
  1. Security services
    1. Authentication

Recommendations

Revisiting the Security of Biometric Authentication Systems Against Statistical Attacks
The uniqueness of behavioral biometrics (e.g., voice or keystroke patterns) has been challenged by recent works. Statistical attacks have been proposed that infer general population statistics and target behavioral biometrics against a particular victim. ...
Mobile user identification through authentication using keystroke dynamics and accelerometer biometrics
MOBILESoft '16: Proceedings of the International Conference on Mobile Software Engineering and Systems

Biometrics is everything that can be measured in a human being. It has two types; behavioral and physiological. This paper discusses the use of keystroke dynamics, a form of behavioral biometrics that deals with the measure of how a person types, and ...
Palmprint and Finger-Knuckle-Print for efficient person recognition based on Log-Gabor filter response

Person recognition systems based on biometrics are being increasingly utilized in any applications to enhance the security of physical and logical access systems. A number of biometric traits exist and are in use in various applications. Each biometric ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SenSys '22: Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems

November 2022

1280 pages

ISBN:9781450398862

DOI:10.1145/3560905

General Chairs:
Jeremy Gummeson
University of Massachusetts Amherst
,
Sunghoon Ivan Lee
University of Massachusetts Amherst
,
Program Chairs:
Jie Gao
Rutgers University
,
Guoliang Xing
The Chinese University of Hong Kong

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 January 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

The University Synergy Innovation Program of Anhui Province
China National Natural Science Foundation
Key Research Program of Frontier Sciences, CAS

Conference

SenSys '22

Sponsor:

SenSys '22: The 20th ACM Conference on Embedded Networked Sensor Systems

November 6 - 9, 2022

Massachusetts, Boston

Acceptance Rates

SenSys '22 Paper Acceptance Rate 52 of 187 submissions, 28%;

Overall Acceptance Rate 174 of 867 submissions, 20%

Upcoming Conference

SenSys '24

Sponsor:
sigbed
sigbed
sigbed
sigbed
sigbed

The 22nd ACM Conference on Embedded Networked Sensor Systems

November 4 - 7, 2024

Hangzhou , China

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
363
Total Downloads

Downloads (Last 12 months)175
Downloads (Last 6 weeks)17

Reflects downloads up to 27 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

Han FYang PDu HLi X(2024)Accuth+: Accelerometer-Based Anti-Spoofing Voice Authentication on Wrist-Worn WearablesIEEE Transactions on Mobile Computing10.1109/TMC.2023.331483723:5(5571-5588)Online publication date: May-2024
https://doi.org/10.1109/TMC.2023.3314837

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents