Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3460120.3484742acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Public Access

"Hello, It's Me": Deep Learning-based Speech Synthesis Attacks in the Real World

Published: 13 November 2021 Publication History

Abstract

Advances in deep learning have introduced a new wave of voice synthesis tools, capable of producing audio that sounds as if spoken by a target speaker. If successful, such tools in the wrong hands will enable a range of powerful attacks against both humans and software systems (aka machines). This paper documents efforts and findings from a comprehensive experimental study on the impact of deep-learning based speech synthesis attacks on both human listeners and machines such as speaker recognition and voice-signin systems. We find that both humans and machines can be reliably fooled by synthetic speech, and that existing defenses against synthesized speech fall short. These findings highlight the need to raise awareness and develop new protections against synthetic speech for both humans and machines.

References

[1]
2015. Announcing WeChat VoicePrint. https://blog.wechat.com/2015/05/21/voiceprint-the-new-wechat-password/.
[2]
2019. Personalize Your Alexa Experience with Voice Profiles. https://developer.amazon.com/blogs/alexa/post/1ad16e9b-4f52--4e68--9187-ec2e93faae55/recognize-voices-and-personalize-your-skills.
[3]
2020. Chase VoiceID. https://www.chase.com/personal/voice-biometrics
[4]
2020. HSBC VoiceID. https://www.us.hsbc.com/customer-service/voice/
[5]
2020. Link Your Voice to your Google Assistant device. https://support.google.com/assistant/answer/9071681
[6]
2020. Resemblyzer. https://github.com/resemble-ai/Resemblyzer
[7]
2020. What Are Alexa Voice Profiles? https://www.amazon.com/gp/help/customer/display.html?nodeId=GYCXKY2AB2QWZT2X.
[8]
2021. Attack-VC Github Implementation. https://github.com/cyhuang-tw/attack-vc
[9]
2021. Lyrebird AI. https://www.descript.com/lyrebird
[10]
2021. Microsoft Azure Speaker Recogition. https://azure.microsoft.com/en-us/services/cognitive-services/speaker-recognition/.
[11]
2021. Mozilla TTS. https://github.com/mozilla/TTS
[12]
2021. Resemble.AI. https://www.resemble.ai/
[13]
2021. TensorflowTTS. https://github.com/TensorSpeech/TensorFlowTTS
[14]
2021. Voxforge. http://www.voxforge.org/
[15]
2021. WeChat VoicePrint Documentation. https://help.wechat.com/cgi-bin/micromsg-bin/oshelpcenter?opcode=2&plat=ios&lang=en&id=150819uqYnUR150819YzINVb.
[16]
Muhammad Ejaz Ahmed, Il-Youp Kwak, Jun Ho Huh, Iljoo Kim, Taekkyung Oh, and Hyoungshick Kim. 2020. Void: A fast and light voice liveness detection system. In Proc. of USENIX.
[17]
Ehab A AlBadawy, Siwei Lyu, and Hany Farid. 2019. Detecting AI-Synthesized Speech Using Bispectral Analysis. In CVPR Workshops.
[18]
Federico Alegre, Artur Janicki, and Nicholas Evans. 2014. Re-assessing the threat of replay spoofing attacks against automatic speaker verification. In Proc. of BIOSIG.
[19]
Gopala K Anumanchipalli, Josh Chartier, and Edward F Chang. 2019. Speech synthesis from neural decoding of spoken sentences. Nature (2019).
[20]
Gopala Krishna Anumanchipalli, Kishore Prahallad, and Alan W Black. 2011. Festvox: Tools for creation and analyses of large speech corpora. In Workshop on Very Large Scale Phonetics Research.
[21]
Sercan O Arik, Jitong Chen, Kainan Peng, Wei Ping, and Yanqi Zhou. 2018. Neural voice cloning with a few samples. Proc. of NeurIPs (2018).
[22]
Pascal Belin, Patricia E G Bestelmeyer, Marianne Latinus, and Rebecca Watson. 2011. Understanding voice perception. British J. of Psychol. 102, 4 (Nov. 2011), 711--725.
[23]
Frédéric Bimbot, Jean-François Bonastre, Corinne Fredouille, et al. [n.d.]. A tutorial on text-independent speaker verification. EURASIP Journal on Advances in Signal Processing 2004 ([n. d.]).
[24]
Maximilian Bisani and Hermann Ney. 2008. Joint-sequence models for grapheme-to-phoneme conversion. Speech Communication 50, 5 (2008), 434--451. https://doi.org/10.1016/j.specom.2008.01.002
[25]
Guangke Chen, Sen Chen, Lingling Fan, Xiaoning Du, Zhe Zhao, Fu Song, and Yang Liu. 2019. Who is real bob? adversarial attacks on speaker recognition systems. arXiv preprint arXiv:1911.01840 (2019).
[26]
Si Chen, Kui Ren, Sixu Piao, Cong Wang, Qian Wang, Jian Weng, Lu Su, and Aziz Mohaisen. 2017. You can hear but you cannot steal: Defending against voice impersonation attacks on smartphones. In Proc. of ICDS.
[27]
Jemine Corentin. 2019. Master thesis : Real-Time Voice Cloning. (2019). https://matheo.uliege.be/handle/2268.2/6801
[28]
Phillip L De Leon, Vijendra Raj Apsingekar, Michael Pucher, and Junichi Yamagishi. 2010. Revisiting the security of speaker verification systems against imposture using synthetic speech. In Proc. of ICASSP.
[29]
Mattia A Di Gangi, Matteo Negri, and Marco Turchi. 2019. Adapting transformer to end-to-end spoken language translation. In Proc. of INTERSPEECH.
[30]
Grant Fairbanks. 1960. Voice and articulation drillbook. Addison-Wesley Educational Publishers.
[31]
Sadaoki Furui. 1981. Cepstral analysis technique for automatic speaker verification. Proc. of ICASSP (1981).
[32]
Haichang Gao, Honggang Liu, Dan Yao, Xiyang Liu, and Uwe Aickelin. 2010. An audio CAPTCHA to distinguish humans from computers. In 2010 Third International Symposium on Electronic Commerce and Security. IEEE, 265--269.
[33]
Yang Gao, Jiachen Lian, Bhiksha Raj, and Rita Singh. 2021. Detection and Evaluation of human and machine generated speech in spoofing attacks on automatic speaker verification systems. In Proc. of IEEE SLT Workshop.
[34]
Rosa González Hautamäki, Md Sahidullah, Ville Hautamäki, and Tomi Kinnunen. 2017. Acoustical and perceptual study of voice disguise by age modification in speaker verification. Speech Communication (2017).
[35]
Georg Heigold, Ignacio Moreno, Samy Bengio, and Noam Shazeer. 2016. End-to-end text-dependent speaker verification. In Proc. of ICASSP. IEEE.
[36]
Wei-Ning Hsu, Yu Zhang, Ron J Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, et al. 2019. Hierarchical generative modeling for controllable speech synthesis. Proc. of ICLR (2019).
[37]
Qiong Hu, Erik Marchi, David Winarsky, Yannis Stylianou, Devang Naik, and Sachin Kajarekar. 2019. Neural text-to-speech adaptation from low quality public recordings. In Speech Synthesis Workshop, Vol. 10.
[38]
Chien-yu Huang, Yist Y Lin, Hung-yi Lee, and Lin-shan Lee. 2021. Defending Your Voice: Adversarial Attack on Voice Conversion. Proc. of IEEE SLT Workshop (2021).
[39]
Artur Janicki, Federico Alegre, and Nicholas Evans. 2016. An assessment of automatic speaker verification vulnerabilities to replay spoofing attacks. Security and Communication Networks (2016).
[40]
Corentin Jemine. 2020. Real Time Voice Cloning. https://github.com/CorentinJ/Real-Time-Voice-Cloning
[41]
Ye Jia, Yu Zhang, Ron Weiss, Quan Wang, Jonathan Shen, Fei Ren, Patrick Nguyen, Ruoming Pang, Ignacio Lopez Moreno, Yonghui Wu, et al. 2018. Transfer learning from speaker verification to multispeaker text-to-speech synthesis. Proc. of NeurIPs (2018).
[42]
Hirokazu Kameoka, Takuhiro Kaneko, Kou Tanaka, and Nobukatsu Hojo. 2018. Stargan-VC: Non-parallel many-to-many voice conversion using star generative adversarial networks. In Proc. of IEEE SLT Workshop.
[43]
Elie Khoury, Laurent El Shafey, and Sébastien Marcel. 2014. Spear: An open source toolbox for speaker recognition based on Bob. In Proc. of ICASSP.
[44]
Tomi Kinnunen, Md Sahidullah, Héctor Delgado, Massimiliano Todisco, Nicholas Evans, Junichi Yamagishi, and Kong Aik Lee. 2017. The ASVspoof 2017 challenge: Assessing the limits of replay spoofing attack detection. (2017).
[45]
Tomi Kinnunen, Zhi-Zheng Wu, Kong Aik Lee, Filip Sedlak, Eng Siong Chng, and Haizhou Li. 2012. Vulnerability of speaker verification systems against voice conversion spoofing attacks: The case of telephone speech. In Proc. of ICASSP.
[46]
John Kominek and Alan W Black. 2003. Carnegie Mellon University ARCTIC databases for speech synthesis. Carnegie Mellon University, Language Technologies Institute Tech Report Carnegie Mellon University-LTI-03--177 (2003).
[47]
Michael W. Kraus, Brittany Torrez, Jun Won Park, and Fariba Ghayebi. 2019. Evidence for the reproduction of social class in brief speech. Proc. of National Academy of Sciences 114, 46 (Nov. 2019), 22998--23003.
[48]
Felix Kreuk, Yossi Adi, Moustapha Cisse, and Joseph Keshet. 2018. Fooling end-to-end speaker verification with adversarial examples. In Proc. of ICASSP.
[49]
Gautam Krishna, Co Tran, Yan Han, Mason Carnahan, and Ahmed H Tewfik. 2020. Speech synthesis using EEG. In Proc. of ICASSP.
[50]
I Han Kuo, Joel Marcus Rabindran, Elizabeth Broadbent, Yong In Lee, Ngaire Kerse, Rebecca MQ Stafford, and Bruce A MacDonald. 2009. Age and gender factors in user acceptance of healthcare robots. In Proc. of RO-MAN. IEEE.
[51]
Yee Wah Lau, Michael Wagner, and Dat Tran. 2004. Vulnerability of speaker verification to voice mimicking. In Proc. of ISIMP.
[52]
Galina Lavrentyeva, Sergey Novoselov, Egor Malykh, Alexander Kozlov, Oleg Kudashev, and Vadim Shchemelinin. 2017. Audio Replay Attack Detection with Deep Learning Frameworks. In Proc. of INTERSPEECH.
[53]
Zhuohang Li, Cong Shi, Yi Xie, Jian Liu, Bo Yuan, and Yingying Chen. 2020. Practical adversarial attacks against speaker recognition systems. In Proc. of HotMobile.
[54]
A Lieto, D Moro, F Devoti, C Parera, Vincenzo Lipari, Paolo Bestagini, and Stefano Tubaro. 2019. "Hello? Who Am I Talking to?" A Shallow CNN Approach for Human vs. Bot Speech Classification. In Proc. of ICASSP.
[55]
Sabrina López, Pablo Riera, María Florencia Assaneo, Manuel Eguía, Mariano Sigman, and Marcos A Trevisan. 2013. Vocal caricatures reveal signatures of speaker identity. Scientific Reports (2013).
[56]
Takashi Masuko, Takafumi Hitotsumatsu, Keiichi Tokuda, and Takao Kobayashi. 1999. On the security of HMM-based speaker verification systems against imposture using synthetic speech. In Proc. of 6th European Conference on Speech Communication and Technology.
[57]
Dibya Mukhopadhyay, Maliheh Shirvanian, and Nitesh Saxena. 2015. All your voices are belong to us: Stealing voices to fool humans and machines. In Proc. of ESORICS.
[58]
John Mullennix and Steven Stern. 2010. Computer Synthesized Speech Technologies: Tools for Aiding Impairment. IGI Global.
[59]
Arsha Nagrani, Joon Son Chung, Weidi Xie, and Andrew Zisserman. 2020. Voxceleb: Large-scalespeaker verificationin the wild. Computer Speech & Language (2020).
[60]
Ajaya Neupane, Nitesh Saxena, Leanne M Hirshfield, and Sarah E Bratt. 2019. The Crux of Voice (In) Security: A Brain Study of Speaker Legitimacy Detection. In Proc. of NDSS.
[61]
Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. 2015. Librispeech: an asr corpus based on public domain audio books. In Proc. of ICASSP.
[62]
Pavol Partila, Jaromir Tovarek, Gokhan Hakki Ilk, Jan Rozhon, and Miroslav Voznak. 2020. Deep Learning Serves Voice Cloning: How Vulnerable Are Automatic Speaker Verification Systems to Spoofing Trials? IEEE Communications Magazine (2020).
[63]
Cyril R. Pernet and Pascal Belin. 2012. The role of pitch and timbre in voice gender categorization. Frontiers in Psychology 3 (Feb 2012).
[64]
Wei Ping, Kainan Peng, Andrew Gibiansky, Sercan O Arik, Ajay Kannan, Sharan Narang, Jonathan Raiman, and John Miller. 2018. DeepVoice 3: Scaling text-to-speech with convolutional sequence learning. Proc. of ICLR (2018).
[65]
Kaizhi Qian. 2021. AutoVC Github Implementation. https://github.com/auspicious3000/autovc
[66]
Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson, and David Cox. 2020. Unsupervised speech decomposition via triple information bottleneck. In Proc. of ICML.
[67]
Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, and Mark Hasegawa-Johnson. 2019. Autovc: Zero-shot voice style transfer with only autoencoder loss. Proc. of ICML (2019).
[68]
Yao Qin, Nicholas Carlini, Garrison Cottrell, Ian Goodfellow, and Colin Raffel. 2019. Imperceptible, robust, and targeted adversarial examples for automatic speech recognition. In Proc. of ICMR.
[69]
Yurii Rebryk and Stanislav Beliaev. 2020. ConVoice: Real-Time Zero-Shot Voice Style Transfer with Convolutional Network. arXiv preprint arXiv:2005.07815 (2020).
[70]
Douglas A Reynolds, Thomas F Quatieri, and Robert B Dunn. 2000. Speaker verification using adapted Gaussian mixture models. Digital signal processing 10 (2000).
[71]
Aaron E Rosenberg. 1976. Automatic speaker verification: A review. IEEE (1976).
[72]
J. Saldana. 2009. The coding manual for qualitative researchers. Sage Publications Limited (2009).
[73]
Aaron Sell, Gregory A. Bryant, Leda Cosmides, John Tooby, Daniel Sznycer, et al. 2010. Adaptations in humans for assessing physical strength from the voice. Proc. of the Royal Society B 277 (June 2010), 3509--3518.
[74]
Joan Serrà, Santiago Pascual, and Carlos Segura. 2019. Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion. Proc. of NeurIPs (2019).
[75]
Neeraj Kumar Sharma, Shobhana Ganesh, Sriram Ganapathy, and Lori L Holt. 2019. Talker change detection: A comparison of human and machine performance. The Journal of the Acoustical Society of America (2019).
[76]
Jonathan Shen, Ruoming Pang, Ron J Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, Rj Skerrv-Ryan, et al. 2018. Natural TTS synthesis by conditioning wavenet on mel spectrogram predictions. In Proc. of ICASSP.
[77]
Maliheh Shirvanian, Manar Mohammed, Nitesh Saxena, and S Abhishek Anand. 2020. Voicefox: Leveraging Inbuilt Transcription to Enhance the Security of Machine-Human Speaker Verification against Voice Synthesis Attacks. In Annual Computer Security Applications Conference. 870--883.
[78]
Maliheh Shirvanian and Nitesh Saxena. 2014. Wiretapping via mimicry: Short voice imitation man-in-the-middle attacks on crypto phones. In Proc. of CCS.
[79]
Maliheh Shirvanian, Summer Vo, and Nitesh Saxena. 2019. Quantifying the Breakability of Voice Assistants. In Proc. of PerCom.
[80]
Dan Simmons. 2017. BBC Fools HSBC Voice Recognition System. (2017). https://www.bbc.com/news/technology-39965545
[81]
David Snyder, Pegah Ghahremani, Daniel Povey, Daniel Garcia-Romero, Yishay Carmiel, and Sanjeev Khudanpur. 2016. Deep neural network-based speaker embeddings for end-to-end speaker verification. In Proc. of IEEE SLT Workshop.
[82]
Catherine Stupp. 2019. Fraudsters Used AI to Mimic CEO's Voice in Unusual Cybercrime Case. Wall Street Journal (August 2019).
[83]
Yaniv Taigman, Lior Wolf, Adam Polyak, and Eliya Nachmani. 2018. Voiceloop: Voice fitting and synthesis via a phonological loop. Proc. of ICLR (2018).
[84]
Rie Tamagawa, Catherine I Watson, I Han Kuo, Bruce A MacDonald, and Elizabeth Broadbent. 2011. The effects of synthesized voice accents on user perceptions of robots. International Journal of Social Robotics 3 (2011).
[85]
Ehsan Variani, Xin Lei, Erik McDermott, Ignacio Lopez Moreno, and Javier Gonzalez-Dominguez. 2014. Deep neural networks for small footprint text-dependent speaker verification. In Proc. of ICASSP. IEEE.
[86]
Ville Vestman, Tomi Kinnunen, Rosa González Hautamäki, and Md Sahidullah. 2020. Voice mimicry attacks assisted by automatic speaker verification. Computer Speech & Language (2020).
[87]
James Vincent. 2018. Google's AI sounds like a human on the phone -- should we be worried. The Verge (May 2018).
[88]
Robbie Vogt and Sridha Sridharan. 2008. Explicit modelling of session variability for speaker verification. Computer Speech & Language (2008).
[89]
Li Wan, Quan Wang, Alan Papir, and Ignacio Lopez Moreno. 2018. Generalized End-to-End Loss for Speaker Verification. In Proc. of ICASSP.
[90]
Qian Wang, Xiu Lin, Man Zhou, Yanjiao Chen, Cong Wang, Qi Li, and Xi-angyang Luo. 2019. Voicepop: A pop noise based anti-spoofing system for voice authentication on smartphones. In Proc. of INFOCOM.
[91]
Run Wang, Felix Juefei-Xu, Yihao Huang, Qing Guo, Xiaofei Xie, Lei Ma, and Yang Liu. 2020. DeepSonar: Towards Effective and Robust Detection of AI-Synthesized Fake Voices. arXiv preprint arXiv:2005.13770 (2020).
[92]
Yuxuan Wang, RJ Skerry-Ryan, Daisy Stanton, Yonghui Wu, Ron J Weiss, Navdeep Jaitly, Zongheng Yang, Ying Xiao, Zhifeng Chen, Samy Bengio, et al. 2017. Tacotron: Towards end-to-end speech synthesis. Proc. of INTERSPEECH (2017).
[93]
Robert Weide. 1998. The Carnegie Mellon pronouncing dictionary. http://www.speech.cs.cmu.edu/cgi-bin/cmudict
[94]
Steven H. Weinberger. 2013. Speech Accent Archive. George Mason University.
[95]
Da-Yi Wu, Yen-Hao Chen, and Hung-Yi Lee. 2020. Vqvc+: One-shot voice conversion by vector quantization and u-net architecture. arXiv preprint arXiv:2006.04154 (2020).
[96]
Zhizheng Wu, Tomi Kinnunen, Nicholas Evans, Junichi Yamagishi, Cemal Hanilci, Md Sahidullah, and Aleksandr Sizov. 2015. ASVspoof 2015: the first automatic speaker verification spoofing and countermeasures challenge. In Proc. of ISCA.
[97]
Junichi Yamagishi, Christophe Veaux, and Kirsten. MacDonald. [n.d.]. CSTR VCTK Corpus: English Multi-speaker Corpus for CSTR Voice Cloning Toolkit. ([n. d.]). https://doi.org/10.7488/ds/2645
[98]
Chen Yan, Yan Long, Xiaoyu Ji, and Wenyuan Xu. 2019. The Catcher in the Field: A Fieldprint based Spoofing Detection for Text-Independent Speaker Verification. In Proc. of CCS.
[99]
R. Zaske and S. R. Schweinberger. 2011. You are only as old as you sound: auditory aftereffects in vocal age perception. Hearing Research 282, 1--2 (Dec. 2011), 283--288.
[100]
Linghan Zhang, Sheng Tan, and Jie Yang. 2017. Hearing your voice is not enough: An articulatory gesture based liveness detection for voice authentication. In Proc. of CCS.

Cited By

View all
  • (2024)A Comprehensive Study on Continuous Person Authentication Using Behavioral Biometrics2024 International Conference on Trends in Quantum Computing and Emerging Business Technologies10.1109/TQCEBT59414.2024.10545048(1-6)Online publication date: 22-Mar-2024
  • (2024)Biometrics-Based Authenticated Key Exchange With Multi-Factor Fuzzy ExtractorIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.346862419(9344-9358)Online publication date: 2024
  • (2024)HiFi-GANw: Watermarked Speech Synthesis via Fine-Tuning of HiFi-GANIEEE Signal Processing Letters10.1109/LSP.2024.345667331(2440-2444)Online publication date: 2024
  • Show More Cited By

Index Terms

  1. "Hello, It's Me": Deep Learning-based Speech Synthesis Attacks in the Real World

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CCS '21: Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security
      November 2021
      3558 pages
      ISBN:9781450384544
      DOI:10.1145/3460120
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 13 November 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. biometric security
      2. neural networks
      3. speech synthesis

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      CCS '21
      Sponsor:
      CCS '21: 2021 ACM SIGSAC Conference on Computer and Communications Security
      November 15 - 19, 2021
      Virtual Event, Republic of Korea

      Acceptance Rates

      Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

      Upcoming Conference

      CCS '25

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)524
      • Downloads (Last 6 weeks)64
      Reflects downloads up to 10 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)A Comprehensive Study on Continuous Person Authentication Using Behavioral Biometrics2024 International Conference on Trends in Quantum Computing and Emerging Business Technologies10.1109/TQCEBT59414.2024.10545048(1-6)Online publication date: 22-Mar-2024
      • (2024)Biometrics-Based Authenticated Key Exchange With Multi-Factor Fuzzy ExtractorIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.346862419(9344-9358)Online publication date: 2024
      • (2024)HiFi-GANw: Watermarked Speech Synthesis via Fine-Tuning of HiFi-GANIEEE Signal Processing Letters10.1109/LSP.2024.345667331(2440-2444)Online publication date: 2024
      • (2024)Personalized user authentication system using wireless EEG headset and machine learningBrain Organoid and Systems Neuroscience Journal10.1016/j.bosn.2024.03.0032(17-22)Online publication date: Dec-2024
      • (2023)Tubes among usProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620253(265-282)Online publication date: 9-Aug-2023
      • (2023)A Review of User Identification Methods Based on Digital FingerprintProceedings of Telecommunication Universities10.31854/1813-324X-2023-9-5-91-1119:5(91-111)Online publication date: 14-Nov-2023
      • (2023)Protecting Your Voice from Speech Synthesis AttacksProceedings of the 39th Annual Computer Security Applications Conference10.1145/3627106.3627183(394-408)Online publication date: 4-Dec-2023
      • (2023)PhantomSound: Black-Box, Query-Efficient Audio Adversarial Attack via Split-Second Phoneme InjectionProceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3607199.3607240(366-380)Online publication date: 16-Oct-2023
      • (2023)Towards Understanding and Mitigating Audio Adversarial Examples for Speaker RecognitionIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2022.322067320:5(3970-3987)Online publication date: 1-Sep-2023
      • (2023)Timbre-Reserved Adversarial Attack in Speaker IdentificationIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2023.330671431(3848-3858)Online publication date: 18-Aug-2023
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media