Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3372297.3417254acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article

When the Differences in Frequency Domain are Compensated: Understanding and Defeating Modulated Replay Attacks on Automatic Speech Recognition

Published: 02 November 2020 Publication History

Abstract

Automatic speech recognition (ASR) systems have been widely deployed in modern smart devices to provide convenient and diverse voice-controlled services. Since ASR systems are vulnerable to audio replay attacks that can spoof and mislead ASR systems, a number of defense systems have been proposed to identify replayed audio signals based on the speakers' unique acoustic features in the frequency domain. In this paper, we uncover a new type of replay attack called modulated replay attack, which can bypass the existing frequency domain based defense systems. The basic idea is to compensate for the frequency distortion of a given electronic speaker using an inverse filter that is customized to the speaker's transform characteristics. Our experiments on real smart devices confirm the modulated replay attacks can successfully escape the existing detection mechanisms that rely on identifying suspicious features in the frequency domain. To defeat modulated replay attacks, we design and implement a countermeasure named DualGuard. We discover and formally prove that no matter how the replay audio signals could be modulated, the replay attacks will either leave ringing artifacts in the time domain or cause spectrum distortion in the frequency domain. Therefore, by jointly checking suspicious features in both frequency and time domains, DualGuard~can successfully detect various replay attacks including the modulated replay attacks. We implement a prototype of DualGuard~on a popular voice interactive platform, ReSpeaker Core v2. The experimental results show DualGuard~can achieve 98% accuracy on detecting modulated replay attacks.

Supplementary Material

MOV File (Copy of CCS2020_fp205_ShuWang - Brian Hollendyke.mov)
Presentation video

References

[1]
Hadi Abdullah, Washington Garcia, Christian Peeters, Patrick Traynor, Kevin R. B. Butler, and Joseph Wilson. 2019. Practical Hidden Voice Attacks against Speech and Speaker Recognition Systems. In Proceedings of the 2019 The Network and Distributed System Security Symposium (NDSS '19).
[2]
E. Alepis and C. Patsakis. 2017a. Monkey Says, Monkey Does: Security and Privacy on Voice Assistants. IEEE Access, Vol. 5 (2017), 17841--17851. https://doi.org/10.1109/ACCESS.2017.2747626
[3]
E. Alepis and C. Patsakis. 2017b. Monkey Says, Monkey Does: Security and Privacy on Voice Assistants. IEEE Access, Vol. 5 (2017), 17841--17851. https://doi.org/10.1109/ACCESS.2017.2747626
[4]
Amazon Alexa. 2018. Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/wiki/Amazon_Alexa, [accessed December 2018].
[5]
Google Assistant. 2019. Google. https://assistant.google.com/, [accessed December 2019].
[6]
The Offical WeChat Blog. 2019. Voiceprint: The New WeChat Password. https://www.techinasia.com/baidu-lenovo-voice-recognition-android-unlock, [Accessed September, 2019].
[7]
Logan Blue, Hadi Abdullah, Luis Vargas, and Patrick Traynor. 2018a. 2MA: Verifying Voice Commands via Two Microphone Authentication. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security (Incheon, Republic of Korea) (ASIACCS '18). ACM, New York, NY, USA, 89--100. https://doi.org/10.1145/3196494.3196545
[8]
Logan Blue, Luis Vargas, and Patrick Traynor. 2018b. Hello, Is It Me You'Re Looking For?: Differentiating Between Human and Electronic Speakers for Voice Interface Security. In Proceedings of the 11th ACM Conference on Security & Privacy in Wireless and Mobile Networks (Stockholm, Sweden) (WiSec '18). ACM, New York, NY, USA, 123--133. https://doi.org/10.1145/3212480.3212505
[9]
L. Brännmark, A. Bahne, and A. Ahlén. 2013. Compensation of Loudspeaker--Room Responses in a Robust MIMO Control Framework. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 21, 6 (June 2013), 1201--1216. https://doi.org/10.1109/TASL.2013.2245650
[10]
A. Carini, S. Cecchi, F. Piazza, I. Omiciuolo, and G. L. Sicuranza. 2012. Multiple Position Room Response Equalization in Frequency Domain. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, 1 (Jan 2012), 122--135. https://doi.org/10.1109/TASL.2011.2158420
[11]
Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang, Micah Sherr, Clay Shields, David Wagner, and Wenchao Zhou. 2016. Hidden Voice Commands. In 25th USENIX Security Symposium (USENIX Security 16). USENIX Association, Austin, TX, 513--530. https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/carlini
[12]
N. Carlini and D. Wagner. 2018. Audio Adversarial Examples: Targeted Attacks on Speech-to-Text. In 2018 IEEE Security and Privacy Workshops (SPW). 1--7.
[13]
Stefania Cecchi, Alberto Carini, and Sascha Spors. 2018. Room Response Equalization?A Review. Applied Sciences, Vol. 8, 1 (2018). https://doi.org/10.3390/app8010016
[14]
Microsoft Cortana. 2019. Microsoft. https://www.microsoft.com/en-us/cortana, [accessed December 2019].
[15]
P. L. De Leon, M. Pucher, J. Yamagishi, I. Hernaez, and I. Saratxaga. 2012. Evaluation of Speaker Verification Security and Detection of HMM-Based Synthetic Speech. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 20, 8 (Oct 2012), 2280--2290. https://doi.org/10.1109/TASL.2012.2201472
[16]
B. Defraene, T. van Waterschoot, M. Diehl, and M. Moonen. 2013. Embedded-optimization-based loudspeaker compensation using a generic Hammerstein loudspeaker model. In 21st European Signal Processing Conference (EUSIPCO 2013). 1--5.
[17]
Wenrui Diao, Xiangyu Liu, Zhe Zhou, and Kehuan Zhang. 2014. Your Voice Assistant is Mine: How to Abuse Speakers to Steal Information and Control Your Phone. CoRR, Vol. abs/1407.4923 (2014). arxiv: 1407.4923 http://arxiv.org/abs/1407.4923
[18]
Yuan Gong and Christian Poellabauer. 2017. Crafting Adversarial Examples For Speech Paralinguistics Applications. CoRR, Vol. abs/1711.03280 (2017). arxiv: 1711.03280 http://arxiv.org/abs/1711.03280
[19]
Y. Gong and C. Poellabauer. 2018. Protecting Voice Controlled Systems Using Sound Source Identification Based on Acoustic Cues. In 2018 27th International Conference on Computer Communication and Networks (ICCCN). 1--9. https://doi.org/10.1109/ICCCN.2018.8487334
[20]
Google. 2019. Let "Ok Google" and "Hey Google" unlock your phone or tablet. https://support.google.com/assistant/answer/7394306?co=GENIE.Platform%3DAndroid&hl=en. Accessed September, 2019.
[21]
Tharshini Gunendradasan, Buddhi Wickramasinghe, Ngoc Phu Le, Eliathamby Ambikairajah, and Julien Epps. 2018. Detection of Replay-Spoofing Attacks Using Frequency Modulation Features. In Proc. Interspeech 2018. 636--640. https://doi.org/10.21437/Interspeech.2018--1473
[22]
Rosa González Hautam"aki, Tomi Kinnunen, Ville Hautam"aki, Timo Leino, and Anne-Maria Laukkanen. 2013. I-vectors meet imitators: on vulnerability of speaker verification systems against voice mimicry. In INTERSPEECH.
[23]
44100 Hz. 2019. Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/wiki/44,100_Hz. [accessed November 2019].
[24]
Spline interpolation. 2019. Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/wiki/Spline_interpolation, [accessed April 2019].
[25]
Anil K Jain, Ruud Bolle, and Sharath Pankanti. 2006. Biometrics: personal identification in networked society. Vol. 479. Springer Science & Business Media.
[26]
Yeongjin Jang, Chengyu Song, Simon P. Chung, Tielei Wang, and Wenke Lee. 2014. A11Y Attacks: Exploiting Accessibility in Operating Systems. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (Scottsdale, Arizona, USA) (CCS '14). ACM, New York, NY, USA, 103--115. https://doi.org/10.1145/2660267.2660295
[27]
M. R. Kamble and H. A. Patil. 2017. Novel energy separation based instantaneous frequency features for spoof speech detection. In 2017 25th European Signal Processing Conference (EUSIPCO). 106--110. https://doi.org/10.23919/EUSIPCO.2017.8081178
[28]
M. R. Kamble and H. A. Patil. 2018. Novel Amplitude Weighted Frequency Modulation Features for Replay Spoof Detection. In 2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP). 185--189. https://doi.org/10.1109/ISCSLP.2018.8706673
[29]
Tomi Kinnunen, Md Sahidullah, Héctor Delgado, Massimiliano Todisco, Nicholas Evans, Junichi Yamagishi, and Kong Aik Lee. 2017. The ASVspoof 2017 challenge: Assessing the limits of replay spoofing attack detection. In INTERSPEECH 2017, Annual Conference of the International Speech Communication Association, August 20--24, 2017, Stockholm, Sweden. Stockholm, SWEDEN. http://www.eurecom.fr/publication/5235
[30]
T. Kinnunen, Z. Wu, K. A. Lee, F. Sedlak, E. S. Chng, and H. Li. 2012. Vulnerability of speaker verification systems against voice conversion spoofing attacks: The case of telephone speech. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 4401--4404. https://doi.org/10.1109/ICASSP.2012.6288895
[31]
Deepak Kumar, Riccardo Paccagnella, Paul Murley, Eric Hennenfent, Joshua Mason, Adam Bates, and Michael Bailey. 2018. Skill Squatting Attacks on Amazon Alexa. In Proceedings of the 27th USENIX Conference on Security Symposium (Baltimore, MD, USA) (SEC'18). USENIX Association, Berkeley, CA, USA, 33--47. http://dl.acm.org/citation.cfm?id=3277203.3277207
[32]
Hyun Kwon, Hyunsoo Yoon, and Ki-Woong Park. 2019. POSTER: Detecting Audio Adversarial Example Through Audio Modification. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (London, United Kingdom) (CCS '19). ACM, New York, NY, USA, 2521--2523. https://doi.org/10.1145/3319535.3363246
[33]
Galina Lavrentyeva, Sergey Novoselov, Egor Malykh, Alexander Kozlov, Oleg Kudashev, and Vadim Shchemelinin. 2017. Audio Replay Attack Detection with Deep Learning Frameworks. In Proc. Interspeech 2017. 82--86. https://doi.org/10.21437/Interspeech.2017--360
[34]
Phillip L. De Leon, Bryan Stewart, and Junichi Yamagishi. 2012. Synthetic Speech Discrimination using Pitch Pattern Statistics Derived from Image Analysis. In INTERSPEECH.
[35]
Dongbo Li, Longbiao Wang, Jianwu Dang, Meng Liu, Zeyan Oo, Seiichi Nakagawa, Haotian Guan, and Xiangang Li. 2018. Multiple Phase Information Combination for Replay Attacks Detection. In INTERSPEECH.
[36]
Johnny Lieu. 2019. Volkswagen drivers can unlock their cars with Siri. https://mashable.com/article/volkswagen-siri-shortcuts-unlock/. Accessed September, 2019.
[37]
C. S. Lin and Y. H. Chen. 2012. Phase compensation for multichannel low-frequency response using minimax approximation. In 2012 International Conference on Audio, Language and Image Processing. 182--188. https://doi.org/10.1109/ICALIP.2012.6376608
[38]
K. M. Malik, H. Malik, and R. Baumann. 2019. Towards Vulnerability Analysis of Voice-Driven Interfaces and Countermeasures for Replay Attacks. In 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR). 523--528. https://doi.org/10.1109/MIPR.2019.00106
[39]
Steven Millward. 2019. Open Sesame: Baidu Helps Lenovo Use Voice Recognition to Unlock Android Phones. https://www.techinasia.com/baidu-lenovo-voice-recognition-android-unlock, [Accessed September, 2019].
[40]
Richard Mitev, Markus Miettinen, and Ahmad-Reza Sadeghi. 2019. Alexa Lied to Me: Skill-based Man-in-the-Middle Attacks on Virtual Assistants. In Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security (Auckland, New Zealand) (Asia CCS '19). ACM, New York, NY, USA, 465--478. https://doi.org/10.1145/3321705.3329842
[41]
Parav Nagarsheth, Elie Khoury, Kailash Patil, and Matt Garland. 2017. Replay Attack Detection Using DNN for Channel Discrimination. In INTERSPEECH.
[42]
S. Novoselov, A. Kozlov, G. Lavrentyeva, K. Simonchik, and V. Shchemelinin. 2016. STC anti-spoofing systems for the ASVspoof 2015 challenge. In 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 5475--5479. https://doi.org/10.1109/ICASSP.2016.7472724
[43]
openstax. 2019. Comparison of Spectrums of Voice Signals Using the L2 Norm. https://cnx.org/contents/gm57qegZ@1/Comparison-of-Spectrums-of-Voice-Signals-Using-the-L2-Norm. Accessed January, 2020.
[44]
Giuseppe Petracca, Yuqiong Sun, Trent Jaeger, and Ahmad Atamli. 2015. AuDroid: Preventing Attacks on Audio Channels in Mobile Devices. In Proceedings of the 31st Annual Computer Security Applications Conference (Los Angeles, CA, USA) (ACSAC 2015). ACM, New York, NY, USA, 181--190. https://doi.org/10.1145/2818000.2818005
[45]
Facebook Portal. 2019. Facebook. https://portal.facebook.com/, [accessed December 2019].
[46]
Frequency Response. 2019. Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/wiki/Frequency_response, [accessed December 2019].
[47]
Nirupam Roy, Sheng Shen, Haitham Hassanieh, and Romit Roy Choudhury. 2018. Inaudible Voice Commands: The Long-Range Attack and Defense. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI 18). USENIX Association, Renton, WA, 547--560. https://www.usenix.org/conference/nsdi18/presentation/roy
[48]
Say-Tec. 2019. Use Say-Tec (formerly SayPay) to make payment approvals. https://www.say-tec.com/.
[49]
Lea Schö nherr, Katharina Kohls, Steffen Zeiler, Thorsten Holz, and Dorothea Kolossa. 2018. Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding. CoRR, Vol. abs/1808.05665 (2018).
[50]
Fourier series. 2019. Wikipedia, The Free Encyclopedia. https://en.wikipedia.org/wiki/Fourier_series, [accessed April 2019].
[51]
American Home Shield. 2019. Top 10: Best ?Smart Home? Voice-Control Devices. https://www.ahs.com/home-matters/tech/smart-home-voice-control-devices. Accessed September, 2019.
[52]
Apple Siri. 2019. Apple. https://www.apple.com/siri/, [accessed November 2019].
[53]
Liwei Song and Prateek Mittal. 2017. Inaudible Voice Commands. CoRR, Vol. abs/1708.07238 (2017). arxiv: 1708.07238 http://arxiv.org/abs/1708.07238
[54]
Takeshi Sugawara, Benjamin Cyr, Sara Rampazzi, Daniel Genkin, and Kevin Fu. 2019. Light Commands: Laser-Based Audio Injection on Voice-Controllable Systems. https://lightcommands.com/20191104-Light-Commands.pdf, [accessed November 2019].
[55]
Gajan Suthokumar, Vidhyasaharan Sethu, Chamith Wijenayake, and Eliathamby Ambikairajah. 2018. Modulation Dynamic Features for the Detection of Replay Attacks. 691--695. https://doi.org/10.21437/Interspeech.2018--1846
[56]
Massimiliano Todisco, Héctor Delgado, and Nicholas Evans. 2017. Constant Q cepstral coefficients: A spoofing countermeasure for automatic speaker verification. Computer Speech & Language, Vol. 45 (2017), 516 -- 535. https://doi.org/10.1016/j.csl.2017.01.001
[57]
ReSpeaker Core v2.0. 2019. Seeed Studio. http://wiki.seeedstudio.com/ReSpeaker_Core_v2.0/, [accessed December 2019].
[58]
Tavish Vaidya, Yuankai Zhang, Micah Sherr, and Clay Shields. 2015. Cocaine Noodles: Exploiting the Gap between Human and Machine Speech Recognition. In 9th USENIX Workshop on Offensive Technologies (WOOT 15). USENIX Association, Washington, D.C. https://www.usenix.org/conference/woot15/workshop-program/presentation/vaidya
[59]
Jesús Villalba and Eduardo Lleida. 2011a. Detecting Replay Attacks from Far-Field Recordings on Speaker Verification Systems. In Biometrics and ID Management, Claus Vielhauer, Jana Dittmann, Andrzej Drygajlo, Niels Christian Juul, and Michael C. Fairhurst (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 274--285.
[60]
J. Villalba and E. Lleida. 2011b. Preventing replay attacks on speaker verification systems. In 2011 Carnahan Conference on Security Technology. 1--8. https://doi.org/10.1109/CCST.2011.6095943
[61]
Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Hector Delgado, Andreas Nautsch, Nicholas Evans, Md Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee, Lauri Juvela, Paavo Alku, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao, Hsin-Min Wang, Sebastien Le Maguer, Markus Becker, Fergus Henderson, Rob Clark, Yu Zhang, Quan Wang, Ye Jia, Kai Onuma, Koji Mushika, Takashi Kaneda, Yuan Jiang, Li-Juan Liu, Yi-Chiao Wu, Wen-Chin Huang, Tomoki Toda, Kou Tanaka, Hirokazu Kameoka, Ingmar Steiner, Driss Matrouf, Jean-Francois Bonastre, Avashna Govender, Srikanth Ronanki, Jing-Xuan Zhang, and Zhen-Hua Ling. 2019. ASVspoof 2019: a large-scale public database of synthetic, converted and replayed speech. arxiv: 1911.01601 [eess.AS]
[62]
Z. Wang, G. Wei, and Q. He. 2011. Channel pattern noise based playback attack detection algorithm for speaker recognition. In 2011 International Conference on Machine Learning and Cybernetics, Vol. 4. 1708--1713. https://doi.org/10.1109/ICMLC.2011.6016982
[63]
Wikipedia. 2019. Ringing artifacts. https://en.wikipedia.org/wiki/Ringing_artifacts. Accessed September, 2019.
[64]
Wikipedia. 2020. Wavelet packet decomposition. https://en.wikipedia.org/wiki/Wavelet_packet_decomposition. Accessed April, 2020.
[65]
Marcin Witkowski, Stanislaw Kacprzak, Piotr Zelasko, Konrad Kowalczyk, and Jakub Galka. 2017. Audio Replay Attack Detection Using High-Frequency Features. In INTERSPEECH 2017.
[66]
Zhizheng Wu, Nicholas Evans, Tomi Kinnunen, Junichi Yamagishi, Federico Alegre, and Haizhou Li. 2015. Spoofing and countermeasures for speaker verification: A survey. Speech Communication, Vol. 66 (2015), 130 -- 153. https://doi.org/10.1016/j.specom.2014.10.005
[67]
Z. Wu and H. Li. 2013. Voice conversion and spoofing attack on speaker verification systems. In 2013 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference. 1--9. https://doi.org/10.1109/APSIPA.2013.6694344
[68]
Zhifeng Xie, Weibin Zhang, Zhuxin Chen, and Xiangmin Xu. 2019. A Comparison of Features for Replay Attack Detection. Journal of Physics: Conference Series, Vol. 1229 (may 2019), 012079. https://doi.org/10.1088/1742--6596/1229/1/012079
[69]
Chen Yan, Yan Long, Xiaoyu Ji, and Wenyuan Xu. 2019. The Catcher in the Field: A Fieldprint Based Spoofing Detection for Text-Independent Speaker Verification. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (London, United Kingdom) (CCS '19). ACM, New York, NY, USA, 1215--1229. https://doi.org/10.1145/3319535.3354248
[70]
Xuejing Yuan, Yuxuan Chen, Yue Zhao, Yunhui Long, Xiaokang Liu, Kai Chen, Shengzhi Zhang, Heqing Huang, XiaoFeng Wang, and Carl A. Gunter. 2018. CommanderSong: A Systematic Approach for Practical Adversarial Voice Recognition. In 27th USENIX Security Symposium (USENIX Security 18). USENIX Association, Baltimore, MD, 49--64. https://www.usenix.org/conference/usenixsecurity18/presentation/yuan-xuejing
[71]
Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, and Wenyuan Xu. 2017b. DolphinAttack: Inaudible Voice Commands. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (Dallas, Texas, USA) (CCS '17). 103--117.
[72]
Linghan Zhang, Sheng Tan, and Jie Yang. 2017a. Hearing Your Voice is Not Enough: An Articulatory Gesture Based Liveness Detection for Voice Authentication. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (Dallas, Texas, USA) (CCS '17). ACM, New York, NY, USA, 57--71. https://doi.org/10.1145/3133956.3133962
[73]
Linghan Zhang, Sheng Tan, Jie Yang, and Yingying Chen. 2016. VoiceLive: A Phoneme Localization Based Liveness Detection for Voice Authentication on Smartphones. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (Vienna, Austria) (CCS '16). ACM, New York, NY, USA, 1080--1091. https://doi.org/10.1145/2976749.2978296
[74]
Nan Zhang, Xianghang Mi, Xuan Feng, XiaoFeng Wang, Yuan Tian, and Feng Qian. 2019 b. Dangerous Skills: Understanding and Mitigating Security Risks of Voice-Controlled Third-Party Functions on Virtual Personal Assistant Systems. In IEEE S&P 2019.
[75]
Yangyong Zhang, Abner Mendoza, Guangliang Yang, Lei Xu, Phakpoom Chinprutthiwong, and Guofei Gu. 2019 a. Life after Speech Recognition: Fuzzing Semantic Misinterpretation for Voice Assistant Applications. In Proceedings of the 2019 The Network and Distributed System Security Symposium (NDSS '19). Internet Society.
[76]
M. Zhou, Z. Qin, X. Lin, S. Hu, Q. Wang, and K. Ren. 2019. Hidden Voice Commands: Attacks and Defenses on the VCS of Autonomous Driving Cars. IEEE Wireless Communications (2019), 1--6. https://doi.org/10.1109/MWC.2019.1800477
[77]
H. Zhu, M. Ding, and Y. Li. 2011. Gibbs phenomenon for fractional Fourier series. IET Signal Processing, Vol. 5, 8 (December 2011), 728--738. https://doi.org/10.1049/iet-spr.2010.0348

Cited By

View all
  • (2024)Fast and Lightweight Voice Replay Attack Detection via Time-Frequency Spectrum DifferenceIEEE Internet of Things Journal10.1109/JIOT.2024.340696211:18(29798-29810)Online publication date: 15-Sep-2024
  • (2024)Enrollment-Stage Backdoor Attacks on Speaker Recognition Systems via Adversarial UltrasoundIEEE Internet of Things Journal10.1109/JIOT.2023.332825311:8(13108-13124)Online publication date: 15-Apr-2024
  • (2023)Protecting Your Voice from Speech Synthesis AttacksProceedings of the 39th Annual Computer Security Applications Conference10.1145/3627106.3627183(394-408)Online publication date: 4-Dec-2023
  • Show More Cited By

Index Terms

  1. When the Differences in Frequency Domain are Compensated: Understanding and Defeating Modulated Replay Attacks on Automatic Speech Recognition

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CCS '20: Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security
      October 2020
      2180 pages
      ISBN:9781450370899
      DOI:10.1145/3372297
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 02 November 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. automatic speech recognition
      2. frequency distortion
      3. modulated replay attack
      4. ringing artifacts

      Qualifiers

      • Research-article

      Funding Sources

      • U.S. ARO
      • U.S. ONR
      • NSFC

      Conference

      CCS '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

      Upcoming Conference

      CCS '24
      ACM SIGSAC Conference on Computer and Communications Security
      October 14 - 18, 2024
      Salt Lake City , UT , USA

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)56
      • Downloads (Last 6 weeks)6
      Reflects downloads up to 04 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Fast and Lightweight Voice Replay Attack Detection via Time-Frequency Spectrum DifferenceIEEE Internet of Things Journal10.1109/JIOT.2024.340696211:18(29798-29810)Online publication date: 15-Sep-2024
      • (2024)Enrollment-Stage Backdoor Attacks on Speaker Recognition Systems via Adversarial UltrasoundIEEE Internet of Things Journal10.1109/JIOT.2023.332825311:8(13108-13124)Online publication date: 15-Apr-2024
      • (2023)Protecting Your Voice from Speech Synthesis AttacksProceedings of the 39th Annual Computer Security Applications Conference10.1145/3627106.3627183(394-408)Online publication date: 4-Dec-2023
      • (2023)Phantom-CSI Attacks against Wireless Liveness DetectionProceedings of the 26th International Symposium on Research in Attacks, Intrusions and Defenses10.1145/3607199.3607245(440-454)Online publication date: 16-Oct-2023
      • (2023)Breaking Security-Critical Voice Authentication2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179374(951-968)Online publication date: May-2023
      • (2023)MagBackdoor: Beware of Your Loudspeaker as A Backdoor For Magnetic Injection Attacks2023 IEEE Symposium on Security and Privacy (SP)10.1109/SP46215.2023.10179364(3416-3431)Online publication date: May-2023
      • (2023)On the Detection of Adaptive Adversarial Attacks in Speaker Verification SystemsIEEE Internet of Things Journal10.1109/JIOT.2023.326761910:18(16271-16283)Online publication date: 15-Sep-2023
      • (2023) LiveProbe : Exploring Continuous Voice Liveness Detection via Phonemic Energy Response Patterns IEEE Internet of Things Journal10.1109/JIOT.2022.322881910:8(7215-7228)Online publication date: 15-Apr-2023
      • (2023)Voice Spoofing Detection Through Residual Network, Max Feature Map, and Depthwise Separable ConvolutionIEEE Access10.1109/ACCESS.2023.327579011(49140-49152)Online publication date: 2023
      • (2023)A Survey of PPG's Application in AuthenticationComputers & Security10.1016/j.cose.2023.103488(103488)Online publication date: Sep-2023
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media