Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Free access
Just Accepted

IUAC: Inaudible Universal Adversarial Attacks Against Smart Speakers

Online AM: 30 September 2024 Publication History

Abstract

Intelligent voice systems are widely utilized to control smart home applications, which raises significant privacy and security concerns. Recent studies have revealed their vulnerability to adversarial attacks, replay attacks, etc. However, these attacks rely on the victim’s voice data. In our work, we investigate a stealthy and command-independent attack that does not necessitate collecting victims’ voices. Our proposed attack, IUAC, misleads the voice system to go against the victim’s will, regardless of the commands delivered. Our core concept is to train highly robust attack commands through the construction of diverse data, rendering the user’s commands negligible. To achieve stealthy attacks, we leverage a high-frequency carrier to construct an inaudible universal adversarial command. Extensive experiments conducted with real-world datasets demonstrate that our attack system attains an average attack success rate of 96% while resisting environmental interference. Moreover, our attack success rate against real-world voice systems is 4.52 × higher than the state-of-the-art. Finally, we propose an effective defense mechanism and provide experimental tests to validate its efficacy.

References

[1]
Sajjad Abdoli, Luiz G Hafemann, Jerome Rony, Ismail Ben Ayed, Patrick Cardinal, and Alessandro L Koerich. 2019. Universal adversarial audio perturbations. arXiv preprint arXiv:1908.03173(2019).
[2]
Amazon. 2017. “Alexa”. [Online]. https://developer. amazon.com/alexa.
[3]
Karissa Bell. 2015. A smarter Siri learns to recognize the sound of your voice in iOS 9.
[4]
Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang, Micah Sherr, Clay Shields, David Wagner, and Wenchao Zhou. 2016. Hidden voice commands. In 25th USENIX security symposium. 513–530.
[5]
Nicholas Carlini and David Wagner. 2018. Audio adversarial examples: Targeted attacks on speech-to-text. In IEEE Security and Privacy Workshops (SPW). 1–7.
[6]
Guangke Chen, Sen Chenb, Lingling Fan, Xiaoning Du, Zhe Zhao, Fu Song, and Yang Liu. 2021. Who is real bob? adversarial attacks on speaker recognition systems. In IEEE Symposium on Security and Privacy (SP). 694–711.
[7]
J. S. Chung, J. Huh, S. Mun, M. Lee, and I. Han. 2020. In defence of metric learning for speaker recognition.
[8]
Wenrui Diao, Xiangyu Liu, Zhe Zhou, and Kehuan Zhang. 2014. Your voice assistant is mine: How to abuse speakers to steal information and control your phone. In Proceedings of the 4th ACM Workshop on Security and Privacy in Smartphones & Mobile Devices. 63–74.
[9]
Alex Graves, Santiago Fernández, Faustino Gomez, and Jürgen Schmidhuber. 2006. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd international conference on Machine learning. 369–376.
[10]
Awni Hannun. 2017. Sequence modeling with ctc. Distill 2, 11 (2017), e8.
[11]
A Hannun*, C C ase, J. Casper, B. Catanzaro, and G. Diamos. 2014. Deep Speech: Scaling up end-to-end speech recognition. Computer Science (2014).
[12]
Yichong Leng, Xu Tan, Linchen Zhu, Jin Xu, Renqian Luo, Linquan Liu, Tao Qin, Xiangyang Li, Edward Lin, and Tie-Yan Liu. 2021. FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition. Advances in Neural Information Processing Systems 34 (2021).
[13]
Xinfeng Li, Xiaoyu Ji, Chen Yan, Chaohao Li, Yichen Li, Zhenning Zhang, and Wenyuan Xu. 2023. Learning normality is enough: a software-based mitigation against inaudible voice attacks. In 32nd USENIX Security Symposium (USENIX Security 23). 2455–2472.
[14]
Zhuohang Li, Cong Shi, Yi Xie, Jian Liu, Bo Yuan, and Yingying Chen. 2020. Practical adversarial attacks against speaker recognition systems. In Proceedings of the 21st international workshop on mobile computing systems and applications. 9–14.
[15]
Zhuohang Li, Yi Wu, Jian Liu, Yingying Chen, and Bo Yuan. 2020. Advpulse: Universal, synchronization-free, and targeted audio adversarial attacks via subsecond perturbations. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 1121–1134.
[16]
Yan Meng, Zichang Wang, Wei Zhang, Peilin Wu, Haojin Zhu, Xiaohui Liang, and Yao Liu. 2018. Wivo: Enhancing the security of voice control system via wireless signal in iot environment. In Proceedings of the Eighteenth ACM International Symposium on Mobile Ad Hoc Networking and Computing. 81–90.
[17]
Mozilla. 2017. Project deepspeech. https://github.com/mozilla/DeepSpeech.
[18]
Vassil Panayotov, Guoguo Chen, Daniel Povey, and Sanjeev Khudanpur. 2015. Librispeech: an asr corpus based on public domain audio books. In IEEE international conference on acoustics, speech and signal processing (ICASSP). 5206–5210.
[19]
N. Papernot, P. Mcdaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami. 2016. Practical Black-Box Attacks against Machine Learning. ACM (2016).
[20]
Yao Qin, Nicholas Carlini, Garrison Cottrell, Ian Goodfellow, and Colin Raffel. 2019. Imperceptible, robust, and targeted adversarial examples for automatic speech recognition. In International conference on machine learning, PMLR. 5231–5240.
[21]
Nirupam Roy, Haitham Hassanieh, and Romit Roy Choudhury. 2017. Backdoor: Making microphones hear inaudible sounds. In Proceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services. 2–14.
[22]
Nirupam Roy, Sheng Shen, Haitham Hassanieh, and Romit Roy Choudhury. 2018. Inaudible Voice Commands: The {Long-Range} Attack and Defense. In 15th USENIX Symposium on Networked Systems Design and Implementation. 547–560.
[23]
Tavish Vaidya, Yuankai Zhang, Micah Sherr, and Clay Shields. 2015. Cocaine noodles: exploiting the gap between human and machine speech recognition. In 9th USENIX Workshop on Offensive Technologies (WOOT 15).
[24]
Shu Wang, Jiahao Cao, Xu He, Kun Sun, and Qi Li. 2020. When the differences in frequency domain are compensated: Understanding and defeating modulated replay attacks on automatic speech recognition. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 1103–1119.
[25]
Yi Xie, Cong Shi, Zhuohang Li, Jian Liu, Yingying Chen, and Bo Yuan. 2020. Real-time, universal, and robust adversarial attacks against speaker recognition systems. In ICASSP 2020 IEEE international conference on acoustics, speech and signal processing (ICASSP). 1738–1742.
[26]
Guoming Zhang, Chen Yan, Xiaoyu Ji, Tianchen Zhang, Taimin Zhang, and Wenyuan Xu. 2017. Dolphinattack: Inaudible voice commands. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. 103–117.
[27]
Lei Zhang, Yan Meng, Jiahao Yu, Chong Xiang, Brandon Falk, and Haojin Zhu. 2020. Voiceprint mimicry attack towards speaker verification system in smart home. In IEEE INFOCOM 2020-IEEE Conference on Computer Communications. 377–386.

Index Terms

  1. IUAC: Inaudible Universal Adversarial Attacks Against Smart Speakers
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image ACM Transactions on Sensor Networks
            ACM Transactions on Sensor Networks Just Accepted
            EISSN:1550-4867
            Table of Contents
            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Journal Family

            Publication History

            Online AM: 30 September 2024
            Accepted: 24 September 2024
            Revised: 15 July 2024
            Received: 22 December 2023

            Check for updates

            Author Tags

            1. Inaudible
            2. universal adversarial
            3. targeted attack
            4. speech recognition

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • 0
              Total Citations
            • 73
              Total Downloads
            • Downloads (Last 12 months)73
            • Downloads (Last 6 weeks)73
            Reflects downloads up to 10 Nov 2024

            Other Metrics

            Citations

            View Options

            View options

            PDF

            View or Download as a PDF file.

            PDF

            eReader

            View online with eReader.

            eReader

            Get Access

            Login options

            Full Access

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media