Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Countering Acoustic Adversarial Attacks in Microphone-equipped Smart Home Devices

Published: 15 June 2020 Publication History

Abstract

Deep neural networks (DNNs) continue to demonstrate superior generalization performance in an increasing range of applications, including speech recognition and image understanding. Recent innovations in compression algorithms, design of efficient architectures and hardware accelerators have prompted a rapid growth in deploying DNNs on mobile and IoT devices to redefine user experiences. Relying on the superior inference quality of DNNs, various voice-enabled devices have started to pervade our everyday lives and are increasingly used for, e.g., opening and closing doors, starting or stopping washing machines, ordering products online, and authenticating monetary transactions. As the popularity of these voice-enabled services increases, so does their risk of being attacked. Recently, DNNs have been shown to be extremely brittle under adversarial attacks and people with malicious intentions can potentially exploit this vulnerability to compromise DNN-based voice-enabled systems. Although some existing work already highlights the vulnerability of audio models, very little is known of the behaviour of compressed on-device audio models under adversarial attacks. This paper bridges this gap by investigating thoroughly the vulnerabilities of compressed audio DNNs and makes a stride towards making compressed models robust. In particular, we propose a stochastic compression technique that generates compressed models with greater robustness to adversarial attacks. We present an extensive set of evaluations on adversarial vulnerability and robustness of DNNs in two diverse audio recognition tasks, while considering two popular attack algorithms: FGSM and PGD. We found that error rates of conventionally trained audio DNNs under attack can be as high as 100%. Under both white- and black-box attacks, our proposed approach is found to decrease the error rate of DNNs under attack by a large margin.

References

[1]
Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, p. 436, 2015.
[2]
I. Goodfellow, Y. Bengio, and A. Courville, Deep learning. MIT press Cambridge, 2016, vol. 1.
[3]
https://developer.amazon.com/en-US/alexa/alexa-skills-kit [Retrieved: May 13, 2020].
[4]
https://biztechmagazine.com/article/2018/11/voiceprint-security-game-changer-banks-and-credit-unions-all-sizes [Retrived: May 13, 2020].
[5]
https://www.apple.com/uk/ios/siri/ [Retrieved: May 13, 2020].
[6]
https://store.google.com/gb/product/google_home_mini [Retrieved: May 13, 2020].
[7]
https://developer.amazon.com/alexa [Retrieved: May 13, 2020].
[8]
V. Sze, Y. Chen, T. Yang, and J. S. Emer, "Efficient Processing of Deep Neural Networks: A Tutorial and Survey," Proceedings of the IEEE, vol. 105, no. 12, pp. 2295--2329, Dec 2017.
[9]
A. Zhou et al., "Incremental Network Quantization: Towards Lossless CNNs with Low-Precision Weights," in International Conference on Learning Representations (ICLR), 2017.
[10]
M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, "XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks," in European Conference on Computer Vision (ECCV), Cham, 2016, pp. 525--542.
[11]
K. Guo, L. Sui, J. Qiu, J. Yu, J. Wang, S. Yao, S. Han, Y. Wang, and H. Yang, "Angel-Eye: A Complete Design Flow for Mapping CNN Onto Embedded FPGA," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), vol. 37, no. 1, pp. 35--47, 2018.
[12]
B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, "Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[13]
N. D. Lane, S. Bhattacharya, P. Georgiev, C. Forlivesi, and F. Kawsar, "An Early Resource Characterization of Deep Learning on Wearables, Smartphones and Internet-of-Things Devices," in Proceedings of the 2015 International Workshop on Internet of Things towards Applications. ACM, 2015, pp. 7--12.
[14]
S. Bhattacharya and N. D. Lane, "Sparsification and separation of deep learning layers for constrained resource inference on wearables," in ACM Conference on Embedded Networked Sensor Systems (SenSys), 2016.
[15]
I. J. Goodfellow, J. Shlens, and C. Szegedy, "Explaining and harnessing adversarial examples," arXiv preprint arXiv:1412.6572, 2014.
[16]
K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati, C. Xiao, A. Prakash, T. Kohno, and D. Song, "Robust Physical-World Attacks on Deep Learning Visual Classification," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[17]
N. Carlini, P. Mishra, T. Vaidya, Y. Zhang, M. Sherr, C. Shields, D. A. Wagner, and W. Zhou, "Hidden Voice Commands," in USENIX Security Symposium, USENIX Security 16, 2016, pp. 513--530.
[18]
N. Carlini and D. Wagner, "Audio Adversarial Examples: Targeted Attacks on Speech-to-Text," in 2018 IEEE Security and Privacy Workshops (SPW), 2018, pp. 1--7.
[19]
N. Carlini and D. Wagner, "Towards evaluating the robustness of neural networks," in IEEE Symposium on Security and Privacy (SP), 2017, pp. 39--57.
[20]
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, "Intriguing properties of neural networks," in International Conference on Learning Representations (ICLR), 2014.
[21]
A. N. Bhagoji, D. Cullina, C. Sitawarin, and P. Mittal, "Enhancing robustness of machine learning systems via data transformations, arxiv preprint," in 52nd Annual Conference on Information Sciences and Systems (CISS), 2018.
[22]
S. Wachter, B. D. Mittelstadt, and C. Russell, "Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR," Harv. JL & Tech., vol. 31, p. 841, 2017.
[23]
A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, "Towards deep learning models resistant to adversarial attacks," in International Conference on Learning Representations (ICLR), 2018.
[24]
C. Tai, T. Xiao, Y. Zhang, X. Wang, and W. E, "Convolutional neural networks with low-rank regularization," in International Conference on Learning Representations (ICLR), 2016.
[25]
F. Chollet, "Xception: Deep Learning With Depthwise Separable Convolutions," in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[26]
G. Hinton, O. Vinyals, and J. Dean, "Distilling the Knowledge in a Neural Network," in NIPS Deep Learning Workshop, 2015.
[27]
S. Han, H. Mao, and W. J. Dally, "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding," in International Conference on Learning Representations (ICLR), 2016.
[28]
N. D. Lane, S. Bhattacharya, A. Mathur, P. Georgiev, C. Forlivesi, and F. Kawsar, "Squeezing Deep Learning into Mobile and Embedded Devices," IEEE Pervasive Computing, vol. 16, no. 3, pp. 82--88, 2017.
[29]
S. I. Venieris, A. Kouris, and C.-S. Bouganis, "Deploying Deep Neural Networks in the Embedded Space," in 2nd International Workshop on Embedded and Mobile Deep Learning (EMDL), 2018.
[30]
N. D. Lane, S. Bhattacharya, P. Georgiev, C. Forlivesi, L. Jiao, L. Qendro, and F. Kawsar, "DeepX: A Software Accelerator for Low-Power Deep Learning Inference on Mobile Devices," in International Conference on Information Processing in Sensor Networks (IPSN), 2016.
[31]
M. Rizakis, S. I. Venieris, A. Kouris, and C.-S. Bouganis, "Approximate FPGA-based LSTMs under Computation Time Constraints," in 14th International Symposium on Applied Reconfigurable Computing (ARC). Springer, 2018, pp. 3--15.
[32]
Łukasz Dudziak, M. S. Abdelfattah, R. Vipperla, S. Laskaridis, and N. D. Lane, "ShrinkML: End-to-End ASR Model Compression Using Reinforcement Learning," in Proc. Interspeech 2019, 2019, pp. 2235--2239.
[33]
C. Tai, T. Xiao, Y. Zhang, X. Wang, and W. E, "Convolutional Neural Networks with Low-Rank Regularization," in International Conference on Learning Representations (ICLR), 2016.
[34]
A. Kouris, S. I. Venieris, and C. Bouganis, "CascadeCNN: Pushing the Performance Limits of Quantisation in Convolutional Neural Networks," in 28th International Conference on Field Programmable Logic and Applications (FPL), 2018, pp. 155--1557.
[35]
P. Gysel, J. Pimentel, M. Motamedi, and S. Ghiasi, "Ristretto: A Framework for Empirical Study of Resource-Efficient Inference in Convolutional Neural Networks," IEEE Transactions on Neural Networks and Learning Systems (TNNLS), vol. 29, no. 11, pp. 5784--5789, Nov 2018.
[36]
A. Mishra, E. Nurvitadhi, J. J. Cook, and D. Marr, "WRPN: Wide Reduced-Precision Networks," in International Conference on Learning Representations (ICLR), 2018.
[37]
L. Schmidt, S. Santurkar, D. Tsipras, K. Talwar, and A. Madry, "Adversarially robust generalization requires more data," in Advances in Neural Information Processing Systems (NeurIPS), 2018, pp. 5019--5031.
[38]
H. Kannan, A. Kurakin, and I. Goodfellow, "Adversarial logit pairing," arXiv preprint arXiv:1803.06373, 2018.
[39]
Q. Xie, Z. Dai, E. Hovy, M.-T. Luong, and Q. V. Le, "Unsupervised data augmentation for consistency training," 2019.
[40]
X. Liu, M. Cheng, H. Zhang, and C.-J. Hsieh, "Towards Robust Neural Networks via Random Self-ensemble," in European Conference on Computer Vision (ECCV), 2018.
[41]
I. J. Goodfellow, J. Shlens, and C. Szegedy, "Explaining and Harnessing Adversarial Examples," in International Conference on Learning Representations (ICLR), 2015.
[42]
A. Al-Dujaili, S. Srikant, E. Hemberg, and U.-M. O'Reilly, "On the Application of Danskin's Theorem to Derivative-Free Minimax Optimization," in AIP Conference Proceedings, vol. 2070, no. 1, 2019, p. 020026.
[43]
J. Gibson, M. V. Segbroeck, and S. S. Narayanan, "Comparing time-frequency representations for directional derivative features," in INTERSPEECH, 2014.
[44]
P. Warden, "Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition," arXiv e-prints, p. arXiv:1804.03209, Apr 2018.
[45]
A. Rakotomamonjy and G. Gasso, "Histogram of gradients of time-frequency representations for audio scene detection," Technical report, HAL, https://sites.google.com/site/alainrakotomamonjy/home/audio-scene, 2014.
[46]
Y. Aytar, C. Vondrick, and A. Torralba, "SoundNet: Learning Sound Representations from Unlabeled Video," in Advances in Neural Information Processing Systems (NeurIPS), 2016.
[47]
D. Wierstra, T. Schaul, T. Glasmachers, Y. Sun, J. Peters, and J. Schmidhuber, "Natural Evolution Strategies," Journal of Machine Learning Research (JMLR), vol. 15, no. 1, pp. 949--980, 2014.
[48]
A. Ilyas, L. Engstrom, A. Athalye, and J. Lin, "Black-box Adversarial Attacks with Limited Queries and Information," in Proceedings of the 35th International Conference on Machine Learning, (ICML), 2018, pp. 2142--2151.
[49]
L. Codrescu, W. Anderson, S. Venkumanhanti, M. Zeng, E. Plondke, C. Koob, A. Ingle, C. Tabony, and R. Maule, "Hexagon DSP: An Architecture Optimized for Mobile Multimedia and Communications," IEEE Micro, vol. 34, no. 2, pp. 34--43, 2014.
[50]
J. Choquette, O. Giroux, and D. Foley, "Volta: Performance and Programmability," IEEE Micro, vol. 38, no. 2, pp. 42--52, 2018.
[51]
Nvidia, "Nvidia Deep Learning Accelerator (NVDLA)," http://nvdla.org/, [Retrieved: May 13, 2020].
[52]
Arm, "Arm Machine Learning Processor," https://developer.arm.com/ip-products/processors/machine-learning/arm-ml-processor, [Retrieved: May 13, 2020].
[53]
J. Song, Y. Cho, J. Park, J. Jang, S. Lee, J. Song, J. Lee, and I. Kang, "7.1 An 11.5TOPS/W 1024-MAC Butterfly Structure Dual-Core Sparsity-Aware Neural Processing Unit in 8nm Flagship Mobile SoC," in International Solid-State Circuits Conference (ISSCC), 2019, pp. 130--132.
[54]
J. Burgess, "RTX ON - The NVIDIA TURING GPU," in 2019 IEEE Hot Chips 31 Symposium (HCS), 2019, pp. 1--27.
[55]
M. Almeida, S. Laskaridis, I. Leontiadis, S. I. Venieris, and N. D. Lane, "EmBench: Quantifying Performance Variations of Deep Neural Networks Across Modern Commodity Devices," in The 3rd International Workshop on Deep Learning for Mobile Systems and Applications (EMDL), 2019, pp. 1--6.
[56]
A. Kurakin, I. J. Goodfellow, and S. Bengio, "Adversarial examples in the physical world," arXiv preprint arXiv:1607.02533, 2016.
[57]
J. Lu, H. Sibai, E. Fabry, and D. A. Forsyth, "NO Need to Worry about Adversarial Examples in Object Detection in Autonomous Vehicles," arXiv preprint arXiv:1707.03501, 2017.
[58]
A. Athalye, L. Engstrom, A. Ilyas, and K. Kwok, "Synthesizing Robust Adversarial Examples," in Proceedings of the 35th International Conference on Machine Learning (ICML), 2018, pp. 284--293.
[59]
H. Yakura and J. Sakuma, "Robust Audio Adversarial Example for a Physical Attack," in Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI), 2019, pp. 5334--5341.
[60]
A. Y. Hannun, C. Case, J. Casper, B. Catanzaro, G. Diamos, E. Elsen, R. Prenger, S. Satheesh, S. Sengupta, A. Coates, and A. Y. Ng, "Deep Speech: Scaling up end-to-end speech recognition," CoRR.
[61]
T. Vaidya, Y. Zhang, M. Sherr, and C. Shields, "Cocaine Noodles: Exploiting the Gap between Human and Machine Speech Recognition," in 9th USENIX Workshop on Offensive Technologies (WOOT), 2015.
[62]
G. Zhang, C. Yan, X. Ji, T. Zhang, T. Zhang, and W. Xu, "DolphinAttack: Inaudible Voice Commands," in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2017, pp. 103--117.
[63]
L. Schönherr, K. Kohls, S. Zeiler, T. Holz, and D. Kolossa, "Adversarial Attacks Against Automatic Speech Recognition Systems via Psychoacoustic Hiding," in Network and Distributed Systems Security (NDSS) Symposium.
[64]
M. Alzantot, B. Balaji, and M. B. Srivastava, "Did you hear that? Adversarial Examples Against Automatic Speech Recognition," in NIPS 2017 Machine Deception Workshop, 2017.
[65]
P.-Y. Chen, H. Zhang, Y. Sharma, J. Yi, and C.-J. Hsieh, "ZOO: Zeroth Order Optimization Based Black-Box Attacks to Deep Neural Networks without Training Substitute Models," in Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security (AISec). ACM, 2017, pp. 15--26.
[66]
R. Taori, A. Kamsetty, B. Chu, and N. Vemuri, "Targeted Adversarial Examples for Black Box Audio Systems," in 2019 IEEE Security and Privacy Workshops (SPW), 2019, pp. 15--20.
[67]
A. Ilyas, L. Engstrom, and A. Madry, "Prior Convictions: Black-box Adversarial Attacks with Bandits and Priors," in International Conference on Learning Representations (ICLR), 2019.
[68]
Q. Liu, T. Liu, Z. Liu, Y. Wang, Y. Jin, and W. Wen, "Security Analysis and Enhancement of Model Compressed Deep Learning Systems under Adversarial Attacks," in Proceedings of the 23rd Asia and South Pacific Design Automation Conference (ASPDAC), 2018, p. 721--726.
[69]
Y. Zhao, I. Shumailov, R. Mullins, and R. Anderson, "To compress or not to compress: Understanding the Interactions between Adversarial Attacks and Neural Network Compression," in MLSys, 2018.
[70]
S. Gopalakrishnan, Z. Marzi, U. Madhow, and R. Pedarsani, "Combating Adversarial Attacks Using Sparse Representations," in ICLR Workshop, 2018.
[71]
Z. Marzi, S. Gopalakrishnan, U. Madhow, and R. Pedarsani, "Sparsity-based Defense against Adversarial Attacks on Linear Classifiers," in IEEE International Symposium on Information Theory (ISIT), 2018.
[72]
J. Hendrik Metzen, T. Genewein, V. Fischer, and B. Bischoff, "On Detecting Adversarial Perturbations," in International Conference on Learning Representations (ICLR), 2017.
[73]
S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, and P. Frossard, "Universal Adversarial Perturbations," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
[74]
N. Akhtar, J. Liu, and A. Mian, "Defense Against Universal Adversarial Perturbations," in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
[75]
N. Carlini, P. Mishra, T. Vaidya, Y. Zhang, M. Sherr, C. Shields, D. Wagner, and W. Zhou, "Hidden Voice Commands," in 25th USENIX Security Symposium (USENIX Security 16), Austin, TX, 2016, pp. 513--530.

Cited By

View all
  • (2024)TXAI-ADV: Trustworthy XAI for Defending AI Models against Adversarial Attacks in Realistic CIoTElectronics10.3390/electronics1309176913:9(1769)Online publication date: 3-May-2024
  • (2024)Keamanan Data Internet of Things dalam Perspektif Pseudosains Mario BungeJurnal Filsafat Indonesia10.23887/jfi.v7i2.724357:2(207-216)Online publication date: 30-Jun-2024
  • (2024)AdverSPAM: Adversarial SPam Account Manipulation in Online Social NetworksACM Transactions on Privacy and Security10.1145/364356327:2(1-31)Online publication date: 26-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies
Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies  Volume 4, Issue 2
June 2020
771 pages
EISSN:2474-9567
DOI:10.1145/3406789
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2020
Published in IMWUT Volume 4, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Compressed neural networks
  2. audio adversarial attack
  3. robust training

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)59
  • Downloads (Last 6 weeks)11
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)TXAI-ADV: Trustworthy XAI for Defending AI Models against Adversarial Attacks in Realistic CIoTElectronics10.3390/electronics1309176913:9(1769)Online publication date: 3-May-2024
  • (2024)Keamanan Data Internet of Things dalam Perspektif Pseudosains Mario BungeJurnal Filsafat Indonesia10.23887/jfi.v7i2.724357:2(207-216)Online publication date: 30-Jun-2024
  • (2024)AdverSPAM: Adversarial SPam Account Manipulation in Online Social NetworksACM Transactions on Privacy and Security10.1145/364356327:2(1-31)Online publication date: 26-Jan-2024
  • (2023)EchoProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36108747:3(1-24)Online publication date: 27-Sep-2023
  • (2023)VoiceCloakProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35962667:2(1-21)Online publication date: 12-Jun-2023
  • (2023)HIJACK: Learning-based Strategies for Sound Classification Robustness to Adversarial Noise2023 IEEE International Conference on Smart Computing (SMARTCOMP)10.1109/SMARTCOMP58114.2023.00082(338-343)Online publication date: Jun-2023
  • (2022)Safe audio AI services in smart buildingsProceedings of the 9th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation10.1145/3563357.3564076(266-269)Online publication date: 9-Nov-2022
  • (2022)WiAdvProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/35346186:2(1-25)Online publication date: 7-Jul-2022
  • (2022)A Survey on Voice Assistant Security: Attacks and CountermeasuresACM Computing Surveys10.1145/352715355:4(1-36)Online publication date: 21-Nov-2022
  • (2022)Evaluation of Smart Home Systems and Novel UV-Oriented Solution for Integration, Resilience, Inclusiveness & Sustainability2022 6th International Conference on Universal Village (UV)10.1109/UV56588.2022.10185519(1-386)Online publication date: 22-Oct-2022
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media