research-article

Countering Acoustic Adversarial Attacks in Microphone-equipped Smart Home Devices

Authors:

Sourav Bhattacharya,

Dionysis Manousakas,

Alberto Gil C. P. Ramos,

Stylianos I. Venieris,

Nicholas D. Lane,

Cecilia MascoloAuthors Info & Claims

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, Volume 4, Issue 2

Article No.: 73, Pages 1 - 24

https://doi.org/10.1145/3397332

Published: 15 June 2020 Publication History

Get Access

Abstract

Deep neural networks (DNNs) continue to demonstrate superior generalization performance in an increasing range of applications, including speech recognition and image understanding. Recent innovations in compression algorithms, design of efficient architectures and hardware accelerators have prompted a rapid growth in deploying DNNs on mobile and IoT devices to redefine user experiences. Relying on the superior inference quality of DNNs, various voice-enabled devices have started to pervade our everyday lives and are increasingly used for, e.g., opening and closing doors, starting or stopping washing machines, ordering products online, and authenticating monetary transactions. As the popularity of these voice-enabled services increases, so does their risk of being attacked. Recently, DNNs have been shown to be extremely brittle under adversarial attacks and people with malicious intentions can potentially exploit this vulnerability to compromise DNN-based voice-enabled systems. Although some existing work already highlights the vulnerability of audio models, very little is known of the behaviour of compressed on-device audio models under adversarial attacks. This paper bridges this gap by investigating thoroughly the vulnerabilities of compressed audio DNNs and makes a stride towards making compressed models robust. In particular, we propose a stochastic compression technique that generates compressed models with greater robustness to adversarial attacks. We present an extensive set of evaluations on adversarial vulnerability and robustness of DNNs in two diverse audio recognition tasks, while considering two popular attack algorithms: FGSM and PGD. We found that error rates of conventionally trained audio DNNs under attack can be as high as 100%. Under both white- and black-box attacks, our proposed approach is found to decrease the error rate of DNNs under attack by a large margin.

References

[1]

Y. LeCun, Y. Bengio, and G. Hinton, "Deep learning," Nature, vol. 521, no. 7553, p. 436, 2015.

Abstract

References

Cited By

Index Terms

Recommendations

AdvPulse: Universal, Synchronization-free, and Targeted Audio Adversarial Attacks via Subsecond Perturbations

Countering DDoS and XDoS Attacks against Web Services

IUAC: Inaudible Universal Adversarial Attacks Against Smart Speakers

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations