Robust adversarial learning via sparsifying front ends

S Gopalakrishnan, Z Marzi, M Cekic, U Madhow… - arXiv preprint arXiv …, 2018 - arxiv.org
arXiv preprint arXiv:1810.10625, 2018arxiv.org
It is by now well-known that small adversarial perturbations can induce classification errors
in deep neural networks. In this paper, we take a bottom-up signal processing perspective to
this problem and show that a systematic exploitation of sparsity in natural data is a promising
tool for defense. For linear classifiers, we show that a sparsifying front end is provably
effective against $\ell_ {\infty} $-bounded attacks, reducing output distortion due to the attack
by a factor of roughly $ K/N $ where $ N $ is the data dimension and $ K $ is the sparsity …
It is by now well-known that small adversarial perturbations can induce classification errors in deep neural networks. In this paper, we take a bottom-up signal processing perspective to this problem and show that a systematic exploitation of sparsity in natural data is a promising tool for defense. For linear classifiers, we show that a sparsifying front end is provably effective against -bounded attacks, reducing output distortion due to the attack by a factor of roughly where is the data dimension and is the sparsity level. We then extend this concept to deep networks, showing that a "locally linear" model can be used to develop a theoretical foundation for crafting attacks and defenses. We also devise attacks based on the locally linear model that outperform the well-known FGSM attack. We supplement our theoretical results with experiments on the MNIST and CIFAR-10 datasets, showing the efficacy of the proposed sparsity-based defense schemes.
arxiv.org