Black-box adversarial attacks on video recognition models
Proceedings of the 27th ACM International Conference on Multimedia, 2019•dl.acm.org
Deep neural networks (DNNs) are known for their vulnerability to adversarial examples.
These are examples that have undergone small, carefully crafted perturbations, and which
can easily fool a DNN into making misclassifications at test time. Thus far, the field of
adversarial research has mainly focused on image models, under either a white-box setting,
where an adversary has full access to model parameters, or a black-box setting where an
adversary can only query the target model for probabilities or labels. Whilst several white …
These are examples that have undergone small, carefully crafted perturbations, and which
can easily fool a DNN into making misclassifications at test time. Thus far, the field of
adversarial research has mainly focused on image models, under either a white-box setting,
where an adversary has full access to model parameters, or a black-box setting where an
adversary can only query the target model for probabilities or labels. Whilst several white …
Deep neural networks (DNNs) are known for their vulnerability to adversarial examples. These are examples that have undergone small, carefully crafted perturbations, and which can easily fool a DNN into making misclassifications at test time. Thus far, the field of adversarial research has mainly focused on image models, under either a white-box setting, where an adversary has full access to model parameters, or a black-box setting where an adversary can only query the target model for probabilities or labels. Whilst several white-box attacks have been proposed for video models, black-box video attacks are still unexplored. To close this gap, we propose the first black-box video attack framework, called V-BAD. V-BAD utilizestentative perturbations transferred from image models andpartition-based rectifications found by the NES to obtain good adversarial gradient estimates with fewer queries to the target model. V-BAD is equivalent to estimating the projection of the adversarial gradient on a selected subspace. Using three benchmark video datasets, we demonstrate that V-BAD can craft both untargeted and targeted attacks to fool two state-of-the-art deep video recognition models. For the targeted attack, it achieves 93% success rate using only an average of queries, a similar number of queries to state-of-the-art black-box image attacks. This is despite the fact that videos often have two orders of magnitude higher dimensionality than static images. We believe that V-BAD is a promising new tool to evaluate and improve the robustness of video recognition models to black-box adversarial attacks.
ACM Digital Library