Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content

Implementation of the Boundary Attack algorithm as described in Brendel, Wieland, Jonas Rauber, and Matthias Bethge. "Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models." arXiv preprint arXiv:1712.04248 (2017).

Notifications You must be signed in to change notification settings

greentfrapp/boundary-attack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

boundary-attack

Implementation of the Boundary Attack algorithm as described in:

Brendel, Wieland, Jonas Rauber, and Matthias Bethge. "Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models." arXiv preprint arXiv:1712.04248 (2017).

The algorithm is also implemented in Foolbox as part of a toolkit of adversarial techniques.

Grumpy Cat's secret identity is Doge

doge_1 doge_2 doge_3 doge_4 doge_5 doge_6

All of the above images are classified as 273: 'dingo, warrigal, warragal, Canis dingo' by the Keras ResNet-50 model pretrained on ImageNet.

Instructions

For starters, just run:

$ python boundary-attack-resnet.py

This will create adversarial images using the Bad Joke Eel and Awkward Moment Seal images, for attacking the Keras ResNet-50 model (pretrained on ImageNet). You can also change the files to other images in the images/original folder or add your own images. All input images will be reshaped to 224 x 224 x3 arrays.

The script will take ~10 minutes to create a decent adversarial image (similar to the second last image in the above series of images) on a 1080 Ti GPU.

Primer to "Orthogonal Perturbation" / "Projecting onto Sphere"

Here is a brief explanation about the orthogonal perturbation step and why we

draw a new random direction by drawing from an iid Gaussian and projecting on a sphere

(See Figure 2 in Wieland et al.)

MNIST Demo

Demo Screenshot

There is also a GUI demo (uses Python3) for MNIST images, using a local convolutional model that follows the architecture described here, which achieved 98% accuracy on the MNIST test set.

To run the demo, run the following commands:

$ cd demo
$ open index.html
$ python3 -m server

Wait for the following message to appear:

* Running on http://0.0.0.0:8080/ (Press CTRL+C to quit)

Then enter the labels for the attack and target and hit Enter. Wait for the MNIST images to load then click on the Attack! button.

This is much faster than the above script and will take far less than a minute to generate an adequate adversarial image on a Macbook Pro.

About

Implementation of the Boundary Attack algorithm as described in Brendel, Wieland, Jonas Rauber, and Matthias Bethge. "Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models." arXiv preprint arXiv:1712.04248 (2017).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages