Alleviating Adversarial Attacks on Variational Autoencoders with MCMC

Kuzina, Anna; Welling, Max; Tomczak, Jakub M.

Computer Science > Machine Learning

arXiv:2203.09940 (cs)

[Submitted on 18 Mar 2022 (v1), last revised 12 Oct 2022 (this version, v2)]

Title:Alleviating Adversarial Attacks on Variational Autoencoders with MCMC

Authors:Anna Kuzina, Max Welling, Jakub M. Tomczak

View PDF

Abstract:Variational autoencoders (VAEs) are latent variable models that can generate complex objects and provide meaningful latent representations. Moreover, they could be further used in downstream tasks such as classification. As previous work has shown, one can easily fool VAEs to produce unexpected latent representations and reconstructions for a visually slightly modified input. Here, we examine several objective functions for adversarial attack construction proposed previously and present a solution to alleviate the effect of these attacks. Our method utilizes the Markov Chain Monte Carlo (MCMC) technique in the inference step that we motivate with a theoretical analysis. Thus, we do not incorporate any extra costs during training, and the performance on non-attacked inputs is not decreased. We validate our approach on a variety of datasets (MNIST, Fashion MNIST, Color MNIST, CelebA) and VAE configurations ($\beta$-VAE, NVAE, $\beta$-TCVAE), and show that our approach consistently improves the model robustness to adversarial attacks.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2203.09940 [cs.LG]
	(or arXiv:2203.09940v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2203.09940

Submission history

From: Anna Kuzina [view email]
[v1] Fri, 18 Mar 2022 13:25:18 UTC (5,212 KB)
[v2] Wed, 12 Oct 2022 15:42:45 UTC (6,277 KB)

Computer Science > Machine Learning

Title:Alleviating Adversarial Attacks on Variational Autoencoders with MCMC

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Alleviating Adversarial Attacks on Variational Autoencoders with MCMC

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators