Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Models

Szyller, Sebastian; Duddu, Vasisht; Gröndahl, Tommi; Asokan, N.

Computer Science > Machine Learning

arXiv:2104.12623 (cs)

[Submitted on 26 Apr 2021 (v1), last revised 28 Feb 2023 (this version, v2)]

Title:Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Models

Authors:Sebastian Szyller, Vasisht Duddu, Tommi Gröndahl, N. Asokan

View PDF

Abstract:Machine learning models are typically made available to potential client users via inference APIs. Model extraction attacks occur when a malicious client uses information gleaned from queries to the inference API of a victim model $F_V$ to build a surrogate model $F_A$ with comparable functionality. Recent research has shown successful model extraction of image classification, and natural language processing models. In this paper, we show the first model extraction attack against real-world generative adversarial network (GAN) image translation models. We present a framework for conducting such attacks, and show that an adversary can successfully extract functional surrogate models by querying $F_V$ using data from the same domain as the training data for $F_V$. The adversary need not know $F_V$'s architecture or any other information about it beyond its intended task. We evaluate the effectiveness of our attacks using three different instances of two popular categories of image translation: (1) Selfie-to-Anime and (2) Monet-to-Photo (image style transfer), and (3) Super-Resolution (super resolution). Using standard performance metrics for GANs, we show that our attacks are effective. Furthermore, we conducted a large scale (125 participants) user study on Selfie-to-Anime and Monet-to-Photo to show that human perception of the images produced by $F_V$ and $F_A$ can be considered equivalent, within an equivalence bound of Cohen's d = 0.3. Finally, we show that existing defenses against model extraction attacks (watermarking, adversarial examples, poisoning) do not extend to image translation models.

Comments:	19 pages
Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2104.12623 [cs.LG]
	(or arXiv:2104.12623v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2104.12623

Submission history

From: Sebastian Szyller [view email]
[v1] Mon, 26 Apr 2021 14:50:59 UTC (2,965 KB)
[v2] Tue, 28 Feb 2023 09:37:59 UTC (7,002 KB)

Computer Science > Machine Learning

Title:Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators