FDINet: Protecting against DNN Model Extraction via Feature Distortion Index

Yao, Hongwei; Li, Zheng; Weng, Haiqin; Xue, Feng; Qin, Zhan; Ren, Kui

Computer Science > Cryptography and Security

arXiv:2306.11338 (cs)

[Submitted on 20 Jun 2023 (v1), last revised 22 Oct 2024 (this version, v3)]

Title:FDINet: Protecting against DNN Model Extraction via Feature Distortion Index

Authors:Hongwei Yao, Zheng Li, Haiqin Weng, Feng Xue, Zhan Qin, Kui Ren

View PDF HTML (experimental)

Abstract:Machine Learning as a Service (MLaaS) platforms have gained popularity due to their accessibility, cost-efficiency, scalability, and rapid development capabilities. However, recent research has highlighted the vulnerability of cloud-based models in MLaaS to model extraction attacks. In this paper, we introduce FDINET, a novel defense mechanism that leverages the feature distribution of deep neural network (DNN) models. Concretely, by analyzing the feature distribution from the adversary's queries, we reveal that the feature distribution of these queries deviates from that of the model's training set. Based on this key observation, we propose Feature Distortion Index (FDI), a metric designed to quantitatively measure the feature distribution deviation of received queries. The proposed FDINET utilizes FDI to train a binary detector and exploits FDI similarity to identify colluding adversaries from distributed extraction attacks. We conduct extensive experiments to evaluate FDINET against six state-of-the-art extraction attacks on four benchmark datasets and four popular model architectures. Empirical results demonstrate the following findings FDINET proves to be highly effective in detecting model extraction, achieving a 100% detection accuracy on DFME and DaST. FDINET is highly efficient, using just 50 queries to raise an extraction alarm with an average confidence of 96.08% for GTSRB. FDINET exhibits the capability to identify colluding adversaries with an accuracy exceeding 91%. Additionally, it demonstrates the ability to detect two types of adaptive attacks.

Comments:	Accepted to IEEE Transactions on Dependable and Secure Computing
Subjects:	Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Cite as:	arXiv:2306.11338 [cs.CR]
	(or arXiv:2306.11338v3 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2306.11338

Submission history

From: Hongwei Yao [view email]
[v1] Tue, 20 Jun 2023 07:14:37 UTC (13,180 KB)
[v2] Thu, 22 Jun 2023 02:20:38 UTC (8,184 KB)
[v3] Tue, 22 Oct 2024 16:39:19 UTC (7,486 KB)

Computer Science > Cryptography and Security

Title:FDINet: Protecting against DNN Model Extraction via Feature Distortion Index

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:FDINet: Protecting against DNN Model Extraction via Feature Distortion Index

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators