DeBackdoor: A Deductive Framework for Detecting Backdoor Attacks on Deep Models with Limited Data

Popovic, Dorde; Sadeghi, Amin; Yu, Ting; Chawla, Sanjay; Khalil, Issa

Computer Science > Cryptography and Security

arXiv:2503.21305 (cs)

[Submitted on 27 Mar 2025]

Title:DeBackdoor: A Deductive Framework for Detecting Backdoor Attacks on Deep Models with Limited Data

Authors:Dorde Popovic, Amin Sadeghi, Ting Yu, Sanjay Chawla, Issa Khalil

View PDF HTML (experimental)

Abstract:Backdoor attacks are among the most effective, practical, and stealthy attacks in deep learning. In this paper, we consider a practical scenario where a developer obtains a deep model from a third party and uses it as part of a safety-critical system. The developer wants to inspect the model for potential backdoors prior to system deployment. We find that most existing detection techniques make assumptions that are not applicable to this scenario. In this paper, we present a novel framework for detecting backdoors under realistic restrictions. We generate candidate triggers by deductively searching over the space of possible triggers. We construct and optimize a smoothed version of Attack Success Rate as our search objective. Starting from a broad class of template attacks and just using the forward pass of a deep model, we reverse engineer the backdoor attack. We conduct extensive evaluation on a wide range of attacks, models, and datasets, with our technique performing almost perfectly across these settings.

Subjects:	Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2503.21305 [cs.CR]
	(or arXiv:2503.21305v1 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2503.21305

Submission history

From: Dorde Popovic [view email]
[v1] Thu, 27 Mar 2025 09:31:10 UTC (990 KB)

Computer Science > Cryptography and Security

Title:DeBackdoor: A Deductive Framework for Detecting Backdoor Attacks on Deep Models with Limited Data

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:DeBackdoor: A Deductive Framework for Detecting Backdoor Attacks on Deep Models with Limited Data

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators