Rethinking the Reverse-engineering of Trojan Triggers

Wang, Zhenting; Mei, Kai; Ding, Hailun; Zhai, Juan; Ma, Shiqing

Computer Science > Cryptography and Security

arXiv:2210.15127 (cs)

[Submitted on 27 Oct 2022]

Title:Rethinking the Reverse-engineering of Trojan Triggers

Authors:Zhenting Wang, Kai Mei, Hailun Ding, Juan Zhai, Shiqing Ma

View PDF

Abstract:Deep Neural Networks are vulnerable to Trojan (or backdoor) attacks. Reverse-engineering methods can reconstruct the trigger and thus identify affected models. Existing reverse-engineering methods only consider input space constraints, e.g., trigger size in the input space. Expressly, they assume the triggers are static patterns in the input space and fail to detect models with feature space triggers such as image style transformations. We observe that both input-space and feature-space Trojans are associated with feature space hyperplanes. Based on this observation, we design a novel reverse-engineering method that exploits the feature space constraint to reverse-engineer Trojan triggers. Results on four datasets and seven different attacks demonstrate that our solution effectively defends both input-space and feature-space Trojans. It outperforms state-of-the-art reverse-engineering methods and other types of defenses in both Trojaned model detection and mitigation tasks. On average, the detection accuracy of our method is 93\%. For Trojan mitigation, our method can reduce the ASR (attack success rate) to only 0.26\% with the BA (benign accuracy) remaining nearly unchanged. Our code can be found at this https URL.

Subjects:	Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2210.15127 [cs.CR]
	(or arXiv:2210.15127v1 [cs.CR] for this version)
	https://doi.org/10.48550/arXiv.2210.15127

Submission history

From: Zhenting Wang [view email]
[v1] Thu, 27 Oct 2022 02:25:18 UTC (932 KB)

Computer Science > Cryptography and Security

Title:Rethinking the Reverse-engineering of Trojan Triggers

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Cryptography and Security

Title:Rethinking the Reverse-engineering of Trojan Triggers

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators