research-article

Open access

REFIT: A Unified Watermark Removal Framework For Deep Learning Systems With Limited Data

Authors:

Bo Li,

Dawn SongAuthors Info & Claims

ASIA CCS '21: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security

Pages 321 - 335

https://doi.org/10.1145/3433210.3453079

Published: 04 June 2021 Publication History

PDF eReader

Abstract

Training deep neural networks from scratch could be computationally expensive and requires a lot of training data. Recent work has explored different watermarking techniques to protect the pre-trained deep neural networks from potential copyright infringements. However, these techniques could be vulnerable to watermark removal attacks. In this work, we propose REFIT, a unified watermark removal framework based on fine-tuning, which does not rely on the knowledge of the watermarks, and is effective against a wide range of watermarking schemes. In particular, we conduct a comprehensive study of a realistic attack scenario where the adversary has limited training data, which has not been emphasized in prior work on attacks against watermarking schemes. To effectively remove the watermarks without compromising the model functionality under this weak threat model, we propose two techniques that are incorporated into our fine-tuning framework: (1) an adaption of the elastic weight consolidation (EWC) algorithm, which is originally proposed for mitigating the catastrophic forgetting phenomenon; and (2) unlabeled data augmentation (AU), where we leverage auxiliary unlabeled data from other sources. Our extensive evaluation shows the effectiveness of REFIT against diverse watermark embedding schemes. In particular, both EWC and AU significantly decrease the amount of labeled training data needed for effective watermark removal, and the unlabeled data samples used for AU do not necessarily need to be drawn from the same distribution as the benign data for model evaluation. The experimental results demonstrate that our fine-tuning based watermark removal attacks could pose real threats to the copyright of pre-trained models, and thus highlight the importance of further investigating the watermarking problem and proposing more robust watermark embedding schemes against the attacks.

Supplementary Material

MP4 File (ASIA-CCS21-fp179.mp4)

Video - REFIT: A Unified Watermark Removal Framework For Deep Learning Systems With Limited Data

Download
53.91 MB

References

[1]

Yossi Adi, Carsten Baum, Moustapha Cisse, Benny Pinkas, and Joseph Keshet. 2018. Turning your weakness into a strength: Watermarking deep neural networks by backdooring. In 27th $$USENIX$$ Security Symposium.

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Visible watermark removal scheme based on reversible data hiding and image inpainting

Detect and Remove Watermark in Deep Neural Networks via Generative Adversarial Networks

Adversarial watermark: A robust and reliable watermark against removal

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations