research-article

Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning

Authors:

Briland Hitaj,

Giuseppe Ateniese,

Fernando Perez-CruzAuthors Info & Claims

CCS '17: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security

Pages 603 - 618

https://doi.org/10.1145/3133956.3134012

Published: 30 October 2017 Publication History

Get Access

Abstract

Deep Learning has recently become hugely popular in machine learning for its ability to solve end-to-end learning systems, in which the features and the classifiers are learned simultaneously, providing significant improvements in classification accuracy in the presence of highly-structured and large databases.

Its success is due to a combination of recent algorithmic breakthroughs, increasingly powerful computers, and access to significant amounts of data.

Researchers have also considered privacy implications of deep learning. Models are typically trained in a centralized manner with all the data being processed by the same training algorithm. If the data is a collection of users' private data, including habits, personal pictures, geographical positions, interests, and more, the centralized server will have access to sensitive information that could potentially be mishandled. To tackle this problem, collaborative deep learning models have recently been proposed where parties locally train their deep learning structures and only share a subset of the parameters in the attempt to keep their respective training sets private. Parameters can also be obfuscated via differential privacy (DP) to make information extraction even more challenging, as proposed by Shokri and Shmatikov at CCS'15.

Unfortunately, we show that any privacy-preserving collaborative deep learning is susceptible to a powerful attack that we devise in this paper. In particular, we show that a distributed, federated, or decentralized deep learning approach is fundamentally broken and does not protect the training sets of honest participants. The attack we developed exploits the real-time nature of the learning process that allows the adversary to train a Generative Adversarial Network (GAN) that generates prototypical samples of the targeted training set that was meant to be private (the samples generated by the GAN are intended to come from the same distribution as the training data). Interestingly, we show that record-level differential privacy applied to the shared parameters of the model, as suggested in previous work, is ineffective (i.e., record-level DP is not designed to address our attack).

Supplemental Material

MP4 File

Download
2711.55 MB

References

[1]

Martín Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 308--318.

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

Anti-cloning protocol suitable to EPCglobal Class-1 Generation-2 RFID systems

Revisiting unpredictability-based RFID privacy models

A review of privacy-preserving techniques for deep learning

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations