Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Papernot, Nicolas; Abadi, Martín; Erlingsson, Úlfar; Goodfellow, Ian; Talwar, Kunal

Statistics > Machine Learning

arXiv:1610.05755 (stat)

[Submitted on 18 Oct 2016 (v1), last revised 3 Mar 2017 (this version, v4)]

Title:Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Authors:Nicolas Papernot, Martín Abadi, Úlfar Erlingsson, Ian Goodfellow, Kunal Talwar

View PDF

Abstract:Some machine learning applications involve training data that is sensitive, such as the medical histories of patients in a clinical trial. A model may inadvertently and implicitly store some of its training data; careful analysis of the model may therefore reveal sensitive information.
To address this problem, we demonstrate a generally applicable approach to providing strong privacy guarantees for training data: Private Aggregation of Teacher Ensembles (PATE). The approach combines, in a black-box fashion, multiple models trained with disjoint datasets, such as records from different subsets of users. Because they rely directly on sensitive data, these models are not published, but instead used as "teachers" for a "student" model. The student learns to predict an output chosen by noisy voting among all of the teachers, and cannot directly access an individual teacher or the underlying data or parameters. The student's privacy properties can be understood both intuitively (since no single teacher and thus no single dataset dictates the student's training) and formally, in terms of differential privacy. These properties hold even if an adversary can not only query the student but also inspect its internal workings.
Compared with previous work, the approach imposes only weak assumptions on how teachers are trained: it applies to any model, including non-convex models like DNNs. We achieve state-of-the-art privacy/utility trade-offs on MNIST and SVHN thanks to an improved privacy analysis and semi-supervised learning.

Comments:	Accepted to ICLR 17 as an oral
Subjects:	Machine Learning (stat.ML); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Cite as:	arXiv:1610.05755 [stat.ML]
	(or arXiv:1610.05755v4 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1610.05755

Submission history

From: Nicolas Papernot [view email]
[v1] Tue, 18 Oct 2016 19:37:37 UTC (199 KB)
[v2] Wed, 2 Nov 2016 13:18:56 UTC (199 KB)
[v3] Mon, 7 Nov 2016 00:18:03 UTC (199 KB)
[v4] Fri, 3 Mar 2017 18:56:43 UTC (218 KB)

Statistics > Machine Learning

Title:Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators