The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks

Carlini, Nicholas; Liu, Chang; Erlingsson, Úlfar; Kos, Jernej; Song, Dawn

Computer Science > Machine Learning

arXiv:1802.08232 (cs)

[Submitted on 22 Feb 2018 (v1), last revised 16 Jul 2019 (this version, v3)]

Title:The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks

Authors:Nicholas Carlini, Chang Liu, Úlfar Erlingsson, Jernej Kos, Dawn Song

View PDF

Abstract:This paper describes a testing methodology for quantitatively assessing the risk that rare or unique training-data sequences are unintentionally memorized by generative sequence models---a common type of machine-learning model. Because such models are sometimes trained on sensitive data (e.g., the text of users' private messages), this methodology can benefit privacy by allowing deep-learning practitioners to select means of training that minimize such memorization.
In experiments, we show that unintended memorization is a persistent, hard-to-avoid issue that can have serious consequences. Specifically, for models trained without consideration of memorization, we describe new, efficient procedures that can extract unique, secret sequences, such as credit card numbers. We show that our testing strategy is a practical and easy-to-use first line of defense, e.g., by describing its application to quantitatively limit data exposure in Google's Smart Compose, a commercial text-completion neural network trained on millions of users' email messages.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
Cite as:	arXiv:1802.08232 [cs.LG]
	(or arXiv:1802.08232v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1802.08232

Submission history

From: Nicholas Carlini [view email]
[v1] Thu, 22 Feb 2018 18:42:41 UTC (186 KB)
[v2] Tue, 5 Mar 2019 18:13:03 UTC (251 KB)
[v3] Tue, 16 Jul 2019 17:05:32 UTC (334 KB)

Computer Science > Machine Learning

Title:The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators