Privacy Profiles and Amplification by Subsampling
Main Article Content
Abstract
Differential privacy provides a robust quantifiable methodology to measure and control the privacy leakage of data analysis algorithms.
A fundamental insight is that by forcing algorithms to be randomized, their privacy leakage can be characterized by measuring the dissimilarity between output distributions produced by applying the algorithm to pairs datasets differing in one individual.
After the introduction of differential privacy, several variants of the original definition have been proposed by changing the measure of dissimilarity between distributions, including concentrated, zero-concentrated and R{\'e}nyi differential privacy.
The first contribution of this paper is to introduce the notion of privacy profile of a mechanism.
This profile captures all valid $(\varepsilon,\delta)$ differential privacy parameters satisfied by a given mechanism, and contrasts with the usual approach of providing guarantees in terms of a single point in this curve.
We show that knowledge of this curve is equivalent to knowledge of the privacy guarantees with respect to the alternative definitions listed above.
This sheds further light into the connections between different privacy definitions, and suggests that these should be considered alternative but otherwise equivalent points of view.
The second contribution of this paper is to apply the privacy profiles machinery to study the so-called ``privacy amplification by subsampling'' principle, which ensures that a differentially private mechanism run on a random subsample of a population provides higher privacy guarantees than when run on the entire population.
Several instances of this principle have been studied for different random subsampling methods, each with an ad-hoc analysis. In this paper we set out to study this phenomenon in detail with the aim to provide a general method capable of recovering prior analyses in a streamlined fashion.
Our method makes extensive use of coupling argument, and introduces a new tool to analyse differential privacy for mixture distributions.
Article Details
Copyright is retained by the authors. By submitting to this journal, the author(s) license the article under the Creative Commons License – Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0), unless choosing a more lenient license (for instance, public domain). For situations not allowed under CC BY-NC-ND, short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source.
Authors of articles published by the journal grant the journal the right to store the articles in its databases for an unlimited period of time and to distribute and reproduce the articles electronically.