1 Introduction
Deep learning (DL) is the state of the art of machine learning used to solve complex tasks in various fields, including computer vision and natural language processing (NLP), as well as applications in different domains ranging from healthcare to finance. As the performance of these models relies on a high amount of training data, the rise of DL comes with an increased interest in collecting and analyzing more and more data. However, data often includes personal or confidential information, making privacy a pressing concern. This development is also reflected in legislation that was recently put in place to protect personal information and avoid identification of individuals—for example, the General Data Protection Regulation (GDPR) in the European Union or the California Consumer Privacy Act (CCPA).
In response, privacy-enhancing technologies are growing in popularity. Cryptographic techniques [
108] like homomorphic encryption and secure multi-party computation are used to protect against direct information leakage during data analysis. However, sensitive information can still be leaked indirectly via the output of the analysis and compromise privacy. For example, a machine learning model trained to diagnose a disease might make it possible to reconstruct specific information about individuals in the training dataset, even when the computation is encrypted. To avoid this kind of privacy leak, output privacy-preserving techniques can be applied. One common approach is to use anonymization techniques like
k-anonymity [
116],
l-diversity [
86] or
t-closeness [
79]. However, it is now well known that anonymization often cannot prevent re-identification [
19,
60,
94] and the privacy risk is not quantifiable. Hence,
differential privacy (DP) [
50] was proposed to provide output privacy guarantees. DP is a mathematical probabilistic definition that makes it possible to hide information about each single datapoint (e.g., about each individual) while allowing inquiries about the whole dataset (e.g., a population) by adding curated noise. While DP provides a good utility-privacy tradeoff in many cases (especially statistical queries and traditional machine learning methods like logistic regression or support vector machines), the combination of DP and DL poses many challenges arising from the high-dimensionality, the high number of training steps, and the non-convex objective functions in DL. As a response, considerable progress has been made in the past few years addressing the difficulties and opportunities of
differentially private deep learning (DP-DL).
1.1 Contributions of This Survey
This survey provides a comprehensive analysis of recent advances in DP-DL, focusing on centralized DL. Distributed, federated, and collaborative DL methods and applications have their unique characteristics and challenges, and deserve a separate survey. We also specifically focus on methods that do not presume convex objective functions.
Our main contributions are as follows:
(1)
Thorough systematic literature review of DP-DL.
(2)
Identification of the research focuses of the past years (2019–2023): (1) DP-DL for specific applications (Section
4), (2) differentially private generative models (Section
5), (3) evaluating and auditing DP-DL models (Section
6), (4) DP-DL against threats other than membership and attribute inference (Section
7), and (5) improving the privacy-utility tradeoff of DP-DL (Section
8).
(3)
Analysis and contextualization of the different research trends including their potential future development (Section
9).
Previous reviews of DL and DP (Table
1) primarily cover the privacy threats specific to DL and the most common DP methods used for protection. In contrast, this survey goes beyond the basic methods and systematically investigates advances and new paths explored in the field since 2019. We provide the reader with advanced understanding of the state-of-the-art methods, the challenges, and opportunities of DP-DL.
Additional to the reviews mentioned in Table
1, a range of broader reviews exist, such as those about differentially private machine learning (without focusing on DL) [
22,
64,
151], privacy-preserving DL (without focusing on DP) [
24,
32,
68,
89,
117,
124], or privacy-preserving machine learning in general [
82,
135]. These works give a good overview, whereas our survey follows a more detailed approach focusing on recent developments of DP in centralized DL. There are also surveys on DP for specific applications or data types such as continuous [
110] or unstructured [
148] data (i.e., image, video, audio, and text data). While these surveys give more insight into the challenges of DP for the specific data type, they do not provide a comprehensive overview of the whole field of DP-DL. In contrast, our strict survey methodology leads to an application- and data type-independent analysis of the research focuses of the recent years.
2 Survey Methodology
We conducted a systematic literature search on Scopus using a predefined search query, and subsequent automatic and manual assessment according to specific inclusion/exclusion criteria. The full process is depicted in Figure
1.
The Scopus query filtered for documents with the keywords “differential privacy,” “differentially private,” or “differential private” combined with “neural network,” “deep learning,” or “machine learning,” but without mention of “federated learning,” “collaborative,” “distributed,” or “edge” in their title or abstract. The documents were restricted to articles, conference papers, reviews, or book chapters published in journals, proceedings, or books in computer science or engineering. Only documents published in English between 2019 and March 2023 were considered.
Additionally, documents were only included if they were cited at least 10 times or were in the top 10% of most cited papers of the corresponding year. This inclusion criterion allowed manual review of the most influential works in the field. The second part of the criterion was added to mitigate the bias toward older papers.
In the last step, the remaining documents were manually reviewed. Whether a document was included in this survey was decided based on the following criteria. First, DP must be a key topic of the study. Second, the methods must include neural networks with results relevant for DL. Third, we only included works about centralized learning in contrast to distributed learning. Fourth, we excluded reinforcement learning and focused on the supervised and unsupervised learning paradigms typical for centralized DL. Last, one document was excluded because of retraction.
3 Preliminaries
3.1 Differential Privacy
DP [
50] is a mathematical definition of privacy that aims at protecting details about individuals while still allowing general learnings from their data. A randomized algorithm
\(\mathcal {M}:\mathcal {D}\rightarrow \mathcal {R}\) with domain
\(\mathcal {D}\) and range
\(\mathcal {R}\) is
\(\epsilon\)-differentially private if for any two datasets
\(x, y \in \mathcal {D}\) differing on at most one datapoint, and any subset of outputs
\(\mathcal {S} \subseteq \mathcal {R}\), it holds that
where
\(Pr[]\) denotes the probability and
\(\epsilon\) is the privacy risk (also referred to as the privacy budget). Simply put, a single datapoint (usually corresponding to an individual) has only a limited impact on the output of the analysis.
\(\epsilon\) quantifies the maximum amount of information the output of the algorithm can disclose about the datapoint. Therefore, a lower
\(\epsilon\) results in stronger privacy.
DP is achieved by computing the sensitivity of the algorithm in question to the absence/presence of a datapoint and adding random noise accordingly. The most common noise distributions used for DP are Laplace, Gaussian, and Exponential distribution. The amount of added noise relative to the sensitivity determines the privacy risk.
DP is not only quantifiable but also has a number of other benefits. For one, it is independent of both the input data and auxiliary information such as publicly available data or future knowledge. Moreover, it is immune to post-processing and is composable—that is, the sequential or parallel application of two \(\epsilon\)-differentially private algorithms is at least \(2\epsilon\)-DP. Advanced composition theorems can even prove lower overall privacy bounds. Another advantage of DP is its flexibility. The noise can be applied at different points in the dataflow, such as on the input data, on the output of the algorithm, or somewhere in between (e.g., during training of a DL model). It can also be combined with other privacy-enhancing technologies like homomorphic encryption, secure multi-party computation, or federated learning.
While these advantages lead to the widespread acceptance of DP as the gold standard for privacy protection, it also comes with its challenges. First, the unitless and probabilistic privacy parameter
\(\epsilon\) is difficult to interpret, and thus choosing an appropriate value is challenging. The interested reader is referred to the work of Desfontaines [
45] for a list of values used in real-world applications.
Another challenge of DP is its influence on the algorithm’s outputs and, consequently, its properties. The dominant issue is the privacy-utility tradeoff. Nevertheless, it can also influence the performance, usability, fairness, or robustness of the algorithm.
Additionally, DP is not easy to understand for laypeople. First, it is important to note that DP does not offer perfect privacy. DP provides a specific interpretation of privacy, but depending on the context, other interpretations might be expected or more relevant. Besides, the privacy protection depends both on the privacy parameter
\(\epsilon\) and the unit/granularity of privacy (i.e., on what is considered as one record). That is to say, merely stating that an algorithm is DP does not ensure a substantial guarantee. Second, as DP is a mathematical definition that only defines the constraints but does not mandate an implementation, there exist many different algorithms that satisfy DP, from simple statistical analyses to machine learning algorithms. Some assume a central trusted party that has access to all the data (global DP), whereas others apply noise before data aggregation (local DP). Third, there exist a wide range of DP variants and extensions. Just within the years 2008 and 2021, more than 250 new notions were introduced [
46]. They differ, for example, in how they quantify privacy loss, which properties are protected, and what capabilities the attacker is assumed to have. The most common extension is approximate DP (often simply referred to as
\((\epsilon ,\delta)\)-DP), where the algorithm is
\(\epsilon\)-DP with a probability of
\(1-\delta\). The failure probability
\(\delta\) is typically set to less than the inverse of the dataset size. Other common relaxations include
zero-concentrated DP (zCDP) [
51] and Rényi DP [
90].
3.2 Deep Neural Networks and Stochastic Gradient Descent
Deep neural networks consist of connected nodes organized in layers, namely an input layer, multiple hidden layers, and an output layer. Each node applies a set of weights to its inputs and passes its sum through a non-linear activation function. The network can learn to perform different tasks, such as classification on complex data by minimizing the loss (i.e., the difference between the predictions of the model and the desired outputs). As the objective function of deep neural networks are often non-convex, it is generally not feasible to solve this optimization problem analytically. Instead, iterative optimization algorithms are applied, most commonly variants of
stochastic gradient descent (SGD): At each timestep
j, the model weights
w are updated according to the gradients of the objective function
\(\mathcal {L}\) computed on a randomly selected subset of the training dataset (=batch)
\(\mathcal {B}\). In Equation (
2),
\(x_i\) denotes a record from the training set with the corresponding target output
\(y_i\),
\(\eta\) the learning rate, and
\(|\mathcal {B}|\) the batch size.
After training the model (e.g., with SGD), neural networks can be used to make predictions for previously unseen data. Models that perform well on new data are well generalized, whereas models that perform well only on the training data are considered overfitted.
DL models are usually trained with a supervised or an unsupervised learning paradigm. In supervised learning, the training set is labeled, whereas in unsupervised learning, the model learns to identify patterns without access to a ground truth. Apart from the fully connected layers described earlier, neural networks often include other types of architectures and layers. One popular example are convolutional layers, which use filters to extract local patterns, such as edges in images.
3.3 Privacy Threats for DL Models
Privacy attacks on DL models target either the training data or the model itself. They can take place in the training or the inference phase. During training, the adversary can not only be a passive observer but can also actively change the training process. Moreover, the attacker can have access to the whole model—that is, its architecture, parameters, gradients, and outputs (white-box access), or only the model’s outputs (black-box access). Typical privacy attacks are presented next.
Membership Inference Attack (MIA) . In a MIA, the adversary attempts to infer whether a specific record was part of the training dataset. This type of attack exists both in the white- and black-box setting, where in the latter case one distinguishes further between access to the prediction vector and access to the label. The first (and still widely used) MIA on machine learning models was proposed by Shokri et al. [
113]. They train shadow models that imitate the target model, and based on their outputs on training and non-training data, a classification model learns to identify which records are members of the training data. The attack performs best on overfitted models [
113,
140].
Model Inversion Attack . In a model inversion attack, the attacker tries to infer sensitive information about the training data, such as an attribute of a specific datapoint or features that characterize a certain class. For example, they might infer genetic information of a specific patient from a drug dosage prediction model [
58], reconstruct the face of an individual from a face recognition model [
57], or extract private text sequences from a generative language model [
29,
30].
Property Inference Attack . In a property inference attack, the adversary seeks to extract information about the training data that is unrelated to the training task, such as statistical properties of the training dataset. Demonstrations in the literature include, for instance, the proportion of women in the training data of an income prediction model, or the proportion of older faces from an image classification model for gender classification [
59].
Model Extraction Attack . In a model extraction attack, the adversary learns a model that approximates the target model. While this threat only targets the model and not the training data, it can still increase the privacy risk as a successful model extraction can facilitate follow-up attacks like model inversion.
In is important to note that especially in the case of model inversion and property inference, the terms are not used consistently. For example, De Cristofaro [
43] uses the term
property inference for inferring features of a class, whereas others [
100,
109,
146] (including us) refer to this as a type of model inversion. In addition, there exist specialized privacy attacks tailored to specific settings that are related to, yet distinct from, these attack classes. One notable example includes authorship inference attacks, which are directed at language models to uncover the identities of the individuals who authored (parts of) the training data, or author attribute attacks, which attempt to determine specific characteristics or demographics of the author(s), such as their gender or age.
Furthermore, there are attacks not specific to privacy that could still be a threat to DL models, such as those presented next.
Adversarial Attack . In an adversarial attack, during the inference phase, the attacker manipulates inputs purposefully so that the model behaves in unintended ways. For example, small perturbations to images invisible to the human eye can lead to misclassifications [
65]. In the context of generative language models, a type of attack related to adversarial attacks is prompt injection, where prompts are deliberately crafted to cause the model to generate harmful, misleading, or sensitive information.
Poisoning Attack . In a poisoning attack, the adversary perturbs the training samples so that they manipulate the model with the goal to reduce its accuracy or trigger specific misclassifications. Poisoning attacks pose a critical threat in environments where data is gathered from uncontrolled sources, such as crowdsourcing. The perturbations can be introduced to the training data itself or its labels [
118]. Targeted poisoning attack, also known as backdoor attacks, insert a hidden pattern (known as trigger) into the model during training, so it behaves normally on regular inputs but produces incorrect predictions when the trigger is present. For instance, the presence of specific types of glasses can cause misclassifications in a face recognition model [
40].
For a more detailed description, we refer the reader to surveys that focus on privacy attacks [
43,
109,
111]. A selection of attack implementations can, for example, be found in the Adversarial Robustness Toolbox [
96]. The use of unstructured data such as text presents additional challenges—for example, it is more difficult to identify what parts should be considered private. For a comprehensive discussion on privacy concerns related to language models, we refer the reader to the work of Brown et al. [
26].
3.4 DP Algorithms for DL
The most popular algorithm for DP-DL is
differentially private stochastic gradient descent (DP-SGD) by Abadi et al. [
11]. DP-SGD adapts classical SGD by perturbing the gradients with noise. Abadi et al. [11] built upon prior research on gradient perturbation [
21,
114] to tailor it for DL, particularly by proposing a tighter privacy accounting method. As the gradients norms are unbounded, they are clipped to ensure a finite sensitivity before Gaussian noise is added. Equation (
2) thus becomes Equation (
3), where
\(clip[]\) denotes the clipping function that clips the per-example gradients so that their norm does not exceed the clipping norm
C,
\(\chi\) is a random vector drawn from a standard Gaussian distribution, and
\(\sigma\) is the standard deviation of the added Gaussian noise.
While the privacy-utility tradeoff inherent to DP still applies (i.e., information is lost), DP-SGD can improve generalization and thus improve accuracy on the validation dataset.
It is important to note that the per-example gradient clipping in DP-SGD used to bind each record’s sensitivity differs from the batch-wise gradient clipping sometimes used to improve stability and convergence of the training process [
144].
Instead of perturbing the gradients, it is also possible to either perturb the model parameters after non-private training or the objective function [
34]. Both of these methods, however, have serious limitations in DL as they rely on strong convexity assumptions that do not hold for deep neural networks. While non-convex objective functions can be approximated with convex polynomial functions [
104,
105], these methods still considerably constrain the model’s learning capabilities [
71].
An alternative to rendering the DL model itself differentially private is the addition of noise at the prediction level [
49]. However, with this method, the privacy budget increases with every prediction made, making it necessary to limit the number of inferences that can be made with the model. For an example of prediction level perturbation in DL, see the work of Ye et al. [
139].
Another popular method for training a DL model in a differentially private manner is PATE (Private Aggregation of Teacher Ensembles) [
98]. PATE is a semi-supervised approach that needs public data. However, in contrast to DP-SGD, PATE can be applied to every machine learning technique. First, several teacher models are trained on separate private datasets. Next, the public data is labeled based on the differentially private aggregation of the predictions of the teacher models. Finally, a student model is trained on the noisily labeled public data, which can be made public afterward.
7 Beyond Membership and Attribute Inference: Applying DP-DL to Protect against Other Threats
DP is usually applied to protect against re-identification. In the context of DL, this mainly includes MIA and model inversion attack. This section reviews cases in which DP can protect against other threats that DL models exhibit. Table
6 lists the discussed threats and literature.
One threat against which DP can be applied, even though it was not originally intended for that purpose, are model extraction attacks. Model extraction is primarily a security and confidentiality issue, but it can also compromise privacy as it facilitates membership and attribute inference.
Zheng et al. [
149,
150] observed that most model extraction attacks infer the decision boundary of the target model via nearby inputs. They introduced the notion of
boundary differential privacy (BDP) and proposed to append a BDP layer to the machine learning model that (1) determines which outputs are close to the decision boundary (a question not straightforward for DL models, as the decision boundary has no closed form) and (2) adds noise to them. Previous input-output pairs are cached to ensure that the same input results in the same noisy output. This method guarantees that the attacker cannot learn the decision boundary with more than a predetermined level of precision (controlled by
\(\epsilon\)).
A subsequent study by Yan et al. [
136] showed that without caching, the BDP layer cannot protect against their novel model extraction attack. They proposed an alternative to caching: monitoring the privacy leakage and adapting the privacy budget accordingly.
DP can also be adapted to protect DL models against adversarial attacks. While originally DP is defined on a record level, feature-level DP can achieve robustness against adversarial examples. For example, Lecuyer et al. [
77] proposed PixelDP, a pixel-level DP layer that can be added to any type of DL model. As this approach only guarantees robustness but not DP in the original sense, Phan et al. [
103] developed the method further by combining it with DP-SGD. To deal with the tradeoff between robustness, privacy, and utility, they relied on heterogeneous noise: more noise is added to more vulnerable coordinates.
Another line of research is the relationship between interpretability/explainability and privacy. Even though interpretable/explainable AI methods are not a threat scenario on their own, they can facilitate privacy attacks. Harder et al. [
69] looked at how models can both be interpretable and guarantee DP. Models trained with DP-SGD are not vulnerable to gradient-based interpretable AI methods due to the post-processing property of DP. However, gradient-based methods can only provide local explanations (i.e., about how relevant a specific input is for the model’s decision), but Harder et al. [
69] were specifically interested in methods that can also give global explanations (i.e., about how the model works overall). They introduced DP-LLM (differentially private locally linear maps), which can approximate DL models and are inherently interpretable.
Chen et al. [
38] investigated the privacy risk of machine unlearning. Machine unlearning [
28,
125] is the process of removing the impact one or more datapoints have on a trained model. Common methods are retraining from scratch and SISA (Sharded, Isolated, Sliced, and Aggregated) [
25], where the original model consists of
k models each trained on a subset of the training set and therefore only one submodel is retrained for unlearning. While the main idea is to be able to comply with privacy regulations like the Right to Be Forgotten in the European General Data Protection Regulation (GDPR), the unlearning can disclose additional information about the removed datapoint(s). Chen et al. [
38] proposed a new black-box MIA that exploits both the original and the unlearned model. Their attack was more powerful than classical MIAs, and also worked for well-generalized models, when several datapoints were removed, when the attacker missed several intermediate unlearned models, and when the model was updated with new inputs. They showed that DP-SGD is an effective defense against the privacy risk of machine unlearning.
Instead of protecting the model directly, DP can also be used to improve the detection of attacks [
48,
137]. This approach can be viewed as a kind of anomaly detection, where the attack scenario is the outlier/novelty. Anomaly detection with DL models (e.g., autoencoders, CNNs) is based on the model’s tendency to underfit on underrepresented subgroups (i.e., the model’s error is expected to be higher for atypical inputs). Training the model with DP amplifies this effect (see Section
6.2). While this leads to negative consequences in the context of fairness and bias, here it can be used to improve the performance of anomaly detection.
Du et al. [
48] applied this approach on crowdsourcing data—that is, data stemming from many individuals. These individuals could launch a backdoor poisoning attack by maliciously adapting their contributed samples. To protect the target model, the poisoned samples need to be identified and removed from the training set. They showed that DP can improve the performance of anomaly detection in this context. Based on the same reasoning, Yang et al. [
137] proposed Griffin, a network intrusion detection system.
8 Improving the Privacy-Utility Tradeoff of DP-DL
The main challenge of DP-DL is that by setting meaningful privacy guarantees, utility often deteriorates strongly. In recent years, many propositions for improved DP-DL (mostly DP-SGD) were made that can increase accuracy at the same privacy. Table
7 gives an overview of the proposed approaches. An alternative method would be to provide tighter theoretical bounds for the privacy loss without influencing the learning algorithm. An example of such an approach evaluated on DL can be found in the work of Ding et al. [
47]. However, as we already established in Section
6, this line of research seems to have reached its limit.
One approach to increase the accuracy of DP-SGD at the same privacy is to adapt the architecture of the DL model to better suit differentially private learning. Papernot et al. [
99] observed that rendering SGD differentially private as proposed by Abadi et al. [
11] leads to exploding gradients. The larger the gradients, the more information is lost during clipping, which in turn hurts the model’s accuracy. To mitigate this effect, Papernot et al. [
99] proposed to use bounded activation functions instead of the unbounded ones commonly used in non-private training (e.g., ReLU). They introduced tempered sigmoid activation functions:
where the parameter
s controls the scale of the activation, the inverse temperature
T regulates the gradient norms, and
o is the offset (Figure
2). The setting [
\(s=2\),
\(T=2\),
\(o=1\)] results in the hyperbolic tangent (tanh) function.
Papernot et al. [
99] showed that tempered sigmoids can increase the model’s accuracy. For the MNIST and Fashion-MNIST datasets, the tanh function performed best.
Another important aspect when tuning DL models for improved performance is hyperparameter selection (e.g., choosing the learning rate). Even though it might be tempting to transfer the choice of hyperparameters from non-private to private model, Papernot et al. [
99] showed that not only the model’s architecture but also the hyperparameters should be chosen explicitly for the private model in contrast to using what worked well in the non-private setting. As this can result in additional privacy leakage, one should consider private selection of hyperparameters, as, for example, proposed by Liu and Talwar [
83].
Similar to non-private DL, DP-SGD can benefit from feature engineering, additional data, and transfer learning. Tramèr and Boneh [
120] showed that handcrafted features can significantly improve the private model’s utility compared to end-to-end learning. The comparable increase in accuracy for private end-to-end learning can be achieved by using an order of magnitude more training data or by transferring features learned from public data.
While the preceding techniques are already known from non-private DL, DP-SGD introduces two new steps that offer opportunities for improvement: gradient clipping and noise addition.
Chen et al. [
41] found that the bias introduced by gradient clipping can cause convergence issues. They discovered a relationship between the symmetry of the gradient distribution and convergence: symmetric gradient distributions lead to convergence even if a large fraction of gradients are heavily scaled down. Based on this finding, Chen et al. [
41] proposed to introduce additional noise before clipping when the gradient distribution is non-symmetric. It is important to note that this approach may lead to better but slower convergence due to the additional variance, and therefore only improves the privacy-utility tradeoff in specific use cases.
Another observation specific to DP-SGD is that the privacy-utility tradeoff worsens with growing model size. A higher number of model parameters results in a higher gradient norm, meaning that clipping the gradient to the same norm leads to a higher impact. If the clipping norm is also increased, then more noise has to be added to achieve the same privacy guarantee. This effect can be mitigated by either reducing the number of model parameters (parameter pruning) [
15,
62] or compressing the gradients (gradient pruning) [
15,
107].
The work by Gondara et al. [
62] is based on the lottery ticket hypothesis [
55,
56], which says that there exist subnetworks in large neural networks that when trained separately achieve comparable accuracy as the full network. The term
lottery ticket refers to the pruned networks and comes from the idea that finding a well-performing subnetwork is like winning the lottery. Gondara et al. [
62] altered the original lottery ticket hypothesis to comply with DP. First, the lottery tickets are created non-privately using a public dataset. Next, the accuracy of each lottery ticket is evaluated on a private validation set, and the best subnetwork is selected while preserving DP via the Exponential Mechanism. Finally, the winning ticket is trained using DP-SGD.
Adamczewski and Park [
15] proposed DP-SSGD (differentially private sparse stochastic gradient descent) that too relies on model pruning. They experimented with both parameter freezing, where just a subset of parameters are trained, and parameter selection, where a different subset of parameters are updated each iteration. The updated parameters were either chosen randomly or based on their magnitude.
Both of these model pruning approaches rely on publicly available data that should be as similar as possible to the private data. In contrast, Phong and Phuong [
107] proposed a gradient pruning method that works without public data. In addition to making the gradients sparse, they use memorization to maintain the direction of the gradient descent.
A further line of research deals with adapting the noise that is added to the differentially private model during training. While the original DP-SGD algorithm adds the same amount of noise to each gradient coordinate independent of the training progress, this line of work adds noise either based on the learning progress (e.g., number of executed epochs) [
142] or based on the coordinates’ impact on the model [
16,
63,
130,
134].
Yu et al. [
142] argue that with training progress and therefore convergence to the local optimum, the model profits more from smaller noise. This “dynamic privacy budget allocation” is similar to the idea behind adaptive learning rates, which is a common technique in non-private learning and can also be applied in private learning (see, e.g., [
133,
134]). Yu et al. [142] compared different variants of dynamic schemes to allocate privacy budgets, including predefined decay schedules like exponential decay, or noise scaling based on the validation accuracy on a public dataset. They showed that all dynamic schemes outperform the uniform noise allocation to a similar extent.
Xiang et al. [
130] treated the privacy-utility tradeoff as an optimization problem, namely minimizing the accuracy loss while satisfying the privacy constraints. Consequently, less noise is added to those gradient coordinates that have a high impact on the model’s output. While the model’s utility was improved for a range of privacy budgets and model architectures, the method is computationally expensive due to the high dimensionality of the optimization problem.
Xu et al. [
134] advanced the approach further not only reducing the computational demand but also improving convergence (and therefore decreasing the privacy budget). The improved version called
AdaDP (
adaptive and fast convergent approach to
differentially
private deep learning) replaces the computationally expensive optimization with a heuristic approach to compute the gradient coordinates’ impact on the model’s output. The added noise is not only adaptive with regard to the coordinates’ sensitivity but also decreases with the number of training iterations. Faster convergence is achieved by incorporating an adaptive learning rate that is larger for less frequently updated coordinates.
A related research direction is the usage of explainable AI methods to calibrate the noise. Both Gong et al. [
63] and Adesuyi and Kim [
16] use
layer-wise relevance propagation (LRP) [
91] to determine the importance of the different parameters. Gong et al. [
63] proposed the ADPPL (adaptive differential privacy preserving learning) framework that adds adaptive Laplace noise [
106] to the gradient coordinates according to their relevance. In contrast, the approach by Adesuyi and Kim [
16] is based on loss function perturbation. As DL models have non-convex loss functions, the polynomial approximation of the loss function is computed before adding Laplace noise. LRP is used to classify the parameters either as high or low relevance, adding small and large noise accordingly.
Explainable AI-based approaches can also be applied in the local DP setting—for example, Wang et al. [
127] uses feature importance to decide how much noise to add to the training data.
A list of examples for results that the discussed works reported can be found in Table
8. For comparison, the results for the original DP-SGD by Abadi et al. [
11] were included as well. The different accuracies for the three different reported privacy levels from the original DP-SGD paper clearly show the privacy-utility tradeoff typical for differentially private algorithms. Direct comparison between the methods is difficult due to different network architectures and hyperparameters, the different evaluation datasets, and the different privacy levels measured according to different DP notions (e.g.,
\(\epsilon\)-DP,
\((\epsilon ,\delta)\)-DP,
\(\rho\)-zCDP).
9 Discussion and Future Directions
This study reviews the latest developments on DP in centralized DL. The main research focuses of the past years were (1) the application of DP to specific domains, (2) differentially private generative models, (3) auditing and evaluation of DP models, (4) applications of DP-DL to protect against threats other than membership and attribute inference, and (5) improvements of the privacy-utility tradeoff. For each subtopic, we provided a comprehensive summary of recent advances. In this last section, we discuss the key points, interconnections, and expected future directions of the respective topics and differentially private centralized DL in general.
Our survey demonstrated how diverse the methods and applications of DP can be on the example of DL models. While this flexibility is one of the strengths of DP, it can also be a hindrance to its broad deployment due to insufficient understanding. The efforts to make DP more accessible to a wider audience and to promote its (correct) application should continue. That includes not only discussions about how to choose the method, the unit of privacy, and the privacy budget but also how to verify that implementations are correct (see the work of Kifer et al. [
74] and references therein for more information). An example where an implementation error lead to privacy issues even though theoretically DP was proven can be found in the work of Tramer et al. [
122].
In addition to the properties inherent to DP that make it hard to understand for laypeople (see Section
3.1), names that are used ambiguously in the research field can add to the confusion. For example, model inversion can refer to inferring (1) the attribute of a single record or (2) the attribute of a class. While the first obviously implies a privacy concern, the second is primarily problematic if a class consists of instances of one individual (e.g., as is the case in face recognition). As a result of this ambiguous meaning, contradicting conclusions emerged about whether DP protects against model inversion attacks. Some (e.g., [
140]) used the first interpretation and concluded that DP naturally also protects against model inversion. Others (e.g., [
146]) used the second interpretation and showed that DP does not always do so. Interestingly, Park et al. [
100] also use the second interpretation but concluded that DP mitigates model inversion attacks. Maybe this is because DP decreases the accuracy of the target model, and, as Zhang et al. [
146] argued, predictive power and vulnerability to model inversion go hand in hand. Future research should pay attention to accurately define the used terms, and, ideally, the research community should agree on a coherent taxonomy.
The increasing awareness of privacy concerns in combination with the many open questions regarding ethical, legal, and methodical aspects make using synthetic data a tempting alternative to applying privacy-enhancing technologies to private data. However, it is important to spread the knowledge that synthetic data alone is not by default privacy preserving [
115]. Additional protection might be necessary. Even though differentially private synthetic data can be a viable solution, future research is needed, for example, to find good ways to evaluate the usefulness of the data.
Auditing and evaluating DL models is a key research topic not only but especially for differentially private models. We expect the trend of novel attacks to continue, analyzing new threat scenarios and improving our understanding of which aspects influence the attack success. An important element of evaluation is the used dataset. Interestingly, many commonly used datasets for auditing DP-DL (e.g., MNIST, CIFAR-10 or CIFAR-100; see Table
5) are not “relevant to the privacy problem” [
42]. There is a need for more realistic benchmark datasets that include private features. It is also noticeable that attack-based evaluation has not been extensively studied on differentially private language models. While there exist many studies proposing privacy attacks on language models (e.g., [
30,
88]) and DP is increasingly applied on language models (see Section
4), the works reviewed in this survey that use attack-based evaluation specifically for comparing the resulting lower bound to the upper bound provided by the DP guarantee focused more on tabular and image datasets. Another point to consider regarding evaluation is that with the rise of continual learning, auditing is not only relevant once before deployment but should be carried out repeatedly whenever the training set changes [
42]. This is also the case when machine unlearning is applied (as discussed in Section
7).
The accuracy disparity between subgroups is one of many biases that are studied in the field of fair AI—with the goal of avoiding discriminatory behavior of machine learning models. Its amplification by DP, the underlying causes, and possible mitigation strategies are actively researched. For example, de Oliviera et al. [
44] suggested that they are preventable by better hyperparameter selection.
In addition to empirical assessment, theoretical privacy analysis might be able to provide more realistic upper bounds by including additional assumptions (e.g., about the attacker’s capabilities) or features (e.g., the clipping norm or initial randomness [
70]). This could also improve the privacy-utility tradeoff.
Section
7 showed that the concept of DP can be beneficial in diverse threat scenarios. Especially, the connection to robustness and explainability might gain importance through the growing interest in trustworthy AI.
We also anticipate further research on novel strategies or advancement on existing approaches to improve the privacy-utility tradeoff. Some of the mentioned methods could be combined in the future. For example, the differentially private lottery ticket hypothesis approach by Gondara et al. [
62] can be combined with tempered sigmoid activation functions [
99]. Additional effort should be made to compare different methods to identify the best-performing method(s). Simply summarizing the reported results, as we did in Table
8, cannot provide sufficient insight. Fair comparison would require testing the methods on the same model (i.e., same architecture and hyperparameters) with the same evaluation dataset for the same privacy level.
When comparing the different approaches, it is also important to note that some rely on public data [
15,
62,
120,
142]. On the one hand, public data might not be available and therefore prevent the application of those methods. On the other hand, it is debatable whether public availability justifies disregarding all privacy considerations (see other works [
42,
121] for further information).
With the rise of DL used in real-world scenarios, new challenges arise. For example, real-world datasets often contain various dependencies, where domain knowledge is required for their correct interpretation. Causal models [
102] may help capture and model this domain knowledge and additionally improve interpretability [
92], and act as guide to avoid biases [
87]. However, they may have new implications on privacy. Growing interest is not only coming from the research and industrial community—the public is also actively engaging in discussions about the impact of AI applications on society. Most recently, large language models like ChatGPT [
2] are in the spotlight, among other things due to privacy concerns. The future will show whether DP will be part of the next generation of DL deployments.
All in all, DP-DL achieved significant progress in recent years, but open questions are still numerous. We expect the interest in the topic to increase further, especially as new standards and legal frameworks arise. On the way to trustworthy AI, we need not only technical innovations but also legal and ethical discussions about what privacy preservation means in the digital age.
10 Conclusion
This survey provided a comprehensive overview of recent trends and developments of DP in centralized DL. Throughout the article, we highlighted the different research focuses of the past years, including auditing and evaluating differentially private models, improving the tradeoff between privacy and utility, applying DP methods to threats beyond membership and attribute inference, generating private synthetic data, and applying DP-DL models to different application fields. A total of six insights have been derived from literature. First, there is a need for more realistic benchmark datasets with private features. Second, there is a necessity for repeated auditing. Third, more realistic upper privacy bounds would be possible by including additional attack assumptions and model features. Fourth, privacy-utility tradeoffs can be improved by better comparison of existing methods and a possible combination of best approaches. Fifth, by default, synthetic data is not privacy preserving, and differentially private synthetic data requires more research. Sixth, ambiguously used terms lead to confusion in the research field, and a coherent taxonomy is needed.
In summary, we explored the advancements, remaining challenges, and future prospects of integrating mathematical privacy guarantees into DL models. By shedding light on the current state of the field and emphasizing its potential, it is our hope to inspire further research and real-world applications of DP-DL.