Private linear programming
Private linear programming
Ellen Vitercik
Carnegie Mellon University
vitercik@cs.cmu.edu
Abstract
We show how to solve linear programs whose constraints depend on private data.
Existing techniques allow constraint violations whose magnitude is bounded in
terms of the differential privacy parameters and δ. In many applications, however,
the constraints cannot be violated under any circumstances. We demonstrate that
straightforward applications of common differential privacy tools, such as Laplace
noise and the exponential mechanism, are inadequate for guaranteeing that the con-
straints are satisfied. We then present a new differentially private mechanism that
takes as input a linear program and releases a solution that satisfies the constraints
and differs from the optimal solution by only a small amount. Empirically, we
show that alternative mechanisms do violate constraints in practice.
1 Introduction
Linear programming (LP) is a fundamental tool in computer science. A diverse array of problems can
be formulated as linear programs, including those from fields such as machine learning, engineering,
manufacturing, and transportation. The past several decades have seen the development of a variety
of linear programming algorithms with provable guarantees, as well as fast commercial solvers.
The goal in linear programming is to find a vector x maximizing an objective function c> x subject to
the constraint that Ax ≤ b. The LP formulation encodes data about the specific problem at hand. In
many applications, such as those from the medical domain, this data is defined by individuals’ private
information. Releasing the LP’s solution would thereby leak information about this sensitive data.
As a concrete example, suppose there is a hospital with branches located throughout a state, each
of which has a number of patients with a certain disease. Each branch requires a specific drug to
treat these patients, which it can obtain from a number of different pharmacies. There is a cost to
transporting the drug from any one pharmacy to any one hospital branch. The goal is to determine
which pharmacies should supply which hospitals, such that the total cost is minimized. In Figure 3
in Appendix A, we present the LP formulation of this problem. The LP is defined by sensitive
information: the constraints reveal the number of diseased patients at each branch.
We provide tools with provable guarantees for solving linear programs while preserving differential
privacy (DP) [4]. This problem falls in the category of private optimization, for which there are
multiple algorithms [1, 2, 7] in the unconstrained case. To our knowledge, only Hsu et al. [6]
study differentially private linear programming — by definition, a constrained optimization problem.
They allow their algorithm’s output to violate the constraints, which can be unacceptable in certain
applications. In our transportation example, if the constraints are violated, then some hospital will not
33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.
receive the drugs they require, or some pharmacy will be asked to supply more drugs than they have
in their inventory. The importance of satisfying constraints motivates this paper’s central question:
how can we privately solve linear programs while ensuring that no constraint is violated?
Our contributions. Formally, our goal is to privately solve linear programs of the form
maxn c> x : Ax ≤ b(D) , (1)
x∈R
where b(D) depends on a private database D. Each database is a set of individuals’ records, each
of which is an element of a domain X . Our algorithm privately maps b(D) to a nearby vector b̄(D)
and releases the vector maximizing c> x such that Ax ≤ b̄(D). We ensure that b̄(D) ≤ b(D), and
therefore our algorithm’s output satisfies the constraints Ax ≤ b(D). This requirement precludes our
use of traditional DP mechanisms: perturbing each component of b(D) using the Laplace, Gaussian,
or exponential mechanisms would not result in a vector that is component-wise smaller than b(D).
We prove that if x(D) is the vector our algorithm outputs and x∗ is the optimal solution to the original
LP (Equation (1)), then c> x(D) is close to c> x∗ . Our bound depends on the standard differential
privacy parameters and δ. It also depends on the sensitivity ∆ of the private LPs, where ∆ is the
maximum `∞ -norm between any two vectors b(D) and b(D ′ ) when D and D ′ are neighboring, in
the sense that D and D ′ differ on at most one individual’s data. Finally, our bound depends on the
“niceness” ∣of the matrix A, which
∣ wequantify
using the condition
number γ(A) of the LP [8, 9]. We
prove that ∣c> x∗ − c> x(D)∣ = O 1 + 1 c2 γ(A)∆ ln 1δ .
2
Figure 1: Densities of x(D) and x(D ′ ) for two neighboring databases D and D ′ , illustrated over
their supports. To preserve DP, we ensure that the probability x(D) is in the interval [b(D ′ ), b(D)] is
at most δ, since this interval is disjoint from the support of x(D ′ ).
Fact 3.1. For any database D, with probability 1, b(D) − O ∆ + ∆
ln 1δ ≤ x(D) ≤ b(D).
Proof sketch. Let V be an arbitrary subset of R and let D and D ′ be two neighboring databases.
Without loss of generality, suppose b(D ′ ) ≤ b(D), as in Figure 1. We decompose V into two subsets,
V ∩ (−∞, b(D ′ )] and V ∩ (b(D ′ ), ∞). We use the exponential decay of the density function fD to
show that Pr[x(D) ∈ V ∩ (−∞, b(D ′ )]] ≤ e Pr[x(D ′ ) ∈ V ]. Next, we show that our careful choice
of s ensures that Pr[x(D) ∈ V ∩ (b(D ′ ), ∞)] ≤ δ. Therefore, Pr[x(D) ∈ V ] ≤ e Pr[x(D ′ ) ∈
V ] + δ. By a similar argument, Pr[x(D ′ ) ∈ V ] ≤ e Pr[x(D) ∈ V ] + δ, so the theorem holds.
In our approach, we map each vector b(D) to a random variable b̄(D) ∈ Rm and release
x(D) := argmaxx∈Rn c> x : Ax ≤ b̄(D) . (2)
In essence, each component of b̄(D) is defined by the same distribution we used to perturb the
one-dimensional constraint in Section 3. We must ensure, however, that the resulting LP is feasible.
To formally describe our approach, we use the notation ∆ { = maxD∼D } ′ b(D) − b(D )∞ to
′
denote the constraint’s sensitivity. We define b̄(D)i = max b̃(D)i , b∗i , where b̃(D)i is a random
variable and b∗i = inf D b(D)i . Asin Section 3, the density functionof b̃(D)i is defined over
(i) u+s−b(D)i −1
[b(D)i − 2s, b(D)i ] as fD (u) ∝ exp − ∆ , where s = ∆ ln e 2δ +1 .
Our mechanism is (m, δm)-DP, a fact that follows immediately from Theorem 3.2 and the composi-
tion and post-processing properties of DP [3, 5]. Our quality guarantee depends on the “niceness” of
the matrix A, as quantified by the LP’s condition number [8]: given two norms · β and · ν ,
AT uν ∗ = 1
γβ,ν (A) = sup uβ ∗ : The rows of A corresponding to nonzero entries of u are .
linearly independent
When A is nonsingular and · β and · ν equal the `2 -norm, γβ,ν (A) equals the inverse of the
minimum singular value, σmin (A)−1 . Li [8] proved that γβ,ν (A) sharply characterizes the extent to
which a change in the constraint scalars b causes a change in the LP’s optimal solution.
3
(a) = 0.25 (b) = 0.5 (c) = 0.75
Figure 2: Comparison of our approach with a baseline based on the Laplace mechanism (Section 5).
Theorem 4.2. Suppose Assumption 4.1 holds. With probability 1, the linear program in Equation (2)
is feasible. Let x∗ ∈ Rn be an optimal solution to the original LP (Equation (1)). Then x∗ −
x(D)ν ≤ γβ,ν (A) · b(D) − b̄(D)β .
Proof. We show that S ∗ = x : Ax ≤ (b∗1 , . . . , b∗m ), which allows us to prove that Equation (2) is
feasible (Lemmas C.2, C.3). The bound on x∗ − x(D)ν follows from Theorem 3.3 by Li [8].
√
By definition of b̄(D), for any `p -norm ·p , we have that b(D)− b̄(D)p = Õ p m∆ −1 + 1 .
√
Theorem 4.2 therefore implies that c> x∗ − c> x(D) = Õ cν ∗ γp,ν (A) p m∆ −1 + 1 .
A is nonsingular,
When setting
· β = · ν = · 2 implies that c> x∗ − c> x(D) =
√
Õ c2 σmin (A)−1 m∆ −1 + 1 .
A natural question is whether we can achieve pure (, 0)-differential privacy. In Appendix C.1, we
prove that if S ∗ 6= ∅, then the optimal (, 0)-DP mechanism disregards the database D and outputs
argmaxx∈S ∗ c> x with probability 1. If S ∗ = ∅, then no (, 0)-DP mechanism exists. This shows that
any non-trivial private mechanism must allow for a failure probability δ > 0.
5 Experiments
We return now to the single-dimensional LP from Section 3. We compare our approach to the
following -DP baseline: given an offset t ≥ 0, draw η ∼ Laplace ∆ , and release x,t (D) :=
b(D) − t + η. This approach will violate the LP’s constraint, but to what extent? We show that when
both approaches have equal expected error, x,t (D) violates the LP’s constraint a non-negligible
fraction of the time, whereas our approach never does.
In Figure 2, we plot the offset t along the x-axis. This offset equals the expected error of the
mechanism x,t (D). For three different values of , we draw 10,000 samples from Laplace 1 . The
blue line equals the fraction of samples η where η > t. For any such sample and any b(D), we have
that b(D) − t + η > b(D), which means the LP’s constraint is violated. For three different values
−1
of δ, we compute the expected error of our mechanism, 1 ln e 2δ + 1 . We mark this value on the
x-axis. Next, we answer the question: if we were to set the offset t to this expected error, how often
would the mechanism x,t violate the LP’s constraint? We mark this value along the y-axis.
As Figure 2 illustrates, the smaller is, the more advantageous our approach. Intuitively, this is
because when is small, Laplace 1 is less concentrated, so η will be often be greater than t.
Meanwhile, the smaller δ is, the greater the expected error of our mechanism. Therefore, smaller
values of δ induce fewer constraint violations when we set the offset t equal to this expected error.
6 Conclusion
We presented a new differentially private method for solving linear programs, where the right-hand
side of the constraints Ax ≤ b depends on private data, and where the constraints must always be
satisfied. Natural directions for future research would be to allow the matrix A to also depend on
private data, and to generalize the constraints or object function from linear to nonlinear functions.
4
References
[1] Raef Bassily, Adam Smith, and Abhradeep Thakurta. Differentially private empirical risk mini-
mization: Efficient algorithms and tight error bounds. In Proceedings of the IEEE Symposium
on Foundations of Computer Science (FOCS), 2014.
[2] Kamalika Chaudhuri, Claire Monteleoni, and Anand D Sarwate. Differentially private empirical
risk minimization. Journal of Machine Learning Research, 12(Mar):1069–1109, 2011.
[3] Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our
data, ourselves: Privacy via distributed noise generation. In Annual International Conference
on the Theory and Applications of Cryptographic Techniques (EUROCRYPT), 2006.
[4] Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to
sensitivity in private data analysis. In Proceedings of the Theory of Cryptography Conference
(TCC), pages 265–284. Springer, 2006.
[5] Cynthia Dwork, Aaron Roth, et al. The algorithmic foundations of differential privacy. Founda-
tions and Trends©
R in Theoretical Computer Science, 9(3–4):211–407, 2014.
[6] Justin Hsu, Aaron Roth, Tim Roughgarden, and Jonathan Ullman. Privately solving linear
programs. In Proceedings of the International Colloquium on Automata, Languages and
Programming (ICALP), pages 612–624, 2014.
[7] Daniel Kifer, Adam D. Smith, and Abhradeep Thakurta. Private convex optimization for
empirical risk minimization with applications to high-dimensional regression. In Proceedings
of the Conference on Learning Theory (COLT), pages 25.1–25.40, 2012.
[8] Wu Li. The sharp Lipschitz constants for feasible and optimal solutions of a perturbed linear
program. Linear algebra and its applications, 187:15–40, 1993.
[9] Olvi L Mangasarian. A condition number of linear inequalities and equalities. Methods of
Operations Research, 43:3–15, 1981.
[10] Frank McSherry and Kunal Talwar. Mechanism design via differential privacy. In Proceedings
of the IEEE Symposium on Foundations of Computer Science (FOCS), pages 94–103, 2007.
∑
minimize cij xij
∑i,j
N ∑M
such that j=1 xij ≤ si ∧ i=1 xij ≥ rj ∧ xij ≥ 0 ∀i ≤ M, j ≤ N
Figure 3: The transportation problem formulated as a linear program. There are N hospital branches
and M pharmacies. Each branch j ∈ 1, . . . , N requires rj ∈ R units of a specific drug. Each
pharmacy i ∈ 1, . . . , M has a supply of si ∈ R units. It costs cij ∈ R dollars to transport a unit
of the drugs from pharmacy i to hospital j. We use the notation xij to denote the units of drugs
transported from pharmacy i to hospital j.
Proof. Let D and D ′ be a pair of neighboring datasets and let V ⊆ R be a set of real values. Without
loss of generality, suppose b(D ′ ) ≤ b(D). Since the support of x(D) equals [b(D) − 2s, b(D)], we
5
know that
b(D) b(D)
Next, since b(D ′ ) ≥ b(D) − ∆, we know that b(D′ ) fD (u)1u∈V du ≤ b(D)−∆ fD (u)1u∈V du.
b(D)
In Lemma B.1, we prove that b(D)−∆ fD (u)1u∈V du ≤ δ, which means that
∫ b(D ′ )
Pr[x(D) ∈ V ] ≤ fD (u)1u∈V du + δ. (3)
b(D)−2s
b(D′ )
We now bound the first summand of Equation (3)’s right-hand-side, b(D)−2s
fD (u)1u∈V du, by
e Pr[x(D ′ ) ∈ V ]. By definition of fD ,
∫ b(D ′ ) ∫ b(D ′ ) ( )
1 u + s − b(D)
fD (u)1u∈V du = exp − 1u∈V du
b(D)−2s Z b(D)−2s ∆
∫ ′ ( )
1 b(D ) u + s − b(D ′ ) + b(D ′ ) − b(D)
= exp − 1u∈V du.
Z b(D)−2s ∆
b(D′ )
By the reverse triangle inequality, b(D)−2s
fD (u)1u∈V du is upper-bounded by
∫ b(D ′ ) ( )
1 (u + s − b(D ′ ) − b(D ′ ) − b(D))
exp − 1u∈V du.
Z b(D)−2s ∆
∫ b(D ′ ) ∫ b(D ′ ) ( )
1 (u + s − b(D ′ ) − ∆)
fD (u)1u∈V du ≤ exp − 1u∈V du
b(D)−2s Z b(D)−2s ∆
∫ ′ ( )
e b(D ) u + s − b(D ′ )
= exp − 1u∈V du
Z b(D)−2s ∆
∫ b(D′ )
=e fD′ (u)1u∈V du
b(D)−2s
≤ e Pr[x(D ′ ) ∈ V ].
This inequality together with Equation (3) implies that Pr[x(D) ∈ V ] ≤ e Pr[x(D ′ ) ∈ V ] + δ.
6
By a similar argument, we show that Pr[x(D ′ ) ∈ V ] ≤ e Pr[x(D) ∈ V ] + δ:
Pr[x(D ′ ) ∈ V ]
= Pr[x(D ′ ) ∈ V ∩ [b(D ′ ) − 2s, b(D) − 2s]] + Pr[x(D ′ ) ∈ V ∩ [b(D) − 2s, b(D ′ )]]
∫ b(D)−2s ∫ b(D′ )
= fD (u)1u∈V du +
′ fD′ (u)1u∈V du
b(D ′ )−2s b(D)−2s
∫ b(D ′ )−2s+∆ ∫ b(D ′ )
≤ fD′ (u)1u∈V du + fD′ (u)1u∈V du
b(D ′ )−2s b(D)−2s
∫ b(D ′ )
≤ δ+ fD′ (u)1u∈V du
b(D)−2s
∫ (
b(D ′ ) )
1 u + s − b(D ′ )
= δ+ exp − 1u∈V du
Zb(D)−2s ∆
∫ ′ ( )
1 b(D ) u + s − b(D) + b(D) − b(D ′ )
= δ+ exp − 1u∈V du
Z b(D)−2s ∆
∫ ′ ( )
1 b(D ) (u + s − b(D) − b(D) − b(D ′ ))
≤ δ+ exp − 1u∈V du
Z b(D)−2s ∆
∫ ′ ( )
1 b(D ) (u + s − b(D) − ∆)
≤ δ+ exp − 1u∈V du
Z b(D)−2s ∆
∫ ′ ( )
e b(D ) u + s − b(D)
= δ+ exp − 1u∈V du
Z b(D)−2s ∆
∫ b(D′ )
= δ+e fD (u)1u∈V du
b(D)−2s
≤ e Pr[x(D) ∈ V ] + δ.
We conclude that for any pair of neighboring databases D and D ′ , Pr[x(D ′ ) ∈ V ] ≤ e Pr[x(D) ∈
V ] + δ, so differential privacy holds.
We now prove that the distribution defined by fD has tails with probability mass bounded by δ, a fact
that we use in the privacy guarantee above.
Lemma B.1. The probability mass of x(D) in each of the intervals [b(D) − 2s, b(D) − 2s + ∆] and
[b(D) − ∆, b(D)] is δ. In other words,
∫ b(D)−2s+∆ ∫ b(D)
fD (u) du = fD (u) du = δ.
b(D)−2s b(D)−∆
Proof. Since the density function fD is symmetric around b(D) − s, we prove this lemma by proving
b(D)
that b(D)−∆ fD (u) du = δ. By definition,
∫ b(D) ∫ ( )
1 b(D) u + s − b(D)
fD (u) du = exp − du.
b(D)−∆ Z b(D)−∆ ∆
Since b(D) − ∆ ≥ b(D) − s we can remove the absolute value from this expression:
∫ b(D) ∫ ( )
1 b(D) (u + s − b(D)) ∆ (e − 1) e−s/∆
fD (u) du = exp − du = .
b(D)−∆ Z b(D)−∆ ∆ Z
2∆(1−e−s/∆ )
Since Z = , we have that
∫ b(D)
(e − 1) e−s/∆ e − 1
fD (u) du = = = δ,
b(D)−∆ 2 1 − e−s/∆ 2 es/∆ − 1
as claimed.
7
C Omitted proofs from Section 4 about multi-dimensional LPs
Proposition C.1. Suppose D and D ′ are two neighboring databases with disjoint feasible regions:
x : Ax ≤ b(D) ∩ x : Ax ≤ b(D ′ ) = ∅. There is no (, δ)-DP mechanism with δ < 1 that
satisfies the constraints with probability 1.
Proof. For the sake of a contradiction, suppose µ : 2X → Rn is an (, δ)-DP mechanism with
δ < 1 that satisfies the constraints with probability 1. Let V = x : Ax ≤ b(D). Since V ∩
x : Ax ≤ b(D ′ ) = ∅, it must be that Pr[µ(D ′ ) ∈ V ] = 0. This means that 1 = Pr[µ(D) ∈ V ] ≤
e Pr[µ(D ′ ) ∈ V ] + δ = δ, which is a contradiction. Therefore, the lemma statement holds.
Lemma C.2. With probability 1, the linear program in Equation (2) is feasible.
We now prove Lemma C.3, which we used in the proof of Lemma C.2. Lemma C.3 guarantees that
the (nonempty) intersection of the feasible regions across all databases is equal to the set of all x
such that Ax ≤ b∗ .
⋂
Lemma C.3. The set D⊆X x : Ax ≤ b(D) is equal to the set x : Ax ≤ b∗ .
⋂
Proof. Suppose that x ∈ D⊆X x : Ax ≤ b(D). We claim that Ax ≤ b∗ . To see why, let ai be
the ith row of the matrix A. For a contradiction, suppose that for some row i ∈ [m], a> i x > bi , and
∗
γ
let γ = ai x − bi . Since bi = inf D⊆X b(D)i , there exists a database D such that b(D)
> ∗ ∗
i>< bi +∗ 2 .
∗
⋂ γ
Since x ∈ D⊆X x : Ax ≤ b(D), it must be that ai x ≤ b(D)i < bi + 2 = 2 ai x + bi .
> ∗ 1
Proof. Fix a mechanism, and let P (D) be the set of vectors x in the support of the mechanism’s
output given as input the database D. We claim that if the mechanism is (, 0)-differentially private,
then there exists a set P ∗ such that P (D) = P ∗ for all databases D. Suppose, for the sake of
a contradiction, that there exist databases D and D ′ such that P (D) 6= P (D ′ ). Let D1 , . . . , Dn
be a sequence of databases such that D1 = D, Dn = D ′ , and each pair of databases Di and
Di+1 are neighbors. Then there must exist a pair of neighboring databases Di and Di+1 such that
P (Di ) 6= P (Di+1 ), which contradicts the fact that the mechanism is (, 0)-differentially private.
Therefore, if the mechanism is (, 0)-differentially private, then to satisfy the feasibility requirement,
we must have that P ∗ ⊆ S ∗ . If S ∗ is empty, then no such mechanism exists. If S ∗ is nonempty, then
the optimal (, 0)-differentially private mechanism outputs argmaxx∈S ∗ c> x with probability 1.