Design and Implementation of Helib: A Homomorphic Encryption Library
Design and Implementation of Helib: A Homomorphic Encryption Library
Design and Implementation of Helib: A Homomorphic Encryption Library
1 Introduction
HElib is a C++ open source library (see https://github.com/homenc/HElib) that implements
both the BGV [3] and CKKS [4] fully homomorphic encryption (FHE) schemes. This document
summarizes some of the basic design principles of HElib, and describes some of its fundamental
algorithms and data structures in significant detail. It is a work in progress, and currently focuses
exclusively on the BGV scheme. It is not intended to be an HElib “user manual”. This document
focuses on the design of HElib’s core — we refer the reader to the papers [6], [7], and [8] for more
details on higher-level algorithms in HElib.
1
• every mth root of unity in F can be written as ω j for a unique j ∈ Zm , and
• every primitive mth root of unity in F can be written as ω j for a unique j ∈ Z∗m .
As a special case, consider ω := e2πi/m ∈ C, which is a primitive mth root of unity in C. Define
the polynomial Y
Φm (X) := (X − ω j ) ∈ C[X].
j∈Z∗m
The polynomial Φm (X) is called the mth cyclotomic polynomial. Clearly, Φm (X) is monic and
has degree φ(m).
The following are well-known facts:
This formula gives a recursive formulation for Φm (X): we have Φ1 (X) = X − 1, and for m > 1, we
have
Xm − 1
Φm (X) = Y .
Φd (X)
d|m
d<m
2
The canonical embedding of a ∈ A is the vector obtained by evaluating a at all primitive
mth roots of unity:
canon(a) := a(ω j ) ∗
.
j∈Zm
We will often use the infinity norm kcanon(a)k∞ as the measure of the “size” of a ∈ A, so we
define
kak := kcanon(a)k∞ .
That is,
kak = max{|a(ω j )| : j ∈ Z∗m },
where |a(ω j )| ∈ R denotes the usual absolute value (or norm) of the complex number a(ω j ). We
call k · k the canonical norm.
The canonical norm satisfies the usual properties satisfied by any norm:
• kcak = |c|kak,
• kabk ≤ kakkbk
for all a, b ∈ A. This sub-multiplicativity property is what makes this norm especially convenient
to use.
The notions of the canonical embedding and corresponding norm apply equally well to elements
of AQ or AR , and we shall use the same notation kak = kcanon(a)k∞ for a in AQ or AR .
3
For t = 1, . . . , k, let yt be the image of Yt in B. Then we have B = Z[y1 , . . . , yk ]. That is, every
element of B can be expressed as g(y1 , . . . , yk ) for some polynomial g(Y1 , . . . , Yk ) ∈ Z[Y1 , . . . , Yk ].
Moreover, every element of B can be expressed uniquely as g(y1 , . . . , yk ) for some polynomial
g(Y1 , . . . , Yk ) ∈ Z[Y1 , . . . , Yk ] where the degree of Yt is less than φ(mt ) for t = 1, . . . , k. Put another
way,
y1 · · · ykik i1 ∈[φ(m1 )],...,i ∈[φ(m )]
i1
k k
4
Lemma 1 holds for AQ and AR as well.
2.5.2 Bounding the standard basis infinity norm in terms of the canonical norm
We saw above that we can bound the powerful basis infinity norm in terms of the canonical norm
with a fairly simple explicit formula that gives a reasonably tight bound. It would be nice to get
a similar result for the standard basis infinity norm. Unfortunately, we know of no simple tight
formula. While we are able to compute explicit bounds, these bounds are not nearly as tight as
the bounds for the powerful basis.
kstd(a)k∞ ≤ Em · kak
where Em is the infinity norm of the inverse Vandermonde matrix, that is,
Em = kVm−1 k∞ , where Vm := ω ij ∗
.
i∈Zm ,j∈[φ(m)]
• If m is the product of 2 distinct prime powers, Em ranges between (roughly) 1.6 and 6.8.
• If m is the product of 3 distinct prime powers, Em ranges between (roughly) 3.5 and 130.
• If m is the product of 4 distinct prime powers, Em ranges between (roughly) 18 and 1,900.
• If m is the product of 5 distinct prime powers, Em ranges between (roughly) 820 and 81,000.
where {ai }i∈I is a mutually independent family of real-values random variables, where each ai
has zero mean and variance σi2 . Let ω ∈ C be a primitive mth root of unity, and consider the
P random variable a(ω). A simple calculation shows that a(ω) has zero mean and variance
complex
σ 2 := i σi2 . Indeed,
X X X
σ 2 = E[a(ω) · a(ω)] = E ai aj ω i−j = E[a2i ] = σi2 .
i,j i i
5
2.6.1 The circularly-symmetric case
In the above setting, if
• I = [m], or
then we will heuristically model a(ω) as a complex Gaussian with variance σ 2 . The heuristic
aspect of this is the fact that we are using the Central Limit Theorem here qualitatively, without
any quantitative error terms. This heuristic is reasonable provided |I| is large. We may also
heuristically model a(ω) as a complex Gaussian with variance σ 2 if I is chosen as a random subset
of [m] (or [φ(m)] if m is a power of two). We require that the complex numbers ω i are well
distributed around the unit circle.
Now, a complex Gaussian with variance σ 2 has the same distribution as a 2-D Gaussian with
variance σ 2 /2. It follows2 that for any B > 0, we have the following heuristic tail bound:
Setting p
B := σ log(φ(m)), (2)
we have
1
Pr |a(ω)| > B = .
φ(m)
Furthermore, kak > B iff |a(ω)| > B for some primitive mth root of unity ω ∈ C. However, since
the coefficients of a are real, we have a(ω̄) = a(ω). Thus, to bound the probability that kak > B,
we can apply the Union Bound to the φ(m)/2 conjugate pairs of primitive roots, rather than all
φ(m) primitive roots. Therefore, with B defined as in (2), we obtain the heuristic bound
1
Pr kak > B ≤ . (3)
2
6
√
B/σ − log2 (erfc(B/σ 2))
8 49.5
9 61.9
10 75.8
11 91.1
12 107.8
Again, applying the Union Bound, we have
√
Pr kak > B ≤ erfc(B / σ 2) · φ(m)/2. (5)
If M is odd, then the symmetric distribution mod M is simply the uniform distribution
on the set of M integers {−bM/2c, . . . , +bM/2c}. If M is even, then the symmetric
distribution mod M assigns probability mass 1/2M to the integers ±M/2, while the
integers of magnitude strictly smaller than M/2 are each assigned probability mass
1/M .
Note that for the symmetric distribution mod M , each residue class mod M is equally likely, and the
distribution is symmetric about zero (and in particular, its mean is zero). Instead of the variance
M 2 /12 for the continuous distribution on [−M/2, +M/2], we have:
2 M2
σM ≤ if M is odd,
12
and
M2
2 2
σM = · 1+ 2 if M is even.
12 M
Proof. Let N = bM/2c. If M is odd then we have
N
2 2 X M2
σM = · i2 ≤ ,
2N + 1 12
i=1
7
2.6.3.1 Symmetric reduction mod M
Related to the notion of the symmetric distribution mod M is the notion of symmetric reduction
of an integer a mod M . If M is odd, this is the unique integer b ≡ a (mod M ) in the interval
(−M/2, +M/2). If M is even, then:
• if a 6≡ M/2 (mod M ), then this is the unique integer b ≡ a (mod M ) in the interval
(−M/2, +M/2);
• otherwise, if a ≡ M/2 (mod M ), then this is an integer b chosen uniformly at random from
the set {±M/2}.
θj : A −→ A
(6)
f (x) 7−→ f (xj ) (for f (X) ∈ Z[X]).
To see why θj is well defined, note that over Z[X], we have Φm (X j ) is divisible by Φm (X). This
follows from the fact that ω := e2πi/m ∈ C is a primitive mth root of unity, and so is ω j . Therefore,
ω is a root of Φm (X j ). Since Φm (X) is the minimal polynomial of ω over Z, it must be the case
that Φm (X j ) is divisible by Φm (X). This means that if f (x) = g(x), then f (xj ) = g(xj ), and so
the map (6) is well defined. Note that if for j, j 0 ∈ Z∗m , we have
θj ◦ θj 0 = θjj 0 = θj 0 ◦ θj .
8
• The polynomial Φm (X) ∈ Zp [X] factors as
where the polynomials Fi (X) ∈ Zp [X] are distinct irreducible polynomials, each of the same
degree d; in particular, φ(m) = nd.
• The value d is the multiplicative order of p modulo m; that is, d is the smallest positive integer
such that pd ≡ 1 (mod m).
Another way to look at A is as follows. Let E = Zp [X]/(F1 (X)). The choice of the polynomial
F1 (X) from among the irreducible factors of Φm (X) is quite arbitrary. Let η := [X mod F1 (X)] ∈
E. Now, since F1 (X) is irreducible, it follows that E is a field — it is a finite field of cardinality
pd . We also have E = Zp [η], which means that every element of E can be expressed as f (η) for
some f (X) ∈ Zp [X]. We naturally view Zp as a subfield of E. By definition, η is a root of F1 (X).
It is also a well-known fact that η ∈ E is a primitive mth root of unity.
Now consider the group Z∗m . It is a well-known fact that the polynomial Φm (X) has φ(m) roots
in E, namely, η j for j ∈ Z∗m . Thus, these φ(m) roots must be partitioned among the irreducible
factors of Φm (X), so that each irreducible factor Fi (X) has d roots in E.
We can say a bit more about how these roots are partitioned among these factors. Consider
the subgroup H of Z∗m generated by p̄ := [p mod m] ∈ Z∗m . This subgroup consists of the d distinct
elements 1̄, p̄, . . . , p̄d−1 . We can form the quotient group Z∗m /H, which consists of n = φ(m)/d
distinct cosets. Each such coset is of the form
kH = {kh : h ∈ H} ⊆ Z∗m
for some k ∈ Z∗m . Such a k is called a representative of the coset. Any other element of a coset can
also act as a representative of the same coset.
Now suppose we choose one representative from each coset, obtaining a complete set of repre-
sentatives k1 , . . . , kn ∈ Z∗m for the cosets of H in Z∗m . Then the cosets of H in Z∗m are
k1 H, . . . , kn H.
• For any set of representatives k1 , . . . , kn ∈ Z∗m of H in Z∗m , we can order them in such a way
that for i = 1, . . . , n, the polynomial Fi (X) has d roots in E, namely, η k for k ∈ ki H.
Zp [X]/(Fi (X)) −→ E
(9)
[f (X) mod Fi (X)] 7−→ f (η ki ) (for f (X) ∈ Zp [X]).
3
A B-algebra isomorphism is a ring isomorphism that acts as the identity function on the subring B.
9
Combining (8) and (9), we obtain a Zp -algebra isomorphism
Ap −→ E n
(10)
f (η k1 ), . . . , f (η kn )
f (x) 7−→ (for f (X) ∈ Zp [X]).
We call E the slot algebra. The isomorphism (10) allows us to perform component-wise
addition and multiplications on vectors in E n by performing corresponding operations on elements
of Ap . If a ∈ Ap corresponds to (α1 , . . . , αn ) ∈ E n and b ∈ Ap corresponds to (β1 , . . . , βn ) ∈ E n ,
then a+b corresponds to (α1 +β1 , . . . , αn +βn ) ∈ E n , and a·b corresponds to (α1 ·β1 , . . . , αn ·βn ) ∈
En.
It is also computationally easy to map (in both directions) between a concrete representation
of an element of Ap (represented, say, as a coefficient vector with respect to the standard basis for
Ap over Zp ), and a concrete representation of E n (where each entry in the vector is represented,
say, as a coefficient vector with respect to the standard basis for E over Zp ).
10
3.1.2 Rotations on a hypercube
As we have seen, the automorphism θp̄ just applies the Frobenius map to each slot, but does not
induce any data movement between slots. We now discuss how we can use other automorphisms
θj to implement various permutations on the slots.
We start with some simple cases.
11
If we have the correspondence
a ∈ Ap ←→ (α0 , . . . , αn−1 ) ∈ E n ,
then
Me · θge (a) ∈ Ap ←→ (αe , . . . , αn−1 , 0, . . . , 0) ∈ E n .
| {z }
e 0’s
and
(1 − Me ) · θge−n (a) ∈ Ap ←→ (0, . . . , 0, α0 , . . . , αe−1 ) ∈ E n .
| {z }
n − e 0’s
Therefore,
Me · θge (a) + (1 − Me ) · θge−n (a). (13)
yields an element of Ap whose slots are obtained by rotating the slots of a to the left e positions.
Instead of rotating and then masking, as in (13), we can achieve exactly the same effect masking
and then rotating:
θge ((1 − Mn−e ) · a) + θge−n (Mn−e · a). (14)
12
In this case, the effect of θg2 is to rotate the slots of each row one position to the left. More generally,
for e2 = 1, . . . , n2 − 1, applying θge2 rotates the slots of each row to the left by e2 positions, and
2
applying θg−e2 rotates the slots of each row to the right by e2 positions.
Suppose we are unlucky, and g2n2 6= 1 but g2n2 ∈ H. If g2n2 = ḡ s , then we have:
n2 −1
2
f (η g2 ) f (η g2 ) ··· f (η g2 ) σ s (f (η 1 ))
2 n2 −1 1
f (η g1 g2 ) f (η g1 g2 ) ··· f (η g1 g2 ) σ s (f (η g1 ))
..
θg2 (f (x)) ∈ Ap ←→ ∈ E n1 ×n2 .
.
n −2 n −2 n −2 n −1 n −2
f (η g1 1 g2 ) f (η g1 1 g22 ) · · · f (η g1 1 g2 2 ) σ s (f (η g1 1 ))
n1 −1 n1 −1 n1 −1 n2 −1 n1 −1
f (η g1 ) f (η g1 g2 ) · · · f (η g1 g2 ) σ s (f (η g1 ))
This is not a true rotation. Rather, applying θg2 to a ∈ A effectively rotates the slots in each row
of a to the left by one position, and then the slots in the last column are perturbed by powers of
Frobenius. However, if the slots of the first column of a happen lie in Zp , this is a true rotation.
More generally, for e2 = 1, . . . , n2 − 1, applying θge2 rotates the slots of each row to the left
2
by e2 positions, and then the slots in the last e2 columns are perturbed by powers of Frobenius,
applying θg−e2 rotates the slots of each row to the right by e2 positions, and then the slots in the
first e2 columns are perturbed by powers of Frobenius.
Now suppose we are even more unlucky, and g2n2 ∈ / H. We claim that for every i ∈ [n1 ], we
must have g1i g2n2 = g1ti · p̄si for some ti ∈ [n1 ] and si ∈ [d]. To see why, observe that we must have
t0
g1i g2n2 = g1ti g2i · p̄si for some ti ∈ [n1 ], t0i ∈ [n2 ], and si ∈ [d], since the group elements (15) form
a complete system of representatives for the cosets of H in Z∗m . Moreover, if we had t0i 6= 0, then
n −t0
g1i g2 2 i = g1ti · p̄si , contradicting the fact that the group elements (15) lie in distinct cosets of H
in Z∗m . It is also not hard to see that (t0 , . . . , tn1 −1 ) is a permutation of (0, . . . , n − 1).
So we have
2 n2 −1 t0
f (η g2 ) f (η g2 ) ··· f (η g2 ) σ s0 (f (η g1 ))
2 n2 −1 t1
f (η g1 g2 ) f (η g1 g2 ) ··· f (η g1 g2 ) σ s1 (f (η g1 ))
..
θg2 (f (x)) ∈ Ap ←→ . ∈ E n1 ×n2 .
n −2 n −2 n −2 n −1 tn −2
f (η g1 1 g2 ) f (η g1 1 g22 ) · · · f (η g1 1 g2 2 ) σ sn1 −2 (f (η g1 1 ))
n1 −1 n1 −1 n1 −1 n2 −1 tn −1
1
f (η g1 ) f (η g1 g2 ) ··· f (η g1 g2 ) σ sn2 −1 (f (η g1 ))
This is not a true rotation. Rather, applying θg2 to a ∈ A effectively rotates the slots in each row
of a to the left by one position, and then the slots in the last column are perturbed by powers of
Frobenius and permuted. However, if the slots of the first column of a happen to be some constant
in Zp , this is a true rotation.
More generally, for e2 = 1, . . . , n2 − 1, applying θge2 rotates the slots of each row to the left by
2
e2 positions, and then the slots in the last e2 columns are perturbed by powers of Frobenius and
permuted, applying θg−e2 rotates the slots of each row to the right by e2 positions, and then the
slots in the first e2 columns are permuted and perturbed by powers of Frobenius.
Just as we did in the one-dimensional case, we can use masking to implement true rotations,
even if g2n2 6= 1. To rotate the slots in each row to the left e2 positions, we can form a “masking
13
(2)
element” Me2 with the correspondence
1 ··· 1 0 ··· 0
1 ··· 1 0 ··· 0
Me(2) ∈ Ap ←→ .. .. ∈ E n1 ×n2 .
2
. .
1| ·{z· · 1} 0| ·{z
·· 0}
n2 −e2 e2
Then, we can rotate the slots in each row of a ∈ A to the left by e2 positions by either computing
Me(2)
2
· θge2 (a) + (1 − Me(2)
2
) · θge2 −n2 (a). (16)
2 2
or
(2) (2)
θge2 ((1 − Mn2 −e2 ) · a) + θge2 −n2 (Mn2 −e2 · a). (17)
2 2
Rotating the slots in each column. Besides rotating the slots in each row by a given amount, we can
also use Galois automorphisms to rotate the slots in each column by a given amount. Specifically,
applying θg1 to a ∈ A rotates the slots in each column up one position. If g1n1 = 1, then this results
in a true rotation. Otherwise, this results in a rotation, followed by a Frobenius perturbation and
possibly a permutation of the slots in the last row. Just as we did above, we can combine Galois
automorphisms and masking to implement true rotations, even if g1n1 6= 1. If we define the masking
(1)
element Me1 ∈ a to have all 1’s in the slots in its first n1 − e1 rows and all 0’s in the slots in its
last e1 rows, then we can rotate the slots in each column up e1 positions by computing either
Me(1)
1
· θge1 (a) + (1 − Me(1)
1
) · θge1 −n1 (a). (18)
1 1
or
(1) (1)
θge1 ((1 − Mn1 −e1 ) · a) + θge1 −n1 (Mn1 −e1 · a). (19)
1 1
14
(a) a hypercolumn (b) a slice
• If gini = 1, we say that i is a good dimension. In this case, applying θgi to a ∈ A effectively
rotates the slots in each hypercolumn in the ith dimension by one position. One can also say
that it rotates the slices orthogonal to the ith dimension by one position.
• If gni 6= 1 but gni ∈ H, then we say that i is a bad dimension. In this case, applying θgi to
a ∈ A effectively rotates the slots in each hypercolumn in the ith dimension by one position,
and then perturbs the slot that wrapped around by a power of Frobenius.
• If gni ∈
/ H, then we say that i is a very bad dimension. In this case, applying θgi to
a ∈ A effectively rotates the slots in each hypercolumn in the ith dimension by one position,
and then perturbs the slot that wrapped around by a power of Frobenius, and in addition,
permutes the slots within the corresponding slice.
If i is a bad (or very bad) dimension, we can still implement rotations and masks, analogous to
what we did in the case of one or two dimensions by using the formula
Me(i)
i
· θgei (a) + (1 − Me(i)
i
) · θgei −ni (a). (21)
i i
or
(i) (i)
θgei ((1 − Mni −ei ) · a) + θgei −ni (Mni −ei · a). (22)
i i
(i)
Here, Mei denotes the element of A that has 1 in the first ni − ei slots of each hypercolumn in the
ith dimension, and 0 in the last ei slots of each hypercolumn in the ith dimension.
So for each dimension i and each ei ∈ [ni ], we get a permutation on the slots of the hypercube.
The collection of all of these permutations is sharply transitive, which means that for every two
slots, there is a unique permutation that moves the first slot to the second.
Note that it is always possible to choose a set of generators where each generator is either
good or bad — but not very bad. This follows from the Fundamental Theorem of Finite Abelian
Groups, applied to the group Z∗m /H. See Appendix B for more details on the default procedure
used by HElib used to find generators. However, HElib also allows for very bad generators (which
are currently used for bootstrapping).
15
3.2 Working in subfields of E
Instead of working with the slot algebra E, we can work in any subfield E 0 of E. Such a subfield
may be specified by an arbitrary polynomial G(X) ∈ Zp [X] whose degree d0 divides d, and E 0 is
isomorphic to Zp [X]/(G(X)).
Working in such a subfield does not change at all the algebra for performing intra-slot data
movement. It only affects how data gets encoded and decoded in the slots.
σ: E −→ E
(23)
f (η) 7−→ f (η p ) (for f (X) ∈ ZP [X]).
Note that this map sends η to η p (and not η P ). (Also note that unlike the case when r = 1, it is
not the case that σ(α) = αp for α ∈ E.)
The Galois automorphism θp̄ still effectively applies the Frobenius automorphism to each slot,
and all of the techniques discussed in Section 3.1.2 still work without any modification.
If r > 1, then unlike in Section 3.2, HElib does not allow one to work in any subring of E other
that E itself and ZP . (Currently, there are no compelling applications to do so, and the math for
doing so is much more complicated.)
16
lies in the interval [−q/2, q/2). For every c̄ ∈ Aq , there is a unique c ∈ A such that c̄ = [c mod q]
and c is q-reduced on that basis (this is just the usual division with remainder property over Z).
We call c the canonical representative on that basis of c̄.4
In addition to the enciphering family {c̄i }i∈I , an HElib ciphertext ψ holds also some bookkeeping
information:
• the plaintext modulus P , which is a prime power of the form P = pr (and relatively prime
to both m and q),
Some quantities related to the process of decrypting ψ with the secret key {si }i∈I are:
It is convenient to define the capacity of such a ciphertext as q/, which intuitively represents
how much more noise can be tolerated before all information about the plaintext is lost.
where the coefficients of bi on the standard basis lie in the interval [−Q/2, Q/2].
4
The choice of interval [−q/2, q/2) rather than (−q/2, q/2] is quite arbitrary. In fact, in HElib, the ciphertext
modulus q is typically odd, in which case, there is no difference at all.
5
The choice of basis is somewhat arbitrary, but in HElib, the powerful basis is used here, rather than the standard
basis, because of the tighter relationship between the canonical norm and the powerful basis infinity norm.
6
The choice of Z-basis here is somewhat arbitrary and may change in the future. In fact HElib, mod switches on
the powerful basis in the bootstrapping routine.
17
For each i ∈ I, we construct di ∈ A so that
Qdi ≡ bi (mod P )
and the coefficients of di on the standard basis lie in the interval [−P/2, P/2].
• If P is even, for some coefficients, we may have a choice between −P/2 and P/2 (note that
−P/2 ≡ P/2 (mod P )); for such a coefficient, then the one that is chosen has the same sign
as the corresponding coefficient of bi (or is chosen at random if the corresponding coefficient
of bi is zero).
The new ciphertext ψ 0 consists of the enciphering family {c̄0i }i∈I , and its correction factor κ0 ∈ ZP
is set to
κ0 := [ Qq mod P ] · κ.
Note that by construction, we have
It is evident from (25) that qe ≡ Qe0 (mod P ). To show that ψ 0 decrypts to the same plaintext
as ψ, it suffices to show that e0 is itself q-reduced on the powerful basis. To do that, it suffices to
show that ke0 k is sufficiently small.
To this end, observe that
X X X q
0 0
e = ci si − qf = (ai + di ) si − qf = ci − bi /Q + di si − qf
Q
i i i
q X q X
= e+ (di − bi /Q)si = e + êi si ,
Q Q
i i
where
êi := di − bi /Q
for i ∈ I. In particular,
0 q
X
ke k ≤ kek +
êi si
. (26)
Q
i
18
We call the first term in (26) the mod-switch scaled noise, and the second term the mod-switch
added noise. Given upper bounds τi on ksi k, we can bound the added noise by
X
X
êi si
≤ kêi kτi . (27)
i i
With Dm as in Lemma 1, if Dm 0 < q/2, then kpwfl(e0 )k∞ < q/2, and hence e0 is q-reduced on the
powerful basis, as required. It follows that ψ 0 decrypts to the same plaintext as ψ, and, moreover,
0 is an upper bound on the noise of ψ 0 .
where b0i = bi /q and each coefficient of b0i lies in the interval [−R/2, R/2].
We then have
ci − b0i
c0i = ai + di = + di .
R
Let ν : A → Aq be the natural map from A to Aq (which sends a ∈ A to [a mod q] ∈ Aq ), and let
ρ := [R mod q] ∈ Z∗q . Then we have
Recall that q | Q, and let ν 0 : AQ → Aq is the natural map from AQ to Aq (which sends [a
mod Q] ∈ AQ to [a mod q] ∈ Aq ), then we also have
The advantage of this formulation is that we do not have to explicitly compute ai , which allows
for certain optimizations that we shall discuss later.
19
Assuming that the coefficients of c̄i are independently and uniformly distributed over ZQ , we
can also say something about the distributions of the coefficients of c̄0i and êi .
First, consider the distribution of the coefficients of c̄0i .
• For the settings in Section 4.1.1, it is easy to see that the coefficients of c̄0i are independently
and uniformly distributed over Zq . In (30), each coefficient u of ν 0 (c¯i ) is uniformly distributed
over Zq . Moreover, the corresponding coefficient v of ν(b0i + Rdi ) depends only on the corre-
sponding coefficient of bi , which by the Chinese Remainder Theorem, is independent of u. It
follows that w = (u − v)ρ−1 is uniformly distributed over Zq .
• More generally (and, in particular, in the mod switching that occurs during bootstrapping),
we have c0i = ai + di . In this case, assuming Q q, then the distribution of each coefficient
u of [ai mod q] ∈ Aq will be close to the uniform distribution over Zq . Moreover, assuming
Q/q P , then conditioned on a fixed value of u, the distribution of the corresponding
coefficient v of di is close to the symmetric distribution mod P (see Section 2.6.3). Thus,
u and v can be reasonably modeled as independent random variables. It follows that the
coefficients of c̄0i are independently distributed, and assuming that Q/q P , each coefficient
of c̄0i has a distribution that is close to the uniform distribution over Zq .
Thus, in either case, we see that the coefficients of c̄0i are independently distributed; in the first
case, each coefficient is uniformly distributed over Zq ; in the second case, each coefficient has a
distribution that is close to the uniform distribution over Zq , assuming Q/q P .
Second, consider the distribution of the coefficients of êi . Assume that Q is odd.7 Let t =
gcd(Q, q), and set Q̃ := Q/t and q̃ := q/t, so that gcd(Q̃, q̃) = 1.8 In (24), we have t | bi , and
setting b̃i := bi /t, we can rewrite (24) as
It follows that each coefficient of b̃i is symmetrically distributed mod Q̃. From this, it follows that
if Q̃ P , each coefficient of di is close to the symmetric distribution mod P , and that êi can be
reasonably modeled by the uniform distribution over [−P/2, P/2].
4.2 Scaling up
The scaling-up operation is in some sense “the opposite of mod switching”, in that it converts a
ciphertext modulo q into another ciphertext modulo a larger modulus Q (which has to be a multiple
of q), with the noise growing by a factor Q/q.
Suppose we have a ciphertext with:
• a plaintext modulus P = pr ,
• a ciphertext modulus q,
20
Let Q := Rq, where R is a positive integer, not divisible by p.
We can define the scale-by-R map
scaleR : Zq −→ ZQ
[a mod q] 7−→ [Ra mod Q] (for a ∈ Z).
This map is well defined. Moreover, it extends naturally to a map from Aq to AQ , applying it
coordinate-wise on any Z-basis for A (the choice does not matter). We can further extend this map
element-wise to families of elements of Aq .
Using this map, we can define a new ciphertext with:
• ciphertext modulus Q,
• the enciphering family scaleR (C) which encrypts the same plaintext as the original ciphertext
relative to S using correction factor κ0 := [R mod P ] · κ,
The process of key switching, or re-linearization, allows us to compute a new ciphertext (c̄00 , c̄01 )
that decrypts to same plaintext under a different secret key of the form S := (1, s). To do this, we
will need access to so-called key-switching matrices, whose structure is described below.
We shall always ensure that the ciphertext modulus q can be factored as q = `j=1 Dj , where
Q
the “digits” Dj are coprime and odd. For j = 1, . . . , `, let Dj∗ be the product of all the digits up to
but not including Dj ,
Dj∗ := D1 · · · Dj−1 .
For i ∈ I, let ci ∈ A be the canonical representative of c̄i ∈ Aq on the standard basis,9 so each
coefficient of ci on the standard basis lies in the interval [−q/2, q/2).
Recall that S := (1, s), let T ⊆ I be the set of “trivial” indices i such that si ∈ {1, s}.10 The
indices i ∈ T will be treated in a special, simplified manner (see below). Consider i ∈ I \ T . We
9
The choice of Z-basis here is somewhat arbitrary.
10
In the computation, the actual values si are never used, but it is enough to know when (by construction) si = 1
or si = s.
21
decompose each coefficient of ci into “digits”, using the mixed-radix system D1 , . . . , D` , so that
`
X
ci = Dj∗ cij ,
j=1
where each coefficient of cij ∈ A lies in the interval (−Dj /2, +Dj /2).
The key-switching matrix for si 7→ s is a 2 × ` matrix whose jth column is essentially an
encryption of RDj∗ si under s, but with respect to a larger ciphertext modulus of the form Q = Rq,
where R is also odd and coprime to q. More precisely, for j = 1, . . . , `, the jth column consists of
(0) (1)
two elements aij , aij ∈ A such that
(0) (1)
aij + aij s ≡ RDj∗ si + P eij (mod Q).
(0) (1)
Using this key-switching matrix, we can compute (ci , ci ) ∈ A2 such that
`
(0) (1) (0) (1)
X
(ci , ci ) ≡ (cij aij , cij aij ) (mod Q).
j=1
If si = s, we define
(0) (1)
(ci , ci ) = (0, Rci ).
Finally, we compute (c00 , c01 ) ∈ A2 such that
X (0) (1)
(c00 , c01 ) ≡ (ci , ci ) (mod Q).
i∈I
where in the sum over i, j, index i ranges over I \ T . Thus, if (c̄00 , c̄01 ) is the image of (c00 , c01 ) in A2Q ,
and we set the correction factor κ0 := [R mod P ] · κ ∈ Z∗P , we get a ciphertext ψ 0 with ciphertext
modulus Q that decrypts to the same thing as the original ciphertext, provided that the noise in
22
ψ 0 is not too large relative to Q. If the noise kek in the original ciphertext ψ is bounded by , and
we have bounds ij on the canonical norms keij k, then the noise in ψ 0 is bounded by
X
0 := R + P ij kcij k. (31)
i,j
P
This first term R is the key-switch scaled noise, and the second term P i,j ij kcij k is the
key-switch added noise. Parameters are typically selected so that the key-switch added noise is
dominated by the key-switch scaled noise. See Section 5.3.4 for more details.
• a ciphertext modulus q` ,
Before we can add these two ciphertexts, we have to adjust them so that the plaintext moduli,
ciphertext moduli, and correction factors match.
1. First, we make the plaintext moduli match by making them both equal to P := gcd(P1 , P2 ) =
pmin(r1 ,r2 ) .
2. Second, we make the ciphertext moduli match by making them both equal to Q :=
lcm(Q1 , Q2 ). To do this, we apply the up-scaling procedure in Section 4.2.
3. Third, to make the correction factors match, we choose integers e1 , e2 , relatively prime to P ,
such that
[e1 mod P ] · κ1 = κ = [e2 mod P ] · κ2 .
The values e1 and e2 are chosen using a heuristic procedure, based on the extended Euclidean
algorithm, that attempts to make |e1 |1 + |e2 |2 as small as possible.
Specifically, we compute an integer ratio ∈ Z such that
and then run the extended Euclidean algorithm on inputs P and ratio. This generates a list
(i) (i)
of pairs of integers (e1 , e2 ), where
(i) (i) (i) (i)
e1 ≡ e2 · ratio (mod P ), e1 , e2 ∈ [−P/2, +P/2], and
(i) (i)
gcd(e1 , P ) = gcd(e2 , P ) = 1.
23
Then, for ` = 1, 2, we replace
• C` by e` C` ,
• ` by |e` |` .
So now both ciphertexts have the same plaintext modulus P , the same ciphertext modulus Q,
and the same correction factor κ. Suppose that
Note that we are assuming that the secret keys for the two ciphertexts are indexed in a consistent
way, so that if two indices are equal, then the components themselves are equal. The secret key for
the resulting ciphertext is the union of the two keys,
C := {c̄00k }k∈K ,
where
c̄k
for k ∈ I \ J,
00
c̄k = c̄0k for k ∈ J \ I,
c̄k + c̄0k for k ∈ I ∩ J.
The noise bound in the resulting ciphertext is := 1 + 2 . The resulting ciphertext decrypts to
the same plaintext as the sum of the decryptions of the two given plaintexts (with respect to the
new plaintext modulus P ).
• a ciphertext modulus q` ,
Before we can multiply two ciphertexts, we have to adjust them so that the plaintext moduli
and ciphertext moduli match.
1. First, we make the plaintext moduli match by making them both equal to P := gcd(P1 , P2 ) =
pmin(r1 ,r2 ) .
24
2. Second, we make the ciphertext moduli match by applying appropriate upscaling and mod
switching to bring them to a common ciphertext modulus Q. In selecting Q, an attempt is
made to reduce the noise in each ciphertext somewhat. See Section 5.3.2 for details.
So now both ciphertexts have the same plaintext modulus P and the same ciphertext modulus
Q. Suppose that
S1 = {si }i∈I and S2 = {sj }j∈J .
Note that we are assuming that the secret keys for the two ciphertexts are indexed in a consistent
way, so that if two indices are equal, then the components themselves are equal. The secret key for
the resulting ciphertext is
S := {si sj }(i,j)∈I×J .
Now suppose that
C1 = {c̄i }i∈I and C2 = {c̄0j }j∈J .
The enciphering family of the resulting ciphertext is
The correction factor of the resulting ciphertext is κ := κ1 κ2 . The noise bound of the resulting
ciphertext is := 1 2 .
Note that if there are known identities among the components si sj of S, then identical secret
key components may be replaced a single component, and the corresponding components of the C
are added together to form a single component.
Example. Suppose the input ciphertexts are both defined with respect to a secret key of the
form (1, s). Let C1 = (c̄0 , c̄1 ) and C2 = (c̄00 , c̄01 ). The product ciphertext is defined with respect
to the secret key (1, s, s2 ), and its enciphering tuple is (c̄0 c̄00 , c̄0 c̄01 + c̄1 c̄00 , c̄1 c̄01 ). After such a
multiplication, if a key-switching matrix for s2 7→ s is available, then the product ciphertext can
be key-switched back to a ciphertext relative to the secret key (1, s).
• a plaintext modulus P = pr ,
• a ciphertext modulus q,
Now suppose we want to homomorphically apply a Galois automorphism θj , where j ∈ Z∗m (see
Section 2.7), to ψ.
Suppose that C = {c̄i }i∈I and S = {si }i∈I . The enciphering tuple for the resulting ciphertext
ψ 0 is C 0 = {θj (c̄i )}i∈I . The secret key for ψ 0 is S 0 = {θj (si )}i∈I . The plaintext modulus, ciphertext
modulus, correction factor, and noise bound for ψ 0 are the same as for ψ. One can verify that if ψ
decrypts to ē ∈ AP , then ψ 0 decrypts to θj (ē) ∈ AP .
25
4.7 Key generation and encryption
4.7.1 Secret key generation
A generated secret key is always in the canonical form (1, s), where s ∈ A. Recall that A = Z[x]
where x := [X mod Φm (X)] is the image of the indeterminate X in A.
Define
m̂ := φ(m), if m is a power of two, and m̂ := m, otherwise. (32)
The value s ∈ A, with coefficients 0, ±1, is generated using one of two methods, depending on
an application-defined parameter.
where each ai is chosen at random from the set {±1}. A bound Bsk is computed such that ksk ≤ Bsk
with probability at least 1/2, and the above procedure for generating s is repeated until ksk ≤ Bsk .
φ(m)
α := .
2m̂
Then X
s := ai xi ,
i∈I
where each ai is chosen at random from the set {±1}. A bound Bsk is computed such that ksk ≤ Bsk
with probability at least 1/2, and the above procedure for generating s is repeated until ksk ≤ Bsk .
The only difference between the two methods is in the selection of the set of indices I and in
the value of the bound Bsk .
26
These bounds have also been experimentally validated to ensure that the probability that
Pr[ksk > Bsk ] for a randomly sampled s is at least roughly 1/2. One advantage of generating
s via this type of “rejection sampling”, rather than just generating a single s, is that we can use a
smaller bound Bsk , rather than a significantly larger high-probability bound Bsk . Another advan-
tage is that we are guaranteed that ksk ≤ Bsk with probability 1, rather than with high probability.
The disadvantage is that we lose, essentially, one bit of security.
• Second, we generate a random element e∗ ∈ A with small norm and Gaussian coefficients, in
a manner to be described below (see Section 4.7.3).
One can see that of c̄∗0 + c̄∗1 s = [P e∗ mod q]. That is, (c̄∗0 , c̄∗1 ) is an encryption of zero with
respect to the secret key (1, s). We also set
Bpk = P · Bgauss ,
where Bgauss is a bound on ke∗ k (see Section 4.7.3 below). The bound Bpk is called the public-key
noise bound, and is stored with the public key, along with the specified plaintext modulus P .
A bound Bgauss is computed such that kek ≤ Bgauss with probability at least 1/2, and the above
procedure for generating e is repeated until kek ≤ Bgauss
√ .
To estimate Bgauss , we use (2) and (3) with σ := m̂ · σ̂0 .11 Based on this, we get the bound
p
Bgauss = σ̂0 · m̂ log(φ(m)). (35)
Again, these bounds have also been experimentally validated to ensure that the probability that
Pr[kek > Bgauss ] for a randomly sampled e is at least roughly 1/2.
11
Here, we are using a rounded Gaussian distribution, which instead of having a variance of σ̂02 , actually has a
variance that is a bit larger, namely ≈ σ̂02 + 1/12. See, for example, https://mathoverflow.net/questions/178964/
estimating-the-variance-of-a-discrete-normal-distribution. Hopefully, we can find a better reference.
27
4.7.4 Encryption using the public key
Let (c̄∗0 , c̄∗1 ) be a public key as above, so that c̄∗0 , c̄∗1 ∈ Aq and c̄∗0 + c̄∗1 s = [P e∗ mod q], where P = pr
is the plaintext modulus associated with the public key. Let Bpk be the public-key noise bound,
which is a bound on kP e∗ k.
In addition to the public key, the encryption routine takes as input a polynomial a ∈ A,
representing the plaintext, along with a plaintext modulus P 0 | P (by default, P 0 = P , but it is
possible to override this default behavior).
The ciphertext ψ produced will have a ciphertext modulus q (the same q used in the public
key). For historical reasons (see Section 5.4.1), the correction factor κ associated with ψ will be
κ := [q mod P 0 ] ∈ ZP 0 .
As a first step, the input a is replaced by an element b ∈ A such that
• b ≡ q · a (mod P 0 ),
• the coefficients of b on the standard basis are symmetrically reduced mod P 0 (see Sec-
tion 2.6.3.1).
It is assumed that kbk ≤ Bptxt , where Bptxt is a high-probability bound computed based on
the analysis in Sections 2.6.2 and 2.6.3. Specifically, we estimate the probability that kbk >
Bptxt assuming (heuristically) that the coefficients of b on the standard basis are symmetrically
distributed mod P 0 . We apply (5) with σ = φ(m)σP 0 , where σP 0 is bounded as in Lemma 3, and
estimate the probability that kbk > Bptxt as in (5). By default, HElib chooses Bptxt = 10σ, so that
this probability is (heuristically) bounded by ≈ 2−75.8 · φ(m)/2. In the current implementation,
HElib will print a warning if kbk > Bptxt . Another strategy under consideration is a randomized
sampling and rejection strategy.
Next, the enciphering tuple (c̄0 , c̄1 ) ∈ Aq × Aq of ψ is computed as follows:
where
• each of e0 and e1 are generated with small norm and Gaussian coefficients as in Section 4.7.3,
so that
kei k ≤ Bgauss (for i = 0, 1),
with Bgauss as in (35).
We have
c̄0 + c̄1 s = [P e∗ r + P 0 (e0 + e1 s) + b mod q].
It follows that the noise of ψ is bounded by
28
4.7.5 Encryption using the secret key
In some applications, the encrypting entity make have access to the secret key (1, s). In this case,
the above encryption procedure can be modified to produce a ciphertext with somewhat less noise.
The first steps of the encryption procedure are identical. The enciphering tuple (c̄0 , c̄1 ) is
computed as follows:
• c̄0 := b + P 0 e − sc̄1 ,
where e is generated with small norm and Gaussian coefficients as in Section 4.7.3, so that kek ≤
Bgauss , with Bgauss as in (35).
We have
c̄0 + c̄1 s = [b + P 0 e mod q].
It follows that the noise of the resulting ciphertext ψ is bounded by
where eij is generated with small norm and Gaussian coefficients as in Section 4.7.3, so that
keij k ≤ Bgauss , with Bgauss as in (35). Thus, the bounds ij appearing in (31) are all set to Bgauss .
See Section 5.3.4 for more details on the values of q, R, and P .
29
5.1 Ciphertext prime sets
A ciphertext modulus q is always defined as a product of word-sized primes. On a 64-bit machine,
such a word-sized prime π is typically at most 60 bits, so as to allow for efficient modular arithmetic
modulo π.12
For reasons to be described below (see Section 5.2), we will always choose word-sized primes π
with π ≡ 1 (mod m). In addition, for different reasons, also to be described below (see Section 5.2),
we will attempt to choose π such that π ≡ 1 (mod 2t ) for t as large as possible.
When initialized on a given set of parameters, HElib defines three disjoint sets of word-sized
primes: smallPrimes, normalPrimes, specialPrimes. The set of normalPrimes are also ordered:
normalPrimes = {π1 , . . . , πK }.
S = Ssmall ∪ {π1 , . . . , πk }
or
S = Ssmall ∪ {π1 , . . . , πk } ∪ specialPrimes,
where Ssmall ⊆ smallPrimes and k ∈ {0, . . . , K}. That is, S consists of
We call S a ciphertext prime set. In HElib, every ciphertext carries with it a ciphertext prime
set, which defines a corresponding ciphertext modulus.
Each of the primes in normalPrimes are chosen to be of the same bit length. In contrast, the
bit lengths of the primes in smallPrimes are of a variety of sizes, all of which are smaller than the
length of the normalPrimes, and chosen in a manner so that various subsets of smallPrimes can be
utilized to form ciphertext moduli of a wide variety of bit lengths. This is discussed in more detail
below in Section 5.3.2.2.
30
An element c̄ ∈ Aq can be represented in a number of ways. One natural way is as a vector
over Zq representing the coefficients of c̄ on the standard basis (see Section 2.4). We call this the
standard representation of Aq .
HElib generally uses a different representation of Aq , which we call the Double-CRT repre-
sentation, defined as follows. For each π ∈ S, we choose a primitive mth root of unity ωπ ∈ Zπ
(recall that we are assuming that π ≡ 1 (mod m), which guarantees the existence of such an ωπ ).
We also have a natural map from Zq to Zπ , which we can extend to a natural map from Aq to Aπ .
For c̄ ∈ Aq and j ∈ Zm , we define c̄(ωπj ) ∈ Zπ to be the element in Zπ obtained by evaluating the
image of c̄ in Aπ at the value ωπj . The Double-CRT representation of c̄ ∈ Aq is defined to be the
collection of values n o
c̄(ωπj ) ∗
. (37)
(π,j)∈S×Zm
In the Double-CRT representation, elements of Aq can be added and multiplied very efficiently,
indeed, in linear time.
One can convert between the standard and Double-CRT representations. Conversion from
standard to Double-CRT essentially consists of doing the following:
• for each π ∈ S, reduce modulo π the coefficients of c̄ ∈ Aq on the standard basis to get an
element c̄π ∈ Aq , represented on the standard basis;
31
is the modulus used to form the public key (see Section 4.7.2) and in the encryption routines (see
Sections 4.7.4 and 4.7.5).
We would like to choose q so that for each ` = 1, 2, the first term (q/q` )` is about the same as
(`)
(or perhaps a bit smaller than) the second term Bams . So the logic attempts to choose a common
ciphertext modulus q such that
(1) (2)
log2 (q) ≈ target := min log2 (Bams ) + log2 (q1 ) − log2 (1 ), log2 (Bams ) + log2 (q2 ) − log2 (2 ) .
The logic to determine such a q is essentially performs a brute-force search among all possible
ciphertext moduli that are the product of normalPrimes and smallPrimes, but not specialPrimes,
as defined in Section 5.1. The number of such moduli is bounded by
(|normalPrimes| + 1) · 2|smallPrimes| .
The cardinality of smallPrimes is chosen by design to be small enough so that this quantity is
not outrageously large. In addition, for each such moduli q, the quantity log2 (q) is pre-computed
and stored in a table (along with the corresponding prime set), and the table is sorted in order of
increasing log2 (q). Given the value target, a search for a value of log2 (q) is done in a small interval
Typically, there will be several values of log2 (q) to choose from, and among these one is chosen that
(heuristically) minimizes the cost of the mod switching operation (and ties are broken in favor of
the largest log2 (q) value in the interval). If no such value is found, a log2 (q) value is chosen that is
as large as possible while still being bound from above by target − a.
The current values used in HElib for a and b are a = 4 and b = 1.
Here, each τi is a bound on ksi k, which can be derived from the bound Bsk in Section 4.7.1.3.
32
While each value kêi k can be computed at the time we do the mod switching, in determining
which modulus to switch to, it is convenient to use an easy-to-compute high-probability bound
on kêi k, rather than the value kêi k itself. As discussed in Section 4.1.2, the coefficients of êi on
the standard basis lie in the interval [−P/2, +P/2], where P is the plaintext modulus, and can
be reasonably be modeled as independently and uniformly distributed over this same interval. We
therefore use a high-probability bound Bround on kêi k based on the analysis in Section 2.6.2. Specif-
ically, we estimate the probability that kêi k > Bround assuming (heuristically) that the coefficients
of êi on the standard basis are uniformly distributed over the interval [−P/2, +P/2]. We apply (5)
with σ = φ(m)·P 2 /12, and estimate the probability that kêi k > Bround as in (5). By default, HElib
chooses Bround = 10σ, so that this probability is (heuristically) bounded by ≈ 2−75.8 · φ(m)/2.
33
where 0
k
Y
0
q := πi .
i=1
We then scale up, as in Section 4.2, adding the primes πk+1 , . . . , πk0 , and then mod switch down to
the prime set π1 , . . . , πk0 , as in Section 4.1, dropping the small and special primes in S. After this
is done, the noise in the ciphertext will be bounded by
scaled + Bams ,
and the choice of k 0 ensures that the second term is smaller than the first term, which ensures that
very little capacity is lost. In fact, when we actually perform the mod switching operation, the
bound Bams is replaced by a more precise and somewhat smaller probability-1 bound.
Note that in some circumstances, there might not be enough normal primes to allow us to find
0
a k so that scaled is large enough. In this case (which is very atypical in practice), some capacity
will be lost.
DS 1 , . . . , DS L ,
where
where
0 < k1 < k2 < · · · < kL = K.
This defines corresponding digits
Y
b j :=
D π (j = 1, . . . , L).
π∈DS j
Qk
We know that q is of the form q = i=1 πi , which means we can factor q as
q = D1 · · · D` ,
where
D1 = D
b 1 , . . . , D`−1 = D
b `−1 , and D` | D
b `.
34
The significance of this is that the values
Dj∗ := D1 · · · Dj−1 (j = 1, . . . , `)
only depend on the values D b j , for j = 1, . . . , L. Recall that a key-switching matrix for si 7→ s
consists of encryptions of RDj∗ si under s, for j = 1, . . . , `, and so these key-switching matrices can
be computed at key-generation time, independent of the particular value of the current ciphertext
modulus q.
In more detail, at key-generation time, key-switching matrices are computed using the digits
D b L , and the ciphertext modulus Q
b 1, . . . , D b = Rq̂, where R is the product of all of the special primes,
and q̂ is the product of all of the normal primes. When performing key switching on a ciphertext
with modulus q | q̂, we simply drop the primes dividing q̂/q from the key-switching matrices. Indeed,
if the we have an encryption modulo Q b of RD∗ si under s, just reducing everything mod Q = Rq
j
gives us an encryption modulo Q (with the same noise). Note also that at key-generation time, the
key-switching matrices must be generated using a value of P = pr for the plaintext modulus that
is at least as large as any plaintext modulus that may arise during the lifetime of the public key.
At public-key generation time, the normal primes are decomposed in digit sets DS 1 , . . . , DS L of
roughly equal cardinality, so that each resulting digit has roughly the same bit length. The number
of digits L is a parameter than can be selected at system initialization. The default is L = 3.
The special primes are chosen so that their product R is large enough so that in (31) the first
term
P R, which represents the key-switching scaled noise, will likely dominate the second term
P i,j ij kcij k, which represents the key-switching added noise. One sees that as the parameter L
increases, the bit-length of the digits will decrease, and hence the values kcij k in (31) will decrease,
and hence we can get by with a smaller value of R, which will imply a higher level of security
(which degrades as Q b increases). In the current implementation of HElib, the bit-length of R is
determined by default using a somewhat heuristic formula that depends on (among other things)
the bit-length of the digits. This default behavior can be overridden, so that the a user can specify
the bit-length of R explicitly. Indeed, experimentally, we have found that in many applications,
it is possible to use a somewhat smaller bit-length for R, resulting in a higher security level, but
without degrading capacity significantly.
As we saw above, by increasing the value of L, we can increase the security level. However,
this comes at a cost: larger values of L increase the space consumed by the key-switching matrices,
and increase the running time of the key-switching operation. These space and time costs increase
linearly in L.
As discussed in Section 4.7.6, each ij is Bgauss as in (35), which is bound on keij k, where eij
generated with small norm and Gaussian coefficients as in Section 4.7.3. As per (35), we have
p
Bgauss = σ̂0 · m̂ log(φ(m)),
√
where if m is a power of two then σ̂0 = σ0 and m̂ = φ(m), and otherwise σ̂0 = σ0 m and m̂ = m.
Now consider the terms kcij k appearing in (38). If we heuristically model the coefficients of cij as
35
uniformly distributed on the interval [−Dj /2, Dj /2], then we can use the heuristic estimate (5) with
σ 2 = φ(m) · Dj2 /12. From this, we can bound the size of the j’th digit kcij k with high probability
by p
Bj = k · Dj φ(m)/12,
where k is a suitable parameter (see the table of values in Section 2.6.2 — we typcally use k = 10).
Thus, if D = maxj Dj , we have the following bound on the key-switch added noise α:
p p
α ≤ (#terms) · P · σ̂0 m̂ log(φ(m)) · k · D φ(m)/12
p P σ̂0 k
= D · m̂ · φ(m) · log(φ(m)) · √ · (#terms).
12
Here, #terms denotes the number of terms in the summation (38). Therefore,
p P σ0 k
D · φ(m) log(φ(m)) · √ · (#terms), if m is a power of two,
12
α≤
p P σ0 k
D · m φ(m) log(φ(m)) · √ · (#terms), otherwise.
12
As discussed above, we want to choose the special primes so that their product R satisfies
R α, where α is the key-switch added noise (38) and is the noise in a ciphertext before key
switching. Of course, we have to choose R ahead of time, and we will not know the relevant value
of at this time. We instead choose to estimate = β 2 , where β is an estimate of the mod-switch
added noise.
The reasons for estimating in this way is is as follows. First, key switching often happens
right after a ciphertext multiplication. As a part of the multiplication process, the noise in each
multiplicand is reduced to roughly β, and after multiplication, it becomes β 2 . Second, even for
key switching operations performed after other operations (such as automorphisms), in which the
ciphertext has less noise than β 2 (and typically, it will be about β), choosing R as we do may
cause the key switching operation to decrease the capacity a bit, but this is a “self correcting”
process: if repeated, since the noise is larger, subsequent key switches will decrease the capacity
less substantially.
As per (27), The mod-switch added noise is bounded by
X
kêi kτi .
i
Here, each τi is a bound on the norm of the relevant secret key, and as discussed in Section4.1.2 the
coefficient of each êi can be modeled as uniformly and independently distributed over [−P 0 /2, P 0 /2].
Note that we use P 0 here to distinguish it from P above — in general, P ≥ P 0 , and while in some
applications, we may have P = P 0 , in others (for example, when bootstrapping), we may have
P > P 0.
As we did above, we can use the heuristic estimate (5) with σ 2 = φ(m) · (P 0 )2 /12. From this,
we can bound kêi k with high probability by
k · P 0 φ(m)/12.
p
As for the terms τi appearing above, we use the results of Section 4.7.1.3, and specifically, (33)
and (34), obtaining a bound of p
h log(φ(m))
36
for τi . Here, h is the specified Hamming weight, or h = φ(m)/2 in the unbounded Hamming weight
case.
So we will use the estimate
β = k · P 0 φ(m)/12 · h log(φ(m)).
p p
While this estimate is really an upper bound, it is actually a very tight upper bound, and so it is
not so bad.
So we want R at least as big as α/β 2 .
In the case where m is a power of two, we have
37
We compute c̄0i ∈ Aq according to the formula (30):
Here, b0i ∈ A is the canonical representative on the standard basis of the image of ci under the
natural map from AQ to AR . Computationally speaking, this requires a conversion of the Double-
CRT representation of an element in AR to its corresponding standard representation. The value
di ∈ A is easily derived from b0i with negligible computational cost. The computation of ν(b0i +Rdi ),
where ν is the natural map from A to Aq , requires a conversion of the standard representation of
an element in Aq to its corresponding Double-CRT representation.
Thus, the cost of mod-switching from Q to q is dominated by these two operations:
Note the savings over a more straightforward implementation in which the first operation would
instead be a Double-CRT to standard conversion in AQ .
Note: We do not the modify stored correction factor κ̃, but the implied correction factor κ gets
divided by [R mod P ].
5.4.3 Scaling up
The up-scaling operation described in Section 4.2, which scales up a ciphertext with modulus q to
one with modulus Q = Rq, is trivial to implement.
We are starting with a ciphertext with an enciphering family {c̄i }i∈I , each c̄i ∈ Aq , and are
computing an enciphering family {c̄0i }i∈I , with each c̄0i ∈ AQ . As discussed above, all of these ring
elements c̄i and c̄0i are represented in Double-CRT format.
Consider one such ring element c̄i ∈ Aq , and suppose its Double-CRT representation, as in (37)
is n o
c̄i (ωπj ) ∗
,
(π,j)∈S×Zm
• for each π dividing R and each j ∈ Z∗m , we set the value c̄i (ωπj ) to 0, and
• for each π dividing q and each j ∈ Z∗m , we multiply the value c̄i (ωπj ) by [R mod π].
Note: We do not the modify stored correction factor κ̃, but the implied correction factor κ gets
multiplied by [R mod P ].
A Computing Em
With notation as in Section 2.5.2, we discuss techniques to efficiently compute Em in time O(φ(m)2 )
with satisfactory accuracy.13
13
The algorithm presented here is similar to the Parker-Traub algorithm in [5]. However, it is specialized to take
advantage of the fact that the evaluation points are the roots of Φm (X).
38
First, observe that if we index the columns of Vm−1 by j ∈ Z∗m , then the jth column is the
coefficient vector of the Lagrange basis polynomial
1 Φm (X)
· .
Φ0m (ω j ) (X − ω j )
We also have Y
Φ0m (ω j ) = (ω j − ω k ).
k∈Z∗m
k6=j
To compute Em , we only need the absolute values |Φ0m (ω j )|, and using the standard formula
2 sin(θ/2) for the length of a chord of angle θ on the unit circle, we have
Y
|Φ0m (ω j )| = 2| sin((j − k)π/m)|.
k∈Z∗m
k6=j
By computing a table of all relevant values | sin(kπ/m)| for |k| ∈ [m], we can compute each value
|Φ0m (ω j )| using roughly φ(m) multiplications. Some care must be taken to avoid floating point
overflow/underflow, however. With this approach, the relative error of the result is guaranteed to
be at most ≈ ( + δ)φ(m), where is the machine precision (usually 2−53 ) and δ is the relative
precision to which the values | sin((k − j)π/m)| are computed (which should also be very close to
2−53 if the computation is done in extended double precision and then rounded to double precision).
The total cost of computing all of the values |Φ0m (ω j )| is therefore O(φ(m)2 ).
To compute the polynomials Φm (X)/(X − ω j ), one can first compute the polynomial Φm (X)
exactly.14 If X
Φm (X) = X φ(m) + ai X i ,
i∈[φ(m)]
Computing all of these coefficients for all j ∈ Z∗m takes time O(φ(m)2 ). Experimentally, if computed
in double precision, this formula yields accurate results with relative error at most 5 × 10−7 for all
m up to 32,000, at most 5 × 10−6 for all m up to 64,000, and at most 2.5 × 10−5 for all m up to
128,000.15
The above computations are trivially parallelized, as the columns of Vm−1 can be computed
independently of one another. The computation can be further simplified, based on the following.
Lemma 4. If m is a positive integer and n is the product of the distinct primes dividing m, then
Em = En .
14
Efficient algorithms for computing Φm (X) may be found, for example, in [1]. These algorithms run much faster
than O(φ(m)2 ).
15
This was verified by computing each value |qi | in both double precision and extended double precision. With
little additional cost, one can get better accuracy by performing the computation in extended double precision, if the
hardware supports it.
39
Proof. Let q := m/n. Then we have the following well-known identity:
Φm (X) = Φn (X q ).
Based on this, if ω is a primitive mth root of unity and j ∈ Z∗m , then we have
Φm (X) Φn (X q ) X j(q−1−i) i
= ω X.
X − ωj X − ω jq
i∈[q]
Moreover,
Φ0m (X) = Φ0n (X q ) · qX q−1
and so
|Φ0m (ω j )| = q|Φ0n (ω jq )|.
Also note that ω q is a primitive nth root of unity, and that as j runs over Z∗m , the value ω jq runs
over all primitive nth roots of unity, each repeated q times.
Based on the above observations, it is not hard to see that we can permute rows and columns
of Vm−1 and Vn−1 , respectively, to obtain matrices (ai,j ) and (bi,j ), such that for each (i, j) ∈
[φ(m)] × [φ(m)], we have
q|ai,j | = |bbi/qc,bj/qc |.
It follows that Em = En .
Proof. If m = 1, then E2 = E1 = 1. So suppose m > 1 and m is odd. Then we have the following
well-known identity:
Φ2m (X) = Φm (−X).
Note that if ω is a primitive mth root of unity, then as j runs over Z∗m , the value −ω j runs over all
primitive 2mth roots of unity.
Based on the above observations, it is not hard to see that we can permute rows and columns
−1
of Vm−1 and V2m , respectively, to obtain matrices (ai,j ) and (bi,j ), such that for each (i, j) ∈
[φ(m)] × [φ(m)], we have
|ai,j | = |bi,j |.
It follows that Em = E2m .
Theorem 1. Let m be a positive integer and let n be the product of the distinct odd primes dividing
m. Then Em = En .
Using the above theorem, instead of computing Em , we can just compute En , where n is the
product of the distinct odd primes dividing m.
40
B Selecting generators for the hypercube
With notation as in Section 3.1.2.3, by default, HElib will construct generators g1 , . . . , g` of orders
n1 , . . . , n` for the hypercube as in (20) using the following procedure.
H0 ← H
i←1
while Hi−1 6= Z∗m do
let ni be the maximal order of any element in the quotient group Z∗m /Hi−1
choose gi ∈ Z∗m such that
(a) the order of gi mod Hi−1 is ni , and
(b) gini ∈ H (and, if possible, gini = 1)
let Hi be the subgroup generated by Hi−1 ∪ {gi }
i←i+1
If we ignore condition (b) in the choice of gi , it is clear that the algorithm succeeds in producing
g1 , . . . , g` that give us a complete set of representatives as required. However, by always choosing
ni maximal, we can be assured of the existence of an element gi that also satisfies both conditions
(a) and (b).
To see why, first notice that, in fact, ni is the exponent of the group Z∗m /Hi−1 : this follows
from the well-known fact that any finite abelian group contains an element whose order is equal to
the exponent of the group. Because of this, it follows from elementary properties of exponents and
quotient groups that
ni | ni−1 | · · · | n1 .
Now suppose we choose gi satisfying (a), which we can always do. We can modify gi , if necessary,
to satisfy (b) as follows. If i = 1, there is nothing to do, so assume i ≥ 2. We know gini ∈ Hi−1 ,
which means gini = gi−1s h for some s ∈ Z and h ∈ H ∗
i−2 . Because ni−1 is the exponent of Zm /Hi−2 ,
ni−1
we know that gi ∈ Hi−2 . Therefore,
ni−1 ni−1
n ni s
= gi−1 h0 for some h0 ∈ Hi−2 .
ni ni
Hi−2 3 gi i−1 = gi
−n /s
Since gi−1 has order ni−1 mod Hi−2 , we must have ni | s. So let gi0 := gi · gi−1i . Observe that
0 0
gi is in the same coset of Hi−1 as gi , and therefore gi also has order ni mod Hi−1 . However,
(gi0 )ni ∈ Hi−2 . So we replace gi by gi0 . If i = 2, we are done. Otherwise, we can repeat the same
procedure to replace gi by an element whose order mod Hi−1 is ni but with gini ∈ Hi−3 . Continuing
in this way, we arrive at an element gi that satisfies both (a) and (b).
The above procedure is basically just a proof the Fundamental Theorem of Finite Abelian
Groups, applied to the group Z∗m /H. Recall that in Section 3.1.2.3, in discussing the hypercube,
we can have good, bad, or very bad dimensions. Good means gini = 1, bad means gini 6= 1 but
gini ∈ H, and very bad means gini ∈ / H. The routine used in HElib will always try to choose
a generator that yields a good dimension, if that is possible. Nevertheless, it may produce bad
dimensions. However, it will never produce a very bad dimension.
References
[1] A. Arnold and M. Monagan. Calculating cyclotomic polynomials. Mathematics of Computation,
80(276):2359–2379, 2011. Available at https://www.ams.org/journals/mcom/2011-80-276/
S0025-5718-2011-02467-1.
41
[2] L. I. Bluestein. A linear filtering approach to the computation of the discrete fourier transform.
Northeast Electronics Research and Engineering Meeting Record 10, 1968.
[3] Z. Brakerski, C. Gentry, and V. Vaikuntanathan. Fully homomorphic encryption without boot-
strapping. In Innovations in Theoretical Computer Science (ITCS’12), 2012. Available at
http://eprint.iacr.org/2011/277.
[4] J. H. Cheon, A. Kim, M. Kim, and Y. S. Song. Homomorphic encryption for arithmetic of
approximate numbers. In T. Takagi and T. Peyrin, editors, Advances in Cryptology - ASI-
ACRYPT 2017 - 23rd International Conference on the Theory and Applications of Cryptology
and Information Security, Hong Kong, China, December 3-7, 2017, Proceedings, Part I, volume
10624 of Lecture Notes in Computer Science, pages 409–437. Springer, 2017.
[5] I. Gohberg and V. Olshevsky. The fast generalized parkertraub algorithm for inversion of
vandermonde and related matrices. Journal of Complexity, 13(2):208 – 234, 1997. Available at
https://pdfs.semanticscholar.org/9233/77ec0483df93af85eb60d108e16cd648f273.pdf.
[6] S. Halevi and V. Shoup. Algorithms in helib. Cryptology ePrint Archive, Report 2014/106,
2014. https://eprint.iacr.org/2014/106.
[7] S. Halevi and V. Shoup. Bootstrapping for helib. Cryptology ePrint Archive, Report 2014/873,
2014. Available at https://eprint.iacr.org/2014/873.
[8] S. Halevi and V. Shoup. Faster homomorphic linear transformations in helib. Cryptology ePrint
Archive, Report 2018/244, 2018. https://eprint.iacr.org/2018/244.
[9] V. Lyubashevsky, C. Peikert, and O. Regev. On ideal lattices and learning with errors over
rings. In H. Gilbert, editor, Advances in Cryptology - EUROCRYPT’10, volume 6110 of Lecture
Notes in Computer Science, pages 1–23. Springer, 2010.
42