Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Design and Implementation of Helib: A Homomorphic Encryption Library

Download as pdf or txt
Download as pdf or txt
You are on page 1of 42

Design and implementation of HElib:

a homomorphic encryption library


Shai Halevi1
Victor Shoup2
1
Algorand Foundation
2
NYU, IBM Research
November 25, 2020

1 Introduction
HElib is a C++ open source library (see https://github.com/homenc/HElib) that implements
both the BGV [3] and CKKS [4] fully homomorphic encryption (FHE) schemes. This document
summarizes some of the basic design principles of HElib, and describes some of its fundamental
algorithms and data structures in significant detail. It is a work in progress, and currently focuses
exclusively on the BGV scheme. It is not intended to be an HElib “user manual”. This document
focuses on the design of HElib’s core — we refer the reader to the papers [6], [7], and [8] for more
details on higher-level algorithms in HElib.

2 Mathematical background and notation


We denote by Z the ring of integers, by Q the field of rational numbers, by R the field of real
numbers, and by C the field of complex numbers.
For a positive integer m, we denote by Zm the quotient ring Z/(m), i.e., the ring of integers
modulo m. We denote by Z∗m the group of units (i.e., elements with multiplicative inverses) in Zm .
Recall that Z∗m consists of those residue classes whose representatives are relatively prime to m.
We have |Z∗m | = φ(m), where φ is Euler’s totient function.
For a positive integer m we denote by [m] the set of integers {0, . . . , m − 1}. Note that Zm and
[m] are not the same thing: the former is a set of residue classes, which forms a ring, and the latter
is a subset of the integers, which does not form a ring.

2.1 Roots of unity and cyclotomic polynomials


Let m be a positive integer and F a field. An element ω ∈ F is called an mth root of unity if
ω m = 1, which is the same as saying that ω is a root of the polynomial X m − 1 ∈ F [X]. Such an
mth root of unity ω is called primitive if no smaller positive power of ω is equal to 1.
Suppose that ω ∈ F is an mth root of unity. Note that if k ≡ ` (mod m), then ω k = ω ` . This
means that if j = [k mod m] ∈ Zm , where k ∈ Z, we can unambiguously define ω j := ω k .1
If ω ∈ F is a primitive mth root of unity, then
1
[k mod m] denotes the residue class k + mZ ∈ Zm , i.e., the residue class modulo m containing k. The same
notation for residue classes is also used for other rings.

1
• every mth root of unity in F can be written as ω j for a unique j ∈ Zm , and

• every primitive mth root of unity in F can be written as ω j for a unique j ∈ Z∗m .

As a special case, consider ω := e2πi/m ∈ C, which is a primitive mth root of unity in C. Define
the polynomial Y
Φm (X) := (X − ω j ) ∈ C[X].
j∈Z∗m

The polynomial Φm (X) is called the mth cyclotomic polynomial. Clearly, Φm (X) is monic and
has degree φ(m).
The following are well-known facts:

• Φm (X) ∈ Z[X], and

• Φm (X) is irreducible over Q.

The following formula is also well known:


Y
Xm − 1 = Φd (X).
d|m

This formula gives a recursive formulation for Φm (X): we have Φ1 (X) = X − 1, and for m > 1, we
have
Xm − 1
Φm (X) = Y .
Φd (X)
d|m
d<m

2.2 The cyclotomic rings A, Aq , AQ , and AR


Let m > 1 be an integer. Let Φm (X) ∈ Z[X] denote the mth cyclotomic polynomial.
We shall be working with the quotient ring A := Z[X]/(Φm (X)), i.e., the ring of polynomials
with integer coefficients modulo the polynomial Φm (X).
Let q > 1 be an integer. We shall also be working with the quotient ring Aq := A/(q).
Equivalently, Aq = Zq [X]/(Φm (X)), where Φm (X) is the image of Φm (X) in Zq [X] (that is, the
integer coefficients of Φm (X) are mapped to their corresponding residue classes in Zq ).
We will sometimes assume that q is relatively prime to m. The restriction that q is relatively
prime to m avoids some algebraic awkwardness. For example, with this restriction, the polynomial
X m − 1 has m distinct roots in the algebraic closure Zq of Zq , and the polynomial Φm (X) ∈ Zq [X]
has φ(m) distinct roots in Zq , all of which are primitive mth roots of unity.
We will also occasionally work with the rings AQ := Q[X]/(Φm (X)) and AR := R[X]/(Φm (X)).
These are the same as A, but where the coefficients are allowed to lie, respectively, in Q and R,
instead of Z.

2.3 The canonical embedding and associated infinity norm


Let ω := e2πi/m ∈ C, which is a primitive mth root of unity.
Consider two polynomials f (X), g(X) ∈ Z[X] such that f (X) ≡ g(X) (mod Φm (X)). Consider
a primitive mth root of unity ω j , where j ∈ Z∗m . Since ω j is a root of Φm (X), it follows that
f (ω j ) = g(ω j ). This means that if a = [f (X) mod Φm (X)] ∈ A for some f (X) ∈ Z[X], we can
unambiguously define a(ω j ) := f (ω j ) (and the specific choice of f (X)doesnotmatter).

2
The canonical embedding of a ∈ A is the vector obtained by evaluating a at all primitive
mth roots of unity:  
canon(a) := a(ω j ) ∗
.
j∈Zm

We will often use the infinity norm kcanon(a)k∞ as the measure of the “size” of a ∈ A, so we
define
kak := kcanon(a)k∞ .
That is,
kak = max{|a(ω j )| : j ∈ Z∗m },
where |a(ω j )| ∈ R denotes the usual absolute value (or norm) of the complex number a(ω j ). We
call k · k the canonical norm.
The canonical norm satisfies the usual properties satisfied by any norm:

• ka + bk ≤ kak + kbk (i.e., subadditivity), and

• kcak = |c|kak,

for all a, b ∈ A and c ∈ Z. In addition, it is sub-multiplicative:

• kabk ≤ kakkbk

for all a, b ∈ A. This sub-multiplicativity property is what makes this norm especially convenient
to use.
The notions of the canonical embedding and corresponding norm apply equally well to elements
of AQ or AR , and we shall use the same notation kak = kcanon(a)k∞ for a in AQ or AR .

2.4 Standard and powerful bases


It is useful to make a syntactic distinction between the indeterminate X and its image x :=
[X mod Φm (X)] in the cyclotomic ring A = Z[X]/(Φm (X)). We can write A = Z[x]. That is,
A = {f (x) : f (X) ∈ Z[X]}.
The element x ∈ A satisfies the equation Φm (x) = 0. Moreover, every element of A can be
expressed uniquely as f (x) where f (X) ∈ Z[X] has degree less than φ(m). This follows from
division with remainder for polynomials: for f (X) ∈ Z[X], there exists a unique polynomial r(X)
of degree less than φ(m) such that f (X) ≡ r(X) (mod Φm (X)), and since Φm (x) = 0, we have
f (x) = r(x).
It follows that every element of a ∈ A can be expressed uniquely as
X
a= ai xi .
i∈[φ(m)]

Put another way,


{xi }i∈[φ(m)]
forms a Z-basis for A. This is called the standard basis for A (it is also sometimes called the
power basis).
Suppose m = m1 · · · mk is the factorization of m into prime powers. We next define a new
quotient ring
B := Z[Y1 , . . . , Yk ]/(Φm1 (Y1 ), . . . , Φmk (Yk )).

3
For t = 1, . . . , k, let yt be the image of Yt in B. Then we have B = Z[y1 , . . . , yk ]. That is, every
element of B can be expressed as g(y1 , . . . , yk ) for some polynomial g(Y1 , . . . , Yk ) ∈ Z[Y1 , . . . , Yk ].
Moreover, every element of B can be expressed uniquely as g(y1 , . . . , yk ) for some polynomial
g(Y1 , . . . , Yk ) ∈ Z[Y1 , . . . , Yk ] where the degree of Yt is less than φ(mt ) for t = 1, . . . , k. Put another
way,
y1 · · · ykik i1 ∈[φ(m1 )],...,i ∈[φ(m )]
 i1
k k

forms a Z-basis for B.


The rings A and B are in fact isomorphic. Let xt := xm/mt ∈ A for t = 1, . . . , k.
Then isomorphism is given by the map that sends g(y1 , . . . , yk ) ∈ B to g(x1 , . . . , xk ) ∈ A for
g(Y1 , . . . , Yk ) ∈ Z[Y1 , . . . , Yk ].
This isomorphism determines another Z-basis for A, namely,
x1 · · · xikk i1 ∈[φ(m1 )],...,i ∈[φ(m )] .
 i1
k k

Following [9], we call this the powerful basis for A.


Besides the norm based on the canonical embedding, other norms on A that are useful to
consider are defined in terms of the standard and powerful bases. For a ∈ A, we denote by std(a)
its coordinate vector on the standard basis, and by pwfl(a) its coordinate vector on the powerful
basis. Such a coordinate vector has φ(m) components, where each component is an integer. The
norms of most interest are the infinity norms on these bases, kstd(·)k∞ and kpwfl(·)k∞ .
Unlike the canonical norm k · k, the norms kstd(·)k∞ and kpwfl(·)k∞ are not (in general) sub-
multiplicative.
We can also consider the rings AQ , AR , and Aq . Each of these rings has a corresponding
standard and powerful basis (over Q, R, and Zq , respectively). We can also naturally extend the
norms kstd(·)k∞ and kpwfl(·)k∞ to AQ and AR .
The rings BQ , BR , and Bq are defined in the same was as B, except the underlying ring of coeffi-
cients is Q, R, and Zq , respectively, rather than Z. We also have corresponding ring isomorphisms
AQ ∼ = BQ , AR ∼= BR , and Aq ∼= Bq for every q ∈ Z, q > 1.

2.5 Relations between norms


2.5.1 Bounding the powerful basis infinity norm in terms of the canonical norm
We can nicely bound kpwfl(·)k∞ in terms of the canonical norm k · k, as established in the following
lemma (see [7] for a proof). To state this lemma, for a real number x, define
2
P (x) := .
x · tan(π/2x)
Below we use the values of P (x) at prime numbers x. One can verify that P (x) approaches
4/π ≈ 1.273 from below as x → ∞, and the (approximate) values of P (x) for the first few primes
are:
x 2 3 5 7 11
P (x) 1 1.155 1.231 1.252 1.265
Lemma 1. For all a ∈ A, we have
kpwfl(a)k∞ ≤ Dm · kak,
where Y
Dm := P (x).
prime x | m

4
Lemma 1 holds for AQ and AR as well.

2.5.2 Bounding the standard basis infinity norm in terms of the canonical norm
We saw above that we can bound the powerful basis infinity norm in terms of the canonical norm
with a fairly simple explicit formula that gives a reasonably tight bound. It would be nice to get
a similar result for the standard basis infinity norm. Unfortunately, we know of no simple tight
formula. While we are able to compute explicit bounds, these bounds are not nearly as tight as
the bounds for the powerful basis.

Lemma 2. For all a ∈ A, we have

kstd(a)k∞ ≤ Em · kak

where Em is the infinity norm of the inverse Vandermonde matrix, that is,
 
Em = kVm−1 k∞ , where Vm := ω ij ∗
.
i∈Zm ,j∈[φ(m)]

Here, ω := e2πi/m ∈ C, and denoting Vm−1 = (aij ) we have kVm−1 k∞ := maxi


P
j |aij |.

When m is a prime power, then Em = Dm , where Dm is as in Lemma 1. We have computed


Em for all other m up to 64,000. We found the following:

• If m is the product of 2 distinct prime powers, Em ranges between (roughly) 1.6 and 6.8.

• If m is the product of 3 distinct prime powers, Em ranges between (roughly) 3.5 and 130.

• If m is the product of 4 distinct prime powers, Em ranges between (roughly) 18 and 1,900.

• If m is the product of 5 distinct prime powers, Em ranges between (roughly) 820 and 81,000.

See Appendix A for some implementation details.

2.6 Probabilistic norm bounds


We consider here probabilistic bounds on randomly generated elements of AR .
Suppose X
a= ai xi ,
i∈I

where {ai }i∈I is a mutually independent family of real-values random variables, where each ai
has zero mean and variance σi2 . Let ω ∈ C be a primitive mth root of unity, and consider the
P random variable a(ω). A simple calculation shows that a(ω) has zero mean and variance
complex
σ 2 := i σi2 . Indeed,
 X  X X
σ 2 = E[a(ω) · a(ω)] = E ai aj ω i−j = E[a2i ] = σi2 .
i,j i i

5
2.6.1 The circularly-symmetric case
In the above setting, if

• I = [m], or

• m is a power of two and I = [φ(m)],

then we will heuristically model a(ω) as a complex Gaussian with variance σ 2 . The heuristic
aspect of this is the fact that we are using the Central Limit Theorem here qualitatively, without
any quantitative error terms. This heuristic is reasonable provided |I| is large. We may also
heuristically model a(ω) as a complex Gaussian with variance σ 2 if I is chosen as a random subset
of [m] (or [φ(m)] if m is a power of two). We require that the complex numbers ω i are well
distributed around the unit circle.
Now, a complex Gaussian with variance σ 2 has the same distribution as a 2-D Gaussian with
variance σ 2 /2. It follows2 that for any B > 0, we have the following heuristic tail bound:

Pr |a(ω)| > B = exp(−B 2 /σ 2 ).


 
(1)

Setting p
B := σ log(φ(m)), (2)
we have
  1
Pr |a(ω)| > B = .
φ(m)
Furthermore, kak > B iff |a(ω)| > B for some primitive mth root of unity ω ∈ C. However, since
the coefficients of a are real, we have a(ω̄) = a(ω). Thus, to bound the probability that kak > B,
we can apply the Union Bound to the φ(m)/2 conjugate pairs of primitive roots, rather than all
φ(m) primitive roots. Therefore, with B defined as in (2), we obtain the heuristic bound
  1
Pr kak > B ≤ . (3)
2

2.6.2 The general case


Now suppose that the above assumptions on the index set I are not necessarily met. A typical
example of this is the setting where I = [φ(m)] but m is not a power of two.
For any B > 0, we have the following heuristic tail bound:
  √
Pr |a(ω)| > B = erfc(B / σ 2). (4)

Here, erfc is the standard complementary error function


Z z
2
erfc(z) := 1 − √ exp(−t2 )dt.
π 0

Note that erfc(B/σ 2) is the probability that a real Gaussian with zero mean and variance σ 2
exceeds B in absolute value. The bound (4) is a conservative estimate, as it rather pessimistically
assumes that the roots of unity ω i are concentrated near the real axis. The following table gives
an idea of how the erfc function behaves:
2
See, for example, M. Brown, “A Generalized Error Function in n Dimensions”, 1963.

6

B/σ − log2 (erfc(B/σ 2))
8 49.5
9 61.9
10 75.8
11 91.1
12 107.8
Again, applying the Union Bound, we have
  √
Pr kak > B ≤ erfc(B / σ 2) · φ(m)/2. (5)

2.6.3 The symmetric distribution mod M


In applying the above bounds, as well as in other settings, it is convenient to introduce a special
distribution.
Let M ≥ 2 be an integer, and consider the following probability distribution over the integers
in the range [−M/2, +M/2] called the symmetric distribution mod M :

If M is odd, then the symmetric distribution mod M is simply the uniform distribution
on the set of M integers {−bM/2c, . . . , +bM/2c}. If M is even, then the symmetric
distribution mod M assigns probability mass 1/2M to the integers ±M/2, while the
integers of magnitude strictly smaller than M/2 are each assigned probability mass
1/M .

Note that for the symmetric distribution mod M , each residue class mod M is equally likely, and the
distribution is symmetric about zero (and in particular, its mean is zero). Instead of the variance
M 2 /12 for the continuous distribution on [−M/2, +M/2], we have:

Lemma 3. Let M ≥ 2 be an integer and X be a random variable that is symmetrically distributed


2 is the variance of X, then
mod M . If σM

2 M2
σM ≤ if M is odd,
12
and
M2
 
2 2
σM = · 1+ 2 if M is even.
12 M
Proof. Let N = bM/2c. If M is odd then we have
N
2 2 X M2
σM = · i2 ≤ ,
2N + 1 12
i=1

and if M is even then


N −1
N2 2 X 2 M2
 
2 2
σM = + · i = 1+ 2 .
2N 2N 12 M
i=1

7
2.6.3.1 Symmetric reduction mod M
Related to the notion of the symmetric distribution mod M is the notion of symmetric reduction
of an integer a mod M . If M is odd, this is the unique integer b ≡ a (mod M ) in the interval
(−M/2, +M/2). If M is even, then:

• if a 6≡ M/2 (mod M ), then this is the unique integer b ≡ a (mod M ) in the interval
(−M/2, +M/2);

• otherwise, if a ≡ M/2 (mod M ), then this is an integer b chosen uniformly at random from
the set {±M/2}.

2.7 The Galois automorphisms


Let A = Z[X]/(Φm (X)), and recall that A = Z[x] where x := [X mod Φm (X)] is the image of the
indeterminate X in A.
For j ∈ Z∗m , the jth Galois automorphism θj is a ring automorphism:

θj : A −→ A
(6)
f (x) 7−→ f (xj ) (for f (X) ∈ Z[X]).

To see why θj is well defined, note that over Z[X], we have Φm (X j ) is divisible by Φm (X). This
follows from the fact that ω := e2πi/m ∈ C is a primitive mth root of unity, and so is ω j . Therefore,
ω is a root of Φm (X j ). Since Φm (X) is the minimal polynomial of ω over Z, it must be the case
that Φm (X j ) is divisible by Φm (X). This means that if f (x) = g(x), then f (xj ) = g(xj ), and so
the map (6) is well defined. Note that if for j, j 0 ∈ Z∗m , we have

θj ◦ θj 0 = θjj 0 = θj 0 ◦ θj .

In particular, if j 0 = j −1 ∈ Z∗m , then θj 0 is the inverse of θj , and so we see that θj is bijective.


For any a ∈ A, observe that canon(θj (a)) is just a permutation of canon(a), from which it
follows that kθj (a)k = kak.
The Galois automorphisms are defined in exactly the same way for the rings Aq , AQ , and AR .

3 The plaintext algebra


Again, let A = Z[X]/(Φm (X)). In HElib, plaintexts can be viewed as elements of the ring AP ,
where P = pr is a prime power and p does not divide m. This ring is actually a ZP -algebra, which
means that it contains a copy (or, really, an isomorphic copy) of the ring ZP as a subring.
We will initially focus of the case where r = 1, so P = p is a prime, and Zp is a field. Since Ap
is a Zp -algebra and contains the field Zp as a subring, we can naturally view Ap as a vector space
over Zp .
Recall that Ap = Zp [X]/(Φm (X)), where Φm (X) is the image of the cyclotomic polynomial
Φm (X) in Zp [X]. We are assuming that p does not divide m. We define x := [X mod Φm (X)],
so that Ap = Zp [x], which means that every element of Ap can be expressed as f (x) for some
f (X) ∈ Zp [X].
The following are well-known facts:

8
• The polynomial Φm (X) ∈ Zp [X] factors as

Φm (X) = F1 (X) · · · Fn (X), (7)

where the polynomials Fi (X) ∈ Zp [X] are distinct irreducible polynomials, each of the same
degree d; in particular, φ(m) = nd.

• The value d is the multiplicative order of p modulo m; that is, d is the smallest positive integer
such that pd ≡ 1 (mod m).

By the Chinese Remainder Theorem for polynomials, we have a Zp -algebra isomorphism3

Ap −→ Zp [X]/(F1 (X)) × · · · × Zp [X]/(Fn (X))


 (8)
f (x) 7−→ [f (X) mod F1 (X)], . . . , [f (X) mod Fn (X)] (for f (X) ∈ Zp [X]).

Another way to look at A is as follows. Let E = Zp [X]/(F1 (X)). The choice of the polynomial
F1 (X) from among the irreducible factors of Φm (X) is quite arbitrary. Let η := [X mod F1 (X)] ∈
E. Now, since F1 (X) is irreducible, it follows that E is a field — it is a finite field of cardinality
pd . We also have E = Zp [η], which means that every element of E can be expressed as f (η) for
some f (X) ∈ Zp [X]. We naturally view Zp as a subfield of E. By definition, η is a root of F1 (X).
It is also a well-known fact that η ∈ E is a primitive mth root of unity.
Now consider the group Z∗m . It is a well-known fact that the polynomial Φm (X) has φ(m) roots
in E, namely, η j for j ∈ Z∗m . Thus, these φ(m) roots must be partitioned among the irreducible
factors of Φm (X), so that each irreducible factor Fi (X) has d roots in E.
We can say a bit more about how these roots are partitioned among these factors. Consider
the subgroup H of Z∗m generated by p̄ := [p mod m] ∈ Z∗m . This subgroup consists of the d distinct
elements 1̄, p̄, . . . , p̄d−1 . We can form the quotient group Z∗m /H, which consists of n = φ(m)/d
distinct cosets. Each such coset is of the form

kH = {kh : h ∈ H} ⊆ Z∗m

for some k ∈ Z∗m . Such a k is called a representative of the coset. Any other element of a coset can
also act as a representative of the same coset.
Now suppose we choose one representative from each coset, obtaining a complete set of repre-
sentatives k1 , . . . , kn ∈ Z∗m for the cosets of H in Z∗m . Then the cosets of H in Z∗m are

k1 H, . . . , kn H.

Every element of Z∗m lies in exactly one of these cosets.


The following is a well-known fact:

• For any set of representatives k1 , . . . , kn ∈ Z∗m of H in Z∗m , we can order them in such a way
that for i = 1, . . . , n, the polynomial Fi (X) has d roots in E, namely, η k for k ∈ ki H.

Because of this, for each i = 1, . . . , n, we have a Zp -algebra isomorphism

Zp [X]/(Fi (X)) −→ E
(9)
[f (X) mod Fi (X)] 7−→ f (η ki ) (for f (X) ∈ Zp [X]).
3
A B-algebra isomorphism is a ring isomorphism that acts as the identity function on the subring B.

9
Combining (8) and (9), we obtain a Zp -algebra isomorphism
Ap −→ E n
(10)
f (η k1 ), . . . , f (η kn )

f (x) 7−→ (for f (X) ∈ Zp [X]).
We call E the slot algebra. The isomorphism (10) allows us to perform component-wise
addition and multiplications on vectors in E n by performing corresponding operations on elements
of Ap . If a ∈ Ap corresponds to (α1 , . . . , αn ) ∈ E n and b ∈ Ap corresponds to (β1 , . . . , βn ) ∈ E n ,
then a+b corresponds to (α1 +β1 , . . . , αn +βn ) ∈ E n , and a·b corresponds to (α1 ·β1 , . . . , αn ·βn ) ∈
En.
It is also computationally easy to map (in both directions) between a concrete representation
of an element of Ap (represented, say, as a coefficient vector with respect to the standard basis for
Ap over Zp ), and a concrete representation of E n (where each entry in the vector is represented,
say, as a coefficient vector with respect to the standard basis for E over Zp ).

3.1 Galois automorphisms and intra-slot data movement


Using isomorphism (10), we can implement simple SIMD operations on vectors of slots. However,
it also very useful to be able to move data between the slots within a vector. This can be achieved
using the Galois automorphisms on Ap , which were introduced in Section 2.7
Recall that for j ∈ Z∗m , the jth Galois automorphism θj is the Zp -algebra automorphism:
θj : Ap −→ Ap
(11)
f (x) 7−→ f (xj ) (for f (X) ∈ Zp [X]).
Under the correspondence
f (x) ∈ Ap ←→ (f (η k1 ), . . . , f (η kn )) ∈ E n ,
as in (10), we have
θj (f (x)) ∈ Ap ←→ (f (η jk1 ), . . . , f (η jkn )) ∈ E n .
Thus, θj acts on the slots in a certain way. By carefully choosing representatives k1 , . . . , kn , we can
use certain Galois mappings to perform useful operations on the slots, including various permuta-
tions on the slots.

3.1.1 The Frobenius automorphism


The map
σ: E −→ E
(12)
f (η) 7−→ f (η p ) (for f (X) ∈ Zp [X]).
is a Zp -algebra automorphism, which is called the Frobenius automorphism. The Frobenius
automorphism plays a central role in the theory of finite fields. One key fact is that for all α ∈ E,
we have σ(α) = αp . Another key fact is that for all α ∈ E, we have
α ∈ Zp ⇐⇒ σ(α) = α.
Under the correspondence
f (x) ∈ Ap ←→ (f (η k1 ), . . . , f (η kn )) ∈ E n ,
as in (10), we have
θp̄ (f (x)) ∈ Ap ←→ (f (η pk1 ), . . . , f (η pkn )) = ( σ(f (η k1 )), . . . , σ(f (η kn )) ) ∈ E n ,
where p̄ = [p mod m] ∈ Z∗m . Thus, the map θp̄ acts slot-wise as the Frobenius map on E.

10
3.1.2 Rotations on a hypercube
As we have seen, the automorphism θp̄ just applies the Frobenius map to each slot, but does not
induce any data movement between slots. We now discuss how we can use other automorphisms
θj to implement various permutations on the slots.
We start with some simple cases.

3.1.2.1 One-dimensional rotations


Let g ∈ Z∗m and suppose that 1, g, g 2 , . . . , g n−1 is a complete system of representatives for the cosets
of H in Z∗m . It must be the case that g n ∈ H. To see why, observe that we must have g n = g e h for
some e ∈ {0, . . . , n − 1} and some h ∈ H. Moreover, we must have e = 0 (as otherwise, g n−e ∈ H
would imply that two distinct representatives among 1, . . . , g n−1 lie in H, which is impossible). So
we have g n = h ∈ H.
Suppose that we are lucky, and that g n = 1. For the isomorphism (10), let us use the com-
plete system of representatives 1, g, . . . , g n−1 ∈ Z∗m for the cosets of H in Z∗m . Then we have the
correspondence
n−2 n−1
f (x) ∈ Ap ←→ (f (η 1 ), f (η g ), . . . , f (η g ), f (η g )) ∈ E n .
Applying θg to f (x), we have the correspondence
2 n−1 n
θg (f (x)) ∈ Ap ←→ (f (η g ), f (η g ), . . . , f (η g ), f (η g )) ∈ E n .
Moreover, since we are assuming g n = 1, we have the correspondence
2 n−1
θg (f (x)) ∈ Ap ←→ (f (η g ), f (η g ), . . . , f (η g ), f (η 1 )) ∈ E n .
Thus, we see that applying θg to a ∈ A effectively rotates the slots of a to the left by one position.
More generally, for e = 1, . . . , n − 1, applying θge rotates the slots to the left by e positions;
moreover, applying θg−e rotates the slots to the right by e positions.
Now suppose we are not so lucky: g n ∈ H but g n 6= 1. This means that g n = p̄s ∈ Z∗m for some
s ∈ {1, . . . , d − 1}. Then we have the correspondence
2 n−1
θg (f (x)) ∈ Ap ←→ (f (η g ), f (η g ), . . . , f (η g ), σ s (f (η 1 ))) ∈ E n .
Thus, applying θg to a effectively rotates the slots of a one position to the left, and then “perturbs”
the last slot by applying the Frobenius-power σ s to that slot. So, in general, we do not get a true
rotation. However, if the first slot of a happens to lie in Zp , then as σ s is the identity on Zp , we do
get a true rotation.
More generally, for e = 1, . . . , n − 1, applying θge rotates the slots to the left by e positions, and
then perturbs the last e slots with the Frobenius-power σ s ; moreover, applying θg−e perturbs the
last e slots with the Frobenius-power σ s , and then rotates the slots to the right by e positions.
If the slots do contain elements outside of Zp , we can still effectively implement rotations as
follows. To rotate the slots of a to the left e positions, we can form a “masking element” Me with
the correspondence
Me ∈ Ap ←→ (1, . . . , 1, 0, . . . , 0) ∈ E n .
| {z } | {z }
n − e 1’s e 0’s
Note that we also have the correspondence
1 − Me ∈ Ap ←→ (0, . . . , 0, 1, . . . , 1) ∈ E n .
| {z } | {z }
n − e 0’s e 1’s

11
If we have the correspondence

a ∈ Ap ←→ (α0 , . . . , αn−1 ) ∈ E n ,

then
Me · θge (a) ∈ Ap ←→ (αe , . . . , αn−1 , 0, . . . , 0) ∈ E n .
| {z }
e 0’s
and
(1 − Me ) · θge−n (a) ∈ Ap ←→ (0, . . . , 0, α0 , . . . , αe−1 ) ∈ E n .
| {z }
n − e 0’s
Therefore,
Me · θge (a) + (1 − Me ) · θge−n (a). (13)
yields an element of Ap whose slots are obtained by rotating the slots of a to the left e positions.
Instead of rotating and then masking, as in (13), we can achieve exactly the same effect masking
and then rotating:
θge ((1 − Mn−e ) · a) + θge−n (Mn−e · a). (14)

3.1.2.2 Two-dimensional rotations


Now suppose that n = n1 n2 and we choose a complete system of representatives for the cosets of
H in Z∗m of the form
g1e1 g2e2 (for e1 ∈ [n1 ], e2 ∈ [n2 ]). (15)
Organizing these representatives in a natural way as a two-dimensional array, we can write the
correspondence arising from (10) as follows:
n2 −2 n2 −1
 
f (η 1 ) f (η g2 ) ··· f (η g2 ) f (η g2 ))
 n2 −2 n2 −1 
 f (η g1 ) f (η g1 g2 ) ··· f (η g1 g2 ) f (η g1 g2 )) 
..
 
f (x) ∈ Ap ←→   ∈ E n1 ×n2 .
 
.
n1 −2 n1 −2 n1 −2 n2 −2 n1 −2 n2 −1
 
f (η g1 ) f (η g1 g2 ) · · · f (η g1 g2 ) f (η g1 g2 ))
 
n1 −1 n1 −1 n1 −1 n2 −2 n1 −1 n2 −1
f (η g1 ) f (η g1 g2 ) · · · f (η g1 g2 ) f (η g1 g2 ))
Applying the automorphism θg2 to f (x), we have:
n2 −1 n2
 2

f (η g2 ) f (η g2 ) ··· f (η g2 ) f (η g2 ))
 2 n2 −1 n2 
 f (η g1 g2 ) g
f (η 2 )
1 g ··· f (η g1 g2 f (η g1 g2 )) 
)
..
 
θg2 (f (x)) ∈ Ap ←→   ∈ E n1 ×n2 .
 
.
n1 −2 n1 −2 2 n1 −2 n2 −1 n1 −2 n2
 
g g g g f (η g1 g2 ) f (η g1 g2 ))
f (η 1 2 ) f (η 1 2) ···
 
n1 −1 n1 −1 2 n1 −1 n2 −1 n1 −1 n2
f (η g1 g 2 ) f (η g1 g2 ) ··· f (η g1 g2 ) f (η g1 g2 ))
If we are lucky, we have g2n2 = 1, in which case:
n2 −1
 2

f (η g2 ) f (η g2 ) ··· f (η g2 ) f (η 1 ))
 2 n2 −1 
 f (η g1 g2 ) f (η g 1 g 2 ) ··· f (η g1 g2 f (η g1 )) 
)
..
 
θg2 (f (x)) ∈ Ap ←→   ∈ E n1 ×n2 .
 
.
n1 −2 n1 −2 2 n1 −2 n2 −1 n1 −2
 
g g g g f (η g1 g2 ) f (η g1 ))
f (η 2 ) f (η ) ···
 1 1 2

n1 −1 n1 −1 2 n1 −1 n2 −1 n1 −1
f (η g1 g2 ) f (η g1 g2 ) ··· f (η g1 g2 ) f (η g1 ))

12
In this case, the effect of θg2 is to rotate the slots of each row one position to the left. More generally,
for e2 = 1, . . . , n2 − 1, applying θge2 rotates the slots of each row to the left by e2 positions, and
2
applying θg−e2 rotates the slots of each row to the right by e2 positions.
Suppose we are unlucky, and g2n2 6= 1 but g2n2 ∈ H. If g2n2 = ḡ s , then we have:
n2 −1
 2

f (η g2 ) f (η g2 ) ··· f (η g2 ) σ s (f (η 1 ))
 2 n2 −1 1 
 f (η g1 g2 ) f (η g1 g2 ) ··· f (η g1 g2 ) σ s (f (η g1 )) 
..
 
θg2 (f (x)) ∈ Ap ←→   ∈ E n1 ×n2 .
 
.
n −2 n −2 n −2 n −1 n −2
 

f (η g1 1 g2 ) f (η g1 1 g22 ) · · · f (η g1 1 g2 2 ) σ s (f (η g1 1 ))
 
n1 −1 n1 −1 n1 −1 n2 −1 n1 −1
f (η g1 ) f (η g1 g2 ) · · · f (η g1 g2 ) σ s (f (η g1 ))

This is not a true rotation. Rather, applying θg2 to a ∈ A effectively rotates the slots in each row
of a to the left by one position, and then the slots in the last column are perturbed by powers of
Frobenius. However, if the slots of the first column of a happen lie in Zp , this is a true rotation.
More generally, for e2 = 1, . . . , n2 − 1, applying θge2 rotates the slots of each row to the left
2
by e2 positions, and then the slots in the last e2 columns are perturbed by powers of Frobenius,
applying θg−e2 rotates the slots of each row to the right by e2 positions, and then the slots in the
first e2 columns are perturbed by powers of Frobenius.
Now suppose we are even more unlucky, and g2n2 ∈ / H. We claim that for every i ∈ [n1 ], we
must have g1i g2n2 = g1ti · p̄si for some ti ∈ [n1 ] and si ∈ [d]. To see why, observe that we must have
t0
g1i g2n2 = g1ti g2i · p̄si for some ti ∈ [n1 ], t0i ∈ [n2 ], and si ∈ [d], since the group elements (15) form
a complete system of representatives for the cosets of H in Z∗m . Moreover, if we had t0i 6= 0, then
n −t0
g1i g2 2 i = g1ti · p̄si , contradicting the fact that the group elements (15) lie in distinct cosets of H
in Z∗m . It is also not hard to see that (t0 , . . . , tn1 −1 ) is a permutation of (0, . . . , n − 1).
So we have
 2 n2 −1 t0 
f (η g2 ) f (η g2 ) ··· f (η g2 ) σ s0 (f (η g1 ))
2 n2 −1 t1
 f (η g1 g2 ) f (η g1 g2 ) ··· f (η g1 g2 ) σ s1 (f (η g1 )) 
 
..
 
θg2 (f (x)) ∈ Ap ←→  .  ∈ E n1 ×n2 .
 
 n −2 n −2 n −2 n −1 tn −2 

f (η g1 1 g2 ) f (η g1 1 g22 ) · · · f (η g1 1 g2 2 ) σ sn1 −2 (f (η g1 1 ))
 
n1 −1 n1 −1 n1 −1 n2 −1 tn −1
1
f (η g1 ) f (η g1 g2 ) ··· f (η g1 g2 ) σ sn2 −1 (f (η g1 ))

This is not a true rotation. Rather, applying θg2 to a ∈ A effectively rotates the slots in each row
of a to the left by one position, and then the slots in the last column are perturbed by powers of
Frobenius and permuted. However, if the slots of the first column of a happen to be some constant
in Zp , this is a true rotation.
More generally, for e2 = 1, . . . , n2 − 1, applying θge2 rotates the slots of each row to the left by
2
e2 positions, and then the slots in the last e2 columns are perturbed by powers of Frobenius and
permuted, applying θg−e2 rotates the slots of each row to the right by e2 positions, and then the
slots in the first e2 columns are permuted and perturbed by powers of Frobenius.
Just as we did in the one-dimensional case, we can use masking to implement true rotations,
even if g2n2 6= 1. To rotate the slots in each row to the left e2 positions, we can form a “masking

13
(2)
element” Me2 with the correspondence
 
1 ··· 1 0 ··· 0
1 ··· 1 0 ··· 0
Me(2) ∈ Ap ←→  .. ..  ∈ E n1 ×n2 .
 
2
 . . 
1| ·{z· · 1} 0| ·{z
·· 0}
n2 −e2 e2

Then, we can rotate the slots in each row of a ∈ A to the left by e2 positions by either computing
Me(2)
2
· θge2 (a) + (1 − Me(2)
2
) · θge2 −n2 (a). (16)
2 2

or
(2) (2)
θge2 ((1 − Mn2 −e2 ) · a) + θge2 −n2 (Mn2 −e2 · a). (17)
2 2

Rotating the slots in each column. Besides rotating the slots in each row by a given amount, we can
also use Galois automorphisms to rotate the slots in each column by a given amount. Specifically,
applying θg1 to a ∈ A rotates the slots in each column up one position. If g1n1 = 1, then this results
in a true rotation. Otherwise, this results in a rotation, followed by a Frobenius perturbation and
possibly a permutation of the slots in the last row. Just as we did above, we can combine Galois
automorphisms and masking to implement true rotations, even if g1n1 6= 1. If we define the masking
(1)
element Me1 ∈ a to have all 1’s in the slots in its first n1 − e1 rows and all 0’s in the slots in its
last e1 rows, then we can rotate the slots in each column up e1 positions by computing either
Me(1)
1
· θge1 (a) + (1 − Me(1)
1
) · θge1 −n1 (a). (18)
1 1

or
(1) (1)
θge1 ((1 − Mn1 −e1 ) · a) + θge1 −n1 (Mn1 −e1 · a). (19)
1 1

3.1.2.3 The general hypercube


Now suppose that n = n1 · · · n` and we choose a complete system of representatives for the cosets
of H in Z∗m of the form
g1e1 · · · g`e` (for e1 ∈ [n1 ], . . . , e` ∈ [n` ]). (20)
We can organize these representatives in a natural way as an `-dimensional hypercube. Figure 1
shows a 3-dimensional hypercube.
Let i ∈ {1, . . . , `}. A hypercolumn in the ith dimension is a one-dimensional sub-cube
specified by fixed indices e1 , . . . , ei−1 and ei+1 , . . . , e` , and consists of the ni slots
(e1 , . . . , ei−1 , ei , ei+1 , . . . , e` ) for ei ∈ [ni ].
We denote such a hypercolumn by
(e1 , . . . , ei−1 , ∗, ei+1 , . . . , e` ).
Figure 1(a) shows a hypercolumn. A slice orthogonal to the ith dimension is an (` − 1)-
dimensional sub-cube specified by a fixed index ei ∈ [ni ] and consists of the n/ni slots
(e1 , . . . , ei−1 , ei , ei+1 , . . . , e` ) for ej ∈ [nj ] where j 6= i.
Figure 1(b) shows a slice.
We call g1 , . . . , g` generators of the hypercube.
Fix i ∈ {1, . . . , `}. We say ni is the order of gi .

14
(a) a hypercolumn (b) a slice

Figure 1: A 3-dimensional hypercube

• If gini = 1, we say that i is a good dimension. In this case, applying θgi to a ∈ A effectively
rotates the slots in each hypercolumn in the ith dimension by one position. One can also say
that it rotates the slices orthogonal to the ith dimension by one position.

• If gni 6= 1 but gni ∈ H, then we say that i is a bad dimension. In this case, applying θgi to
a ∈ A effectively rotates the slots in each hypercolumn in the ith dimension by one position,
and then perturbs the slot that wrapped around by a power of Frobenius.

• If gni ∈
/ H, then we say that i is a very bad dimension. In this case, applying θgi to
a ∈ A effectively rotates the slots in each hypercolumn in the ith dimension by one position,
and then perturbs the slot that wrapped around by a power of Frobenius, and in addition,
permutes the slots within the corresponding slice.

If i is a bad (or very bad) dimension, we can still implement rotations and masks, analogous to
what we did in the case of one or two dimensions by using the formula

Me(i)
i
· θgei (a) + (1 − Me(i)
i
) · θgei −ni (a). (21)
i i

or
(i) (i)
θgei ((1 − Mni −ei ) · a) + θgei −ni (Mni −ei · a). (22)
i i

(i)
Here, Mei denotes the element of A that has 1 in the first ni − ei slots of each hypercolumn in the
ith dimension, and 0 in the last ei slots of each hypercolumn in the ith dimension.
So for each dimension i and each ei ∈ [ni ], we get a permutation on the slots of the hypercube.
The collection of all of these permutations is sharply transitive, which means that for every two
slots, there is a unique permutation that moves the first slot to the second.
Note that it is always possible to choose a set of generators where each generator is either
good or bad — but not very bad. This follows from the Fundamental Theorem of Finite Abelian
Groups, applied to the group Z∗m /H. See Appendix B for more details on the default procedure
used by HElib used to find generators. However, HElib also allows for very bad generators (which
are currently used for bootstrapping).

15
3.2 Working in subfields of E
Instead of working with the slot algebra E, we can work in any subfield E 0 of E. Such a subfield
may be specified by an arbitrary polynomial G(X) ∈ Zp [X] whose degree d0 divides d, and E 0 is
isomorphic to Zp [X]/(G(X)).
Working in such a subfield does not change at all the algebra for performing intra-slot data
movement. It only affects how data gets encoded and decoded in the slots.

3.3 Prime-power plaintext modulus


The plaintext modulus can also be of the form P = pr , where p is prime and r > 1. In this case, the
factorization of Φm (X) modulo p in (7) can be “lifted”, via a procedure known as Hensel lifting, to
a corresponding factorization mod P . The number of factors and their degrees remain the same. In
fact, the isomorphisms (8), (9), and (10) are all still valid (just replace p by P everywhere). Note
that the subgroup H that we use in these definitions is still the subgroup generated by p̄ ∈ Z∗m
(and not the subgroup generated by P̄ ). The field E is now a ring (actually, a ZP -algebra). We
also have Galois automorphisms θj on AP , just as in (11).
We still have a Frobenius automorphism σ on E, corresponding to (12):

σ: E −→ E
(23)
f (η) 7−→ f (η p ) (for f (X) ∈ ZP [X]).

Note that this map sends η to η p (and not η P ). (Also note that unlike the case when r = 1, it is
not the case that σ(α) = αp for α ∈ E.)
The Galois automorphism θp̄ still effectively applies the Frobenius automorphism to each slot,
and all of the techniques discussed in Section 3.1.2 still work without any modification.
If r > 1, then unlike in Section 3.2, HElib does not allow one to work in any subring of E other
that E itself and ZP . (Currently, there are no compelling applications to do so, and the math for
doing so is much more complicated.)

4 Secret keys and ciphertexts: basic structure and operations


Recall the cyclotomic ring A := Z[X]/(Φm (X)), and for q > 1, the ring Aq := A/(q).
Secret keys and ciphertexts in the BGV/CKKS cryptosystems are essentially vectors of elements
over A or Aq , and BGV/CKKS decryption is essentially an inner-product between them followed by
some rounding operations. HElib uses a more flexible structure, where ciphertext objects contain
a changing set of Aq -elements, and each element carries a descriptor of which secret-key element it
should multiply upon decryption. Below we use an abstract “index set I” to refer to the information
needed to match elements in the ciphertext to those in the secret key.
A secret-key object in HElib is a family of elements of A, indexed by some index set I, which
we write as S := {si }i∈I with each si ∈ A. A ciphertext object in HElib, relative to this secret
key, includes a corresponding family of elements of Aq for some integer q > 1 and relatively prime
to m, indexed by the same index set I, which we write as C := {c̄i }i∈I , with each c̄i ∈ Aq . We
sometimes refer to the family C as the enciphering family of this ciphertext. The integer q is
called the ciphertext modulus associated with this ciphertext. Note that the value of q is not
fixed, and may change over the course of a homomorphic computation.
For a given Z-basis for A (which will typically be either the standard basis or the powerful
basis), an element a ∈ A is called q-reduced on that basis if every coefficient of a on that basis

16
lies in the interval [−q/2, q/2). For every c̄ ∈ Aq , there is a unique c ∈ A such that c̄ = [c mod q]
and c is q-reduced on that basis (this is just the usual division with remainder property over Z).
We call c the canonical representative on that basis of c̄.4
In addition to the enciphering family {c̄i }i∈I , an HElib ciphertext ψ holds also some bookkeeping
information:

• a descriptor that identifies the secret key,

• an indicator of the ciphertext modulus q,

• the plaintext modulus P , which is a prime power of the form P = pr (and relatively prime
to both m and q),

• a correction factor κ ∈ Z∗P ,

• and an upper bound  on the noise (which is defined below).

Some quantities related to the process of decrypting ψ with the secret key {si }i∈I are:

• the pre-decryption of ψ is the canonical representative e ∈ A on the powerful basis of


ē := i∈I c̄i si ∈ Aq ;5
P

• the decryption of ψ is ẽ · κ−1 ∈ AP , where ẽ = [e mod P ] ∈ AP ;

• the noise of ψ is kek.

It is convenient to define the capacity of such a ciphertext as q/, which intuitively represents
how much more noise can be tolerated before all information about the plaintext is lost.

4.1 Modulus switching


We describe the modulus-switching procedure, which is a crucial maintenance-type operation in the
BGV/CKKS cryptosystems. Modulus switching converts a ciphertext relative to one ciphertext
modulus Q into an equivalent ciphertext (i.e., one the decrypts to the same plaintext), but with
respect to a smaller ciphertext modulus q < Q, and whose noise is reduced by a factor of nearly
Q/q. (This operation requires that both Q, q are co-prime with the plaintext modulus P .)
Consider a ciphertext ψ, with enciphering family C = {c̄i }i∈I , ciphertext modulus Q, plaintext
modulus P , and a correction factor κ ∈ Z∗P , defined with respect to a secret key S = {si }i∈I .
Let ci ∈ A be the canonical representative of c̄i on the standard basis for each i ∈ I.6 Consider
scaling down each coefficient (on the standard basis) in each ci by a q/Q factor and rounding
the resulting rational number to the nearest integer. Denoting the A-element so obtained by
ai := d Qq · ci c ∈ A, and the “rounding error element” bi := qci − Qai ∈ A, we can write

qci = Qai + bi , (24)

where the coefficients of bi on the standard basis lie in the interval [−Q/2, Q/2].
4
The choice of interval [−q/2, q/2) rather than (−q/2, q/2] is quite arbitrary. In fact, in HElib, the ciphertext
modulus q is typically odd, in which case, there is no difference at all.
5
The choice of basis is somewhat arbitrary, but in HElib, the powerful basis is used here, rather than the standard
basis, because of the tighter relationship between the canonical norm and the powerful basis infinity norm.
6
The choice of Z-basis here is somewhat arbitrary and may change in the future. In fact HElib, mod switches on
the powerful basis in the bootstrapping routine.

17
For each i ∈ I, we construct di ∈ A so that

Qdi ≡ bi (mod P )

and the coefficients of di on the standard basis lie in the interval [−P/2, P/2].

• If P is odd, each coefficient of di is uniquely determined.

• If P is even, for some coefficients, we may have a choice between −P/2 and P/2 (note that
−P/2 ≡ P/2 (mod P )); for such a coefficient, then the one that is chosen has the same sign
as the corresponding coefficient of bi (or is chosen at random if the corresponding coefficient
of bi is zero).

For each i ∈ I, we set

c0i := ai + di ∈ A, and c̄0i := [c0i mod q] ∈ Aq .

The new ciphertext ψ 0 consists of the enciphering family {c̄0i }i∈I , and its correction factor κ0 ∈ ZP
is set to
κ0 := [ Qq mod P ] · κ.
Note that by construction, we have

qci = Qai + bi ≡ Qc0i (mod P ). (25)

Let us denote the pre-decryption of ψ by e ∈ A, which means that


X
e= ci si − Qf ,
i

where f ∈ A and e is Q-reduced on the powerful basis. Now let


X
e0 := c0i si − qf .
i

It is evident from (25) that qe ≡ Qe0 (mod P ). To show that ψ 0 decrypts to the same plaintext
as ψ, it suffices to show that e0 is itself q-reduced on the powerful basis. To do that, it suffices to
show that ke0 k is sufficiently small.
To this end, observe that
X X X q 
0 0
e = ci si − qf = (ai + di ) si − qf = ci − bi /Q + di si − qf
Q
i i i
q X q X
= e+ (di − bi /Q)si = e + êi si ,
Q Q
i i

where
êi := di − bi /Q
for i ∈ I. In particular,
0 q X
ke k ≤ kek + êi si . (26)

Q
i

18
We call the first term in (26) the mod-switch scaled noise, and the second term the mod-switch
added noise. Given upper bounds τi on ksi k, we can bound the added noise by
X X
êi si ≤ kêi kτi . (27)


i i

Note that the values kêi k can be computed explicitly.


Finally, if  is an upper bound on the noise kek of ψ, let
q X
0 :=  + kêi kτi . (28)
Q
i

With Dm as in Lemma 1, if Dm 0 < q/2, then kpwfl(e0 )k∞ < q/2, and hence e0 is q-reduced on the
powerful basis, as required. It follows that ψ 0 decrypts to the same plaintext as ψ, and, moreover,
0 is an upper bound on the noise of ψ 0 .

4.1.1 Typical settings for Q and q


In HElib, except for mod switching that occurs during bootstrapping, the ciphertext modulus Q is
a multiple of q, both Q and q are odd, and R := Q/q ∈ Z is relatively prime to q. In the equation
(24), we see that since q | Q, we must have q | bi . So we can rewrite (24) as

ci = Rai + b0i , (29)

where b0i = bi /q and each coefficient of b0i lies in the interval [−R/2, R/2].
We then have
ci − b0i
c0i = ai + di = + di .
R
Let ν : A → Aq be the natural map from A to Aq (which sends a ∈ A to [a mod q] ∈ Aq ), and let
ρ := [R mod q] ∈ Z∗q . Then we have

c̄0i = ν(ci − b0i ) · ρ−1 + ν(di ) = ν(ci ) − ν(b0i + Rdi ) · ρ−1 ∈ Aq .




Recall that q | Q, and let ν 0 : AQ → Aq is the natural map from AQ to Aq (which sends [a
mod Q] ∈ AQ to [a mod q] ∈ Aq ), then we also have

c̄0i = ν 0 (c̄i ) − ν(b0i + Rdi ) · ρ−1 ∈ Aq .



(30)

The advantage of this formulation is that we do not have to explicitly compute ai , which allows
for certain optimizations that we shall discuss later.

4.1.2 On the coefficients of êi and c̄0i


Recall that êi := di − bi /Q. We know that each coefficient of bi /Q lies in the interval [−1/2, 1/2].
Moreover, because of the procedure used to choose the coefficients of di (specifically, the tie-breaking
rule for ±P/2 when P is even), each coefficient of êi lies in the interval [−P/2, P/2].
For example, suppose that P = 3 and consider a single coefficient z ∈ Z of êi . We know that
z = z1 − z2 , where z1 is the corresponding coefficient of di and z2 the corresponding coefficient of
bi /Q. We know that z1 ∈ {−1, 0, 1} and z2 ∈ [−1/2, 1/2], and so z ∈ [−1.5, 1.5], as claimed.
As another example, suppose P = 4. We know that z1 ∈ {−2, −1, 0, 1, 2} and z2 ∈ [−1/2, 1/2].
Moreover, we know that if z1 = ±2, then the sign of z1 agrees with the sign of z2 , which means
that z = z1 − z2 ∈ [−2, 2].

19
Assuming that the coefficients of c̄i are independently and uniformly distributed over ZQ , we
can also say something about the distributions of the coefficients of c̄0i and êi .
First, consider the distribution of the coefficients of c̄0i .

• For the settings in Section 4.1.1, it is easy to see that the coefficients of c̄0i are independently
and uniformly distributed over Zq . In (30), each coefficient u of ν 0 (c¯i ) is uniformly distributed
over Zq . Moreover, the corresponding coefficient v of ν(b0i + Rdi ) depends only on the corre-
sponding coefficient of bi , which by the Chinese Remainder Theorem, is independent of u. It
follows that w = (u − v)ρ−1 is uniformly distributed over Zq .

• More generally (and, in particular, in the mod switching that occurs during bootstrapping),
we have c0i = ai + di . In this case, assuming Q  q, then the distribution of each coefficient
u of [ai mod q] ∈ Aq will be close to the uniform distribution over Zq . Moreover, assuming
Q/q  P , then conditioned on a fixed value of u, the distribution of the corresponding
coefficient v of di is close to the symmetric distribution mod P (see Section 2.6.3). Thus,
u and v can be reasonably modeled as independent random variables. It follows that the
coefficients of c̄0i are independently distributed, and assuming that Q/q  P , each coefficient
of c̄0i has a distribution that is close to the uniform distribution over Zq .

Thus, in either case, we see that the coefficients of c̄0i are independently distributed; in the first
case, each coefficient is uniformly distributed over Zq ; in the second case, each coefficient has a
distribution that is close to the uniform distribution over Zq , assuming Q/q  P .
Second, consider the distribution of the coefficients of êi . Assume that Q is odd.7 Let t =
gcd(Q, q), and set Q̃ := Q/t and q̃ := q/t, so that gcd(Q̃, q̃) = 1.8 In (24), we have t | bi , and
setting b̃i := bi /t, we can rewrite (24) as

q̃ci = Q̃ai + b̃i .

It follows that each coefficient of b̃i is symmetrically distributed mod Q̃. From this, it follows that
if Q̃  P , each coefficient of di is close to the symmetric distribution mod P , and that êi can be
reasonably modeled by the uniform distribution over [−P/2, P/2].

4.2 Scaling up
The scaling-up operation is in some sense “the opposite of mod switching”, in that it converts a
ciphertext modulo q into another ciphertext modulo a larger modulus Q (which has to be a multiple
of q), with the noise growing by a factor Q/q.
Suppose we have a ciphertext with:

• a plaintext modulus P = pr ,

• a ciphertext modulus q,

• an enciphering family C, relative to a corresponding secret key S,

• a correction factor κ ∈ Z∗P ,

• a bound  on the noise.


7
This is always the case in HElib and avoids some corner cases.
8
When bootstrapping, we usually have t = 1, and when not bootstrapping, we have t = q.

20
Let Q := Rq, where R is a positive integer, not divisible by p.
We can define the scale-by-R map

scaleR : Zq −→ ZQ
[a mod q] 7−→ [Ra mod Q] (for a ∈ Z).
This map is well defined. Moreover, it extends naturally to a map from Aq to AQ , applying it
coordinate-wise on any Z-basis for A (the choice does not matter). We can further extend this map
element-wise to families of elements of Aq .
Using this map, we can define a new ciphertext with:

• plaintext modulus P = pr (same as before),

• ciphertext modulus Q,

• the enciphering family scaleR (C) which encrypts the same plaintext as the original ciphertext
relative to S using correction factor κ0 := [R mod P ] · κ,

• noise bound 0 := R.

4.3 Key switching (or re-linearization)


Another important maintenance-type operation in BGV/CKKS is key-switching, converting a ci-
phertext relative to one key S 0 into an equivalent ciphertext relative to another secret key S. (This
operation needs access to additional gadgets (called key-switching matrices), which would typically
be included with the public key.)
Suppose we have a ciphertext ψ defined with respect to a secret key S 0 := {si }i∈I . Suppose
that the ciphertext modulus is q, the plaintext modulus is P = pr . Such a ciphertext consists of
an enciphering family {c̄i }i∈I , with each c̄i ∈ Aq , along with a correction factor κ ∈ Z∗P .
P
• The pre-decryption of ψ is the canonical representative e ∈ A of i c̄i si ∈ Aq on the powerful
basis.

• The decryption of ψ is ẽ · κ−1 ∈ AP , where ẽ = [e mod P ] ∈ AP .

• The noise of ψ is kek.

The process of key switching, or re-linearization, allows us to compute a new ciphertext (c̄00 , c̄01 )
that decrypts to same plaintext under a different secret key of the form S := (1, s). To do this, we
will need access to so-called key-switching matrices, whose structure is described below.
We shall always ensure that the ciphertext modulus q can be factored as q = `j=1 Dj , where
Q
the “digits” Dj are coprime and odd. For j = 1, . . . , `, let Dj∗ be the product of all the digits up to
but not including Dj ,
Dj∗ := D1 · · · Dj−1 .
For i ∈ I, let ci ∈ A be the canonical representative of c̄i ∈ Aq on the standard basis,9 so each
coefficient of ci on the standard basis lies in the interval [−q/2, q/2).
Recall that S := (1, s), let T ⊆ I be the set of “trivial” indices i such that si ∈ {1, s}.10 The
indices i ∈ T will be treated in a special, simplified manner (see below). Consider i ∈ I \ T . We
9
The choice of Z-basis here is somewhat arbitrary.
10
In the computation, the actual values si are never used, but it is enough to know when (by construction) si = 1
or si = s.

21
decompose each coefficient of ci into “digits”, using the mixed-radix system D1 , . . . , D` , so that
`
X
ci = Dj∗ cij ,
j=1

where each coefficient of cij ∈ A lies in the interval (−Dj /2, +Dj /2).
The key-switching matrix for si 7→ s is a 2 × ` matrix whose jth column is essentially an
encryption of RDj∗ si under s, but with respect to a larger ciphertext modulus of the form Q = Rq,
where R is also odd and coprime to q. More precisely, for j = 1, . . . , `, the jth column consists of
(0) (1)
two elements aij , aij ∈ A such that

(0) (1)
aij + aij s ≡ RDj∗ si + P eij (mod Q).

(0) (1)
Using this key-switching matrix, we can compute (ci , ci ) ∈ A2 such that
`
(0) (1) (0) (1)
X
(ci , ci ) ≡ (cij aij , cij aij ) (mod Q).
j=1

Working mod Q, observe that


(0) (1)
X (0) (1)
ci + ci s = (aij + aij s)cij
j
X
≡ (RDj∗ si + P eij )cij (mod Q)
j
X  X
=R Dj∗ cij si + P eij cij
j j
X
= Rci si + P eij cij .
j

Now consider the “trivial” indexes i ∈ T . If si = 1, we define


(0) (1)
(ci , ci ) = (Rci , 0).

If si = s, we define
(0) (1)
(ci , ci ) = (0, Rci ).
Finally, we compute (c00 , c01 ) ∈ A2 such that
X (0) (1)
(c00 , c01 ) ≡ (ci , ci ) (mod Q).
i∈I

From the above calculations, we see that


X
c00 + c01 s ≡ Re + P eij cij (mod Q),
i,j

where in the sum over i, j, index i ranges over I \ T . Thus, if (c̄00 , c̄01 ) is the image of (c00 , c01 ) in A2Q ,
and we set the correction factor κ0 := [R mod P ] · κ ∈ Z∗P , we get a ciphertext ψ 0 with ciphertext
modulus Q that decrypts to the same thing as the original ciphertext, provided that the noise in

22
ψ 0 is not too large relative to Q. If the noise kek in the original ciphertext ψ is bounded by , and
we have bounds ij on the canonical norms keij k, then the noise in ψ 0 is bounded by
X
0 := R + P ij kcij k. (31)
i,j
P
This first term R is the key-switch scaled noise, and the second term P i,j ij kcij k is the
key-switch added noise. Parameters are typically selected so that the key-switch added noise is
dominated by the key-switch scaled noise. See Section 5.3.4 for more details.

4.4 Homomorphic addition


Suppose we are given two ciphertexts to add, which, for ` = 1, 2, comprise the following:

• a plaintext modulus P` = pr` ,

• a ciphertext modulus q` ,

• an enciphering family C` relative to a corresponding secret key S` ,

• a correction factor κ` ∈ Z∗P` ,

• a bound ` on the noise.

Before we can add these two ciphertexts, we have to adjust them so that the plaintext moduli,
ciphertext moduli, and correction factors match.

1. First, we make the plaintext moduli match by making them both equal to P := gcd(P1 , P2 ) =
pmin(r1 ,r2 ) .

2. Second, we make the ciphertext moduli match by making them both equal to Q :=
lcm(Q1 , Q2 ). To do this, we apply the up-scaling procedure in Section 4.2.

3. Third, to make the correction factors match, we choose integers e1 , e2 , relatively prime to P ,
such that
[e1 mod P ] · κ1 = κ = [e2 mod P ] · κ2 .
The values e1 and e2 are chosen using a heuristic procedure, based on the extended Euclidean
algorithm, that attempts to make |e1 |1 + |e2 |2 as small as possible.
Specifically, we compute an integer ratio ∈ Z such that

[ratio mod P ] = κ2 /κ1 and ratio ∈ [P ],

and then run the extended Euclidean algorithm on inputs P and ratio. This generates a list
(i) (i)
of pairs of integers (e1 , e2 ), where
(i) (i) (i) (i)
e1 ≡ e2 · ratio (mod P ), e1 , e2 ∈ [−P/2, +P/2], and
(i) (i)
gcd(e1 , P ) = gcd(e2 , P ) = 1.

(i) (i) (i) (i)


Among these, a pair (e1 , e2 ) that minimizes the value |e1 |1 + |e2 |2 is chosen, and we set
(i) (i)
(e1 , e2 ) := (e1 , e2 ).

23
Then, for ` = 1, 2, we replace

• C` by e` C` ,

• κ` by κ = [e` mod P ] · κ, and

• ` by |e` |` .

So now both ciphertexts have the same plaintext modulus P , the same ciphertext modulus Q,
and the same correction factor κ. Suppose that

S1 = {si }i∈I and S2 = {sj }j∈J .

Note that we are assuming that the secret keys for the two ciphertexts are indexed in a consistent
way, so that if two indices are equal, then the components themselves are equal. The secret key for
the resulting ciphertext is the union of the two keys,

S := {sk }k∈K , where K := I ∪ J.

Now suppose that


C1 = {c̄i }i∈I and C2 = {c̄0j }j∈J .
Then enciphering family for the resulting ciphertext is

C := {c̄00k }k∈K ,

where 
c̄k
 for k ∈ I \ J,
00
c̄k = c̄0k for k ∈ J \ I,

c̄k + c̄0k for k ∈ I ∩ J.

The noise bound in the resulting ciphertext is  := 1 + 2 . The resulting ciphertext decrypts to
the same plaintext as the sum of the decryptions of the two given plaintexts (with respect to the
new plaintext modulus P ).

4.5 Homomorphic multiplication


Suppose we are given two ciphertexts to multiply, which, for ` = 1, 2, comprise the following:

• a plaintext modulus P` = pr` ,

• a ciphertext modulus q` ,

• an enciphering family C` relative to a corresponding secret key S` ,

• a correction factor κ` ∈ Z∗P` ,

• a bound ` on the noise.

Before we can multiply two ciphertexts, we have to adjust them so that the plaintext moduli
and ciphertext moduli match.

1. First, we make the plaintext moduli match by making them both equal to P := gcd(P1 , P2 ) =
pmin(r1 ,r2 ) .

24
2. Second, we make the ciphertext moduli match by applying appropriate upscaling and mod
switching to bring them to a common ciphertext modulus Q. In selecting Q, an attempt is
made to reduce the noise in each ciphertext somewhat. See Section 5.3.2 for details.

So now both ciphertexts have the same plaintext modulus P and the same ciphertext modulus
Q. Suppose that
S1 = {si }i∈I and S2 = {sj }j∈J .
Note that we are assuming that the secret keys for the two ciphertexts are indexed in a consistent
way, so that if two indices are equal, then the components themselves are equal. The secret key for
the resulting ciphertext is
S := {si sj }(i,j)∈I×J .
Now suppose that
C1 = {c̄i }i∈I and C2 = {c̄0j }j∈J .
The enciphering family of the resulting ciphertext is

C := {c̄i c̄0j }(i,j)∈I×J .

The correction factor of the resulting ciphertext is κ := κ1 κ2 . The noise bound of the resulting
ciphertext is  := 1 2 .
Note that if there are known identities among the components si sj of S, then identical secret
key components may be replaced a single component, and the corresponding components of the C
are added together to form a single component.

Example. Suppose the input ciphertexts are both defined with respect to a secret key of the
form (1, s). Let C1 = (c̄0 , c̄1 ) and C2 = (c̄00 , c̄01 ). The product ciphertext is defined with respect
to the secret key (1, s, s2 ), and its enciphering tuple is (c̄0 c̄00 , c̄0 c̄01 + c̄1 c̄00 , c̄1 c̄01 ). After such a
multiplication, if a key-switching matrix for s2 7→ s is available, then the product ciphertext can
be key-switched back to a ciphertext relative to the secret key (1, s).

4.6 Homomorphic Galois automorphisms


Suppose we are given a ciphertext ψ, which comprises

• a plaintext modulus P = pr ,

• a ciphertext modulus q,

• an enciphering family C relative to a corresponding secret key S,

• a correction factor κ ∈ Z∗P ,

• a bound  on the noise.

Now suppose we want to homomorphically apply a Galois automorphism θj , where j ∈ Z∗m (see
Section 2.7), to ψ.
Suppose that C = {c̄i }i∈I and S = {si }i∈I . The enciphering tuple for the resulting ciphertext
ψ 0 is C 0 = {θj (c̄i )}i∈I . The secret key for ψ 0 is S 0 = {θj (si )}i∈I . The plaintext modulus, ciphertext
modulus, correction factor, and noise bound for ψ 0 are the same as for ψ. One can verify that if ψ
decrypts to ē ∈ AP , then ψ 0 decrypts to θj (ē) ∈ AP .

25
4.7 Key generation and encryption
4.7.1 Secret key generation
A generated secret key is always in the canonical form (1, s), where s ∈ A. Recall that A = Z[x]
where x := [X mod Φm (X)] is the image of the indeterminate X in A.
Define
m̂ := φ(m), if m is a power of two, and m̂ := m, otherwise. (32)
The value s ∈ A, with coefficients 0, ±1, is generated using one of two methods, depending on
an application-defined parameter.

4.7.1.1 Bounded Hamming weight


In this method, an application-defined parameter h is specified. The value h is a “Hamming weight
bound”. A random subset I ⊆ [m̂] of cardinality h is chosen. Then
X
s := ai xi ,
i∈I

where each ai is chosen at random from the set {±1}. A bound Bsk is computed such that ksk ≤ Bsk
with probability at least 1/2, and the above procedure for generating s is repeated until ksk ≤ Bsk .

4.7.1.2 Unbounded Hamming weight


A subset I ⊆ [m̂] is chosen, where each index i ∈ [m̂] is selected with probability

φ(m)
α := .
2m̂
Then X
s := ai xi ,
i∈I

where each ai is chosen at random from the set {±1}. A bound Bsk is computed such that ksk ≤ Bsk
with probability at least 1/2, and the above procedure for generating s is repeated until ksk ≤ Bsk .

The only difference between the two methods is in the selection of the set of indices I and in
the value of the bound Bsk .

4.7.1.3 Establishing the bounds Bsk


For the bounds Bsk , we use the heuristic analysis in Section 2.6.1. We can apply that analysis as
follows. For the bounded Hamming weight method, we may apply (2) and (3) with σ 2 = h, so that
the bound Bsk is estimated as p
Bsk = h log(φ(m)). (33)
For the unbounded Hamming weight method, we may apply (2) and (3) with σ 2 = φ(m)/2, so that
the bound Bsk is estimated as p
Bsk = φ(m) log(φ(m))/2. (34)
The bound Bsk is called the secret-key noise bound, and is stored with the secret key, as well
as with the corresponding public key.

26
These bounds have also been experimentally validated to ensure that the probability that
Pr[ksk > Bsk ] for a randomly sampled s is at least roughly 1/2. One advantage of generating
s via this type of “rejection sampling”, rather than just generating a single s, is that we can use a
smaller bound Bsk , rather than a significantly larger high-probability bound Bsk . Another advan-
tage is that we are guaranteed that ksk ≤ Bsk with probability 1, rather than with high probability.
The disadvantage is that we lose, essentially, one bit of security.

4.7.2 Public key generation


Let (1, s) be a secret key as generated above, where s ∈ A.
When a public-key is generated, a plaintext modulus P = pr is specified. A public key essen-
tially comprises a ciphertext that encrypts zero (modulo P ), relative to the secret key (1, s). An
appropriate ciphertext modulus q is chosen (see below in Section 5.3.1).

• First, we generate a random element c̄∗1 ∈ Aq .

• Second, we generate a random element e∗ ∈ A with small norm and Gaussian coefficients, in
a manner to be described below (see Section 4.7.3).

• Third, we set c̄∗0 := P e∗ − sc̄∗1 ∈ Aq .

One can see that of c̄∗0 + c̄∗1 s = [P e∗ mod q]. That is, (c̄∗0 , c̄∗1 ) is an encryption of zero with
respect to the secret key (1, s). We also set

Bpk = P · Bgauss ,

where Bgauss is a bound on ke∗ k (see Section 4.7.3 below). The bound Bpk is called the public-key
noise bound, and is stored with the public key, along with the specified plaintext modulus P .

4.7.3 Generating e with a Gaussian distribution


Let σ0 be a given standard deviation (σ0 = 3.2 by default). Let σ̂0 := σ0 , if m is a power of two,

and σ̂0 := m · σ0 , otherwise. Let m̂ be defined as in (32). Let I := [m̂]. For each i ∈ I, we choose
ci ∈ R according to a continuous Gaussian distribution with mean 0 and variance σ̂02 , and then
round ci to the nearest integer ai . Then
X
e := ai xi .
i∈I

A bound Bgauss is computed such that kek ≤ Bgauss with probability at least 1/2, and the above
procedure for generating e is repeated until kek ≤ Bgauss
√ .
To estimate Bgauss , we use (2) and (3) with σ := m̂ · σ̂0 .11 Based on this, we get the bound
p
Bgauss = σ̂0 · m̂ log(φ(m)). (35)

Again, these bounds have also been experimentally validated to ensure that the probability that
Pr[kek > Bgauss ] for a randomly sampled e is at least roughly 1/2.
11
Here, we are using a rounded Gaussian distribution, which instead of having a variance of σ̂02 , actually has a
variance that is a bit larger, namely ≈ σ̂02 + 1/12. See, for example, https://mathoverflow.net/questions/178964/
estimating-the-variance-of-a-discrete-normal-distribution. Hopefully, we can find a better reference.

27
4.7.4 Encryption using the public key
Let (c̄∗0 , c̄∗1 ) be a public key as above, so that c̄∗0 , c̄∗1 ∈ Aq and c̄∗0 + c̄∗1 s = [P e∗ mod q], where P = pr
is the plaintext modulus associated with the public key. Let Bpk be the public-key noise bound,
which is a bound on kP e∗ k.
In addition to the public key, the encryption routine takes as input a polynomial a ∈ A,
representing the plaintext, along with a plaintext modulus P 0 | P (by default, P 0 = P , but it is
possible to override this default behavior).
The ciphertext ψ produced will have a ciphertext modulus q (the same q used in the public
key). For historical reasons (see Section 5.4.1), the correction factor κ associated with ψ will be
κ := [q mod P 0 ] ∈ ZP 0 .
As a first step, the input a is replaced by an element b ∈ A such that

• b ≡ q · a (mod P 0 ),

• the coefficients of b on the standard basis are symmetrically reduced mod P 0 (see Sec-
tion 2.6.3.1).

It is assumed that kbk ≤ Bptxt , where Bptxt is a high-probability bound computed based on
the analysis in Sections 2.6.2 and 2.6.3. Specifically, we estimate the probability that kbk >
Bptxt assuming (heuristically) that the coefficients of b on the standard basis are symmetrically
distributed mod P 0 . We apply (5) with σ = φ(m)σP 0 , where σP 0 is bounded as in Lemma 3, and
estimate the probability that kbk > Bptxt as in (5). By default, HElib chooses Bptxt = 10σ, so that
this probability is (heuristically) bounded by ≈ 2−75.8 · φ(m)/2. In the current implementation,
HElib will print a warning if kbk > Bptxt . Another strategy under consideration is a randomized
sampling and rejection strategy.
Next, the enciphering tuple (c̄0 , c̄1 ) ∈ Aq × Aq of ψ is computed as follows:

(c̄0 , c̄1 ) := r · (c̄∗0 , c̄∗1 ) + P 0 · (e0 , e1 ) + (b, 0),

where

• r is generated with coefficients 0, ±1 as in Section 4.7.1.2, where


p
krk ≤ Bsmall := φ(m) log(φ(m))/2, (36)

corresponding to (34), and

• each of e0 and e1 are generated with small norm and Gaussian coefficients as in Section 4.7.3,
so that
kei k ≤ Bgauss (for i = 0, 1),
with Bgauss as in (35).

We have
c̄0 + c̄1 s = [P e∗ r + P 0 (e0 + e1 s) + b mod q].
It follows that the noise of ψ is bounded by

Bpkenc := Bpk Bsmall + P 0 Bgauss (1 + Bsk ) + Bptxt .

28
4.7.5 Encryption using the secret key
In some applications, the encrypting entity make have access to the secret key (1, s). In this case,
the above encryption procedure can be modified to produce a ciphertext with somewhat less noise.
The first steps of the encryption procedure are identical. The enciphering tuple (c̄0 , c̄1 ) is
computed as follows:

• c̄1 is chosen at random from Aq ,

• c̄0 := b + P 0 e − sc̄1 ,

where e is generated with small norm and Gaussian coefficients as in Section 4.7.3, so that kek ≤
Bgauss , with Bgauss as in (35).
We have
c̄0 + c̄1 s = [b + P 0 e mod q].
It follows that the noise of the resulting ciphertext ψ is bounded by

Bskenc := Bptxt + P 0 Bgauss .

4.7.6 Generating key-switching matrices


The generation of key-switching matrices (see Section 4.3) is very similar to the process of encrypting
using a secret key (see Section 4.7.5). Recall from Section 4.3 that a key-switching matrix for si 7→ s
is a 2×` matrix whose jth column, for j = 1, . . . , `, is an encryption of RDj∗ si under s, with respect
to a ciphertext modulus of the form Q = qR, where q and R are coprime. We also have q = `j=1 Dj ,
Q
where the Dj are coprime, and Dj∗ = D1 · · · Dj−1 for j = 1, . . . , `. For each j = 1, . . . , `, we compute
(0) (1)
āij , āij ∈ AQ such that
(0) (1)
āij + āij s = [RDj∗ si + P eij mod Q]
as follows:
(1)
• āij is chosen at random from AQ ,
(0) (1)
• āij := RDj∗ si + P eij − sāij ,

where eij is generated with small norm and Gaussian coefficients as in Section 4.7.3, so that
keij k ≤ Bgauss , with Bgauss as in (35). Thus, the bounds ij appearing in (31) are all set to Bgauss .
See Section 5.3.4 for more details on the values of q, R, and P .

5 Managing ciphertext moduli


Again, let A = Z[X]/(Φm (X)). Recall that a ciphertext is defined in terms of several quantities,
including a ciphertext modulus q and an enciphering family {c̄i }i∈I , where each c̄i ∈ Aq .
Over the course of a computation, the ciphertext modulus associated with a ciphertext may
change over time, through a sequence up mod switching (or “down-scaling”) operations (see Sec-
tion 4.1) or up-scaling operations (see Section 4.2).
This section provides more details over how different values of q are chosen.

29
5.1 Ciphertext prime sets
A ciphertext modulus q is always defined as a product of word-sized primes. On a 64-bit machine,
such a word-sized prime π is typically at most 60 bits, so as to allow for efficient modular arithmetic
modulo π.12
For reasons to be described below (see Section 5.2), we will always choose word-sized primes π
with π ≡ 1 (mod m). In addition, for different reasons, also to be described below (see Section 5.2),
we will attempt to choose π such that π ≡ 1 (mod 2t ) for t as large as possible.
When initialized on a given set of parameters, HElib defines three disjoint sets of word-sized
primes: smallPrimes, normalPrimes, specialPrimes. The set of normalPrimes are also ordered:

normalPrimes = {π1 , . . . , πK }.

A ciphertext modulus q is always of the form


Y
q= π,
π∈S

where S is a set of word-sized primes of the form

S = Ssmall ∪ {π1 , . . . , πk }

or
S = Ssmall ∪ {π1 , . . . , πk } ∪ specialPrimes,
where Ssmall ⊆ smallPrimes and k ∈ {0, . . . , K}. That is, S consists of

• any number of primes in smallPrimes,

• an initial sequence π1 , . . . , πk of the primes in normalPrimes, and

• either all of the primes in specialPrimes, or none of them.

We call S a ciphertext prime set. In HElib, every ciphertext carries with it a ciphertext prime
set, which defines a corresponding ciphertext modulus.
Each of the primes in normalPrimes are chosen to be of the same bit length. In contrast, the
bit lengths of the primes in smallPrimes are of a variety of sizes, all of which are smaller than the
length of the normalPrimes, and chosen in a manner so that various subsets of smallPrimes can be
utilized to form ciphertext moduli of a wide variety of bit lengths. This is discussed in more detail
below in Section 5.3.2.2.

5.2 Standard and Double-CRT representation of Aq


Suppose that q is a ciphertext modulus as above, so that
Y
q= π,
π∈S

where S is a set of word-sized primes.


12
This is an NTL-imposed restriction. It is possible to configure NTL (and hence HElib) to allow 62-bit word-sized
primes. Also, on 32-bit machines, the limit on a word-sized prime is 30 bits. Finally, note that when we say “64-bit
machine” or ‘32-bit machine”, the bit count refers to the size of a long int.

30
An element c̄ ∈ Aq can be represented in a number of ways. One natural way is as a vector
over Zq representing the coefficients of c̄ on the standard basis (see Section 2.4). We call this the
standard representation of Aq .
HElib generally uses a different representation of Aq , which we call the Double-CRT repre-
sentation, defined as follows. For each π ∈ S, we choose a primitive mth root of unity ωπ ∈ Zπ
(recall that we are assuming that π ≡ 1 (mod m), which guarantees the existence of such an ωπ ).
We also have a natural map from Zq to Zπ , which we can extend to a natural map from Aq to Aπ .
For c̄ ∈ Aq and j ∈ Zm , we define c̄(ωπj ) ∈ Zπ to be the element in Zπ obtained by evaluating the
image of c̄ in Aπ at the value ωπj . The Double-CRT representation of c̄ ∈ Aq is defined to be the
collection of values n o
c̄(ωπj ) ∗
. (37)
(π,j)∈S×Zm
In the Double-CRT representation, elements of Aq can be added and multiplied very efficiently,
indeed, in linear time.
One can convert between the standard and Double-CRT representations. Conversion from
standard to Double-CRT essentially consists of doing the following:
• for each π ∈ S, reduce modulo π the coefficients of c̄ ∈ Aq on the standard basis to get an
element c̄π ∈ Aq , represented on the standard basis;

• for each π ∈ S, evaluate c̄π ∈ Aq at the values ωπj for j ∈ Z∗m .


Conversion from Double-CRT to standard representation essentially reverses these steps:
• for each π ∈ S, interpolate the values c̄π (ωπj ) for j ∈ Z∗m to obtain c̄π ∈ Aπ , represented on
the standard basis;
• for each coefficient on the standard basis, apply the Chinese Remainder Theorem (CRT)
to the corresponding coefficient of c̄π ∈ Aπ , for each π ∈ S, to obtain the corresponding
coefficient of c̄ ∈ Aq .
The polynomial evaluation and interpolation steps are done using an FFT. When m is a power
of two, since each π is ≡ 1 mod m, these steps can be implemented (essentially) directly using an
m/2-point radix-2 FFT over Zπ . When m is not a power of two, Bluestein’s technique [2] is used
to reduce each of these steps to that of convolution, which itself can be implemented using radix-2
FFT’s provided Zπ contains the necessary roots of unity. This is why we always try to choose π
so that π ≡ 1 mod 2t for t as large as possible. If t is not large enough, these convolutions are
implemented using the Chinese Remainder Theorem and a small number (usually 2 or 3) of FFT’s
modulo other primes that do contain the necessary roots of unity.
These conversions are somewhat expensive computationally, even when using FFTs.

5.3 Prime set evolution


We next discuss how the prime set S associated with a ciphertext evolves over the lifetime of the
ciphertext.

5.3.1 Freshly encrypted ciphertexts


A freshly encrypted ciphertext has a prime set S = normalPrimes. That is,
Y
q= π
π∈normalPrimes

31
is the modulus used to form the public key (see Section 4.7.2) and in the encryption routines (see
Sections 4.7.4 and 4.7.5).

5.3.2 Homomorphic multiplication


Recall from Section 4.5 that in order to multiply two ciphertexts, with ciphertext moduli q` for
` = 1, 2, the two ciphertexts were made to have a common ciphertext modulus q, by applying
appropriate upscaling and mod switching to each of the input ciphertexts. One goal was to reduce
the noise of each as much as possible.
To start with, we compute a high-probability bound on the added noise from mod switching
(`)
Bams for ` = 1, 2. See Section 4.1, and in particular (26). We discuss the computation of these
bounds below.
Now suppose the two ciphertexts originally have noise bounds ` for ` = 1, 2. After mod
switching to q, according to (26), the noise in the `th ciphertext, for ` = 1, 2, will be bounded by
(`)
(q/q` )` + Bams .

We would like to choose q so that for each ` = 1, 2, the first term (q/q` )` is about the same as
(`)
(or perhaps a bit smaller than) the second term Bams . So the logic attempts to choose a common
ciphertext modulus q such that
(1) (2)

log2 (q) ≈ target := min log2 (Bams ) + log2 (q1 ) − log2 (1 ), log2 (Bams ) + log2 (q2 ) − log2 (2 ) .

The logic to determine such a q is essentially performs a brute-force search among all possible
ciphertext moduli that are the product of normalPrimes and smallPrimes, but not specialPrimes,
as defined in Section 5.1. The number of such moduli is bounded by

(|normalPrimes| + 1) · 2|smallPrimes| .

The cardinality of smallPrimes is chosen by design to be small enough so that this quantity is
not outrageously large. In addition, for each such moduli q, the quantity log2 (q) is pre-computed
and stored in a table (along with the corresponding prime set), and the table is sorted in order of
increasing log2 (q). Given the value target, a search for a value of log2 (q) is done in a small interval

[target − a, target − b].

Typically, there will be several values of log2 (q) to choose from, and among these one is chosen that
(heuristically) minimizes the cost of the mod switching operation (and ties are broken in favor of
the largest log2 (q) value in the interval). If no such value is found, a log2 (q) value is chosen that is
as large as possible while still being bound from above by target − a.
The current values used in HElib for a and b are a = 4 and b = 1.

5.3.2.1 Mod-switch added noise bound estimation


As we saw in Section 4.1, specifically, in (27), the added noise from mod switching is bounded by
X
kêi kτi .
i∈I

Here, each τi is a bound on ksi k, which can be derived from the bound Bsk in Section 4.7.1.3.

32
While each value kêi k can be computed at the time we do the mod switching, in determining
which modulus to switch to, it is convenient to use an easy-to-compute high-probability bound
on kêi k, rather than the value kêi k itself. As discussed in Section 4.1.2, the coefficients of êi on
the standard basis lie in the interval [−P/2, +P/2], where P is the plaintext modulus, and can
be reasonably be modeled as independently and uniformly distributed over this same interval. We
therefore use a high-probability bound Bround on kêi k based on the analysis in Section 2.6.2. Specif-
ically, we estimate the probability that kêi k > Bround assuming (heuristically) that the coefficients
of êi on the standard basis are uniformly distributed over the interval [−P/2, +P/2]. We apply (5)
with σ = φ(m)·P 2 /12, and estimate the probability that kêi k > Bround as in (5). By default, HElib
chooses Bround = 10σ, so that this probability is (heuristically) bounded by ≈ 2−75.8 · φ(m)/2.

5.3.2.2 How small primes are chosen


The set smallPrimes is chosen so that we can always find a ciphertext modulus q such that log2 (q)
is fairly close to any desired target value. A “precision parameter” δ is chosen — by default δ = 3.
The idea is that we want to be able to always construct a ciphertext modulus whose bit length
is within (about) δ/2 of any given target value. To this end, the primes π in smallPrimes should
have a diversity of sizes. However, we require that π ≡ 1 (mod m) (and for efficiency reasons, we
want that π − 1 is divisible by large power of 2), and so the size of π cannot be too small.
Here is how the current logic in HElib works (which is subject to change). On a 64-bit machine,
the primes in normalPrimes will always be b-bit primes, for some b ∈ {54, . . . , 60}. The logic for
generating smallPrimes then chooses a bound c := d2b/3e on the bit length of the smallest prime
in smallPrimes. So c will always be at least 36. The set smallPrimes consists of two primes of size
c, along with one prime of each size b − δ2t for t = 0, 1, . . ., with the restriction that b − δ2t > c.
The design principle here is that any integer should be congruent mod b to some integer that
is within (about) δ/2 of an integer that can be expressed as a sum of the bit lengths of primes
in smallPrimes. So to get close to any given target value for log2 (q), we can combine an initial
segment of the primes in normalPrimes whose bit length adds up to an appropriate multiple of b,
together with a corresponding subset of primes in smallPrimes, to get a set of primes whose bit
lengths add up to a number within (about) δ/2 of the target value. This is a bit heuristic, as the bit
length of the product will not (in general) be the same as the sum of the bit lengths. Nevertheless,
experimentally, it works well enough.

5.3.3 Dropping small and special primes


In some situations, we need to take a ciphertext and drop all of the small and special primes from
the prime set associated with a ciphertext. In so doing, we also want to add a suitable number
of normal primes so as to preserve (to the extent possible) the capacity of the ciphertext, i.e., the
quantity q/, where q is the ciphertext modulus and  is the noise bound.
This is done as follows. Suppose Y
q= π,
π∈S
where S is the prime set associated with the ciphertext. Let S 0 := S ∩ normalPrimes, and suppose
S 0 = {π1 , . . . , πk }.
First, we compute a high-probability bound Bams on the mod-switching added noise, as in Sec-
tion 5.3.2.1. Next, we choose k 0 to be the smallest integer greater than or equal to k such that
scaled :=  · q 0 /q  Bams ,

33
where 0
k
Y
0
q := πi .
i=1

We then scale up, as in Section 4.2, adding the primes πk+1 , . . . , πk0 , and then mod switch down to
the prime set π1 , . . . , πk0 , as in Section 4.1, dropping the small and special primes in S. After this
is done, the noise in the ciphertext will be bounded by

scaled + Bams ,

and the choice of k 0 ensures that the second term is smaller than the first term, which ensures that
very little capacity is lost. In fact, when we actually perform the mod switching operation, the
bound Bams is replaced by a more precise and somewhat smaller probability-1 bound.
Note that in some circumstances, there might not be enough normal primes to allow us to find
0
a k so that scaled is large enough. In this case (which is very atypical in practice), some capacity
will be lost.

5.3.4 Key switching


We now consider the key-switching operation (see Section 4.3).
As a first step, we drop all small and special primes, as in Section 5.3.3.
Next, suppose the ciphertext modulus is q, which is itself a product of normal primes. In
Section 4.3, we work with a larger ciphertext modulus of the form Q = Rq. The factor R will
always be the product of all the special primes.
We now discuss the decomposition of q into “digits”, q = `j=1 Dj . Suppose the set of all
Q
normal primes is
{π1 , . . . , πK }.
At system initialization, we partition this set into corresponding “digit sets”

DS 1 , . . . , DS L ,

where

DS 1 = {π1 , . . . , πk1 }, DS 2 = {πk1 +1 , . . . , πk2 }, . . . , DS L = {πkL−1 +1 , . . . , πkL },

where
0 < k1 < k2 < · · · < kL = K.
This defines corresponding digits
Y
b j :=
D π (j = 1, . . . , L).
π∈DS j

Qk
We know that q is of the form q = i=1 πi , which means we can factor q as

q = D1 · · · D` ,

where
D1 = D
b 1 , . . . , D`−1 = D
b `−1 , and D` | D
b `.

34
The significance of this is that the values

Dj∗ := D1 · · · Dj−1 (j = 1, . . . , `)

only depend on the values D b j , for j = 1, . . . , L. Recall that a key-switching matrix for si 7→ s
consists of encryptions of RDj∗ si under s, for j = 1, . . . , `, and so these key-switching matrices can
be computed at key-generation time, independent of the particular value of the current ciphertext
modulus q.
In more detail, at key-generation time, key-switching matrices are computed using the digits
D b L , and the ciphertext modulus Q
b 1, . . . , D b = Rq̂, where R is the product of all of the special primes,
and q̂ is the product of all of the normal primes. When performing key switching on a ciphertext
with modulus q | q̂, we simply drop the primes dividing q̂/q from the key-switching matrices. Indeed,
if the we have an encryption modulo Q b of RD∗ si under s, just reducing everything mod Q = Rq
j
gives us an encryption modulo Q (with the same noise). Note also that at key-generation time, the
key-switching matrices must be generated using a value of P = pr for the plaintext modulus that
is at least as large as any plaintext modulus that may arise during the lifetime of the public key.
At public-key generation time, the normal primes are decomposed in digit sets DS 1 , . . . , DS L of
roughly equal cardinality, so that each resulting digit has roughly the same bit length. The number
of digits L is a parameter than can be selected at system initialization. The default is L = 3.
The special primes are chosen so that their product R is large enough so that in (31) the first
term
P R, which represents the key-switching scaled noise, will likely dominate the second term
P i,j ij kcij k, which represents the key-switching added noise. One sees that as the parameter L
increases, the bit-length of the digits will decrease, and hence the values kcij k in (31) will decrease,
and hence we can get by with a smaller value of R, which will imply a higher level of security
(which degrades as Q b increases). In the current implementation of HElib, the bit-length of R is
determined by default using a somewhat heuristic formula that depends on (among other things)
the bit-length of the digits. This default behavior can be overridden, so that the a user can specify
the bit-length of R explicitly. Indeed, experimentally, we have found that in many applications,
it is possible to use a somewhat smaller bit-length for R, resulting in a higher security level, but
without degrading capacity significantly.
As we saw above, by increasing the value of L, we can increase the security level. However,
this comes at a cost: larger values of L increase the space consumed by the key-switching matrices,
and increase the running time of the key-switching operation. These space and time costs increase
linearly in L.

5.3.4.1 Heuristic calculation of the bit-length of R


P
As stated above, we want to choose R so that R  P i,j ij kcij k, in the context of (31).
Let us focus on key-switch added noise
X
α := P ij kcij k. (38)
i,j

As discussed in Section 4.7.6, each ij is Bgauss as in (35), which is bound on keij k, where eij
generated with small norm and Gaussian coefficients as in Section 4.7.3. As per (35), we have
p
Bgauss = σ̂0 · m̂ log(φ(m)),

where if m is a power of two then σ̂0 = σ0 and m̂ = φ(m), and otherwise σ̂0 = σ0 m and m̂ = m.
Now consider the terms kcij k appearing in (38). If we heuristically model the coefficients of cij as

35
uniformly distributed on the interval [−Dj /2, Dj /2], then we can use the heuristic estimate (5) with
σ 2 = φ(m) · Dj2 /12. From this, we can bound the size of the j’th digit kcij k with high probability
by p
Bj = k · Dj φ(m)/12,
where k is a suitable parameter (see the table of values in Section 2.6.2 — we typcally use k = 10).
Thus, if D = maxj Dj , we have the following bound on the key-switch added noise α:
p p
α ≤ (#terms) · P · σ̂0 m̂ log(φ(m)) · k · D φ(m)/12
p P σ̂0 k
= D · m̂ · φ(m) · log(φ(m)) · √ · (#terms).
12
Here, #terms denotes the number of terms in the summation (38). Therefore,
 p P σ0 k

 D · φ(m) log(φ(m)) · √ · (#terms), if m is a power of two,
12


α≤
 p P σ0 k
D · m φ(m) log(φ(m)) · √ · (#terms), otherwise.


12

As discussed above, we want to choose the special primes so that their product R satisfies
R  α, where α is the key-switch added noise (38) and  is the noise in a ciphertext before key
switching. Of course, we have to choose R ahead of time, and we will not know the relevant value
of  at this time. We instead choose to estimate  = β 2 , where β is an estimate of the mod-switch
added noise.
The reasons for estimating  in this way is is as follows. First, key switching often happens
right after a ciphertext multiplication. As a part of the multiplication process, the noise in each
multiplicand is reduced to roughly β, and after multiplication, it becomes β 2 . Second, even for
key switching operations performed after other operations (such as automorphisms), in which the
ciphertext has less noise than β 2 (and typically, it will be about β), choosing R as we do may
cause the key switching operation to decrease the capacity a bit, but this is a “self correcting”
process: if repeated, since the noise is larger, subsequent key switches will decrease the capacity
less substantially.
As per (27), The mod-switch added noise is bounded by
X
kêi kτi .
i

Here, each τi is a bound on the norm of the relevant secret key, and as discussed in Section4.1.2 the
coefficient of each êi can be modeled as uniformly and independently distributed over [−P 0 /2, P 0 /2].
Note that we use P 0 here to distinguish it from P above — in general, P ≥ P 0 , and while in some
applications, we may have P = P 0 , in others (for example, when bootstrapping), we may have
P > P 0.
As we did above, we can use the heuristic estimate (5) with σ 2 = φ(m) · (P 0 )2 /12. From this,
we can bound kêi k with high probability by

k · P 0 φ(m)/12.
p

As for the terms τi appearing above, we use the results of Section 4.7.1.3, and specifically, (33)
and (34), obtaining a bound of p
h log(φ(m))

36
for τi . Here, h is the specified Hamming weight, or h = φ(m)/2 in the unbounded Hamming weight
case.
So we will use the estimate

β = k · P 0 φ(m)/12 · h log(φ(m)).
p p

While this estimate is really an upper bound, it is actually a very tight upper bound, and so it is
not so bad.
So we want R at least as big as α/β 2 .
In the case where m is a power of two, we have

D · φ(m) log(φ(m)) · P√σ12 0k


p
· (#terms)
2
α/β = 2
k · (P 0 )2 · (φ(m)/12) · (h log(φ(m)))

D · P · σ0 · 12 · (#terms)
= p .
log(φ(m)) · (P 0 )2 · h · k
p
In the case where m is not a power of two, we have to multiply this by F (m) = m/ φ(m),
obtaining √
2 D · m · P · σ0 · 12 · (#terms)
α/β = p .
φ(m) log(φ(m)) · (P 0 )2 · h · k

5.4 Some implementation details


Generally speaking, each element c̄i ∈ Aq of an enciphering family {c̄i }i∈I is represented in Double-
CRT format.
(0) (1)
Similarly, elements āij , āij ∈ AQ of a key-switching matrix are also represented in Double-
(1)
CRT format. As a space optimization, since āij is chosen at random from AQ (see Section 4.7.6),
(1)
instead of storing the element āij ∈ AQ , a seed to a PRG is stored, which is expanded into an
element of AQ when necessary.

5.4.1 The correction factor: stored and implied


For historical reasons, the correction factor κ associated with a ciphertext with ciphertext modulus
q is always of the form κ = [q mod P ]· κ̃. It is the value κ̃ that is actually stored with the ciphertext.
We call κ̃ the stored correction factor, and κ the implied correction factor. For a freshly encrypted
ciphertext, the stored correction factor κ̃ is set to 1, which is why the implied correction factor κ̃
is initially [q mod P ] (as discussed in Section 4.7.4).
This is really an artifact of the implementation, and is subject to change.

5.4.2 Mod switching


Consider the mod-switching operation, described in Section 4.1. As discussed in Section 4.1.1, there
are certain optimizations that can be implemented when mod-switching from a ciphertext modulus
Q to a ciphertext modulus q, where Q = Rq.
We are starting with a ciphertext with an enciphering family {c̄i }i∈I , each c̄i ∈ AQ , and are
computing an enciphering family {c̄0i }i∈I , with each c̄0i ∈ Aq . As discussed above, all of these ring
elements c̄i and c̄0i are represented in Double-CRT format.

37
We compute c̄0i ∈ Aq according to the formula (30):

c̄0i = ν 0 (c̄i ) − ν(b0i + Rdi ) ρ−1 ∈ Aq .




Here, b0i ∈ A is the canonical representative on the standard basis of the image of ci under the
natural map from AQ to AR . Computationally speaking, this requires a conversion of the Double-
CRT representation of an element in AR to its corresponding standard representation. The value
di ∈ A is easily derived from b0i with negligible computational cost. The computation of ν(b0i +Rdi ),
where ν is the natural map from A to Aq , requires a conversion of the standard representation of
an element in Aq to its corresponding Double-CRT representation.
Thus, the cost of mod-switching from Q to q is dominated by these two operations:

• Double-CRT to standard conversion in AR ;

• standard to Double-CRT conversion in Aq .

Note the savings over a more straightforward implementation in which the first operation would
instead be a Double-CRT to standard conversion in AQ .
Note: We do not the modify stored correction factor κ̃, but the implied correction factor κ gets
divided by [R mod P ].

5.4.3 Scaling up
The up-scaling operation described in Section 4.2, which scales up a ciphertext with modulus q to
one with modulus Q = Rq, is trivial to implement.
We are starting with a ciphertext with an enciphering family {c̄i }i∈I , each c̄i ∈ Aq , and are
computing an enciphering family {c̄0i }i∈I , with each c̄0i ∈ AQ . As discussed above, all of these ring
elements c̄i and c̄0i are represented in Double-CRT format.
Consider one such ring element c̄i ∈ Aq , and suppose its Double-CRT representation, as in (37)
is n o
c̄i (ωπj ) ∗
,
(π,j)∈S×Zm

π. To get the Double-CRT representation of c̄0i ,


Q
where S is the associated prime set, i.e., q = π∈S
we simply

• add to S all the primes π dividing R,

• for each π dividing R and each j ∈ Z∗m , we set the value c̄i (ωπj ) to 0, and

• for each π dividing q and each j ∈ Z∗m , we multiply the value c̄i (ωπj ) by [R mod π].

Note: We do not the modify stored correction factor κ̃, but the implied correction factor κ gets
multiplied by [R mod P ].

A Computing Em
With notation as in Section 2.5.2, we discuss techniques to efficiently compute Em in time O(φ(m)2 )
with satisfactory accuracy.13
13
The algorithm presented here is similar to the Parker-Traub algorithm in [5]. However, it is specialized to take
advantage of the fact that the evaluation points are the roots of Φm (X).

38
First, observe that if we index the columns of Vm−1 by j ∈ Z∗m , then the jth column is the
coefficient vector of the Lagrange basis polynomial
1 Φm (X)
· .
Φ0m (ω j ) (X − ω j )
We also have Y
Φ0m (ω j ) = (ω j − ω k ).
k∈Z∗m
k6=j

To compute Em , we only need the absolute values |Φ0m (ω j )|, and using the standard formula
2 sin(θ/2) for the length of a chord of angle θ on the unit circle, we have
Y
|Φ0m (ω j )| = 2| sin((j − k)π/m)|.
k∈Z∗m
k6=j

By computing a table of all relevant values | sin(kπ/m)| for |k| ∈ [m], we can compute each value
|Φ0m (ω j )| using roughly φ(m) multiplications. Some care must be taken to avoid floating point
overflow/underflow, however. With this approach, the relative error of the result is guaranteed to
be at most ≈ ( + δ)φ(m), where  is the machine precision (usually 2−53 ) and δ is the relative
precision to which the values | sin((k − j)π/m)| are computed (which should also be very close to
2−53 if the computation is done in extended double precision and then rounded to double precision).
The total cost of computing all of the values |Φ0m (ω j )| is therefore O(φ(m)2 ).
To compute the polynomials Φm (X)/(X − ω j ), one can first compute the polynomial Φm (X)
exactly.14 If X
Φm (X) = X φ(m) + ai X i ,
i∈[φ(m)]

then one can compute the coefficients of the polynomial


Φm (X) X
= qi X i
(X − ω j )
i∈[φ(m)]

by the Horner scheme:

qφ(m)−1 = 1, qi = qi+1 ω j + ai+1 for i = φ(m) − 2 down to 0.

Computing all of these coefficients for all j ∈ Z∗m takes time O(φ(m)2 ). Experimentally, if computed
in double precision, this formula yields accurate results with relative error at most 5 × 10−7 for all
m up to 32,000, at most 5 × 10−6 for all m up to 64,000, and at most 2.5 × 10−5 for all m up to
128,000.15
The above computations are trivially parallelized, as the columns of Vm−1 can be computed
independently of one another. The computation can be further simplified, based on the following.
Lemma 4. If m is a positive integer and n is the product of the distinct primes dividing m, then
Em = En .
14
Efficient algorithms for computing Φm (X) may be found, for example, in [1]. These algorithms run much faster
than O(φ(m)2 ).
15
This was verified by computing each value |qi | in both double precision and extended double precision. With
little additional cost, one can get better accuracy by performing the computation in extended double precision, if the
hardware supports it.

39
Proof. Let q := m/n. Then we have the following well-known identity:

Φm (X) = Φn (X q ).

Based on this, if ω is a primitive mth root of unity and j ∈ Z∗m , then we have

Φm (X) Φn (X q ) X j(q−1−i) i
= ω X.
X − ωj X − ω jq
i∈[q]

Moreover,
Φ0m (X) = Φ0n (X q ) · qX q−1
and so
|Φ0m (ω j )| = q|Φ0n (ω jq )|.
Also note that ω q is a primitive nth root of unity, and that as j runs over Z∗m , the value ω jq runs
over all primitive nth roots of unity, each repeated q times.
Based on the above observations, it is not hard to see that we can permute rows and columns
of Vm−1 and Vn−1 , respectively, to obtain matrices (ai,j ) and (bi,j ), such that for each (i, j) ∈
[φ(m)] × [φ(m)], we have
q|ai,j | = |bbi/qc,bj/qc |.
It follows that Em = En .

Lemma 5. If m is an odd positive integer, then E2m = Em .

Proof. If m = 1, then E2 = E1 = 1. So suppose m > 1 and m is odd. Then we have the following
well-known identity:
Φ2m (X) = Φm (−X).
Note that if ω is a primitive mth root of unity, then as j runs over Z∗m , the value −ω j runs over all
primitive 2mth roots of unity.
Based on the above observations, it is not hard to see that we can permute rows and columns
−1
of Vm−1 and V2m , respectively, to obtain matrices (ai,j ) and (bi,j ), such that for each (i, j) ∈
[φ(m)] × [φ(m)], we have
|ai,j | = |bi,j |.
It follows that Em = E2m .

Combining the above two lemmas, we have:

Theorem 1. Let m be a positive integer and let n be the product of the distinct odd primes dividing
m. Then Em = En .

Using the above theorem, instead of computing Em , we can just compute En , where n is the
product of the distinct odd primes dividing m.

40
B Selecting generators for the hypercube
With notation as in Section 3.1.2.3, by default, HElib will construct generators g1 , . . . , g` of orders
n1 , . . . , n` for the hypercube as in (20) using the following procedure.

H0 ← H
i←1
while Hi−1 6= Z∗m do
let ni be the maximal order of any element in the quotient group Z∗m /Hi−1
choose gi ∈ Z∗m such that
(a) the order of gi mod Hi−1 is ni , and
(b) gini ∈ H (and, if possible, gini = 1)
let Hi be the subgroup generated by Hi−1 ∪ {gi }
i←i+1

If we ignore condition (b) in the choice of gi , it is clear that the algorithm succeeds in producing
g1 , . . . , g` that give us a complete set of representatives as required. However, by always choosing
ni maximal, we can be assured of the existence of an element gi that also satisfies both conditions
(a) and (b).
To see why, first notice that, in fact, ni is the exponent of the group Z∗m /Hi−1 : this follows
from the well-known fact that any finite abelian group contains an element whose order is equal to
the exponent of the group. Because of this, it follows from elementary properties of exponents and
quotient groups that
ni | ni−1 | · · · | n1 .
Now suppose we choose gi satisfying (a), which we can always do. We can modify gi , if necessary,
to satisfy (b) as follows. If i = 1, there is nothing to do, so assume i ≥ 2. We know gini ∈ Hi−1 ,
which means gini = gi−1s h for some s ∈ Z and h ∈ H ∗
i−2 . Because ni−1 is the exponent of Zm /Hi−2 ,
ni−1
we know that gi ∈ Hi−2 . Therefore,
ni−1 ni−1
n ni s
= gi−1 h0 for some h0 ∈ Hi−2 .
ni ni
Hi−2 3 gi i−1 = gi
−n /s
Since gi−1 has order ni−1 mod Hi−2 , we must have ni | s. So let gi0 := gi · gi−1i . Observe that
0 0
gi is in the same coset of Hi−1 as gi , and therefore gi also has order ni mod Hi−1 . However,
(gi0 )ni ∈ Hi−2 . So we replace gi by gi0 . If i = 2, we are done. Otherwise, we can repeat the same
procedure to replace gi by an element whose order mod Hi−1 is ni but with gini ∈ Hi−3 . Continuing
in this way, we arrive at an element gi that satisfies both (a) and (b).
The above procedure is basically just a proof the Fundamental Theorem of Finite Abelian
Groups, applied to the group Z∗m /H. Recall that in Section 3.1.2.3, in discussing the hypercube,
we can have good, bad, or very bad dimensions. Good means gini = 1, bad means gini 6= 1 but
gini ∈ H, and very bad means gini ∈ / H. The routine used in HElib will always try to choose
a generator that yields a good dimension, if that is possible. Nevertheless, it may produce bad
dimensions. However, it will never produce a very bad dimension.

References
[1] A. Arnold and M. Monagan. Calculating cyclotomic polynomials. Mathematics of Computation,
80(276):2359–2379, 2011. Available at https://www.ams.org/journals/mcom/2011-80-276/
S0025-5718-2011-02467-1.

41
[2] L. I. Bluestein. A linear filtering approach to the computation of the discrete fourier transform.
Northeast Electronics Research and Engineering Meeting Record 10, 1968.

[3] Z. Brakerski, C. Gentry, and V. Vaikuntanathan. Fully homomorphic encryption without boot-
strapping. In Innovations in Theoretical Computer Science (ITCS’12), 2012. Available at
http://eprint.iacr.org/2011/277.

[4] J. H. Cheon, A. Kim, M. Kim, and Y. S. Song. Homomorphic encryption for arithmetic of
approximate numbers. In T. Takagi and T. Peyrin, editors, Advances in Cryptology - ASI-
ACRYPT 2017 - 23rd International Conference on the Theory and Applications of Cryptology
and Information Security, Hong Kong, China, December 3-7, 2017, Proceedings, Part I, volume
10624 of Lecture Notes in Computer Science, pages 409–437. Springer, 2017.

[5] I. Gohberg and V. Olshevsky. The fast generalized parkertraub algorithm for inversion of
vandermonde and related matrices. Journal of Complexity, 13(2):208 – 234, 1997. Available at
https://pdfs.semanticscholar.org/9233/77ec0483df93af85eb60d108e16cd648f273.pdf.

[6] S. Halevi and V. Shoup. Algorithms in helib. Cryptology ePrint Archive, Report 2014/106,
2014. https://eprint.iacr.org/2014/106.

[7] S. Halevi and V. Shoup. Bootstrapping for helib. Cryptology ePrint Archive, Report 2014/873,
2014. Available at https://eprint.iacr.org/2014/873.

[8] S. Halevi and V. Shoup. Faster homomorphic linear transformations in helib. Cryptology ePrint
Archive, Report 2018/244, 2018. https://eprint.iacr.org/2018/244.

[9] V. Lyubashevsky, C. Peikert, and O. Regev. On ideal lattices and learning with errors over
rings. In H. Gilbert, editor, Advances in Cryptology - EUROCRYPT’10, volume 6110 of Lecture
Notes in Computer Science, pages 1–23. Springer, 2010.

42

You might also like