Abstract
The ring learning with errors (RLWE) problem can be used to construct efficient post-quantum public key encryption schemes. An error distribution, normally a Gaussian-like distribution, is involved in the RLWE problem. In this work we focus on using polar codes to alleviate a natural trade-off present in RLWE public key encryption schemes; namely, we would like a wider error distribution to increase security, but a wider error distribution comes at the cost of an increased probability of decryption error. The motivation of this work is to improve the bit-security level by using wider error distribution while keeping the target decryption failure rate achievable. The approach we proposed in this work is twofold. Firstly, we formulate RLWE public key encryption as a channel model with some noise terms known by the decoder. This makes our approach distinguished from existing research of this kind in the literature which ignores these known terms. Secondly, we design polar codes for the derived channel model. Theoretically and numerically, we show the proposed modeling and polar coding scheme contributes to a considerable bit-security level improvement compared with NewHope, a submission to National Institute of Standards and Technology (NIST), with almost the same parameters. Moreover, polar encoding and decoding support isochronous implementations in the sense that the timings of associated operations are irrelevant to the sensitive information.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
1.1 Error correction for Ring-LWE-based public key encryption
As the world’s top tech companies and research labs compete in the race to build a quantum computer, real world public key cryptography, such as digital signatures, public key encryption (PKE), and key exchange protocols, must be made quantum resistant. The ring learning with errors (RLWE) problem was introduced in [20] in 2010, expanding on the classical version of the learning with errors problem (LWE) introduced by Regev in [30]. Since then, cryptography based on the RLWE problem has become one of the most attractive post quantum candidates. Its security relies on the worst-case approximate shortest independent vector problem (SIVP) on ideal lattices and it gives better efficiency compared with plain LWE because of the ring structure. Many of the prominent submissions to the National Institute of Standards and Technology’s (NIST) call for proposals [26], for example NewHope [4] and LAC Footnote 1 [19], are based on RLWE. Though neither of the two advanced to NIST’s third round, academic and industrial study on RLWE cryptography and their applications never stops. In this work, we focus on the issue of error correction for RLWE-based public key encryption.
Among the RLWE-based public key exchange protocols, there are essentially two major approaches to the problem of sharing a session key which is used to protect communication payload: the reconciliation approach of [10] and the encryption approach of [3]. By the first approach, both participants agree on a shared value from some pseudorandom signals with the help of a robust extractor. This work focuses on the other approach which resembles the compact RLWE public key encryption scheme proposed in [21]. Taking NewHope for example, the binary secret to be shared is encoded using a repetition code, mapped to {0,⌊q/2⌋}n and then wrapped by an encryption function. There will be a residue noise term after the decryption. Upon getting a decrypted codeword, the decoder then sums up the symbols corresponding to the repeated digits and infers if the digit should be 0 or 1 according to a threshold. Taking the telecommunication system as an analogy, this process is exactly a hard-decision decoding process which is not able to offer the optimal decoding performance. Decreasing the decryption failure rate (DFR) is believed of vital importance to the RLWE-based PKE. Firstly, if we seek chosen-ciphertext attack (CCA) security of the above cryptosystem using the classical Fujisaki-Okamoto transform [14], a pretty low DFR is required and the NIST standardization targets at a failure rate lower than 2− 128. Secondly, more capable error correction allows larger error terms of RLWE, increasing the hardness of the underlying lattice problem and therefore the security of the cryptosystem.
To improve the error correction and security of RLWE-based PKE, some researchers have exploited the goodness of multidimensional lattices. For example, Leech lattice encoding and decoding are used in LWE-based PKE [28]. It gives the densest sphere packing of its dimension which means a good trade-off between information transmission rate and error correction capability. An alternative way to decrease DFR is to apply error-correcting codes (ECC). In [13], Fritzmann et al. considered how much the RLWE-based PKE protocol, NewHope Simple, could profit from Bose–Chaudhuri–Hocquenghem (BCH) codes, low-density parity-check (LDPC) codes, and a hybrid of the two regarding the DFR. They achieved a DFR of 2− 140 using these codes, but their decoding algorithms were not intrinsically constant-time though this defect was likely to be managed if proper measures (e.g., fixing the number of iterations) were taken. In an independent line of work, Saarinen designed a linear block code called XE5 and implemented it in a RLWE-based PKE scheme called Hila5 [31] Footnote 2. This method is able to share 256 bits of message and additional 240 bits of redundancy at DFR below 2− 128. The decoding algorithm runs in constant time, which provides resilience to timing-based side-channel attacks.
How to deal with the dependency existing in the residue noise term of RLWE-based PKE is closely related to the soundness of DFR estimation [9, 13, 32]. For example, in an integer ring of cyclotomic field \(\frac {\mathbb {Z}_{q}[X]}{x^{n}+1}\), the multiplication, denoted by ⋅, of two ring elements results in a polynomial in the ring with correlated coefficients. In the case of RLWE-based PKE, the residue noise e ⋅ t − s ⋅ e1 + e2 (will appear in Section 2.2) has correlated coefficients such that an incorrect decryption of one bit may increase the probability of decryption error of other bits. As a result, the DFR estimation will be inaccurate if we assume the noise term has independent and identically distributed (i.i.d.) coefficients [9]. Moreover, advanced decoding algorithms (e.g. soft-decision decoding) presume an i.i.d. channel. That is the reason why we expect an i.i.d. noise model. We have found a few “independence” assumptions in the literature. Fritzmann et al. gave upper bounds on DFR using their error-correcting codes assuming that the residue noise can be seen as independent [13]. They improved the bit-security of NewHope to 309 bits for n = 1024, q = 12289 targeting at DFR= 2− 140. D’Anvers et al. assume the residue noise has independent coefficients conditional on its norm. This method was used to deal with ternary error terms but it is impractical for true discrete Gaussian errors [9]. Song et al. formulated the NewHope as a digital communication system and solved a part of the dependency. They improved the bit-security to 252 bits (n = 1024, q = 12289) targeting at a DFR of 2− 140 as well [32].
1.2 Originality and contribution
Distinguished from existing error-correcting schemes for RLWE-based PKE in the literature, the originality of the proposed polar coding scheme is threefold.
-
1.
Firstly, we take advantage of the fact that the secret s and e in the residue noise e ⋅ t − s ⋅ e1 + e2 are known by Alice at the decryption stage. This is leveraged to improve decoding performance. To this end, this work formalizes the mathematical model of RLWE PKE as a fading channel with channel state information (CSI) available. Existing works (e.g. [32]) also treat RLWE PKE as a telecommunication system but it does not exploit the knowledge about s and e which are seen as CSI in this work.
-
2.
Secondly, we resolve the correlation between the coefficients of the residue noise using canonical embedding under which polynomial multiplications are turned into coordinate-wise multiplications Footnote 3 and we derived an i.i.d. channel model in the end. This allows us to carry out soft-decision decoding and accurate DFR estimation.
-
3.
In addition to providing an error-correcting approach, polar codes exhibit some salient features. Compared with BCH and LDPC, polar codes provide competitive and well-understood decoding performance limits measured by Bhattacharyya parameter. Moreover, its decoding is not affected by error floors [23]. Besides, the encoding and decoding of polar codes are isochronous in the sense that the timings of encoding and decoding are irrelevant to the secret and the plaintext.
The contributions of this paper are summarized as follows.
-
1.
We formulate the RLWE-based PKE as an i.i.d. fading channel with CSI available to the receiver without any “independence” assumptions. These are the prerequisites of the proposed of polar coding scheme.
-
(a)
As explained earlier in this section, the coefficient correlation of the residue noise term e ⋅ t − s ⋅ e1 + e2 is unfastened by canonical embedding leading to an i.i.d. channel model. We view e and s as CSI which are known by Alice at the decryption stage whilst Bob on the other side only knows its distribution.
-
(b)
Taking telecommunication system as an analogy, mapping a single bit 0 or 1 of the plaintext to a symbol on the constellation {0,⌊q/2⌋} is called modulation. To make the modulation scheme fit in with the i.i.d. fading channel in canonical basis, we proposed a new modulation scheme at the cost of error tolerance.
-
(a)
-
2.
Then we give the explicit construction of polar codes for RLWE-based PKE channel model. Experimental results and theoretical estimation of DFR are also given. Specifically, we derive a new DFR of 2− 298 for q = 12289,n = 1024,r = 2 (\(r=\sqrt {k/2}\)) and code rate= 0.25, while NewHope gives a DFR of 2− 216 in the same setting; we derive a new DFR of 2− 156 for r = 2.83 (k = 16) and code rate= 0.25 while NewHope is proved to give a DFR of 2− 137 in almost the same setting [32]. Thanks to the new DFR margin, the proposed RLWE-based PKE achieves a better bit-security level than NewHope while achieving the same target DFR. Besides, the encoding and decoding of polar codes support quasi-linear (i.e., \(O(n\log n)\) with n to be the degree of the cyclotomic field of RLWE) and isochronous implementations, which will be discussed in detail in Section 7.2.
1.3 Roadmap
This paper is organized as follows. A review of the necessary algebraic number theory, fading channels and polar codes can be found in Section 2. In Section 3 we explain how to formulate a typical RLWE-based PKE scheme as an i.i.d. fading channel. How to handle the dependency in canonical basis is also demonstrated. Section 4 gives a high-level description of RLWE-based PKE with the proposed polar coding scheme. Section 5 gives the explicit construction of polar codes for RLWE. Section 6 analyzes the DFR theoretically and experimentally when polar coding is applied. Section 7 discusses the bit-security improvement derived by the new DFR margin as well as the isochrony of polar codes. Section 8 concludes this paper.
2 Preliminaries
2.1 Algebraic number theory
We review the necessary concepts from algebraic number theory required for our discussion of ring-LWE. In particular, we will relate many of our definitions to power-of-two cyclotomic fields, which are popular in modern cryptography.
A number field \(K= \mathbb {Q}(\zeta )\) can be defined by adjoining an element \(\zeta \in \mathbb {C}\) to the field of \(\mathbb {Q}\) where ζ satisfies f(ζ) = 0 for some irreducible polynomial \(f(X) \in \mathbb {Q}[X]\). Then, the degree of K over \(\mathbb {Q}\) is precisely the degree n of f(X). Because f(ζ) = 0, K can be seen as a vector space over \(\mathbb {Q}\) endowed with a basis {1,ζ,...,ζn− 1} known as the power basis of K. Let ζm be a primitive m th complex root of unity with minimal polynomial
where \(\mathbb {Z}^{*}_{m}\) is the group of invertible elements in \(\mathbb {Z}_{m}\). Then, the m th cyclotomic number field is defined as \(K=\mathbb {Q}(\zeta _{m})\). When m ≥ 2 is a power of two, f(X) = Xn + 1 and n = m/2.
A number field K of degree n permits n distinct ring embeddings \(\sigma _{i}: K \rightarrow \mathbb {C}, i =1,...,n,\) which correspond to n automorphisms of K mapping ζ to each root of its minimal polynomial f(X). The n embeddings include s1 real embeddings and s2 pairs of complex conjugate embeddings. The concatenation of the n embeddings is called canonical embedding σ(⋅) which is a map from K into the space
For power-of-two cyclotomics, s1 = 0. Because the complex embeddings come in pairs of conjugates, H is isomorphic to \(\mathbb {R}^{n}\). We also remark that under the embedding σ multiplication in K maps to coordinate-wise multiplication in H.
Let \(\mathcal {O}_{K}\) be the set of all the algebraic integers in K. It forms a ring and is called ring of integers of the number field. For the above power-of-two cyclotomics, the ring of integers is \(\mathcal {O}_{K}=\mathbb {Z}[X]/{(}1+X^{n}{)}\) and the canonical embedding maps \(\mathcal {O}_{K}\) to an algebraic lattice in space H and the lattice generator matrix is defined as
Moreover, because of the conjugate pairs of the embeddings, we can rewrite σ as \(\sigma ^{\prime }: K\rightarrow \mathbb {R}^{n}\)
And the corresponding basis \(\tilde {B}\) of the images of the mapping is
Note that both B and \(\tilde {B}\) are orthogonal matrices. The determinant of B is \(\sqrt {n}^{n}\) while that of \(\tilde {B}\) is \((\sqrt {n/2})^{n}\).
2.2 Ring-LWE public key encryption scheme and the coefficient dependency
For concreteness, we give an example of a public key scheme based on ring-LWE which was first described in [21]. Many ring-LWE schemes and protocols including NewHope closely resemble this one. The scheme is parameterized by an integer modulus q, dimension n, and error distribution χ over Rq. We will take the example of NewHope and view Rq as \(\frac {\mathbb {Z}_{q}[X]}{x^{n}+1}\) and define sampling from χ to be sampling each coefficient of a polynomial from the discrete Gaussian over \(\mathbb {Z}\). The scheme proceeds as follows.
-
Alice samples a secret key \(s \leftarrow \chi\) and publishes as a public key a ring-LWE sample (a,b) = (a,a ⋅ s + e) ∈ Rq × Rq, where a is uniformly random and \(e \leftarrow \chi\).
-
Bob encrypts a message m ∈ R2 as \((c_{1}, c_{2}) = (a \cdot t + e_{1}, b \cdot t + e_{2} + \lfloor \frac {q}{2} \rfloor \cdot m)\), where e1,e2,t are sampled independently from χ.
-
Alice decrypts using s by \(d := c_{2} - c_{1} \cdot s = \lfloor \frac {q}{2} \rfloor \cdot m +e \cdot t - s\cdot e_{1} + e_{2}\).
Alice then recovers the message m by decoding: if the ith coordinate of d is closer to 0 than ⌊q/2⌋, Alice assumes the ith coordinate of m was 0, otherwise she assumes it was 1.
We find the dependency between the coefficients of the residue noise term e ⋅ t − s ⋅ e1 + e2 obvious if we rewrite it in vector form using coefficient embedding of Rq, i.e.,
where e(i) is the i-th coefficient of polynomial e and t is the coefficient embedding of polynomial t. The row vectors of the negacyclic matrix generated by e have identical norm and they are multiplied by the same vector t and so do s and e1.
2.3 Fading channel
In wireless communications, a fading channel arises due to a time-varying attenuation of signal quality caused by either the propagation environment or by movement of the transmitter/receiver. We consider a discrete-time fading channel model W
where hi is the channel gain, zi is additive white Gaussian noise (AWGN) and N is the signal length. We highlight two facts about CSI which are relevant to the RLWE channel model we will discuss in Section 3. Firstly, a few consecutive hi may be correlated and this period is called coherence interval of a fading channel W denoted by Tc. In the context of a fading channel with memory, the channel gain hi is believed to be a constant within one coherence interval and varies independently as the next coherence interval approaches. Secondly, the realization of hi is called channel state information (CSI) and the distribution of hi is called channel distribution information (CDI). CSI sometimes is known to the decoder.
When designing a telecommunication system, we prefer i.i.d. fading channels where hi are independent. There are a few methods to deal with the correlation. Let m = Tc > 1 and N/m = n. Since a fading channel with coherence interval Tc can be seen as m parallel sub-channels, a bit-interleaved coded modulation (BICM) technique can be used to handle the correlation between sub-channels [7, 22]. Another solution is to use multilevel codes [11] to design a coded modulation scheme with signal points in an m-dimensional signal space. In [18], a properly chosen lattice partition chain Λ1/⋯/Λl− 1/Λl is employed to design multilevel polar codes to achieve fading channel capacity. In this case, the dimension m of Λ1 is properly chosen such that the channel gain hi is assumed to be a constant amid the whole transmission of m symbols, i.e. Tc = m. A component code \(\mathcal {C}_{i}\) at the i-th level of the partition chain is designed in order to achieve the capacity of a Λi/Λi+ 1 fading channel. The component codes are combined by construction D giving rise to a lattice. More information about the multilevel construction and the Λi/Λi+ 1 channel can be found in [18] and [11]. We give an example of a mod \(\mathbb {Z}\) channel and a \(\mathbb {Z}/2\mathbb {Z}\) channel as follows and the fading version will be given in Section 3.
Example 1
A mod \(\mathbb {Z}\) channel is an AWGN channel with input restricted to \(a\in \mathcal {V}(\mathbb {Z})\) where \(\mathcal {V}(\mathbb {Z})\) is the fundamental region Footnote 4 of \(\mathbb {Z}\). At the receiver’s end, there is a mod \(\mathcal {V}(\mathbb {Z})\) operation giving the equivalent channel output as
where z is an AWGN noise and \(z^{\prime }=z~\text {mod}~\mathbb {Z}\).
Example 2
A \(\mathbb {Z}/2\mathbb {Z}\) channel is an AWGN channel with input restricted to \(r\in (\mathbb {Z}+a)\cap \mathcal {V}(2\mathbb {Z})\) for some offset \(a\in \mathbb {R}\). At the receiver’s end, the equivalent channel output is
where z is an AWGN noise and \(z^{\prime }=z~\text {mod}~2\mathbb {Z}\). It can be viewed as a mod \(2\mathbb {Z}\) channel with input restricted to a set of elements of \(\mathbb {Z}+a\) that fall in \(\mathcal {V}(2\mathbb {Z})\).
In the special case of Tc = 1, channel W is referred to as an i.i.d. fading channel. The design and performance of error-correcting codes for i.i.d. fading channels with/without CSI is well studied [6, 36]. In [18], Liu et al. proposed a polar coding scheme for i.i.d. fading channels to achieve the ergodic capacity. Unlike previous work of [6] in which CSI is given to both ends of communication, in Liu et al.’s scheme CSI is only known to the receiver which is more feasible in practice.
2.4 Polar codes
Polar codes, introduced by Arıkan in [5], are linear block codes of length n = 2l for a positive integer l that achieves the capacity of any binary-input discrete memoryless symmetric (BDMS) channels asymptotically Footnote 5. We firstly review some basics of polar codes for a BDMS channel. A binary-input channel W is symmetric if there exists a permutation π of the output alphabet \(\mathcal {Y}\) such that W(y|1) = W(π(y)|0) and π− 1 = π for \(y\in \mathcal {Y}\). Given a BDMS channel W, there are two commonly used metrics in information theory to measure the quality of W: the mutual informationFootnote 6 and the reliability.
Definition 1 (Mutual information of BDMS channels)
The mutual information I(W) ∈ [0,1] of a BDMS channel \(W:\mathcal {X}\rightarrow \mathcal {Y}\) is the maximum rate at which information can be successfully transmitted from the transmitter to the receiver. We define I(W) as
In here, we use the definition of symmetric mutual information assuming uniform channel input which is also the capacity of the BDMS channel. We use the notations I(W) and I(Y ;X) interchangeably to denote the mutual information of W.
Definition 2 (Bhattacharyya parameter of BDMS channels)
The Bhattacharyya parameter Z(W) ∈ [0,1] is a measure of channel reliability for a BDMS channel W defined as
where a small Z(W) indicates a more reliable channel while a large Z(W) implies a channel with more inference.
The capacity-achieving nature of polar codes arises from the so-called channel polarization phenomenon as a result of recursive applications of Arıkan’s transform to two identical W channels and their synthesized derivatives. The overall recursive transform can be done in a channel combining phase and a channel splitting phase. In the channel combining phase, a linear transformation defined as X1:n = U1:nGn is performed on a vector \(U^{1:n}\in \mathcal {X}^{1:n}\) over GF(2), where \(G_{n}= B_{n}\left [ \begin {array}{cc} 1 & 0 \\ 1 & 1 \end {array}\right ]^{\otimes l}\). Bn is a permutation matrix: if \(U^{\prime 1:n}=U^{1:n}B_{n}\) and \(l=\log _{2}n\), the \(i^{\prime }=((b_{l},\cdots ,b_{2},b_{1})_{2}+1)\)-th coordinate of \(U^{\prime 1:n}\) is the i = ((b1,b2,⋯ ,bl)2 + 1)-th coordinate of U1:n where (⋯ )2 is the binary expansion of an integer. By taking X1:n as the raw input of W, one derives a combined channel \(W_{n}:\mathcal {X}^{1:n}\rightarrow \mathcal {Y}^{1:n}\) with a transition probability of
where (⋅)i denotes i-the coordinate. Since Gn induces a one-to-one mapping between U1:n and X1:n, the mutual information of Wn is
In the channel splitting phase, Wn is further split back into n synthesized channels \(W_{n}^{(i)}:\mathcal {X}\rightarrow \mathcal {Y}^{n}\times \mathcal {X}^{i-1}\) whose transition probability is defined by
It is proved in [5] that Arıkan’s transform preserves the mutual information in the sense that
More importantly, the quality of the synthesized channels polarizes asymptotically as the recursion proceeds.
Theorem 1 (Channel polarization of mutual information 5)
For any BDMS channel W, the synthesized channels \(W_{n}^{(i)}\) polarize in the sense that, for any fixed δ ∈ (0,1), as n goes to infinity through powers of two, the fraction of indices i ∈{1,⋯ ,n} for which \(I(W_{n}^{(i)})\in (1-\delta ,1]\) goes to I(W) and the fraction for which \(I(W_{n}^{(i)})\in [0,\delta )\) goes to 1 − I(W).
The channel polarization theorem can also be stated in the metric of Bhattacharyya parameter by replacing \(I(W_{n}^{(i)})\) by \(Z(W_{n}^{(i)})\). For any desired transmission rate R < I(W), we can partition {1,⋯ ,n} into a subset \(\mathcal {A}\) and its complement \(\mathcal {A}^{C}\) such that (i) \(|\mathcal {A}|=\lfloor nR \rfloor\) and (ii) for any \(i\in \mathcal {A}\) and \(j\in \mathcal {A}^{C}\), \(Z(W_{n}^{(i)})\leq Z(W_{n}^{(j)})\). Given the “best” ⌊nR⌋ channels indexed by \(\mathcal {A}\), one can construct polar codes following the encoding rule:
where ⊕ is XOR operation, \(U_{\mathcal {A}}\) is called the information vector and \(U_{\mathcal {A}^{C}}\) is called the frozen vector known by both encoder and decoder. Typical realization of the frozen vector is \(U_{\mathcal {A}^{C}}=\textbf {0}\) for BDMS channels. In this manner, the useful information is transmitted via the most reliable synthesized channels. A question may arise on how to efficiently calculate \(Z(W_{n}^{(i)})\). A brief review can be found in Sections 2.5 and 5.3 but detailed descriptions of these methods are beyond the scope of this work.
The successive cancellation (SC) decoder is the initial decoding algorithm for polar codes. Let u(i) be the i-th coordinate of U1:n. Given a channel output y1:n of polar code, the SC decoder yields the recovered \(\bar {u}^{(i)}\) of u(i) in sequential order of index i according to the decoding rule specified as
where \(\bar {u}^{1:i-1}\) is the estimation of u1:i− 1 recovered before \(\bar {u}^{(i)}\). Details of the SC decoder can be found in Appendix A.
Denote by Pe the averaged probability of frame errors. As a result of polar encoding and SC decoding, it is proved in [5] that Pe is upper bounded as follows.
Theorem 2 (Decoding Performance 5)
For any BDMS channel W and any choices of parameter \((n,R,\mathcal {A})\),
2.5 Channel degradation and upgradation
The construction of polar codes can be addressed if all the Bhattacharyya parameters \(Z(W_{n}^{(i)})\) of synthesized channels can be efficiently calculated. To this end, a quantization method was proposed in [34] to construct a degraded or upgraded approximation of a binary-input memoryless symmetric (BMS) channel. In this way, one can approximate \(Z(W_{n}^{(i)})\) efficiently with tractable and minor distortion. We define the degradation and upgradation relation as follows and will be further discussed them in the sequel.
Definition 3 (Degraded and Upgraded Channel, 34)
A channel \(\mathcal {Q}:\mathcal {X}\rightarrow \mathcal {Z}\) is (stochastically) degraded with respect to a channel \(\mathcal {W}:\mathcal {X}\rightarrow \mathcal {Y}\) if there exists a channel \(\mathcal {P}:\mathcal {Y}\rightarrow \mathcal {Z}\) such that
for all \(z\in \mathcal {Z}\) and \(x\in \mathcal {X}\). We denote by \(\mathcal {Q}\preceq \mathcal {W}\) the relation that \(\mathcal {Q}\) is degraded with respect to \(\mathcal {W}\). Conversely, we denote by \(\mathcal {Q}^{\prime }\succeq \mathcal {W}\) the relation that \(\mathcal {Q}^{\prime }\) is upgraded with respect to \(\mathcal {W}\) if there exists a channel \(\mathcal {Q}^{\prime }:\mathcal {X}\rightarrow \mathcal {Z}^{\prime }\) and a channel \(\mathcal {P}:\mathcal {Z}^{\prime }\rightarrow \mathcal {Y}\) such that for \(y\in \mathcal {Y}\) and \(x\in \mathcal {X}\)
Moreover, Lemma 1 indicates that the synthesized channels of \(\mathcal {Q},\mathcal {W},\mathcal {Q}^{\prime }\) under Arıkan’s transform also fulfill the channel degradation and upgradation relation. This implies a polar code constructed for \(\mathcal {Q}\) also fits in with \(\mathcal {W}\).
Lemma 1 (restatement of Lemma 4.7 in 17)
Given BMS channels \(\mathcal {W},\mathcal {Q}\), and \(\mathcal {Q}^{\prime }\), we denote by \(\mathcal {W}_{n}^{(i)}\), \(\mathcal {Q}_{n}^{(i)}\) and \({\mathcal {Q}^{\prime }}_{n}^{(i)}\) for i ∈ [1,n] the synthesized channels derived by Arıkan’s transform. If \(\mathcal {Q}^{\prime }\succeq \mathcal {W}\succeq \mathcal {Q}\) for all i, then \({\mathcal {Q}^{\prime }}_{n}^{(i)}\succeq \mathcal {W}_{n}^{(i)}\succeq \mathcal {Q}_{n}^{(i)}\).
If the channel degradation or upgradation relation is set up, their channel capacity, reliability and error probability will be related as follows.
Lemma 2 (34)
Let \(\mathcal {W}\) be a BMS channel and suppose there exists the other channel \(\mathcal {Q}\) such that \(\mathcal {Q}\preceq \mathcal {W}\). Then
The inequality will reverse if we replace “degraded” by “upgraded”.
3 RLWE channel model
3.1 RLWE channel model in canonical basis
Definition 4
The real multivariate normal distribution has density function
where |⋅| denotes the determinant, \(\mu =\mathbb {E}[X]\in \mathbb {R}^{n}\), \({\Sigma }=\mathbb {E}\left [ (X-\mu )(X-\mu )^{T} \right ]\); we write \(X\sim \mathcal {N}(\mu ,{\Sigma })\). A generalization would be the complex multivariate normal distribution \(Z\sim \mathcal {N}\mathcal {C}(\mu ,{{\varGamma }})\) with density function
where z∗ denotes the Hermitian transpose of the vector z and \({{\varGamma }} =\mathbb {E}[(Z-\mu )(Z-\mu )^{*}]\).
We already have an RLWE-based PKE instance in Section 2.2. Now we consider the problem of decoding the message m from the polynomial
where e ⋅ t and s ⋅ e1 are products of polynomials in \(\mathbb {Z}_{q}[x]/(1+x^{n})\). The coefficients of e,t,s,e1,e2 should be drawn from discrete Gaussian. We use continuous normal distribution \(\mathcal {N}(0,r^{2})\) instead to simplify the distribution analysis of the noise term.
Under canonical embedding, formula (3) can be rewritten as Footnote 7
where B is the orthogonal basis defined in Section 2.1 and the multiplications (i.e., σ(e)σ(t) and σ(s)σ(e1)) and additions are both coordinate-wise as explained in Section 2.1. Due to the conjugate pairs, formula (4) can be refined as
where Bj represents the jth row of B, vector y and m are vector forms of polynomials y and m, \(\tilde {B}\) and \(\sigma ^{\prime }\) are introduced in Section 2.1, \(\tilde {B}\textbf {y}=\sigma ^{\prime }({y})\), and \(\tilde {B}\lfloor \frac {q}{2}\rfloor \textbf {m}=\sigma ^{\prime }(\lfloor \frac {q}{2}\rfloor {m})\). To see how the noise term N is distributed, we rewrite formula (5) for all the odd indices i = 1,3,5,⋯ ,n/2 − 1 as
where \(\tilde {B}_{i}(\cdot )=\sigma ^{\prime }_{i}(\cdot )\) and \(\tilde {B}_{i+1}(\cdot )=\sigma ^{\prime }_{i+1}(\cdot )\). Under embedding \(\sigma :K\rightarrow \mathbb {C}^{n}\), the spherical normal distributed vectors, e and t, are mapped to complex spherical normal vectors, \(\sigma (e),\sigma (t)\sim \mathcal {N}\mathcal {C}(0,nr^{2}\mathbb {I})\). As for the embedding \(\sigma ^{\prime }:K\rightarrow \mathbb {R}^{n}\), the spherical normal distribution \(\mathcal {N}(0,r^{2}\mathbb {I})\) is transformed to a new spherical normal distribution \(\mathcal {N}(0,nr^{2}/2\mathbb {I})\). Since e,t are coordinate-wise i.i.d. their embeddings σ(e), σ(t), \(\sigma ^{\prime }(e)\), \(\sigma ^{\prime }(t)\) are coordinate-wise independent as well. We observe from formula (6) that every odd-indexed coordinate and the next even-indexed coordinate are somehow correlated because they share the same \(\sigma ^{\prime }_{i}(e),\sigma ^{\prime }_{i+1}(e)\), \(\sigma ^{\prime }_{i}(t),\sigma ^{\prime }_{i+1}(t)\), \(\sigma ^{\prime }_{i}(s),\sigma ^{\prime }_{i+1}(s)\) and \(\sigma ^{\prime }_{i}(e_{1}),\sigma ^{\prime }_{i+1}(e_{1})\) although \(\sigma ^{\prime }_{i}(e_{2}),\sigma ^{\prime }_{i+1}(e_{2})\) are independent.
To further refine the RLWE channel model, we can rewrite formula (5) and (6) as
where for i = 1,2,⋯ ,n, Ni = Hi ∗ Zi, \(Z_{i}\leftarrow \mathcal {N}(0,\frac {nr^{2}}{2})\), and
Because of the correlation between every two coordinates, Hi and Hj are independent for two different indices i,j as long as ⌈i/2⌉≠⌈j/2⌉; otherwise Hi = Hj. Similarly, Zi and Zj are correlated if ⌈i/2⌉ = ⌈j/2⌉; otherwise they are independent.
Unlike in NewHope and other RLWE-based encryption schemes where the plaintext is encoded and decoded in the polynomial basis, we will carry out encoding and decoding in canonical basis. Observe that the channel given by formula (7) is a fading channel with coherence interval Tc = 2 coordinates except that the symbols to be transmitted after modulation, i.e., \(\tilde {B}\lfloor \frac {q}{2}\rfloor \mathbf {m}\), are not coordinate-wise independent. In next subsection, we will adjust the modulation scheme such that a tailored constellation diagram can fit in with the fading channel.
3.2 A tailored constellation diagram
The RLWE channel in formula (3) can be interpreted as n parallel \(\mathbb {Z}/2\mathbb {Z}\) channels where a message m ∈{0,1}n is mapped to a symbol on the constellation diagram \(\{0,\lfloor \frac {q}{2} \rfloor \}^{n}\). The mod Rq operation defines a valid constellation space as an n-dimensional cube Λ with vertices {0,q}n. To ease the description of how we design a new constellation diagram in canonical basis, we make a modification to the modulation scheme in formula (3): the message m ∈{− 1,1}n is mapped onto the constellation diagram \(\{\pm \lfloor \frac {q}{4} \rfloor \}^{n}\) and the valid constellation space is a cube Λ with vertices \(\{\pm \lfloor \frac {q}{2}\rfloor \}^{n}\). This modification will preserve the capacity of the \(\mathbb {Z}/2\mathbb {Z}\) channel because they are statistically equivalent if we ignore geometrical approximation caused by the round-off operation ⌊⋅⌋.
According to formula (7), after applying the canonical embedding, the constellation diagram turns into \(\tilde {B}\{\pm \lfloor \frac {q}{4}\rfloor \}^{n}\). Similarly, we can obtain the new constellation space \({{\varLambda }}^{\prime }=\tilde {B}{{\varLambda }}\) by rotating Λ and scaling it up by a factor of \(\sqrt {n/2}\).
As discussed in previous subsection, the coherence interval Tc of the residue noise equals to 2 coordinates while the constellation symbol \(\tilde {B}\lfloor \frac {q}{4}\rfloor \textbf {m}\) has memory throughout n coordinates. In a communication system, the interleaving technique can be used to alleviate the correlation of the source by permuting symbols of different code blocks. Unfortunately, interleaving is impractical in the RLWE channel because there is only one code block of length n. At the cost of distance between the constellation symbols, we tailor the constellation space \({{\varLambda }}^{\prime }\) to fit in with the fading channel.
Essentially, we are looking for a new modulation scheme meeting two conditions: (a) we desire the symbols after modulation (or the modulated message) to be coordinate-wise i.i.d.; in other words, we expect a valid constellation diagram inside the space \({{\varLambda }}^{\prime }\) such that for coordinate-wise i.i.d. message m, the modulated message is coordinate-wise i.i.d. as well; (b) the new modulation scheme gives us a \(\mathbb {Z}/2\mathbb {Z}\) channel. Conceptually, the maximal n-dimensional cube \({{\varLambda }}^{\prime \prime }\) enclosed in \({{\varLambda }}^{\prime }\) and parallel to Λ is our target constellation space. In this case, the symbols to be transmitted can be easily made to be binary and i.i.d. if we divide the cube \({{\varLambda }}^{\prime \prime }\) equally into 2n small cubes and select all the centers of the small cubes to be the constellation diagram. However, looking for such a \({{\varLambda }}^{\prime \prime }\) in practice is intractable when the dimension n is large and we are unclear about in what direction and by what degree the cube \({{\varLambda }}^{\prime }\) is rotated with respect to Λ. Instead, we compromise on the constellation size and use the cube \({{\varLambda }}^{\prime \prime }\) which is parallel to Λ and is enclosed in the maximal ball inscribed in \({{\varLambda }}^{\prime }\). In this manner, we can make sure there always exists such a constellation space \({{\varLambda }}^{\prime \prime }\) and it is straightforward to calculate its size. Figure 1 illustrates this idea in 2-dimensional case. If the side length of Λ is q, the side of \({{\varLambda }}^{\prime }\) turns out to have length \(q\sqrt {n/2}\), and the side of \({{\varLambda }}^{\prime \prime }\) will be \(q/\sqrt {2}\). Observe that \({{\varLambda }}^{\prime }=\sqrt {2}\tilde {B}{{\varLambda }}^{\prime \prime }\).
3.3 Tailored RLWE channel model in canonical basis
Given the tailored constellation space \({{\varLambda }}^{\prime \prime }\) and its corresponding constellation diagram, we now have a tailored RLWE channel model in the canonical basis:
where m ∈{0,1}n, Ni = Hi ∗ Zi and \(Z_{i}\leftarrow \mathcal {N}(0,nr^{2}/2)\) for 1 ≤ i ≤ n. As discussed in formula (7), Hi and Hj are independent for two different indices i,j as long as ⌈i/2⌉≠⌈j/2⌉; otherwise Hi = Hj. Similarly, Zi and Zj are independent if ⌈i/2⌉≠⌈j/2⌉ otherwise they are correlated.
We observe that the tailored channel model in formula (8) can be seen as a fading channel where Hi is the channel gain and Zi is the additive noise. A family of fading channels (e.g., i.i.d. fading, block fading, compound fading) are well studied in existing work of [6, 18, 36] and explicit constructions of error-correcting codes are given. In this work, since Hi and Zi have the same coherence interval of two coordinates, our strategy is to divide the n parallel channels into two groups of i.i.d. channels and we construct two parallel polar codes of equal block length n/2 for the two \(\mathbb {Z}/2\mathbb {Z}\) fading channels. Note that in this work we use parameters similar to NewHope, e.g., q = 12289,n = 1024, r ∈{1,2,6,9} where the values of r correspond to the “Short” and “Tall” parameters in [8].
Denote by L and \(L^{\prime }\) two one-dimensional lattices \(\lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}} \mathbb {Z}\) and \(q\frac {1}{\sqrt {2}}\mathbb {Z}\) respectively. The above channel model can also be written as a fading \(L/L^{\prime }\) channel, i.e.,
where mi ∈{0,1} and the channel input X is restricted to the discrete alphabet \(\mathcal {X}=L\cap \mathcal {R}(L^{\prime })=\{0, \lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}}\}\). Since Alice knows exactly what e and s are, she knows both the distribution and realization of the channel gain Hi. At the transmitter’s end, Bob only knows the distribution of Hi. Both of them know the distribution of Zi. How to achieve the ergodic capacity of such an i.i.d. fading channel using polar codes is well studied in [18] and we are about to adapt their strategy to our tailored RLWE channel model. A diagram of a fading \(L/L^{\prime }\) channel with CSI available to the decoder is shown in Fig. 2.
Denote by \(W:X\rightarrow (\tilde {Y},H)\) the fading \(L/L^{\prime }\) channel with CSI available to the decoder. The transition probability of W is
where \(\sigma =\sqrt {\frac {n}{2}}r\). The distribution of H is
The pdf of H in terms of various choices of parameter r is depicted in Fig. 3.
As discussed in [12] and [18], the capacity of the fading \(L/L^{\prime }\) channel is given by
where \(E_{H}\left [\cdot \right ]\) denotes the expectation over the fading coefficient, \(\mathfrak {h}(L,\sigma ^{2})\) and \(\mathfrak {h}(L^{\prime },\sigma ^{2})\) are differential entropies of mod-L and mod-\(L^{\prime }\) channels respectively, and \(|L/L^{\prime }|\) is the order of the partition \(L/L^{\prime }\). Specifically, \(\mathfrak {h}(L,\sigma ^{2})\) is given by
where \(\mathcal {R}\) is a fundamental region of lattice L, \(g_{(h\sigma )^{2}}(\cdot )\) is the density function of \(\mathcal {N}(0,h^{2}\sigma ^{2}\mathbb {I})\). We refer to \(f_{L,(h\sigma )^{2}}\) as an L-periodic Gaussian density function which is defined by summing up a set of copies of a Gaussian density function centered at every lattice point of L. The value of an L-periodic Gaussian variable \(z^{\prime }\) is restricted to any fundamental region of L such that the integral of its density function over \(\mathcal {R}(L)\) is obviously 1. See Fig. 4 for the ergodic capacity of the fading \(L/L^{\prime }\) channel \(W:X\rightarrow (\tilde {Y},H)\) with respect to different choices of r. In a communication system, the signal-to-noise ratio (SNR) is a measure of the reliability of a channel. It is defined as the ratio of the signal strength over the noise strengthFootnote 8.
Recall it in Section 2.4 the definition of a symmetric channel. It is observed that \(P_{\tilde {Y},H|X}(\tilde {y},h|x = 0 )=P_{\tilde {Y},H|X}(\pi (\tilde {y},h)|x = \lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}} )\) holds for a permutation \(\pi (\tilde {y},h)=\left ((\lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}}-\tilde {y})~\text {mod}~q\frac {1}{\sqrt {2}} \mathbb {Z},h \right )\) over the outputs \((\tilde {y},h)\). Therefore, the fading \(L/L^{\prime }\) channel W is symmetric and we can achieve its capacity using polar codes.
4 Description of the encryption scheme
Table 1 gives a high-level description of the RLWE-based PKE scheme using polar codes which are customized for our tailored RLWE channel model in canonical basis. The functions PolarEnc(⋅) and PolarDec(⋅) are encoding and decoding algorithms of polar codes which will be explicitly introduced in the sequel.
Remark 1
Unlike most RLWE encryption schemes where the error distribution χ is defined over \(\mathbb {Z}\) (e.g., central Binomial in NewHope), we use the definition of χ when the ideal learning with errors problem was initially proposed in [33] where χ is defined on \(\mathbb {R}/[0, q)\). Moreover, according to the formal definition of ring-LWE in [21], the error distribution is also continuous over the field tensor product \(K\otimes _{\mathbb {Q}}\mathbb {R}\).
Remark 2
A plaintext m is uniquely mapped to a symbol \(\lfloor \frac {q}{2} \rfloor \frac {1}{\sqrt {2}}\textit {PolarEnc}(\mathbf {m})\) on the constellation diagram in canonical basis. Then it is switched to polynomial basis and turned into vector v. Note that \(\mathbf {v}\in (\mathbb {R}/[0,q))^{n}\) but not in Rq. We see it reasonable since χ is also real and continuous.
One may notice in Table 1 that Alice finally derives a mod-\(\tilde {B}R_{q}\) channel (or equivalently a mod-\({{\varLambda }}^{\prime }\) channel) as in Fig. 1 rather than the mod-\({{\varLambda }}^{\prime \prime }\) in formula (8) (or equivalently the mod-\(L^{\prime }\) channel in (9)). Questions arise whether the tailored RLWE channel model in formula (8) makes sense and how it will behave if we construct a polar code for the mod-\({{\varLambda }}^{\prime \prime }\) channel when we actually have a mod-\({{\varLambda }}^{\prime }\) channel. Lemma 3 illustrates the channel degradation relation between the two channels.
Lemma 3
(Channel Degradation Relation Between RLWE Channel and Its Tailored Variant) Let \({{\varLambda }}^{\prime }\) be the constellation space and let \({{\varLambda }}^{\prime \prime }\) be its tailored variant as in Fig. 1. Given the tailored RLWE channel model as in formula (8) with CSI Hi known to the decoder as in Fig. 2, the fading \(L^{n}/{{\varLambda }}^{\prime \prime }\) channel is degraded with respect to the fading \(L^{n}/{{\varLambda }}^{\prime }\) channel.
Proof
Denote by \(W^{\prime }\) the fading \(L^{n}/{{\varLambda }}^{\prime }\) channel \(y^{\prime }=x+h*z\mod {{{\varLambda }}^{\prime }}\) where \(y^{\prime }\in \mathcal {R}({{\varLambda }}^{\prime })\), \(x\in L^{n}\cap \mathcal {R}({{\varLambda }}^{\prime })\) is the channel input, h is the channel gain and z is the Gaussian noise. In the same fashion, we define the fading \(L^{n}/{{\varLambda }}^{\prime \prime }\) channel \(W^{\prime \prime }\) as \(y^{\prime \prime }=x+h*z\mod {{{\varLambda }}^{\prime \prime }}\) where \(y^{\prime \prime }\in \mathcal {R}({{\varLambda }}^{\prime \prime })\), \(x\in L^{n}\cap \mathcal {R}({{\varLambda }}^{\prime \prime })\).
As formula (10) indicates, the \(L/L^{\prime }\) fading channel with CSI known to the receiver in formula (9) can be viewed as an independent combination of channel gain h and an \(L/L^{\prime }\) Gaussian channel. Therefore, with no loss of generality, we can view the channel gain h as a constant. We can rewrite channel \(W^{\prime }\) as \(W^{\prime }: y^{\prime }=x+z^{\prime }\mod {{{\varLambda }}^{\prime }}\) and rewrite \(W^{\prime \prime }\) as \(W^{\prime \prime }: y^{\prime \prime }=x+z^{\prime }\mod {{{\varLambda }}^{\prime \prime }}\) where \(z^{\prime }\sim \mathcal {N}(0,h^{2}\sigma ^{2}\mathbb {I})\). The channel transition probability of \(W^{\prime }\) is
where \(g_{(h\sigma )^{2}}\) represents the density function of \(\mathcal {N}(0,h^{2}\sigma ^{2}\mathbb {I})\) and \(n^{\prime }=z^{\prime }\mod {{{\varLambda }}^{\prime }}\). The channel transition probability of \(W^{\prime \prime }\) is
where \(n^{\prime \prime }=z^{\prime }\mod {{{\varLambda }}^{\prime \prime }}\) and the equality (a) is due to the relation \({{\varLambda }}^{\prime }=\sqrt {2}\tilde {B}{{\varLambda }}^{\prime \prime }\), \(\lambda ^{\prime }=\sqrt {2}\tilde {B}\lambda ^{\prime \prime }\), and \(n^{\prime }\in \mathcal {R}({{\varLambda }}^{\prime })\), \(n^{\prime \prime }\in \mathcal {R}({{\varLambda }}^{\prime \prime })\). We observe from equation (13) that channel \(W^{\prime \prime }\) is statistically equivalent to \(W^{\prime \prime }: y^{\prime \prime } = x + z^{\prime \prime }\mod {{{\varLambda }}^{\prime }}\) where \(z^{\prime \prime }\sim \mathcal {N}(0,(h\sigma \sqrt {2}\tilde {B})^{2}\mathbb {I})\). Since the transition probabilities in equation (12) and equation (13) are two \({{\varLambda }}^{\prime }\)-periodic Gaussian distributions featured with variances \((h\sigma )^{2} < (h\sigma \sqrt {2}\tilde {B})^{2}\), we can prove \(W^{\prime \prime }\) is degraded with respect to \(W^{\prime }\) by introducing an intermediate \(L^{n}/{{\varLambda }}^{\prime }\) channel \(W^{\prime \prime \prime }\) with additive Gaussian noise \(z^{\prime \prime \prime }\sim \mathcal {N}(0,(h\sigma \sqrt {2}\tilde {B})^{2}\mathbb {I}-(h\sigma )^{2}\mathbb {I})\) such that \(W^{\prime \prime }\) is a concatenation of \(W^{\prime }\) and \(W^{\prime \prime \prime }\), i.e.,
The above concatenation satisfies the definition of channel degradation (Definition 3). □
Given the channel degradation relation between the fading \(L^{n}/{{\varLambda }}^{\prime }\) channel \(W^{\prime }\) and the fading \(L^{n}/{{\varLambda }}^{\prime \prime }\) channel \(W^{\prime \prime }\), it is guaranteed by Lemma 1 that the polar codes constructed for \(W^{\prime \prime }\) also fit in with \(W^{\prime }\). How to explicitly construct polar codes will be shown in next section.
5 Polar coding for the tailored RLWE channel
As discussed in Section 2.4, we need a BDMS channel before we can adapt the polar coding method, including calculating the Bhattacharyya parameters of the synthesized channels, defining the information set \(\mathcal {A}\) and frozen set \(\mathcal {A}^{c}\), encoding and SC decoding. We have already proved the fading \(L/L^{\prime }\) channel \(W:X\rightarrow (\tilde {Y},H)\) as in formula (9) is symmetric in Section 3.3. Since we assume the channel gain H and Gaussian noise Z to be continuous and so is the channel output, we need to discretize the channel output \(H,\tilde {Y}\) before constructing polar codes. An elegant channel quantization scheme was proposed in [18] where the two output H and \(\tilde {Y}\) are discretized independently with tractable loss of channel capacity. Basically, the channel gain H is discretized into a series of discrete values with uniform occurrence probability. As for the output \(\tilde {Y}\), we will decompose the \(L/L^{\prime }\) channel into multiple BDMS channels such that the overall channel capacity almost preserves with only negligible loss.
5.1 Quantization of the fading coefficient
As discussed in previous sections, the fading \(L/L^{\prime }\) channel with CSI available to the decoder is statistically equivalent to an independent combination of the fading coefficient H and an \(L/L^{\prime }\) channel with additive Gaussian noise of variance (hσ)2. Therefore, we firstly quantize H then the \(L/L^{\prime }\) channel. Let {αi} be an ascending sequence in the following form
so that for 1 ≤ i ≤ m we have
We take the centroid with respect to the interval (αi,αi+ 1) as the discretized alphabet \({\mathscr{H}}_{q}=\{h_{i}\}\) for i = 1,⋯ ,m where hi is calculated as follows.
5.2 Degrading transform quantization
As in Fig. 2 we view the tailored RLWE channel as an i.i.d. fading channel. For such a channel, polar codes are constructed in [18] to achieve the ergodic capacity C(W) as long as the receiver knows the CSI and the transmitter knows the CDI. Given n (\(n=2^{l}, l\in \mathbb {Z}\)) i.i.d. tailored RLWE channels \(W:X\rightarrow (\tilde {Y},H)\), we define the channel input as X1:n = U1:nGn where U1:n ∈{0,1}1:n and Gn is the generator matrix Footnote 9. We obtain n synthesized channels \(W_{n}^{(i)}:U^{(i)}\rightarrow (U^{1:i-1},\tilde {Y}^{1:n},H^{1:n})\) for 1 ≤ i ≤ n by performing channel combining and channel splitting. The Bhattacharyya parameter for W is defined as
To compute \(Z(W_{n}^{(i)})\) efficiently, we employ the degrading transform proposed in [34] to quantize a BMS channel W with continuous output alphabet into a degraded and approximated BDMS channel WQ with finite output alphabet size. Intuitively, the finer the discretized output alphabet is, the better WQ approximates W. Since we have already discretized H as hi for i = 1,⋯ ,m, we can consider hi as a constant and quantize the \(L/L^{\prime }\) channel \(W_{h_{i}}:X,h_{i}\rightarrow \tilde {Y}\) for each hi.
We define the likelihood ratio (LR) of a channel W as
where the transition probability \(W_{\tilde {Y}|X,h_{i}}\) is
Figure 5 depicts \(W_{\tilde {Y}|X,h_{i}}\) and \(\lambda (\tilde {y},h_{i})\) by giving some examples when q = 12289,r = 2 and hi = 10,30.
We can see it in Fig. 5 that the channel \(W_{h_{i}}:X, h_{i}\rightarrow \tilde {Y}\) is BMS with \(\tilde {Y}\) continuously located over the interval \([0,q/\sqrt {2})\). There exists a permutation function \(\pi (\tilde {y})=(\lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}}-\tilde {y})\mod q/\sqrt {2}\) such that \(W(\tilde {y}|0,h_{i})=W(\pi (\tilde {y})|\lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}},h_{i})\). Intuitively, the BMS channel \(W_{h_{i}}\) can be decomposed into infinite binary symmetric channels (BSCs) \(W_{c}:X,h_{i}\rightarrow \tilde {Y}_{c}\) where the output is \(\tilde {Y}_{c}\in \{y_{c},\pi (y_{c})\}\) for continuous \(y_{c}\in [0,q/\sqrt {2})\), \(X\in \{0,\lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}}\}\) and the crossover probability is the corresponding probability density \(W(\tilde {y}_{c}|X,h_{i})\). If we focus on the likelihood ratio \(\lambda (\tilde {y}_{c},h_{i})\geq 1\), the crossover probability of BSC Wc is \(\frac {1}{\lambda (\tilde {y}_{c},h_{i})+1}\). The capacity of this BSC is
where \(\lambda (\tilde {y}_{c},h_{i})\geq 1\). Quantitatively, the continuous decomposition of \(W_{h_{i}}\) preserves the channel capacity in the sense that
where the integral interval is restricted to \(\tilde {y}\) such that \(\lambda (\tilde {y},h_{i})\geq 1\). If we ignore the subtle geometrical error introduced by rounding ⌊⋅⌋, we can observe a symmetry feature in the graphs in Fig. 5 and we find that the valid integral interval is
We divide the interval A into ν segments Aj for j ∈ [ν] such that
where \(\mathfrak {h_{2}}(\cdot )\) is the binary entropy function. Each Aj corresponds to a BSC channel with crossover probability
where
Since lattice \(L^{\prime }\) is infinite, we can numerically approximate \(f_{L^{\prime },0,{h_{i}^{2}}\sigma ^{2}}(\tilde {y})\), \(f_{L^{\prime },\lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}},{h_{i}^{2}}\sigma ^{2}}(\tilde {y})\) then \(\lambda (\tilde {y},h_{i}),A_{j}\) and pj.
If we define zj and its conjugate \(\bar {z}_{j}\) to be the channel output of the BSC associated with Aj, we will obtain the discretized output alphabet of \(W_{h_{i}}\) as
If we denote by WQ the discretized version of the original fading \(L/L^{\prime }\) channel \(W:X\rightarrow \tilde {Y},H\), the output alphabet of WQ is \({\mathscr{H}}_{q}\otimes \mathcal {Z}:=\{h_{i}\}\otimes \{z_{1},\bar {z}_{1},\cdots ,z_{\nu },\bar {z}_{\nu }\}\) for i ∈ [m] and j ∈ [ν] where ⊗ denotes the Cartesian product of two sets.
Lemma 4
The channel \(W_{Q}:X\rightarrow Z,H_{q}\) is degraded with respect to W.
Proof
We supply an intermediate channel \(W_{P}:(\tilde {Y},H)\rightarrow (Z,H_{q})\) such that
We observe a channel degradation relation such that
□
Corollary 1
Given that \(W_{Q}:X\rightarrow Z,H_{q}\) is degraded with respect to W, the capacity, Bhattacharyya parameter and frame error rate of the two channels are related as
Proof
As a corollary of Lemmas 2 and 4. □
It is indicated in [34] that the capacity loss introduced by the degrading transform is no greater than 1/ν. If we choose large alphabet size m and 2ν, the loss of capacity is negligible and so is Z(⋅) and Pe(⋅). We also verified our channel quantization scheme with respect to the channel capacity. As is shown in Fig. 6, for m = 20,ν = 50 and multiple choices of r, C(WQ) is close to C(W) with only negligible difference.
To summarize, what the degrading transform does is to convert the RLWE channel W with continuous output alphabet into a BDMS channel WQ with finite output, which can be viewed as a combination of m × ν BSC channels. In this way, one can construct polar codes for WQ which also fit in with W.
5.3 Polar encoding and SC decoding
5.3.1 Encoding algorithm PolarEnc(⋅)
Given the BDMS channel WQ derived by channel quantization, we can adapt the polar encoding and decoding method introduced in Section 2.4 to WQ. Recall that the output alphabet of WQ is m × 2ν. As the channel combining and splitting process continue, the alphabet size of the synthesized channels \(W_{Qn}^{(i)}\) will increase exponentially as the recursion proceeds. To handle this problem, we employ an approximation method proposed in [27] which can reduce the alphabet size of a BDMS channel with negligible and tractable loss of performance by merging some of the output symbols.
After we finish computing the Bhattacharyya parameters of all the \(W_{Qn}^{(i)}\), we can define the information set \(\mathcal {A}\) and frozen set \(\mathcal {A}^{c}\). Recall the encoding algorithm PolarEnc(m) in Table 1. We construct polar codes for plaintext m = u1:n as
where \(u_{\mathcal {A}}\) is the information vector and \(u_{\mathcal {A}^{c}}\) is the frozen vector. The complexity of encoding is \(O(n\log n)\) where n is equal to the degree of the cyclotomic field of RLWE.
5.3.2 Decoding algorithm PolarDec(⋅)
The decoding algorithm PolarDec(⋅) is exactly the same as the so called successive cancellation (SC) decoding initially proposed in [5]. Upon receiving the signal \(\tilde {y}^{1:n}\) (i.e. \(\tilde {y}^{1:n}=\tilde {B}\mathbf {y}\) in Table 1) and invoking their knowledge of the CSI h1:n, the recipient applies the SC decoding to \(\tilde {y}^{1:n},h^{1:n}\) and gives an estimation \(\bar {u}^{1:n}\) of u1:n as
where the transition probabilities of synthesized channels \(W_{n}^{(i)}(\cdot |\cdot )\) can be recursively calculated by SC decoding algorithm with complexity \(O(n\log n)\). Details of SC decoding can be found in Appendix A. A frame error occurs if \(\bar {u}^{1:n}\neq u^{1:n}\); we may interchangeably use frame error probability and DFR in this work. Additionally, PolarEnc(⋅) and PolarDec(⋅) require constant steps of operations for fixed choices of \(n,\mathcal {A}\), making isochronous implementations possible. Details about isochrony will be discussed in Section 7.2.
6 Results: Performance analysis and improvement
According to Theorem 2, the frame error probability \(P_{e}(n,R,\mathcal {A})\) of SC decoding is upper bounded by the sum of \(Z(W_{n}^{(i)})\). Since \(W_Q \preceq W\) and \(W_{Qn}^{(i)}\preceq W_{n}^{(i)}\) according to Lemma 1, we derive
Recall it in Fig. 6 that the capacity of our tailored RLWE channel deteriorates dramatically because we use a tailored and shrunk constellation diagram. As a result, for most choices of r which are believed to be secure in RLWE-based PKE, we cannot obtain a desired DFR lower than 2− 128 which is used as a benchmark in NIST standardization. As explained in Section 3.2, we carefully and conservatively choose a cube \({{\varLambda }}^{\prime \prime }\) which is enclosed in the maximal sphere inscribed in \({{\varLambda }}^{\prime }\). Almost surely there are other valid choices of \({{\varLambda }}^{\prime \prime }\) lager than the one we choose, though it is not easy at all to figure out the optimal one. A pragmatic solution to this harsh problem is to gradually scale \({{\varLambda }}^{\prime \prime }\) up by a factor t ≥ 1 and run simulations for each to justify if the numerical results of Pe coincide with the upper bound in formula (16). We highlight that if t is not larger than some critical point, the channel degradation relation in Lemma 3 will still hold. Therefore, the theoretical upper bound on Pe will still apply after we scale the modulation constellation \({{\varLambda }}^{\prime \prime }\). Please refer to Remark 3 for further explanation.
Figure 7 compares the upper bounds of frame error probability Pe with our simulation results in the setting of q = 12289,n = 1024,r = 1. The solid lines indicate the upper bounds of Pe with respect to different code rate R. The solid lines with stars represent our simulation results which, for reasonably small DFR, comply with the upper bound. We aim to achieve Pe = 2− 128 at code rate R = 0.25. Apparently, it is unachievable when the scale factor t = 1. We gradually increase t and obtain the corresponding estimation of Pe. We can see that the decoding performance is improved significantly upon a slightly larger t, e.g., Pe is smaller than 10− 60(≈ 2− 200) at R = 0.25 for t = 2. When t = 2, the experiment result represented by the red star also complies with its corresponding theoretical estimation, i.e., the red solid line. It implies that our estimation of Pe for t = 2 is reliable to some extent. Please note that all these experiments target at relatively large Pe which is feasible to verify.
Figure 8 can be interpreted in the same manner as Fig. 7. The only different parameter used here is r = 2. The solid lines in different colors represent our estimation of Pe and the stars are our simulation results. By making scale factor t as large as 6, the target R and Pe can be achieved. For relatively large Pe shown in the graph, we observed that our simulation results comply with our estimation when t = 6,7,9,11,12. However, when t = 14, simulation results are worse than our estimation, implying that the constellation diagram \({{\varLambda }}^{\prime \prime }\) is overwhelmingly large and goes beyond the valid domain.
In Fig. 9, r = 2.83. We can observe that our estimations are effective for t = 8,12 but fail for t > 12. We can see that none of our simulation results comply with the estimations in Fig. 10. It implies that the scaling method does not apply for r ≥ 3.46.
Remark 3
The error sources for the scaled and tailored RLWE channel model are concluded as follows.
-
(a)
As t increases, the constellation space \({{\varLambda }}^{\prime \prime }\) may go beyond \({{\varLambda }}^{\prime }\) and our model will fail to describe the statistical feature of the real channel.
-
(b)
The SC decoder takes \(\tilde {B}\textbf {y}\) to be the channel output of a fading \(L^{n}/{{\varLambda }}^{\prime \prime }\) channel while it is actually a fading \(L^{n}/ {{\varLambda }}^{\prime }/ {{\varLambda }}^{\prime \prime }\) channel according to Table 1. This is because Alice firstly performs a mod Rq operation and then calculates \(\tilde {B}\textbf {y}\) upon receiving y from Bob. For small r, the two channels have quite close distributions but they become less likely as r goes larger. This explains why our model fails when r ≥ 3.46 in Fig. 10.
-
(c)
It might be misunderstood that for any t > 1 the theoretical estimation in formula (16) would not apply. This is exactly not the case. As stated in Section 3.2, the constellation \({{\varLambda }}^{\prime }\) shrinks \(\sqrt {n}\) times in length and becomes the tailored one \({{\varLambda }}^{\prime \prime }\). As a result of (a) and (b), slightly increasing \({{\varLambda }}^{\prime \prime }\) will not affect the soundness of the channel degradation relation and formula (16) if t does not go beyond some critical point. To find such point is nontrivial. That is why we run simulations to explore the relation between t, r and DFR. The disadvantage of this pragmatic method is that we can not verify small Pe of cryptographic interest.
7 Security analysis
7.1 Security improvement by new DFR
We define the concrete bit-security to be \(\log _{2}\) of the time complexity of certain attacks breaking a scheme of specific parameters of interest. We analyze the concrete bit-security of the proposed RLWE-based PKE by considering the best known generic attacks against ring-LWE and the corresponding cost models. A comprehensive survey of a variety of generic attacks and cost models can be found in [1, 2]. Since the proposed RLWE-based PKE differs from NewHope solely in the way plaintext is encoded and decoded, and the error-correction code itself does not affect security reduction, therefore the security estimation of NewHope [4] can be extended to our case.
Following the security estimation in [4, 19], we focus on two generic attacks. Essentially, we will consider (a) a primal attack which consists of constructing a unique shortest vector problem (uSVP) given LWE samples and solving it using block Korkin–Zolotarev (BKZ) algorithm with classical/quantum sieving (b) a dual attack which searches for the shortest vector in a dual lattice constructed by LWE samples using BKZ with classical/quantum sieving. We employ the cost model in [4] where the cost of BKZ with classical/quantum sieving is 0.292β/0.265β with β the block dimension of BKZ. In Table 2, we summarize the security estimates of the two attacks where the cost is defined as \(\log _{2}\) of time complexity of BKZ Footnote 10. Note that a variant of the dual attack is used by the estimator which makes the cost different from [4].
There exists a trade-off relation between DFR and bit-security level of RLWE-based PKE. Basically, larger error term (or larger binomial parameter k in NewHope) gives better security but worse DFR. The motivation of this work is to employ polar codes to give a safer DFR margin such that we can improve the bit-security level while achieving the target DFR. In NIST standardization, this target DFR is 2− 128. A more conservative target 2− 140 is used in the literature [13, 32].
Table 2 illustrates the DFR and bit-security level of RLWE-based PKE using our polar coding scheme for different choices of binomial parameter k (\(r=\sqrt {k/2}\)) and scale factor t. As we discussed in previous section, the scale factor of the constellation diagram cannot be larger than 12 for k = 8, otherwise the estimation of DFR is no longer valid. We select a more conservative choice t = 11 and achieve DFR= 2− 298 for n = 1024,q = 12289,k = 8 using our polar coding scheme which is smaller than the DFR 2− 216 of NewHope round 2 in the same setting. As discussed in Fig. 10, our calculation of DFR for k ≥ 24 (r ≥ 3.46) no longer applies.
In conclusion, our polar coding scheme and the selected parameters provide the RLWE-based PKE with a bit-security of at least 256 bits while achieving the target DFR 2− 140 (and also 2− 128). This is a considerable improvement compared with NewHope round 2 which offers a bit-security of 235 bits with the same parameters. The state-of-the-art study of this kind can be classified into two categories. In [13], LDPC and BCH codes are used to increase the bit-security to 309 bits while achieving DFR of 2− 140. However, their DFR estimation highly relies on an “independence” assumption and their error-correcting algorithms are not isochronous. The other approach was proposed by Song et al. in [32] which gave a tighter bound on DFR of NewHope and the bit-security is increased to 252 bits.
7.2 Resilience against timing-based attacks
When error-correcting codes are adapted to RLWE-based PKE, a major concern is the resilience against timing-based attacks. Discussions of this kind can be found in [19, 31]. We employ a semi-formal definition of constant-time algorithms which is called “isochrony” in [15]. We view an algorithm to be isochronous if its execution time is independent of the sensitive part of its input and output. This is a weaker notion than the conventional definition but suffices to argue security against timing attacks. We will justify the isochrony of polar encoding and decoding in this section.
Encoding
As introduced in Section 2.4 as well as Section 5.3, the encoding of polar codes takes plaintext u1:n as input and yields codewords as equation (1). The block length n is equal to the degree of cyclotomic field of RLWE. The encoding process comprises exactly \(\frac {n\log n}{2}\) many XOR logical operations no matter what the plaintext u1:n is. This can be verified by some trivial examples as in Fig. 11. Note that it is sensible to carry out the calculation of Bhattacharyya parameters for the synthesized channels offline. Because they are determined by the distribution of the residue noise term e ⋅ t − s ⋅ e1 + e2 and can be done once and for all. Therefore, the encoding is isochronous.
Decoding
As detailed in Appendix A, the SC decoding comprises three types of operations, i.e. (1) recursive calculation of the transition probabilities \(W_{n}^{(i)}\) as in Algorithm 2 (2) comparisons of two transition probabilities as line 9 of Algorithm 1 (3) XOR logical operations as in Algorithm 3. As in [15], we prove the SC decoding to be isochronous by showing that its timing is irrelevant to the sensitive information of the protocol. Regarding the decoding of RLWE PKE, the sensitive information includes the input \(\tilde {B}\mathbf {y}\) in Table 1 (we use shorthand notation y1:n) and output \(\bar {u}^{1:n}\) (i.e. the decoding result of plaintext u1:n) of SC decoding and the secret terms e,s,t,e1,e2 separately generated by each side of protocol. Note that the information set \(\mathcal {A}\) and its complement \(\mathcal {A}^{c}\) are determined by the distribution of secret terms and block length n which are publicly known. The frozen vector (e.g., an all zero vector) is also publicly known. Table 3 illustrates what types of operations are isochronous with respect to the sensitive information.
Firstly, recursively calculations of \(W_{n}^{(i)}\) are isochronous because their timings are irrelevant to any sensitive information. As described in Appendix A, for any fixed n an SC decoder carries out exactly \(n\log n\) many transition probability assemblies as in equation (17) and (18). Normally, these assemblies are floating-point operations. We use transition probability rather than the more popular likelihood ratio recursions to avoid floating-point divisions which are considered difficult for isochronous implementations [29, 37].
Secondly, the floating-point comparisons of two transition probabilities in Algorithm 1 are the decision-making process which yields the output \(\bar {u}^{1:n}\). Generally speaking, comparing two close floating-point values would take longer, but it is equally likely to return True and False nonetheless. Therefore, it makes sense to consider the timings of this type of operations irrelevant to \(\bar {u}^{1:n}\). In addition, the overall running time taken by comparisons is relevant to \(\mathcal {A}\) and n because comparisons only take place for information set \(\mathcal {A}\). Other sensitive information is not related to comparison operations.
Thirdly, the XOR logical operations in Algorithm 3 are the same as what happens in encoding. The quantity of XOR operations carried out by Algorithm 3 is uniquely determined by block length n.
We conclude that the encoding and decoding are isochronous with respect to sensitive information including the plaintext, the input and output of SC decoding and the secret terms e,s,t,e1,e2.
8 Conclusions
We have presented the first example of a polar coding technique to improve the DFR of RLWE-based PKE which takes advantage of viewing the protocol as a fading channel with CSI known to the decoder. Moreover, switching from polynomial basis to canonical basis unfastens the dependency existing in the residue noise term. The constellation space is tailored to derive an i.i.d. fading channel at the cost of decoding performance and a scaling method is employed to counteract the performance loss. Both numerical and theoretical results are given to verify the DFR estimation. The advantages of our method are as follows.
-
We derive an i.i.d. channel model of the residue noise term in H space using canonical embedding. The advantage that some knowledge of noise term is known by the decoder is taken to improve the decoding performance.
-
The bit-security is increased to 256 bits while achieving the target DFR of 2− 140 in the setting of n = 1024,q = 12289,k = 16 (r = 2.83). This improvement is better than the benchmark 252 bits in [32]. Though it does break the record of 309 bits in [13], their results rely on an “independence” assumption that may not hold nonetheless.
-
Polar codes support isochronous implementations of encoding and decoding while LDPC and BCH codes employed in [13] do not. We show the encoding and decoding of polar codes to be isochronous with respect to sensitive information of the protocol.
The disadvantages are also given as follows.
-
Switching between the two basis by multiplying matrix \(\tilde {B}\) and \(\tilde {B}^{-1}\) as in Table 1 increases the complexity of the protocol.
-
To derive an i.i.d. channel model, we designed a tailored modulation diagram which gives closer code distance than the original modulation diagram \(\{0,\lfloor \frac {q}{2} \rfloor \}\). It hurts the decoding performance but the power of polar codes and the proposed scaling method counteract this effects to some extent.
-
However, the critical points of the scale factor t and the noise parameter r beyond which the theoretical upper bound on Pe no longer applies are currently missing.
Notes
The expansion of LAC is LAttice-based Cryptosystems.
Hila5 was merged into another NIST proposal called Round5.
A fundamental region of a lattice Λ is a region that includes only one point from each coset of Λ in \(\mathbb {R}^{n}\). Algebraically, \(\mathcal {V}({{\varLambda }})\) is a set of all the coset representatives of Λ in \(\mathbb {R}^{n}\).
The generalizations of polar codes are extended to a large class of channels, e.g., the binary-input memoryless symmetric (BMS) channel.
The maximum mutual information over all possible channel input distributions is the channel capacity.
For non-expert audiences, another interpretation of the proposed RLWE channel model is given in vector form in Appendix B.
Different r induces different SNR. The calculation of SNR with respect to the fading \(L/L^{\prime }\) channel is given in Appendix C.
The notation n indicates the block length of a code; it also indicates the degree of the 2n-th cyclotomic field which defines RLWE.
We use the security estimator from https://github.com/estimate-all-the-lwe-ntru-schemes/estimate-all-the-lwe-ntru-schemes.github.io.
References
Albrecht, M.R., Player, R., Scott, S.: On the concrete hardness of learning with errors. J. Math. Cryptol. 9(3), 169–203 (2015)
Albrecht, MR, Curtis, BR, Deo, A, Davidson, A, Player, R, Postlethwaite, EW, Virdia, F, Wunderer, T Catalano, D, De Prisco, R (eds.): Estimate All the LWE, NTRU schemes!. Springer International Publishing, Cham (2018)
Alkim, E., Ducas, L., Pöppelmann, T., Schwabe, P.: NewHope without reconciliation. IACR Cryptology ePrint Archive, 1157 (2016)
Alkim, E, Ducas, L, Pöppelmann, T, Schwabe, P: Post-quantum key exchange—a new hope. In: 25th USENIX Security Symposium, pp 327–43 (2016b)
Arikan, E.: Channel polarization: a method for constructing capacity-achieving codes for symmetric binary-input memoryless channels. IEEE Trans. Inf. Theory 55(7), 3051–73 (2009)
Bravo-Santos, A.: Polar codes for the Rayleigh fading channel. IEEE Communications Lett. 17(12), 2352–55 (2013)
Caire, G., Taricco, G., Biglieri, E.: Bit-interleaved coded modulation. IEEE Trans. Inform. Theory 44(3), 927–46 (1998)
Crockett, E., Peikert, C.: Challenges for ring-LWE. IACR Cryptol ePrint Arch, 782 (2016)
D’Anvers, JP, Vercauteren, F, Verbauwhede, I: The impact of error dependencies on ring/mod-LWE/LWR based schemes. In: International Conference on Post-Quantum Cryptography. Springer, pp 103–15 (2019)
Ding, J., Xie, X., Lin, X.: A simple provably secure key exchange scheme based on the learning with errors problem. IACR Cryptology EPrint Archive, 688 (2012)
Forney, G.D.: Coset codes. I. Introduction and geometrical classification. IEEE Trans. Inf. Theory 34(5), 1123–51 (1988)
Forney, G.D., Trott, M.D., Chung, S.Y.: Sphere-bound-achieving coset codes and multilevel coset codes. IEEE Trans. Inf. Theory 46(3), 820–850 (2000)
Fritzmann, T., Pöppelmann, T., Sepúlveda, M.J.: Analysis of error-correcting codes for lattice-based key exchange. IACR Cryptology ePrint Archive, 150 (2018)
Fujisaki, E, Okamoto, T: Secure integration of asymmetric and symmetric encryption schemes. In: Annual International Cryptology Conference. Springer, pp 537–54 (1999)
Howe, J, Prest, T, Ricosset, T, Rossi, M: Isochronous Gaussian Sampling: From Inception to Implementation. In: Ding, J, Tillich, JP (eds.) Post-Quantum Cryptography, pp 53–71. Springer International Publishing, Cham (2020)
Kocer, EG.: Circulant, negacyclic and semicirculant matrices with the modified Pell, Jacobsthal and jacobsthal-Lucas numbers. Hacettepe Journal of Mathematics and Statistics 36(2) (2007)
Korada, SB.: Polar Codes for Channel and Source Coding. PhD Thesis Ecole Polytechnique Fédérale De Lausanne. Lausanne, Switzerland (2009)
Ling, C.: Polar codes and polar lattices for independent fading channels. IEEE Trans. Commun. 64(12), 4923–4935 (2016)
Lu, X., Liu, Y., Zhang, Z., Jia, D., Xue, H., He, J., Li, B., Wang, K., Liu, Z., Yang, H.: LAC: Practical ring-LWE based public-key encryption with byte-level modulus. IACR Cryptol ePrint Arch, 1009 (2018)
Lyubashevsky, V, Peikert, C, Regev, O: On ideal lattices and learning with errors over rings. In: Annual international conference on the theory and applications of cryptographic techniques, pp 1–23. Springer, (2010)
Lyubashevsky, V, Peikert, C, Regev, O: A toolkit for ring-LWE cryptography. In: Annual international conference on the theory and applications of cryptographic techniques, pp 35–54. Springer, (2013)
Martinez, A., Guillen i Fabregas, A., Caire, G., Willems, F.M.J.: Bit-interleaved coded modulation revisited: a mismatched decoding perspective. IEEE Trans. Inf. Theory 55(6), 2756–2765 (2009)
Mondelli, M., Hassani, S.H., Urbanke, R.L.: Unified scaling of polar codes: Error exponent, scaling exponent, moderate deviations, and error floors. IEEE Trans. Inf. Theory 62(12), 6698–6712 (2016)
Murphy, S, Player, R: δ-Subgaussian random variables in cryptography. In: Jang-Jaccard, J, Guo, F (eds.) Information Security and Privacy, pp 251–268. Springer International Publishing, Cham (2019)
Murphy, S, Player, R: Discretisation and product distributions in ring-LWE. J. Math. Cryptol. 15(1) (2020)
NIST.: Submission requirements and evaluation criteria for the post-quantum cryptography standardization process. https://csrc.nist.gov/CSRC/media/Projects/Post-Quantum-Cryptography/documents/call-for-proposals-final-dec-2016.pdf. Accessed 1 Aug 2016
Pedarsani, R, Hassani, SH, Tal, I, Telatar, E: On the construction of polar codes. In: 2011 IEEE international symposium on information theory proceedings, pp. 11–15 (2011)
Poppelen, A.V.: Cryptographic decoding of the Leech lattice. IACR Cryptology ePrint Archive, 1050 (2016)
Prest, T., Ricosset, T., Rossi, M.: Simple, fast and constant-time Gaussian sampling over the integers for Falcon. Tech. rep., Second PQC Standardization Conference. https://csrc.nist.gov/Presentations/2019/simple-fast-and-constant-time-gaussian. Accessed 23 Aug 2019
Regev, O.: On lattices, learning with errors, random linear codes and cryptography. In: Proceedings of the thirty-seventh annual ACM symposium on theory of computing. ACM, New York, NY, USA, STOC ’05, pp. 84–93 (2005)
Saarinen, MJO.: HILA5: On reliability, reconciliation, and error correction for ring-LWE encryption. In: International conference on selected areas in cryptography. Springer, pp. 192–212 (2017)
Song, M., Lee, S., Shin, D., Lee, E., Kim, Y., No, J.: Analysis of error dependencies on newHope. IEEE Access 8, 45443–56 (2020)
Stehlé, D, Steinfeld, R, Tanaka, K, Xagawa, K.: Efficient public key encryption based on ideal lattices. In: International conference on the theory and application of cryptology and information security. Springer, pp. 617–635 (2009)
Tal, I., Vardy, A.: How to construct polar codes. IEEE Trans. Inf. Theory 59(10), 6562–6582 (2013)
Tal, I., Vardy, A.: List decoding of polar codes. IEEE Trans. Inf. Theory 61(5), 2213–2226 (2015)
Trifonov, P.: Design of polar codes for Rayleigh fading channel. In: 2015 international symposium on wireless communication systems (ISWCS), pp. 331–335 (2015)
Zhao, R.K., Steinfeld, R., Sakzad, A.: Facct: fast, compact, and constant-time discrete Gaussian sampler over integers. IEEE Trans. Comput. 69(1), 126–137 (2019)
Acknowledgements
The authors would like to thank Charles Grover for providing support on the formulation of RLWE channel.
Funding
This research is supported by the National Research Foundation, Singapore under its Strategic Capability Research Centres Funding Initiative and by the UK Engineering and Physical Sciences Research Council (EPSRC grant EP/S021043/1). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not reflect the views of National Research Foundation, Singapore.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A Successive cancellation decoding
SC decoding is proposed in [5, Section VIII] and modularized in [35, Section II]. Upon observing the signal y1:n, the SC decoding works as in Algorithm 1 and gives the estimation of u1:n. We now illustrate SC decoding by taking an example of n = 8 as in Fig. 12. In Fig. 12, an SC decoder is described as a circuit consisting of n × (l + 1) nodes for n = 2l. These nodes are pairwise connected by “ ” wires. We also define two probability arrays PReg0[n][l + 1] and PReg1[n][l + 1] as well as a bit array UReg[n][l + 1]. Each array consists of n × (l + 1) elements. In the circuit, every node can be specified by a phase parameter ϕ, a branch parameter ψ and a layer number m where 0 ≤ m ≤ l, 1 ≤ ϕ ≤ 2m, 0 ≤ ψ < 2l−m. For any integer 1 ≤ i ≤ n, it has a unique representation as
where we use the shorthand notation i = 〈ϕ,ψ〉 when m is clear. As shown in Fig. 12, nodes on layer m are classified into 2m phases and 2l−m branches. Nodes in the same phases are in the same color.
Algorithm 1 SC decoder(⋅)
As a high-level description, an SC decoder takes the signal y1:n as input. Then node A〈ϕ,ψ〉m calculates the transition probabilities \(W_{2^m}^{(i)}(\cdot |0)\) and \(W_{2^m}^{(i)}(\cdot |1)\) of an intermediate synthesized channel \(W_{2^m}^{(i)}\) and stores them in PReg0[〈ϕ,ψ〉][m] and PReg1[〈ϕ,ψ〉][m] of the same index as in Algorithm 2. Meanwhile, node A〈ϕ,ψ〉m also updates the unique element UReg[〈ϕ,ψ〉][m] of the bit array as in Algorithm 3. Each node in the circuit is activated once to update the two probability arrays and one more time to update the bit array. The output \(\bar {u}^{1:n}\) is stored in the l-th column of bit array, i.e. UReg[:][l].
Algorithm 2 CalP(⋅)
In CalP(⋅), a node at layer m assembles the outputs of two nodes at layer m − 1 of the same phase but different branches, then it yields two probability values and stores them in the corresponding position in array PReg0 and PReg1. This process takes place in the circuit from rightmost to leftmost in a recursive manner in the sense that the probabilities of same phase but different branches on previous layer (two nodes on RHS of “ ”) are assembled and turned into probabilities of different phases but the same branch (two nodes on LHS of “ ”) on current layer.
To initiate, the nodes on layer m = 0 take y1:n as input and store W(y(i)|0) and W(y(i)|1) for i = 1,⋯ ,n in column m = 0 of the two probability arrays, i.e., PReg0[:][m = 0] and PReg1[:][m = 0], respectively. Recall it in Section 2.4 that W(Y |X) is the transition probability of channel W.
Then, node A〈ϕ,ψ〉m at layer m > 0 updates the two probability arrays as follows. Let \(\kappa =\lceil \frac {\phi }{2} \rceil\). If ϕ is odd, then for α = 0, 1
If ϕ is even, then for α = 0, 1 and β = UReg[〈ϕ − 1,ψ〉][m]
The above process proceeds recursively from right to left along the “ ” in the circuit until every node on layer m = l finishes it work.
Algorithm 3: CalBit(⋅)
CalBit(⋅) updates the bit array UReg also in a recursive manner similar to CalP(⋅) but in reverse direction. Basically, the two nodes on the RHS of “ ” assemble the two bits yielded by two nodes on LHS of “ ”. Then they derive another two bits and update the bit array UReg. Specifically, for even ϕ and κ = ϕ/2, node A〈κ, 2ψ〉m− 1 updates UReg[〈κ, 2ψ〉][m − 1] as follows.
Appendix B An interpretation of RLWE channel model in vector representation
To visualize the decoding step of RLWE-based PKE as a channel decoding problem, we rewrite formula (3) in vector form as
where m,t,e1,e2 are vectors, E is a negacyclic matrix with the first column to be the coefficients of the polynomial e and S is also a negacyclic matrix defined in the same manner. Formula (21) can be viewed as a channel model where y is the channel output, m is the codeword to be modulated as ⌊q/2⌋⋅m and E ⋅t −S ⋅e1 + e2 is the channel noise. In this section, we refer to formula (21) as an RLWE channel.
Theorem 3 (Diagonalization of negacyclic matrix,16)
Let N(x) be an n × n negacyclic matrix whose first column is x. Then
where the element \(G_{p,q}=\frac {1}{\sqrt {n}}w^{(2p-1)(q-1)}\), w is the 2n-th root of unity, G∗ is the conjugate transpose of G and \(\lambda _j(x)={\sum }_{k=1}^{n}x_kw^{(2j-1)(k-1)}\).
According to Theorem 3 this RLWE channel can be formulated as
where λ(e) and λ(e) refer to the diagonal elements in Theorem 3.
Appendix C Calculating the SNR of tailored RLWE channel model
SNR is defined as SNR = Psignal/Pnoise where Psignal and Pnoise denote the signal and noise power, respectively. The channel model of RLWE-based PKE in polynomial basis is
where e,t,s,e1,e2 are polynomials in \(\frac {\mathbb {Z}_q[x]}{x^n+1}\) whose coordinates are independently drawn from a spherical normal distribution \(\mathcal {N}(0,\sigma ^2\mathbb {I})\). The multiplication of two polynomials can be interpreted as the convolution of their coordinates, giving rise to n parallel and correlated channels. If we set message m to be \(\lbrace -\lfloor \frac {q}{4}\rfloor , \lfloor \frac {q}{4}\rfloor \rbrace ^n\), the SNR is roughly \(\frac {q^2}{16(2n\sigma ^4 + \sigma ^2)}\).
In the tailored RLWE channel model in formula (8) and (9), polynomials convolutions are transformed to coordinate-wise multiplication in canonical basis. Since we shrink the constellation diagram as described in Section 3.2, we set the message m to be \(\lbrace -\lfloor \frac {q}{4\sqrt {2}}\rfloor ,\lfloor \frac {q}{4\sqrt {2}}\rfloor \rbrace ^n\). The channel gain Hi and Gaussian noise Zi are independent and their distributions are discussed in detail in formula (8). Its SNR is \(\frac {q^2}{32(n^2\sigma ^4+n\sigma ^2/2)}\).
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Wang, J., Ling, C. Polar coding for Ring-LWE-based public key encryption. Cryptogr. Commun. 15, 397–431 (2023). https://doi.org/10.1007/s12095-022-00607-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12095-022-00607-1