Polar coding for Ring-LWE-based public key encryption

Wang, Jiabo; Ling, Cong

doi:10.1007/s12095-022-00607-1

Polar coding for Ring-LWE-based public key encryption

Open access
Published: 24 September 2022

Volume 15, pages 397–431, (2023)
Cite this article

Download PDF

You have full access to this open access article

Cryptography and Communications Aims and scope Submit manuscript

Polar coding for Ring-LWE-based public key encryption

Download PDF

2402 Accesses
3 Citations
Explore all metrics

Abstract

The ring learning with errors (RLWE) problem can be used to construct efficient post-quantum public key encryption schemes. An error distribution, normally a Gaussian-like distribution, is involved in the RLWE problem. In this work we focus on using polar codes to alleviate a natural trade-off present in RLWE public key encryption schemes; namely, we would like a wider error distribution to increase security, but a wider error distribution comes at the cost of an increased probability of decryption error. The motivation of this work is to improve the bit-security level by using wider error distribution while keeping the target decryption failure rate achievable. The approach we proposed in this work is twofold. Firstly, we formulate RLWE public key encryption as a channel model with some noise terms known by the decoder. This makes our approach distinguished from existing research of this kind in the literature which ignores these known terms. Secondly, we design polar codes for the derived channel model. Theoretically and numerically, we show the proposed modeling and polar coding scheme contributes to a considerable bit-security level improvement compared with NewHope, a submission to National Institute of Standards and Technology (NIST), with almost the same parameters. Moreover, polar encoding and decoding support isochronous implementations in the sense that the timings of associated operations are irrelevant to the sensitive information.

Quantum Resistant Public Key Encryption Scheme polarRLCE

Improved polar-code-based efficient post-processing algorithm for quantum key distribution

Article Open access 16 June 2022

Secured polar code derived from random hopped frozen-bits

Article Open access 27 September 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

1.1 Error correction for Ring-LWE-based public key encryption

As the world’s top tech companies and research labs compete in the race to build a quantum computer, real world public key cryptography, such as digital signatures, public key encryption (PKE), and key exchange protocols, must be made quantum resistant. The ring learning with errors (RLWE) problem was introduced in [20] in 2010, expanding on the classical version of the learning with errors problem (LWE) introduced by Regev in [30]. Since then, cryptography based on the RLWE problem has become one of the most attractive post quantum candidates. Its security relies on the worst-case approximate shortest independent vector problem (SIVP) on ideal lattices and it gives better efficiency compared with plain LWE because of the ring structure. Many of the prominent submissions to the National Institute of Standards and Technology’s (NIST) call for proposals [26], for example NewHope [4] and LAC ^{Footnote 1} [19], are based on RLWE. Though neither of the two advanced to NIST’s third round, academic and industrial study on RLWE cryptography and their applications never stops. In this work, we focus on the issue of error correction for RLWE-based public key encryption.

Among the RLWE-based public key exchange protocols, there are essentially two major approaches to the problem of sharing a session key which is used to protect communication payload: the reconciliation approach of [10] and the encryption approach of [3]. By the first approach, both participants agree on a shared value from some pseudorandom signals with the help of a robust extractor. This work focuses on the other approach which resembles the compact RLWE public key encryption scheme proposed in [21]. Taking NewHope for example, the binary secret to be shared is encoded using a repetition code, mapped to {0,⌊q/2⌋}ⁿ and then wrapped by an encryption function. There will be a residue noise term after the decryption. Upon getting a decrypted codeword, the decoder then sums up the symbols corresponding to the repeated digits and infers if the digit should be 0 or 1 according to a threshold. Taking the telecommunication system as an analogy, this process is exactly a hard-decision decoding process which is not able to offer the optimal decoding performance. Decreasing the decryption failure rate (DFR) is believed of vital importance to the RLWE-based PKE. Firstly, if we seek chosen-ciphertext attack (CCA) security of the above cryptosystem using the classical Fujisaki-Okamoto transform [14], a pretty low DFR is required and the NIST standardization targets at a failure rate lower than 2^− 128. Secondly, more capable error correction allows larger error terms of RLWE, increasing the hardness of the underlying lattice problem and therefore the security of the cryptosystem.

To improve the error correction and security of RLWE-based PKE, some researchers have exploited the goodness of multidimensional lattices. For example, Leech lattice encoding and decoding are used in LWE-based PKE [28]. It gives the densest sphere packing of its dimension which means a good trade-off between information transmission rate and error correction capability. An alternative way to decrease DFR is to apply error-correcting codes (ECC). In [13], Fritzmann et al. considered how much the RLWE-based PKE protocol, NewHope Simple, could profit from Bose–Chaudhuri–Hocquenghem (BCH) codes, low-density parity-check (LDPC) codes, and a hybrid of the two regarding the DFR. They achieved a DFR of 2^− 140 using these codes, but their decoding algorithms were not intrinsically constant-time though this defect was likely to be managed if proper measures (e.g., fixing the number of iterations) were taken. In an independent line of work, Saarinen designed a linear block code called XE5 and implemented it in a RLWE-based PKE scheme called Hila5 [31] ^{Footnote 2}. This method is able to share 256 bits of message and additional 240 bits of redundancy at DFR below 2^− 128. The decoding algorithm runs in constant time, which provides resilience to timing-based side-channel attacks.

How to deal with the dependency existing in the residue noise term of RLWE-based PKE is closely related to the soundness of DFR estimation [9, 13, 32]. For example, in an integer ring of cyclotomic field $\frac {\mathbb {Z}_{q}[X]}{x^{n}+1}$, the multiplication, denoted by ⋅, of two ring elements results in a polynomial in the ring with correlated coefficients. In the case of RLWE-based PKE, the residue noise e ⋅ t − s ⋅ e₁ + e₂ (will appear in Section 2.2) has correlated coefficients such that an incorrect decryption of one bit may increase the probability of decryption error of other bits. As a result, the DFR estimation will be inaccurate if we assume the noise term has independent and identically distributed (i.i.d.) coefficients [9]. Moreover, advanced decoding algorithms (e.g. soft-decision decoding) presume an i.i.d. channel. That is the reason why we expect an i.i.d. noise model. We have found a few “independence” assumptions in the literature. Fritzmann et al. gave upper bounds on DFR using their error-correcting codes assuming that the residue noise can be seen as independent [13]. They improved the bit-security of NewHope to 309 bits for n = 1024, q = 12289 targeting at DFR= 2^− 140. D’Anvers et al. assume the residue noise has independent coefficients conditional on its norm. This method was used to deal with ternary error terms but it is impractical for true discrete Gaussian errors [9]. Song et al. formulated the NewHope as a digital communication system and solved a part of the dependency. They improved the bit-security to 252 bits (n = 1024, q = 12289) targeting at a DFR of 2^− 140 as well [32].

1.2 Originality and contribution

Distinguished from existing error-correcting schemes for RLWE-based PKE in the literature, the originality of the proposed polar coding scheme is threefold.

1.
Firstly, we take advantage of the fact that the secret s and e in the residue noise e ⋅ t − s ⋅ e₁ + e₂ are known by Alice at the decryption stage. This is leveraged to improve decoding performance. To this end, this work formalizes the mathematical model of RLWE PKE as a fading channel with channel state information (CSI) available. Existing works (e.g. [32]) also treat RLWE PKE as a telecommunication system but it does not exploit the knowledge about s and e which are seen as CSI in this work.
2.
Secondly, we resolve the correlation between the coefficients of the residue noise using canonical embedding under which polynomial multiplications are turned into coordinate-wise multiplications ^{Footnote 3} and we derived an i.i.d. channel model in the end. This allows us to carry out soft-decision decoding and accurate DFR estimation.
3.
In addition to providing an error-correcting approach, polar codes exhibit some salient features. Compared with BCH and LDPC, polar codes provide competitive and well-understood decoding performance limits measured by Bhattacharyya parameter. Moreover, its decoding is not affected by error floors [23]. Besides, the encoding and decoding of polar codes are isochronous in the sense that the timings of encoding and decoding are irrelevant to the secret and the plaintext.

The contributions of this paper are summarized as follows.

1.
We formulate the RLWE-based PKE as an i.i.d. fading channel with CSI available to the receiver without any “independence” assumptions. These are the prerequisites of the proposed of polar coding scheme.
1. (a)
  As explained earlier in this section, the coefficient correlation of the residue noise term e ⋅ t − s ⋅ e₁ + e₂ is unfastened by canonical embedding leading to an i.i.d. channel model. We view e and s as CSI which are known by Alice at the decryption stage whilst Bob on the other side only knows its distribution.
2. (b)
  Taking telecommunication system as an analogy, mapping a single bit 0 or 1 of the plaintext to a symbol on the constellation {0,⌊q/2⌋} is called modulation. To make the modulation scheme fit in with the i.i.d. fading channel in canonical basis, we proposed a new modulation scheme at the cost of error tolerance.
2.
Then we give the explicit construction of polar codes for RLWE-based PKE channel model. Experimental results and theoretical estimation of DFR are also given. Specifically, we derive a new DFR of 2^− 298 for q = 12289,n = 1024,r = 2 ($r=\sqrt {k/2}$) and code rate= 0.25, while NewHope gives a DFR of 2^− 216 in the same setting; we derive a new DFR of 2^− 156 for r = 2.83 (k = 16) and code rate= 0.25 while NewHope is proved to give a DFR of 2^− 137 in almost the same setting [32]. Thanks to the new DFR margin, the proposed RLWE-based PKE achieves a better bit-security level than NewHope while achieving the same target DFR. Besides, the encoding and decoding of polar codes support quasi-linear (i.e., $O(n\log n)$ with n to be the degree of the cyclotomic field of RLWE) and isochronous implementations, which will be discussed in detail in Section 7.2.

1.3 Roadmap

This paper is organized as follows. A review of the necessary algebraic number theory, fading channels and polar codes can be found in Section 2. In Section 3 we explain how to formulate a typical RLWE-based PKE scheme as an i.i.d. fading channel. How to handle the dependency in canonical basis is also demonstrated. Section 4 gives a high-level description of RLWE-based PKE with the proposed polar coding scheme. Section 5 gives the explicit construction of polar codes for RLWE. Section 6 analyzes the DFR theoretically and experimentally when polar coding is applied. Section 7 discusses the bit-security improvement derived by the new DFR margin as well as the isochrony of polar codes. Section 8 concludes this paper.

2 Preliminaries

2.1 Algebraic number theory

We review the necessary concepts from algebraic number theory required for our discussion of ring-LWE. In particular, we will relate many of our definitions to power-of-two cyclotomic fields, which are popular in modern cryptography.

A number field $K= \mathbb {Q}(\zeta )$ can be defined by adjoining an element $\zeta \in \mathbb {C}$ to the field of $\mathbb {Q}$ where ζ satisfies f(ζ) = 0 for some irreducible polynomial $f(X) \in \mathbb {Q}[X]$. Then, the degree of K over $\mathbb {Q}$ is precisely the degree n of f(X). Because f(ζ) = 0, K can be seen as a vector space over $\mathbb {Q}$ endowed with a basis {1,ζ,...,ζ^n− 1} known as the power basis of K. Let ζ_m be a primitive m th complex root of unity with minimal polynomial

$$\begin{array}{@{}rcl@{}} f(X)=\prod\limits_{i\in\mathbb{Z}^{*}_{m}}(X-{\zeta_{m}^{i}})\in\mathbb{Z}[X], \end{array}$$

where $\mathbb {Z}^{*}_{m}$ is the group of invertible elements in $\mathbb {Z}_{m}$. Then, the m th cyclotomic number field is defined as $K=\mathbb {Q}(\zeta _{m})$. When m ≥ 2 is a power of two, f(X) = Xⁿ + 1 and n = m/2.

A number field K of degree n permits n distinct ring embeddings $\sigma _{i}: K \rightarrow \mathbb {C}, i =1,...,n,$ which correspond to n automorphisms of K mapping ζ to each root of its minimal polynomial f(X). The n embeddings include s₁ real embeddings and s₂ pairs of complex conjugate embeddings. The concatenation of the n embeddings is called canonical embedding σ(⋅) which is a map from K into the space

$$H = \lbrace (x_{1},...,x_{n}) \in \mathbb{R}^{s_{1}} \times \mathbb{C}^{2 s_{2}} \vert x_{s_{1}+j} = \overline{x_{s_{1}+s_{2} +j}},~\forall 1 \leq j \leq s_{2} \rbrace \subseteq \mathbb{C}^{n}.$$

For power-of-two cyclotomics, s₁ = 0. Because the complex embeddings come in pairs of conjugates, H is isomorphic to $\mathbb {R}^{n}$. We also remark that under the embedding σ multiplication in K maps to coordinate-wise multiplication in H.

Let $\mathcal {O}_{K}$ be the set of all the algebraic integers in K. It forms a ring and is called ring of integers of the number field. For the above power-of-two cyclotomics, the ring of integers is $\mathcal {O}_{K}=\mathbb {Z}[X]/{(}1+X^{n}{)}$ and the canonical embedding maps $\mathcal {O}_{K}$ to an algebraic lattice in space H and the lattice generator matrix is defined as

$$B=\left(\begin{array}{ccccc} 1 & \zeta_{m} & {\zeta_{m}^{2}} &{\cdots} & \zeta_{m}^{n-1} \\ 1 & {\zeta_{m}^{3}} & {\zeta_{m}^{6}} &{\cdots} & \zeta_{m}^{3(n-1)}\\ 1 & {\zeta_{m}^{5}} & \zeta_{m}^{10} &{\cdots} & \zeta_{m}^{5(n-1)}\\ {\vdots} & {\vdots} & {\vdots} & {\vdots} & {\vdots} \\ 1 & \zeta_{m}^{(2n-1)} & \zeta_{m}^{(2n-1)2} & {\cdots} & \zeta_{m}^{(2n-1)(n-1))} \end{array} \right)_{n\times n}.$$

Moreover, because of the conjugate pairs of the embeddings, we can rewrite σ as $\sigma ^{\prime }: K\rightarrow \mathbb {R}^{n}$

$$\sigma^{\prime}(x) = (\Re[\sigma_{1}(x)],\Im[\sigma_{1}(x)],...,\Re[\sigma_{n/2}(x)],\Im[\sigma_{n/2}(x)]).$$

And the corresponding basis $\tilde {B}$ of the images of the mapping is

$$\tilde{B}=\left(\begin{array}{ccccc} 1 & \Re[\zeta_{m}] & {\Re[\zeta_{m}^{2}}] &{\cdots} & \Re[\zeta_{m}^{n-1}] \\ 0 & \Im[\zeta_{m}] & {\Im[\zeta_{m}^{2}}] &{\cdots} & \Im[\zeta_{m}^{n-1}] \\ 1 & {\Re[\zeta_{m}^{3}}] & {\Re[\zeta_{m}^{6}}] &{\cdots} & \Re[\zeta_{m}^{3(n-1)}]\\ 0 & {\Im[\zeta_{m}^{3}}] & {\Im[\zeta_{m}^{6}}] &{\cdots} & \Im[\zeta_{m}^{3(n-1)}]\\ 1 & {\Re[\zeta_{m}^{5}}] & \Re[\zeta_{m}^{10}] &{\cdots} & \Re[\zeta_{m}^{5(n-1)}]\\ 0 & {\Im[\zeta_{m}^{5}}] & \Im[\zeta_{m}^{10}] &{\cdots} & \Im[\zeta_{m}^{5(n-1)}]\\ {\vdots} & {\vdots} & {\vdots} & {\vdots} & {\vdots} \\ 1 & \Re[\zeta_{m}^{(n-1)}] & \Re[\zeta_{m}^{(n-1)2}] & {\cdots} & \Re[\zeta_{m}^{(n-1)(n-1)}]\\ 0 & \Im[\zeta_{m}^{(n-1)}] & \Im[\zeta_{m}^{(n-1)2}] & {\cdots} & \Im[\zeta_{m}^{(n-1)(n-1)}] \end{array} \right)_{n\times n}.$$

Note that both B and $\tilde {B}$ are orthogonal matrices. The determinant of B is $\sqrt {n}^{n}$ while that of $\tilde {B}$ is $(\sqrt {n/2})^{n}$.

2.2 Ring-LWE public key encryption scheme and the coefficient dependency

For concreteness, we give an example of a public key scheme based on ring-LWE which was first described in [21]. Many ring-LWE schemes and protocols including NewHope closely resemble this one. The scheme is parameterized by an integer modulus q, dimension n, and error distribution χ over R_q. We will take the example of NewHope and view R_q as $\frac {\mathbb {Z}_{q}[X]}{x^{n}+1}$ and define sampling from χ to be sampling each coefficient of a polynomial from the discrete Gaussian over $\mathbb {Z}$. The scheme proceeds as follows.

Alice samples a secret key $s \leftarrow \chi$ and publishes as a public key a ring-LWE sample (a,b) = (a,a ⋅ s + e) ∈ R_q × R_q, where a is uniformly random and $e \leftarrow \chi$.
Bob encrypts a message m ∈ R₂ as $(c_{1}, c_{2}) = (a \cdot t + e_{1}, b \cdot t + e_{2} + \lfloor \frac {q}{2} \rfloor \cdot m)$, where e₁,e₂,t are sampled independently from χ.
Alice decrypts using s by $d := c_{2} - c_{1} \cdot s = \lfloor \frac {q}{2} \rfloor \cdot m +e \cdot t - s\cdot e_{1} + e_{2}$.

Alice then recovers the message m by decoding: if the i^th coordinate of d is closer to 0 than ⌊q/2⌋, Alice assumes the i^th coordinate of m was 0, otherwise she assumes it was 1.

We find the dependency between the coefficients of the residue noise term e ⋅ t − s ⋅ e₁ + e₂ obvious if we rewrite it in vector form using coefficient embedding of R_q, i.e.,

$$\begin{pmatrix}e^{(0)}&-e^{(n-1)}&\cdots&-e^{(1)}\\e^{(1)}&e^{(0)}&\cdots&-e^{(2)}\\\vdots&\vdots&\ddots&\vdots\\e^{(n-1)}&e^{(n-2)}&\cdots&e^{(0)}\end{pmatrix}\;\mathbf t-\begin{pmatrix}s^{(0)}&-s^{(n-1)}&\cdots&-s^{(1)}\\s^{(1)}&s^{(0)}&\cdots&-s^{(2)}\\\vdots&\vdots&\ddots&\vdots\\s^{(n-1)}&s^{(n-2)}&\cdots&s^{(0)}\end{pmatrix}\;{\mathbf e}_1+{\mathbf e}_2,$$

where e⁽ⁱ⁾ is the i-th coefficient of polynomial e and t is the coefficient embedding of polynomial t. The row vectors of the negacyclic matrix generated by e have identical norm and they are multiplied by the same vector t and so do s and e₁.

2.3 Fading channel

In wireless communications, a fading channel arises due to a time-varying attenuation of signal quality caused by either the propagation environment or by movement of the transmitter/receiver. We consider a discrete-time fading channel model W

$$y_{i} = h_{i}x_{i}+z_{i},~~~i=1,\cdots,N,$$

where h_i is the channel gain, z_i is additive white Gaussian noise (AWGN) and N is the signal length. We highlight two facts about CSI which are relevant to the RLWE channel model we will discuss in Section 3. Firstly, a few consecutive h_i may be correlated and this period is called coherence interval of a fading channel W denoted by T_c. In the context of a fading channel with memory, the channel gain h_i is believed to be a constant within one coherence interval and varies independently as the next coherence interval approaches. Secondly, the realization of h_i is called channel state information (CSI) and the distribution of h_i is called channel distribution information (CDI). CSI sometimes is known to the decoder.

When designing a telecommunication system, we prefer i.i.d. fading channels where h_i are independent. There are a few methods to deal with the correlation. Let m = T_c > 1 and N/m = n. Since a fading channel with coherence interval T_c can be seen as m parallel sub-channels, a bit-interleaved coded modulation (BICM) technique can be used to handle the correlation between sub-channels [7, 22]. Another solution is to use multilevel codes [11] to design a coded modulation scheme with signal points in an m-dimensional signal space. In [18], a properly chosen lattice partition chain Λ₁/⋯/Λ_l− 1/Λ_l is employed to design multilevel polar codes to achieve fading channel capacity. In this case, the dimension m of Λ₁ is properly chosen such that the channel gain h_i is assumed to be a constant amid the whole transmission of m symbols, i.e. T_c = m. A component code $\mathcal {C}_{i}$ at the i-th level of the partition chain is designed in order to achieve the capacity of a Λ_i/Λ_i+ 1 fading channel. The component codes are combined by construction D giving rise to a lattice. More information about the multilevel construction and the Λ_i/Λ_i+ 1 channel can be found in [18] and [11]. We give an example of a mod $\mathbb {Z}$ channel and a $\mathbb {Z}/2\mathbb {Z}$ channel as follows and the fading version will be given in Section 3.

Example 1

A mod $\mathbb {Z}$ channel is an AWGN channel with input restricted to $a\in \mathcal {V}(\mathbb {Z})$ where $\mathcal {V}(\mathbb {Z})$ is the fundamental region ^{Footnote 4} of $\mathbb {Z}$. At the receiver’s end, there is a mod $\mathcal {V}(\mathbb {Z})$ operation giving the equivalent channel output as

$$y=a+z~\text{mod}~\mathbb{Z}=(a+z^{\prime})~\text{mod}~\mathbb{Z},$$

where z is an AWGN noise and $z^{\prime }=z~\text {mod}~\mathbb {Z}$.

Example 2

A $\mathbb {Z}/2\mathbb {Z}$ channel is an AWGN channel with input restricted to $r\in (\mathbb {Z}+a)\cap \mathcal {V}(2\mathbb {Z})$ for some offset $a\in \mathbb {R}$. At the receiver’s end, the equivalent channel output is

$$y=r+z~\text{mod}~2\mathbb{Z}= r+z^{\prime}~\text{mod}~2\mathbb{Z},$$

where z is an AWGN noise and $z^{\prime }=z~\text {mod}~2\mathbb {Z}$. It can be viewed as a mod $2\mathbb {Z}$ channel with input restricted to a set of elements of $\mathbb {Z}+a$ that fall in $\mathcal {V}(2\mathbb {Z})$.

In the special case of T_c = 1, channel W is referred to as an i.i.d. fading channel. The design and performance of error-correcting codes for i.i.d. fading channels with/without CSI is well studied [6, 36]. In [18], Liu et al. proposed a polar coding scheme for i.i.d. fading channels to achieve the ergodic capacity. Unlike previous work of [6] in which CSI is given to both ends of communication, in Liu et al.’s scheme CSI is only known to the receiver which is more feasible in practice.

2.4 Polar codes

Polar codes, introduced by Arıkan in [5], are linear block codes of length n = 2^l for a positive integer l that achieves the capacity of any binary-input discrete memoryless symmetric (BDMS) channels asymptotically ^{Footnote 5}. We firstly review some basics of polar codes for a BDMS channel. A binary-input channel W is symmetric if there exists a permutation π of the output alphabet $\mathcal {Y}$ such that W(y|1) = W(π(y)|0) and π^− 1 = π for $y\in \mathcal {Y}$. Given a BDMS channel W, there are two commonly used metrics in information theory to measure the quality of W: the mutual information^{Footnote 6} and the reliability.

Definition 1 (Mutual information of BDMS channels)

The mutual information I(W) ∈ [0,1] of a BDMS channel $W:\mathcal {X}\rightarrow \mathcal {Y}$ is the maximum rate at which information can be successfully transmitted from the transmitter to the receiver. We define I(W) as

$$I(W)\triangleq\sum\limits_{y\in\mathcal{Y}}\sum\limits_{x\in\mathcal{X}}\frac{1}{2}W(y|x)\log\frac{W(y|x)}{\frac{1}{2}W(y|0)+\frac{1}{2}W(y|1)}.$$

In here, we use the definition of symmetric mutual information assuming uniform channel input which is also the capacity of the BDMS channel. We use the notations I(W) and I(Y ;X) interchangeably to denote the mutual information of W.

Definition 2 (Bhattacharyya parameter of BDMS channels)

The Bhattacharyya parameter Z(W) ∈ [0,1] is a measure of channel reliability for a BDMS channel W defined as

$$Z(W)\triangleq\sum\limits_{y\in\mathcal{Y}}\sqrt{W(y|0)W(y|1)},$$

where a small Z(W) indicates a more reliable channel while a large Z(W) implies a channel with more inference.

The capacity-achieving nature of polar codes arises from the so-called channel polarization phenomenon as a result of recursive applications of Arıkan’s transform to two identical W channels and their synthesized derivatives. The overall recursive transform can be done in a channel combining phase and a channel splitting phase. In the channel combining phase, a linear transformation defined as X^1:n = U^1:nG_n is performed on a vector $U^{1:n}\in \mathcal {X}^{1:n}$ over GF(2), where $G_{n}= B_{n}\left [ \begin {array}{cc} 1 & 0 \\ 1 & 1 \end {array}\right ]^{\otimes l}$. B_n is a permutation matrix: if $U^{\prime 1:n}=U^{1:n}B_{n}$ and $l=\log _{2}n$, the $i^{\prime }=((b_{l},\cdots ,b_{2},b_{1})_{2}+1)$-th coordinate of $U^{\prime 1:n}$ is the i = ((b₁,b₂,⋯ ,b_l)₂ + 1)-th coordinate of U^1:n where (⋯ )₂ is the binary expansion of an integer. By taking X^1:n as the raw input of W, one derives a combined channel $W_{n}:\mathcal {X}^{1:n}\rightarrow \mathcal {Y}^{1:n}$ with a transition probability of

$$W_{n}(y^{1:n}|u^{1:n})=\prod\limits_{i\in\{1,\cdots,n\}}W(y^{(i)}|x^{(i)}=(u^{1:n}G_{n})_{i}),$$

where (⋅)_i denotes i-the coordinate. Since G_n induces a one-to-one mapping between U^1:n and X^1:n, the mutual information of W_n is

$$I(W_{n}) = I((Y^{1:n};U^{1:n})) = n I(W).$$

In the channel splitting phase, W_n is further split back into n synthesized channels $W_{n}^{(i)}:\mathcal {X}\rightarrow \mathcal {Y}^{n}\times \mathcal {X}^{i-1}$ whose transition probability is defined by

$$W_n^{(i)}(y^{1:n},u^{1:i-1}\vert u^{(i)})=\sum\limits_{U^{i+1:n}\in\mathcal X^{n-i}}\frac1{2^{n-1}}W_n(Y^{1:n}\vert U^{1:n}).$$

It is proved in [5] that Arıkan’s transform preserves the mutual information in the sense that

$$I(W_{n})=nI(W)=\sum\limits_{i\in\{1,\cdots,n\}}I(W_{n}^{(i)}).$$

More importantly, the quality of the synthesized channels polarizes asymptotically as the recursion proceeds.

Theorem 1 (Channel polarization of mutual information 5)

For any BDMS channel W, the synthesized channels $W_{n}^{(i)}$ polarize in the sense that, for any fixed δ ∈ (0,1), as n goes to infinity through powers of two, the fraction of indices i ∈{1,⋯ ,n} for which $I(W_{n}^{(i)})\in (1-\delta ,1]$ goes to I(W) and the fraction for which $I(W_{n}^{(i)})\in [0,\delta )$ goes to 1 − I(W).

The channel polarization theorem can also be stated in the metric of Bhattacharyya parameter by replacing $I(W_{n}^{(i)})$ by $Z(W_{n}^{(i)})$. For any desired transmission rate R < I(W), we can partition {1,⋯ ,n} into a subset $\mathcal {A}$ and its complement $\mathcal {A}^{C}$ such that (i) $|\mathcal {A}|=\lfloor nR \rfloor$ and (ii) for any $i\in \mathcal {A}$ and $j\in \mathcal {A}^{C}$, $Z(W_{n}^{(i)})\leq Z(W_{n}^{(j)})$. Given the “best” ⌊nR⌋ channels indexed by $\mathcal {A}$, one can construct polar codes following the encoding rule:

$$\begin{array}{@{}rcl@{}} X^{1:n}=U_{\mathcal{A}}G_{n}(\mathcal{A})\oplus U_{\mathcal{A}^{C}}G_{n}(\mathcal{A}^{C}), \end{array}$$

(1)

where ⊕ is XOR operation, $U_{\mathcal {A}}$ is called the information vector and $U_{\mathcal {A}^{C}}$ is called the frozen vector known by both encoder and decoder. Typical realization of the frozen vector is $U_{\mathcal {A}^{C}}=\textbf {0}$ for BDMS channels. In this manner, the useful information is transmitted via the most reliable synthesized channels. A question may arise on how to efficiently calculate $Z(W_{n}^{(i)})$. A brief review can be found in Sections 2.5 and 5.3 but detailed descriptions of these methods are beyond the scope of this work.

The successive cancellation (SC) decoder is the initial decoding algorithm for polar codes. Let u⁽ⁱ⁾ be the i-th coordinate of U^1:n. Given a channel output y^1:n of polar code, the SC decoder yields the recovered $\bar {u}^{(i)}$ of u⁽ⁱ⁾ in sequential order of index i according to the decoding rule specified as

$$\begin{array}{@{}rcl@{}} \bar{u}^{(i)}=\begin{cases} u^{(i)}, & i\in\mathcal{A}^{C}\\ \text{arg~max}_{u^{(i)}\in\{0,1\}} W_{n}^{(i)}(y^{1:n},\bar{u}^{1:i-1}|u^{(i)}), & i\in \mathcal{A} \end{cases}, \end{array}$$

(2)

where $\bar {u}^{1:i-1}$ is the estimation of u^1:i− 1 recovered before $\bar {u}^{(i)}$. Details of the SC decoder can be found in Appendix A.

Denote by P_e the averaged probability of frame errors. As a result of polar encoding and SC decoding, it is proved in [5] that P_e is upper bounded as follows.

Theorem 2 (Decoding Performance 5)

For any BDMS channel W and any choices of parameter $(n,R,\mathcal {A})$,

$$\begin{array}{@{}rcl@{}} P_{e}\leq\sum\limits_{i\in\mathcal{A}}Z(W_{n}^{(i)}). \end{array}$$

2.5 Channel degradation and upgradation

The construction of polar codes can be addressed if all the Bhattacharyya parameters $Z(W_{n}^{(i)})$ of synthesized channels can be efficiently calculated. To this end, a quantization method was proposed in [34] to construct a degraded or upgraded approximation of a binary-input memoryless symmetric (BMS) channel. In this way, one can approximate $Z(W_{n}^{(i)})$ efficiently with tractable and minor distortion. We define the degradation and upgradation relation as follows and will be further discussed them in the sequel.

Definition 3 (Degraded and Upgraded Channel, 34)

A channel $\mathcal {Q}:\mathcal {X}\rightarrow \mathcal {Z}$ is (stochastically) degraded with respect to a channel $\mathcal {W}:\mathcal {X}\rightarrow \mathcal {Y}$ if there exists a channel $\mathcal {P}:\mathcal {Y}\rightarrow \mathcal {Z}$ such that

$$\mathcal{Q}(z|x)=\sum\limits_{y\in\mathcal{Y}}\mathcal{W}(y|x)\mathcal{P}(z|y)$$

for all $z\in \mathcal {Z}$ and $x\in \mathcal {X}$. We denote by $\mathcal {Q}\preceq \mathcal {W}$ the relation that $\mathcal {Q}$ is degraded with respect to $\mathcal {W}$. Conversely, we denote by $\mathcal {Q}^{\prime }\succeq \mathcal {W}$ the relation that $\mathcal {Q}^{\prime }$ is upgraded with respect to $\mathcal {W}$ if there exists a channel $\mathcal {Q}^{\prime }:\mathcal {X}\rightarrow \mathcal {Z}^{\prime }$ and a channel $\mathcal {P}:\mathcal {Z}^{\prime }\rightarrow \mathcal {Y}$ such that for $y\in \mathcal {Y}$ and $x\in \mathcal {X}$

$$\mathcal{W}(y|x)=\sum\limits_{z^{\prime}\in\mathcal{Z}^{\prime}}\mathcal{Q}^{\prime}(z^{\prime}|x)\mathcal{P}(y|z^{\prime}).$$

Moreover, Lemma 1 indicates that the synthesized channels of $\mathcal {Q},\mathcal {W},\mathcal {Q}^{\prime }$ under Arıkan’s transform also fulfill the channel degradation and upgradation relation. This implies a polar code constructed for $\mathcal {Q}$ also fits in with $\mathcal {W}$.

Lemma 1 (restatement of Lemma 4.7 in 17)

Given BMS channels $\mathcal {W},\mathcal {Q}$, and $\mathcal {Q}^{\prime }$, we denote by $\mathcal {W}_{n}^{(i)}$, $\mathcal {Q}_{n}^{(i)}$ and ${\mathcal {Q}^{\prime }}_{n}^{(i)}$ for i ∈ [1,n] the synthesized channels derived by Arıkan’s transform. If $\mathcal {Q}^{\prime }\succeq \mathcal {W}\succeq \mathcal {Q}$ for all i, then ${\mathcal {Q}^{\prime }}_{n}^{(i)}\succeq \mathcal {W}_{n}^{(i)}\succeq \mathcal {Q}_{n}^{(i)}$.

If the channel degradation or upgradation relation is set up, their channel capacity, reliability and error probability will be related as follows.

Lemma 2 (34)

Let $\mathcal {W}$ be a BMS channel and suppose there exists the other channel $\mathcal {Q}$ such that $\mathcal {Q}\preceq \mathcal {W}$. Then

$$C(\mathrm{Q}) \leq C(\mathcal{W}),~Z(\mathrm{Q})\geq Z(\mathcal{W}),~P_{e}(\mathrm{Q})\geq P_{e}(\mathcal{W}).$$

The inequality will reverse if we replace “degraded” by “upgraded”.

3 RLWE channel model

3.1 RLWE channel model in canonical basis

Definition 4

The real multivariate normal distribution has density function

$$g_{\mu,{\Sigma}}(x)=\frac{e^{-\frac{1}{2}(x-\mu)^{T}{\Sigma}^{-1}(x-\mu)}}{\sqrt{|2\pi {\Sigma}|}}~~x\in\mathbb{R}^{n}$$

where |⋅| denotes the determinant, $\mu =\mathbb {E}[X]\in \mathbb {R}^{n}$, ${\Sigma }=\mathbb {E}\left [ (X-\mu )(X-\mu )^{T} \right ]$; we write $X\sim \mathcal {N}(\mu ,{\Sigma })$. A generalization would be the complex multivariate normal distribution $Z\sim \mathcal {N}\mathcal {C}(\mu ,{{\varGamma }})$ with density function

$$g_{\mu,{{\varGamma}}}(z)=\frac{e^{-(z-\mu)^{*}{{\varGamma}}^{-1}(z-\mu)}}{|\pi {{\varGamma}}|}~~z\in\mathbb{C}^{n},$$

where z^∗ denotes the Hermitian transpose of the vector z and ${{\varGamma }} =\mathbb {E}[(Z-\mu )(Z-\mu )^{*}]$.

We already have an RLWE-based PKE instance in Section 2.2. Now we consider the problem of decoding the message m from the polynomial

$$y = \lfloor \frac{q}{2} \rfloor \cdot m + e \cdot t - s \cdot e_{1} + e_{2} \mod R_{q},$$

(3)

where e ⋅ t and s ⋅ e₁ are products of polynomials in $\mathbb {Z}_{q}[x]/(1+x^{n})$. The coefficients of e,t,s,e₁,e₂ should be drawn from discrete Gaussian. We use continuous normal distribution $\mathcal {N}(0,r^{2})$ instead to simplify the distribution analysis of the noise term.

Under canonical embedding, formula (3) can be rewritten as ^{Footnote 7}

$$\sigma(y)=\sigma(\lfloor\frac{q}{2}\rfloor m)+\sigma(e)\sigma(t)-\sigma(s)\sigma(e_{1})+\sigma(e_{2})~~\mod{BR_{q}},$$

(4)

where B is the orthogonal basis defined in Section 2.1 and the multiplications (i.e., σ(e)σ(t) and σ(s)σ(e₁)) and additions are both coordinate-wise as explained in Section 2.1. Due to the conjugate pairs, formula (4) can be refined as

$$\underbrace{\left(\begin{array}{c} \Re[B_{1}\textbf{y}]\\ \Im[B_{1}\textbf{y}]\\ \Re[B_{2}\textbf{y}]\\ \Im[B_{2}\textbf{y}]\\ \vdots\\ \Re[B_{\frac{n}{2}}\textbf{y}]\\ \Im[B_{\frac{n}{2}}\textbf{y}] \end{array}\right)}_{\tilde{B}\textbf{y}} = \underbrace{\left(\begin{array}{c} \Re[B_{1}\lfloor\frac{q}{2}\rfloor \textbf{m}]\\ \Im[B_{1}\lfloor\frac{q}{2}\rfloor \textbf{m}]\\ \Re[B_{2}\lfloor\frac{q}{2}\rfloor \textbf{m}]\\ \Im[B_{2}\lfloor\frac{q}{2}\rfloor \textbf{m}]\\ \vdots\\ \Re[B_{\frac{n}{2}}\lfloor\frac{q}{2}\rfloor \textbf{m}]\\ \Im[B_{\frac{n}{2}}\lfloor\frac{q}{2}\rfloor \textbf{m}] \end{array}\right)}_{\tilde{B}\lfloor\frac{q}{2}\rfloor \textbf{m}} + \underbrace{\left(\begin{array}{c} \Re[\sigma_{1}({e})\sigma_{1}({t})-\sigma_{1}({s})\sigma_{1}({e}_{1})+\sigma_{1}({e}_{2})]\\ \Im[\sigma_{1}({e})\sigma_{1}({t})-\sigma_{1}({s})\sigma_{1}({e}_{1})+\sigma_{1}({e}_{2})]\\ \Re[\sigma_{2}({e})\sigma_{2}({t})-\sigma_{2}({s})\sigma_{2}({e}_{1})+\sigma_{2}({e}_{2})]\\ \Im[\sigma_{2}({e})\sigma_{2}({t})-\sigma_{2}({s})\sigma_{2}({e}_{1})+\sigma_{2}({e}_{2})]\\ \vdots\\ \Re[\sigma_{\frac{n}{2}}({e})\sigma_{\frac{n}{2}}({t})-\sigma_{\frac{n}{2}}({s})\sigma_{\frac{n}{2}}({e}_{1})+\sigma_{\frac{n}{2}}({e}_{2})]\\ \Im[\sigma_{\frac{n}{2}}({e})\sigma_{\frac{n}{2}}({t})-\sigma_{\frac{n}{2}}({s})\sigma_{\frac{n}{2}}({e}_{1})+\sigma_{\frac{n}{2}}({e}_{2})] \end{array}\right)}_{\textbf{N}}\text{mod}{\tilde{B}R_{q}},$$

(5)

where B_j represents the j^th row of B, vector y and m are vector forms of polynomials y and m, $\tilde {B}$ and $\sigma ^{\prime }$ are introduced in Section 2.1, $\tilde {B}\textbf {y}=\sigma ^{\prime }({y})$, and $\tilde {B}\lfloor \frac {q}{2}\rfloor \textbf {m}=\sigma ^{\prime }(\lfloor \frac {q}{2}\rfloor {m})$. To see how the noise term N is distributed, we rewrite formula (5) for all the odd indices i = 1,3,5,⋯ ,n/2 − 1 as

$$\begin{array}{@{}rcl@{}}\left[ \begin{array}{c} \sigma^{\prime}_{i}(y)\\ \sigma^{\prime}_{i+1}(y) \end{array}\right] &= &\left[\begin{array}{c} \sigma^{\prime}_{i}(\lfloor \frac{q}{2} \rfloor m)\\ \sigma^{\prime}_{i+1}(\lfloor \frac{q}{4} \rfloor m) \end{array}\right] + \left[\begin{array}{cc} \sigma^{\prime}_{i}(e) & -\sigma^{\prime}_{i+1}(e)\\ \sigma^{\prime}_{i+1}(e) & \sigma^{\prime}_{i}(e) \end{array}\right] \left[\begin{array}{c} \sigma^{\prime}_{i}(t)\\ \sigma^{\prime}_{i+1}(t) \end{array}\right]\\ &&- \left[\begin{array}{cc} \sigma^{\prime}_{i}(s) & -\sigma^{\prime}_{i+1}(s)\\ \sigma^{\prime}_{i+1}(s) & \sigma^{\prime}_{i}(s) \end{array}\right] \left[\begin{array}{c} \sigma^{\prime}_{i}(e_{1})\\ \sigma^{\prime}_{i+1}(e_{1}) \end{array}\right] + \left[\begin{array}{c} \sigma^{\prime}_{i}(e_{2})\\ \sigma^{\prime}_{i+1}(e_{2}) \end{array}\right], \end{array}$$

(6)

where $\tilde {B}_{i}(\cdot )=\sigma ^{\prime }_{i}(\cdot )$ and $\tilde {B}_{i+1}(\cdot )=\sigma ^{\prime }_{i+1}(\cdot )$. Under embedding $\sigma :K\rightarrow \mathbb {C}^{n}$, the spherical normal distributed vectors, e and t, are mapped to complex spherical normal vectors, $\sigma (e),\sigma (t)\sim \mathcal {N}\mathcal {C}(0,nr^{2}\mathbb {I})$. As for the embedding $\sigma ^{\prime }:K\rightarrow \mathbb {R}^{n}$, the spherical normal distribution $\mathcal {N}(0,r^{2}\mathbb {I})$ is transformed to a new spherical normal distribution $\mathcal {N}(0,nr^{2}/2\mathbb {I})$. Since e,t are coordinate-wise i.i.d. their embeddings σ(e), σ(t), $\sigma ^{\prime }(e)$, $\sigma ^{\prime }(t)$ are coordinate-wise independent as well. We observe from formula (6) that every odd-indexed coordinate and the next even-indexed coordinate are somehow correlated because they share the same $\sigma ^{\prime }_{i}(e),\sigma ^{\prime }_{i+1}(e)$, $\sigma ^{\prime }_{i}(t),\sigma ^{\prime }_{i+1}(t)$, $\sigma ^{\prime }_{i}(s),\sigma ^{\prime }_{i+1}(s)$ and $\sigma ^{\prime }_{i}(e_{1}),\sigma ^{\prime }_{i+1}(e_{1})$ although $\sigma ^{\prime }_{i}(e_{2}),\sigma ^{\prime }_{i+1}(e_{2})$ are independent.

To further refine the RLWE channel model, we can rewrite formula (5) and (6) as

$$\tilde{B}\textbf{y}=\tilde{B}\lfloor\frac{q}{2}\rfloor\textbf{m}+\textbf{N},\mod{\tilde{B}R_{q}}$$

(7)

where for i = 1,2,⋯ ,n, N_i = H_i ∗ Z_i, $Z_{i}\leftarrow \mathcal {N}(0,\frac {nr^{2}}{2})$, and

$$H_{i} =\sqrt{\sigma^{*}_{\lceil i/2\rceil}({e})\sigma_{\lceil i/2\rceil}({e})+\sigma^{*}_{\lceil i/2\rceil}({s})\sigma_{\lceil i/2\rceil}({s})+1}.$$

Because of the correlation between every two coordinates, H_i and H_j are independent for two different indices i,j as long as ⌈i/2⌉≠⌈j/2⌉; otherwise H_i = H_j. Similarly, Z_i and Z_j are correlated if ⌈i/2⌉ = ⌈j/2⌉; otherwise they are independent.

Unlike in NewHope and other RLWE-based encryption schemes where the plaintext is encoded and decoded in the polynomial basis, we will carry out encoding and decoding in canonical basis. Observe that the channel given by formula (7) is a fading channel with coherence interval T_c = 2 coordinates except that the symbols to be transmitted after modulation, i.e., $\tilde {B}\lfloor \frac {q}{2}\rfloor \mathbf {m}$, are not coordinate-wise independent. In next subsection, we will adjust the modulation scheme such that a tailored constellation diagram can fit in with the fading channel.

3.2 A tailored constellation diagram

The RLWE channel in formula (3) can be interpreted as n parallel $\mathbb {Z}/2\mathbb {Z}$ channels where a message m ∈{0,1}ⁿ is mapped to a symbol on the constellation diagram $\{0,\lfloor \frac {q}{2} \rfloor \}^{n}$. The mod R_q operation defines a valid constellation space as an n-dimensional cube Λ with vertices {0,q}ⁿ. To ease the description of how we design a new constellation diagram in canonical basis, we make a modification to the modulation scheme in formula (3): the message m ∈{− 1,1}ⁿ is mapped onto the constellation diagram $\{\pm \lfloor \frac {q}{4} \rfloor \}^{n}$ and the valid constellation space is a cube Λ with vertices $\{\pm \lfloor \frac {q}{2}\rfloor \}^{n}$. This modification will preserve the capacity of the $\mathbb {Z}/2\mathbb {Z}$ channel because they are statistically equivalent if we ignore geometrical approximation caused by the round-off operation ⌊⋅⌋.

According to formula (7), after applying the canonical embedding, the constellation diagram turns into $\tilde {B}\{\pm \lfloor \frac {q}{4}\rfloor \}^{n}$. Similarly, we can obtain the new constellation space ${{\varLambda }}^{\prime }=\tilde {B}{{\varLambda }}$ by rotating Λ and scaling it up by a factor of $\sqrt {n/2}$.

As discussed in previous subsection, the coherence interval T_c of the residue noise equals to 2 coordinates while the constellation symbol $\tilde {B}\lfloor \frac {q}{4}\rfloor \textbf {m}$ has memory throughout n coordinates. In a communication system, the interleaving technique can be used to alleviate the correlation of the source by permuting symbols of different code blocks. Unfortunately, interleaving is impractical in the RLWE channel because there is only one code block of length n. At the cost of distance between the constellation symbols, we tailor the constellation space ${{\varLambda }}^{\prime }$ to fit in with the fading channel.

Essentially, we are looking for a new modulation scheme meeting two conditions: (a) we desire the symbols after modulation (or the modulated message) to be coordinate-wise i.i.d.; in other words, we expect a valid constellation diagram inside the space ${{\varLambda }}^{\prime }$ such that for coordinate-wise i.i.d. message m, the modulated message is coordinate-wise i.i.d. as well; (b) the new modulation scheme gives us a $\mathbb {Z}/2\mathbb {Z}$ channel. Conceptually, the maximal n-dimensional cube ${{\varLambda }}^{\prime \prime }$ enclosed in ${{\varLambda }}^{\prime }$ and parallel to Λ is our target constellation space. In this case, the symbols to be transmitted can be easily made to be binary and i.i.d. if we divide the cube ${{\varLambda }}^{\prime \prime }$ equally into 2ⁿ small cubes and select all the centers of the small cubes to be the constellation diagram. However, looking for such a ${{\varLambda }}^{\prime \prime }$ in practice is intractable when the dimension n is large and we are unclear about in what direction and by what degree the cube ${{\varLambda }}^{\prime }$ is rotated with respect to Λ. Instead, we compromise on the constellation size and use the cube ${{\varLambda }}^{\prime \prime }$ which is parallel to Λ and is enclosed in the maximal ball inscribed in ${{\varLambda }}^{\prime }$. In this manner, we can make sure there always exists such a constellation space ${{\varLambda }}^{\prime \prime }$ and it is straightforward to calculate its size. Figure 1 illustrates this idea in 2-dimensional case. If the side length of Λ is q, the side of ${{\varLambda }}^{\prime }$ turns out to have length $q\sqrt {n/2}$, and the side of ${{\varLambda }}^{\prime \prime }$ will be $q/\sqrt {2}$. Observe that ${{\varLambda }}^{\prime }=\sqrt {2}\tilde {B}{{\varLambda }}^{\prime \prime }$.

3.3 Tailored RLWE channel model in canonical basis

Given the tailored constellation space ${{\varLambda }}^{\prime \prime }$ and its corresponding constellation diagram, we now have a tailored RLWE channel model in the canonical basis:

$$\textbf{y}=\lfloor\frac{q}{2}\rfloor \frac{1}{\sqrt{2}}\textbf{m}+\mathbf{N},\mod{{{\varLambda}}^{\prime\prime}},$$

(8)

where m ∈{0,1}ⁿ, N_i = H_i ∗ Z_i and $Z_{i}\leftarrow \mathcal {N}(0,nr^{2}/2)$ for 1 ≤ i ≤ n. As discussed in formula (7), H_i and H_j are independent for two different indices i,j as long as ⌈i/2⌉≠⌈j/2⌉; otherwise H_i = H_j. Similarly, Z_i and Z_j are independent if ⌈i/2⌉≠⌈j/2⌉ otherwise they are correlated.

We observe that the tailored channel model in formula (8) can be seen as a fading channel where H_i is the channel gain and Z_i is the additive noise. A family of fading channels (e.g., i.i.d. fading, block fading, compound fading) are well studied in existing work of [6, 18, 36] and explicit constructions of error-correcting codes are given. In this work, since H_i and Z_i have the same coherence interval of two coordinates, our strategy is to divide the n parallel channels into two groups of i.i.d. channels and we construct two parallel polar codes of equal block length n/2 for the two $\mathbb {Z}/2\mathbb {Z}$ fading channels. Note that in this work we use parameters similar to NewHope, e.g., q = 12289,n = 1024, r ∈{1,2,6,9} where the values of r correspond to the “Short” and “Tall” parameters in [8].

Denote by L and $L^{\prime }$ two one-dimensional lattices $\lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}} \mathbb {Z}$ and $q\frac {1}{\sqrt {2}}\mathbb {Z}$ respectively. The above channel model can also be written as a fading $L/L^{\prime }$ channel, i.e.,

$$Y_{i}=\lfloor\frac{q}{2}\rfloor\frac{1}{\sqrt{2}} m_{i}+H_{i}*Z_{i},\mod{q\frac{1}{\sqrt{2}}\mathbb{Z}},~i=1,{\cdots} n,$$

(9)

where m_i ∈{0,1} and the channel input X is restricted to the discrete alphabet $\mathcal {X}=L\cap \mathcal {R}(L^{\prime })=\{0, \lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}}\}$. Since Alice knows exactly what e and s are, she knows both the distribution and realization of the channel gain H_i. At the transmitter’s end, Bob only knows the distribution of H_i. Both of them know the distribution of Z_i. How to achieve the ergodic capacity of such an i.i.d. fading channel using polar codes is well studied in [18] and we are about to adapt their strategy to our tailored RLWE channel model. A diagram of a fading $L/L^{\prime }$ channel with CSI available to the decoder is shown in Fig. 2.

Denote by $W:X\rightarrow (\tilde {Y},H)$ the fading $L/L^{\prime }$ channel with CSI available to the decoder. The transition probability of W is

$$\begin{array}{@{}rcl@{}} P_{\tilde{Y},H|X}(\tilde{y},h|x) & =& P_{Y,H|X}(y=\tilde{y}+L^{\prime},h|x)\frac{d\tilde{y}}{dy} \\ & =& P_{H}(h)P_{Y|H,X}(y=\tilde{y}+L^{\prime}|h,x) \\ & =& P_{H}(h)\sum\limits_{\lambda\in L^{\prime}}\frac{1}{\sqrt{2\pi}h\sigma}\exp\left\lbrace-\frac{(\tilde{y}+\lambda-x)^{2}}{2\sigma^{2} h^{2}}\right\rbrace, \end{array}$$

(10)

where $\sigma =\sqrt {\frac {n}{2}}r$. The distribution of H is

$$P_{H}(h) = \frac{1}{2\sigma^{4}}h(h^{2}-1)\exp\left\lbrace -\frac{h^{2}-1}{2\sigma^{2}} \right\rbrace =\frac{2h(h^{2}-1)}{n^{2}r^{4}}e^{-\frac{(h^{2}-1)}{nr^{2}}},~h>1.$$

The pdf of H in terms of various choices of parameter r is depicted in Fig. 3.

As discussed in [12] and [18], the capacity of the fading $L/L^{\prime }$ channel is given by

$$\begin{array}{@{}rcl@{}} C(L/L^{\prime},\sigma^{2}) &=& E_{H}\left[C(L^{\prime},(h\sigma)^{2})\right]-E_{H}\left[C(L,(h\sigma)^{2})\right] \\ &=& E_{H}\left[\mathfrak{h}(L,(h\sigma)^{2})\right]-E_{H}\left[\mathfrak{h}(L^{\prime},(h\sigma)^{2})\right]+\log |L/L^{\prime}|, \end{array}$$

where $E_{H}\left [\cdot \right ]$ denotes the expectation over the fading coefficient, $\mathfrak {h}(L,\sigma ^{2})$ and $\mathfrak {h}(L^{\prime },\sigma ^{2})$ are differential entropies of mod-L and mod-$L^{\prime }$ channels respectively, and $|L/L^{\prime }|$ is the order of the partition $L/L^{\prime }$. Specifically, $\mathfrak {h}(L,\sigma ^{2})$ is given by

$$\begin{array}{@{}rcl@{}} \mathfrak{h}(L,(h\sigma)^{2}) &=&-{\int}_{\mathcal{R}(L)}\;f_{L,(h\sigma)^{2}}(z^{\prime})\log f_{L,(h\sigma)^{2}}(z^{\prime})dz^{\prime}, \\ f_{L,(h\sigma)^{2}}(z^{\prime}) &=& \sum\limits_{\lambda\in L}g_{(h\sigma)^{2}}(z^{\prime}+\lambda),~z^{\prime}\in \mathcal{R}(L), \end{array}$$

(11)

where $\mathcal {R}$ is a fundamental region of lattice L, $g_{(h\sigma )^{2}}(\cdot )$ is the density function of $\mathcal {N}(0,h^{2}\sigma ^{2}\mathbb {I})$. We refer to $f_{L,(h\sigma )^{2}}$ as an L-periodic Gaussian density function which is defined by summing up a set of copies of a Gaussian density function centered at every lattice point of L. The value of an L-periodic Gaussian variable $z^{\prime }$ is restricted to any fundamental region of L such that the integral of its density function over $\mathcal {R}(L)$ is obviously 1. See Fig. 4 for the ergodic capacity of the fading $L/L^{\prime }$ channel $W:X\rightarrow (\tilde {Y},H)$ with respect to different choices of r. In a communication system, the signal-to-noise ratio (SNR) is a measure of the reliability of a channel. It is defined as the ratio of the signal strength over the noise strength^{Footnote 8}.

Recall it in Section 2.4 the definition of a symmetric channel. It is observed that $P_{\tilde {Y},H|X}(\tilde {y},h|x = 0 )=P_{\tilde {Y},H|X}(\pi (\tilde {y},h)|x = \lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}} )$ holds for a permutation $\pi (\tilde {y},h)=\left ((\lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}}-\tilde {y})~\text {mod}~q\frac {1}{\sqrt {2}} \mathbb {Z},h \right )$ over the outputs $(\tilde {y},h)$. Therefore, the fading $L/L^{\prime }$ channel W is symmetric and we can achieve its capacity using polar codes.

4 Description of the encryption scheme

Table 1 gives a high-level description of the RLWE-based PKE scheme using polar codes which are customized for our tailored RLWE channel model in canonical basis. The functions PolarEnc(⋅) and PolarDec(⋅) are encoding and decoding algorithms of polar codes which will be explicitly introduced in the sequel.

Remark 1

Unlike most RLWE encryption schemes where the error distribution χ is defined over $\mathbb {Z}$ (e.g., central Binomial in NewHope), we use the definition of χ when the ideal learning with errors problem was initially proposed in [33] where χ is defined on $\mathbb {R}/[0, q)$. Moreover, according to the formal definition of ring-LWE in [21], the error distribution is also continuous over the field tensor product $K\otimes _{\mathbb {Q}}\mathbb {R}$.

Remark 2

A plaintext m is uniquely mapped to a symbol $\lfloor \frac {q}{2} \rfloor \frac {1}{\sqrt {2}}\textit {PolarEnc}(\mathbf {m})$ on the constellation diagram in canonical basis. Then it is switched to polynomial basis and turned into vector v. Note that $\mathbf {v}\in (\mathbb {R}/[0,q))^{n}$ but not in R_q. We see it reasonable since χ is also real and continuous.

Table 1 A RLWE public key encryption scheme with polar encoding and decoding

Full size table

One may notice in Table 1 that Alice finally derives a mod-$\tilde {B}R_{q}$ channel (or equivalently a mod-${{\varLambda }}^{\prime }$ channel) as in Fig. 1 rather than the mod-${{\varLambda }}^{\prime \prime }$ in formula (8) (or equivalently the mod-$L^{\prime }$ channel in (9)). Questions arise whether the tailored RLWE channel model in formula (8) makes sense and how it will behave if we construct a polar code for the mod-${{\varLambda }}^{\prime \prime }$ channel when we actually have a mod-${{\varLambda }}^{\prime }$ channel. Lemma 3 illustrates the channel degradation relation between the two channels.

Lemma 3

(Channel Degradation Relation Between RLWE Channel and Its Tailored Variant) Let ${{\varLambda }}^{\prime }$ be the constellation space and let ${{\varLambda }}^{\prime \prime }$ be its tailored variant as in Fig. 1. Given the tailored RLWE channel model as in formula (8) with CSI H_i known to the decoder as in Fig. 2, the fading $L^{n}/{{\varLambda }}^{\prime \prime }$ channel is degraded with respect to the fading $L^{n}/{{\varLambda }}^{\prime }$ channel.

Proof

Denote by $W^{\prime }$ the fading $L^{n}/{{\varLambda }}^{\prime }$ channel $y^{\prime }=x+h*z\mod {{{\varLambda }}^{\prime }}$ where $y^{\prime }\in \mathcal {R}({{\varLambda }}^{\prime })$, $x\in L^{n}\cap \mathcal {R}({{\varLambda }}^{\prime })$ is the channel input, h is the channel gain and z is the Gaussian noise. In the same fashion, we define the fading $L^{n}/{{\varLambda }}^{\prime \prime }$ channel $W^{\prime \prime }$ as $y^{\prime \prime }=x+h*z\mod {{{\varLambda }}^{\prime \prime }}$ where $y^{\prime \prime }\in \mathcal {R}({{\varLambda }}^{\prime \prime })$, $x\in L^{n}\cap \mathcal {R}({{\varLambda }}^{\prime \prime })$.

As formula (10) indicates, the $L/L^{\prime }$ fading channel with CSI known to the receiver in formula (9) can be viewed as an independent combination of channel gain h and an $L/L^{\prime }$ Gaussian channel. Therefore, with no loss of generality, we can view the channel gain h as a constant. We can rewrite channel $W^{\prime }$ as $W^{\prime }: y^{\prime }=x+z^{\prime }\mod {{{\varLambda }}^{\prime }}$ and rewrite $W^{\prime \prime }$ as $W^{\prime \prime }: y^{\prime \prime }=x+z^{\prime }\mod {{{\varLambda }}^{\prime \prime }}$ where $z^{\prime }\sim \mathcal {N}(0,h^{2}\sigma ^{2}\mathbb {I})$. The channel transition probability of $W^{\prime }$ is

$$\begin{array}{@{}rcl@{}} W^{\prime}(y^{\prime}|x) &=& \sum\limits_{\lambda^{\prime}\in{{\varLambda}}^{\prime}}g_{(h\sigma)^{2}}(y^{\prime}-x+\lambda^{\prime}), y^{\prime}\in \mathcal{R}({{\varLambda}}^{\prime}) \\ &=& \sum\limits_{\lambda^{\prime}\in{{\varLambda}}^{\prime}}g_{(h\sigma)^{2}}(n^{\prime}+\lambda^{\prime}), n^{\prime}\in \mathcal{R}({{\varLambda}}^{\prime}) \end{array}$$

(12)

where $g_{(h\sigma )^{2}}$ represents the density function of $\mathcal {N}(0,h^{2}\sigma ^{2}\mathbb {I})$ and $n^{\prime }=z^{\prime }\mod {{{\varLambda }}^{\prime }}$. The channel transition probability of $W^{\prime \prime }$ is

$$\begin{array}{@{}rcl@{}} W^{''}(y^{''}|x) &&= \sum\limits_{\lambda^{''}\in{{\varLambda}}^{''}}g_{(h\sigma)^{2}}(y^{''}-x+\lambda^{''}),\;\;\;\;\;\;\;\;\;\;y^{''}\in \mathcal{R}({{\varLambda}}^{''}) \\ &&= \sum\limits_{\lambda^{''}\in{{\varLambda}}^{''}}g_{(h\sigma)^{2}}(n^{''}+\lambda^{''}),\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;n^{''}\in \mathcal{R}({{\varLambda}}^{''}) \\ &&\overset{(a)}{=}\sum\limits_{\lambda^{\prime}\in{{\varLambda}}^{\prime}}g_{(h\sigma)^{2}}(n^{'}\frac{\tilde{B}^{-1}}{\sqrt{2}}+\lambda^{'}\frac{\tilde{B}^{-1}}{\sqrt{2}}),\;\;\;\;\;\;\;n^{'}\in\mathcal{R}({{\varLambda}}^{'}) \\ &&=\sum\limits_{\lambda^{'}\in{{\varLambda}}^{'}}g_{(h\sigma)^{2}}\left(\frac{\tilde{B}^{-1}}{\sqrt{2}}(n^{'}+\lambda^{'})\right),\;\;\;\;\;\;\;n^{'}\in\mathcal{R}({{\varLambda}}^{'}) \\ &&=\sum\limits_{\lambda^{'}\in{{\varLambda}}^{'}}g_{(h\sigma\sqrt{2}\tilde{B})^{2}}(n^{'}+\lambda^{'}),\;\;\;\;\;\;\;\;\;\;\;\;\;n^{'}\in\mathcal{R}({{\varLambda}}^{'}) \end{array}$$

(13)

where $n^{\prime \prime }=z^{\prime }\mod {{{\varLambda }}^{\prime \prime }}$ and the equality (a) is due to the relation ${{\varLambda }}^{\prime }=\sqrt {2}\tilde {B}{{\varLambda }}^{\prime \prime }$, $\lambda ^{\prime }=\sqrt {2}\tilde {B}\lambda ^{\prime \prime }$, and $n^{\prime }\in \mathcal {R}({{\varLambda }}^{\prime })$, $n^{\prime \prime }\in \mathcal {R}({{\varLambda }}^{\prime \prime })$. We observe from equation (13) that channel $W^{\prime \prime }$ is statistically equivalent to $W^{\prime \prime }: y^{\prime \prime } = x + z^{\prime \prime }\mod {{{\varLambda }}^{\prime }}$ where $z^{\prime \prime }\sim \mathcal {N}(0,(h\sigma \sqrt {2}\tilde {B})^{2}\mathbb {I})$. Since the transition probabilities in equation (12) and equation (13) are two ${{\varLambda }}^{\prime }$-periodic Gaussian distributions featured with variances $(h\sigma )^{2} < (h\sigma \sqrt {2}\tilde {B})^{2}$, we can prove $W^{\prime \prime }$ is degraded with respect to $W^{\prime }$ by introducing an intermediate $L^{n}/{{\varLambda }}^{\prime }$ channel $W^{\prime \prime \prime }$ with additive Gaussian noise $z^{\prime \prime \prime }\sim \mathcal {N}(0,(h\sigma \sqrt {2}\tilde {B})^{2}\mathbb {I}-(h\sigma )^{2}\mathbb {I})$ such that $W^{\prime \prime }$ is a concatenation of $W^{\prime }$ and $W^{\prime \prime \prime }$, i.e.,

$$\begin{array}{@{}rcl@{}} (x+z^{\prime\prime})\mod{{{\varLambda}}^{\prime}} &=& (x+z^{\prime}+z^{\prime\prime\prime}) \mod{{{\varLambda}}^{\prime}} \\ &=& \left((x+z^{\prime})\mod{{{\varLambda}}^{\prime}}\right) + z^{\prime\prime\prime} \mod{{{\varLambda}}^{\prime}}. \end{array}$$

The above concatenation satisfies the definition of channel degradation (Definition 3). □

Given the channel degradation relation between the fading $L^{n}/{{\varLambda }}^{\prime }$ channel $W^{\prime }$ and the fading $L^{n}/{{\varLambda }}^{\prime \prime }$ channel $W^{\prime \prime }$, it is guaranteed by Lemma 1 that the polar codes constructed for $W^{\prime \prime }$ also fit in with $W^{\prime }$. How to explicitly construct polar codes will be shown in next section.

5 Polar coding for the tailored RLWE channel

As discussed in Section 2.4, we need a BDMS channel before we can adapt the polar coding method, including calculating the Bhattacharyya parameters of the synthesized channels, defining the information set $\mathcal {A}$ and frozen set $\mathcal {A}^{c}$, encoding and SC decoding. We have already proved the fading $L/L^{\prime }$ channel $W:X\rightarrow (\tilde {Y},H)$ as in formula (9) is symmetric in Section 3.3. Since we assume the channel gain H and Gaussian noise Z to be continuous and so is the channel output, we need to discretize the channel output $H,\tilde {Y}$ before constructing polar codes. An elegant channel quantization scheme was proposed in [18] where the two output H and $\tilde {Y}$ are discretized independently with tractable loss of channel capacity. Basically, the channel gain H is discretized into a series of discrete values with uniform occurrence probability. As for the output $\tilde {Y}$, we will decompose the $L/L^{\prime }$ channel into multiple BDMS channels such that the overall channel capacity almost preserves with only negligible loss.

5.1 Quantization of the fading coefficient

As discussed in previous sections, the fading $L/L^{\prime }$ channel with CSI available to the decoder is statistically equivalent to an independent combination of the fading coefficient H and an $L/L^{\prime }$ channel with additive Gaussian noise of variance (hσ)². Therefore, we firstly quantize H then the $L/L^{\prime }$ channel. Let {α_i} be an ascending sequence in the following form

$$\alpha_{1}=1,\alpha_{2},\cdots,\alpha_{m},\alpha_{m+1}=+\infty,$$

so that for 1 ≤ i ≤ m we have

$${\int}_{\alpha_{i}}^{\alpha_{i+1}}P_{H}(h)dh=\frac{1}{m}.$$

We take the centroid with respect to the interval (α_i,α_i+ 1) as the discretized alphabet ${\mathscr{H}}_{q}=\{h_{i}\}$ for i = 1,⋯ ,m where h_i is calculated as follows.

$$h_{i} = {\int}_{\alpha_{i}}^{\alpha_{i+1}}mhP_{H}(h)dh.$$

5.2 Degrading transform quantization

As in Fig. 2 we view the tailored RLWE channel as an i.i.d. fading channel. For such a channel, polar codes are constructed in [18] to achieve the ergodic capacity C(W) as long as the receiver knows the CSI and the transmitter knows the CDI. Given n ($n=2^{l}, l\in \mathbb {Z}$) i.i.d. tailored RLWE channels $W:X\rightarrow (\tilde {Y},H)$, we define the channel input as X^1:n = U^1:nG_n where U^1:n ∈{0,1}^1:n and G_n is the generator matrix ^{Footnote 9}. We obtain n synthesized channels $W_{n}^{(i)}:U^{(i)}\rightarrow (U^{1:i-1},\tilde {Y}^{1:n},H^{1:n})$ for 1 ≤ i ≤ n by performing channel combining and channel splitting. The Bhattacharyya parameter for W is defined as

$$Z(W)\overset{\triangle}{=}\sum\limits_{\tilde{y},h}\sqrt{P_{\tilde{Y},H|X}(\tilde{y},h|0)P_{\tilde{Y},H|X}(\tilde{y},h|1)}.$$

To compute $Z(W_{n}^{(i)})$ efficiently, we employ the degrading transform proposed in [34] to quantize a BMS channel W with continuous output alphabet into a degraded and approximated BDMS channel W_Q with finite output alphabet size. Intuitively, the finer the discretized output alphabet is, the better W_Q approximates W. Since we have already discretized H as h_i for i = 1,⋯ ,m, we can consider h_i as a constant and quantize the $L/L^{\prime }$ channel $W_{h_{i}}:X,h_{i}\rightarrow \tilde {Y}$ for each h_i.

We define the likelihood ratio (LR) of a channel W as

$$\lambda(\tilde{y},h_{i}){:=}\frac{W_{\tilde{Y}|X,h_{i}}(\tilde{y}|0,h_{i})}{W_{\tilde{Y}|X,h_{i}}(\tilde{y}|\lfloor\frac{q}{2}\rfloor\frac{1}{\sqrt{2}},h_{i})},$$

(14)

where the transition probability $W_{\tilde {Y}|X,h_{i}}$ is

$$\begin{array}{@{}rcl@{}} && W_{\tilde{Y}|h_{i},X}(\tilde{y}|0,h_{i}) = f_{L^{\prime},0,{h_{i}^{2}}\sigma^{2}}(\tilde{y}) = \sum\limits_{\lambda\in L^{\prime}}g_{0,(h_{i}\sigma)^{2}}(\tilde{y}+\lambda),\\ && W_{\tilde{Y}|h_{i},X}(\tilde{y}|\lfloor\frac{q}{2}\rfloor\frac{1}{\sqrt{2}},h_{i}) = f_{L^{\prime},\lfloor\frac{q}{2}\rfloor\frac{1}{\sqrt{2}},{h_{i}^{2}}\sigma^{2}}(\tilde{y}) = \sum\limits_{\lambda\in L^{\prime}}g_{\lfloor\frac{q}{2}\rfloor\frac{1}{\sqrt{2}},(h_{i}\sigma)^{2}}(\tilde{y}+\lambda). \end{array}$$

Figure 5 depicts $W_{\tilde {Y}|X,h_{i}}$ and $\lambda (\tilde {y},h_{i})$ by giving some examples when q = 12289,r = 2 and h_i = 10,30.

We can see it in Fig. 5 that the channel $W_{h_{i}}:X, h_{i}\rightarrow \tilde {Y}$ is BMS with $\tilde {Y}$ continuously located over the interval $[0,q/\sqrt {2})$. There exists a permutation function $\pi (\tilde {y})=(\lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}}-\tilde {y})\mod q/\sqrt {2}$ such that $W(\tilde {y}|0,h_{i})=W(\pi (\tilde {y})|\lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}},h_{i})$. Intuitively, the BMS channel $W_{h_{i}}$ can be decomposed into infinite binary symmetric channels (BSCs) $W_{c}:X,h_{i}\rightarrow \tilde {Y}_{c}$ where the output is $\tilde {Y}_{c}\in \{y_{c},\pi (y_{c})\}$ for continuous $y_{c}\in [0,q/\sqrt {2})$, $X\in \{0,\lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}}\}$ and the crossover probability is the corresponding probability density $W(\tilde {y}_{c}|X,h_{i})$. If we focus on the likelihood ratio $\lambda (\tilde {y}_{c},h_{i})\geq 1$, the crossover probability of BSC W_c is $\frac {1}{\lambda (\tilde {y}_{c},h_{i})+1}$. The capacity of this BSC is

$$C[\lambda(\tilde{y}_{c},h_{i})]= 1-\frac{\lambda(\tilde{y}_{c},h_{i})}{\lambda(\tilde{y}_{c},h_{i})+1}\log{\frac{\lambda(\tilde{y}_{c},h_{i})+1}{\lambda(\tilde{y}_{c},h_{i})}}- \frac{1}{\lambda(\tilde{y}_{c},h_{i})+1} \log{(\lambda(\tilde{y}_{c},h_{i})+1)},$$

where $\lambda (\tilde {y}_{c},h_{i})\geq 1$. Quantitatively, the continuous decomposition of $W_{h_{i}}$ preserves the channel capacity in the sense that

$$C(W_{h_{i}})={\int}_{\lambda(\tilde{y},h_{i})\geq1} (W_{\tilde{Y}|X,h_{i}}(\tilde{y}|0,h_{i})+W_{\tilde{Y}|X,h_{i}}(\tilde{y}|\lfloor\frac{q}{2}\rfloor\frac{1}{\sqrt{2}},h_{i})) C[\lambda(\tilde{y},h_{i})] d\tilde{y},$$

where the integral interval is restricted to $\tilde {y}$ such that $\lambda (\tilde {y},h_{i})\geq 1$. If we ignore the subtle geometrical error introduced by rounding ⌊⋅⌋, we can observe a symmetry feature in the graphs in Fig. 5 and we find that the valid integral interval is

$$A:=[0,\lfloor\frac{q}{2}\rfloor\frac{1}{2\sqrt{2}}] \cup [\lfloor\frac{q}{2}\rfloor\frac{3}{2\sqrt{2}}, q\frac{1}{\sqrt{2}}].$$

We divide the interval A into ν segments A_j for j ∈ [ν] such that

$$\begin{array}{@{}rcl@{}} A_{j} &=& \left \{ \tilde{y}\in A:\frac{j-1}{\nu} \le C[\lambda(\tilde{y},h_{i})] < \frac{j}{\nu} \right \} \\ &=& \left \{\tilde{y}\in A: { \frac{1}{\mathfrak{h_{2}^{-1}}\left(\frac{\nu-i+1}{\nu}\right)}-1} \le \lambda(\tilde{y},h_{i}) < {\frac{1}{\mathfrak{h_{2}^{-1}}\left(\frac{\nu-i}{\nu}\right)}-1} \right \}, \end{array}$$

where $\mathfrak {h_{2}}(\cdot )$ is the binary entropy function. Each A_j corresponds to a BSC channel with crossover probability

$$p_{j}=\frac{{\int}_{A_{j}}W_{\tilde{Y}|X,h_{i}}(\tilde{y}|\lfloor\frac{q}{2}\rfloor\frac{1}{\sqrt{2}},h_{i})d\tilde{y}}{{\int}_{A_{j}}W_{\tilde{Y}|X,h_{i}}(\tilde{y}|\lfloor\frac{q}{2}\rfloor\frac{1}{\sqrt{2}},h_{i})d\tilde{y}+{\int}_{A_{j}}W_{\tilde{Y}|X,h_{i}}(\tilde{y}|0,h_{i})d\tilde{y}},$$

(15)

where

$$\begin{array}{@{}rcl@{}} {\int}_{A_{j}}W_{\tilde{Y}|X,h_{i}}(\tilde{y}|0,h_{i})d\tilde{y}&=&{\int}_{A_{j}}\sum\limits_{\lambda\in L^{\prime}}g_{0,(h_{i}\sigma)^{2}}(\tilde{y}+\lambda)d\tilde{y}\\ {\int}_{A_{j}}W_{\tilde{Y}|X,h_{i}}(\tilde{y}|\lfloor\frac{q}{2}\rfloor\frac{1}{\sqrt{2}},h_{i})d\tilde{y} &=&{\int}_{A_{j}}\sum\limits_{\lambda\in L^{\prime}}g_{\lfloor\frac{q}{2}\rfloor\frac{1}{\sqrt{2}},(h_{i}\sigma)^{2}}(\tilde{y}+\lambda)d\tilde{y}. \end{array}$$

Since lattice $L^{\prime }$ is infinite, we can numerically approximate $f_{L^{\prime },0,{h_{i}^{2}}\sigma ^{2}}(\tilde {y})$, $f_{L^{\prime },\lfloor \frac {q}{2}\rfloor \frac {1}{\sqrt {2}},{h_{i}^{2}}\sigma ^{2}}(\tilde {y})$ then $\lambda (\tilde {y},h_{i}),A_{j}$ and p_j.

If we define z_j and its conjugate $\bar {z}_{j}$ to be the channel output of the BSC associated with A_j, we will obtain the discretized output alphabet of $W_{h_{i}}$ as

$$\mathcal{Z}:=\{z_{1},\bar{z}_{1},z_{2},\bar{z}_{2},\cdots,z_{\nu},\bar{z}_{\nu}\}.$$

If we denote by W_Q the discretized version of the original fading $L/L^{\prime }$ channel $W:X\rightarrow \tilde {Y},H$, the output alphabet of W_Q is ${\mathscr{H}}_{q}\otimes \mathcal {Z}:=\{h_{i}\}\otimes \{z_{1},\bar {z}_{1},\cdots ,z_{\nu },\bar {z}_{\nu }\}$ for i ∈ [m] and j ∈ [ν] where ⊗ denotes the Cartesian product of two sets.

Lemma 4

The channel $W_{Q}:X\rightarrow Z,H_{q}$ is degraded with respect to W.

Proof

We supply an intermediate channel $W_{P}:(\tilde {Y},H)\rightarrow (Z,H_{q})$ such that

$$\begin{array}{@{}rcl@{}} W_{P}(z,h_{q}|\tilde{y},h)=\begin{cases} \begin{array}{c} 1, \\ 1,\\ 0, \end{array}&\begin{array}{c} \text{if}~z=z_{j}, \tilde{y}\in A_{j},~\text{and}~h_{q}=h_{i},h\in[\alpha_{i},\alpha_{i+1}),\\ \text{if}~z=\bar{z}_{j}, \pi(\tilde{y})\in A_{j},~\text{and}~h_{q}=h_{i},h\in[\alpha_{i},\alpha_{i+1}),\\ \text{otherwise}. \end{array} \end{cases} \end{array}$$

We observe a channel degradation relation such that

$$W_{Q}(z,h_{q}|x)=\int W_{\tilde{Y},H|X}(\tilde{y},h|x)W_{P}(z,h_{q}|\tilde{y},h)d\tilde{y}\;dh.$$

□

Corollary 1

Given that $W_{Q}:X\rightarrow Z,H_{q}$ is degraded with respect to W, the capacity, Bhattacharyya parameter and frame error rate of the two channels are related as

$$\begin{array}{@{}rcl@{}} C(W_{Q}) &\leq C(W),~Z(W_{Q})\geq Z(W),~P_{e}(W_{Q})\geq P_{e}(W). \end{array}$$

Proof

As a corollary of Lemmas 2 and 4. □

It is indicated in [34] that the capacity loss introduced by the degrading transform is no greater than 1/ν. If we choose large alphabet size m and 2ν, the loss of capacity is negligible and so is Z(⋅) and P_e(⋅). We also verified our channel quantization scheme with respect to the channel capacity. As is shown in Fig. 6, for m = 20,ν = 50 and multiple choices of r, C(W_Q) is close to C(W) with only negligible difference.

To summarize, what the degrading transform does is to convert the RLWE channel W with continuous output alphabet into a BDMS channel W_Q with finite output, which can be viewed as a combination of m × ν BSC channels. In this way, one can construct polar codes for W_Q which also fit in with W.

5.3 Polar encoding and SC decoding

5.3.1 Encoding algorithm PolarEnc(⋅)

Given the BDMS channel W_Q derived by channel quantization, we can adapt the polar encoding and decoding method introduced in Section 2.4 to W_Q. Recall that the output alphabet of W_Q is m × 2ν. As the channel combining and splitting process continue, the alphabet size of the synthesized channels $W_{Qn}^{(i)}$ will increase exponentially as the recursion proceeds. To handle this problem, we employ an approximation method proposed in [27] which can reduce the alphabet size of a BDMS channel with negligible and tractable loss of performance by merging some of the output symbols.

After we finish computing the Bhattacharyya parameters of all the $W_{Qn}^{(i)}$, we can define the information set $\mathcal {A}$ and frozen set $\mathcal {A}^{c}$. Recall the encoding algorithm PolarEnc(m) in Table 1. We construct polar codes for plaintext m = u^1:n as

$$\begin{array}{@{}rcl@{}} x^{1:n} &=& \textit{PolarEnc}(\textbf{m}) \\ &=& u_{\mathcal{A}}G_{N}(\mathcal{A})\oplus u_{\mathcal{A}^{c}}G_{N}(\mathcal{A}^{c}), \end{array}$$

where $u_{\mathcal {A}}$ is the information vector and $u_{\mathcal {A}^{c}}$ is the frozen vector. The complexity of encoding is $O(n\log n)$ where n is equal to the degree of the cyclotomic field of RLWE.

5.3.2 Decoding algorithm PolarDec(⋅)

The decoding algorithm PolarDec(⋅) is exactly the same as the so called successive cancellation (SC) decoding initially proposed in [5]. Upon receiving the signal $\tilde {y}^{1:n}$ (i.e. $\tilde {y}^{1:n}=\tilde {B}\mathbf {y}$ in Table 1) and invoking their knowledge of the CSI h^1:n, the recipient applies the SC decoding to $\tilde {y}^{1:n},h^{1:n}$ and gives an estimation $\bar {u}^{1:n}$ of u^1:n as

$$\begin{array}{@{}rcl@{}} \bar{u}^{(i)}=\begin{cases} \begin{array}{l} \text{arg~max}_{u^{(i)}\in\{0,1\}}W_{n}^{(i)}(\tilde{y}^{1:n},h^{1:n},\bar{u}^{1:i-1}|u^{(i)}),\\ 0, \end{array} & \begin{array}{c} \text{if}~i\in\mathcal{A}\\ \text{otherwise} \end{array}\end{cases}, \end{array}$$

where the transition probabilities of synthesized channels $W_{n}^{(i)}(\cdot |\cdot )$ can be recursively calculated by SC decoding algorithm with complexity $O(n\log n)$. Details of SC decoding can be found in Appendix A. A frame error occurs if $\bar {u}^{1:n}\neq u^{1:n}$; we may interchangeably use frame error probability and DFR in this work. Additionally, PolarEnc(⋅) and PolarDec(⋅) require constant steps of operations for fixed choices of $n,\mathcal {A}$, making isochronous implementations possible. Details about isochrony will be discussed in Section 7.2.

6 Results: Performance analysis and improvement

According to Theorem 2, the frame error probability $P_{e}(n,R,\mathcal {A})$ of SC decoding is upper bounded by the sum of $Z(W_{n}^{(i)})$. Since $W_Q \preceq W$ and $W_{Qn}^{(i)}\preceq W_{n}^{(i)}$ according to Lemma 1, we derive

$$P_{e}(n,R,\mathcal{A})\leq \sum\limits_{i\in\mathcal{A}}Z(W_{Qn}^{(i)}).$$

(16)

Recall it in Fig. 6 that the capacity of our tailored RLWE channel deteriorates dramatically because we use a tailored and shrunk constellation diagram. As a result, for most choices of r which are believed to be secure in RLWE-based PKE, we cannot obtain a desired DFR lower than 2^− 128 which is used as a benchmark in NIST standardization. As explained in Section 3.2, we carefully and conservatively choose a cube ${{\varLambda }}^{\prime \prime }$ which is enclosed in the maximal sphere inscribed in ${{\varLambda }}^{\prime }$. Almost surely there are other valid choices of ${{\varLambda }}^{\prime \prime }$ lager than the one we choose, though it is not easy at all to figure out the optimal one. A pragmatic solution to this harsh problem is to gradually scale ${{\varLambda }}^{\prime \prime }$ up by a factor t ≥ 1 and run simulations for each to justify if the numerical results of P_e coincide with the upper bound in formula (16). We highlight that if t is not larger than some critical point, the channel degradation relation in Lemma 3 will still hold. Therefore, the theoretical upper bound on P_e will still apply after we scale the modulation constellation ${{\varLambda }}^{\prime \prime }$. Please refer to Remark 3 for further explanation.

Figure 7 compares the upper bounds of frame error probability P_e with our simulation results in the setting of q = 12289,n = 1024,r = 1. The solid lines indicate the upper bounds of P_e with respect to different code rate R. The solid lines with stars represent our simulation results which, for reasonably small DFR, comply with the upper bound. We aim to achieve P_e = 2^− 128 at code rate R = 0.25. Apparently, it is unachievable when the scale factor t = 1. We gradually increase t and obtain the corresponding estimation of P_e. We can see that the decoding performance is improved significantly upon a slightly larger t, e.g., P_e is smaller than 10^− 60(≈ 2^− 200) at R = 0.25 for t = 2. When t = 2, the experiment result represented by the red star also complies with its corresponding theoretical estimation, i.e., the red solid line. It implies that our estimation of P_e for t = 2 is reliable to some extent. Please note that all these experiments target at relatively large P_e which is feasible to verify.

Figure 8 can be interpreted in the same manner as Fig. 7. The only different parameter used here is r = 2. The solid lines in different colors represent our estimation of P_e and the stars are our simulation results. By making scale factor t as large as 6, the target R and P_e can be achieved. For relatively large P_e shown in the graph, we observed that our simulation results comply with our estimation when t = 6,7,9,11,12. However, when t = 14, simulation results are worse than our estimation, implying that the constellation diagram ${{\varLambda }}^{\prime \prime }$ is overwhelmingly large and goes beyond the valid domain.

In Fig. 9, r = 2.83. We can observe that our estimations are effective for t = 8,12 but fail for t > 12. We can see that none of our simulation results comply with the estimations in Fig. 10. It implies that the scaling method does not apply for r ≥ 3.46.

Remark 3

The error sources for the scaled and tailored RLWE channel model are concluded as follows.

(a)
As t increases, the constellation space ${{\varLambda }}^{\prime \prime }$ may go beyond ${{\varLambda }}^{\prime }$ and our model will fail to describe the statistical feature of the real channel.
(b)
The SC decoder takes $\tilde {B}\textbf {y}$ to be the channel output of a fading $L^{n}/{{\varLambda }}^{\prime \prime }$ channel while it is actually a fading $L^{n}/ {{\varLambda }}^{\prime }/ {{\varLambda }}^{\prime \prime }$ channel according to Table 1. This is because Alice firstly performs a mod R_q operation and then calculates $\tilde {B}\textbf {y}$ upon receiving y from Bob. For small r, the two channels have quite close distributions but they become less likely as r goes larger. This explains why our model fails when r ≥ 3.46 in Fig. 10.
(c)
It might be misunderstood that for any t > 1 the theoretical estimation in formula (16) would not apply. This is exactly not the case. As stated in Section 3.2, the constellation ${{\varLambda }}^{\prime }$ shrinks $\sqrt {n}$ times in length and becomes the tailored one ${{\varLambda }}^{\prime \prime }$. As a result of (a) and (b), slightly increasing ${{\varLambda }}^{\prime \prime }$ will not affect the soundness of the channel degradation relation and formula (16) if t does not go beyond some critical point. To find such point is nontrivial. That is why we run simulations to explore the relation between t, r and DFR. The disadvantage of this pragmatic method is that we can not verify small P_e of cryptographic interest.

7 Security analysis

7.1 Security improvement by new DFR

We define the concrete bit-security to be $\log _{2}$ of the time complexity of certain attacks breaking a scheme of specific parameters of interest. We analyze the concrete bit-security of the proposed RLWE-based PKE by considering the best known generic attacks against ring-LWE and the corresponding cost models. A comprehensive survey of a variety of generic attacks and cost models can be found in [1, 2]. Since the proposed RLWE-based PKE differs from NewHope solely in the way plaintext is encoded and decoded, and the error-correction code itself does not affect security reduction, therefore the security estimation of NewHope [4] can be extended to our case.

Following the security estimation in [4, 19], we focus on two generic attacks. Essentially, we will consider (a) a primal attack which consists of constructing a unique shortest vector problem (uSVP) given LWE samples and solving it using block Korkin–Zolotarev (BKZ) algorithm with classical/quantum sieving (b) a dual attack which searches for the shortest vector in a dual lattice constructed by LWE samples using BKZ with classical/quantum sieving. We employ the cost model in [4] where the cost of BKZ with classical/quantum sieving is 0.292β/0.265β with β the block dimension of BKZ. In Table 2, we summarize the security estimates of the two attacks where the cost is defined as $\log _{2}$ of time complexity of BKZ ^{Footnote 10}. Note that a variant of the dual attack is used by the estimator which makes the cost different from [4].

There exists a trade-off relation between DFR and bit-security level of RLWE-based PKE. Basically, larger error term (or larger binomial parameter k in NewHope) gives better security but worse DFR. The motivation of this work is to employ polar codes to give a safer DFR margin such that we can improve the bit-security level while achieving the target DFR. In NIST standardization, this target DFR is 2^− 128. A more conservative target 2^− 140 is used in the literature [13, 32].

Table 2 illustrates the DFR and bit-security level of RLWE-based PKE using our polar coding scheme for different choices of binomial parameter k ($r=\sqrt {k/2}$) and scale factor t. As we discussed in previous section, the scale factor of the constellation diagram cannot be larger than 12 for k = 8, otherwise the estimation of DFR is no longer valid. We select a more conservative choice t = 11 and achieve DFR= 2^− 298 for n = 1024,q = 12289,k = 8 using our polar coding scheme which is smaller than the DFR 2^− 216 of NewHope round 2 in the same setting. As discussed in Fig. 10, our calculation of DFR for k ≥ 24 (r ≥ 3.46) no longer applies.

In conclusion, our polar coding scheme and the selected parameters provide the RLWE-based PKE with a bit-security of at least 256 bits while achieving the target DFR 2^− 140 (and also 2^− 128). This is a considerable improvement compared with NewHope round 2 which offers a bit-security of 235 bits with the same parameters. The state-of-the-art study of this kind can be classified into two categories. In [13], LDPC and BCH codes are used to increase the bit-security to 309 bits while achieving DFR of 2^− 140. However, their DFR estimation highly relies on an “independence” assumption and their error-correcting algorithms are not isochronous. The other approach was proposed by Song et al. in [32] which gave a tighter bound on DFR of NewHope and the bit-security is increased to 252 bits.

Table 2 Improved bit-security of RLWE PKE using polar codes for n = 1024,q = 12289

Full size table

7.2 Resilience against timing-based attacks

When error-correcting codes are adapted to RLWE-based PKE, a major concern is the resilience against timing-based attacks. Discussions of this kind can be found in [19, 31]. We employ a semi-formal definition of constant-time algorithms which is called “isochrony” in [15]. We view an algorithm to be isochronous if its execution time is independent of the sensitive part of its input and output. This is a weaker notion than the conventional definition but suffices to argue security against timing attacks. We will justify the isochrony of polar encoding and decoding in this section.

Encoding

As introduced in Section 2.4 as well as Section 5.3, the encoding of polar codes takes plaintext u^1:n as input and yields codewords as equation (1). The block length n is equal to the degree of cyclotomic field of RLWE. The encoding process comprises exactly $\frac {n\log n}{2}$ many XOR logical operations no matter what the plaintext u^1:n is. This can be verified by some trivial examples as in Fig. 11. Note that it is sensible to carry out the calculation of Bhattacharyya parameters for the synthesized channels offline. Because they are determined by the distribution of the residue noise term e ⋅ t − s ⋅ e₁ + e₂ and can be done once and for all. Therefore, the encoding is isochronous.

Decoding

As detailed in Appendix A, the SC decoding comprises three types of operations, i.e. (1) recursive calculation of the transition probabilities $W_{n}^{(i)}$ as in Algorithm 2 (2) comparisons of two transition probabilities as line 9 of Algorithm 1 (3) XOR logical operations as in Algorithm 3. As in [15], we prove the SC decoding to be isochronous by showing that its timing is irrelevant to the sensitive information of the protocol. Regarding the decoding of RLWE PKE, the sensitive information includes the input $\tilde {B}\mathbf {y}$ in Table 1 (we use shorthand notation y^1:n) and output $\bar {u}^{1:n}$ (i.e. the decoding result of plaintext u^1:n) of SC decoding and the secret terms e,s,t,e₁,e₂ separately generated by each side of protocol. Note that the information set $\mathcal {A}$ and its complement $\mathcal {A}^{c}$ are determined by the distribution of secret terms and block length n which are publicly known. The frozen vector (e.g., an all zero vector) is also publicly known. Table 3 illustrates what types of operations are isochronous with respect to the sensitive information.

Firstly, recursively calculations of $W_{n}^{(i)}$ are isochronous because their timings are irrelevant to any sensitive information. As described in Appendix A, for any fixed n an SC decoder carries out exactly $n\log n$ many transition probability assemblies as in equation (17) and (18). Normally, these assemblies are floating-point operations. We use transition probability rather than the more popular likelihood ratio recursions to avoid floating-point divisions which are considered difficult for isochronous implementations [29, 37].

Secondly, the floating-point comparisons of two transition probabilities in Algorithm 1 are the decision-making process which yields the output $\bar {u}^{1:n}$. Generally speaking, comparing two close floating-point values would take longer, but it is equally likely to return True and False nonetheless. Therefore, it makes sense to consider the timings of this type of operations irrelevant to $\bar {u}^{1:n}$. In addition, the overall running time taken by comparisons is relevant to $\mathcal {A}$ and n because comparisons only take place for information set $\mathcal {A}$. Other sensitive information is not related to comparison operations.

Thirdly, the XOR logical operations in Algorithm 3 are the same as what happens in encoding. The quantity of XOR operations carried out by Algorithm 3 is uniquely determined by block length n.

We conclude that the encoding and decoding are isochronous with respect to sensitive information including the plaintext, the input and output of SC decoding and the secret terms e,s,t,e₁,e₂.

Table 3 Relevance between timings of operations and information: relevant: $\checkmark$, irrelevant: ×

Full size table

8 Conclusions

We have presented the first example of a polar coding technique to improve the DFR of RLWE-based PKE which takes advantage of viewing the protocol as a fading channel with CSI known to the decoder. Moreover, switching from polynomial basis to canonical basis unfastens the dependency existing in the residue noise term. The constellation space is tailored to derive an i.i.d. fading channel at the cost of decoding performance and a scaling method is employed to counteract the performance loss. Both numerical and theoretical results are given to verify the DFR estimation. The advantages of our method are as follows.

We derive an i.i.d. channel model of the residue noise term in H space using canonical embedding. The advantage that some knowledge of noise term is known by the decoder is taken to improve the decoding performance.
The bit-security is increased to 256 bits while achieving the target DFR of 2^− 140 in the setting of n = 1024,q = 12289,k = 16 (r = 2.83). This improvement is better than the benchmark 252 bits in [32]. Though it does break the record of 309 bits in [13], their results rely on an “independence” assumption that may not hold nonetheless.
Polar codes support isochronous implementations of encoding and decoding while LDPC and BCH codes employed in [13] do not. We show the encoding and decoding of polar codes to be isochronous with respect to sensitive information of the protocol.

The disadvantages are also given as follows.

Switching between the two basis by multiplying matrix $\tilde {B}$ and $\tilde {B}^{-1}$ as in Table 1 increases the complexity of the protocol.
To derive an i.i.d. channel model, we designed a tailored modulation diagram which gives closer code distance than the original modulation diagram $\{0,\lfloor \frac {q}{2} \rfloor \}$. It hurts the decoding performance but the power of polar codes and the proposed scaling method counteract this effects to some extent.
However, the critical points of the scale factor t and the noise parameter r beyond which the theoretical upper bound on P_e no longer applies are currently missing.

Notes

The expansion of LAC is LAttice-based Cryptosystems.
Hila5 was merged into another NIST proposal called Round5.
A concurrent work also used canonical embedding to analyze the statistical framework of RLWE-based cryptography [24, 25]; the novel contribution of our work is to derive an i.i.d. channel model under canonical embedding and to use the CSI for error-correction.
A fundamental region of a lattice Λ is a region that includes only one point from each coset of Λ in $\mathbb {R}^{n}$. Algebraically, $\mathcal {V}({{\varLambda }})$ is a set of all the coset representatives of Λ in $\mathbb {R}^{n}$.
The generalizations of polar codes are extended to a large class of channels, e.g., the binary-input memoryless symmetric (BMS) channel.
The maximum mutual information over all possible channel input distributions is the channel capacity.
For non-expert audiences, another interpretation of the proposed RLWE channel model is given in vector form in Appendix B.
Different r induces different SNR. The calculation of SNR with respect to the fading $L/L^{\prime }$ channel is given in Appendix C.
The notation n indicates the block length of a code; it also indicates the degree of the 2n-th cyclotomic field which defines RLWE.
We use the security estimator from https://github.com/estimate-all-the-lwe-ntru-schemes/estimate-all-the-lwe-ntru-schemes.github.io.

References

Albrecht, M.R., Player, R., Scott, S.: On the concrete hardness of learning with errors. J. Math. Cryptol. 9(3), 169–203 (2015)
Article MathSciNet MATH Google Scholar
Albrecht, MR, Curtis, BR, Deo, A, Davidson, A, Player, R, Postlethwaite, EW, Virdia, F, Wunderer, T Catalano, D, De Prisco, R (eds.): Estimate All the LWE, NTRU schemes!. Springer International Publishing, Cham (2018)
Alkim, E., Ducas, L., Pöppelmann, T., Schwabe, P.: NewHope without reconciliation. IACR Cryptology ePrint Archive, 1157 (2016)
Google Scholar
Alkim, E, Ducas, L, Pöppelmann, T, Schwabe, P: Post-quantum key exchange—a new hope. In: 25th USENIX Security Symposium, pp 327–43 (2016b)
Arikan, E.: Channel polarization: a method for constructing capacity-achieving codes for symmetric binary-input memoryless channels. IEEE Trans. Inf. Theory 55(7), 3051–73 (2009)
Article MathSciNet MATH Google Scholar
Bravo-Santos, A.: Polar codes for the Rayleigh fading channel. IEEE Communications Lett. 17(12), 2352–55 (2013)
Article Google Scholar
Caire, G., Taricco, G., Biglieri, E.: Bit-interleaved coded modulation. IEEE Trans. Inform. Theory 44(3), 927–46 (1998)
Article MathSciNet MATH Google Scholar
Crockett, E., Peikert, C.: Challenges for ring-LWE. IACR Cryptol ePrint Arch, 782 (2016)
Google Scholar
D’Anvers, JP, Vercauteren, F, Verbauwhede, I: The impact of error dependencies on ring/mod-LWE/LWR based schemes. In: International Conference on Post-Quantum Cryptography. Springer, pp 103–15 (2019)
Ding, J., Xie, X., Lin, X.: A simple provably secure key exchange scheme based on the learning with errors problem. IACR Cryptology EPrint Archive, 688 (2012)
Google Scholar
Forney, G.D.: Coset codes. I. Introduction and geometrical classification. IEEE Trans. Inf. Theory 34(5), 1123–51 (1988)
Article MathSciNet MATH Google Scholar
Forney, G.D., Trott, M.D., Chung, S.Y.: Sphere-bound-achieving coset codes and multilevel coset codes. IEEE Trans. Inf. Theory 46(3), 820–850 (2000)
Article MathSciNet MATH Google Scholar
Fritzmann, T., Pöppelmann, T., Sepúlveda, M.J.: Analysis of error-correcting codes for lattice-based key exchange. IACR Cryptology ePrint Archive, 150 (2018)
MATH Google Scholar
Fujisaki, E, Okamoto, T: Secure integration of asymmetric and symmetric encryption schemes. In: Annual International Cryptology Conference. Springer, pp 537–54 (1999)
Howe, J, Prest, T, Ricosset, T, Rossi, M: Isochronous Gaussian Sampling: From Inception to Implementation. In: Ding, J, Tillich, JP (eds.) Post-Quantum Cryptography, pp 53–71. Springer International Publishing, Cham (2020)
Kocer, EG.: Circulant, negacyclic and semicirculant matrices with the modified Pell, Jacobsthal and jacobsthal-Lucas numbers. Hacettepe Journal of Mathematics and Statistics 36(2) (2007)
Korada, SB.: Polar Codes for Channel and Source Coding. PhD Thesis Ecole Polytechnique Fédérale De Lausanne. Lausanne, Switzerland (2009)
Ling, C.: Polar codes and polar lattices for independent fading channels. IEEE Trans. Commun. 64(12), 4923–4935 (2016)
Article Google Scholar
Lu, X., Liu, Y., Zhang, Z., Jia, D., Xue, H., He, J., Li, B., Wang, K., Liu, Z., Yang, H.: LAC: Practical ring-LWE based public-key encryption with byte-level modulus. IACR Cryptol ePrint Arch, 1009 (2018)
Google Scholar
Lyubashevsky, V, Peikert, C, Regev, O: On ideal lattices and learning with errors over rings. In: Annual international conference on the theory and applications of cryptographic techniques, pp 1–23. Springer, (2010)
Lyubashevsky, V, Peikert, C, Regev, O: A toolkit for ring-LWE cryptography. In: Annual international conference on the theory and applications of cryptographic techniques, pp 35–54. Springer, (2013)
Martinez, A., Guillen i Fabregas, A., Caire, G., Willems, F.M.J.: Bit-interleaved coded modulation revisited: a mismatched decoding perspective. IEEE Trans. Inf. Theory 55(6), 2756–2765 (2009)
Article MathSciNet MATH Google Scholar
Mondelli, M., Hassani, S.H., Urbanke, R.L.: Unified scaling of polar codes: Error exponent, scaling exponent, moderate deviations, and error floors. IEEE Trans. Inf. Theory 62(12), 6698–6712 (2016)
Article MathSciNet MATH Google Scholar
Murphy, S, Player, R: δ-Subgaussian random variables in cryptography. In: Jang-Jaccard, J, Guo, F (eds.) Information Security and Privacy, pp 251–268. Springer International Publishing, Cham (2019)
Murphy, S, Player, R: Discretisation and product distributions in ring-LWE. J. Math. Cryptol. 15(1) (2020)
NIST.: Submission requirements and evaluation criteria for the post-quantum cryptography standardization process. https://csrc.nist.gov/CSRC/media/Projects/Post-Quantum-Cryptography/documents/call-for-proposals-final-dec-2016.pdf. Accessed 1 Aug 2016
Pedarsani, R, Hassani, SH, Tal, I, Telatar, E: On the construction of polar codes. In: 2011 IEEE international symposium on information theory proceedings, pp. 11–15 (2011)
Poppelen, A.V.: Cryptographic decoding of the Leech lattice. IACR Cryptology ePrint Archive, 1050 (2016)
Google Scholar
Prest, T., Ricosset, T., Rossi, M.: Simple, fast and constant-time Gaussian sampling over the integers for Falcon. Tech. rep., Second PQC Standardization Conference. https://csrc.nist.gov/Presentations/2019/simple-fast-and-constant-time-gaussian. Accessed 23 Aug 2019
Regev, O.: On lattices, learning with errors, random linear codes and cryptography. In: Proceedings of the thirty-seventh annual ACM symposium on theory of computing. ACM, New York, NY, USA, STOC ’05, pp. 84–93 (2005)
Saarinen, MJO.: HILA5: On reliability, reconciliation, and error correction for ring-LWE encryption. In: International conference on selected areas in cryptography. Springer, pp. 192–212 (2017)
Song, M., Lee, S., Shin, D., Lee, E., Kim, Y., No, J.: Analysis of error dependencies on newHope. IEEE Access 8, 45443–56 (2020)
Article Google Scholar
Stehlé, D, Steinfeld, R, Tanaka, K, Xagawa, K.: Efficient public key encryption based on ideal lattices. In: International conference on the theory and application of cryptology and information security. Springer, pp. 617–635 (2009)
Tal, I., Vardy, A.: How to construct polar codes. IEEE Trans. Inf. Theory 59(10), 6562–6582 (2013)
Article MathSciNet MATH Google Scholar
Tal, I., Vardy, A.: List decoding of polar codes. IEEE Trans. Inf. Theory 61(5), 2213–2226 (2015)
Article MathSciNet MATH Google Scholar
Trifonov, P.: Design of polar codes for Rayleigh fading channel. In: 2015 international symposium on wireless communication systems (ISWCS), pp. 331–335 (2015)
Zhao, R.K., Steinfeld, R., Sakzad, A.: Facct: fast, compact, and constant-time discrete Gaussian sampler over integers. IEEE Trans. Comput. 69(1), 126–137 (2019)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors would like to thank Charles Grover for providing support on the formulation of RLWE channel.

Funding

This research is supported by the National Research Foundation, Singapore under its Strategic Capability Research Centres Funding Initiative and by the UK Engineering and Physical Sciences Research Council (EPSRC grant EP/S021043/1). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not reflect the views of National Research Foundation, Singapore.

Author information

Authors and Affiliations

SCRIPTS, Nanyang Technological University, Singapore, 637457, Singapore
Jiabo Wang
Department of Electrical and Electronic Engineering, Imperial College London, London, SW7 2AZ, UK
Cong Ling

Authors

Jiabo Wang
View author publications
You can also search for this author in PubMed Google Scholar
Cong Ling
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cong Ling.

Ethics declarations

Conflict of Interests

The authors have no conflicts of interest to declare that are relevant to the content of this article.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A Successive cancellation decoding

SC decoding is proposed in [5, Section VIII] and modularized in [35, Section II]. Upon observing the signal y^1:n, the SC decoding works as in Algorithm 1 and gives the estimation of u^1:n. We now illustrate SC decoding by taking an example of n = 8 as in Fig. 12. In Fig. 12, an SC decoder is described as a circuit consisting of n × (l + 1) nodes for n = 2^l. These nodes are pairwise connected by “ ” wires. We also define two probability arrays PReg₀[n][l + 1] and PReg₁[n][l + 1] as well as a bit array UReg[n][l + 1]. Each array consists of n × (l + 1) elements. In the circuit, every node can be specified by a phase parameter ϕ, a branch parameter ψ and a layer number m where 0 ≤ m ≤ l, 1 ≤ ϕ ≤ 2^m, 0 ≤ ψ < 2^l−m. For any integer 1 ≤ i ≤ n, it has a unique representation as

$$i= \langle \phi,\psi\rangle_{m}=\phi+2^{m}\cdot\psi,$$

where we use the shorthand notation i = 〈ϕ,ψ〉 when m is clear. As shown in Fig. 12, nodes on layer m are classified into 2^m phases and 2^l−m branches. Nodes in the same phases are in the same color.

Algorithm 1 SC decoder(⋅)

As a high-level description, an SC decoder takes the signal y^1:n as input. Then node A〈ϕ,ψ〉_m calculates the transition probabilities $W_{2^m}^{(i)}(\cdot |0)$ and $W_{2^m}^{(i)}(\cdot |1)$ of an intermediate synthesized channel $W_{2^m}^{(i)}$ and stores them in PReg₀[〈ϕ,ψ〉][m] and PReg₁[〈ϕ,ψ〉][m] of the same index as in Algorithm 2. Meanwhile, node A〈ϕ,ψ〉_m also updates the unique element UReg[〈ϕ,ψ〉][m] of the bit array as in Algorithm 3. Each node in the circuit is activated once to update the two probability arrays and one more time to update the bit array. The output $\bar {u}^{1:n}$ is stored in the l-th column of bit array, i.e. UReg[:][l].

Algorithm 2 CalP(⋅)

In CalP(⋅), a node at layer m assembles the outputs of two nodes at layer m − 1 of the same phase but different branches, then it yields two probability values and stores them in the corresponding position in array PReg₀ and PReg₁. This process takes place in the circuit from rightmost to leftmost in a recursive manner in the sense that the probabilities of same phase but different branches on previous layer (two nodes on RHS of “ ”) are assembled and turned into probabilities of different phases but the same branch (two nodes on LHS of “ ”) on current layer.

To initiate, the nodes on layer m = 0 take y^1:n as input and store W(y⁽ⁱ⁾|0) and W(y⁽ⁱ⁾|1) for i = 1,⋯ ,n in column m = 0 of the two probability arrays, i.e., PReg₀[:][m = 0] and PReg₁[:][m = 0], respectively. Recall it in Section 2.4 that W(Y |X) is the transition probability of channel W.

Then, node A〈ϕ,ψ〉_m at layer m > 0 updates the two probability arrays as follows. Let $\kappa =\lceil \frac {\phi }{2} \rceil$. If ϕ is odd, then for α = 0, 1

$$\textsf{PReg}_{\alpha}[\langle \phi,\psi \rangle][m] = \sum\limits_{\beta=0,1}\frac{1}{2}\textsf{PReg}_{\alpha\oplus\beta}[\langle \kappa,2\psi\rangle][m-1]\cdot\textsf{PReg}_{\beta}[\langle \kappa,2\psi+1\rangle][m-1].$$

(17)

If ϕ is even, then for α = 0, 1 and β = UReg[〈ϕ − 1,ψ〉][m]

$$\textsf{PReg}_{\alpha}[\langle \phi,\psi \rangle][m] = \frac{1}{2}\textsf{PReg}_{\alpha\oplus\beta} [\langle \kappa,2\psi\rangle][m-1]\cdot\textsf{PReg}_{\alpha}[\langle \kappa,2\psi+1\rangle][m-1].$$

(18)

The above process proceeds recursively from right to left along the “ ” in the circuit until every node on layer m = l finishes it work.

Algorithm 3: CalBit(⋅)

CalBit(⋅) updates the bit array UReg also in a recursive manner similar to CalP(⋅) but in reverse direction. Basically, the two nodes on the RHS of “ ” assemble the two bits yielded by two nodes on LHS of “ ”. Then they derive another two bits and update the bit array UReg. Specifically, for even ϕ and κ = ϕ/2, node A〈κ, 2ψ〉_m− 1 updates UReg[〈κ, 2ψ〉][m − 1] as follows.

$$\begin{array}{@{}rcl@{}} \textsf{UReg}[\langle \kappa,2\psi \rangle][m-1] &=& \textsf{UReg}[\langle \phi-1, \psi\rangle][m]\oplus \textsf{UReg}[\langle \phi,\psi\rangle][m] \end{array}$$

(19)

$$\begin{array}{@{}rcl@{}} \textsf{UReg}[\langle \kappa,2\psi+1 \rangle][m-1] &=& \textsf{UReg}[\langle \phi,\psi\rangle][m] \end{array}$$

(20)

Appendix B An interpretation of RLWE channel model in vector representation

To visualize the decoding step of RLWE-based PKE as a channel decoding problem, we rewrite formula (3) in vector form as

$$\mathbf{y} = \lfloor \frac{q}{2} \rfloor \cdot \mathbf{m} + \mathbf{E} \cdot \mathbf{t} - \mathbf{S}\cdot \mathbf{e}_{1} + \mathbf{e}_{2}\mod{R_{q}},~~\textbf{e},\textbf{t},\textbf{e}_{1},\textbf{e}_{2}\leftarrow \mathcal{N}(0,r^{2}\mathbb{I}),$$

(21)

where m,t,e₁,e₂ are vectors, E is a negacyclic matrix with the first column to be the coefficients of the polynomial e and S is also a negacyclic matrix defined in the same manner. Formula (21) can be viewed as a channel model where y is the channel output, m is the codeword to be modulated as ⌊q/2⌋⋅m and E ⋅t −S ⋅e₁ + e₂ is the channel noise. In this section, we refer to formula (21) as an RLWE channel.

Theorem 3 (Diagonalization of negacyclic matrix,16)

Let N(x) be an n × n negacyclic matrix whose first column is x. Then

$$N(x)=G^{*}\text{diag}(\lambda_{1}(x),\lambda_{2}(x),\cdots,\lambda_{n}(x))G,$$

where the element $G_{p,q}=\frac {1}{\sqrt {n}}w^{(2p-1)(q-1)}$, w is the 2n-th root of unity, G^∗ is the conjugate transpose of G and $\lambda _j(x)={\sum }_{k=1}^{n}x_kw^{(2j-1)(k-1)}$.

According to Theorem 3 this RLWE channel can be formulated as

$$B\textbf{y}=B\lfloor\frac{q}{2}\rfloor \textbf{m}+BB^{-1}\text{diag}(\lambda(\textbf{e}))B\textbf{t}-BB^{-1}\text{diag}(\lambda(\textbf{s}))B\textbf{e}_{1}+B\textbf{e}_{2},\mod{BR_{q}} ,$$

(22)

where λ(e) and λ(e) refer to the diagonal elements in Theorem 3.

Appendix C Calculating the SNR of tailored RLWE channel model

SNR is defined as SNR = P_signal/P_noise where P_signal and P_noise denote the signal and noise power, respectively. The channel model of RLWE-based PKE in polynomial basis is

$$\begin{array}{@{}rcl@{}} \lfloor \frac{q}{2} \rfloor \cdot m+ e\cdot t - s \cdot e_{1} + e_{2}, \end{array}$$

where e,t,s,e₁,e₂ are polynomials in $\frac {\mathbb {Z}_q[x]}{x^n+1}$ whose coordinates are independently drawn from a spherical normal distribution $\mathcal {N}(0,\sigma ^2\mathbb {I})$. The multiplication of two polynomials can be interpreted as the convolution of their coordinates, giving rise to n parallel and correlated channels. If we set message m to be $\lbrace -\lfloor \frac {q}{4}\rfloor , \lfloor \frac {q}{4}\rfloor \rbrace ^n$, the SNR is roughly $\frac {q^2}{16(2n\sigma ^4 + \sigma ^2)}$.

In the tailored RLWE channel model in formula (8) and (9), polynomials convolutions are transformed to coordinate-wise multiplication in canonical basis. Since we shrink the constellation diagram as described in Section 3.2, we set the message m to be $\lbrace -\lfloor \frac {q}{4\sqrt {2}}\rfloor ,\lfloor \frac {q}{4\sqrt {2}}\rfloor \rbrace ^n$. The channel gain H_i and Gaussian noise Z_i are independent and their distributions are discussed in detail in formula (8). Its SNR is $\frac {q^2}{32(n^2\sigma ^4+n\sigma ^2/2)}$.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wang, J., Ling, C. Polar coding for Ring-LWE-based public key encryption. Cryptogr. Commun. 15, 397–431 (2023). https://doi.org/10.1007/s12095-022-00607-1

Download citation

Received: 09 November 2021
Accepted: 19 August 2022
Published: 24 September 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s12095-022-00607-1

Keywords

Mathematics Subject Classification (2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Polar coding for Ring-LWE-based public key encryption

Abstract

Similar content being viewed by others

Quantum Resistant Public Key Encryption Scheme polarRLCE

Improved polar-code-based efficient post-processing algorithm for quantum key distribution

Secured polar code derived from random hopped frozen-bits

1 Introduction

1.1 Error correction for Ring-LWE-based public key encryption

1.2 Originality and contribution

1.3 Roadmap

2 Preliminaries

2.1 Algebraic number theory

2.2 Ring-LWE public key encryption scheme and the coefficient dependency

2.3 Fading channel

Example 1

Example 2

2.4 Polar codes

Definition 1 (Mutual information of BDMS channels)

Definition 2 (Bhattacharyya parameter of BDMS channels)

Theorem 1 (Channel polarization of mutual information 5)

Theorem 2 (Decoding Performance 5)

2.5 Channel degradation and upgradation

Definition 3 (Degraded and Upgraded Channel, 34)

Lemma 1 (restatement of Lemma 4.7 in 17)

Lemma 2 (34)

3 RLWE channel model

3.1 RLWE channel model in canonical basis

Definition 4

3.2 A tailored constellation diagram

3.3 Tailored RLWE channel model in canonical basis

4 Description of the encryption scheme

Remark 1

Remark 2

Lemma 3

Proof

5 Polar coding for the tailored RLWE channel

5.1 Quantization of the fading coefficient

5.2 Degrading transform quantization

Lemma 4

Proof

Corollary 1

Proof

5.3 Polar encoding and SC decoding

5.3.1 Encoding algorithm PolarEnc(⋅)

5.3.2 Decoding algorithm PolarDec(⋅)

6 Results: Performance analysis and improvement

Remark 3

7 Security analysis

7.1 Security improvement by new DFR

7.2 Resilience against timing-based attacks

Encoding

Decoding

8 Conclusions

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Appendices

Appendix A Successive cancellation decoding

Algorithm 1 SC decoder(⋅)

Algorithm 2 CalP(⋅)

Algorithm 3: CalBit(⋅)

Appendix B An interpretation of RLWE channel model in vector representation

Theorem 3 (Diagonalization of negacyclic matrix,16)

Appendix C Calculating the SNR of tailored RLWE channel model

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification (2010)

Search

Navigation