0% found this document useful (0 votes)

110 views

LDPC Github

This document is the thesis of Tomàs Ortega Sanchez-Colomer titled "Low Density Parity Check codes". It discusses LDPC codes, which are a family of error-correcting codes that can achieve near optimal performance. The thesis reviews the history and development of LDPC codes, including their original proposal in 1963, rediscovery in the 1980s, use of expander graphs to allow linear-time encoding and decoding, and applications in modern technologies like DVB, 5G, WiFi and deep space communications. It also proposes a new family of LDPC codes constructed from generalized quadrangles that show improved performance over random codes in experiments.

Uploaded by

Rahim Umar

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

110 views

LDPC Github

Uploaded by

Rahim Umar

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 53

Master of Science in

Advanced Mathematics and

Mathematical Engineering

Title: Low Density Parity Check codes

Author: Tomàs Ortega Sanchez-Colomer

Advisor: Simeon Michael Ball

Department: Applied Mathematics IV

Academic year: 2020-2021

Universitat Politècnica de Catalunya
Facultat de Matemàtiques i Estadı́stica

Master in Advanced Mathematics and Mathematical Engineering

Master’s thesis

Low Density Parity Check codes

Tomàs Ortega

Supervised by Simeon Michael Ball

July, 2021
First of all, I would like to thank Prof. Simeon Ball, my supervisor, for his help and guidance during
the developement of this thesis.
I would also like to acknowledge Prof. Juanjo Rué, for encouraging me to pursue this Master’s and for
introducing me to expander graphs.
Finally, I want to thank my family and friends for their support during this unconventional year, and
for their patience while hearing me talk about error-correcting codes.
Abstract
Low Density Parity Check codes, LDPCs for short, are a family of codes which have shown near optimal
error-correcting capabilites. They were proposed in 1963 by Robert Gallager in his PhD thesis. While he
proved that probabilistic constructions of random LDPCs gave asymptotically good linear codes, they were
largely abandoned due to the lack of computing power to make them practically feasable. They enjoyed
a re-birth during the coding revolution of the 1980’s, and thanks to the developement of expander graph
theory, it was proven that they can be encoded and decoded in linear time. This thesis will review the
main results through this journey. Nowadays, LDPCs appear in a plethora of commercial applications.
The codes used in practice and the techniques that were employed to construct them will also be explored
in this work. Finally, a new family of LDPCs will be proposed, which will be constructed from incidence
structures called generalized quadrangles, and perform markedly better than random codes.

Keywords

Error correcting codes, LDPC, graph theory, expander graphs

1
Contents

1 Introduction 5
1.1 Background and motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.1 Relationship of linear codes and bipartite graphs . . . . . . . . . . . . . . . . . . . 6
1.2 Goals and contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2 LDPC codes 10
2.1 Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Bit-flipping algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.2 Multiple bit-flipping algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.1.3 Sum-Product algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Expander codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.1 Existence of desired expanders . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.2 Asymptotically good codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4 Practical considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 Used implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5.1 Digital video broadcasting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.5.2 5G NR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.3 NASA’s deep space and proximity links . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5.4 Wi-Fi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.6 Generalized quadrangle codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3 Experimental results 31
3.1 Random codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 GQ codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.1 Lifted GQ codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.2 Line decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4 Summary and future work 40

References 41

A Simulator and auxiliary functions 43

A.1 Random matrix generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
A.2 BSC simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
A.3 BI-AWGN capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

2
A.4 GAP code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

3
LDPC codes

List of Figures
1 Example check matrix with corresponding bipartite (or Tanner) graph. . . . . . . . . . . . 7
2 BI-AWGN and BSC capacity as a function of Eb /N0 . . . . . . . . . . . . . . . . . . . . . 8
3 Error regions of a typical LDPC code for the AWGN channel. The code used to obtain
this plot is a rate 0.82 code used in practice for digital video broadcasting, taken from the
DVB-S2 standard discussed in subsubsection 2.5.1. . . . . . . . . . . . . . . . . . . . . . 11
4 Example of indexing for a parity-check tree representation. . . . . . . . . . . . . . . . . . 14
5 Parity-check tree representation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
6 Bit-flipping and sum-product performance comparison, with 105 Monte-Carlo runs. . . . . 16
7 Construction of a code by lifting a protograph, lifting size k = 3. . . . . . . . . . . . . . . 22
8 DVB-S2 rate 1/2 parity check matrix, with a zoom-in to see the circulants. . . . . . . . . 24
9 DVB-S2 rate 1/2 parity check matrix performance, with 30 AMS iterations. . . . . . . . . 25
10 NASA’s AR4JA rate 1/2 parity check matrix. . . . . . . . . . . . . . . . . . . . . . . . . 26
11 802.11n rate 1/2, n = 1944 parity-check matrix. . . . . . . . . . . . . . . . . . . . . . . . 27
12 Generalized quadrangle GQ(2, 2), the doily. . . . . . . . . . . . . . . . . . . . . . . . . . 29
13 Performance of a commercially used Wi-Fi code, and a random left 3-regular code. . . . . 32
14 Performance of a random (3, 6)-regular code of block length 20000, against the rate 1/2,
n = 1944 Wi-Fi code from Fig. 13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
15 Performance of the Q(4, 3) code compared with a random code in a BSC. . . . . . . . . . 33
16 Performance of the Q − (5, 3)code compared with a random code in a BSC, decoded with
the bit-flipping algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
17 Performance of the Q − (5, 3)
code compared with a random code in a BI-AWGN channel
with sum-product decoding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
18 Performance of a full GQ code and a reduced GQ code in a BSC. . . . . . . . . . . . . . 35
19 Performance of three lifts of the Q − (5, 2) code, of rate approximately 0.4. . . . . . . . . . 36
20 Performance of a 2-lifted GQ code, a random left 3-regular code, and the best-known code
of these dimensions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
21 Performance bit-flipping decoding and line decoding using the Q(4, 3) code on a BSC. . . 39

List of Tables
1 802.11n rate 1/2 table for block size 1944, with lifting size 81. . . . . . . . . . . . . . . . 27
2 The polar spaces of rank 2, where q = ph , and p is prime. . . . . . . . . . . . . . . . . . 29
3 Points and lines of generalized quadrangles of interest. . . . . . . . . . . . . . . . . . . . 30

4
1. Introduction
In order to send information from a transmitter to a receiver through a noisy channel, the data is sent with
some redundancy in order to correct the errors that might occur. We will assume in this work that the data
to be sent is a sequence of bits. The transmitter sends blocks of n bits, k of which are data bits, and the
remaining n − k are redundancy bits. The way we map vectors of k data bits to vectors of n transmitted
bits is called a (binary) code. The vectors of n transmitted bits of a given code are called its codewords.
The fraction of data bits per codeword, k/n, is called the code’s rate.
A receiver, upon receiving a vector of n bits (also known as a word), has to decide which vector of
k data bits encoded this word. This is called decoding a word. To make the receiver’s job easier, codes
are designed in such a way that different codewords do not resemble each other. Thus, if some errors
occur in the channel, the receiver can still distinguish which codeword was sent and perform the decoding
successfully. Given two vectors, the Hamming distance between them is the number of coordinates where
the vectors differ. Thus, if the minimum Hamming distance between the codewords of a given code is large
enough, when we decode the received word we can correct some errors that were introduced in the noisy
channel.
Given a family of codes, we say that they are asymptotically good if their rate is bounded by a constant
larger than zero, and their minimum Hamming distance grows linearly with block length (n). In 1948,
Shannon used the probabilistic method to show that asymptotically good codes exist [Sha48]. However,
his method did not give explicit examples on how to obtain them. Moreover, these codes might not be
easily encodable or decodable.
Error correcting codes need to be practical, which means that encoding and decoding must be cheap
both in computation and storage. The most common solution is to use linear codes, which are characterized
by the property that any linear combination of codewords is also a codeword. After Shannon’s landmark
paper, the race to find asymptotically good linear codes began. During the sixties and seventies, algebraic
constructions proved that such codes exist, but they were not encodable and decodable in linear time in
block length.
In 1963, Gallager discovered Low Density Parity Check (LDPC) codes [Gal63], which he found exper-
imentally to have good performance and had a link to graph theory through random graphs. However,
Gallager lacked the tools to give explicit arguments of all the good properties of these codes, namely the
concept of expander graphs (which will be detailed later in this thesis). These codes were somewhat for-
gotten, since the thought at the time was that they were not practical due to the computing power they
required.
In the seventies, the concept of expander graphs was introduced, which allowed Tanner, Sipser and
Spielman [Tan81; SS96] to produce stronger results than the ones Gallager had obtained with random
graphs. These led to Spielman’s discovery in 1996 of the first family of asymptotically good, explicit codes,
with encoding and decoding time linear in block length [Spi96], which he coined expander codes.
Nowadays, LDPC codes are extensively used. Most notably, they appear in Digital Video Broadcasting,
Wi-Fi and 5G standards [CF07; Bae+19]. They are also widely employed for various storage system
applications. While Spielman’s decoding algorithm gives stronger analytical results, variants of Gallager’s
probabilistic decoding method are used in the aforementioned practical applications, as experiments have
shown that the latter has better performance.

The main source for this introduction was [HLW06], an excellent survey on expander graphs and their applications.

5
LDPC codes

1.1 Background and motivation

Definition 1.1 ([n, k, d] code). An [n, k, d] code is a linear mapping from vectors of k data bits to
codewords of n bits, with minimum Hamming distance d between codewords.

Remark 1.2. With a [n, k, d] code the receiver can correctly decode words that contain less than d/2 errors.
Since a linear code is a linear map from Fk2 to Fn2 , we only need to specify the images of a basis of
Fk2 to completely determine the code. The matrix G ∈ Fk×n 2 corresponding to the linear map is called the
generator matrix, where the linear code is the row space of the matrix. To determine the codeword c ∈ Fn2
corresponding to a data vector v ∈ Fk2 we compute c = v · G (c and v are row vectors).
The decoder can also use the linearity of the code to its advantage. A matrix that generates the
orthogonal space to G is called a parity-check (or check) matrix, and it is usually denoted by H. It follows
that H · c T = 0 ⇐⇒ c is a codeword. To measure the error of a received vector x, one can calculate its
syndrome.

Definition 1.3 (Weight). The weight of a bit vector x is denoted wt(x) and is the number of non-zero
bits in x.

Definition 1.4 (Syndrome). Given a received row vector of n bits x, and a parity-check matrix H, the
syndrome of x is calculated as s(x) = H · x T .

It follows that x is a codeword ⇐⇒ wt(s(x)) = 0. Another interesting property is that s(x1 + x2 ) =

H · (x1 + x2 )T = H · x1T + H · x2T = s(x1 ) + s(x2 ). Therefore, if the received word is x = c + e, where c is
a codeword and e is some error, then s(x) = s(c + e) = s(c) + s(e) = s(e).
Linearity also allows us to characterize the minimum Hamming distance of a code.

Proposition 1.5. The minimum Hamming distance of a code is the minimum weight of a non-zero code-
word.

Proof. Let d be the minimum Hamming distance. If c1 and c2 are two distinct codewords with d(c1 , c2 ) =
d, then by linearity c1 − c2 is a codeword too, it is not zero and it has weight d.

1.1.1 Relationship of linear codes and bipartite graphs

To introduce the relationship between bipartite graphs and linear codes we first need some definitions.

Definition 1.6 (Graph). A graph G is a pair G = (V , E ), where V is a set whose elements are called
vertices, and E is a set of paired vertices whose elements are called edges.

We will assume from now on that all our graphs are simple graphs, that is, without loops (edges that
join a vertex to itself), without directed edges (an edge between vertices x and y is the same as an edge
between y and x), and without double edges (there can only be one edge joining vertices x and y ). Since
we will work with simple graphs, two vertices that are connected by an edge will be called adjacent (or
neighbors).

Definition 1.7 (Independent set). Given a graph G = (V , E ), an independent set A is a subset of V such
that any two vertices x, y ∈ A are not adjacent to each other.

6
Definition 1.8 (Bipartite graph). A graph is bipartite when its vertices can be divided into two disjoint
independent sets A, B, and all edges of the graph join vertices in A with vertices in B. We will call A and
B the left and right vertex sets. If all vertices in A have the same number of neighbors c, then the graph
is left c-regular.

A cycle is defined as a sequence of adjacent vertices where the only vertex that appears twice is the
first, which must also be the last. One can observe that bipartite graphs are graphs which have no cycles
of odd length.

Definition 1.9 (Girth and Local Girth). The girth of graph is the length of the shortest cycle in the graph.
The local girth of a vertex is the length of shortest cycle that passes through that vertex.

An [n, k, d] code with a parity-check matrix H can be associated with a bipartite graph (in coding
theory, these are often called Tanner graphs). The parity check-matrix H has |Vp | = m rows and |Vv | = n
columns, where m ≥ n − k = rank(H). The disjoint sets of vertices Vv and Vp are called variable and
parity-check vertices, respectively, and they correspond to the left and right vertex sets of the bipartite
graph. If a variable vertex vi is connected to a check vertex cj , that means that the jth parity check
equation has a one in position i (see Fig. 1 for an example of the correspondence between H and the
bipartite graph).

v1 v2 v3 v4
v1 v2 v3 v4
1 1 1 0 c1
1 1 0 1 c2
c1 c2

Figure 1: Example check matrix with corresponding bipartite (or Tanner) graph.

Using this correspondence, we will hop back and forth from graph theory and coding theory throughout
this work. We will see that finding desirable codes translates to finding bipartite graphs with a certain
regularity and large girth.
Finally, it is necessary to introduce the concept of capacity. In his landmark paper [Sha48] Shannon
proved that for block codes of length tending to infinity, it is possible to achieve an arbitrarily low probability
of error when decoding, as long as the transmission rate is lower than a quantity that depends on the channel.
This quantity, that establishes the maximum information (per channel use) that can be sent through this
channel, is called the capacity.
For example, in a Binary Input Additive White Gaussian Noise (BI-AWGN) channel, the capacity depends
on the Eb /N0 ratio, that is, energy per bit to noise power spectral density ratio (which gives a measure of
the signal to noise ratio per bit). In Fig. 2 we have the BI-AWGN’s capacity plot, compared to the capacity
of a receiver that performs a hard decision (that is, a Binary Symmetric Channel). As one might expect,
the capacity is strictly increasing with the Eb /N0 , and the BSC’s capacity is lower since some information
is lost during the hard decision. For the computation of the BI-AWGN capacity see subsection A.3.

7
LDPC codes

0.9
Channel capacity (bits/channel use)
0.8

0.7

0.6
BI-AWGN
0.5
BSC

0.4

0.3

0.2

0.1

0
-20 -15 -10 -5 0 5 10
Eb =N0 (dB)

Figure 2: BI-AWGN and BSC capacity as a function of Eb /N0 .

One of the main reasons LDPC codes are ubiquitous in present-day communications systems is that they
have been found to have extremely low error rates at signal to noise ratios very near to the theoretical limit
set by Shannon’s capacity [Chu+01]. LDPC codes, together with turbo codes (which will not be discussed
in this thesis), are the two best known families of codes that have good near-capacity performance. A
curious historical note is that when turbo codes were presented in 1993 they were received with skepticism
from the coding community, as their analysis was thought to be mistaken by a factor of 2 [CF07]. The
disbelief turned to astonishment when the claims were confirmed by other laboratories, which prompted a
boom in coding theory and helped with the re-birth of LDPC codes.

1.2 Goals and contributions

The goal of this thesis is to expose the relationship between expander graphs and LDPC codes, and explore
several techniques that are used in practical implementations of LDPC codes. Also, we want to see how
well expander codes perform against other known LDPCs. A new family of LDPC codes is constructed
from generalized quadrangles, which was not present in the previous literature. These finite geometries
posses some properties which are interesting when producing LDPC codes.
Codes constructed from generalized quadrangles, which we will call GQ codes, have yielded good results
when decoded using the bit-flipping algorithm (an algorithm that is interesting for proofs, but not used
in practice, see subsubsection 2.1.1) and will be explored in subsection 2.6. They also perform well when
decoded using the sum-product algorithm (the standard way to decode LDPCs, see subsubsection 2.1.3),
as their error-correcting capabilities are significantly greater than randomly constructed codes. A series of
simulators have been programmed to calculate the performance of these codes, and they have been posted
online (see https://github.com/TomasOrtega/LDPC) for public use.

8
1.3 Organization
This remainder of this work is structured as follows. First, the decoding and encoding of LDPC codes
will be discussed in subsection 2.1 and subsection 2.2, respectively. Second, in subsection 2.3 expander
graphs will be defined, as well as their relationship with LDPC codes. Some practical considerations when
constructing LDPC codes will be introduced in subsection 2.4, which will give way to talking about used
implementations of these codes, and their different characteristics in subsection 2.5. After seeing these
practical codes, in subsection 2.6 we will introduce generalized quadrangles. These incidence structures will
allow us to propose a new family of LDPC codes which we will name GQ codes. Some experimental results
will follow in section 3, where the performance of the different ensembles of codes we have introduced will
be compared via a series of Monte-Carlo experiments. Finally, the results of this work will be summarized
in section 4, and some possible future work will be detailed. Appendix A contains the main functions of
code that were developed for this work.

9
LDPC codes

2. LDPC codes
In this section we will discuss the definitions and theory behind LDPCs. First, we will see some basic
definitions and notation, which have been inherited from [Tom+17]. Afterwards, we will see how these
codes are decoded, and we will prove the existence of asymptotically good LDPC codes using expander
graphs, which will be defined later. Finally, we will see some practical considerations when designing these
codes.

Definition 2.1 (LDPC code). LDPC codes are linear block codes whose parity-check matrix, as the name
implies, is sparse.

We will maintain the definition of an LDPC code purposefully vague, as we will consider different
ensembles of sparse matrices later on to create particular families of codes. These codes can be iteratively
decoded using what is known as the sum product algorithm (also known as the belief propagation soft
decision decoder). This will be described in subsection 2.1. It has been shown, for example in [Chu+01]
for a binary AWGN channel, that for long block lengths, the performance of LDPC codes using an iterative
decoding algorithm is very close to channel capacity.
The performance of an LDPC in a binary AWGN channel scenario using an iterative decoding algorithm
may be partitioned into three regions: erroneous, waterfall and error floor regions (see Fig. 3). The erroneous
region occurs at low Eb /N0 values, and ends somewhere after the Shannon limit of the given channel. In it,
the decoder is practically unable to correct any errors in the transmitted vectors (or frames). Afterwards,
as the signal power increases, the frame error rate (FER) decreases rapidly, resembling a waterfall. The
Eb /N0 value at which the waterfall region starts is commonly known as the convergence threshold in the
literature. At higher Eb /N0 values, the error rate starts to flatten, introducing an error floor in the FER
curve.
There are two main families of LDPC codes: random and algebraic. In the former, the parity-check
matrix is chosen at random, sometimes with restrictions. In the latter, the parity-check matrix is constructed
from some algebraic structure.
LDPC codes can also be classified depending on the regularity of their associated bipartite graph,
namely regular and irregular codes. In his doctoral thesis [Gal63], Gallager introduced the (n, λ, ρ) LDPC
codes, where n represents the block length, and λ and ρ are the number of non-zeros per column and row,
respectively. In other words, these code’s bipartite graphs are λ-left regular and ρ-right regular. The short
notation (λ, ρ) is also commonly used to represent these codes. In this work we will use the letters (c, r ),
that stand for the number of ones per column and row in the check matrix, respectively.

2.1 Decoding
Given a received vector x, a maximum likelihood decoder outputs the codeword y that maximizes

Pr[x received | y sent].

Applying maximum likelihood decoding for LDPCs is not a good idea, since the size of the codewords
set grows exponentially with block length, and the decoder would have to iterate through all the possible
codewords to find the one with minimum probability of error. This is why the techniques that are employed
to decode LDPCs are iterative decoding algorithms, which need less resources in memory and computation
time. In many cases, they are also designed to allow computations to be done in parallel. Iterative

10
N = 16200, K = 13320, DVB-S2
0
10

Error region
!1
10

10!2
Error rate

10!3 Frame ER Error .oor region

Shannon's capacity converse bound

10!4

Waterfall region
!5
10
2.2 2.4 2.6 2.8 3 3.2 3.4 3.6
Eb =N0 (dB)

Figure 3: Error regions of a typical LDPC code for the AWGN channel. The code used to obtain this plot is
a rate 0.82 code used in practice for digital video broadcasting, taken from the DVB-S2 standard discussed
in subsubsection 2.5.1.

11
LDPC codes

decoding schemes bring down the decoding time complexity from exponential to polynomial (and in some
cases linear) in block length. These characteristics allow the practical implementations of LDPC codes,
despite moderately increasing the probability of error. If a lower probability of error is required, this is
usually solved by increasing the block length of the code.
We will describe three iterative decoding algorithms for LDPCs. They are introduced in order of
simplicity. The first, albeit simple, will allow us to prove the existence of asymptotically good LDPC codes
in subsubsection 2.3.2. However, for practical implementations this algorithm is too slow. The second
and third algorithms will be successive variations of the decoding algorithm that lower complexity at some
performance cost. We will observe that if there are cycles in the parity check matrix then these algorithms
might never converge, unless a maximum number of iterations is specified.

2.1.1 Bit-flipping algorithm

This method is also known as belief propagation for codes in a BSC, for a reference see [Bal20]. Let the
decoder receive a vector v , and perform the following routine:

1. The receiver calculates the syndrome of v , that is, H · v T . If the syndrome has weight 0, v is a
codeword and we are done. If it is not 0, we continue to step 2.

2. For every e of weight one, we calculate the weight of the syndrome of v + e. If no e decreases the
weight of the syndrome, the decoding breaks down. If there is, we choose the e that minimizes it.

3. We replace v by v + e, and start again on step 1.

We can see that the weight of the syndrome of the received vector decreases at each iteration of the
routine, so we will eventually find a codeword.
Observation 2.2. Calculating the syndrome of v is takes (n − k)r operations, where r is the number of
ones per row. Afterward, the calculation of each syndrome s(v + e) takes only c operations, where c is the
number of ones per column, since s(v + e) = s(v ) + s(e), and the syndrome of v was already calculated.
In other words, obtaining the minimum syndrome for all errors of weight one uses c · n operations.
Effectively, at each iteration we flip the bit that will ensure that the maximum number of check
conditions are satisfied, until all check conditions are satisfied and we arrive to a codeword. However, each
iteration consists of one v syndrome calculation, and n syndrome calculations of v + e, which take O(n)
in total, exploiting the sparsity of the check matrix. Since we have at most n bits to correct, the algorithm
converges in at most n iterations, so the total complexity is O(n2 ). This is just to prove that this method
is indeed polynomial; in practical scenarios, this algorithm behaves linearly.
We will use this method to decode LDPCs in our proof of existence of asymptotically good LDPC codes
in subsubsection 2.3.2, but practical schemes need faster decoding algorithms, which will be introduced
shortly.

2.1.2 Multiple bit-flipping algorithm

In this version of bit-flipping belief propagation, in each iteration the decoder computes all parity-checks and
flips the bits that are contained in more than some fixed number δ of non-satisfied parity-check equations.
Ideally, instead of flipping the bit that was contained in the largest number of unsatisfied parity check
equations like in the last algorithm, this one flips several bits at once per iteration. The idea is that this

12
will reduce the number of iterations until the algorithm converges. Finally, if no bits are contained in δ or
more unsatisfied parity-check equations, the previous algorithm is applied.
Observation 2.3. This algorithm has no guarantee to converge, as it is not assured that the syndrome of
the decoded word will reduce at each iteration. A maximum number of iterations is needed, at which point
the algorithm will output a failure in decoding. We can observe that each iteration is still O(n), taking
into account the sparsity of the check matrix.

This decoding scheme is very simple, but it is only applicable for the BSC at rates far below capacity.
If we approach capacity, a large number of errors will appear, which will break our heuristic decoding. The
following decoding scheme decodes directly from the a posteriori probabilities at the channel output.

2.1.3 Sum-Product algorithm

The probabilistic decoding scheme described in this section is due to [Gal63], and it is known as the sum-
product algorithm (among many other names, like iterative decoding, message-passing, belief propagation,
probabilistic decoding, ...). Although it involves more expensive operations at each iteration than the
previous schemes, its complexity is still linear in block length for regular LDPCs; its derivation is more
systematic and the decoding tends to converge faster for large block lengths. It is also applicable to the
AWGN channel. Nowadays, practical applications of LDPCs (see subsection 2.5) can have block lengths of
tens of thousands of bits, and therefore use variations of this method for decoding.
During the following derivations, we will assume that our LDPC code is (c, r )-regular, that is, the
parity-check matrix H has c ones per column and r ones per row. This will ease the notation and simplify
the calculations, but the reasoning is completely equivalent for irregular codes.
To start, assume that all received bits are 0 or 1 with equal probability. We transmit a codeword x
and receive a vector y . The probabilities of the received vector are determined by the channel transition
probabilities. We want to find the probability that the transmitted bit in position d is a 1 conditional on
y , and on the event S that x satisfies the c parity-check equations on bit d, which we write as

Pr[xd = 1 | y , S]. (1)

Observe that S always happens when x is a codeword; however, we need to add this event in order to
assume independence of bits in Lemma 2.4, which will aid us in future proofs.
Let Pd be the probability that xd = 1, conditional on the received yd . Let us use i to iterate through
the parity checks containing bit d. Observe that i will go from 1 to c, since H is (c, r )-regular. Let us use
l to iterate through the variables involved in the i-th check, so l will go from 1 to r , and we will assume
without loss of generality that xd is always in position r . Let f be the function that sends each pair (i, l)
to the index of the coordinate of the l-th bit involved in the i-th parity check equation that contains d.
For an example with a visual representation, see the first tier of Fig. 4, where we would use i ∈ {1, 2, 3}
to iterate through the parity-checks containing bit d. At each one of these checks, l ∈ {1, 2, 3, 4} would
iterate through the tier 1 variables, where the fourth variable is always bit d, i.e., f (i, 4) = d for all i.
Let Pil be the probability that xf (i,l) = 1 conditional on the received yf (i,l) , i.e., the analogous to Pd
for the l-th bit in the i-th parity check equation containing d. In fact, Pir = Pd for all i.
Before we present the theorem that will give us an iterative method to obtain approximations of the
likelihood ratios of the received bits, we need the following lemma:

13
LDPC codes

Involved variables (1, 1) (1, 2) (1, 3) (2, 1) (2, 2) (2, 3) (3, 1) (3, 2) (3, 3)
with label (i, j)

Checks containing d i =1 i =2 i =3

Root d

Figure 4: Example of indexing for a parity-check tree representation.

Lemma 2.4. Consider a sequence of m independent bits in which the l-th bit is a 1 with probability Pl .
Then the probability that an even number of bits are 1 is
1+ m
Q
l=1 (1 − 2Pl )
.
2
Proof. For m = 1, the lemma stands. For m > 1, we suppose true for m − 1, so the probability that an
even number of bits are 1 is
!
1 + m−1
Qm−1
1+ m
Q Q
l=1 (1 − 2P l ) 1 + l=1 (1 − 2P l ) l=1 (1 − 2Pl )
(1 − Pm ) + 1 − Pm =
2 2 2

Knowing this lemma, and using the previously presented notation notation, let us prove the following
theorem:
Theorem 2.5. If every bit is statistically independent of each other, and S is the event where the trans-
mitted bits satisfy the c parity-check constraints on bit d, then
c
" Q −1 #
Pr[xd = 0 | y , S] 1 − Pd Y 1 + rl=1 (1 − 2Pil )
= Q −1 . (2)
Pr[xd = 1 | y , S] Pd 1 − rl=1
i=1
(1 + 2Pil )

Proof. By Bayes’ rule,

Pr[S | xd = 1, y ] · Pr[xd = 1 | y ] Pr[S | xd = 1, y ]
Pr[xd = 1 | y , S] = = Pd ,
Pr[S | y ] Pr[S | y ]
and computing the analogous for Pr[xd = 0 | y , S], it follows that
Pr[xd = 0 | y , S] 1 − Pd Pr[S | xd = 0, y ]
= · (3)
Pr[xd = 1 | y , S] Pd Pr[S | xd = 1, y ]
Now consider the probability Pr[S | xd = 0, y ]. Since all bits are statistically independent from each other,
and xd = 0, for the c parity-check equations involving xd to be satisfied, the r − 1 remaining bits in each
equation must have an even number of ones. We can now use Lemma 2.4 to obtain
c Q −1
Y 1 + rl=1 (1 − 2Pil )
Pr[S | xd = 0, y ] =
2
i=1

Calculating analogously for Pr[S | xd = 1, y ] and substituting in (3) gives the theorem’s statement.

14
Tier 2 variables ... ...

Tier 2 checks ...

Tier 1 variables

Tier 1 checks

Root d

Figure 5: Parity-check tree representation.

As we can see from the expression in Theorem 2.5, if the Pil are known, we have obtained the maximum
likelihood estimation of the xd . However, we have not exploited the full properties of the code. Let us
consider the graph associated with our code. Gallager noticed that if the component that is connected to
xd can be represented by a tree (see Fig. 5), then we can apply Theorem 2.5 iteratively to obtain the Pil .
Observing the two-tier case in Fig. 5, we can extend Theorem 2.5 using (2) once for each of the variables
in tier 1, and find the probability that each is a 1 conditioned to the received bits in tier 2. When that is
calculated, we can apply the theorem to the root, and obtain the maximum likelihood estimation of xd .
By induction, we can apply the previous argument iteratively for multiple tiers as long as the Tanner
graph is a tree. This appears to be good news: if the Tanner graph is cycle-free, sum-product decoding is
optimal! The bad news: such graphs produce weak linear codes. Nevertheless, practical experiments show
that sum-product decoding using graphs of LDPC with large girth behaves well [RU08]. This behavior is
still not well understood (except for erasure channels, where progress has been made, see [Di+02; Orl+02]),
but the intuition is that a large girth means that locally, the Tanner graph resembles a tree. Thus, the
parity-check tree representation can be used for many tiers before collapsing on an already seen variable,
at which point we hope the probabilities have converged to stable values. It is also reasonable to think
that distant dependencies have a relatively minor effect.
The first and third decoding algorithm’s performance is compared in Fig. 6. In this example, a random
left 3-regular code of rate 1/2 was used in a BSC. The random code in question is MacKay’s 96.33.964
code [Mac05], employed for reproducibility, but codes that were randomly generated using Listing 1 yield
the same results. The probabilistic algorithm clearly outperforms the bit-flipping one.

15
LDPC codes

N = 96, K = 48, 3 ones per column

10!1

10!2
BER

10!3

Random LDPC: bit-.ipping

Random LDPC: sum-product
Uncoded
10!4
0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
Channel probability of error p

Figure 6: Bit-flipping and sum-product performance comparison, with 105 Monte-Carlo runs.

16
2.2 Encoding
In order to encode LDPCs, the standard technique to encode linear codes is often employed. This consists
in obtaining G , the generator matrix, which is orthogonal to H, the parity-check matrix, and multiplying
the k length data vectors with G to obtain the codewords. However, in order to avoid multiplying all the
length of the vector, the matrix G is often stored in standard form.

Definition 2.6 (Standard form). A generator matrix of an [n, k] linear code is said to be in standard form
when G = [Ik |P], where P is a k × (n − k) matrix that defines the redundancy bits. This defines a standard
form for the parity-check matrix H = [−P T |In−k ].

We can observe that since we are treating with binary codes −P T = P T . Also, we can verify that
GH T = P − P = 0, so G and H are orthogonal. There are three main advantages of operating with G in
standard form:

G occupies less memory in standard form, as we only have to store the matrix P and the dimensions
of the code to obtain G .

Calculating the codeword corresponding to an information vector is less expensive, as the first k bits
of the codeword will correspond to the data bits.

The decoder can easily obtain the information bits (the first k bits of the codeword) once it has
corrected the errors of the received vector. In other words, the inverse of the map from data vectors
to codewords is trivial.

The techniques employed to encode LDPCs in some practical scenarios (e.g., NASA’s deep-space
communications [Alc+08]) are small variants of this standard technique, optimized for quasi-cyclic codes,
which will be described later. A further modification has been performed for 5G applications, where a
variant of the Richardson and Urbanke generator matrix transformation has been optimized for quasi-cyclic
codes (see [RU01; Li+06]). We will not delve into these techniques, as they are not the main focus of this
work.

2.3 Expander codes

In this section, we will prove that there exist asymptotically good LDPC codes, and that we can decode
them fast. To do that, we will use our regular bipartite graphs with which we picture LDPC codes, but we
will ask them to satisfy the expander property (which will be defined later). The set of LDPC codes that
we will obtain from expander graphs is often referred to in the literature as expander codes. These graphs
will allow us to assure that all sets of variable vertices have a large enough proportion of check vertices
as neighbors. Then, we will see that the derived expander codes minimum Hamming distance growing
linearly with block length. Finally, we will see that the decoding algorithm described in subsubsection 2.1.1
corrects words that are at half the minimum Hamming distance from the sent codeword. This will allow
us to conclude that these codes with the presented decoding algorithm are asymptotically good, and can
be decoded efficiently. For the source of this proof, see [Bal20].

Definition 2.7 (Expander property). Given a left c-regular bipartite graph, with stable sets Vv and Vp ,
where |Vv | = n, we say that it has the expander property with respect to δ if for all subsets S of Vv of size
less than δn, the set of S’s neighbors N(S) satisfies |N(S)| > 34 c|S|.

17
LDPC codes

In other texts, the definition of graphs with the expander property might differ slightly from the given
one, but this is a simplified definition for bipartite graphs, and the 43 c proportion is chosen so that the
posterior proofs can be derived.

2.3.1 Existence of desired expanders

Lemma 2.8 (Existence of good expander graphs). Given c > 4 and R ∈ (0, 1), there is a constant
δ ∈ (0, 1), dependent on c and R, for which a left c-regular bipartite graph with the expander property
with respect to δ exists, for all n large enough, with left and right stable set sizes n and m = b(1 − R)nc,
respectively.

Consider the family Θ of bipartite left c-regular graphs with left and right stable set sizes n and m,
respectively. The following proof will use a probabilistic argument to see that our desired graph exists
within Θ, but will give us no indication as to how to construct it. However, we will see that for n large
enough there always is a graph in Θ that satisfies our desired properties, which will allow us to randomly
construct good LDPC codes.

Proof. For a graph in Θ, let us define the random variable XS,T for each pair of subsets S ⊆ Vv and
T ⊆ Vp of sizes |S| = s < δn and |T | = b 43 csc. XS,T takes value one when all the neighbors of S are in
T , otherwise it takes value zero. Observe that if
 
X
Pr  XS,T = 0 > 0 (4)
S,T

P
then there exists at least one graph in Θ for which XS,T = 0, which means that XS,T = 0 for all S, T
of the described sizes. Therefore, no T contains all the neighbors of a subset S, so each S has more than
|T | = b 43 csc neighbors. In short, this graph has the expander property with respect to δ. Thus, our goal
will be to prove that when n is large enough (4) is true.
To start, given subsets S and T , we can observe
that the probability that all c edges that start from a
randomly chosen vertex in S end in T is |Tc | / m

c . Computing the probability that all the cs edges that
originate in S end in T gives us
!s c−1
!s !cs
|T |
c
Y |T | − i b 43 csc
Pr[XS,T = 1] = m = ≤ ,
c
m−i m
i=0

with a strict inequality when |T | < m. By summing over all the subsets S and T of the allowed sizes we
arrive to the following expression:

bδnc
!cs
X X n m b 34 csc
Pr[XS,T = 1] < . (5)
s b 34 csc m
S,T s=1

Now we need the well known bound

nk

n n n−1 n − (k − 1) ne k
= · ··· ≤ < , (6)
k k k −1 1 k! k

18
where the last inequality follows from the fact that
∞
k
X kj kk
e = > .
j! k!
j=0

Applying (6) and some algebra to (5) gives us

!c s  !c s
bδnc 3 4 bδnc 3 4
4 cs 4c
X X ne s X c
−1
Pr[XS,T = 1] < = ne s 4  ,
s m m
S,T s=1 s=1

and since s ≤ δn and m = b(1 − R)nc,

 !c s
bδnc 3 4
4 cn c
X X
−1
Pr[XS,T = 1] < e δ 4  . (7)
b(1 − R)nc
S,T s=1

Observe that for n large enough we can bound

!c
3 4
4 cn
e
b(1 − R)nc

with a quantity K that depends on R and c, but not on n. Finally,

X X c s
Pr[XS,T = 1] < K δ 4 −1 < 1,
S,T s=1

c
provided that K δ 4 −1 < 21 . Since c > 4, we can always choose a δ > 0 such that this is satisfied. This
brings us to    
X [ X
Pr  XS,T 6= 0 = Pr  (XS,T = 1) ≤ Pr [XS,T = 1] < 1, (8)
S,T S,T S,T

so (4) holds.

Corollary 2.9. Given R ∈ (0, 1) and δ ∈ (0, 1), there is a constant c > 4 dependent on δ and R, for which
a left c-regular bipartite graph with the expander property with respect to δ exists, for all n large enough,
with left and right stable set sizes n and m = b(1 − R)nc, respectively.

Proof. Following the same proof as in Lemma 2.8 we can observe that the terms in the sum of (7) decrease
as a function of c, since δ < 1. Thus, for n large enough, we can always find a c such that the sum adds
up to less than 1 and conclude the proof in the same fashion.

2.3.2 Asymptotically good codes

Now that we have proven the existence of good expander graphs, we will see that these generate asymp-
totically good codes. Fixing R and δ as in Lemma 2.8 we will obtain a series of check matrices that define
codes with rate at least R and minimum Hamming distance of at least δn (so they are asymptotically
good). Furthermore, these codes will be decodable in polynomial time using the algorithm described in
subsubsection 2.1.1, which we will prove later on.

19
LDPC codes

Lemma 2.10. Given a left c-regular, bipartite graph with left and right independent set sizes n and
m = b(1 − R)nc and the expander property with respect to δ, where R, δ ∈ (0, 1), we can obtain a binary
linear code with rate at least R and minimum Hamming distance δn.

Proof. Let us define our code by constructing its parity-check matrix H from the bipartite graph as in
previous sections, where the n vertices of the left independent set represent the variables and the m
vertices on the right independent set represent the parity check equations (see Fig. 1). Observe that since
our graph is left c-regular, H has c ones in each column. To check the rate of the code we can use that
rank(H) ≤ m, so
n − m = n − b(1 − R)nc ≥ Rn,
which makes the rate of the code defined by H at least R.
Now, to prove the minimum Hamming distance of δn we will assume there is a non-zero codeword u
of weight lower than δn and arrive to a contradiction (by proposition 1.5, this will prove that the minimum
Hamming distance is at least δn). Let S ⊆ Vv be a subset of the variable nodes that correspond to the
non-zero bits of u. Thus, |S| < δn, and every vertex in S has c edges going to N(S), due to our graph
being left c-regular. Due to the expander property of our graph, |N(S)| ≥ 43 c|S|. Since every parity-
check equation is satisfied, every vertex in N(S) has an even number of edges arriving to it from S, so at
least two. Therefore, counting edges gives us |S|c ≥ 2|N(S)|. Joining the last two inequalities gives us
|S|c ≥ 2|N(S)| ≥ 32 c|S|, which is a contradiction, and we are done.

To finish this section, we will now prove the result that we initially wanted, which is that these codes
can be decoded efficiently using the algorithm from subsubsection 2.1.1. To do that, we will first need the
following lemma:

Lemma 2.11. Consider a received word x such that d(x, u) < δn for some codeword u of a binary linear
code obtained from a bipartite, left c-regular graph which has the expander property with respect to δ.
Then, there is an i ∈ {1, ... , n} such that wt(s(x + ei )) < wt(s(x)).

Proof. Let S be the set of vertices corresponding to the positions where x and u differ. Since |S| < δn,
by the expander property |N(S)| > 34 c|S|. Recall that N(S) are the parity-check equations where bits in
S are involved. Let us divide N(S) into disjoint sets U and T , where T are the vertices that correspond
to the parity-checks satisfied by x, and U are the unsatisfied ones. Thus, |T | + |U| = |N(S)| > 43 c|S|.
Since the satisfied parity-checks need an even number of adjacent vertices from S, |U| + 2|T | ≤ c|S|, and
together with the previous inequality we obtain
3 1 1
|T | + |U| − (|U| + 2|T |) > c|S| − c|S| =⇒ |T | < c|S| =⇒ |U| > c|S|. (9)
4 4 2
This implies that there are more than 12 c|S| unsatisfied parity-checks, so wt(s(x)) > 12 c|S|. Given that u
is a codeword, it has syndrome 0, so
X
s(x) = s(x + u) = s(ei ),
i∈S

where ei is the all-zero word except a one in the i-th position. Since wt(s(x)) > 12 c|S|, and wt(s(ei )) = c,
there is at least one i where s(x) and s(ei ) have a one in more than 12 c of the same positions. Thus, adding
ei to x could augment the syndrome weight in at most c, but it is decreasing it in over 12 c positions, so
wt(s(x + ei )) < wt(s(x)).

20
Observe that here is where we have used that our definition of the expander property requires |N(S)| >
3
4 c|S|, since this is the minimum number of vertices that we need to be able to apply the pigeonhole
principle and conclude the proof. Finally, we are now in a position to prove the following:

Theorem 2.12 (Expander codes can be decoded efficiently). Consider a binary linear code of length n
obtained from a left c-regular, bipartite graph with the expander property with respect to δ. We can use
the decoding algorithm from subsubsection 2.1.1, which ends in a number of steps polynomial in n and
corrects up to 21 δn error bits.

Proof. Consider a received word x. By assumption, there is a codeword u such that d(x, u) < 12 δn.
Following the notation from 2.11, let S be the set of vertices corresponding to the positions where x and
u differ. Observe that |S| < 12 δn, so

1
|U| ≤ |N(S)| ≤ c|S| < cδn
2
since the code is obtained from a left c-regular graph. By Lemma 2.11, there exists an ei such that
x 0 = x + ei has syndrome weight lower than x, thus, the number of unsatisfied parity-checks decreases.
Joining the previous expression with what we know from (9), we obtain

1 1
c|S| < |U| < cδn =⇒ |S| < δn,
2 2
so the new x 0 will always be at a distance from u smaller than δn, which allows us to iteratively apply
Lemma 2.11 renaming x 0 as x. In conclusion, the set of unsatisfied parity-checks decreases with each
iteration, and the set of bits that differ between x and u is always smaller than the minimum Hamming
distance, so x will become u in at most 21 cδn steps. Since each iteration consists in n syndrome weight
calculations, this algorithm has a time complexity polynomial in n.

Corollary 2.13. If our bipartite graph is (c, r )-regular (left c-regular, and right r -regular) the bit-flipping
algorithm runs in linear time.

Proof. Let us start with a set S of variables that we need to check to see if flipping them reduces the
syndrome. At first, S is all the variables. We need one initial syndrome calculation, which takes O(n).
At every iteration of the bit-flipping algorithm we take out a variable of S, and check if flipping its
value reduces the syndrome. This takes c operations per variable. If it does not reduce the syndrome, we
discard it; otherwise we flip it and update the syndrome. Now we need to update the set of variables which
may reduce the syndrome: we have changed c equations, each with r − 1 other variables involved, so we
need to add at most c(r − 1) variables to S.
Using the previous theorem, we know that the bit-flipping algorithm ends in at most 21 cδn flips. There-
fore, during the whole running time of the algorithm we can only add 12 c 2 (r − 1)δn variables to S, so the
bit-flipping algorithm is runs in linear time.

Observation 2.14. In practice, even if H is only left c-regular, this algorithm runs in linear time. Tracing
back the previous argument, we can not ensure that we add a constant number of variables to S at each
iteration. However, if H has dimensions m × n, then H has cn ones, which means that the average number
of involved variables per equation is cn
m ≤ cR (where R is the rate of the code). It is reasonable to assume
that we add around c 2 R variables to S at each iteration, so the bit-flipping algorithm will still behave
linearly.

21
LDPC codes

2.4 Practical considerations

In practical implementations, memory, encoding and decoding speed, and hardware complexity are im-
portant factors. They are sometimes prioritized, which can come at a cost in the code’s error correcting
performance. For example, irregular LDPC codes are known to have lower error rates than their regular
counterparts [Lub+01], but the required hardware to implement these codes is more complex. This has
made irregular LDPC codes unattractive from an industry point of view [Tom+17].
For practical encoding and decoding, there is a general mantra that computing and memory cost should
be at most linear in block length. In order to achieve this, codes used in industry standards are quasi-cyclic,
and constructed via a protograph.
Definition 2.15 (Protograph (or base graph)). Graph of small dimensions which is used as a template to
produce larger graphs.
The process of making a large graph from a protograph is called lifting. Lifting the base graph by a
factor k means making k copies of said graph, and permuting the edges that came from the same couple
of vertices. An example of this procedure can be seen in Fig. 7, where the edges generated from lifting
edge (v1 , c1 ) have had a cyclic shift, as can be seen in the corresponding check matrix.

v1 v2 v3 v4

c1 c2
 
0 1 0 1 0 0 1 0 0 0 0 0
0 0 1 0 1 0 0 1 0 0 0 0
 
1 1 1 0 1 0 0 0 0 1 0 0 1 0 0 0
→  
0 1 0 1 0
 0 0 1 0 0 0 0 0 1 0 0

0 0 0 0 1 0 0 0 0 0 1 0
0 0 0 0 0 1 0 0 0 0 0 1

Figure 7: Construction of a code by lifting a protograph, lifting size k = 3.

Observe that every non-zero element in Fig. 7 has been transformed into a 3×3 identity, or a permutation
of its columns. When the permutations are column shifts, the resulting graph is quasi-cyclic. In other words,
we just have to store the number of columns we shift each element of the protograph and the size of the lift
to obtain the lifted graph. The Wi-Fi standard described in subsubsection 2.5.4 shows a practical example
where this is used.
Protograph-constructed LDPCs have the advantage of being easily encodable and decodable (the en-
coder and decoder hardware is less complex than other types of LDPC codes), and their computations can
be done in parallel, as well as being memory efficient. As we can see from Fig. 7, multiplying by the lifted
matrix can be eased by a simple bit shift for every permuted identity, and carried out in parallel. The price
to pay for this structure is that their performance is slightly weaker than some irregular codes of the same
length, but this is often preferable, as the matrix can be lifted to a larger number to compensate for this
deficiency [Tom+17].
It is trivial to see that lifting a graph by the identity maintains the base graph’s expansion properties.
It is also not hard to see the following proposition on the girth of a lifted graph.

22
Proposition 2.16. Lifting a graph can only increase its girth.

Proof. Let us consider a base graph G and lift it to G 0 . We can label every vertex v 0 ∈ G 0 by the name of
the vertex of G it is the copy of. Any cycle v10 , v20 , ... , vk0 in G 0 can be traced in G by following the labels
of each vertex vi0 , and thus any cycle in G 0 has length larger or equal than a cycle in G .

There exists a good amount of literature on constructing good expanders by chaining two-lifts of
graphs, as first proposed by Bilu and Linial [BL06]. However, successive two-lifts are not an optimal way
to construct expanders for practical implementations, because there is no significant gain in storage or
hardware implementation.
As a final note on practical considerations for LDPC codes, there exist many good open LDPC databases
online. See the following links for the databases that have been used to obtain the majority of the practical
codes that are discussed in this work: Creonic, TU Kaiserslautern, David Mackay, and the AFF3CT project.

2.5 Used implementations

In this section, we will explore different LDPCs that are used in real-world applications and highlight their
key characteristics. Plots have been added in order to illustrate these attributes. The codes that we will
present are for different cases of digital communications. LDPCs for mass storage systems generally operate
at a higher signal to noise ratio, and thus the experiments that are needed to evaluate their performance are
more computationally expensive. Most papers in this area employ Field-Programmable Gate Array (FPGA)
hardware implementations for their experiments. Since we will use software to simulate our experiments,
we will restrict ourselves to LDPCs for digital communications.
First, we will see the earliest use case of LDPCs, for digital video broadcasting. We will detail the simple
parity-check matrix structure that was proposed and simulate its performance. Next, we will see the code
for 5G NR, which we will use to introduce a common technique to obtain codes of different rates from the
same check matrix called puncturing. The following family of codes will be NASA’s AR4JA, which will
be useful to explain protograph-based codes. This is a method that helps create large, simple codes from
a small base matrix. Finally, we will use a Wi-Fi 802.11n code to illustrate how LDPCs are specified in
industry standards.

2.5.1 Digital video broadcasting

The first major practical use of LDPCs was in satellite television broadcasting. In 2005, the Digital Video
Broadcasting - Satellite - Second Generation (DVB-S2) standard introduced exceptionally long LDPC codes
(up to 64800 bits of block length) to provide a powerful coding scheme, which greatly outperformed its
predecessors and boasted of being 0.6 to 0.8 dB away from the Shannon capacity limit [ESL04]. Today it
remains popular and is the de-facto standard for satellite broadcast.
Several characteristics of this communications scenario made it propitious for LDPCs to be quickly
adopted. The first, the available power at satellite transmission is limited. Since satellites are expensive to
deploy, each dB of power that can be gained by a powerful coding scheme is vital. The second, latency
is not of the utmost concern. This allows the long block length of LDPCs to be employed, which give
excellent performance at the cost of high hardware complexity.
Due to the size of block length, super-linear encoding complexity would be prohibitive. In order to
allow linear-time encoding, the parity-check matrices of the DVB-S2 are highly structured, as can be seen

23
LDPC codes

in Fig. 8. These large code matrices are constructed from circulant matrices (each row is a one-column
shift of the previous one), so only the first row of the matrix needs to be stored. We can also observe that
the right part of the matrix is almost in standard form, which is a great advantage for fast encoding.

Figure 8: DVB-S2 rate 1/2 parity check matrix, with a zoom-in to see the circulants.

While the right part of the parity-check matrix, which we will call H p for historical reasons, may seem
like an identity, it is not. In fact, it is a double diagonal of ones, i.e., the following matrix:
 
1 0
1 1 
Hp =  .
 
. . . 
 . . 
0 1 1

This was proposed in a short letter by Ping et al. [PLP99] who observed empirically that this structure
provided better performance than an identity matrix. Although they did not explain it in their paper, the
intuition behind this structure is rather straightforward: if the H p were an identity matrix and a data bit
was flipped during transmission, only one parity-check equation would be affected by this flip. In contrast,
by using the double-diagonal structure, all bits except the last data bit affect two equations if they are
flipped, thus augmenting their protection.
The naif idea that follows is: if a double-diagonal augments protection for most data bits, why not
try with a triple-diagonal structure? Upon more careful examination, one finds that adding more than two
diagonals with ones results in many four-cycles in H p , which is highly undesirable.
Due to the extraordinary length of this code, the simulations to obtain its performance have not been
carried out in Matlab. Instead the AFF3CT C++ library has been employed, which is an excellent open-
source project [Cas+19]. With it, we have used a decoder called the AMS (average min-sum). This is a
standard variation of the sum-product decoder we have seen, where the likelihood quotient is passed to the
logarithm domain for numerical stability, and the minimum operation is used instead of the tanh−1 that
appears in the derivation for the BI-AWGN channel. This decoder is significantly faster than the original
sum-product, albeit it yields a slightly higher error rate. A maximum of 30 iterations of the AMS was used
for decoding in Fig. 9. As we can see, this code performs at less than 1 dB from the Shannon limit at BER
10−6 , and there is no detectable error floor at BER 10−8 .

24
N = 64800, K = 32400, DVB-S2
100

10!2
Error rate

Frame ER
10!4 Bit ER
Shannon's capacity converse bound

10!6

10!8
0 0.2 0.4 0.6 0.8 1 1.2
Eb =N0 (dB)

Figure 9: DVB-S2 rate 1/2 parity check matrix performance, with 30 AMS iterations.

25
LDPC codes

2.5.2 5G NR

5G NR uses quasi-cyclic LDPCs with a double diagonal structure as described in subsubsection 2.5.1
[DPS20]. This gives linear-time encoding and decoding, and allows for a simple encoding procedure
without having to store an encoding matrix. They are also protograph-based codes, and two base graphs
are used, named BG1 and BG2, depending on the needs of the transmission. A wide range of data rates
is allowed: from 0.2 to 0.95. Three techniques are applied to accommodate for all these rates:

Applying a set of 51 different lifting sizes shift coefficients to the base matrices depending on the
rate. This will be exemplified in subsubsection 2.5.4.
Adding filler data bits: if we want to transmit a block that is not one of the defined sizes, known filler
bits can be appended to the data bits before encoding, and they will be removed before transmission.
The receiver knows the filler bits and can assume them at decoding.
Puncturing the code: a bit mask is applied to the encoded word, which leads to a shorter word. The
receiver can then decode the word by first assuming a BEC for the missing bits, filling them out, and
then perform standard decoding with the full word.

2.5.3 NASA’s deep space and proximity links

NASA’s Accumulate-Repeat-4-Jagged-Accumulate (AR4JA) family of codes is a near-Shannon capacity

family of codes for deep space and proximity links. Its name comes from the type of hardware that inspired
the original protograph of the code. In Fig. 10 we can see the structure of its parity-check matrix for rate
1/2. A keen-eyed reader may notice that the dimensions of the matrix are unusual for a rate 1/2 code,
and this is because this check matrix is punctured to accommodate for several transmission rates.
NASA's AR4JA, rate 1=2
0

1000

2000

3000

4000

5000

6000
0 2000 4000 6000 8000 10000

Figure 10: NASA’s AR4JA rate 1/2 parity check matrix.

As explained in reference [And+07], this family of codes was designed using density evolution techniques,
which try to provide the best degree distribution of nodes for base graphs in order to optimize iterative

26
decoding. When the protograph is lifted, a Progressive Edge Growth (PEG) algorithm is used, which is a
greedy algorithm that selects the column shift that maximizes a certain criteria, which can be the graph
girth or a function of the density evolution techniques that was previously mentioned.

2.5.4 Wi-Fi

LDPCs were introduced to Wi-Fi standards in the 2009 IEEE 802.11n standard. As is usual in industry
applications, the proposed parity-check matrices are quasi-cyclic, and constructed by lifting a base graph
with identity shifts. The standard describes matrices as in Table 1, a 24 × 12 matrix for a rate 1/2 code.
Each entry represents the number of column shifts to apply to an 81 × 81 identity matrix to obtain the
final 1944 × 972 parity-check matrix. Empty entries represent zero matrices.

57 - - - 50 - 11 - 50 - 79 - 1 0 - - - - - - - - - -
3 - 28 - 0 - - - 55 7 - - - 0 0 - - - - - - - - -
30 - - - 24 37 - - 56 14 - - - - 0 0 - - - - - - - -
62 53 - - 53 - - 3 35 - - - - - - 0 0 - - - - - - -
40 - - 20 66 - - 22 28 - - - - - - - 0 0 - - - - - -
0 - - - 8 - 42 - 50 - - 8 - - - - - 0 0 - - - - -
69 79 79 - - - 56 - 52 - - - 0 - - - - - 0 0 - - - -
65 - - - 38 57 - - 72 - 27 - - - - - - - - 0 0 - - -
64 - - - 14 52 - - 30 - - 32 - - - - - - - - 0 0 - -
- 45 - 70 0 - - - 77 9 - - - - - - - - - - - 0 0 -
2 56 - 57 35 - - - - - 12 - - - - - - - - - - - 0 0
24 - 61 - 60 - - 27 51 - - 16 1 - - - - - - - - - - 0

Table 1: 802.11n rate 1/2 table for block size 1944, with lifting size 81.

Lifting the base graph described in Table 1 gives the check matrix seen in Fig. 11.

WiFi (802.11n), rate 1=2

200

400

600

800

0 200 400 600 800 1000 1200 1400 1600 1800

Figure 11: 802.11n rate 1/2, n = 1944 parity-check matrix.

27
LDPC codes

2.6 Generalized quadrangle codes

As has been explained in previous sections, graphs with large girth tend to produce good LDPC codes. In an
effort to produce good bipartite graphs with large girth, we have taken a look at generalized quadrangles,
which will be defined shortly. In this section, we will expose the results we have found when exploring codes
based on generalized quadrangles. First, we will need the following definition.
Definition 2.17 (Incidence structure). An incidence structure is a triple (P, L, I ), where P is a set of
points, L is a set of incidences, which are pairs of a line and a point. The elements of I are called flags,
and they are symmetric, that is, a point p is incident to a line l is the same as l is incident to p.

Generalized quadrangles are incidence structures whose main feature is the lack of any triangles (al-
though they contain many quadrangles). Their lack of triangles will be exploited when creating graphs
with large girth. The formal definition is as follows.
Definition 2.18 (Generalized quadrangle). A generalized quadrangle is an incidence structure (P, L, I ) that
satisfies

1. There is at most one common point on two distinct lines.

2. There is at most one line through two distinct points.
3. There is an s (s ≥ 1) such that every line contains exactly s + 1 points.
4. There is a t (t ≥ 1) such that every point is in exactly t + 1 lines.
5. For every point p not on a line l, there is a unique line m and a unique point q, such that p is on m,
and q on l and m.

The parameters of the generalized quadrangle are (s, t), and they are allowed to be infinite. Since we are
only interested in finite cases, we will focus our attention on these. If either s or t is one, the generalized
quadrangle is called trivial. A generalized quadrangle with parameters (s, t) is often denoted by GQ(s, t).

This definition is taken from reference [PT09], and it captures many important aspects, like the similar
role of points and lines in generalized quadrangles. However, it can be proved that some of the properties
are redundant, for a more concise definition see [BW11].
Corollary 2.19. An immediate corollary from the definition is that generalized quadrangles do not contain
triangles. Given a line l, and two points a, b ∈ l, there can be no point outside of l that forms a line with
a and also with b, since that would violate the last condition in our definition.

The dual of a generalized quadrangle is the image of a map that sends points to lines and lines to
points, and also preserves incidence. It follows from the definition of a generalized quadrangle that its dual
is also a generalized quadrangle. From now on, any result that is proven for points is also proven for lines
on the dual generalized quadrangle.
Proposition 2.20. A GQ(s, t) has exactly (st + 1)(s + 1) points and (st + 1)(t + 1) lines.

Proof. Let l be a line of our GQ(s, t). We know by definition that it contains s + 1 points. Since each of
these points is in t + 1 lines, there are (s + 1)t lines incident to l. We will use this to count the points not
on l. By definition, each point outside of l belongs exactly to one of the (s + 1)t lines incident to l. Since
each line contains s + 1 points, in total, there are (s + 1)ts points outside of l and s + 1 points in it, and
the result for points follows. By duality, we achieve the result for lines.

28
A trivial example of a generalized quadrangle is a complete bipartite graph. The smallest non-trivial
generalized quadrangle is GQ(2,2), known as the doily, which is plotted in Fig. 12. We can check proposition
2.20 holds, as the doily has 15 points and 15 lines.

Figure 12: Generalized quadrangle GQ(2, 2), the doily.

Definition 2.21 (Point-line incidence matrix). The point-line incidence matrix of a generalized quadrangle
G is a matrix with columns that represent the lines of G , and rows that represent the points of G , with a
one when a line and a point are incident, and zero otherwise.

The nomenclature for generalized quadrangles comes from the field of their associated polar spaces.
These fall outside of the scope of this thesis, but Table 2 includes the polar spaces of rank 2, which give
name to the known finite generalized quadrangles we will need in this work. This table is taken from
reference [BW11], where a more in-depth discussion about polar spaces can be found.

Name Polar space (s,t)

Symplectic W (3, q) (q, q) dual of Q(4, q)
Unitary (or Hermitian) U(3, q 2 ) or H(3, q 2 ) (q 2 , q) dual of Q − (5, q)
Unitary (or Hermitian) U(4, q 2 ) or H(4, q 2 ) (q 2 , q 3 )
Hyperbolic Q + (3, q) (q, 1) a grid
Parabolic Q(4, q) (q, q) dual of W (3, q)
Elliptic Q − (5, q) (q, q 2 ) dual of H(3, q 2 )

Table 2: The polar spaces of rank 2, where q = p h , and p is prime.

From an LDPC perspective, the point-line incidence matrix H of a generalized quadrangle has some very
nice properties. First, the absence of triangles means that any H will have a girth of at least 8. Secondly,
the fixed number of points per line and vice versa means that H will be (s + 1, t + 1)-regular. The known
structure of generalized quadrangles allows for reduced storage (knowing which generalized quadrangle we
are using will suffice to generate H) and may enable novel encoding and decoding procedures.
The results we have obtained (see subsection 3.2) point to GQ codes being superior than random
column-regular codes using both the bit-flipping algorithm and sum-product decoding. In an effort to

29
LDPC codes

estimate the rate of GQ codes, we have used Proposition 2.20 and Table 2 to produce Table 3, which will
be of use further on. Observe that for H(4, q 2 ), the number of points and lines are

Number of points = (st + 1)(s + 1) = (q 5 + 1)(q 2 + 1) ≈ q 7

Number of lines = (st + 1)(t + 1) = (q 5 + 1)(q 3 + 1) ≈ q 8 ,

and for Q − (5, q) these are

Number of points = (st + 1)(s + 1) = (q 3 + 1)(q + 1) ≈ q 4

Number of lines = (st + 1)(t + 1) = (q 3 + 1)(q 2 + 1) ≈ q 5 .

This means that in both cases, the resulting GQ codes will have “long” parity-check matrices, which is
desireable to construct high-rate LDPC codes.

GQ (s,t) Points Lines 1 − Points

Lines ≈ Rate
H(4, 4) (4, 8) 165 297 0.44
H(4, 9) (9, 27) 2440 6832 0.64
Q − (5, 2) (2, 4) 27 45 0.4
Q − (5, 3) (3, 9) 112 280 0.6
Q − (5, 4) (4, 16) 325 1105 0.71
Q − (5, 5) (5, 25) 756 3276 0.77

Table 3: Points and lines of generalized quadrangles of interest.

30
3. Experimental results
This section will contain results obtained with different ensembles of codes, with different benchmarks
for comparison. Often times, the benchmark will be a random code, by which we will mean a randomly
constructed left c-regular code. The value of c will be specified in each case. These random codes have
been generated with the function in Listing 1, which ensures the generated parity-check matrix will be
full-rank.
Some plots are cut off at a certain error rate to ensure that the displayed data is representative (a
minimum of 100 erroneous frames has been used for all displayed error rates). For example, the plot for
Fig. 16 cuts off at BER = 10−4 because to achieve reliable results below this error rate, more Monte-Carlo
experiments would have been needed. Other experiments may contain lower error rates.
We will see a curious result: not only is it easy to construct a code that outperforms commercially
used LDPC codes, but a random regular code will do! The trick that allows this to work is constructing
a random code with a much bigger block length than the commercial code. The random code also has a
much higher complexity than the commercial code, because the latter is quasi-cyclic.
Finally, the performance of GQ codes will be analyzed. We will see that their error-correcting capabilities
are superior than those of random codes, but inferior than the best-known codes, constructed via the PEG
algorithm. We will show empirical proof that lifting GQ codes produces good codes, as expected.

3.1 Random codes

The first question that arises after our theoretical discussion is: how good are random codes, really? In this
section we will illustrate their performance with some examples. The first we will see is Fig. 13, where the
performance of a commercially used Wi-Fi code of the standard 802.11n is compared with the performance
of a random left 3-regular code of the same dimensions. We can see that the commercial code outperforms
the random one, and that both have great error-correcting capabilities at 1.5 dB and 2.3 dB above capacity,
respectively.
While the Wi-Fi code performs better than the random code, it is easy to construct a random code
that outperforms its error-correcting capabilities, see the random (3, 6)-regular code proposed in Fig. 14,
constructed with the function in Listing 2. The trick here is making the code longer, which is undesirable in
many practical applications. For Eb /N0 = 1.5 dB, the proposed regular random code has had no uncorrected
frame errors in 104 experiments. This is just 1.3 dB above capacity, outperforming the previous Wi-Fi code
by at least 0.2 dB for all BER ≤ 10−4 . It is known that random d-regular graphs are good expanders
[BL06], which helps explain why random codes perform so well.

31
LDPC codes

N = 1944, K = 972, WiFi 802.11n

10!1

10!2
Error rate

10!3

10!4
Random code
Wi-Fi code
Shannon's capacity converse bound
Uncoded

0 0.5 1 1.5 2 2.5

Eb =N0 (dB)

Figure 13: Performance of a commercially used Wi-Fi code, and a random left 3-regular code.
N = 20000, K = 10000, (3; 6)-regular random code
10!1

10!2
BER

10!3

10!4
(3; 6)-regular random code
Wi-Fi code
Shannon's capacity converse bound
Uncoded
10!5
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
Eb =N0 (dB)

Figure 14: Performance of a random (3, 6)-regular code of block length 20000, against the rate 1/2,
n = 1944 Wi-Fi code from Fig. 13.

32
3.2 GQ codes
In order to manipulate incidence structures, we have employed a software package aptly named GAP
(Groups, Algorithms and Programming). In particular, we have used the package FinInG – Finite Incidence
Geometry [Bam+18] to obtain the point-line incidence matrices of generalized quadrangles. The script can
be found in Listing 6.
These codes present better performance with the bit-flipping algorithm than randomly constructed
codes, see for example Fig. 15 for the rate 3/8 code from Q(4, 3), and Fig. 16 for the rate 27/40 code
corresponding to Q − (5, 3). We can observe that both perform markedly better than their random coun-
terparts. The number of ones per column of the random matrix was chosen to be as close as possible to
the ones per column of the GQ code. However, when that number was even, the random matrix was not
full rank, and the next or previous number was chosen, whichever maximized performance.
Q(4; 3) vs. random H with 3 ones per column
10!1

10!2
BER

10!3 LDPC random: bit-.ipping

LDPC GQ: bit-.ipping
Uncoded

10!4
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
Probability of error p

Figure 15: Performance of the Q(4, 3) code compared with a random code in a BSC.

The large girth of generalized quadrangles gives an intuition for the superiority of GQ codes to random
codes with the bit-flipping algorithm, but these codes are superior with sum-product decoding as well.
See, for example Fig. 17, where the error correcting performance of both a random and a Q − (5, 3) code is
plotted. This seems to indicate that the goodness of GQ codes is independent of the chosen error-correcting
algorithm.
In Fig. 17 we also observe that the Q − (5, 3) code has no predictable error floor, but the random code
of the same block size and rate seems to end its waterfall region around 5 dB of Eb /N0 .
A necessary remark on GQ codes is that, despite the first intuition when making LDPC codes is forcing
the parity-check matrix to be full rank, they perform better when H is not full rank. Given a GQ with n
nodes, m lines and rank H = n − k, forcing the original H to be full rank would imply deleting random

33
LDPC codes

Q! (5; 3) vs. random H with 5 ones per column

10!1

10!2

LDPC random: bit-.ipping

BER

LDPC GQ: bit-.ipping

Uncoded

10!3

10!4
0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 0.05
Probability of error p

Figure 16: Performance of the Q − (5, 3) code compared with a random code in a BSC, decoded with the
bit-flipping algorithm.
Q! (5; 3) vs. random code with 5 ones per column
10!1

10!2

10!3
BER

10!4

LDPC random: sum-product

!5 GQ LDPC: sum-product
10
Uncoded

10!6

1 2 3 4 5 6
Eb =N0 (dB)

Figure 17: Performance of the Q − (5, 3) code compared with a random code in a BI-AWGN channel with
sum-product decoding.

34
rows in such a way that we get a full-rank matrix H 0 , which we will call the reduced matrix. However, we
have empirically found that this does not produce the best results, as figure Fig. 18 shows. In this case,
Q(4, 3) produces a 40 × 40 matrix, with rank H = 15, so the reduced matrix H 0 is 25 × 40. Despite this
being a toy example (n = 40, which is far too small to achieve any good performance), this illustrates that
reducing a GQ code gives a worse performance than not reducing it.

N = 40, K = 15, Q(4; 3) code

10!1

10!2
BER

10!3
Reduced Q(4; 3) code
Q(4; 3) code
Uncoded

10!4
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
Probability of error p

Figure 18: Performance of a full GQ code and a reduced GQ code in a BSC.

Another observation is that we can see that the reduced code still performs better than the random
code in Fig. 15. This can be explained by the high girth of a GQ code, which is better than a random
code’s girth with high probability.

3.2.1 Lifted GQ codes

In an effort to obtain near-Shannon GQ codes, we will study generalized quadrangles that produce longer
frames. For each prime power q, we will look at the generalized quadrangles H(4, q 2 ) and Q − (5, q),
because they have more lines than points. Observe that Table 2 shows that studying H(3, q 2 ) is equivalent
to studying Q − (5, q), just transposing the point-line incidence matrix. We will not study Q(4, q) because
its (s, t) = (q, q) mean that it will produce codes of rates approximately 1/2, and we want to obtain codes
of higher rates.
In Table 3 we calculated the number of points and lines of our generalized quadrangles of interest. We
can see that H(4, q 2 ) has an s value that is quadratic with q, so the resulting GQ code would be non-sparse
for very small values of q. For Q − (5, q), the value of s grows linearly, which is still undesirable, and the
approximate rate also tends to 1 as q grows. For both of these reasons, in order to construct larger GQ
codes, we will use GQ codes with small parameters as a base matrix, which has large girth, and lift them

35
LDPC codes

as explained in subsection 2.4 to obtain codes with larger block sizes.

k-lifts of the Q! (5; 2) code

10!1 Q! (5; 2) code

5-lifted Q! (5; 2) code
20-lifted Q! (5; 2) code

10!2
BER

10!3

10!4

10!5
0 1 2 3 4 5 6
Eb =N0 (dB)

Figure 19: Performance of three lifts of the Q − (5, 2) code, of rate approximately 0.4.

In Fig. 19 we have plotted the performance of three different lifts of the Q − (5, 2) code. Note that
Q − (5, 2)has 45 lines and 27 points, which yields a code of approximately 0.4 (the rate may be higher if
the check matrix is not full-rank). We can observe the performance improvement as the lift size increases.
This is remarkable, since the 20-lifted matrix is highly structured, yet has good performance. The gain
in performance with lift size while essentially maintaining the complexity of the base Q − (5, 2) code helps
understand why these type of lifts are so popular in practice, as we have seen in subsection 2.5.
Fig. 20 shows the performance of a n = 504, rate 1/2 code called PEGirReg252x504, which according
to MacKay’s LDPC encyclopedia [Mac05] is the best-known LDPC for these specifications. This code is
irregular and was constructed via the PEG algorithm, hence its name. This will serve as a benchmark for
our 2-lifted H(4, 4) code, which has block size n = 594 and rate 0.52. Despite its slightly larger block
length, it has a slightly superior rate, so their performances should be comparable. Also, a left 3-regular
random code is compared. We can see that the lifted GQ code outperforms the random code in the range
of practical operation of the codes (BER below 10−4 ), but falls short of the best-known code by 0.5 dB.
No error floor is observable for any of the three codes. The GQ code has the advantage of being storable
just by knowing the generalized quadrangle which produces it, and a pre-determined lifting function could
generate different sizes of code. This makes lifted GQ codes attractive for low-storage capacity applications,
where the parity-check matrix can be computed via a known generalized quadrangle and lifting function.

36
PEGirReg252x504 vs. H(4; 4) 2-lifted code vs. random

10!1

10!2
BER

10!3

10!4
Random 3-regular
H(4; 4) 2-lifted
PEGirReg252x504
Shannon's converse for R = 1=2
10!5

0 0.5 1 1.5 2 2.5 3 3.5

Eb =N0 (dB)

Figure 20: Performance of a 2-lifted GQ code, a random left 3-regular code, and the best-known code of
these dimensions.

37
LDPC codes

3.2.2 Line decoding

In this section, we will explore a novel decoding scheme that attempts to exploit the geometric nature of
GQ codes. This decoding scheme does not need floating point precision, unlike the sum-product algorithm,
so we will compare it to the standard bit-flipping algorithm.
We can think of adjacent lines in a generalized quadrangle as bits that share a parity-check equation.
Observe that they can only share one equation, because in a generalized quadrangle there are no triangles.
Every bit (or line) could be flipped in an iteration, the idea is to only flip the lines that we are sure will help
us decode the received word. If flipping a line makes the syndrome decrease, it is an unreliable line. If most
of the lines adjacent to this line are also unreliable, we may want to avoid flipping it until we have flipped
lines that we are more certain produce errors. In this spirit, we propose the following decoding scheme:
1 Consider a received word x
2 while the number of iterations < the maximum number of iterations :
3 compute the syndrome of x
4 if the syndrome is 0 , output x
5 flag the lines that would decrease the syndrome when flipped
as to - flip
6 for every bit in x :
7 if the majority of lines adjacent to this bit ( line ) are
flagged to - flip , flag the bit as unreliable
8 if there is a reliable bit to - flip , flip the reliable bit
9 else , flip an unreliable bit

This idea of propagating which lines are reliable and unreliable, inherited from the sum-product algo-
rithm, seems promising to investigate. However, a majority vote on the reliability of adjacent lines may
not be the best strategy, as it has produced equivalent performance to standard bit-flipping decoding (see
Fig. 21).

38
Q(4; 3) code, line decoding and bit-.ipping decoding
10!1

10!2

Bit-.ipping decoding
BER

10!3 GQ line decoding

Uncoded

10!4

10!5
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1
Probability of error p

Figure 21: Performance bit-flipping decoding and line decoding using the Q(4, 3) code on a BSC.

39
LDPC codes

4. Summary and future work

In this thesis, we have detailed the main properties of LDPCs and their relationship to expander graphs.
We have seen that there exist asymptotically good LDPCs which can be decoded in linear time. Multiple
decoding techniques have been analyzed, both of theoretical and practical interest. Several industry ap-
plication of LDPCs have been outlined, and their performance has been empirically evaluated. We have
also performed experiments to see that a sufficiently long random LDPC code can outperform a shorter
block-length good LDPC code. We have proposed a new family of LDPCs based on generalized quadran-
gles, which outperform random codes and have the property of being fully characterized by the GQ that
generates them. We have also seen that lifts of these GQ codes also perform better than random codes,
and retain the property of needing very little storage capacity. The structure of these codes could allow
some of the following ideas to be fruitful:

Future work
Investigate further on line-decoding for generalized quadrangles: the majority-vote on unreliable
adjacent lines may not be the best decoding strategy.

In order to construct higher rate LDPC codes, random columns could be added to GQ codes via the
PEG algorithm.

Use of GQ and lifted GQ codes in storage-limited scenarios. The parity-check matrix of a GQ code
can be obtained from just the name of the GQ, which avoids us having to store a large check matrix.

Investigating the lift of GQ codes via a known low-storage lift function, to aid in low-storage appli-
cations.

Creating an encyclopedia of GQ codes, analogous to MacKay’s [Mac05], containing their parity-check

matrices and their performance in the BSC and BI-AWGN channel.

Exploiting the generalized quadrangle properties for efficient encoding. Encoding is a problem that
has not been treated much in this work, but structured codes may be exploited for efficient encoding.

Analytically try to find the error-floors of GQ codes with sum-product decoding. While this is an
open problem for general LDPC codes, the known structure of GQ codes may prove useful in this
sub-problem.

40
References
[Alc+08] George Alcorn et al. Low Density Parity Check Code for Rate 7/8. Technical Standard 9100.
Greenbelt, MD 20771, USA: NASA Goddard Space Flight Center, Mar. 2008. url: https:
//standards.nasa.gov/file/2593.
[And+07] Kenneth S. Andrews et al. “The Development of Turbo and LDPC Codes for Deep-Space
Applications”. In: Proceedings of the IEEE 95.11 (2007), pp. 2142–2156. doi: 10 . 1109 /
JPROC.2007.905132.
[Bae+19] Jung Hyun Bae et al. “An overview of channel coding for 5G NR cellular communications”.
In: APSIPA Transactions on Signal and Information Processing 8 (2019), pp. 1–14. doi:
10.1017/ATSIP.2019.10.
[Bal20] Simeon Ball. A Course in Algebraic Error-Correcting Codes. Compact Textbooks in Mathe-
matics. Springer International Publishing, 2020. isbn: 9783030411534. doi: 10.1007/978-
3-030-41153-4.
[Bam+18] John Bamberg et al. FinInG – Finite Incidence Geometry, Version 1.4.1. 2018. url: www .
fining.org.
[BL06] Yonatan Bilu and Nathan Linial. “Lifts, discrepancy and nearly optimal spectral gap”. In:
Combinatorica 26.5 (2006), pp. 495–519. doi: 10.1007/s00493-006-0029-7.
[BW11] Simeon Ball and Zsuzsa Weiner. An introduction to finite geometry. 2011. url: https :
//web.mat.upc.edu/simeon.michael.ball/IFG.pdf.
[Cas+19] Adrien Cassagne et al. “AFF3CT: A fast forward error correction toolbox!” In: SoftwareX 10
(2019), p. 100345. doi: 10.1016/j.softx.2019.100345.
[CF07] Daniel J Costello and G David Forney. “Channel coding: The road to channel capacity”. In:
Proceedings of the IEEE 95.6 (2007), pp. 1150–1177. doi: 10.1109/JPROC.2007.895188.
url: https://arxiv.org/pdf/cs/0611112.pdf.
[Chu+01] Sae-Young Chung et al. “On the design of low-density parity-check codes within 0.0045 dB
of the Shannon limit”. In: IEEE Communications Letters 5.2 (Feb. 2001), pp. 58–60. issn:
1558-2558. doi: 10.1109/4234.905935.
[Di+02] Changyan Di et al. “Finite-length analysis of low-density parity-check codes on the binary
erasure channel”. In: IEEE Transactions on Information Theory 48.6 (2002), pp. 1570–1579.
doi: 10.1109/TIT.2002.1003839.
[DPS20] Erik Dahlman, Stefan Parkvall, and Johan Skold. 5G NR: The next generation wireless access
technology. Academic Press, 2020.
[ESL04] Mustafa Eroz, Feng-Wen Sun, and Lin-Nan Lee. “DVB-S2 low density parity check codes with
near Shannon limit performance”. In: International Journal of Satellite Communications and
Networking 22.3 (2004), pp. 269–279. doi: 10.1002/sat.787.
[Gal63] Robert G Gallager. Low density parity check codes, monograph. 1963. doi: 10 . 7551 /
mitpress/4347.001.0001.
[HLW06] Shlomo Hoory, Nathan Linial, and Avi Wigderson. “Expander graphs and their applications”.
In: Bulletin of the American Mathematical Society 43.4 (2006), pp. 439–561. doi: 10.1090/
S0273-0979-06-01126-8.

41
LDPC codes

[Li+06] Zongwang Li et al. “Efficient encoding of quasi-cyclic low-density parity-check codes”. In:
IEEE Transactions on Communications 54.1 (2006), pp. 71–81. doi: 10.1109/TCOMM.2005.
861667.
[Lub+01] M.G. Luby et al. “Improved low-density parity-check codes using irregular graphs”. In: IEEE
Transactions on Information Theory 47.2 (Feb. 2001), pp. 585–598. issn: 1557-9654. doi:
10.1109/18.910576.
[Mac05] David JC MacKay. Encyclopedia of sparse graph codes. 2005. url: http://www.inference.
org.uk/mackay/codes/data.html.
[Orl+02] A. Orlitsky et al. “Stopping sets and the girth of Tanner graphs”. In: Proceedings IEEE
International Symposium on Information Theory. 2002, p. 2. doi: 10 . 1109 / ISIT . 2002 .
1023274.
[PLP99] Li Ping, W.K. Leung, and Nam Phamdo. “Low density parity check codes with semi-random
parity check matrix”. English. In: Electronics Letters 35 (1 Jan. 1999), pp. 38–39. issn: 0013-
5194. doi: 10.1049/el:19990065.
[PT09] Stanley E Payne and Joseph Adolf Thas. Finite generalized quadrangles. Vol. 9. European
Mathematical Society, 2009. doi: 10.4171/066.
[RU01] Thomas J Richardson and Rüdiger L Urbanke. “Efficient encoding of low-density parity-check
codes”. In: IEEE transactions on information theory 47.2 (2001), pp. 638–656. doi: 10.1109/
18.910579.
[RU08] Tom Richardson and Ruediger Urbanke. Modern coding theory. Cambridge university press,
2008. doi: 10.1017/CBO9780511791338.
[Sha48] C. E. Shannon. “A Mathematical Theory of Communication”. In: Bell System Technical Jour-
nal 27.3 (1948), pp. 379–423. doi: 10.1002/j.1538-7305.1948.tb01338.x.
[Spi96] D.A. Spielman. “Linear-time encodable and decodable error-correcting codes”. In: IEEE Trans-
actions on Information Theory 42.6 (Nov. 1996), pp. 1723–1731. issn: 1557-9654. doi:
10.1109/18.556668.
[SS96] M. Sipser and D.A. Spielman. “Expander codes”. In: IEEE Transactions on Information Theory
42.6 (Nov. 1996), pp. 1710–1722. issn: 1557-9654. doi: 10.1109/18.556667.
[Tan81] R. Tanner. “A recursive approach to low complexity codes”. In: IEEE Transactions on Infor-
mation Theory 27.5 (Sept. 1981), pp. 533–547. issn: 1557-9654. doi: 10.1109/TIT.1981.
1056404.
[Tom+17] Martin Tomlinson et al. “LDPC Codes”. In: Error-Correction Coding and Decoding: Bounds,
Codes, Decoders, Analysis and Applications. Cham: Springer International Publishing, 2017,
pp. 315–354. isbn: 978-3-319-51103-0. doi: 10.1007/978-3-319-51103-0_12.

42
A. Simulator and auxiliary functions
This annex contains the main code that was developed for the LDPC simulator and other auxiliary functions.
For the full body of code see https://github.com/TomasOrtega/LDPC.

A.1 Random matrix generators

The function that is used to generate random codes is the following:

Listing 1: Matlab function generate random H

1 function H = gene rate_r andom_ H (N , K , ones_per_column )
2 % GENERA TE_RAN DOM_H Generates a left - regular H
3 % With a fixed number of ones per row , and full rank
4 % If ones_per_column is even , this method will not finish
5 assert ( mod ( ones_per_column , 2) == 1 , ’ ones_per_column is not
odd ! ’)
6
7 H = zeros ( N - K , N ) ;
8
9 while ( gfrank (H , 2) < N - K )
10 H = zeros ( N - K , N ) ;
11
12 for i = 1: N
13 H ( randperm ( N - K , ones_per_column ) , i ) = 1;
14 end
15
16 end
17
18 end

The function that is used to generate random (c, r )-regular codes is the following:

Listing 2: Matlab function generate regular H

1 function H = g en er ate _r eg ul ar_H (N , K , ones_per_column , ones_per_row )
2 % GENERA TE _R EG UL AR _H Generates a regular H
3 % With a fixed number of ones per column and row
4 assert ( N * floor ( ones_per_column ) == ( N - K ) *
floor ( ones_per_row ) )
5 H = zeros ( N - K , N ) ;
6 ones_at_rows = zeros ( N - K , 1) ;
7
8 for i = 1: N
9 rows = 1: N - K ;
10 rows = rows ( ones_at_rows < ones_per_row ) ;
11 % shuffle rows
12 rows = rows ( randperm ( length ( rows ) ) ) ;

43
LDPC codes

13
14 if ( length ( rows ) < ones_per_column )
15 H = ge ne ra te _r eg ul ar _H (N , K , ones_per_column ,
ones_per_row ) ;
16 return ;
17 end
18
19 H ( rows (1: ones_per_column ) , i ) = 1;
20 ones_at_rows ( rows (1: ones_per_column ) ) =
ones_at_rows ( rows (1: ones_per_column ) ) + 1;
21 end
22
23 end

A.2 BSC simulator

The Matlab function that was used evaluate the performance of LDPCs in a BSC was the following:

Listing 3: Matlab function BSC error rate

1 function [ FER , BER ] = BSC_error_rate ( ps , K , H , G , permutation_vector ,
numFrames )
2 % BSC_ERROR_RATE Runs Monte - Carlo experiments to calculate error
rate of an LDPC in a BSC channel
3 % G must be a full - rank matrix in systematic form
4 % H must be a parity - check matrix
5
6 BER = zeros (1 , length ( ps ) ) ;
7 FER = zeros (1 , length ( ps ) ) ;
8 [~ , N ] = size ( H ) ;
9
10 for i = 1: length ( ps )
11 p = ps ( i ) ;
12 ttlErrB = 0;
13 ttlErrF = 0;
14
15 for ii = 1: numFrames
16 % Generate c from 0 to 2^ K - 1
17 data = logical ( randi ([0 1] , 1 , K ) ) ;
18 % Encode the word to obtain the sent vector s
19 encData = mod ( data * G , 2) ;
20 % Generate error vector .
21 e = rand (1 , N ) < p ;
22 % Received vector .
23 r = mod ( encData + e , 2) ;
24 % Decoded received vector , v
25 v = b e l i e f _ p r o p a g a t i o n _ m e x (H , r , N ) ;

44
26 reorderedBits = logical ( v ( p erm ut at io n_ ve ct or ) ) ;
27 numErr = biterr ( data , reorderedBits ( N - K + 1: N ) ) ; % get
the data bits
28 ttlErrB = ttlErrB + numErr ;
29 ttlErrF = ttlErrF + ( numErr > 0) ;
30 end
31
32 BER ( i ) = ttlErrB / ( numFrames * K ) ;
33 FER ( i ) = ttlErrF / numFrames ;
34 end
35
36 end

The Matlab function that was used to perform the bit-flipping belief propagation algorithm is found
in Listing 4. Note that this function is recursive, which surprisingly performed better than its iterative
counterpart in Matlab. Also, this function was compiled into a MEX in order to speed up performance to
obtain results.

Listing 4: Matlab function belief propagation

1 function v = b el ie f_p ro pa ga tion (H , v , N )
2 % BELIEF _P RO PA GA TI ON Performs belief propagation bit - flipping of v
with the parity - check matrix H
3 % Outputs a corrected vector v
4 w_v = syndrome (H , v ) ;
5
6 if w_v < 1
7 return
8 end
9
10 % Compute weights of v + errors .
11 ws = zeros (1 , N ) ;
12
13 for i = 1: N
14 ve = v ;
15 ve ( i ) = ~ v ( i ) ;
16 ws ( i ) = syndrome (H , ve ) ;
17 end
18
19 % If there is a better syndrome , propagate beleif
20 [ min_w , i ] = min ( ws ) ;
21
22 if ( min_w < w_v )
23 v(i) = ~v(i);
24 v = be li ef _p ro pa ga ti on (H , v , N ) ;
25 end
26
27 end

45
LDPC codes

A.3 BI-AWGN capacity

The derivation of the BI-AWGN channel capacity is rather long and will not be discussed here. Instead, we
refer you to [RU08]. Unfortunately, it cannot be expressed in an elementary form, but it can be computed
numerically as
(x+1)2 (x−1)2
Z
1 2 1 − −
CBI −AWGN (σ) = − log2 (2πeσ ) − φ(x) log2 φ(x) dx, where φ(x) = √ e 2σ + e 2σ
2 2 .
2 σ 8π

The integral on the expression above produces numerical errors if the integral is made with large margins.
The Matlab function to calculate the BI-AWGN channel’s capacity that follows has adjusted margins to
maintain stability and accuracy.
Listing 5: Matlab function BIAWGN Capacity
1 function cap = BIAWGN_Capacity ( sigma )
2 % BIAWGN_Capacity Returns the capacity of a BIAWGN given sigma
( noise std dv )
3 % Vector input admitted
4 if ( length ( sigma ) > 1)
5 cap = zeros (1 , length ( sigma ) ) ;
6
7 for ii = 1: length ( sigma )
8 cap ( ii ) = BIAWGN_Capacity ( sigma ( ii ) ) ;
9 end
10
11 else
12 phi = @ ( x ) (1 ./ ( sigma * sqrt (8 * pi ) ) ) .* ( exp ( -(( x +
1) .^2) ./ (2 * sigma .^2) ) + exp ( -(( x - 1) .^2) ./ (2 *
sigma .^2) ) ) ;
13 f = @ ( x ) phi ( x ) .* log2 ( phi ( x ) ) ;
14 cap = -0.5 * log2 (2 * pi * exp (1) * sigma ^2) - integral (f , -20
* sigma , 20 * sigma ) ;
15 end
16
17 end

A.4 GAP code

The code that was used to obtain the point-line incidence matrices of generalized quadrangles is the
following:
Listing 6: GAP script incidence matrix GQ.gap
1 LoadPackage ( " fining " ) ;
2 # Choose the GQ
3 # Reasonable bounds are : Q -(5 , q ) , q <= 3
4 # H (4 , q ^2) , q <= 3
5

46
6 q := 3;
7 GQ := Hyperboli cQuadr ic (3 , q ) ;
8 # GQ := ParabolicQuadric (4 , q ) ;
9 # GQ := EllipticQuadric (5 , q ) ;
10 # GQ := Herm i t ia n P ol a r S pa c e (4 , q * q ) ;
11
12 lines_GQ := Lines ( GQ ) ;
13 points_GQ := Points ( GQ ) ;
14 incidences := [];
15 for line in lines_GQ do
16 row := [];
17 incident_to_line := Set ( ShadowOfElement ( GQ , line , 1) ) ;
18 for point in points_GQ do
19 if point in incident_to_line then
20 Add ( row , 1) ;
21 else
22 Add ( row , 0) ;
23 fi ;
24 od ;
25 Add ( incidences , row ) ;
26 od ;
27
28 # Output incidence matrix in Matlab - ready format
29 output := OutputTextFile ( " out . txt " , false ) ;
30 S e tP r i n tF o r m a t t i n g S t a t u s ( output , false ) ;
31 PrintTo ( output , " " ) ;
32 for row in incidences do
33 for cell in row do
34 AppendTo ( output , cell ) ;
35 AppendTo ( output , " " ) ;
36 od ;
37 AppendTo ( output , " \ n " ) ;
38 od ;