Convolutional Codes Turbo Codes LDPC Codes
Convolutional Codes Turbo Codes LDPC Codes
Convolutional Codes Turbo Codes LDPC Codes
Turbo codes
LDPC codes
Turbo and LDPC Codes:
Binary Convolutional Codes
A convolutional encoder comprises:
k input streams
We assume k=1 throughout this tutorial.
n output streams
m delay elements arranged in a shift register.
Combinatorial logic (OR gates).
Each of the n outputs depends on some modulo-2 combination of the k current
inputs and the m previous inputs in storage
The constraint length is the maximum number of past and present
input bits that each output bit can depend on.
K = m + 1
Constraint Length K = 3
D D
Recursive Systematic
Convolutional (RSC) Codes
An RSC encoder is constructed from a standard convolutional encoder
by feeding back one of the outputs.
An RSC code is systematic.
The input bits appear directly in the output.
D D
i
x
i
r
D D
Parallel Concatenated Codes
with Nonuniform Interleaving
A stronger code can be created by encoding in parallel.
A nonuniform interleaver scrambles the ordering of bits at the input
of the second encoder.
Uses a pseudo-random interleaving pattern.
It is very unlikely that both encoders produce low weight code words.
MUX increases code rate from 1/3 to 1/2.
Higher code rates Rc are obtained by transmitting
only some of the parity bits (puncturing)
RSC
#1
RSC
#2
Nonuniform
Interleaver
MUX
Input
Parity
Output
Systematic Output
i
x
Random Coding Interpretation
of Turbo Codes
Random codes achieve the best performance.
Shannon showed that as n, random codes achieve channel
capacity.
However, random codes are not feasible.
The code must contain enough structure so that decoding can be
realized with actual hardware.
Coding dilemma:
All codes are good, except those that we can think of.
With turbo codes:
The nonuniform interleaver adds apparent randomness to the
code.
Yet, they contain enough structure so that decoding is feasible.
Characteristics of Turbo Codes
Turbo codes have extraordinary performance at low SNR.
Very close to the Shannon limit.
Due to a low multiplicity of low weight code words.
However, turbo codes have high BER .
This is due to their low minimum distance.
Performance improves for larger block sizes.
Larger block sizes mean more latency (delay).
However, larger block sizes are not more complex to decode.
The BER is lower for larger frame/ interleaver sizes
The complexity of a constraint length K
TC
turbo code is the
same as a K = K
CC
convolutional code, where:
K
CC
~ 2+K
TC
+ log
2
(number decoder iterations)
Turbo Encoder
Data is segmented into blocks of L bits.
where 40 s L s 5114
Upper
RSC
Encoder
Lower
RSC
Encoder
Interleaver
Systematic
Output
X
k
Uninterleaved
Parity
Z
k
Interleaved
Parity
Z
k
Input
X
k
Interleaved
Input
X
k
Output
Interleaver:
Inserting Data into Matrix
Data is fed row-wise into a R by C matrix.
R = 5, 10, or 20.
8 s C s 256
If L < RC then matrix is padded with dummy characters.
X
1
X
2
X
3
X
4
X
5
X
6
X
7
X
8
X
9
X
10
X
11
X
12
X
13
X
14
X
15
X
16
X
17
X
18
X
19
X
20
X
21
X
22
X
23
X
24
X
25
X
26
X
27
X
28
X
29
X
30
X
31
X
32
X
33
X
34
X
35
X
36
X
37
X
38
X
39
X
40
Interleaver:
Intra-Row Permutations
Data is permuted within each row.
Permutation rules are rather complicated.
See spec for details.
X
2
X
6
X
5
X
7
X
3
X
4
X
1
X
8
X
10
X
12
X
11
X
15
X
13
X
14
X
9
X
16
X
18
X
22
X
21
X
23
X
19
X
20
X
17
X
24
X
26
X
28
X
27
X
31
X
29
X
30
X
25
X
32
X
40
X
36
X
35
X
39
X
37
X
38
X
33
X
34
Interleaver:
Inter-Row Permutations
Rows are permuted.
If R = 5 or 10, the matrix is reflected about the middle row.
For R=20 the rule is more complicated and depends on L.
See spec for R=20 case.
X
40
X
36
X
35
X
39
X
37
X
38
X
33
X
34
X
26
X
28
X
27
X
31
X
29
X
30
X
25
X
32
X
18
X
22
X
21
X
23
X
19
X
20
X
17
X
24
X
10
X
12
X
11
X
15
X
13
X
14
X
9
X
16
X
2
X
6
X
5
X
7
X
3
X
4
X
1
X
8
Interleaver:
Reading Data From Matrix
Data is read from matrix column-wise.
Thus:
X
1
= X
40
X
2
= X
26
X
3
= X
18
X
38
= X
24
X
2
= X
16
X
40
= X
8
X
40
X
36
X
35
X
39
X
37
X
38
X
33
X
34
X
26
X
28
X
27
X
31
X
29
X
30
X
25
X
32
X
18
X
22
X
21
X
23
X
19
X
20
X
17
X
24
X
10
X
12
X
11
X
15
X
13
X
14
X
9
X
16
X
2
X
6
X
5
X
7
X
3
X
4
X
1
X
8
UMTS Constituent RSC Encoder
Upper and lower encoders are identical:
Feedforward generator is 15 in octal.
Feedback generator is 13 in octal.
D D D
Parity Output
(Both Encoders)
Systematic Output
(Upper Encoder Only)
Encoding Termination
After the L
th
input bit, a 3 bit tail is calculated.
The tail bit equals the fed back bit.
This guarantees that the registers get filled with zeros.
Each encoder has its own tail.
The tail bits and their parity bits are transmitted at the end.
D D D
X
L+1
X
L+2
X
L+3
Z
L+1
Z
L+2
Z
L+3
Output Stream Format
The format of the output steam is:
X
1
Z
1
Z
1
X
2
Z
2
Z
2
X
L
Z
L
Z
L
X
L+1
Z
L+1
X
L+2
Z
L+2
X
L+3
Z
L+3
X
L+1
Z
L+1
X
L+2
Z
L+2
X
L+3
Z
L+3
L data bits and
their associated
2L parity bits
(total of 3L bits)
3 tail bits for
upper encoder
and their
3 parity bits
3 tail bits for
lower encoder
and their
3 parity bits
Total number of coded bits = 3L + 12
Code rate: r
L
L
=
+
~
3 12
1
3
Turbo Decoding Architecture
Upper
MAP
Decoder
r(X
k
)
r(Z
k
)
Lower
MAP
Decoder
r(Z
k
)
Initialization and timing:
Upper decoder executes first, then lower decoder.
X
k
Interleave
Deinnterleave
Demux
zeros
Demux
Preface to LDPC Codes:
Review of Linear Block Codes
V
n
= n-dimensional vector space over {0,1}
A (n, k) linear block code with dataword length k, codeword length n
is a k-dimensional vector subspace of V
n
A codeword c is generated by the matrix multiplication c = uG, where
u is the k-bit long message and G is a k by n generator matrix
The parity check matrix H is a n-k by n matrix of ones and zeros,
such that if c is a valid codeword then, cH
T
= 0
Each row of H specifies a parity check equation. The code bits in
positions where the row is one must sum (modulo-2) to zero
Low-Density Parity-Check Codes
Low-Density Parity-Check (LDPC) codes are a class of linear block
codes characterized by sparse parity check matrices H
H has a low-density of 1s
LDPC codes were originally invented by Robert Gallager in the early
1960s but were largely ignored until they were rediscovered in the
mid-1990s by MacKay
Sparseness of H can yield large minimum distance d
min
and reduces
decoding complexity
Can perform within 0.0045 dB of Shannon limit
Decoding LDPC codes
Like Turbo codes, LDPC can be decoded iteratively
Instead of a trellis, the decoding takes place on a Tanner graph
Messages are exchanged between the v-nodes and c-nodes
Edges of the graph act as information pathways
Hard decision decoding
Bit-flipping algorithm
Soft decision decoding
Sum-product algorithm
Also known as message passing/ belief propagation algorithm
Min-sum algorithm
Reduced complexity approximation to the sum-product algorithm
In general, the per-iteration complexity of LDPC codes is less than it is
for turbo codes
However, many more iterations may be required (max~100;avg~30)
Thus, overall complexity can be higher than turbo
Tanner Graphs
A Tanner graph is a bipartite graph that describes the parity check
matrix H
There are two classes of nodes:
Variable-nodes: Correspond to bits of the codeword or equivalently, to
columns of the parity check matrix
There are n v-nodes
Check-nodes: Correspond to parity check equations or equivalently, to
rows of the parity check matrix
There are m=n-k c-nodes
Bipartite means that nodes of the same type cannot be connected (e.g. a
c-node cannot be connected to another c-node)
The i
th
check node is connected to the j
th
variable node iff the (i,j)
th
element of the parity check matrix is one, i.e. if h
ij
=1
All of the v-nodes connected to a particular c-node must sum (modulo-2)
to zero
More on Tanner Graphs
A cycle of length l in a Tanner graph is a path of l distinct edges
which closes on itself
The girth of a Tanner graph is the minimum cycle length of the
graph.
The shortest possible cycle in a Tanner graph has length 4
f
0
f
1
f
2
v
0
v
1
v
2
v
3
v
4
v
5
v
6
v-nodes
c-nodes
Example: Tanner Graph
for (7,4) Hamming Code
(
(
(
=
1 0 0 1 1 0 1
0 1 0 1 0 1 1
0 0 1 0 1 1 1
H
f
0
f
1
f
2
v
0
v
1
v
2
v
3
v
4
v
5
v
6
v-nodes
c-nodes
Regular vs. Irregular LDPC codes
An LDPC code is regular if the rows and columns of H have
uniform weight, i.e. all rows have the same number of ones
(d
v
) and all columns have the same number of ones (d
c
)
The codes of Gallager were regular (or as close as possible)
Although regular codes had impressive performance, they are
still about 1 dB from capacity and generally perform worse than
turbo codes
An LDPC code is irregular if the rows and columns have
non-uniform weight
Irregular LDPC codes tend to outperform turbo codes for block
lengths of about n>10
5
Constructing Regular LDPC Codes:
MacKay, 1996
The idea is to randomly generate a M N matrix H with weight d
v
columns and weight d
c
rows, subject to some constraints
Construction 1: Overlap between any 2 columns is not greater than 1
Construction 2: M/2 columns have d
v
=2, with no overlap between any
pair of columns. Remaining columns have d
v
=3. As with 1A, the
overlap between any two columns is no greater than 1
Constructing Irregular LDPC Codes:
Luby, et. al., 1998
Luby et. al. developed LDPC codes based on irregular LDPC Tanner
graphs
Message and check nodes have conflicting requirements
Message nodes benefit from having a large degree
LDPC codes perform better with check nodes having low degrees
Irregular LDPC codes help balance these competing requirements
High degree message nodes converge to the correct value quickly
This increases the quality of information passed to the check nodes,
which in turn helps the lower degree message nodes to converge
Check node degree kept as uniform as possible and variable node
degree is non-uniform
Example of low-density Parity check code matrix H for
n=20,j=4,k=3
Encoding LDPC Codes
A linear block code is encoded by performing the matrix multiplication c = uG
A common method for finding G from H is to first make the code systematic
by adding rows and exchanging columns to get the H matrix in the form H =
[P
T
I]
Then G = [I P]
However, the result of the row reduction is a non-sparse P matrix
The multiplication c =[u uP] is therefore very complex
As an example, for a (10000, 5000) code, P is 5000 by 5000
Assuming the density of 1s in P is 0.5, then 0.5 (5000)
2
additions are required per
codeword
This is especially problematic since we are interested in large n (>10
5
)
An often used approach is to use the all-zero codeword in simulations
Encoding LDPC Codes
Richardson and Urbanke show that even for large n, the encoding
complexity can be (almost) linear function of n
Efficient encoding of low-density parity-check codes, IEEE Trans. Inf.
Theory, Feb., 2001
Using only row and column permutations, H is converted to an
approximately lower triangular matrix
Since only permutations are used, H is still sparse
The resulting encoding complexity in almost linear as a function of n
An alternative involving a sparse-matrix multiply followed by
differential encoding has been proposed by Ryan, Yang, & Li.
Lowering the error-rate floors of moderate-length high-rate irregular
LDPC codes, ISIT, 2003
Encoding LDPC Codes
Let H = [H
1
H
2
] where H
1
is sparse and
Then a systematic code can be generated with G = [I H
1
T
H
2
-T
].
It turns out that H
2
-T
is the generator matrix for an accumulate-code
(differential encoder), and thus the encoder structure is simply:
u u
uH
1
T
H
2
-T
Similar to Jin & McElieces Irregular Repeat Accumulate (IRA) codes.
Thus termed Extended IRA Codes
(
(
(
(
(
(
(
(
=
(
(
(
(
(
(
(
(
=
1
1 ...
1 ... 1
1 ... 1 1
1 ... 1 1 1
and
1 1
1 ... 1
1 1
1 1
1
2 2
T
H H
Multiply
by H
1
T
D
For example, the (7,4) Hamming code is
defined by the following parity check
equations
Decoding: Bit-Flipping Algorithm:
Example.1:(7,4) Hamming Code
f
1
=1
y
0
=1 y
1
=1 y
2
=1
y
3
=1
y
4
=0
y
5
=0
y
6
=1
c
0
=1 c
1
=0 c
2
=1
c
3
=1
c
4
=0
c
5
=0
c
6
=1
f
2
=0
Transmitted code word
Received code word
f
0
=1
Bit-Flipping Algorithm:
(7,4) Hamming Code
y
0
=1
y
2
=1
y
3
=1
y
6
=1
y
4
=0
y
5
=0
y
1
=1
f
2
=0
f
0
=1
f
1
=1
Bit-Flipping Algorithm:
(7,4) Hamming Code
y
0
=1
y
2
=1
y
3
=1
y
6
=1
y
4
=0
y
5
=0
y
1
=0
f
2
=0
f
0
=0
f
1
=0
Decoding
Generalized Bit-Flipping Algorithm
Step 1: Compute parity-checks
If all checks are zero, stop decoding
Step 2: Flip any digit contained in T or more failed check equations
Step 3: Repeat 1 to 2 until all the parity checks are zero or a maximum
number of iterations are reached
The parameter T can be varied for a faster convergence
Example 2: Double error
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Iteration #1
Example 2: Double error
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Iteration #1
Example 2: Double error
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Iteration #1
2 0 1 1 0 1 1 1 0 2 1 1 0 1 1 2
Example 2: Double error
Iteration #1
2 0 1 1 0 1 1 1 0 2 1 1 0 1 1 2
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
Example 2: Double error
Iteration #2
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
Example 2: Double error
Iteration #2
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
Example 2: Double error
Iteration #2
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
Example 2: Double error
Iteration #2
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
Example 2: Double error
Iteration #2
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
1 0 0 0 0 1 1 0 0 2 0 1 0 1 0 1
Example 2: Double error
Iteration #2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 1 1 0 0 2 0 1 0 1 0 1
Example 2: Double error
Iteration #2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Done
Generalized Bit Flipping:
(15,7) BCH Code
f
0
=1 f
1
=0 f
2
=0
f
3
=0 f
4
=1 f
5
=0 f
6
=0 f
7
=1
Transmitted code word
c
0
=0 c
1
=0
c
2
=0
c
3
=0
c
4
=0
c
5
=0
c
6
=0
c
7
=0 c
8
=0 c
9
=0 c
10
=0 c
11
=0 c
12
=0 c
13
=0 c
14
=0
y
0
=0 y
1
=0
y
2
=0
y
3
=0
y
4
=1
y
5
=0
y
6
=0
y
7
=0 y
8
=0 y
9
=0 y
10
=0 y
11
=0 y
12
=0 y
13
=0 y
14
=1
Received code word
Generalized Bit Flipping:
(15,7) BCH Code
f
0
=0 f
1
=0 f
2
=0
f
3
=0 f
4
=0 f
5
=0 f
6
=0 f
7
=1
y
0
=0 y
1
=0
y
2
=0
y
3
=0
y
4
=0
y
5
=0
y
6
=0
y
7
=0 y
8
=0 y
9
=0 y
10
=0 y
11
=0 y
12
=0 y
13
=0 y
14
=1
Generalized Bit Flipping:
(15,7) BCH Code
f
0
=0 f
1
=0 f
2
=0
f
3
=0 f
4
=0 f
5
=0 f
6
=0 f
7
=0
y
0
=0 y
1
=0
y
2
=0
y
3
=0
y
4
=0
y
5
=0
y
6
=0
y
7
=0 y
8
=0 y
9
=0 y
10
=0 y
11
=0 y
12
=0 y
13
=0 y
14
=0
Halting Criteria
After each iteration, halt if:
This is effective, because the probability of an undetectable decoding
error is negligible
Otherwise, halt once the maximum number of iterations is reached
If the Tanner graph contains no cycles, then Q
i
converges to the true
APP value as the number of iterations tends to infinity
0 H c =
T
Applications
In 2003, an LDPC code beat six different turbo codes to become the
error correcting code in the new DVB-S2 standard for the satellite
transmission of digital television. The latest version of the standard
DVB-S2 uses a concatenation of an outer BCH code and inner LDPC
code
In 2008, LDPC beat convolutional turbo codes as the FEC scheme for
the ITU-T G.hn standard.
G.hn chose LDPC over turbo codes because of its lower decoding
complexity (especially when operating at data rates close to 1 Gbit/s)
and because the proposed turbo codes exhibited a significant error
rate at the desired range of operation.
LDPC is also used for 10GBase-T Ethernet, which sends data at 10
gigabits per second over twisted-pair cables. As of 2009, LDPC codes
are also part of the Wi-Fi 802.11 standard as an optional part
of 802.11n, in the High Throughput (HT) PHY specification