Viterbi Algorithm in Continuous-Phase
Viterbi Algorithm in Continuous-Phase
Master of Technology
In
VLSI Design and Embedded System
By
L. Mallesh
Master of Technology
In
VLSI Design and Embedded System
By
L. Mallesh
Prof. G. Panda
CERTIFICATE
To the best of my knowledge, the matter embodied in the thesis has not been submitted to any
Prof.G.Panda
Date Dept. of Electronics and Communication Engg.
National Institute of Technology
Rourkela-769008
ACKNOWLEDGEMENTS
This project is by far the most significant accomplishment in my life and it would be
impossible without people who supported me and believed in me.
I want to thank all my teachers Prof. G.S. Rath, Prof. K. K. Mahapatra, Prof. S.K.
Patra and Prof. S.K. Meher for providing a solid background for my studies and research
thereafter. They have been great sources of inspiration to me and I thank them from the
bottom of my heart.
I would like to thank all my friends and especially my classmates for all the
thoughtful and mind stimulating discussions we had, which prompted us to think beyond the
obvious. I’ve enjoyed their companionship so much during my stay at NIT, Rourkela.
I would like to thank all those who made my stay in Rourkela an unforgettable and
rewarding experience.
Last but not least I would like to thank my parents, who taught me the value of hard
work by their own example. They rendered me enormous support during the whole tenure of
my stay in NIT Rourkela.
L.Mallesh
CONTENTS
Abstract iii
List of Figures iv
List of Tables vi
1 Introduction 1
2 Error control coding 3
2.1 Preliminaries 3
2.2 Advantages of coding 5
2.3 Principle of control coding 6
2.4 Coding techniques 9
3 Convolutional coding 14
3.1 Introduction 14
3.2 Encoder structure 15
3.3 Encoder representation 15
3.3.1 General representation 16
3.3.2 Tree diagram representation 16
3.3.3 State diagram representation 17
3.3.3 Trellis diagram representation 18
3.4 hard decision and Soft decision decoding 19
3.5 Hard decision viterbi algorithm 19
3.6 Soft decision viterbi algorithm 22
3.7 Performance Analysis of convolutional codes 23
3.7.1 Transfer function of convolutional code 23
3.7.2 Degree of Quantization 24
3.7.3 Decoding complexity for convolutional codes 24
4 Viterbi algorithm 25
4.1 Introduction 25
4.2 MAP and MLSE 26
4.3 The Viterbi Algorithm 27
4.4 Examples 29
4.5 Algorithm Extensions 34
4.6 Applications 35
i
4.6.1 Communication 35
4.6.2 Target Tracking 36
4.6.3 Recognition 39
4.7 Viterbi decoder 40
4.7.1 Implementation of Viterbi decoder 40
5 Pass Band Modulation 47
5.1 Introduction 47
5.2 FSK 49
5.3 PSK 52
5.4 DPSK 53
6 Viterbi Algorithm in CPFSK 55
6.1 Introduction 55
6.2 CPFSK 57
6.3 Performance Analysis 59
6.4 Implementation issues 60
7 Simulation Results 62
7.1 Convolutional Encoding and Viterbi decoding 62
7.2 Performance of Viterbi decoder for Convolutional codes 63
7.3 Performance of FSK 66
7.4 Convolutionally modulated and demodulated FSK 67
7.5 Modulated signal of CPFSK 68
7.4 Performance of Viterbi Algorithm in CPFSK 69
8 Conclusion 70
References 71
ii
iii
ABSTRACT
The Viterbi algorithm, an application of dynamic programming, is widely used for estimation
and detection problems in digital communications and signal processing. It is used to detect
signals in communication channels with memory, and to decode sequential error-control
codes that are used to enhance the performance of digital communication systems. The
Viterbi algorithm is also used in speech and character recognition tasks where the speech
signals or characters are modeled by hidden Markov models. This project explains the basics
of the Viterbi algorithm as applied to systems in digital communication systems, and speech
and character recognition. It also focuses on the operations and the practical memory
requirements to implement the Viterbi algorithm in real-time.
A forward error correction technique known as convolutional coding with Viterbi decoding
was explored. In this project, the basic Viterbi decoder behavior model was built and
simulated. The convolutional encoder, BPSK and AWGN channel were implemented in
MATLAB code. The BER was tested to evaluate the decoding performance.
The main issue of this thesis is to implement the RTL level model of Viterbi decoder. The
RTL Viterbi decoder model includes the Branch Metric block, the Add-Compare-Select
block, the trace-back block, the decoding block and next state block. With all done, we
further understand about the Viterbi decoding algorithm.
iv
LIST OF FIGURES
v
Fig 5.5 Non-coherent FSK detector and Demodulator 51
Fig 5.6 Independent FSK amplitude spectrum 51
Fig 5.7 Illustrating of the modulation of a bi-polar message signal to yield a PRK signal 52
Fig 5.8 Amplitude spectrum of PSK 53
Fig 5.9 coherent demodulator of PSK signal 53
Fig 6.1 HART signal path 55
Fig 6.2 HART character structure 56
Fig 6.3 Illustration of continuous phase FSK 57
vi
LIST OF TABLES
vii
1. INTRODUCTION
1.1 INTRODUCTION:
Convolutional coding has been used in communication systems including deep space
communications and wireless communications. It offers an alternative to block codes for
transmission over a noisy channel. An advantage of convolutional coding is that it can be
applied to a continuous data stream as well as to blocks of data. IS-95, a wireless digital
cellular standard for CDMA (code division multiple access), employs convolutional coding.
A third generation wireless cellular standard, under preparation, plans to adopt turbo coding,
which stems from convolutional coding.
1
1.3 THESIS CONTRIBUTION:
This section outlines some of the contributions of the study presented in this thesis. In this
project, the basic Viterbi decoder behavior model was built and simulated. The convolutional
encoder, BPSK and AWGN channel were implemented in MATLAB code. The BER was
tested to evaluate the decoding performance. The application of Viterbi Algorithm in the
Continuous-Phase Frequency Shift Keying (CPFSK) is presented. Analysis for the
performance is made and compared with the conventional coherent estimator and the
complexity of the implementation of the Viterbi decoder in hardware device
The main issue of this thesis is to implement the RTL level model of Viterbi decoder. The
RTL Viterbi decoder model includes the Branch Metric block, the Add-Compare-Select
block, the trace-back block, the decoding block and next state block. With all done, we
further understand about the Viterbi decoding algorithm.
2
2. ERROR CONTROL CODING
2.1 PRELIMINARIES:
In this section we define the terms that we will need to handle the later topics. Definitions are
tailored to suit this paper and may differ from those presented in the literature.
Digital communications systems are often partitioned as shown in Fig.1.1. The following
paragraphs describe the elements of Fig.1.1 and define other terms common to error-control
coding.
Modulator Demodulator
Channel
Noise
Encoder and Decoder - The encoder adds redundant bits to the sender's bit stream to create
a codeword. The decoder uses the redundant bits to detect and/or correct as many bit errors
as the particular error-control code will allow.
Modulator and Demodulator - The modulator transforms the output of the encoder, which is
digital, into a format suitable for the channel, which is usually analog (e.g., a telephone
channel). The demodulator attempts to recover the correct channel symbol in the presence of
noise. When the wrong symbol is selected, the decoder tries to correct any errors that result.
Communications Channel - The part of the communication system that introduces errors. The
channel can be radio, twisted wire pair, coaxial cable, fiber optic cable, magnetic tape,
Optical discs or any other noisy medium.
3
Bit-Error-Rate (BER) - The probability of bit error. This is often the figure of merit
for a error-control code. We want to keep this number small, typically less than 10-4. Bit-
error- rate is a useful indicator of system performance on an independent error channel, but it
has little meaning on bursty or dependent error channels.
Random Errors - Errors that occur independently. This type of error occurs on
channels that are impaired solely by thermal (Gaussian) noise. Independent-error
channels are also called memoryless channels because knowledge of previous channel
symbols adds nothing to our knowledge of the current channel symbol.
Burst Errors - Errors that are not independent. For example, channels with deep
fades experience errors that occur in bursts. Because the fades make consecutive bits more
likely to be in error, the errors are usually considered dependent rather than independent. In
contrast to independent-error channels, burst-error channels have memory.
Energy per Bit- The amount of energy contained in one information bit. This is not
a parameter that can be measured by a meter, but it can be derived from other known
parameters. Energy per bit (Eb) is important because almost all channel impairments can be
overcome by increasing the energy per bit. Energy per bit (in joules) is related to
transmitter power Pt (in watts), and bit rate R (in bits per second), in the following way:
Pt
Eb =
R
4
If transmit power is fixed, the energy per bit can be increased by lowering the bit rate.
Thus, the reason why lower bit rates are considered more robust. The required energy
per bit to maintain reliable communications can be decreased through error-control coding as
we shall see in the next section.
Coding Gain - The difference (in dB) in the required signal-to-noise ratio to
maintain reliable communications after coding is employed. Signal-to-noise ratio is
usually given by Eb/N0, where N0 is the noise power spectral density measured in
watts/Hertz (joules). For example, if a communications system requires a Eb/N0 of 12 dB
to maintain a BER of 10-5, but after coding it requires only 9 dB to maintain the same BER,
then the coding gain is 12 dB minus 9 dB = 3 dB. (Recall that dB (decibels) = 10 log10 X,
where X is a ratio of powers or energies.)
Code Rate - Consider an encoder that takes k information bits and adds r redundant
bits (also called parity bits) for a total of n = k + r bits per codeword. The code rate is
the fraction k/n and the code is called a (n, k) error-control code. The added parity bits
are a burden (i.e. overhead) to the communications system, so the system designer often
chooses a code for its ability to achieve high coding gain with few parity bits.
Reduce the occurrence of undetected errors. This was one of the first uses of error-
control coding. Today's error detection codes are so effective that the occurrence of
undetected errors is, for all practical purposes, eliminated.
5
example, coding can achieve coding gains of over 35 dB.
Eliminate Interference. As the electromagnetic spectrum becomes more crowded
with man-made signals, error-control coding will mitigate the effects of unintentional
interference.
more than 7.8 dB, coding can't do it. We must resort to other measures, like increasing
transmitter power. In practice, the situation is worse because we have no practical code that
achieves Shannon's lower bound. A more realistic coding gain for this example is 3 dB rather
than 7.8 dB.
6
The block encoder takes a block of k bits and replaces it with a n-bit codeword (n
is bigger than k). For a binary code, there are 2 k possible codewords in the codebook.
The channel introduces errors and the received word can be any one of 2 n n-bit words of
which only 2 k are valid codewords. The job of the decoder is to find the codeword that
is closest to the received n-bit word. How a practical decoder does this is beyond the scope
of this paper, but our examples will use a brute force look-up table method.
Case 1: No errors. Let's assume that the encoder sends codeword C and the
channel introduces no errors. Then codeword C will also be received, the decoder will find it
in the look-up table, and decoding will be successful.
Case 2: Detectable error pattern. This time we send codeword C and the channel
introduces errors such that the n -bit word Y i s received. Because Y i s not a valid
codeword, the decoder will not find it in the table and will therefore flag the received n-
bit word as an errored codeword. The decoder does not necessarily know the number or
location of the errors, but that's acceptable because we only asked the decoder to detect
errors. Since the decoder properly detected an errored codeword, decoding is successful.
Case 3: Undetectable error pattern. We send codeword C for the third time and
this time the channel introduces the unlikely (but certainly possible) error pattern that
converts codeword C into codeword D. The decoder can't know that codeword C was sent
and must assume that codeword D was sent instead. Because codeword D is a valid
codeword, the decoder declares the received n-bit word error-free and passes the
corresponding information bits on to the user. This is an example of decoder failure.
7
Naturally, we want the decoder to fail rarely, so we choose codes that have a small
probability of undetected error. One of the most popular error detection codes is the
shortened Hamming code, also known as the cyclic redundancy check (CRC). Despite its
widespread use since the 1960s, the precise performance of CRCs was not known until
Fujiwara et al. [7] published their results in 1985.
correction decoder: correct decoding, decoding failure, and error detection without correction.
Case 1: Correct decoding. Assume that codeword C is sent and the n-bit word Y
is received. Because Y is inside s sphere, the decoder will correct all errors and
error correction decoding will be successful.
Case 2: Decoding failure. This time we send codeword C and the channel gives us
n- bit word Z. The decoder has no way of knowing that codeword C was sent and must
decode to D since Z is in D's sphere. This is an example of error correction decoder failure.
Case 3: Error detection without correction. This case shows one way that an
error correction code can be used to also detect errors. We send codeword C and receive
n-bit word X. Since X is not inside any sphere, we won't try to correct it. We do,
however, recognize that it is an errored codeword and report this information to the user.
In the last example, we could try to correct n-bit word X to the nearest valid
codeword, even though X was not inside any codeword's sphere. A decoder that
attempts to correct all received n-bit words whether they are in a decoding sphere or
not is called a complete decoder. On the other hand, a decoder that attempts to correct only
n-bit words that lie inside a decoding sphere is called an incomplete or bounded distance
decoder. Bounded distance decoders are much more common than complete decoders. Now
that we understand the basics of encoding and decoding, let's investigate a simple error
correction code.
8
2.3.3 The Repetition Code
Consider a (5, 1) repetition code that repeats each bit four times. The encoder looks like
this:
0 → 00000
1 → 11111
The decoder takes 5 bits at a time and counts the number of 1's. If there are three or
more, the decoder selects 1 for the decoded bit. Otherwise, the decoder selects 0. The
minimum distance of this code is 5, so it can correct all patterns of two errors. To compute the
error performance of this code, consider a random error channel with probability of bit error,
p. After decoding, the probability of bit error is simply the probability of three or more
bit errors in a 5 bit codeword. This probability is computed for several values of p with
results listed in Table 1.
10 -3 1.0 x 10-8
10 -4 1.0 x 10-11
10 -5 1.0 x 10-14
Table 2.1 Post-Decoding probability of bit error for (5, 1) Repetition code
The values listed in Table 1.1 show that even this simple code offers dramatic
improvements in error performance, but at the price of a 400% overhead burden.
9
technique is called automatic repeat request, or ARQ. In terms of error performance, ARQ
outperforms forward error correction because codewords are always delivered error-free
(provided the error detection code doesn't fail). This advantage does not come free – we pay
for it with decreased throughput. The chief advantage of ARQ is that error detection requires
much simpler decoding equipment than error correction. ARQ is also adaptive since it only
re-transmits information when errors occur. On the other hand, ARQ schemes require a
feedback path which may not be available. They are also prone to duping by the enemy. A
pulse jammer can optimize its duty cycle to increase its chances of causing one or more errors
in each codeword. Ideally (from the jammer's point of view), the jammer forces the
communicator to retransmit the same codeword over and over, rendering the channel useless.
There are two types of ARQ: stop and wait ARQ and continuous ARQ.
10
Block Codes: The operation of binary block codes was described in Section 4.0 of
this paper. All we need to add here is that not all block codes are binary. In fact, one of the
most popular block codes is the Reed-Solomon code which operates on m-bit symbols,
not bits. Because Reed-Solomon codes correct symbol errors rather than bit errors, they
are very effective at correcting burst errors. For example, a 2-symbol error correcting
Reed-Solomon code with 8 bit-symbols can correct all bursts of length 16 bits or less. Reed
Solomon Codes are used in JTIDS, a new deep space standard, and compact disc (CD) players.
Convolutional Codes: With convolutional codes, the incoming bit stream is applied
to a K-bit long shift register. For each shift of the shift register, b new bits are inserted
and n code bits are delivered, so the code rate is b/n. The power of a convolutional
code is a function of its constraint length, K. Large constraint length codes tend to be
more powerful. Unfortunately, with large constraint length comes greater decoder
complexity. There are several effective decoding algorithms for convolutional codes, but
the most popular is the Viterbi algorithm, discovered by Andrew Viterbi in 1967. Viterbi
decoders are now available on single integrated circuits (VLSI) from several
manufacturers. Viterbi decoders are impractical for long constraint length codes because
decoding complexity increases rapidly with constraint length. For long constraint length
codes (K > 9), a second decoding algorithm called sequential decoding is often used. A
third decoding technique, feedback decoding, is effective on burst-error channels, but is
inferior on random error channels. In general, convolutional codes provide higher
coding gain than block codes for the same level of encoder/decoder complexity.
One drawback of the codes we have looked at so far is that they all require
bandwidth expansion to accommodate the added parity bits if the user wishes to maintain
the original unencoded information rate. In 1976, Gottfried Ungerboeck discovered a class
of codes that integrates the encoding and modulation functions and does not require
bandwidth expansion. These codes are called Ungerboeck codes or trellis coded modulation
(TCM). Virtually every telephone line modem on the market today operating above 9.6 k
bits/s uses TCM.
11
make more efficient use of the channel. At the receiver, the decoder first attempts to
correct any errors present in the received codeword. If it cannot correct all the errors, it
requests retransmission using one of the three ARQ techniques described above. Type I
hybrid ARQ sends all the necessary parity bits for error detection and error correction
with each codeword. Type II hybrid ARQ, on the other hand, sends only the error
detection parity bits and keeps the error correction parity bits in reserve. If the decoder
detects errors, the receiver requests the error correction parity bits and attempts to correct the
errors with these parity bits before requesting retransmission of the entire codeword.
Type II ARQ is very efficient on a channel characterized by a "good" state that
prevails most of the time and a "bad" state that occurs infrequently.
2.4.4. Interleaving
One of the most popular ways to correct burst errors is to take a code that works
well on random errors and interleave the bursts to "spread out" the errors so that they
appear random to the decoder. There are two types of interleavers commonly in
use today, block interleavers and convolutional interleavers.
The block interleaver is loaded row by row with L codewords, each of length n bits.
These L codewords are then transmitted column by column until the interleaver is emptied.
Then the interleaver is loaded again and the cycle repeats. At the receiver, the
codewords are deinterleaved before they are decoded. A burst of length L bits or less
will cause no more than 1 bit error in any one codeword. The random error decoder is much
more likely to correct this single error than the entire burst.
The parameter L is called the interleaver degree, or interleaver depth. The interleaver
depth is chosen based on worst case channel conditions. It must be large enough so
that the interleaved code can handle the longest error bursts expected on the channel.
The main drawback of block interleavers is the delay introduced with each row-by-
row fill of the interleaver. Convolutional interleavers eliminate the problem except for the
delay associated with the initial fill. Convolutional interleavers also reduce memory
requirements over block interleavers by about one-half. The big disadvantage of either
type of interleaver is the interleaver delay introduced by this initial fill. The delay is a
function of the interleaver depth and the data rate and for some channels it can be several
seconds long. This long delay may be unacceptable for some applications. On voice circuits,
for example, interleaver delays confuse the unfamiliar listener by introducing long pauses
12
between speaker transitions. Even short delays of less than one second are sufficient to
disrupt normal conversation. Another disadvantage of interleavers is that a smart jammer
can choose the appropriate time to jam to cause maximum damage. This problem is
overcome by randomizing the order in which the interleaver is emptied.
13
3. CONVOLUTIONAL CODING
3.1 INTRODUCTION:
Over the years, there has been a tremendous growth in digital communications
especially in the fields of cellular/PCS, satellite, and computer communication. In these
communication systems, the information is represented as a sequence of binary bits. The
binary bits are then mapped (modulated) to analog signal waveforms and transmitted over a
communication channel. The communication channel introduces noise and interference to
corrupt the transmitted signal. At the receiver, the channel corrupted transmitted signal is
mapped back to binary bits. The received binary information is an estimate of the transmitted
binary information. Bit errors may result due to the transmission and the number of bit errors
depends on the amount of noise and interference in the communication channel.
Channel coding is often used in digital communication systems to protect the digital
information from noise and interference and reduce the number of bit errors. Channel coding
is mostly accomplished by selectively introducing redundant bits into the transmitted
information stream. These additional bits will allow detection and correction of bit errors in
the received data stream and provide more reliable information transmission. The cost of
using channel coding to protect the information is a reduction in data rate or an expansion in
bandwidth.
14
Convolutional codes are one of the most widely used channel codes in practical
communication systems. These codes are developed with a separate strong mathematical
structure and are primarily used for real time error correction. Convolutional codes convert
the entire data stream into one single codeword. The encoded bits depend not only on the
current k input bits but also on past input bits. The main decoding strategy for convolutional
codes is based on the widely used Viterbi algorithm.
15
k
r=
n
here k is the number of parallel input information bits and n is the number of parallel output
encoded bits at one time interval. The constraint length K for a convolutional code is defined
as
K = m +1 (2.2)
where m is the maximum number of stages (memory size) in any shift register. The shift
registers store the state information of the convolutional encoder and the constraint length
relates the number of bits upon which the output depends. For the convolutional encoder
shown in Figure 2.1, the code rate r=2/3, the maximum memory size m=3, and the constraint
length K=4.
A convolutional code can become very complicated with various code rates and
constraint lengths. As a result, a simple convolutional code will be used to describe the
code properties as shown in Figure 2.2.
Fig 3.2 Convolution Encoder with k=1, n=2, r=1/2, m=2, K=3.
3.3 ENCODER REPRESENTATIONS:
The encoder can be represented in several different but equivalent ways. They are
1. Generator Representation
2. Tree Diagram Representation
3. State Diagram Representation
4. Trellis Diagram Representation
16
In the tree diagram, a solid line represents input information bit 0 and a dashed line
represents input information bit 1. The corresponding output encoded bits are shown on the
branches of the tree. An input information sequence defines a specific path through the tree
diagram from left to right. For example, the input information sequence x = {1011} produces
the output encoded sequence c={11, 10, 00, 01}. Each input information bit corresponds to
branching either upward (for input information bit 0) or downward (for input information bit 1)
at a tree node.
Fig3.3 Tree Diagram representation of encoder in Fig 3.2 for four input bit intervals
17
In the state diagram, the state information of the encoder is shown in the circles. Each
new input information bit causes a transition from one state to another. The path information
between the states, denoted as x/c.
Fig 3.5 The state transitions (path) for input information sequence{1011}.
For example, the input information sequence x={1011} leads to the state transition sequence
s={10, 01, 10, 11} and produces the output encoded sequence c={11,10,00,01}. Figure 3.5
shows the path taken through the state diagram for the given example
3.3.4 Trellis Diagram Representation
The trellis diagram is basically a redrawing of the state diagram. It shows all possible
state transitions at each time step. Frequently, a legend accompanies the trellis diagram to
show the state transitions and the corresponding input and output bit mappings (x/c). This
compact representation is very helpful for decoding convolutional codes as discussed later.
Fig 3.6 Trellis diagram representation of encoder in Fig 3.2 for four input bit intervals
18
3.4 HARD-DECISION AND SOFT-DECISION DECODING:
Hard-decision and soft-decision decoding refer to the type of quantization used on the
received bits. Hard-decision decoding uses 1-bit quantization on the received channel values.
Soft-decision decoding uses multi-bit quantization on the received channel values.
For the ideal soft-decision decoding (infinite-bit quantization), the received channel
values are directly used in the channel decoder. Figure 3.7 shows hard- and soft- decision
decoding.
BPSK MOD
Channel
Noise
Soft-Decision
y Convolutional BPSK
Decoder Hard -Decision Demodulator
rin ≤ 0 → rout = 0 rin
rout
rin > 0 → rout = 1
19
x c r Viterbi
Convolutional Channel
Decoder
Encoder
Noise
Fig 3.8 Convolutional code system.
For a rate r convolutional code, the encoder inputs k bits in parallel and outputs n bits in
parallel at each time step. The input sequence is denoted as
x=(x0(1), x0 (2), ..., x0(k), x1(1), ..., x1(k), xL+m-1(1), ..., xL+m-1(k)) (2.3)
and the coded sequence is denoted as
c=(c0 (1), c0(2), ..., c0(n), c1(1), ..., c1(n), cL+m-1(1), ..., cL+m-1(n)) (2.4)
where L denotes the length of input information sequence and m denotes the maximum length
of the shift registers.
Additional m zero bits are required at the tail of the information sequence to take the
convolutional encoder back to the all-zero state. It is required that the encoder start and end at
the all-zero state. The subscript denotes the time index while the superscript denotes the bit
within a particular input k-bit or output n- bit block. The received and estimated sequences r
and y can be described similarly as
r=( r0(1), r0(2), ..., r0(n), r1(1), ..., r1(n), rL+m-1(1), ..., rL+m-1(n)) (2.5)
and
y=(y0(1), y0(2), ..., y0 (n), y1(1), ..., y1(n), yL+m-1(1), ..., yL+m-1(n)). (2.6)
For ML decoding, the Viterbi algorithm selects y to maximize p(r|y). The channel is assumed
to be memoryless, and thus the noise process affecting a received bit is independent from the
noise process affecting all of the other received bits.
The Viterbi algorithm utilizes the trellis diagram to compute the path metrics. Each
state (node) in the trellis diagram is assigned a value, the partial path metric. The partial path
metric is determined From state s = 0 at time t = 0 to a particular state s = k at time t ≥ 0. At
each state, the “best” partial path metric is chosen from the paths terminated at that state. The
“best” partial path metric may be either the larger or smaller metric, depending whether a and
b are chosen conventionally or alternatively.
20
The selected metric represents the survivor path and the remaining metrics represent
the nonsurvivor paths. The survivor paths are stored while the nonsurvivor paths are
discarded in the trellis diagram. The Viterbi algorithm selects the single survivor path left at
the end of the process as the ML path. Trace-back of the ML path on the trellis diagram
would then provide the ML decoded sequence. The hard-decision Viterbi algorithm (HDVA)
can be implemented as follows :
Sk,t is the state in the trellis diagram that corresponds to state Sk at time t. Every state in the
trellis is assigned a value denoted V(Sk,t).
1. (a) Initialize time t = 0.
(b) Initialize V(S0,0) = 0 and all other V(Sk,t) = +∞.
2. (a) Set time t = t+1.
(b) Compute the partial path metrics for all paths going to state Sk at time t.
3. (a) Set V(Sk,t) to the “best” partial path metric going to state Sk at time t. Conventionally,
the “best” partial path metric is the partial path metric with the smallest value.
(b) If there is a tie for the “best” partial path metric, then any one of the tied partial path
metric may be chosen.
4. Store the “best” partial path metric and its associated survivor bit and state paths.
5. if t < L+m-1, return to Step 2.
The result of the Viterbi algorithm is a unique trellis path that corresponds to the ML
codeword.
Fig 3.9 The state Transition diagram (trellis legend) of the example convolutional encoder.
21
A simple HDVA decoding example is shown below. The convolutional encoder
used is shown in Figure 2.2. The input sequence is x={1010100}, where the last two bits
are used to return the encoder to the all-zero state. The coded sequence is c={11, 10, 00, 10,
00, 10, 11}. However, the received sequence r={10, 10, 00, 10, 00, 10, 11} has a bit error
(underlined). Figure 3.9 shows the state transition diagram (trellis legend) of the example
convolutional encoder.
From the trellis diagram in Figure 2.9, the estimated code sequence is y={11, 10, 00,
10,00, 10, 11} which is the code sequence c. Utilizing the state transition diagram in
Figure 2.12, the estimated information sequence is x’={1010100}.
The soft-decision Viterbi algorithm (SDVA1) can be implemented as follows: Sk,t is the
state in the trellis diagram that corresponds to state Sk at time t. Every state in the trellis is
assigned a value denoted V(Sk,t).
1. (a) Initialize time t = 0.
(b) Initialize V(S0,0) = 0 and all other V(Sk,t) = +∞.
2. (a) Set time t = t+1.
(b) Compute the partial path metrics for all paths going to state Sk at time t.
3. (a) Set V(Sk,t) to the “best” partial path metric going to state Sk at time t.
Conventionally, the “best” partial path metric is the partial path metric with the smallest
value.
(b) If there is a tie for the “best” partial path metric, then any one of the tied
partial path metric may be chosen.
4. Store the “best” partial path metric and its associated survivor bit and state paths.
5. If t < L+m-1, return to Step 2.
22
3.7 PERFORMANCE ANALYSIS OF CONVOLUTIONAL CODE:
The performance of convolutional codes can be quantified through analytical
means or by computer simulation. The analytical approach is based on the transfer
function of the convolutional code which is obtained from the state diagram. The
process of obtaining the transfer function and other related performance measures are
described below.
23
Sb = NJD 2 S a + NJSc
S c = JDS b + JDS d
S d = NJDSb + NJDS d
S e = JD 2 S c
The transfer function is defined to be
S e ( D, N , J )
T ( D, N , J ) =
S s ( D, N , J )
By substituting and rearranging,
NJ 3 D5
T ( D, N , J ) =
1 − ( NJ + NJ 2 ) D
= NJ D + ( N J + N J ) D + ( N J + 2 N J + N J ) D + ........
3 5 2 4 2 5 6 3 5 3 6 3 7 7
24
4. VITERBI DECODING ALGORITHM
4.1 INTRODUCTION:
The Viterbi Algorithm (VA) was first proposed as a solution to the decoding of convolutional
codes by Andrew J. Viterbi in 1967, with the idea being further developed by the same
author in. It was quickly shown by Omura that the VA could be interpreted as a dynamic
programming algorithm. Both Omura and Forney showed that the VA is a maximum
likelihood decoder. The VA is often looked upon as minimizing the error probability by
comparing the likelihoods of a set of possible state transitions that can occur, and deciding
which of these has the highest probability of occurrence. A similar algorithm, known as the
Stack Sequential Decoding Algorithm (SS-DA), was described by Forney in as an
alternative to the VA, requiring less hardware to implement than the VA. The SSDA has
been proposed as an alternative to the VA in such applications as target tracking, and high
rate convolutional decoding. It can be shown though, that this algorithm is sub-optimum to
the VA in that it discards some of the paths that are kept by the VA.
Since it's conception the VA has found a wide area of applications, where it has been found
to be an optimum method usually out performing previous methods. The uses it has been
applied to not just covers communications for which it was originally developed, but includes
diverse areas such as handwritten word recognition, through to nonlinear dynamic system
state estimation.
This report is in effect a review of the VA. It describes the VA and how it works, with an
appropriate example of decoding corrupted convolutional codes. Extensions to the basic
algorithm are also described. In section 3 some of the applications that the VA can be put to
are described, including some uses in communications, recognition problems and target
tracking. The area of dynamic signature verification is identified as an area requiring further
research.
In this section the Viterbi Algorithm (VA) is defined, and with the help of an example, its use
is examined. Some extensions to the basic algorithm are also looked at viterbi algorithm
(VA) can be viewed as a solution of estimation for a finite sequence from Markov process
through memoryless noise channel as illustrated in Figure 1:]
25
Markov Process Memoryless Channel
M M’
Viterbi Algorithm
Sequence detection with Viterbi decoding has been widely considered for the detection of
signals with memory. It was originally invented to decode convolutional codes. So, the
introduction of the Viterbi algorithm will mainly be based on the decoding process for the
convolution coding.
There exist several statistical tools in VA estimation such as the Maximum A posteriori
Probability (MAP) and Maximum Likelihood Sequence Estimation (MLSE).
In MLSE, let
26
P0 (C10 − C00 ) = P1 (C01 − C11 )
It yields:
H1
>
P( y / H1 ) P( y / H 0 )
<
H0
Figure 4.2: Showing a) trellis diagram spread over time and b) the corresponding state
diagram of the FSM.
For each of the possible transitions within a given FSM there is a corresponding out-put
symbol produced by the FSM. This data symbol does not have to be a binary digit it could
instead represent a letter of the alphabet. The outputs of the FSM are viewed by the VA as a
set of observation symbols with some of the original data symbols corrupted by some form of
noise. This noise is usually inherent to the observation channel that the data symbols from the
FSM have been transmitted along.
The trellis that the VA uses corresponds to the FSM exactly, i.e. the structure of the FSM is
available, as is the case in it's use for convolutional code decoding. Another type of FSM is
the Hidden Markov Model (HMM) . As the name suggests the actual FSM is hidden from the
27
VA and has to be viewed through the observations produced by the HMM. In this case the
trellis's states and transitions are estimates of the under-lying HMM. This type of model is
useful in such applications as target tracking and character recognition, where only estimates
of the true state of the system can be produced. In either type of model, MM or HMM, the
VA uses a set of metrics associated with the observation symbols and the transitions within
the FSM. These metrics are used to cost the various paths through the trellis, and are used by
the VA to decide which path is the most likely path to have been followed, given the set of
observation symbols.
Before defining the VA the following set of symbols have to be defined:
t - The discrete time index.
N - Total number of states in the FSM.
xn - The nth state of the FSM.
ot - The observation symbol at time t, which can be one of M different symbols.
spnt - The survivor path which terminates at time t, in the nth state of the FSM.
It consists of an ordered list of xn's visited by this path from time t = 0 to time t.
T - Truncation length of the VA, i.e. the time when a decision has to be made by the VA as to
which spnt is the most likely.
n - Initial state metric for the nth state at t = 0. Defined as the probability that the nth state is
the most likely starting start, i.e. Prob(xn at t = 0).
anm - The transition metric for the transition from state xm at time t - 1 to the state xn at time t.
Defined as the probability that given that state xm occurs at time t - 1, the state xn will occur at
time t, i.e. Prob(xn at t | xm at t - 1).
bn - The observation metric at time t, for state xn. Defined as the probability that the
observation symbol o t would occur at time t, given that we are in the state xn at time t, i.e.
Prob (ot | xn at t).
nt - The survivor path metric of spnt. This is defined as the Product of the metrics ( n, anm and
bn) for each transition in the nth survivor path, from time t = 0 to time t.
The equations for the model metrics, n, anm and bn, can be derived mathematically where
their properties result from a known application. If the metric properties are not known, re-
estimation algorithms can be used, such as the Baum-Welch re-estimation algorithm, to
obtain optimum probabilities for the model. It is also usual to take the natural logarithm of
the metrics, so that arithmetic underflow is prevented in the VA during calculations.
The VA can now be defined:
28
In English the VA looks at each state at time t, and for all the transitions that lead into that
state, it decides which of them was the most likely to occur, i.e. the transition with the
greatest metric. If two or more transitions are found to be maximum, i.e. their metrics are the
same, then one of the transitions is chosen randomly as the most likely transition. This
greatest metric is then assigned to the state's survivor path metric, . The VA then discards
nt
the other transitions into that state, and appends this state to the survivor path of the state at t
- 1, from where the transition originated. This then becomes the survivor path of the state
being examined at time t. The same operation is carried out on all the states at time t, at
which point the VA moves onto the states at t + 1 and carries out the same operations on the
states there. When we reach time t = T (the truncation length), the VA determines the
survivor paths as before and it also has to make a decision on which of these survivor paths is
the most likely one. This is carried out by determining the survivor with the greatest metric,
again if more than one survivor is the greatest, then the most likely path followed is chosen
randomly. The VA then outputs this survivor path, sp T, along with it's survivor metric, .
T
4.4 EXAMPLE:
Now that the VA has been defined, the way in which it works can be looked at using an
example communications application. The example chosen is that of the VA's use in
convolutional code decoding, from a memoryless Binary Symetric Channel (BSC), as
described in. A picture of the communications system that this example assumes is shown
below in Figure 2. This consists of encoding the input sequence, transmitting the sequence
over a transmission line (with possible noise) and optimal decoding the sequence by the use
of the VA.
The input sequence, we shall call it I, is a sequence of binary digits which have to be
transmitted along the communications channel. The convolutional encoder consists of a shift
register, which shifts in a number of the bits from I at a time, and then produces a set of
output bits based on logical operations carried out on parts of I in the register memory. This
process is often referred to as convolutional encoding. The encoder introduces redundancy
into the output code, producing more output bits than input bits shifted into it's memory. As a
bit is shifted along the register it becomes part of other output symbols sent. Thus the present
output bit that is observed by the VA has information about previous bits in I, so that if one
of these symbols becomes corrupted then the VA can still decode the original bits in I by
using information from the previous and subsequent observation symbols. A diagram of the
convolutional encoder used in this ex-ample is shown in Figure 3. It is assumed here that the
29
shift register only shifts in one bit at a time and outputs two bits, though other combinations
of input to output bits are possible.
O1 Encoded output
Input I S1 S2 S3 Sequence
O2
This encoder can be represented by the FSM shown in Figure 4a. The boxes in this diagram
represent the shift register and the contents are the state that the FSM is in. This state
corresponds to the actual contents of the shift register locations S2 followed by S1, i.e. if we
are in state 01, then the digit in S1 is 1 and the digit in S2 is 0. The lines with arrows
represent the possible transitions between the states. These transitions are labeled as x/y,
where x is a two digit binary number, which represents the output symbol sent to the
communications channel for that particular transition and y represents the binary digit from I,
that when shifted into the encoder causes that particular transition to occur in the state
machine.The encoded sequence produced at the output of the encoder is transmitted along the
channel where noise inherent to the channel can corrupt some of the bits so that what was
transmitted as a 0 could be interpreted by the receiver as a 1, and vice versa.
These observed noisy symbols are then used along with a trellis diagram of the known FSM
to reconstruct the original data sequence sent. In our example the trellis diagram used by the
VA is shown in Figure 4b. This shows the states as the nodes which are fixed as time
progresses. The possible transitions are shown as grey lines, if they were caused by a 1
entering the encoder, and the black lines, if they were caused by a 0 entering the encoder.
The corresponding outputs that should of been produced by the encoder are shown, by the
two bit binary digits next to the transition that caused them. As can be seen in Figure 4b the
possible transitions and states remain fixed between differing time intervals. The trellis
diagram of Figure 4b can be simplified to show the recursive nature of the trellis, as is shown
in Figure 4c.
30
It was shown by Viterbi in that the log likelihood function used to determine survivor
metrics can be reduced to a minimum distance measure, known as the Hamming Distance.
The Hamming distance can be defined as the number of bits that are different between,
between the symbol that the VA observes, and the symbol that the convolution encoder
should have produced if it followed a particular input sequence. This measure defines the
combined measure of anm and b n for each transition in the trellis. The n's are usually set
before decoding begins such that the normal start state of the encoder has a n = 0 and the
other states in the trellis have a n whose value is as large as possible, preferably . In this
example the start state of the encoder is always assumed to be state 00, so 0 = 0, and the
other n's are set to 100.
00/0
31
Figure 4.4. Showing the various aspects of the FSM in the example. a) shows the FSM
diagram, b) the trellis diagram for this FSM spread over time and c) shows the recursive
structure of this trellis diagram.
As an example if it is assumed that an input sequence I, of 0 1 1 0 0 0 is to be transmitted
32
across the BSC, using the convolutional encoder described above, then the out-put obtained
from the encoder will be 00 11 01 01 11 00, as shown in Table 1. The output is termed as the
Encoder Output Sequence (EOS). Table 1 also shows the corresponding contents of each
memory element of the shift register, where each element is assumed to be initialized to
zero's at the start of encoding. As the EOS is constructed by the encoder, the part of the EOS
already formed is transmitted across the channel. At the receiving end of the channel the
following noisy sequence of bits may be received, 01 11 01 00 11 00. As can be seen there
are two bit errors in this sequence, the 00 at the beginning has changed to 01, and similarly
the fourth symbol has changed to 00 from 01. It is the job of the Viterbi Algorithm to find the
most likely set of states visited by the original FSM and thus determine the original input
sequence.
0 0 0 0 0 0
1 1 0 0 1 1
1 1 1 0 0 1
0 0 1 1 0 1
33
the decoding and survivor stages are shown in Figure 5c and 5d for t = 2, and the same stages
for t = 6.
When the VA reaches time T = t, then it has to decide which of the survivor paths is the most
likely one, i.e. the path with the smallest Hamming Distance. In this example T is assumed to
be 6, so the path terminating at state a in Figure 5f has the minimum Hamming distance, 2,
making this the most likely path taken. Next, the VA would start outputing the estimated
sequence of binary digits that it thinks were part of the original input sequence. It is found
that the estimated sequence is 0 1 1 0 0 0 which corresponds directly to the original input
sequence. Thus the input sequence has been recovered dispite the error introduced during
transmission.
34
which compensates for insertions and deletions in the observation sequence. For each of the
insertion, deletion and change operations a metric is assigned, which also depends upon the
application the VA is being applied to. The VA produces the most likely path through the
states of the FSM, estimates of the original data sequence, as well as the best sequence of
operations performed on the data to obtain the incorrect observation sequence. An
application of this VA, is in the use of correcting programming code whilst compiling
programs, since many of the errors produced while writing code tend to be characters missed
out, or inserted characters. An-other extension to the VA is that of a parallel version. Possible
solutions to this have been suggested by Fettweis and Meyr, and Lin et al. The parallel
versions of the VA have risen out of the needs for fast hardware implementations of the VA.
4.6 APPLICATIONS:
This section looks at the applications that the VA has been applied to. The application of the
VA in communications is initially examined, since this is what the algorithm was initially
developed for. The use of the VA in target tracking is also looked at and finally recognition
problems. In all the applications, that the VA has been applied to since it's conception, the
VA has been used to determine the most likely path through a trellis, as discussed in section
2. What makes the VA an interesting research area is that the metrics, ( n, anm and bn), used
to determine the most likely sequence of states, are application specific. It should be noted
that the VA is not limited to the areas of application mentioned in this section though only
one other use is known to this author. This is the application of the VA in digital magnetic
recording systems. Though this section sums up the uses of the VA there are probably a
number of other application areas that could be identified.
3.6.1 Communications.
A number of the uses of the VA in communications have already been covered in sec-tion 2.
These include the first proposed use of the VA in decoding convolutional codes. It was
shown that the VA could be used to combat Intersymbol Interference (ISI), by Forney in. ISI
usually occurs in modulation systems where consecutive signals disperse and run into each
other causing the filter at the demodulator to either miss a signal, or wrongly detect a signal.
ISI can also be introduced by imperfections in the filter used. Forney suggested that the VA
could be used to estimate the most likely sequence of 0's and 1’s that entered the modulator
at the transmitter end of the channel, given the sequence of ISI affected observation symbols.
So if a convolutional code was used for error control then the output from this VA would be
passed through another VA to obtain the original input sequence into the transmitter. The VA
35
used in this situation is more commonly referred to as a Viterbi Equalizer and has been used
in conjunction with Trellis-Coded Modulation (TCM) and also on different channel types.
The advantage of using a VA in the equalizer is that the SOVA, described in section 2, can
be used to pass on more information into the decoder.
Another application of the VA in communications is that of decoding TCM codes. This
method of encoding was presented by Ungerboeck, for applications where the redundancy
introduced by convolutional encoding could not be tolerated because of the reduction in data
transmission rate or limited bandwidth. This method combines the error correcting abilities
of a trellis code with the redundancy which can be introduced into the modulation signal
itself via multiple amplitude levels or multiple phases. Instead of transmitting the extra
redundant bits produced by the convolutional encoder for error control, these bits are mapped
onto different amplitudes or phases in the modulation set which ensures that the bandwidth of
the signal does not have to be increased. The VA is then used to decode the TCM codes, and
produce the most likely set of bits as in convolutional decoding. It should also be noted that
the decisions coming from the demodulator are in fact soft decisions, and a soft-decision VA
has to be used in the de-coding.Much of the research carried out in the use of the VA in
communications has been directed into finding better performance TCM codes and in the
application of the VA as an equalizer in different communication environments. It was
therefore decided to look for another application that the VA could be applied too.
4.6.2 Target Tracking.
The use of the VA in the field of Target Tracking is investigated in this section. Work in this
application area has been carried out to date using Kalman Filters for the tracking of targets.
The basic aim is to take observations from radar, sonar or some other form of detector and to
estimate the actual position of the target in relation to the detector. In an ideal world this
would be a simple task since the detector should give us the true position of the target, but in
reality various problems arise which affect the readings from the detector. These usually
from noise, be it background, signal deteoriation or due to imperfections in the detector.
There is also the matter of manoeuvrering by the target which typically results in the
modelling of a non-linear system. Another type of noise that can be introduced into the
signals used by the detector is random interference.
36
Background Position
These problems were tackled by Demirbas in where the VA was used to produce estimates
of a target's state, be it in terms of speed, position or acceleration. The method uses an FSM
to construct the trellis diagram, both the next states and the transitions between these states,
using some a motion model. The motion model used is a approximation of non-linear motion.
This recursive model is used to produce the next set of possible states that the object could go
into along with the transitions into these states. A model of the tracking system proposed by
Demirbas is shown in Figure 6 above.
Each state of the trellis represents a n-dimensional vector, for example a specific range,
bearing and elevation position of a plane. Unlike the Kalman filter, this method does not use
a linear motion model, but a non-linear one. It is also noticed that the number of states
produced at the next stage in the trellis can increase or decrease, unlike most other
applications of the VA where the number of states is fixed throughout the trellis. The VA can
only consider a small amount of possible states that a target can move into, since in theory
we can have a very large or even an infinite amount of possible states. The anm and n
metrics can be worked out whilst the trellis diagram is being constructed and the bn metrics
depend on whether we are tracking the target in the presence of interference or with normal
background noise. The VA is then used to estimate the most likely path taken through this
trellis using the observation sequence produced by the detector. It was found that this method
is far superior to the Kalman filter at estimating non-linear motion and it is comparable in
37
performance to the Kalman filter when estimating linear motion. Also considered by
Demirbas in is the use of the Stack Sequential Decoding Algorithm instead of the VA, to
produce fast estimates. Though this method is sub-optimum to the one using the VA.
This model can be applied to any non-linear dynamic system such as population growths, or
economics, but Demirbas has applied this tracking method to tracking manoeuvrable targets
using a radar system. In these papers, Demirbas adapts the above system so that instead of a
state in a trellis consisting of a position estimate in all dimensions, (range, bearing and
elevation in this case), he splits these up into separate components and each of these
components is estimated separately. So each of the dimensions has it's own trellis for
estimation. The motion models for each of the trellises needed in this system are also
presented in these papers and it is noted that the VA has to produce an estimate at each point
so that the trellis diagrams can be extended.
The model can been taken one step further by accounting for missing observations,
e.g when a plane goes out of radar range momentarily. This was dealt with by Demirbas in
where interpolating functions were used to determine the missing observations from the
observations received. This set of observations was then given to the VA to estimate the true
positions. Another adoption proposed in uses the VA to estimate the position of a plane when
any single observation received depends upon a number of previous observations, as in ISI.
Another tracking method has been developed by Streit and Barrett where the VA is used to
estimate the frequency of a signal, using the Fast Fourier Transform of the signal received.
Unlike Demirbas, Streit uses a HMM tracker where the trellis is fixed and constructed from
an observation model not a motion model.The states of the HMM represent a certain
frequency. This method of tracking can be used to track changing frequency signals such as
those used by some radars to detect where a target is.
All through Demirbas's work and Streit's work several improvements come to mind.
One is the adaption of the VA so that it can deal with more than one target's position being
estimated at the same time, i.e. multiple target tracking. Another improvement, particularly in
Demirbas's work, is the use of an adaptive maneuvering model. Demirbas assumes in all his
work that the manoeuvre is known to the tracker, but in reality this parameter would not be
known at all, so it would have to be estimated as is done for the Kalman filter. This
maneuvering parameter could be represented by a random variable, or it can be estimated by
using a similar estimation scheme for the target's position, or estimated as suggested in.
Another problem not considered by Demirbas is adapting the noise models along with the
finite state model. Though the tracking model given in Streit's work could be trained.
38
The problem of multiple targets tracking using the VA has been solved for the
tracking of time-varying frequency signals by Xie and Eval. They describe a multitrack VA
which is used to track two crossing targets. This system is an extension to the frequency line
tracking method presented in. Though this is not the same type of tracking as mentioned in
Demirbas's work, it should be a simple matter of altering the parameters of the model to fit
this type of tracking.
4.6.3 Recognition.
Another area where the VA could be applied is that of character and word recognition of
printed and handwritten words. This has many applications such as post code and ad-dress
recognition, document analysis, car licence plate recognition and even direct in-put into a
computer using a pen. Indeed the idea of using the VA for optical character reading (OCR)
was suggest by Forney in. Also, Kunda et al used the VA to select the most likely sequence
of letters that form a handwritten English word. Another advantage with this model is that if
a word produced by the VA is not in the system dictionary then the VA can produce the other
less likely sequence of letters, along with their metrics, so that a higher syntactic/semantic
model could determine the word produced. It can be easierly seen that a similar method
would apply to the determination of a sequence of letters of a printed character as in OCR. In
fact the VA can be used to recognize the individual characters or letters that make up a word.
This is dealt with in for Chinese character recognition, though a similar method could be
used to recognize English letters. The VA could be used at an even lower level of processing
in the recognition phase than this, where it could be applied to character segmentation, i.e.
determining which area of the paper is background, and which area contains a part of a letter.
It was noted by the authors of that the use of the VA and HMM in character recognition has
not been widely investigated thus making this area an interesting one for further research.
The VA has also been used in other areas of pattern recognition, where it has been used to
detect edges and carry out region segmentation of an image. Another recognition problem is
that of speech recognition, which unlike character and word recognition, the VA along with
HMM's have been used widely . The VA has also been used with neural networks to
recognize continuous speech
39
The Viterbi decoding involves the search through the trellis for the most likely sequence. i.e.,
to select the path with the maximum probability of P (z|x), where z is the received sequence
and x is the signal which is needed to estimate
Convolutional X
M Coding
Memoryless Noise
Channel or
Interference
Viterbi
M’ Decoding Z
40
Viterbi decoder receives successive code symbols, in which the boundaries of the symbols
and the frames have been identified.
The branch metric computation block compares the received code symbol with the expected
code symbol and counts the number of differing bits
i BM i 1 p
SMi SMp
BM i 0
BM j 0
SMj j q SMq
BM j 0
Figure 4.7 A butterfly structure for a convolutional encoder with rate 1/n
41
Figure 4.9 : The flow in General Viterbi Decoder.
The state metric update block selects the survivor path and updates the state metric. The
trellis diagram for a rate 1/n convolutional encoder consists of butterfly structures. This
structure contains a pair of origin and destination states, and four interconnecting branches.
In the Figure 4.9 the upper (lower) branch from each state I or j is taken , when
42
the corresponding source input bit is ‘1’ (‘0’). If the source input bit is ‘1’ (‘0’), the next state
for both i or j is state p(q). the following relations in figure are established for a (n,1,m)
convolutional encoder.
Notation
BM XK : branch metric from a state x under the source input k, where k {‘0’,’1’}
Figure 4.10 The relationships of the states and branch metrics in a butterfly
It is important to note that state p is even and state q is odd. This implies that an odd (even)
state is reached only if the source input bit is ‘0’ (‘1’). This property is utilized for the
traceback, which is explained later. Another important point to be noted is that, it is possible
to traceback from a state at a stage t to its previous state at the stage t-1 provided the survivor
branch of the state is the upper path or the lower path. If the survivor branch of an odd state p
at stage t is the upper (lower) path, the previous state at stage t-1 is state i(j). Note that i is
m-i m-i
obtained as (p-i)/2 and j is 2 +i = 2 + (p-i)/2. Similar results can be applied to an even state.
In summary, if we record whether the survivor path is the upper path or the lower path, we
can traceback from the final state to the initial state.
43
and the decoded output sequence is 101. This approach eliminates the need to traceback,
since the register of the final state contains the decoded output sequence. Hence, the
approach may offer a high-speed operation, but it is not power efficient due to the need to
copy all the registers in a stage to the next stage. We have investigated on the power
efficiency of this approach
SM i
Adder To State Metric
BM i 0
update block
Compare Selector SMq
SM j
Adder
BM j 0
To Survivor
path
Figure 4.11 ACS (Add-Compare-Select) module recording
The other approach called trace back records the survivor branch of each state. As explained
earlier, it is possible to trace back the survivor path provided the survivor branch of each
state is known. While following the survivor path, the decoded output bit is ‘0’ (‘1’)
whenever it encounters an even (odd) state. A flip-flop is assigned to each state to store the
survivor branch and the flip-flop records ‘1’ (‘0’) if the survivor branch is the upper (lower)
path. Concatenation of decoded output bits in reverse order of time forms the decoded output
sequence.
S3 11 111 1011
S2 10 110 1010
S1 1 01 101 1101
S0 0 00 000 1100
44
It is possible to form registers by collecting the flip-flops in the vertical direction or in the
horizontal direction as shown in Figure 2.12. When a register is formed in vertical direction,
it is referred to as “selective update” in this thesis. When a register is formed in horizontal
direction, it is referred to as “shift update”.
In selective update, the survivor path information is filled from the left register to the right
register as the time progresses. In contrast, survivor path information is applied to the least
significant bits of all the registers in “shift update”. Then all the registers perform a shift left
operation. Hence, each register in the shift update method fills in survivor path information
from the least significant bit toward the most significant bit. Figure 2.13 shows a selective
update in the traceback approach.
The shift update is more complicated than the selective update. The shift update is described
in, and the selective update is proposed by us to improve the shift update. In chapter 4, we
show that the selective update is more efficient in power dissipation and requires less area
than the shift update. Due to the need to traceback, the traceback approach is slower than the
register-exchange approach.
Sj-1 Rj-1
New
Survivor
Path
R0 R1 ------ Ri information
S1 R1
45
S3 0 1 0
S2 0 0 0
S1 0 0 1 1
S0 0 0 0 1
1 1 0 0
t=0 t=1 t=2 t=3 t=4
46
5. PASS BAND MODULATION
most of the power is localised in this range and there is negligible power at higher
frequencies. PAM is an example of baseband signaling. These signals are transmitted
directly over a low-pass channel. Some pulse shaping is necessary to minimize Inter Symbol
Interference (ISI).
For passband signaling the signal power is concentrated in a band centred on a carrier
frequency i.e. f − f c ≤ B . Only negligible signal power exists outside this range. This is
sinusoidal. Systems are generally designed to minimize the probability of symbol error in the
presence of noise.
We will make the following assumptions about the transmission channel:
1. The channel is linear with bandwidth wide enough to accommodate the transmission of the
modulated signal with almost zero distortion.
2. The transmitted signal is perturbed by an additive, zero-mean, stationary, white noise.
3. The receiver is time synchronised with the transmitter i.e. the receiver knows the times
that the modulation changes state.
Sometimes the receiver is also phase-locked with the transmitter. This is called
coherent detection and the receiver is a coherent receiver. If the phase of the incoming signal
is not known i.e. non-coherent detection, the receiver is called a non-coherent detector.
The transmitted signal S i (t ) is of finite energy:
τ
E = ∫ Si2 (t )dt
0
One such signal e.g. S1 (t ) or S 2 (t ) etc, is transmitted every T seconds. The signal
transmitted depends upon the message.
47
Carrier
Wave
m Si
Message
Source Encoder Modulator
Si(t)
Communication
Channel
X(t)
Estimate m x
Decoder Detector
Demodulator
C (t ) = A sin (2πft + φ )
Three quantities may be varied to transmit the message: the amplitude A, the frequency f and
the phase. Modulation methods based on varying these quantities to transmit digital data are
known as Amplitude Shift Keying (ASK), Frequency Shift Keying (FSK) and Phase Shift
Keying (PSK).
48
Figure 5.2: Basic digital modulation schemes.
T T 1
for − ≤t ≤ and f1 , f 2 << .
2 2 T
∆f is known as the frequency deviation and, for practical systems, varies in the range
rb r
≤ ∆f ≤ b where rb is the bit rate.
4 2
49
For arbitrary ∆f the FSK signal will have a step phase change at the transition between bits.
However, ∆f can be chosen so the FSK signal has continuous phase across bit boundaries.
Continuous Phase Modulation (CPM) has the same bandwidth but the power falls of faster in
adjacent channels so reducing inter-channel interference. The CPM FSK signal is easier to
T
2
demodulate if the two coding signals S1 and S 2 are orthogonal i.e. ∫ S (t )S (t )dt = 0 .
1 2
−T
2
rb rb
Continuous phase and orthogonality are imposed if f1 = m , f2 = n and
2 2
rb
∆f = (m − n ) .
2
There are a number of ways of generating binary FSK signals:
1. Use two independent oscillators and switch between them
2. Use one oscillator and multiply up to the two frequencies switching between two
modulating oscillators,
3. Use one voltage-controlled oscillator (VCO) in a phase-locked-loop (PLL) and switch
between frequencies.
Methods 2 and 3 produce continuous phase signals while 1 has sudden phase shifts.
A large number of methods exist for the demodulation of FSK signals. Figures 5.4 and 5.5
illustrate coherent and non-coherent detectors. The coherent demodulator is essentially two
parallel AM demodulators where the output of the upper or lower path are only non-zero
when signal is present around the multiplier input signal frequency. The system relies upon
the orthogonality of the trigonometric basis functions. The comparator produces a full-scale
positive output when a signal is detected around f1 and a full-scale negative output when a
signal is detected around f2.
Cos(2 f1t)
LPF
Binary data
FSK signal +
-
LPF
Cos(2 f2t)
50
Figure5. 4: Coherent FSK detector and demodulator.
Envelope
BPF@f1
Detector
Binary data
FSK signal +
-
Envelope
BPF@f2
Detector
The non-coherent detector breaks the input FSK signal into two signals, each containing a
frequency band corresponding to f1 and f2. This is achieved by two band-pass filters (BPF)
with the pass bands centred on f1 and f2. The envelope detectors provide the non-coherent
detection.
Figure 5.6: Independent FSK amplitude spectrum for A) ∆f >> rb and B) ∆f < rb
If the FSK signal is phase continuous then the two ASK signals are not independent.
This leads to relations between the phases of the two spectra and so the power spectra cannot
51
be directly added. When the frequency deviation is small the FSK spectrum can be look
quite unlike two ASK spectra.
For wide band modulation i.e. ∆f >> rb , the bandwidth is approximately 2∆f . For
narrow band modulation, i.e. ∆f < rb the bandwidth is 2rb (same as for OOK). For real
systems the aim is to minimise the bandwidth used. The minimum possible frequency
deviation for the two basis signals to be orthogonal and yield continuous phase is ∆f = rb .
4
This is known as Orthogonal FSK or Minimum Shift Keying. Orthogonal FSK is commonly
used in practice.
S (t ) = Ac cos(2πf c t + D p m(t ))
= Ac cos(2πf c t ) cos(D p m(t )) − Ac sin (2πf c t )sin (D p m(t ))
where m(t ) = ±1 is a bi-polar, base-band, message signal made up of rectangular pulses.
Since m(t) is either +1 or -1 and sin and cos are odd and even respectively
S (t ) = Ac cos(2πf c t ) cos(D p ) − Ac sin (2πf c t )m(t ) sin (D p )
The first term is a scaled carrier wave where the scaling depends upon the phase
deviation Dp. The second term contains the data (message) and is also scaled by Dp. In order
to minimise the signalling efficiency i.e. minimise the probability of error, the data term
needs to be maximised. The amplitude of the data term is maximised when sin (D p ) = 1 i.e.
π
Dp = . In this case cos(D p ) = 0 and the carrier term has zero amplitude. This is a special
2
case of BPSK and is known as Phase Reversal Keying (PRK).
PRK generation is very similar to ASK (OOK) generation. The only difference is that the
message signal is now bi-polar.
0 1 0 1 1
S(t)
M(t)
A cos(2 ft)
Figure 5.7: Illustrating of the modulation of a bi-polar message signal to yield a PRK
52
signal.
The spectrum of the PRK signal is also the same as for ASK (OOK) except that the
discrete carrier delta function is no longer present (as the mean signal is now zero).
Kcos(2 fct)
53
This is a partially coherent technique to detect PRK. In DPSK the information is
encoded in the difference between consecutive bits e.g.
54
6. VITERBI ALGORITHM IN CP-FSK
6.1 INTRODUCTION:
The HART signal path from the processor in a sending device to the processor in a receiving
device is shown in figure 1.5. Amplifiers, filters, etc. have been omitted for simplicity. At
this level the diagram is the same, regardless of whether a Master or Slave is transmitting.
Notice that, if the signal starts out as a current, the "Network" converts it to a voltage. But if
it starts out a voltage it stays a voltage.
Modulator FSK
Sending Sending
processor UART
Demodulator
Byte
stream Serial 11-bit character
stream
The transmitting device begins by turning ON its carrier and loading the first byte to
be transmitted into its UART. It waits for the byte to be transmitted and then loads the next
one. This is repeated until all the bytes of the message are exhausted. The transmitter then
waits for the last byte to be serialized and finally turns off its carrier. With minor exceptions,
the transmitting device does not allow a gap to occur in the serial stream.
The UART converts each transmitted byte into an 11 bit serial character, as in figure
1.6. The original byte becomes the part labeled "Data Byte (8 bits)". The start and stop bits
are used for synchronization. The parity bit is part of the HART error detection. These 3
added bits contribute to "overhead" in HART communication.
55
Figure 6.2 HART Character Structure
The serial character stream is applied to the Modulator of the sending modem. The
Modulator operates such that a logic 1 applied to the input produces a 1200 Hz periodic
signal at the Modulator output. Logic 0 produces 2200 Hz. The type of modulation used is
called Continuous Phase Frequency Shift Keying (CPFSK). "Continuous Phase" means that
there is no discontinuity in the Modulator output when the frequency changes. A magnified
view of what happens is illustrated in figure 1.7 for the stop bit to start bit transition. When
the UART output (modulator input) switches from logic 1 to logic 0, the frequency changes
from 1200 Hz to 2200 Hz with just a change in slope of the transmitted waveform. A
moment's thought reveals that the phase doesn't change through this transition. Given the
chosen shift frequencies and the bit rate, a transition can occur at any phase.
At the receiving end, the demodulator section of a modem converts FSK back into a
serial bit stream at 1200 bps. Each 11-bit character is converted back into an 8-bit byte and
parity is checked. The receiving processor reads the incoming UART bytes and checks parity
for each one until there are no more or until parsing of the data stream indicates that this is
the last byte of the message. The receiving processor accepts the incoming message only if
it's amplitude is high enough to cause carrier detect to be asserted. In some cases the
receiving processor will have to test an I/O line to make this determination. In others the
carrier detect signal gates the receive data so that nothing (no transitions) reaches the
receiving UART unless carrier detect is asserted.
56
6.2 CONTINUOUS PHASE FSK:
V = V0 sin[θ 0 + θ (t )]
Where
V = signal voltage,
t = time,
V0=Amplitude
θ (t ) is given by
t
θ (t ) = ∫ 2π (1700 Hz ) + 2π (500 Hz )∑ Bn (t − nT ) dτ
n
0
57
Where Bn(t) is a pulse that exists from 0 < t < T and has a value of 1 or -1, according to
whether the nth bit is a 0 or 1. T is one bit time. If phase is plotted versus time it is a
steadily increasing value that increases with two possible slopes. Based on the setting in
above section, VB can be implemented in Binary CPFSK(BCPFSK). MAP will be used to
select the shortest path in the trellis diagram.
It is known that the probability of bit errors (Pe) in this model is estimated
2
As Pe = Q( ) . Assume that the priori probability of the binary signal is the same (i.e.: P (1)
2σ
= P (0) =0.5), it is easy to show that the probability to shift from one state to another is the
same.
Two survival sequences in the kth step are denoted as k1 and k2 . The Viterbi
recursive equation can be written as the following three stages:
1. Initialize: k=0, k1 = k2 = 0, and assign the memory space for state sequence X (k)
where X (k) is a 2 X Ke array. Ke is the length of the sequence.
2. For step k:
if ( yok , y1k ) = (1, 0)
Γ k1 = min{[Γ k −1,1 − ln(1 − P) − ln(1 − Pe )],[Γ k −1,2 − ln(Q) − ln( Pe)]}
Γ k 2 = min{[Γ k −1,1 − ln( P) − ln( Pe )],[Γ k −1,2 − ln(1 − Q) − ln( Pe)]}
f ( yok , y1k ) = (0,1)
Γ k1 = min{[Γ k −1,1 − ln(1 − P) − ln( Pe )],[Γ k −1,2 − ln(Q) − ln( Pe)]}
Γ k 2 = min{[Γ k −1,1 − ln( P) − ln(1 − Pe )],[Γ k −1,2 − ln(1 − Q) − ln( Pe)]}
f ( yok , y1k ) = (0, −1)
Γ k1 = min{[Γ k −1,1 − ln(1 − P) − ln( Pe )],[Γ k −1,2 − ln(Q) − ln(1 − Pe)]}
Γ k 2 = min{[Γ k −1,1 − ln( P) − ln( Pe )],[Γ k −1,2 − ln(1 − Q) − ln( Pe)]}
f ( yok , y1k ) = (−1, 0)
Γ k1 = min{[Γ k −1,1 − ln(1 − P) − ln( Pe )],[Γ k −1,2 − ln(Q) − ln( Pe)]}
Γ k 2 = min{[Γ k −1,1 − ln( P) − ln( Pe )],[Γ k −1,2 − ln(1 − Q) − ln(1 − Pe)]}
2. Sequence k1 , k2 and state sequence X (k) are stored, and stage 2 is executed for step
k+1.With the finite state sequences X (k), the shortest complete path will be got and the
corresponding signal sequence can be estimated.
58
6.3 PERFORMANCE ANALYSIS:
It is known that the probability of any error event starting at time k may be upper-
bounded or lower bounded. In the model of CPFSK, the bound can be calculated as follows:
two modulated signal are assumed to be:
s0 (t ) = 2 Eb cos(w(0)t + θ1 )
s1 (t ) = 2 Eb cos( w(1)t + θ 2 )
From step k, the error events can happen at step k, k+1, k+2, k+3… These can be seen in
Figure 7. (e.g.: the correct path should be 0-0-0-0, but due to the error events in the middle,
the path could be 0- - -0)
The probability can be obtained from the Euclidean Distance between the incorrect
sequence and the correct sequence. For the sequence of step k+1, the Euclidean Distance
Eb
is 2 Eb , the probability for this particular error event is Q( ) . It is the lower bound of
σ
Pe2. The probability is accumulated for all the error events in subsequent steps. The range of
Pe can be obtained as
Eb Eb 3 Eb (2k + 1) Eb
Q( ) < Pe < Q( ) + Q( ) + ......Q( )
σ σ σ σ
The right part of this formula is bounded because Q( x) decreases rapidly with x. So, the Pe
Eb
can be estimated as Q( ) . Since the one-side white noise power spectral density (denoted
σ
1 2 Eb
as n0) n0 = σ 2 , so the Pe can be simplified as: Pe = Q( )
2 2 n0
However, the Coherent Detection of Binary FSK inis
Eb
Pe = Q( )
n0
Eb Eb
So Pe (probability of bit error) is reduced from Q( ) to Q( ) when the VA is used
n0 2 n0
59
6.4 MEMORY COST
It can be seen that due to the recursive feature, VA does not need much storage space
in estimation process of MAP or MLSE. Most of the memory will be allocated for the storage
of the state sequence X (k). The storage size of the sequence is m× L where m is the number
of states and L is the truncated length of the sequence.
In BCFSK, the decoder will do the job of plus-compare-select twice for each additional step.
Similarly, the job of plus-compare-select will be done m times in m states trellis decoding
process for each input sequence. Hence, the total cost for the sequence with length (L) and M
states will be O(ML) . In some high-speed Digital Signal Processor such as TMS320C series
and ADSP21000 series, with processing speed of thousands of MIPS (million instructions per
second), it is not difficult to implement VA in perspective of hardware.
There are several issues that need to address in the future so that VA can be applied into more
Digital Signal Processing (DSP) applications.
From this model, the decision of the estimation won’t be made until the entire signal is
received. Hence, it leads to the problem that VA will not work if the state sequence is infinite.
It is necessary to truncate the survivors to some manageable length (L) under such
circumstance. L should be chosen large enough, so that at time K = n*L, (n is positive
integer). All the survivors are in the same state sequence. Hence, the length L is related to the
power of the noise in the channel obviously. On the other hand, it should not be too long
otherwise it will introduce the unnecessary delay and lower the efficiency of decoding. The
trade-off has to be made between accuracy and efficiency.
The state sequence X (k) is estimated only when the survivor length k+1 is calculated because
the longer the sequence is, the more memory will be hogged. On the other hand, the
probability of decision error decreases accordingly. At the last step (denoted as Klast), there
60
are still m states left for the next step. So there are Klast+1 , Klast+2 to make these sequences to
converge at the same node at step Klast. If the decision is made only according to klast , the
probability of the error in the last state will be much greater than that in previous states. Some
dummy sequence will be used to drive the state sequence to go on at the end of the truncated
sequences. If the length of the dummy sequence is Le, then Le should be chosen large enough
so that at step Klast + Le, all the m sequences will converge at the same node at step Klast + Le.
Under such circumstance, the decision can be made with state sequence stored in the X (k)
because at step Klast+Le, all the state sequences from step 1 to step Klast + Le are the same.
The length Le depends on the number of the states m and the power of the noise
The algorithm is required to start with knowing the initial state X (0), but it might not be
satisfied in practice. If Ke is the state that all the possible sequence in CFSK converges into,
then VA can take Ke as the initial state, which is proved to be finite in. Figure 9 depicts the
problems in 5.2 and 5.3. It shows the output state sequence of the Viterbi decoding in
convolutional coding. All the sequences converge together for the first 12 steps but diverge
after step 12. So, the initial state can be decided as state 1. Actually, the states in first 12 steps
can be precisely decided in this case. So to the problem in 5.3, the initial state can be the state
in any step between step 0 and step 12. To the problem in section5.2, the sequences diverge
from step 12 to step 16. The extra more steps are required after step 16 to make these
sequences converge again so that all the decision on path selection can be made precisely.
61
7.1 SIMULATION RESULTS
62
7.2 PERFORMANCE OF VITERBI DECODER FOR
CONVOLUTIONAL CODES
2.1 Convolution code Rate 1/2 K=3 generator polynomials G0= 7, G1= 5
-2
10
BER
-3
10
-4
10
-5
10
-6
10
0 0.5 1 1.5 2 2.5 3
Eb/No(dB)
2.2 Convolution code Rate 1/2 K=5 generator polynomials G0= 35, G1=23
-2
10
BER
-3
10
-4
10
-5
10
-6
10
0 0.5 1 1.5 2 2.5 3
Eb/No(dB)
63
2.3 Convolution code Rate 1/2 K=7 generator polynomials G0=171,
G1= 133
-2
10
BER
-3
10
-4
10
-5
10
-6
10
0 0.5 1 1.5 2 2.5 3
Eb/No(dB)
2.4 Convolution code Rate 1/2 K=9 generator polynomials G0= 753,
G1= 561
-2
10
BER
-3
10
-4
10
-5
10
-6
10
0 0.5 1 1.5 2 2.5 3
Eb/No(dB)
64
2.5 Comparison of convolutionally coded signal for different generator
polynomials
-1
10
go=7,g1=5
go=35,g1=23
go=171,g1=133
-2
10 go=753,g1=561
BER
-3
10
-4
10
-5
10
0 0.5 1 1.5 2 2.5 3
Eb/No(dB)
65
7.3 PERFORMANCE OF FREQUENCY SHIFT KEYING
-2
10
-4
10
-6
10
-8
10
-2 0 2 4 6 8 10 12 14 16
EbNodB
3.2 Performance of Coherent detected FSK Signal
0
10
-5
10
-10
10
BER(dB)
-15
10
coherent Dectection
-20
10
-25
10
-1 0 1 2
10 10 10 10
EbNodB
66
7.4 CONVOLUTIONALLY MODULATED AND
DEMODULATED FSK SIGNAL
xhat(t)
x(t)
0 0
-2 -2
0 2 4 6 8 0 2 4 6 8
Time (sec) Time (sec)
Modulated signal Received signal
2 2
s(t)
r(t)
0 0
-2 -2
0 0.1 0.2 0.3 0 0.1 0.2 0.3
Time (sec) Time (sec)
0 0
-2 -2
0 2 4 6 8 0 2 4 6 8
Time (sec) Time (sec)
Modulated signal Received signal
2 2
s(t)
r(t)
0 0
-2 -2
0 0.1 0.2 0.3 0 0.1 0.2 0.3
Time (sec) Time (sec)
67
7.5 MODULATED SIGNAL OF CONTINUOUS-PHASE
FREQUENCY SHIFT KEYING
-1
-2
-3
0.01 0.011 0.012 0.013 0.014 0.015 0.016 0.017 0.018 0.019 0.02
68
7.6 PERFORMANCE OF VITERBI ALGORITHM IN
CONTINUOUS-PHASE FREQUENCY SHIFT KEYING
6.1PERFORMANCE OF VA FOR COHERENT AND NON
COHERENT DETECTION
Performance of the VA detection of CPFSK
0
10
-2
10
BER(dB)
-4
10
coherent Dectection
-6
10 VA detection of CPFSK
-1 0 1 2
10 10 10 10
EbNodB
69
8. CONCLUSION AND FUTURE WORK
Convolutional coding is a coding scheme often employed in deep space communications and
recently in digital wireless communications. Viterbi decoders are used to decode
convolutional codes. Viterbi decoders employed in digital wireless communications are
complex in its implementation and dissipate large power. We investigated a Viterbi decoder
design.
By building the convolutional encoder and Viterbi decoder in the behavior model, the
MATLAB simulation results give us a light on its performance and how to implement it in a
RTL system. From the results, we find the Viterbi decoding algorithm is mature error correct
system, which will give us a BER at 8.6E-007 at 5db on an AWGN channel with BPSK
modulation. By puncturing, for rate 2/3, we will pay around a 2db cost. For rate 3/4, we will
pay for a 3 db cost during the transmission.