Unit 2
Unit 2
Unit 2
Source Coding:
Definition: A conversion of the output of a discrete memory less source
(DMS) into a sequence of binary symbols i.e. binary code word, is called
Source Coding.
Source coding is defined as the process of encoding the information
source message to a binary code word, and lossless and faithful decoding
from the binary code word to the original information source message
The device that performs this conversion is called the Source
Encoder.
Objective of Source Coding: An objective of source coding is to minimize
the average bit rate required for representation of the source by reducing the
redundancy of the information source
A Source code is identified as a non-singular code if all the code words are
distinct.
A block code of length 𝑛 if the code words are all of fixed length 𝑛.
All practical codes must be uniquely decodable.
A code is said to be uniquely decodable if, and only if, the 𝑛th extension
of the code is non-singular for every finite value of 𝑛.
A block code of length 𝑛 which is non-singular is uniquely decodable
Codes
Uniquely decodable code: It is a prefix code (or prefix-free code) if it has the
prefix property, which requires that no codeword is a proper prefix of any
other codeword.
Example: Not Uniquely Decodable Code (0, 10, 010, 101)
Example: Uniquely Decodable Code :(00, 10, 110, 11)
Instantaneous Code: A uniquely decodable code is said to be instantaneous if it
is possible to decode each codeword in a sequence without reference to
succeeding codewords. A necessary and sufficient condition for a code to be
instantaneous is that no codeword is a prefix of some other codeword.
Examples
Construction of instantaneous codes for source symbol , length for , and all
The parameter represents the average number of bits per source symbol used in the
source coding process.
Contd…
1. Code Efficiency:
The code efficiency η is defined
as
𝐿𝑚𝑖𝑛
𝜂=
𝐿
2. Code Redundancy:
The code redundancy γ is defined as
𝜸 = 𝟏 −ƞ
The Source Coding Theorem
The source coding theorem states that for a DMS X, with entropy H (X), the
𝐿 symbol is bounded as ≥ 𝐻 ( 𝑋 )
Average code word lengthper
And further, can be made as close to H (X) as desired for some suitable chosen code
Thus,
x
Contd…
1. Fixed – Length Codes:
A fixed – length code is one whose code word length is fixed. Code 1 and Code 2 of
above table are fixed – length code words with length 2.
A variable – length code is one whose code word length is not fixed. All codes of
above table except Code 1 and Code 2 are variable – length codes.
3. Distinct Codes:
A code is distinct if each code word is distinguishable from each other. All codes of
above table except Code 1 are distinct codes.
Contd…
4. Prefix – Free Codes:
A code in which no code word can be formed by adding code symbols to another code word is
called a prefix- free code. In a prefix – free code, no code word is prefix of another. Codes 2, 4
and 6 of above table are prefix – free codes.
5. Uniquely Decodable Codes:
A distinct code is uniquely decodable if the original source sequence can be reconstructed
perfectly from the encoded binary sequence. A sufficient condition to ensure that a code is
uniquely decodable is that no code word is a prefix of another. Thus the prefix – free codes 2, 4 and
6 are uniquely decodable codes. Prefix – free condition is not a necessary condition for uniquely
decidability. Code 5 albeit does not satisfy the prefix – free condition and yet it is a uniquely
decodable code since the bit 0 indicates the beginning of each code word of the code
Contd…
6. Instantaneous Codes:
A uniquely decodable code is called an instantaneous code if the end of any code word is
recognizable without examining subsequent code symbols. The instantaneous codes have the
property previously mentioned that no code word is a prefix of another code word. Prefix –
free codes are sometimes known as instantaneous codes.
7. Optimal Codes:
A code is said to be optimal if it is instantaneous and has the minimum average L for a
given source with a given probability assignment for the source symbols
Kraft Inequality
A necessary and sufficient condition for the existence of an instantaneous code with alphabet
size 𝑟 and 𝑞 code words with individual code word lengths of l1, 𝑙2, … … . 𝑙𝑞 is that the
following inequality be satisfied:
Let X be a DMS with alphabet {𝑥𝑖}(𝑖= 1,2, …, q). Assume that the length of the
assigned binary code word corresponding to xi is li.
A necessary and sufficient condition for the existence of an instantaneous binary code is
𝑞
𝐾 =∑ 𝑟 −𝑙 ≤ 1 𝑖
𝑖 =1
This is known as the Kraft Inequality
It may be noted that Kraft inequality assures us of the existence of an instantaneously
decodable code with code word lengths that satisfy the inequality.
But it does not show us how to obtain those code words, nor does it say any code satisfies the
inequality is automatically uniquely decodable.
Kraft’s Inequality
McMillan’s Theorem
Since the class of uniquely decodable codes is larger than the class of instantaneous codes, one
would expect greater efficiencies to be achieved considering the class of all uniquely decodable
codes rather than the more restrictive class of instantaneous codes.
McMillan’s Theorem assures us that we do not lose out if we only consider the class of
instantaneous codes
The code word lengths of any uniquely decodable code must satisfy the Kraft Inequality:
Conversely, given a set of code word lengths that satisfy this inequality, then there exists a uniquely
decodable code with these code word lengths
Consider the following two binary codes for the
same source. Which code is better?
Entropy Coding
The design of a variable – length code such that its average code word length
approaches the entropy of DMS is often referred to as Entropy Coding.
Find the code words occurring in the probability the symbols Find the coding efficiency
𝒙𝟏 0.30 0 0 00
H (X) = 2.36 bits/symbol
𝒙𝟐 0.25 0 1 01
= 2.38
𝒙𝟑 0.20 1 0 10
η = H (X)/ = 0.99
𝒙𝟒 0.12 1 1 0 110
𝒙𝟓 0.08 1 1 1 0 1110
𝒙𝟔 0.05 1 1 1 1 1111
Huffman Coding:
Huffman coding results in an optimal code. It is the code that has the highest
efficiency.
The Huffman coding procedure is as follows:
1) List the source symbols in order of decreasing probability.
2) Combine the probabilities of the two symbols having the lowest probabilities and reorder
the resultant probabilities, this step is called reduction 1. The same procedure is repeated
until there are two ordered probabilities remaining.
Contd…
3) Start encoding with the last reduction, which consists of exactly two ordered
probabilities. Assign 0 as the first digit in the code word for all the source
symbols associated with the first probability; assign 1 to the second probability.
4) Now go back and assign 0 and 1 to the second digit for the two probabilities that
were combined in the previous reduction step, retaining all the source symbols
associated with the first probability; assign 1 to the second probability.
5) Keep regressing this way until the first column is reached.
6) The code word is obtained tracing back from right to left.
Example: Huffman Coding Algorithm
Alphabet with probability for symbols Find the Huffman codes and also find efficiency and variance
Redundancy:
Redundancy in information theory refers to the reduction in information content of
a message from its maximum value
For example, consider English having 26 alphabets. Assuming all alphabets are equally
likely to occur, P (xi) = 1/26. For all the 26 letters, the information contained is therefore
log2 26 = 4.7 bits/letter
Assuming that each letter to occur with equal probability is not correct, if we assume that
some letters are more likely to occur than others, it actually reduces the information
content in English from its maximum value of 4.7 bits/symbol.
We define relative entropy on the ratio of H (Y/X) to H (X) which gives the maximum
compression value and Redundancy is then expressed as
A range that is specified by two integers serves as a representation of the most recent data.
Because they work directly on a single natural number that represents the most recent information, the
asymmetric numeral systems family of entropy coders, which is relatively new, enables quicker
implementations.