Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
108 views

Entropy Coding

This document provides an introduction to entropy coding. It defines key terms like alphabet, symbols, coding, and codewords. It gives examples of fixed length binary codes for English letters using ASCII. It discusses calculating the average code length and introduces uniquely decodable codes. Prefix codes are defined as codes where no codeword is a prefix of another. Prefix codes can be represented using binary trees for decoding. Theorems are presented showing that prefix codes can achieve the optimal compression efficiency.

Uploaded by

Rakesh Inani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
108 views

Entropy Coding

This document provides an introduction to entropy coding. It defines key terms like alphabet, symbols, coding, and codewords. It gives examples of fixed length binary codes for English letters using ASCII. It discusses calculating the average code length and introduces uniquely decodable codes. Prefix codes are defined as codes where no codeword is a prefix of another. Prefix codes can be represented using binary trees for decoding. Theorems are presented showing that prefix codes can achieve the optimal compression efficiency.

Uploaded by

Rakesh Inani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Lecture 5:

Introduction to Entropy
Coding

Thinh Nguyen
Oregon State University

Codes

Definitions:

Alphabet: is a collection of symbols.

Letters (symbols): is an element of an alphabet.

Coding: the assignment of binary sequences to


elements of an alphabet.

Code: A set of binary sequences.

Codewords: Individual members of the set of binary


sequences.

Examples of Binary Codes

English alphabets:

26 uppercase and 26 lowercase letters and


punctuation marks.

ASCII code for the letter a is 1000011


ASCII code for the letter A is 1000001
ASCII code for the letter , is 0011010

Note: all the letters (symbols) in this case use


the same number of bits (7). These are called
fixed length codes.

Examples of Binary Codes

English alphabets:

26 uppercase and 26 lowercase letters and punctuation


marks.

ASCII code for the letter a is 1000011


ASCII code for the letter A is 1000001
ASCII code for the letter , is 0011010

Note: all the letters (symbols) in this case use the same
number of bits (7). These are called fixed length codes.
The average number of bits per symbol (letter) is called
the rate of the code.

Code Rate

Average length of the code is important in compression.

Suppose our source alphabet consists of four letters a1, a2, a3,
and a4 with probabilities P(a1) = 0.5 P(a2) = 0.25, and P(a3)
= P(a4) = 0.125.

The average length of the code is given by


4

l = P(ai )n(ai )
i =1

n(ai) is the number of bits in the codeword for letter ai

Uniquely Decodable Codes


Letters

a1
a2
a3
a4

Probabilitity

Code 1

Code 2

Code 3

Code 4

0.5
0.25
0.125
0.125

0
0
1
10
1.125

0
1
00
11
1.25

0
10
110
111
1.75

0
01
011
0111
1.875

Average Length

Code 1: not unique a1 and a2 have the same codeword


Code 2: not uniquely decodable: 100 could mean a2a3 or a2a1a1
Codes 3 and 4: uniquely decodable: What are the rules?
Code 3 is called instantaneous code since the decoder knows the
codeword the moment a code is complete.

How do we know a uniquely decodable


code?

Consider two codewords: 011 and 011101

Prefix: 011
Dangling suffix: 101

Algorithm:
1.

Construct a list of all the codewords.

2.

Examine all pairs of codewords to see if any codeword is a prefix


of another codeword. If there exists such a pair, add the
dangling suffice to the list unless there is one already.

3.

Continue this procedure using the larger list until:


1.

Either a dangling suffix is a codeword -> not uniquely decodable.

2.

There are no more unique dangling suffixes -> uniquely decodable.

Examples of Unique Decodability

Consider {0,01,11}

Dangling suffix is 1 from 0 and 01

New list: {0,01,11,1}

Dangling suffix is 1 (from 0 and 01, and also 1 and 11),


and is already included in previous iteration.

Since the dangling suffix is not a codeword, {0,01, 11}


is uniquely decodable.

Examples of Unique Decodability

Consider {0,01,10}

Dangling suffix is 1 from 0 and 01

New list: {0,01,10,1}

The new dangling suffix is 0 (from 10 and 1).

Since the dangling suffix 0 is a codeword, {0,01, 10} is


not uniquely decodable.

Prefix Codes

Prefix codes: A code in which no codeword is a prefix to


another codeword.

A prefix code can be defined by a binary tree


Example:

Decoding a Prefix Codeword

Decoding a Prefix Codeword

How good is the code?


Suppose a, b, and c occur with probabilities
1/8, 1/4, and 5/8, respectively.

Are we losing any efficiency by using


prefix code?

The answer is NO!


Theorem 1: Let C be a code with N code words with lengths
l1, l2, lN . If C is uniquely decodable, then
N

K (C ) = 2 li 1
i =1

Theorem 2: Given a set of integers l1, l2, lN that satisfy


the inequality
N

li
2
1
i =1

we can always find a prefix code with codeword lengths l1,


l2, lN .

K (C ) = 2 li 1

Proof of Theorem 1

i =1

N li1 N li 2 N li 3 N N N ( li1 +li 2 +...+lin )

li
2 = 2 2 ... 2 = ... 2
i1=1 i 2=1 in =1
i =1
i =1
i =1
i =1
N

The exponent k = ( li 1 + li 2 +...+ lin ) is simply the length of n codewords


Smallest value of k is n and largest value is
So,
nl

[ K (C )] = Ak 2 k
n

k =n

Ak is the number of combinations of n codewords that have a combined length of k


Ak 2 k

Since for a uniquely decodable code, each sequence can represent


one and only one sequence of codewords. This implies
nl

nl

k =n

k =n

[ K (C )]n = Ak 2 k 2 k 2 k = nl n + 1

Growth linearly!!!!

Thus,

K (C ) 1

Proof of Theorem 2: If 2 1 we can always


find a prefix codes with the length l , l ...l
li

i =1

Assume: l1 l2 ... l N
j 1

Define:

w1 = 0, w j = 2

l j li

j >1

i =1

Fact 1: binary representation of

wj

would take up

ceil[log 2 ( w j + 1)]

Fact 2: The number of bits in the binary representation of

wj

is less than

l j j =1 li
j =1 l j li
l j
log 2 ( w j + 1) = log 2 2 + 1 = log 2 2 2 + 2

i =1
i =1
j =1 li
= l j + log 2 2 l j
i =1

lj

Proof of Theorem 2: If 2 1 we can always


find a prefix codes with the length l1 , l2 ...l N
li

i =1

Now using the binary representation of


If

ceil (log 2 ( w j + 1)) = l j

w j , we define the codeword as:

, then the jth codeword cj is the binary

representation of wj
If

ceil (log 2 ( w j + 1)) l j

, then the jth codeword cj is the binary

representation of wj with l j ceil (log 2 ( w j + 1)) zeros


j 1

This is clearly a decodable code (wj are all different since

j
2

i =1

l li

is an increased function, each wj also has


length lj)

Proof of Theorem 2: If 2 1 we can always


find a prefix codes with the length l1 , l2 ...l N
li

i =1

Suppose the claim is not true, then for some

j < k,

cj is the prefix of ck

This means lj most significant bits fo wk form the binary represention of wj

w
w j = lk kl j
2

k 1

, However

wk = 2 k

l l j

i =1

Therefore,

wk
2

lk l j

k 1

= 2
i =1

l j li

k 1

= wj + 2

l j li

i= j

That is the smallest value for

wk
2

Hence, contradicts!

= wj +1+

lk l j

is

k 1

i = j +1

wj +1

l j li

wj +1

You might also like