0% found this document useful (0 votes)

108 views

Entropy Coding

This document provides an introduction to entropy coding. It defines key terms like alphabet, symbols, coding, and codewords. It gives examples of fixed length binary codes for English letters using ASCII. It discusses calculating the average code length and introduces uniquely decodable codes. Prefix codes are defined as codes where no codeword is a prefix of another. Prefix codes can be represented using binary trees for decoding. Theorems are presented showing that prefix codes can achieve the optimal compression efficiency.

Uploaded by

Rakesh Inani

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

108 views

Entropy Coding

Uploaded by

Rakesh Inani

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Lecture 5:

Introduction to Entropy
Coding

Thinh Nguyen
Oregon State University

Codes

Definitions:

Alphabet: is a collection of symbols.

Letters (symbols): is an element of an alphabet.

Coding: the assignment of binary sequences to

elements of an alphabet.

Code: A set of binary sequences.

Codewords: Individual members of the set of binary

sequences.

Examples of Binary Codes

English alphabets:

26 uppercase and 26 lowercase letters and

punctuation marks.

ASCII code for the letter a is 1000011

ASCII code for the letter A is 1000001
ASCII code for the letter , is 0011010

Note: all the letters (symbols) in this case use

the same number of bits (7). These are called
fixed length codes.

Examples of Binary Codes

English alphabets:

26 uppercase and 26 lowercase letters and punctuation

marks.

ASCII code for the letter a is 1000011

ASCII code for the letter A is 1000001
ASCII code for the letter , is 0011010

Note: all the letters (symbols) in this case use the same
number of bits (7). These are called fixed length codes.
The average number of bits per symbol (letter) is called
the rate of the code.

Code Rate

Average length of the code is important in compression.

Suppose our source alphabet consists of four letters a1, a2, a3,
and a4 with probabilities P(a1) = 0.5 P(a2) = 0.25, and P(a3)
= P(a4) = 0.125.

The average length of the code is given by

l = P(ai )n(ai )
i =1

n(ai) is the number of bits in the codeword for letter ai

Uniquely Decodable Codes

Letters

a1
a2
a3
a4

Probabilitity

Code 1

Code 2

Code 3

Code 4

0.5
0.25
0.125
0.125

0
0
1
10
1.125

0
1
00
11
1.25

0
10
110
111
1.75

0
01
011
0111
1.875

Average Length

Code 1: not unique a1 and a2 have the same codeword

Code 2: not uniquely decodable: 100 could mean a2a3 or a2a1a1
Codes 3 and 4: uniquely decodable: What are the rules?
Code 3 is called instantaneous code since the decoder knows the
codeword the moment a code is complete.

How do we know a uniquely decodable

code?

Consider two codewords: 011 and 011101

Prefix: 011
Dangling suffix: 101

Algorithm:
1.

Construct a list of all the codewords.

Examine all pairs of codewords to see if any codeword is a prefix

of another codeword. If there exists such a pair, add the
dangling suffice to the list unless there is one already.

Continue this procedure using the larger list until:

Either a dangling suffix is a codeword -> not uniquely decodable.

There are no more unique dangling suffixes -> uniquely decodable.

Examples of Unique Decodability

Consider {0,01,11}

Dangling suffix is 1 from 0 and 01

New list: {0,01,11,1}

Dangling suffix is 1 (from 0 and 01, and also 1 and 11),

and is already included in previous iteration.

Since the dangling suffix is not a codeword, {0,01, 11}

is uniquely decodable.

Examples of Unique Decodability

Consider {0,01,10}

Dangling suffix is 1 from 0 and 01

New list: {0,01,10,1}

The new dangling suffix is 0 (from 10 and 1).

Since the dangling suffix 0 is a codeword, {0,01, 10} is

not uniquely decodable.

Prefix Codes

Prefix codes: A code in which no codeword is a prefix to

another codeword.

A prefix code can be defined by a binary tree

Example:

Decoding a Prefix Codeword

How good is the code?

Suppose a, b, and c occur with probabilities
1/8, 1/4, and 5/8, respectively.

Are we losing any efficiency by using

prefix code?

The answer is NO!

Theorem 1: Let C be a code with N code words with lengths
l1, l2, lN . If C is uniquely decodable, then
N

K (C ) = 2 li 1
i =1

Theorem 2: Given a set of integers l1, l2, lN that satisfy

the inequality
N

li
2
1
i =1

we can always find a prefix code with codeword lengths l1,

l2, lN .

K (C ) = 2 li 1

Proof of Theorem 1

i =1

N li1 N li 2 N li 3 N N N ( li1 +li 2 +...+lin )

li
2 = 2 2 ... 2 = ... 2
i1=1 i 2=1 in =1
i =1
i =1
i =1
i =1
N

The exponent k = ( li 1 + li 2 +...+ lin ) is simply the length of n codewords

Smallest value of k is n and largest value is
So,
nl

[ K (C )] = Ak 2 k
n

k =n

Ak is the number of combinations of n codewords that have a combined length of k

Ak 2 k

Since for a uniquely decodable code, each sequence can represent

one and only one sequence of codewords. This implies
nl

k =n

[ K (C )]n = Ak 2 k 2 k 2 k = nl n + 1

Growth linearly!!!!

Thus,

K (C ) 1

Proof of Theorem 2: If 2 1 we can always

find a prefix codes with the length l , l ...l
li

i =1

Assume: l1 l2 ... l N
j 1

Define:

w1 = 0, w j = 2

l j li

j >1

i =1

Fact 1: binary representation of

would take up

ceil[log 2 ( w j + 1)]

Fact 2: The number of bits in the binary representation of

is less than

l j j =1 li
j =1 l j li
l j
log 2 ( w j + 1) = log 2 2 + 1 = log 2 2 2 + 2

i =1
i =1
j =1 li
= l j + log 2 2 l j
i =1

Proof of Theorem 2: If 2 1 we can always

find a prefix codes with the length l1 , l2 ...l N
li

i =1

Now using the binary representation of

ceil (log 2 ( w j + 1)) = l j

w j , we define the codeword as:

, then the jth codeword cj is the binary

representation of wj
If

ceil (log 2 ( w j + 1)) l j

, then the jth codeword cj is the binary

representation of wj with l j ceil (log 2 ( w j + 1)) zeros

j 1

This is clearly a decodable code (wj are all different since

j
2

i =1

l li

is an increased function, each wj also has

length lj)

Proof of Theorem 2: If 2 1 we can always

find a prefix codes with the length l1 , l2 ...l N
li

i =1

Suppose the claim is not true, then for some

j < k,

cj is the prefix of ck

This means lj most significant bits fo wk form the binary represention of wj

w
w j = lk kl j
2

k 1

, However

wk = 2 k

l l j

i =1

Therefore,

wk
2

lk l j

k 1

= 2
i =1

l j li

k 1

= wj + 2

l j li

i= j

That is the smallest value for

wk
2

Hence, contradicts!

= wj +1+

lk l j

k 1

i = j +1

wj +1

l j li

wj +1

Data Compression Solutions
79% (19)
Data Compression Solutions
67 pages
Information Theory and Coding
No ratings yet
Information Theory and Coding
27 pages
55 Characters For Vedic Sanskrit
0% (1)
55 Characters For Vedic Sanskrit
35 pages
Sig. Figs. Sci. Notation Worksheet Answer Key PDF
No ratings yet
Sig. Figs. Sci. Notation Worksheet Answer Key PDF
6 pages
Decodable PDF
No ratings yet
Decodable PDF
4 pages
Lecture 4
No ratings yet
Lecture 4
18 pages
Lecture 4-Print
No ratings yet
Lecture 4-Print
18 pages
ch3 Part1
No ratings yet
ch3 Part1
7 pages
Lec27 PDF
No ratings yet
Lec27 PDF
26 pages
Kraft'S and Mcmillan'S Inequalities: Theorem 1.11
No ratings yet
Kraft'S and Mcmillan'S Inequalities: Theorem 1.11
11 pages
M2_prefixCode
No ratings yet
M2_prefixCode
44 pages
Data Compression: Chapter - 2 Mathematical Preliminaries For Lossless Compression
100% (2)
Data Compression: Chapter - 2 Mathematical Preliminaries For Lossless Compression
26 pages
Introduction To Digital Communications and Information Theory
No ratings yet
Introduction To Digital Communications and Information Theory
8 pages
Input Source Encoder Channel Encoder Binary Interface
No ratings yet
Input Source Encoder Channel Encoder Binary Interface
29 pages
5 Data Compression
No ratings yet
5 Data Compression
6 pages
Lecture35-37 SourceCoding
No ratings yet
Lecture35-37 SourceCoding
20 pages
ICT - Module 1 Lecture 2
No ratings yet
ICT - Module 1 Lecture 2
19 pages
Coding Tech
No ratings yet
Coding Tech
32 pages
Data Compression Can Be Achieved by Assigning To of The Data Source and
No ratings yet
Data Compression Can Be Achieved by Assigning To of The Data Source and
42 pages
Lecture 3-Huffman Coding
No ratings yet
Lecture 3-Huffman Coding
30 pages
Unit 2
No ratings yet
Unit 2
28 pages
Lecture4
No ratings yet
Lecture4
65 pages
Uniquely Decodable Codes
No ratings yet
Uniquely Decodable Codes
10 pages
Data Compression Basic Concepts of Data Compression Data Compression
No ratings yet
Data Compression Basic Concepts of Data Compression Data Compression
21 pages
Lec 2 X
No ratings yet
Lec 2 X
6 pages
Entropy: A 00 A 01 A 10 A 11
No ratings yet
Entropy: A 00 A 01 A 10 A 11
22 pages
Week_2
No ratings yet
Week_2
73 pages
ECCLectureNotes 2
No ratings yet
ECCLectureNotes 2
81 pages
Information Theory: Dr. Muhammad Imran Farid
No ratings yet
Information Theory: Dr. Muhammad Imran Farid
32 pages
An Information Source Is A Device Which Delivers Symbols (Or Letters) Randomly From A
No ratings yet
An Information Source Is A Device Which Delivers Symbols (Or Letters) Randomly From A
5 pages
Entropy, Coding and Data Compression
No ratings yet
Entropy, Coding and Data Compression
33 pages
Data Compression Introduction
No ratings yet
Data Compression Introduction
43 pages
Source Codes
No ratings yet
Source Codes
37 pages
Publication 3 26433 1410
No ratings yet
Publication 3 26433 1410
6 pages
Uniquely Decodable Codes (UDC) : Data Compression and Data Retrieval
No ratings yet
Uniquely Decodable Codes (UDC) : Data Compression and Data Retrieval
9 pages
CSC 310, Spring 2004 - Assignment #1 Solutions
No ratings yet
CSC 310, Spring 2004 - Assignment #1 Solutions
4 pages
Lecture 6 PDF
No ratings yet
Lecture 6 PDF
5 pages
5.3 Kraft Inequality and Optimal Codeword Length: Theorem 22 Let X
No ratings yet
5.3 Kraft Inequality and Optimal Codeword Length: Theorem 22 Let X
11 pages
Unit 2 - Source Coding-4
No ratings yet
Unit 2 - Source Coding-4
57 pages
Coding Theory Lecture Notes
100% (1)
Coding Theory Lecture Notes
73 pages
Unit 2
No ratings yet
Unit 2
30 pages
Lossless Data Compression
No ratings yet
Lossless Data Compression
24 pages
Data Compression Arithmetic Coding
No ratings yet
Data Compression Arithmetic Coding
38 pages
Mathematical Prelims
No ratings yet
Mathematical Prelims
13 pages
ITC Unit 2
No ratings yet
ITC Unit 2
186 pages
Module IV
No ratings yet
Module IV
37 pages
Huffman Codes: Spring 2010
No ratings yet
Huffman Codes: Spring 2010
7 pages
Entropy 3
No ratings yet
Entropy 3
10 pages
3
No ratings yet
3
11 pages
CH 6
No ratings yet
CH 6
21 pages
MATH 291T Coding Theory: California State University, Fresno
No ratings yet
MATH 291T Coding Theory: California State University, Fresno
74 pages
Source Coding Theory: TSBK01 Image Coding and Data Compression
No ratings yet
Source Coding Theory: TSBK01 Image Coding and Data Compression
14 pages
Coding
No ratings yet
Coding
61 pages
Source Coding Theory: TSBK01 Image Coding and Data Compression
No ratings yet
Source Coding Theory: TSBK01 Image Coding and Data Compression
14 pages
Source 515 A
No ratings yet
Source 515 A
80 pages
ECEVSP L03 Compression2
No ratings yet
ECEVSP L03 Compression2
40 pages
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
No ratings yet
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
18 pages
Notes
No ratings yet
Notes
32 pages
3 Source Coding
No ratings yet
3 Source Coding
31 pages
Sequences and Infinite Series, A Collection of Solved Problems
From Everand
Sequences and Infinite Series, A Collection of Solved Problems
Steven Tan
No ratings yet
Principles of Digital Electronics
From Everand
Principles of Digital Electronics
Sapana Rane
No ratings yet
EC Cryptography Tutorials - Herong's Tutorial Examples
From Everand
EC Cryptography Tutorials - Herong's Tutorial Examples
Herong Yang
No ratings yet
Firpm
No ratings yet
Firpm
5 pages
Histogram Processing
No ratings yet
Histogram Processing
27 pages
Image Enhancement
No ratings yet
Image Enhancement
38 pages
L10 - Walsh & Hadamard Transforms
100% (1)
L10 - Walsh & Hadamard Transforms
25 pages
ARC (File Format)
No ratings yet
ARC (File Format)
5 pages
JPEG
No ratings yet
JPEG
29 pages
DFT Properties
No ratings yet
DFT Properties
37 pages
Frequency Domain Filters
No ratings yet
Frequency Domain Filters
43 pages
Difference Between ZIP and GZIP - Difference Between - ZIP Vs GZIP
No ratings yet
Difference Between ZIP and GZIP - Difference Between - ZIP Vs GZIP
2 pages
Cyclic Redundancy Check
No ratings yet
Cyclic Redundancy Check
9 pages
Deflate: From Wikipedia, The Free Encyclopedia
No ratings yet
Deflate: From Wikipedia, The Free Encyclopedia
9 pages
Convex Function: From Wikipedia, The Free Encyclopedia
No ratings yet
Convex Function: From Wikipedia, The Free Encyclopedia
7 pages
Concave Function: From Wikipedia, The Free Encyclopedia
No ratings yet
Concave Function: From Wikipedia, The Free Encyclopedia
3 pages
Run-Length, Golomb, and Tunstall Codes: Thinh Nguyen Oregon State University
No ratings yet
Run-Length, Golomb, and Tunstall Codes: Thinh Nguyen Oregon State University
26 pages
Haar Transform
No ratings yet
Haar Transform
12 pages
Burrows-Wheeler Transform - Wikipedia, The Free Encyclopedia
No ratings yet
Burrows-Wheeler Transform - Wikipedia, The Free Encyclopedia
10 pages
The Burrows-Wheeler Transform
No ratings yet
The Burrows-Wheeler Transform
64 pages
Lossy Compression Iii - 1
No ratings yet
Lossy Compression Iii - 1
21 pages
Algorithms in The Real World: Data Compression: Lectures 1 and 2
No ratings yet
Algorithms in The Real World: Data Compression: Lectures 1 and 2
55 pages
CPS 296.3:algorithms in The Real World: Data Compression: Lecture 2.5
No ratings yet
CPS 296.3:algorithms in The Real World: Data Compression: Lecture 2.5
22 pages
Transform Coding II
No ratings yet
Transform Coding II
19 pages
English Grammar Punctuation
No ratings yet
English Grammar Punctuation
7 pages
To See The Animation, Move Your Mouse Over A Letter
No ratings yet
To See The Animation, Move Your Mouse Over A Letter
1 page
Experiment No.3 Objective: To Study & Perform BCD Addition and Subtraction Operation Theory
No ratings yet
Experiment No.3 Objective: To Study & Perform BCD Addition and Subtraction Operation Theory
3 pages
Typing French Symbols From The Laptop Keyboard Updated
No ratings yet
Typing French Symbols From The Laptop Keyboard Updated
3 pages
Com 111
No ratings yet
Com 111
3 pages
Error Detection and Correction Codes
No ratings yet
Error Detection and Correction Codes
36 pages
Final Computer Project
No ratings yet
Final Computer Project
106 pages
Proofreading Guidelines
No ratings yet
Proofreading Guidelines
39 pages
Business Math Week 1-2
No ratings yet
Business Math Week 1-2
11 pages
C++ Programming Activities: Technological University of The Philippines
No ratings yet
C++ Programming Activities: Technological University of The Philippines
26 pages
Notes On Standard Form
No ratings yet
Notes On Standard Form
9 pages
Adamson University College of Engineering Computer Engineering Department
No ratings yet
Adamson University College of Engineering Computer Engineering Department
6 pages
NATO Phonetic Alphabet 2015 NGL
No ratings yet
NATO Phonetic Alphabet 2015 NGL
1 page
Unit 2 ch-1
No ratings yet
Unit 2 ch-1
48 pages
Document 1 3
No ratings yet
Document 1 3
11 pages
COE221 Lect2 Numbersystems 2s
No ratings yet
COE221 Lect2 Numbersystems 2s
23 pages
58mm Thermal Printer Program Manual
No ratings yet
58mm Thermal Printer Program Manual
30 pages
Comparing and Ordering Fractions (Unlike Denominators) Whole Lesson 2
No ratings yet
Comparing and Ordering Fractions (Unlike Denominators) Whole Lesson 2
11 pages
MATH4-PPT-MATATAG-Q4-W6-D1-final
100% (1)
MATH4-PPT-MATATAG-Q4-W6-D1-final
21 pages
DL (Number System) - Q2 - New - W PDF
No ratings yet
DL (Number System) - Q2 - New - W PDF
4 pages
Computer Science Igcse Student Workbook
No ratings yet
Computer Science Igcse Student Workbook
28 pages
Run Length Encoding
No ratings yet
Run Length Encoding
7 pages
Accuload Comm Manual
No ratings yet
Accuload Comm Manual
166 pages
Placing Decimals With Multiplication
0% (1)
Placing Decimals With Multiplication
20 pages
BCD Codes
No ratings yet
BCD Codes
11 pages
Brahmic Scripts
No ratings yet
Brahmic Scripts
9 pages
Cs8351 Digital Principles and System Design MCQ
No ratings yet
Cs8351 Digital Principles and System Design MCQ
53 pages
Error Detection Correction
No ratings yet
Error Detection Correction
33 pages