0% found this document useful (0 votes)

47 views

Lecture35-37 SourceCoding

The document discusses source coding techniques for data compaction. It describes: 1) Prefix coding schemes like Huffman coding which assign variable length codes to symbols such that no code is a prefix of another code. 2) Lempel-Ziv coding, an adaptive technique that encodes repeated patterns in data to overcome limitations of Huffman coding. 3) How prefix codes like Huffman coding can achieve an average code length close to the entropy limit based on Shannon's source coding theorem.

Uploaded by

Parveen Swami

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

47 views

Lecture35-37 SourceCoding

Uploaded by

Parveen Swami

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 20

Sour ce Coding

1. Source symbols encoded in binary 2. The average codelength must be reduced 3. Remove redund ancy reduces bit -rate Consider a discrete memoryless source on the alphabet S = {s0 , s1 , , sk } Let the corresponding probabilities be {p0 , p1 , , pk } and codelengths be {l0 , l1 , , lk }.

Then, the average codelength(average number of bits per symbol) of the source is defined as

L =

K 1 X k=0

pk ll

If Lmin is the minimum possible value of L , the n the coding efficiency of the source is given by .

Lmin = L For an efficient code approaches unity. The question: What is smallest average codelength that is possible? The Answer: Shannons source coding theorem Given a discrete memoryless source of ent ropy H(s), the average

codeword length L bounded by for any distortionle ss source encoding scheme is

L H(s) Since, H(s) is the fund amental limit on the average number of bits/symbol, we can say Lmin H(s) H(s) = = L Data Compaction: 1. Removal of redund ant information prior to transmission. 2. Lossless data compaction no information is lost. 3. A source code which represents the outpu t of a discrete memoryless source should be uniquely decodable.

Sour ce Coding Sche mes for Data Compaction

Prefix Coding 1. The Prefix Code is variable length source coding scheme where no code is the prefix of any other code. 2. The prefix code is a uniquely decodable code. 3. But, the converse is not true i.e., all uniquely decodable codes may not be prefix codes.

Table 1: Illustrating the definition of prefix code Symbol s0 s1 s2 s3 Prob.of Occurrence 0.5 0.25 0.125 0.125 Code I 0 1 00 11 Code II 0 10 110 111 Code III 0 01 011 0111

Table 2: Table is reproduce d from S.Haykins book on Communication Systems From 1 we see that Code I is not a prefix code. Code II is a prefix code. Code III is also uniquely decodable but not a prefix code. Prefix codes also satisfies Kraft-McMillan inequality which is given by

K 1 X k=0

2 lk 1

Kraft-McMillan inequality maps codewords to a binary tree as shown is Figure 1.

s 0 Initial State 1 0 1 1 s
3 0

s 0

Figure 1: Decision tree for Code II Given a discrete memoryless source of entropy H(s), a prefix code can be constructed with an average code-word length l, which is

bounded as follows:

H(s) (L) < H(s) + 1 The left hand side of the above equation, the equality is satisfied owing to the condition that, any symbol sk is emitt ed with the probability

(1)

pk = 2lk

(2)

where, lk is the length of the codeword assigned to the symbol sk . Hence, from Eq. 2, we have

K 1

K 1 X k=0

pk = 1

(3)

k=0

With this condition, the Kraft-McMillan inequality tells that a prefix code can be constructed such that the length of the codeword assigned to source symbol sk is log2 pk . Therefore, the average codeword length is given by

L =

K 1 X k=0

lk 2 lk

(4)

and the corresponding entropy is given by

H(s) =

K 1 X k=0

1 2 lk lk 2l
k

! log2 (2lk ) (5)

K 1 X k=0

Hence, from Eq. 5, the equality condition on the leftside of Eq. 1, L = H(s) is satisfied. To prove the inequality condition we will proceed as follows: Let Ln denote the average codeword length of the extended prefix code. For a uniquely decodable code,

Ln is the smallest possible.

Huffman Coding
1. Huffman code is a prefix code 2. The length of codeword for each symbol is roughly equal to the amount of information conveyed. 3. The code need not be unique (see Figure 3) A Huffman tree is constructed as shown in Figure. 3, (a) and (b) represents two forms of Huffman trees. We see that both schemes have same average length but different variances. Variance is a measure of the variability in codeword lengths of a source code. It is defined as follows:
K 1 X k=0

2 =

pk (lk L )2

(6)

where, pk is the probability of kth symbol. lk is the codeword is the average codeword length. It is length of kth symbol and L reasonable to choose the huffman tree which gives greater variace.

s0 (a) Average length, L = .2.2 Variance = 0.160 0 1 s1 0 s2 1 s3 0 1 s4 Symbol s0 s1 s2 s3 0 1 s4 code 10 00 01 110 111

s3 0

s4 1 1

Symbol

code

(b) Average length, L = .2.2 Variance = 1.036 (b) 0 0

1 1

Figure 2: Huffman tree

0 10 110 1110 1111

s1 s2 s3 s4

Figure 2: Huffman tree

Drawbacks: 1. Requires proper statistics. 2. Cannot exploit relationships between words, phrases etc., 3. Does not consider redund ancy of the language.

Lempel-Ziv Coding
1. Overcomes the drawbacks of Huffman coding 2. It is an adaptive and simple encoding scheme. 3. When applied to English tex t it achieves 55% in contrast to Huffman coding which achieves only 43%.

4. Encodes patterns in the text This algorithm is accomplished by parsing the source data stream into segments that are the shortest subsequences not encountered previously. (see Figure 3 the example is reproduce d from S.Haykins book on Communication Systems.)

Let the input sequence be 000101110010100101......... We assume that 0 and 1 are known and stored in codebook subsequences stored : 0, 1 Data to be parsed: 000101110010100101......... The shortest subsequence of the data stream encountered for the first time and not seen before is 00 subsequences stored: 0, 1, 00 Data to be parsed: 0101110010100101.........

The second shortest subsequence not seen before is 01; accordingly, we go on to write Subsequences stored: 0, 1, 00, 01 Data to be parsed: 01110010100101......... We continue in the manner described here until the given data stream has been completely parsed. The code book is shown below:

Numerical positions: 1 subsequences: Numerical Repre sentations: Binary encoded blocks: 0

2 1

3 00 11

4 01 12

5 011 42

6 10 21

7 010 41

8 100 61

9 101 62

0010

0011

1001

0100

1000

1100

1101

Figure 3: Lempel-Ziv Encoding

Math10 - q1 - Mod10 - Solving Problems Involving Sequences - v3
100% (2)
Math10 - q1 - Mod10 - Solving Problems Involving Sequences - v3
61 pages
Information Theory and Coding
No ratings yet
Information Theory and Coding
27 pages
RT-RRT : A Real-Time Path Planning Algorithm Based On RRT : Kourosh Naderi Joose Rajam Aki Perttu H Am Al Ainen
No ratings yet
RT-RRT : A Real-Time Path Planning Algorithm Based On RRT : Kourosh Naderi Joose Rajam Aki Perttu H Am Al Ainen
6 pages
ch3 Part1
No ratings yet
ch3 Part1
7 pages
Group Assignment Multimedia System
No ratings yet
Group Assignment Multimedia System
26 pages
ECEVSP L03 Compression2
No ratings yet
ECEVSP L03 Compression2
40 pages
Module IV
No ratings yet
Module IV
37 pages
3
No ratings yet
3
11 pages
Chapter 3 Multimedia Data Compression
No ratings yet
Chapter 3 Multimedia Data Compression
21 pages
Lec27 PDF
No ratings yet
Lec27 PDF
26 pages
Information Theory: Mohamed Hamada
No ratings yet
Information Theory: Mohamed Hamada
44 pages
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
No ratings yet
Ec8093-Digital Image Processing: Dr.K.Kalaivani Associate Professor Dept. of EIE Easwari Engineering College
37 pages
Chapter Three
No ratings yet
Chapter Three
30 pages
3 Source Coding
No ratings yet
3 Source Coding
31 pages
Multimedia Data Compression
No ratings yet
Multimedia Data Compression
31 pages
Entropy: A 00 A 01 A 10 A 11
No ratings yet
Entropy: A 00 A 01 A 10 A 11
22 pages
Chapter 4 Multi
No ratings yet
Chapter 4 Multi
45 pages
Data Compression Arithmetic Coding
No ratings yet
Data Compression Arithmetic Coding
38 pages
L12, L13, L14, L15, L16 - Module 4 - Source Coding
No ratings yet
L12, L13, L14, L15, L16 - Module 4 - Source Coding
59 pages
Why Needed?: Without Compression, These Applications Would Not Be Feasible
No ratings yet
Why Needed?: Without Compression, These Applications Would Not Be Feasible
11 pages
Publication 3 26433 1410
No ratings yet
Publication 3 26433 1410
6 pages
Lecture 4
No ratings yet
Lecture 4
18 pages
11 Huffman Coding
No ratings yet
11 Huffman Coding
25 pages
Compression For Sending and Storing Information: Text, Audio, Images, Videos
No ratings yet
Compression For Sending and Storing Information: Text, Audio, Images, Videos
28 pages
Information Theory: Dr. Muhammad Imran Farid
No ratings yet
Information Theory: Dr. Muhammad Imran Farid
32 pages
Huffman and Lempel-Ziv-Welch
No ratings yet
Huffman and Lempel-Ziv-Welch
14 pages
Huffman Coding
No ratings yet
Huffman Coding
40 pages
04Huffman-2x2
No ratings yet
04Huffman-2x2
6 pages
CH 6
No ratings yet
CH 6
21 pages
Kraft'S and Mcmillan'S Inequalities: Theorem 1.11
No ratings yet
Kraft'S and Mcmillan'S Inequalities: Theorem 1.11
11 pages
Group Presentation Digital Communication Systems
No ratings yet
Group Presentation Digital Communication Systems
29 pages
M2_prefixCode
No ratings yet
M2_prefixCode
44 pages
Week6 Channel Encoding
No ratings yet
Week6 Channel Encoding
64 pages
Huffman
No ratings yet
Huffman
53 pages
Lecture4
No ratings yet
Lecture4
65 pages
Coding Line Coding Covered
No ratings yet
Coding Line Coding Covered
68 pages
ICE513 Module 4 - Source Coding
No ratings yet
ICE513 Module 4 - Source Coding
26 pages
Source Coding Ompression
No ratings yet
Source Coding Ompression
34 pages
Entropy & Run Length Coding
No ratings yet
Entropy & Run Length Coding
45 pages
Introduction To Digital Communications and Information Theory
No ratings yet
Introduction To Digital Communications and Information Theory
8 pages
Source 515 A
No ratings yet
Source 515 A
80 pages
Lecture
No ratings yet
Lecture
75 pages
Entropy Coding
No ratings yet
Entropy Coding
18 pages
Huffman Codes: Spring 2010
No ratings yet
Huffman Codes: Spring 2010
7 pages
Huffman Coding: Greedy Algorithm
No ratings yet
Huffman Coding: Greedy Algorithm
27 pages
Unit 1: Information Theory and Coding
No ratings yet
Unit 1: Information Theory and Coding
52 pages
Huffman Coding-: Step-01
No ratings yet
Huffman Coding-: Step-01
19 pages
Arithmetic, Run Length, Compression
No ratings yet
Arithmetic, Run Length, Compression
62 pages
Source Coding
No ratings yet
Source Coding
35 pages
Lec.4n - COMM 552 Information Theory and Coding
No ratings yet
Lec.4n - COMM 552 Information Theory and Coding
23 pages
Source Coding
No ratings yet
Source Coding
18 pages
Lecture 3-Huffman Coding
No ratings yet
Lecture 3-Huffman Coding
30 pages
Unit 2
No ratings yet
Unit 2
30 pages
Mmis G1 Ass
No ratings yet
Mmis G1 Ass
13 pages
Input Source Encoder Channel Encoder Binary Interface
No ratings yet
Input Source Encoder Channel Encoder Binary Interface
29 pages
Lecture 6 PDF
No ratings yet
Lecture 6 PDF
5 pages
Lecture 4-Print
No ratings yet
Lecture 4-Print
18 pages
Instantaneous Codes, Kraft-McMillan Inequality
No ratings yet
Instantaneous Codes, Kraft-McMillan Inequality
5 pages
Characterization Results For Time-Varying Codes: Fundamenta Informaticae 53 (2), 2002, 185-198
No ratings yet
Characterization Results For Time-Varying Codes: Fundamenta Informaticae 53 (2), 2002, 185-198
15 pages
Characterization Results For Time-Varying Codes: Fundamenta Informaticae 53 (2), 2002, 185-198
No ratings yet
Characterization Results For Time-Varying Codes: Fundamenta Informaticae 53 (2), 2002, 185-198
15 pages
Error-Correction on Non-Standard Communication Channels
From Everand
Error-Correction on Non-Standard Communication Channels
Edward A. Ratzer
No ratings yet
An Introduction To Digital Design
From Everand
An Introduction To Digital Design
Jason King
2/5 (1)
2nd Unit
No ratings yet
2nd Unit
10 pages
Executive Development Programmes North East States of India (2016 - 17)
No ratings yet
Executive Development Programmes North East States of India (2016 - 17)
24 pages
Maharshi Dayanand University, Rohtak: (A State University Established Under Haryana Act No. XXV of 1975)
No ratings yet
Maharshi Dayanand University, Rohtak: (A State University Established Under Haryana Act No. XXV of 1975)
1 page
Curriculum Vitae: Carrier Objective
No ratings yet
Curriculum Vitae: Carrier Objective
2 pages
XXXXXXXXXXXXXXXXXXXXXXJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJB NN, M . KMLML
No ratings yet
XXXXXXXXXXXXXXXXXXXXXXJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJB NN, M . KMLML
1 page
Data Mining: Association
No ratings yet
Data Mining: Association
41 pages
Talysurf Cci 6000
No ratings yet
Talysurf Cci 6000
3 pages
Answer of Asp
No ratings yet
Answer of Asp
3 pages
What Is The Difference Between OLAP and Data Warehouse?
No ratings yet
What Is The Difference Between OLAP and Data Warehouse?
3 pages
Fig 1.1 Fig 1.2
No ratings yet
Fig 1.1 Fig 1.2
6 pages
Section A Section B
No ratings yet
Section A Section B
1 page
Gate
No ratings yet
Gate
6 pages
Analytic Function
No ratings yet
Analytic Function
3 pages
Automation
No ratings yet
Automation
2 pages
Plug Flow Reactor Design
No ratings yet
Plug Flow Reactor Design
3 pages
Morini 2011
No ratings yet
Morini 2011
17 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
11 pages
Malik Raheel Ahmad (12349) DAA Assignment 3
No ratings yet
Malik Raheel Ahmad (12349) DAA Assignment 3
6 pages
Self-Consistent Field
No ratings yet
Self-Consistent Field
6 pages
pdfSMA 104 LECTURE 3 DIFFERENTIATION
No ratings yet
pdfSMA 104 LECTURE 3 DIFFERENTIATION
16 pages
Demand Forecasting & Collaborating Planning, Forecasting & Replenishment
No ratings yet
Demand Forecasting & Collaborating Planning, Forecasting & Replenishment
13 pages
Newton Law of Motion - DPP 03 (Of Lec 05) - Arjuna JEE 2024
No ratings yet
Newton Law of Motion - DPP 03 (Of Lec 05) - Arjuna JEE 2024
7 pages
Som Formulas
No ratings yet
Som Formulas
20 pages
MCQ For PPC
0% (2)
MCQ For PPC
33 pages
Recommended Initial Alram Ge - 872639
100% (1)
Recommended Initial Alram Ge - 872639
4 pages
Estonian Math
100% (1)
Estonian Math
26 pages
Review of Signals and Systems: Gaurav S. Kasbekar Dept. of Electrical Engineering IIT Bombay
No ratings yet
Review of Signals and Systems: Gaurav S. Kasbekar Dept. of Electrical Engineering IIT Bombay
32 pages
Mathematical Symbols: Good Problems: March 25, 2008
No ratings yet
Mathematical Symbols: Good Problems: March 25, 2008
2 pages
Mathematics and Twist PDF
No ratings yet
Mathematics and Twist PDF
148 pages
Numerical Ability 1
No ratings yet
Numerical Ability 1
3 pages
EC2066 Commentary 2019
No ratings yet
EC2066 Commentary 2019
31 pages
Lab Manual: Department of Mechanical Engineering
No ratings yet
Lab Manual: Department of Mechanical Engineering
19 pages
Decision Analysis & Modelling
No ratings yet
Decision Analysis & Modelling
2 pages
Tutorial - II EEU 08106
No ratings yet
Tutorial - II EEU 08106
2 pages
Holy Garden Model School Syllabus - 4
No ratings yet
Holy Garden Model School Syllabus - 4
4 pages
Real Numbers
No ratings yet
Real Numbers
34 pages
Hydraulics Equations Solutions Mat Lab
No ratings yet
Hydraulics Equations Solutions Mat Lab
13 pages
PRI Analysis and Deinterleaving
100% (1)
PRI Analysis and Deinterleaving
76 pages
Ship Structure Design and Nomeclature
100% (1)
Ship Structure Design and Nomeclature
289 pages
Fluid Question
No ratings yet
Fluid Question
20 pages
Teacher Job Satisfaction the Importance of School Working Conditions and Teacher Characteristics
No ratings yet
Teacher Job Satisfaction the Importance of School Working Conditions and Teacher Characteristics
28 pages
Grade 5 (Quarter 2 S.Y. 2023-2024)
100% (1)
Grade 5 (Quarter 2 S.Y. 2023-2024)
4 pages