50% found this document useful (2 votes)

209 views

Huffman Coding Assignment

The document describes implementing Huffman encoding and decoding in MATLAB. It provides background on Huffman coding and how it works. The key steps are: 1) Compute character probabilities, 2) Build a Huffman tree assigning shorter codes to more frequent characters, 3) Use the tree to encode data as binary strings. The MATLAB code takes in character probabilities, builds a Huffman dictionary, encodes a string, and decodes a bitstream using the dictionary. It also calculates the entropy and efficiency of the encoding.

Uploaded by

Mavine

Available Formats

Download as PDF, TXT or read online on Scribd

50% found this document useful (2 votes)

209 views

Huffman Coding Assignment

Uploaded by

Mavine

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

MULTIMEDIA UNIVERSITY OF KENYA

DEPARTMENT OF ELECTRICAL AND COMMUNICATION

ENGINEERING

ECE 2323: INFORMATION THEORY AND CODING

MATLAB ASSIGNMENT 1

Title: Huffman Encoding and Decoding in MATLAB

Course Instructor: Mr. J. K. Makiche

Objective: To implement the Huffman encoding and decoding algorithm in Matlab.

Software requirement: Matlab

Background

Encoding the information before transmission is necessary to ensure data security and efficient
delivery of the information. The MATLAB program presented here encodes and decodes the
information and also outputs the values of entropy, efficiency and frequency probabilities of
characters present in the data stream.

Huffman algorithm is a popular encoding method used in electronics communication systems. It

is widely used in all the mainstream compression formats that you might encounter—from GZIP,
PKZIP (winzip, etc) and BZIP2, to image formats such as JPEG and PNG. Some programs use
just the Huffman coding method, while others use it as one step in a multistep compression
process.

Huffman coding & deciding algorithm is used in compressing data with variable-length codes.
The shortest codes are assigned to the most frequent characters and the longest codes are
assigned to infrequent characters.

Huffman coding is an entropy encoding algorithm used for lossless data compression. Entropy is
a measure of the unpredictability of an information stream. Maximum entropy occurs when a
stream of data has totally unpredictable bits. A perfectly consistent stream of bits (all zeroes or
all ones) is totally predictable (has no entropy).

The Huffman coding method is somewhat similar to the Shannon–Fano method. The main
difference between the two methods is that Shannon–Fano constructs its codes from top to
bottom (and the bits of each codeword are constructed from left to right), while Huffman
constructs a code tree from the bottom up and the bits of each codeword are constructed from
right to left.

Page 1 of 7

Figure 1: Huffman tree

The model for Huffman tree is shown here in Figure 1. It is generated from the sentence

“this is an example of a huffman tree”

using Huffman algorithm. Here 36 is the root of the tree. Below the root node you can see the
leaf nodes 16 and 20. Adding 16 and 20 gives 36. Adding 8 and 8 gives 16, while 4+4=8. On the
left-hand side, ‘e’ is attached to 4. Similarly, ‘a’, ‘n’, ‘t’, etc have been attached to form the
complete Huffman tree.

For details on Huffman tree formation, please refer the ‘Data Compression and Decompression’
software project published in EFY April 2005. The simplest tree construction algorithm uses a
priority queue or table where the node with the lowest probability or frequency is given the
highest priority.

First, create a leaf node for each symbol or character and add it to the priority table. If there is
more than one node in the table, remove two nodes of the highest priority (lowest frequency)
from the table. Create a new node with these two nodes as sub-nodes and with probability equal
to the sum of the two nodes’ probabilities. Continue in this way until you reach the last single
node. The last node is the root, so the tree is now complete.

Steps to Encode Data using Huffman coding

Step 1. Compute the probability of each character in a set of data.

Step 2. Sort the set of data in ascending order.

Page 2 of 7

Step 3. Create a new node where the left sub-node is the lowest frequency in the sorted list and
the right sub-node is the second lowest in the sorted list.

Step 4. Remove these two elements from the sorted list as they are now part of one node and add
the probabilities. The result is the probability for the new node.

Step 5. Perform insertion sort on the list.

Step 6. Repeat steps 3, 4 and 5 until you have only one node left.[/stextbox]

Now that there is one node remaining, simply draw the tree.

With the above tree, place a ‘0’ on each path going to the left and a ‘1’ on each path going to the
right. Now assign the binary code to each of the symbols or characters by counting 0’s and 1’s
starting from the root.

Since efficient priority-queue data structures require O(log n) time per insertion, and a tree with
‘n’ leaves has 2n−1 nodes, this algorithm operates in O(n log n) time, where ‘n’ is the number of
symbols.

From the above, it is now clear that the encoding method should give rise to a uniquely
decodable code so that the original message can be detected uniquely and perfectly without
errors. The message generated with the highest probability will be generated more number of
times than other messages. In such a case, if you use a variable-length code instead of a fixed-
length code, you will be improving the efficiency by assigning fewer bits to the higher-
probability messages than the lower-probability messages.

Matlab Code

clc;
p=input('Enter the probabilities:');
n=length(p);
symbols=[1:n];
[dict,avglen]=huffmandict(symbols,p);
temp=dict;
t=dict(:,2);
for i=1:length(temp)
temp{i,2}=num2str(temp{i,2});
end
disp('The huffman code dict:');
disp(temp)
fprintf('Enter the symbols between 1 to %d in[]',n);
sym=input(':')
encod=huffmanenco(sym,dict);
disp('The encoded output:');
disp(encod);

Page 3 of 7

bits=input('Enter the bit stream in[];');
decod=huffmandeco(bits,dict);
disp('The symbols are:');
disp(decod);

H=0;
Z=0;
for(k=1:n)
    H=H+(p(k)*log2(1/p(k)));

end
fprintf(1,'Entropy is %f bits',H);
N=H/avglen;
fprintf('\n Efficiency is:%f',N);
for(r=1:n)
   l(r)=length(t{r});
end
m=max(l)
s=min(l)
v=m‐s;
fprintf('the variance is:%d',v);

How MATLAB Program Works

1. List the source probabilities in decreasing order.

2. Combine the probabilities of the two symbols having the lowest probabilities, and record
the resultant probabilities; this step is called reduction.This procedure is repeated until
there are two-order probabilities remaining.
3. Start encoding with the last reduction, which consists of exactly two-order probabilities.
Assign ‘0’ as the first digit in the code words for all the source symbols associated with
the first probability; assign ‘1’ to the second probability.
4. Now go back and assign ‘0’ and ‘1’ to the second digit for the two probabilities that were
combined in the previous reduction step, retaining all assignments made in Step 3.
5. Keep regressing in this way until the first column is reached.
6. Calculate the entropy. The entropy of the code is the average number of bits needed to
decode a given pattern.
7. Calculate efficiency. For evaluating the source code generated, you need to calculate its
efficiency.

The entropy for a source with statistically independent symbols:

N
H = −∑ p (k ) log 2 p (k ) bits/ symbol
k =0

Page 4 of 7

Code efficiency ηcode is defined as

H
ηcode = × 100%
H max

For a set of symbols represented by binary code words with lengths lk (binary) digits, an
overall code length, L , can be defined as the average codeword length, i.e.:
M
L = ∑ p (k )lk
k =1

The code efficiency ηcode can then be found using

H
ηcode = × 100%
L

Key MATLAB functions

Huffmanenco. This function is used in Huffman coding. The syntax is:

comp = huffmanenco(sig,dict)

This line encodes the signal ‘sig’ described by the ‘dict’ dictionary. The argument ‘sig’ can have
the form of a numeric vector, numeric cell array or alphanumeric cell array. If ‘sig’ is a cell
array, it must be either a row or a column. The ‘dict’ is an Nx2 cell array, where ‘N’ is the
number of distinct possible symbols to be encoded. The first column of ‘dict’ represents the
distinct symbols and the second column represents the corresponding codewords. Each codeword
is represented as a numeric row vector, and no codeword in ‘dict’ can be the prefix of any other
codeword in ‘dict’. You can generate ‘dict’ using the huffmandict function.

Huffmandeco. This function is used in Huffman decoding. The syntax is:

dsig = huffmandeco(comp,dict)

This line decodes the numeric Huffman code vector comp using the code dictionary ‘dict’. The
argument ‘dict’ is an Nx2 cell array, where ‘N’ is the number of distinct possible symbols in the
original signal that was encoded as ‘comp’. The first column of ‘dict’ represents the distinct
symbols and the second column represents the corresponding codewords. Each codeword is
represented as a numeric row vector, and no codeword in ‘dict’ is allowed to be the prefix of any
other codeword in ‘dict’. You can generate ‘dict’ using the Huffmandict function and ‘comp’
using the huffmanenco function. If all signal values in ‘dict’ are numeric, ‘dsig’ is a vector; if
any signal value in ‘dict’ is alphabetical, ‘dsig’ is a one-dimensional cell array.

Page 5 of 7

Testing

1. Launch the MATLAB program. The program first generates the dictionary of messages.
These messages are nothing but codes or bitstreams from 00 to 1001 in this example. You
can extend this range by changing in the source code. The MATLAB program output for
the example is given below:

Enter the probabilities:[0.3 0.25 0.2 0.12 0.08 0.05]
The huffman code dict:
    [1]    '0  0'
    [2]    '0  1'
    [3]    '1  1'
    [4]    '1  0  1'
    [5]    '1  0  0  0'
    [6]    '1  0  0  1'
Enter the symbols between 1 to 6 in[]:[3]
sym =
     3
The encoded output:
     1
     1
Enter the bit stream in[];[1 1]
The symbols are:
     3
Entropy is 2.360147 bits
Efficiency is:0.991659
m =
     4
s =
     2
The variance is:2>>

2. First, the program prompts you to enter the number between ‘1’ and ‘6’. When you enter
‘3’, code ‘1 1’ appears on the screen. This code is nothing but the character
corresponding to number ‘3’. Hence encoding is done successfully.
3. For decoding, enter bitstream ‘1 1’. The output generated is ‘3’.
4. Instead of ‘3’, you can try out using various combinations from ‘1’ to ‘6’.

The program outputs the values of maximum length (m) and minimum length (s) generated in
the dictionary. The maximum length generated is ‘1111’, i.e., m=4. The minimum length is ‘00’,
which is two bits long and therefore s=2.

Huffman coding is also called Minimum-variance coding. Variance is maximum length-

minimum length. Hence variance is ‘2’ in this example.

Page 6 of 7

Exercise

An analysis of the letters occurring in the words listed in the main entries of the Concise Oxford
Dictionary (9th edition, 1995) was carried out with the aim of determining the frequency of
letters in the English text. The table below was obtained. The third column in the table represents
proportions, taking the least common letter (q) as equal to 1. The letter E is over 56 times more
common than Q in forming individual English words. Using the Matlab code presented in the
preceding descriptions as a guide, develop the Huffman code table for the letters in the English
alphabet. Also determine the coding efficiency for code using your computer program.

Required:
You will be required to submit a fully typed report for this assignment by Friday 21st February
2020 5.00 pm. The report should contain a brief overview of your work, the results obtained in
terms of the source codes you have developed and the results generated from those source codes
for this problem, discussions of those results and conclusion. Plagiarized work will attract
heavy penalties. Note that if two reports are similar, it will be difficult to determine who copied
from who and therefore both will be penalized equally.

Page 7 of 7

Converting assembly language to machine language (1)
No ratings yet
Converting assembly language to machine language (1)
6 pages
Error Control Systems For Digital Communication and Storage (Stephen B. Wicker)
100% (3)
Error Control Systems For Digital Communication and Storage (Stephen B. Wicker)
268 pages
DSP Mod1@AzDOCUMENTS - in
No ratings yet
DSP Mod1@AzDOCUMENTS - in
60 pages
Electrical Properties of Materials Mod-1
No ratings yet
Electrical Properties of Materials Mod-1
18 pages
Data Structures - Module 1
No ratings yet
Data Structures - Module 1
118 pages
CCN Lab Manual Ecl77
No ratings yet
CCN Lab Manual Ecl77
121 pages
BEC401 module1 (1) (2)
No ratings yet
BEC401 module1 (1) (2)
50 pages
FM 2151
No ratings yet
FM 2151
82 pages
Baseband M Ary Transmission and Digital Subscriber Lines
100% (1)
Baseband M Ary Transmission and Digital Subscriber Lines
17 pages
BBEE203 Module 4 Boolean Algebra and Logic Circuits
No ratings yet
BBEE203 Module 4 Boolean Algebra and Logic Circuits
19 pages
15ecl48-VTU-raghudathesh-r 2r Dac PDF
No ratings yet
15ecl48-VTU-raghudathesh-r 2r Dac PDF
5 pages
Notes Module-2 COA 22BEC306C
No ratings yet
Notes Module-2 COA 22BEC306C
19 pages
Decoder VHDL Code Using Behavioural Flow Modeling
No ratings yet
Decoder VHDL Code Using Behavioural Flow Modeling
1 page
Ec3501 Wireless Communication 1560801494 WC Lab Manual
No ratings yet
Ec3501 Wireless Communication 1560801494 WC Lab Manual
24 pages
Download Complete Information Theory Coding And Cryptography 3rd Edition Ranjan Bose PDF for All Chapters
100% (2)
Download Complete Information Theory Coding And Cryptography 3rd Edition Ranjan Bose PDF for All Chapters
51 pages
HDL Manual (18ecl58)
No ratings yet
HDL Manual (18ecl58)
20 pages
18MT57-MITE - 16 Laboratory Manual - Format - MITE - 16 - VI - 18MTL57
No ratings yet
18MT57-MITE - 16 Laboratory Manual - Format - MITE - 16 - VI - 18MTL57
31 pages
Block Diagram Reduction Techniques
No ratings yet
Block Diagram Reduction Techniques
47 pages
DLD Final Exam Spring 2020
No ratings yet
DLD Final Exam Spring 2020
8 pages
Course File
No ratings yet
Course File
254 pages
BEC304 and BEE302 QP Solutions..
100% (1)
BEC304 and BEE302 QP Solutions..
104 pages
VHDL Interfacing Programs
No ratings yet
VHDL Interfacing Programs
22 pages
MC - BCS402 Lab Manual
No ratings yet
MC - BCS402 Lab Manual
21 pages
Module - 1
No ratings yet
Module - 1
21 pages
Eeng410/Infe410 - Microprocessors I Midterm Exam: Questions
No ratings yet
Eeng410/Infe410 - Microprocessors I Midterm Exam: Questions
4 pages
ITC Mod1 Notes
0% (1)
ITC Mod1 Notes
66 pages
ITC Unit 4 Convolution Code
No ratings yet
ITC Unit 4 Convolution Code
12 pages
Number System & Boolean Algebra
No ratings yet
Number System & Boolean Algebra
52 pages
Manual CN Lab 2017 PDF
No ratings yet
Manual CN Lab 2017 PDF
71 pages
Computer Organization KCS 302
No ratings yet
Computer Organization KCS 302
30 pages
Implementation of Electronic Voting Machine Through Fpga: Timardeepkaurarneja, Jasleenkaurbassi, Damanjeetkaur
No ratings yet
Implementation of Electronic Voting Machine Through Fpga: Timardeepkaurarneja, Jasleenkaurbassi, Damanjeetkaur
3 pages
Addressing Mode Problems: Prepared by I. Mala Serene,, VIT University
No ratings yet
Addressing Mode Problems: Prepared by I. Mala Serene,, VIT University
9 pages
Gate Questions Bank EC Communication
No ratings yet
Gate Questions Bank EC Communication
10 pages
Half and Full Adder Using VHDL
100% (3)
Half and Full Adder Using VHDL
16 pages
Canonical and Standard Form
No ratings yet
Canonical and Standard Form
3 pages
Orcad / Pspice Simulator - 7400 Library - 7408, 7432 & 7486 Simulation Settings: Analysis Type - Time Domain
No ratings yet
Orcad / Pspice Simulator - 7400 Library - 7408, 7432 & 7486 Simulation Settings: Analysis Type - Time Domain
6 pages
EC6601 VLSI Lesson Plan
No ratings yet
EC6601 VLSI Lesson Plan
7 pages
Digital Electronics & Fundamentals of Microprocessor Paper - IV
100% (1)
Digital Electronics & Fundamentals of Microprocessor Paper - IV
2 pages
Lesson Plan DPSD
No ratings yet
Lesson Plan DPSD
5 pages
OS (BEC405C) Vtu QP
No ratings yet
OS (BEC405C) Vtu QP
2 pages
EC8462
No ratings yet
EC8462
86 pages
Technological Innovation Management and Entrepreneurship: Model Question Paper - With Effect From 2020-21 (CBCS Scheme)
No ratings yet
Technological Innovation Management and Entrepreneurship: Model Question Paper - With Effect From 2020-21 (CBCS Scheme)
7 pages
Electromagnetic Field Theory Solution BTETC501 - 1
No ratings yet
Electromagnetic Field Theory Solution BTETC501 - 1
19 pages
Embedded C Lab (21EC481)
No ratings yet
Embedded C Lab (21EC481)
19 pages
Cyclic Encoding & Decoding
No ratings yet
Cyclic Encoding & Decoding
3 pages
Engineering Statistics and Linear Algebra by Vijayashri V. B.
100% (1)
Engineering Statistics and Linear Algebra by Vijayashri V. B.
128 pages
GTU PHD Core Syllabus CMOS Analog Circuit Design
No ratings yet
GTU PHD Core Syllabus CMOS Analog Circuit Design
1 page
Modified DSP Course File (2018-19)
No ratings yet
Modified DSP Course File (2018-19)
30 pages
Graphical System Design PDF
No ratings yet
Graphical System Design PDF
35 pages
Digital Lab VIVA Questions
No ratings yet
Digital Lab VIVA Questions
4 pages
Experiment - 05
No ratings yet
Experiment - 05
17 pages
C++ Lab Manual Final
100% (1)
C++ Lab Manual Final
35 pages
Ee-Module 3 PDF
No ratings yet
Ee-Module 3 PDF
32 pages
Cusat Ec 6th Sem Question Paper
0% (1)
Cusat Ec 6th Sem Question Paper
16 pages
Pseudo Random Sequence Generator in Verilog
No ratings yet
Pseudo Random Sequence Generator in Verilog
3 pages
Verilog Code
No ratings yet
Verilog Code
60 pages
Optical Fiber Communications: WDM Concepts and Components
No ratings yet
Optical Fiber Communications: WDM Concepts and Components
27 pages
Multimedia University of Kenya. Faculty of Engineering and Technology. Bsc. Electrical and Telecommunication Engineering
No ratings yet
Multimedia University of Kenya. Faculty of Engineering and Technology. Bsc. Electrical and Telecommunication Engineering
8 pages
DSP PDF
No ratings yet
DSP PDF
8 pages
Huff Man Coding
No ratings yet
Huff Man Coding
8 pages
Aim: To Implement Huffman Coding Using MATLAB Experimental Requirements: PC Loaded With MATLAB Software Theory
No ratings yet
Aim: To Implement Huffman Coding Using MATLAB Experimental Requirements: PC Loaded With MATLAB Software Theory
5 pages
Buoyancy and Stability of Floating Bodies
No ratings yet
Buoyancy and Stability of Floating Bodies
12 pages
Reading and Analyzying Literature Marking Sche e
No ratings yet
Reading and Analyzying Literature Marking Sche e
4 pages
Properties of Fluids Notes
No ratings yet
Properties of Fluids Notes
14 pages
Service Request Letter
No ratings yet
Service Request Letter
1 page
CAT
No ratings yet
CAT
3 pages
History of Education Marking Scheme
No ratings yet
History of Education Marking Scheme
3 pages
1.7.1 Moments and Moment Generating Functions: Chapter 1. Elements of Probability Distribution Theory
No ratings yet
1.7.1 Moments and Moment Generating Functions: Chapter 1. Elements of Probability Distribution Theory
8 pages
Frequency Transformation
No ratings yet
Frequency Transformation
21 pages
Ch6 Error Detection and Correction
No ratings yet
Ch6 Error Detection and Correction
56 pages
quiz_3_27_3_2023
No ratings yet
quiz_3_27_3_2023
2 pages
Media Info
No ratings yet
Media Info
2 pages
Chapter 4 Multi
No ratings yet
Chapter 4 Multi
45 pages
Data Comm Assisgnment
No ratings yet
Data Comm Assisgnment
6 pages
ECS452 2014 Premidterm
No ratings yet
ECS452 2014 Premidterm
56 pages
Image and Video Compression
No ratings yet
Image and Video Compression
18 pages
Bakemonogatari - Renai Circulation (Opening 4) - Easy Version
No ratings yet
Bakemonogatari - Renai Circulation (Opening 4) - Easy Version
2 pages
FIKAYI Harmo Augustin TSHOMBE IBANDA & Jonathan MWAMBA
No ratings yet
FIKAYI Harmo Augustin TSHOMBE IBANDA & Jonathan MWAMBA
11 pages
Practical 5 Notes
No ratings yet
Practical 5 Notes
15 pages
Image Compression: I. Fundamentals
No ratings yet
Image Compression: I. Fundamentals
12 pages
Session 08 4
No ratings yet
Session 08 4
41 pages
Create A Huffman Code Dictionary in MATLAB
No ratings yet
Create A Huffman Code Dictionary in MATLAB
10 pages
Error Detection and Correction Codes
No ratings yet
Error Detection and Correction Codes
36 pages
Record DCN (S)
No ratings yet
Record DCN (S)
18 pages
Improving Belief Propagation Decoding of Polar Codes Using Scattered EXIT Charts
No ratings yet
Improving Belief Propagation Decoding of Polar Codes Using Scattered EXIT Charts
5 pages
ITC Unit 3 Part 1
No ratings yet
ITC Unit 3 Part 1
24 pages
EpresensiBancak 22 - 30
No ratings yet
EpresensiBancak 22 - 30
4 pages
Move To Front Encoding and Decoding
No ratings yet
Move To Front Encoding and Decoding
29 pages
Digital Communication Unit 5
No ratings yet
Digital Communication Unit 5
105 pages
Arpeggios Pattern
No ratings yet
Arpeggios Pattern
1 page
DX Diag
No ratings yet
DX Diag
9 pages
HF Radio Communication
No ratings yet
HF Radio Communication
5 pages
Error Detection Codes PDF
No ratings yet
Error Detection Codes PDF
2 pages
F
No ratings yet
F
23 pages
Digital Communication Systems by Simon Haykin-104
No ratings yet
Digital Communication Systems by Simon Haykin-104
6 pages
Huffman Coding A Case Study of A Comparison
No ratings yet
Huffman Coding A Case Study of A Comparison
2 pages
What Are The Similarities and Differences Between 7-Bit and 8-Bit ASCII - Quora
No ratings yet
What Are The Similarities and Differences Between 7-Bit and 8-Bit ASCII - Quora
3 pages
Lec 1c - Character Representation
No ratings yet
Lec 1c - Character Representation
11 pages