(IJCST-V2I4P27) Author: Ritu, Puneet Sharma

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 4, Jul-Aug 2014
ISSN: 2347-8578 www.ijcstjournal.org Page 148

A Review of Data Compression Technique
Ritu
1
, Puneet Sharma
2

Department of Computer Science and Engineering,
Hindu College of Engineering, Sonipat
Deenbandhu Chhotu Ram University, Murthal
Haryana-India

ABSTRACT
With the growth of technology and the entrance into the Digital Age, the world has found itself amid a vast amount of
information. Dealing with such enormous amount of information can often present difficulties. Digital information must be
stored and retrieved in an efficient manner, in order for it to be put to practical use. Compression is one way to deal with
this problem. Images require substantial storage and transmission resources, thus image compression is advantageous to
reduce these requirements. Image compression is a key technology in transmission and storage of digital images because
of vast data associated with them. This paper addresses about various image compression techniques. There are two types
of image compression: lossless and lossy.
Keywords:- Image Compression, JPEG, Discrete wavelet Transform.

I. INTRODUCTION
Image compression is important for many applications
that involve huge data storage, transmission and retrieval
such as for multimedia, documents, videoconferencing, and
medical imaging. Uncompressed images require
considerable storage capacity and transmission bandwidth.
The objective of image compression technique is to reduce
redundancy of the image data in order to be able to store or
transmit data in an efficient form. This results in the
reduction of file size and allows more images to be stored in
a given amount of disk or memory space.
II. IMAGE COMPRESSION
Image compression addresses the problem of reducing the
amount of data required to represent a digital image. It is a
process intended to yield a compact representation of an
image, thereby reducing the image storage/ transmission
requirements. Compression is achieved by the removal of
one or more of the three basic data redundancies:
Coding Redundancy
Interpixel Redundancy
Psychovisual Redundancy
Coding redundancy is present when less than optimal code
words are used. Interpixel redundancy results from
correlations between the pixels of an image. Psychovisual
redundancy is due to data that is ignored by the human
visual system (i.e. visually non essential information).Image
compression techniques reduce the number of bits required
to represent an image by taking advantage of these

redundancies. An inverse process called decompression
(decoding) is applied to the compressed data to get the
reconstructed image. The objective of compression is to
reduce the number of bits as much as possible, while
keeping the resolution and the visual quality of the
reconstructed image as close to the original image as
possible.
Image compression may be lossy or lossless. Lossless
compression is preferred for archival purposes and often for
medical imaging, technical drawings, clip art, or comics.
Lossy compression methods, especially when used at low bit
rates, introduce compression artifacts. Lossy methods are
especially suitable for natural images such as photographs in
applications where minor (sometimes imperceptible) loss of
fidelity is acceptable to achieve a substantial reduction in bit
rate. The lossy compression that produces imperceptible
differences may be called visually lossless.
A. Lossless Compression
Lossless compression algorithms reduce file size with no
loss in image quality. When the file is saved it is
compressed, when it is decompressed (opened) the original
data is retrieved. The file data is only temporarily 'thrown
away', so that the file can be transferred.
This type of compression can be applied not just to graphics
but to any kind of computer data such as spreadsheets, text
documents and software applications. If you need to send
files as an email attachment, then you may be best to
compress it first. sLossless image compression is
particularly useful in image archiving as in the storage of
legal or medical records Methods for lossless image
compression includes: Entropy coding, Huffman coding,
RESEARCH ARTICLE OPEN ACCESS
Bit-plane coding, Run-length coding and LZW ( Lempel Ziv
Welch ) coding.
The advantage of this is that it maintains quality the main
disadvantage is it doesn't reduce the file size as much as
lossy compression.
B. Lossy Compression
Lossy compression also looks for 'redundant' pixel
information, however, it permanently discards it. This
means that when the file is decompressed the original data
isn't retrieved. In lossy compression, the original signal
cannot be exactly reconstructed from the compressed data.
The reason is that, much of the detail in an image can be
discarded without greatly changing the appearance of the
image. Lossy compression isn't used for data such as text
based documents and software, since they need to keep all
their information Lossy image compressions are useful in
applications such as broadcast television,
videoconferencing, and facsimile transmission, in which a
certain amount of error is an acceptable trade-off for
increased compression performance. Methods for lossy
compression include: Fractal compression, Transform
coding, Fourier-related transform, DCT (Discrete Cosine
Transform) and Wavelet transform.
C. Benefits of Compression
It provides a potential cost savings associated with sending
less data over switched telephone network where cost of call
is really usually based upon its duration. Mapper Quantizes
Symbol Coder Symbol Decoder Inverse Mapper
It not only reduces storage requirements but also
overall execution time.
It also reduces the probability of transmission errors
since fewer bits are transferred.
It also provides a level of security against illicit
monitoring.
III. IMAGE COMPRESSION TECHNIQUES
A. Lossless Compression Techniques
1) Run length encoding
2) Entropy encoding
3) Huffman encoding
4) Arithmetic coding
5) LZW coding
1) Run Length Encoding: Rle is used in lossless data
compression. This is a very simple compression method
used for sequential data. It is very useful in case of repetitive
data. This technique replaces sequences of identical symbols
(pixels), called runs by shorter symbols. Runs is sequences
in which the same data value occurs in many consecutive
data elements are stored as a single data value and count,
rather than as the original run. The run length code for a
gray scale image is represented by a sequence {Vi, Ri}
where Vi is the intensity of pixel and Ri refers to the number
of consecutive pixels with the intensity Vi as shown in the
figure. If both Vi and Ri are represented by one byte, this
span of 12 pixels is coded using eight bytes yielding a
compression ratio of 1: 5.
RLE may also be used to refer to an early graphics file
format. It does not work well at all on continuous-tone
images such as photographs, although JPEG uses it quite
effectively on the coefficients that remain after transforming
and quantizing image blocks. This technique is based on the
repetition of colors in an image. If you read the image from
the top left, reading a row of pixels at a time, you often get
runs of pixels of the same color. If there are 3 or more pixels
in a row, storing the number in the run and the color is more
efficient.

2) Entropy encoding: An entropy encoding is a coding
scheme that involves assigning codes to symbols so as to
match code lengths with the probabilities of the symbols.
Typically, entropy encoders are used to compress data by
replacing symbols represented by equal-length codes with
symbols represented by codes proportional to the negative
logarithm of the probability. Therefore, the most common
symbols use the shortest codes.

3) Huffman coding: This is a general technique for
coding symbols based on their statistical occurrence
frequencies (probabilities)..The Huffmans algorithm is
generating minimum redundancy codes compared to other
algorithms. The Huffman coding has effectively used in
text, image, video compression, and conferencing system
such as, JPEG, MPEG-2, MPEG-4, and H.263 etc. The
pixels in the image are treated as symbols.
The Huffman coding technique collects unique symbols
from the source image and calculates its probability value
for each symbol and sorts the symbols based on its
probability value. Further, from the lowest probability value
symbol to the highest probability value symbol, two
symbols combined at a time to form a binary tree.
Moreover, allocates zero to the left node and one to the right
node starting from the root of the tree. To obtain Huffman
code for a particular symbol, all zero and one collected from
the root to that particular node in the same order.
Most image coding standards use lossy techniques in the
earlier stages of compression and use Huffman coding as the
final step.
4) Arithmetic Encoding: AC is the most powerful
technique for statics lossless encoding that has attracted
much attention in the recent years. It provides more
flexibility and better efficiency than the celebrated Huffman
coding does. The aim of AC is to define a method that
provides code words with an ideal length. Like for every
other entropy coder, it is required to know the probability
for the appearance of the individual symbols.
AC is the most efficient method to code symbols according
to the probability of their occurrence. The average code
length is very close to the possible minimum given by
information theory.
The AC assigns an interval to each symbol whose size
reflects the probability for the appearance of this symbol.
The code word of a symbol is an arbitrary rational number
belonging to the corresponding interval.
5) LZW Coding: LZW (Lempel- Ziv Welch) is a
dictionary based coding.LZW algorithm is working based
on the occurrence multiplicity of character sequences in the
string to be encoded. Its
Principle consists in substituting patterns with an index
code, by progressively building a dictionary. The dictionary
is initialized with the 256 values of the ASCII table. The file
to be compressed is split into strings of bytes (thus
monochrome images coded on 1 bit this compression is
not very effective), each of these strings is compared with
the dictionary and is added, if not found there.
In encoding process the algorithm goes over the stream
of information, coding it; if a string is never smaller than the
longest word in the dictionary then it s transmitted.
In decoding process, the algorithm rebuilds the dictionary
in the opposite direction; it thus does not need to be stored.
Dictionary based coding can be static or dynamic. In static
dictionary coding, dictionary is fixed during the encoding
and decoding processes. In dynamic dictionary coding, the
dictionary is updated on fly. LZW is widely used in
computer industry and is implemented as compress
command on UNIX.
B. Lossy Coding Techniques
1) Transform coding
2) DCT
3) Fractal Compression
4) DWT

1) Transform Coding: Transform coding algorithm
usually start by partitioning the original image into sub
images (blocks) of small size (usually 8 x 8). For each block
the transform coefficients are calculated, effectively
converting the original 8 x 8 array of
pixel values into an array of coefficients closer to the top-
left corner usually contain most of the information needed to
quantize and encode the image with little perceptual
distortion .The resulting coefficients are then quantized and
the output of the quantizer is used by a symbol encoding
technique(s) to produce the output bit stream representing
the encoded image.
At the decoders side, the reverse process takes place, with
the obvious difference that the dequantization stage will
only generate an approximated version of the original
coefficient values; in other words, whatever loss is
introduced by the quantizer in the encoder stage is not
reversible.
2) DCT: DCT is used by many Non-analytical
applications such as image processing and signal-processing
DSP applications such as video conferencing. The DCT is
used in transformation for data compression. DCT is an
orthogonal transform, which has a fixed set of basis
function. DCT is used to map an image space into a
frequency.
The DCT process is applied on blocks of 8 * 8 or 16 * 16
pixels, which will convert into series of coefficients, which
define spectral composition of the block. The Transformer
transforms the input data into a format to reduce interpixel
redundancies in the input image.
Transform coding techniques use a reversible, linear
mathematical transform to map the pixel values onto a set of
coefficients, which are then quantized and encoded. The key
factor behind the success of transform-based coding
schemes is that many of the resulting coefficients for most
natural images have small magnitudes and can be quantized
without causing significant distortion in the decoded image.
DCT Attempts to decorrelate the image data after
decorrelation each transform coefficient can be encoded
without dropping off compression efficiency.
3) Fractal Compression: The fractal compression
technique relies on the fact that in certain images, parts of
the image resemble other parts of the same image. Fractal
algorithms convert these parts, or more precisely, geometric
shapes into mathematical data called "fractal codes" which
are used to recreate the encoded image. Once an image has
been converted into fractal code its relationship to a specific
resolution has been lost; it becomes resolution independent.
The image can be recreated to fill any screen size without
the introduction of image artifacts or loss of sharpness that
occurs in pixel-based compression schemes
4) Dwt: The DWT represents an image as a sum of
wavelet functions, known as wavelets, with different
location and scale.
IV. DISCRET WAVELET TRANSFORM
(DWT)
The DWT represents the image data into a set of high pass
(detail) and low pass (approximate) coefficients.
The image is first divided into blocks of 3232. Each
block is then passed through the two filters: the first level
decomposition is performed to decompose the input data
into an approximation and detail coefficients. After
obtaining the transformed matrix, the detail and approximate
coefficients are separated as LL, HL, LH, and HH
coefficients. All the coefficients are discarded except the LL
coefficients that are transformed into the second level. The
coefficients are then passed through a constant scaling factor
to achieve the desired compression ratio.

Fig. 1 Example of wavelet decomposition
Compression Steps:
The steps needed to compress an image are as follows:
Digitize the source image into a signal s, which is a
string of numbers.
Decompose the signal into a sequence of wavelet
coefficients w.
Use thresholding to modify the wavelet coefficients
from w to another sequence w'.
Use quantization to convert w' to a sequence q.
Apply entropy coding to compress q into a
sequence e.

Fig. 2 Example of Image Compression and decompression

A. Digitization
The first step in the wavelet compression process is to
digitize the image. The digitized image can be characterized
by its intensity levels, or scales of gray which range from 0
(black) to 255 (white), and its resolution, or how many
pixels per square inch. Each of the bits involved in creating
an image takes up both time and money, so a tradeoff must
be made.
B. Thresholding
In certain signals, many of the wavelet coefficients are close
or equal to zero. Through a method called thresholding,
these coefficients may be modified so that the sequence of
wavelet coefficients contains long strings of zeros. Through
a type of compression known as entropy coding, these long
strings may be stored and sent electronically in much less
space.
There are different types of thresholding. In hard
thresholding, a tolerance is selected. Any wavelet whose
absolute value falls below the tolerance is set to zero with
the goal to introduce many zeros without losing a great
amount of detail. There is not a straightforward easy way to
choose the threshold, although the larger the threshold that
is chosen the more error that is introduced into the process.
Another type of thresholding is soft thresholding. Once
again a tolerance, h, is selected. If the absolute value of an
entry is less than the tolerance, than that entry is set to zero.
All other entries, d, are replaced with sign(d)||d| - h|.
Soft thresholding can be thought of as a translation of the
signal toward zero by the amount h. A third type of
thresholding is quantile thresholding. In this method a
percentage p of entries to be eliminated are selected. The
smallest (in absolute value) p percent of entries are set to
zero.
C. Entropy Coding
Wavelets and thresholding help process the signal, but up
until this point, no compression has yet occurred. One
method to compress the data is Huffman entropy coding.
With this method, and integer sequence, q, is changed into a
shorter sequence, e, with the numbers in e being 8 bit
integers.
The conversion is made by an entropy coding table.
Strings of zeros are coded by the numbers 1 through 100,
105, and 106, while the non-zero integers in q are coded by
101 through 104 and 107 through 254. In Huffman entropy
coding, the idea is to use two or three numbers for coding,
with the first being a signal that a large number or long zero
sequence is coming. Entropy coding is designed so that the
numbers that are expected to appear the most often in q,
need the least amount of space in e.
D. Quantization
The fourth step of the process, known as quantization,
converts a sequence of floating numbers w' to a sequence of
integers q. The simplest form is to round to the nearest
integer. Another option is to multiply each number in w' by
a constant k, and then round to the nearest integer.
Quantization is called lossy because it introduces error
into the process, since the conversion of w' to q is not a one-
to-one function.
V. CONCLUSION AND FUTYRE SCOPE

This paper presents various types of image compression
techniques. There are basically two types of compression
techniques. One is Lossless Compression and other is Lossy
Compression Technique. Comparing the performance of
compression technique is difficult unless identical data sets
and performance measures are used. Some of these
techniques are obtained good for certain applications like
security technologies. Some techniques perform well for
certain classes of data and poorly for others. By studying
and discussing all the techniques we find lossy
compression techniques provides high compression
ratio than lossless compression scheme.
Lossy compression is used for more compression
ratio and Lossless compression is used when the original
image and reconstructed image are to be identical.. Wavelet
can be effectively used for this purpose. Our future work
involves improving image quality by increasing PSNR value
and lowering MSE value.

REFERENCES

[1] Talukder, K.H. and Harada, K., A Scheme of Wavelet Based
Compression of 2D Image, Proc. IMECS, Hong Kong, pp. 531.

[2] M. Mozammel Hoque Chowdhury and Amina Khatun, Image
Compression Using Discrete Wavelet Transform, IJCSI International
Journal of Computer Science Issues, Vol. 9, Issue 4, No 1, July 2012.

[3] Subramanya A, Image Compression Technique,
Potentials IEEE, Vol. 20, Issue 1, pp 19-23, Feb-March 2001.

[4] Hong Zhang, Xiaofei Zhang & Shun Cao, Analysis & Evaluation of
Some Image Compression
Techniques, High Performance Computing in AsiaPacific Region, 2000
Proceedings, 4th Int. Conference,vol. 2, pp 799-803,14-17 May, 2000

[5] CHRISTOPHUE UEFFING,Merzhausen,Wavelet based ECW image
compression,Wichmann Verlag,Heidelberg 2001.

[6] SonjaGrgic,Kresimir Kers,Mislav Grgic, Image Compression Using
Wavelets, ISIE99 Bled,Slovenia, pp.99-104.

[7] Jayavrinda Vrindavanam , Saravanan Chandran, Gautam K. Mahanti,
A Survey of Image Compression Methods International Journal of
Computer Applications (IJCA) 2012.
[8] Jagadish H. Pujar, Lohit M. Kadlaskar, A New Lossless Method Of
Image Compression And Decompression Using Huffman Coding
TechniquesJournal of Theoretical and Applied Information Technology.
[9] Othman Khalifa, Wavelet Coding Design for Image Data
Compression, The International Arab Journal of Information Technology,
Vol. 6, No. 2, pp.
118-127, April 2009

[10] G. K. Kharate, V. H. Pati, Color Image Compression Based On
Wavelet Packet Best Tree,International Journal of Computer Science, Vol.
7,No.3, March 2010

[11] M. R. Haque and F. Ahmed, Image data compression with JPEG
and JPEG2000, 8
th
International Confrrence on Computer and Information
Technology, pp. 1064-1069, 2005.

[12] Talukder, K.H. and Harada, K., A Scheme of Wavelet Based
Compression of 2D Image, Proc. IMECS, Hong Kong, pp. 531-536,June
2006.

[13] David Jeff Jackson & Sidney Joel Hannah, Comparative Analysis
of image Compression Techniques, System Theory 1993, Proceedings
SSST 93, 25th Southeastern Symposium,pp 513-517, 7 9 March 1993.

[14] Ming Yang & Nikolaos Bourbakis ,An Overview of Lossless
Digital Image Compression Techniques,Circuits & Systems, 2005 48th
Midwest Symposium,vol. 2 IEEE ,pp 1099-1102,7 10 Aug, 2005

(IJCST-V2I4P27) Author: Ritu, Puneet Sharma

Uploaded by

Copyright:

Available Formats

(IJCST-V2I4P27) Author: Ritu, Puneet Sharma

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

(IJCST-V2I4P27) Author: Ritu, Puneet Sharma

Uploaded by

Copyright:

Available Formats

International Journal of Computer Science Trends and Technology (IJCST) Volume 2 Issue 4, Jul-Aug 2014

ISSN: 2347-8578 www.ijcstjournal.org Page 148

You might also like