Image Compression Using DCT
Image Compression Using DCT
com/site/enggprojectece
( , ) ( , , , )
(1)
Here, f(x,y) is the image in spatial domain and r(x,y,u,v) is called the forward transformation
kernel.
The original image can be obtained from the transformed image using the relation:
( , )=
( , ) ( , , , )
(2)
(3)
Where f(MxN) is the image matrix and R1(MxM) and R2(NXN) combined represent the transformation
kernel.
( , )
(4)
( , )
(4)
Figure 1 (a): Passing an image through LPF more information in low frequency region
Figure 1 (b): Passing an image through HPF less information in high frequency region
From the figures 1(a) and 1(b) it can be seen that the low frequency components of the signal
has more information compared to the high frequency components. The LPF output is
intelligible but HPF output contains only some fine details of the image.
Here I have just shown the example of 2D DFT. But the fact is that any transformation shows
some energy compaction in the transformed domain. We shall discuss other transforms and
their performance in terms of energy compaction and other affecting factors in a later part of
this tutorial. Now we will try to understand what actually image compression is and why it is
required.
But practically we can see that the sizes of the pictures taken by such cameras are in the
approximate range of 1.5 MB to 4 MB. This is possible only because of image compression.
Types of Compression
Compression algorithms are characterized by information preservation. There are three types
of compression:
1. Loss-less or Information Preserving: No loss of information. (text, legal or medical
application) But lossless compression provides only a modest amount of
compression.
2. Lossy Compression: Sacrifice some information for better compression (web
images) In this tutorial we are mainly dealing with this type.
3. Near Lossless: No (or very little) perceptible loss of information. (increasingly
accepted for legal, medical application).
(5a)
(5b)
Transformer: It transforms the input data into a format to reduce interpixel redundancies in
the input image. Transform coding techniques use a reversible, linear mathematical transform
to map the pixel values onto a set of coefficients, which are then quantized and encoded. The
key factor behind the success of transform-based coding schemes is that many of the resulting
coefficients for most natural images have small magnitudes and can be quantized without
causing significant distortion in the decoded image. For compression purpose, the higher the
capability. of compressing information in fewer coefficients, the better the transform; for that
reason, the Discrete Cosine Transform (DCT) and Discrete Wavelet Transform(DWT) have
become the most widely used transform coding techniques.
Transform coding algorithms usually start by partitioning the original image into
subimages (blocks) of small size (usually 8 8). For each block the transform coefficients are
calculated, effectively converting the original 8 8 array of pixel values into an array of
coefficients within which the coefficients closer to the top-left corner usually contain most of
the information needed to quantize and encode (and eventually perform the reverse process at
the decoders side) the image with little perceptual distortion. The resulting coefficients are
then quantized and the output of the quantizer is used by symbol encoding techniques to
produce the output bitstream representing the encoded image. In image decompression model
at the decoders side, the reverse process takes place, with the obvious difference that the
dequantization stage will only generate an approximated version of the original coefficient
values e.g., whatever loss was
Quantizer: It reduces the accuracy of the transformers output in accordance with some preestablished fidelity criterion. Reduces the psychovisual (perceptual) redundancies of the input
image. This operation is not reversible and must be omitted if lossless compression is desired.
The quantization stage is at the core of any lossy image encoding algorithm. Quantization at
the encoder side, means partitioning of the input data range into a smaller set of values. There
are two main types of quantizers: scalar quantizers and vector quantizers. A scalar quantizer
partitions the domain of input values into a smaller number of intervals. If the output intervals
are equally spaced, which is the simplest way to do it, the process is called uniform scalar
quantization; otherwise, for reasons usually related to minimization of total distortion, it is
called non uniform scalar quantization. One of the most popular non uniform quantizers is the
Lloyd-Max quantizer. Vector quantization (VQ) techniques extend the basic principles of
scalar quantization to multiple dimensions.
Symbol (entropy) encoder: It creates a fixed or variable-length code to represent the
quantizers output and maps the output in accordance with the code. In most cases, a
variable-length code is used. An entropy encoder compresses the compressed values obtained
by the quantizer to provide more efficient compression. Most important types of entropy
encoders used in lossy image compression techniques are arithmetic encoder, huffman
encoder and run-length encoder.
Decompression is the reverse operation of all these steps to recover the image in
spatial domain.
( , )cos
(2 + 1)
2
(2 + 1)
2
(6)
(2 + 1)
2
(7)
Where,
( )=
( )=
if u=0
if u=1,2,3,.,M 1
( ) ( ) ( , )
(2 + 1)
2
Result:
Original Image
Compressed Image
Figure 3: Output of Example 1: Comparison of the original and the compressed image
MSE obtained = 0.0156
It can be seen that the compressed image is similar to the original image and MSE is also
small. But we converted a 256256 image to just 150150 in the transformed (DCT) domain.
The encoding is nothing but representing the 150150 quantized image in transformed image
into some efficient encoding scheme to represent them in an efficient manner. Usually some
efficient variable length coding such as Huffman Coding is used.
The encoding part is important because in the transformed domain, there are more than 104
levels in the image and we need at approximately 16 bits to represent each pixel. Thus
although the size of the image is small, the number of bits required will be double compared
to a standard image (8-bit). If we reduce the number of number quantization levels then
quality suffers.
Here I have used in-built functions to perform DCT and IDCT. Students are encouraged to
write their own programs to perform these using equations 6 and 7 respectively.
JPEG Compression
7. Finally the entropy encoder uses Huffman coding to optimize the entropy.
Part 2 will cover Image compression using Discrete Wavelet Transform and JPEG 2000