Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Chapter 3

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 41

3.

Multimedia Data Compression


Multimedia data compression involves reducing the size of multimedia files (which can include
text, audio, images, and video) without excessively compromising their quality. This process is
essential because multimedia files tend to be large and can consume significant storage space and
bandwidth. Compression can be achieved through various techniques and algorithms, aiming to
make the storage, transmission, and processing of multimedia data more efficient.

There are two main types of compression:

1. Lossless Compression: In this method, data is compressed in such a way that it can be
perfectly reconstructed from the compressed version. This technique is vital for applications
where preserving the original data is crucial, such as text documents and some image formats
(e.g., PNG). Lossless compression algorithms include Run-Length Encoding, Huffman
Coding, and Lempel-Ziv-Welch (LZW).
2. Lossy Compression: This approach achieves higher compression rates by eliminating some
data deemed less important, resulting in a file that is not an exact replica of the original but
retains essential qualities. This is often acceptable and widely used for audio, images, and
videos where a slight loss in quality is not noticeable to the user but significantly reduces file
size. Common lossy compression techniques include JPEG for images, MPEG and H.264 for
video, and MP3 and AAC for audio.

The effectiveness of multimedia data compression is measured by its compression ratio, the
balance between quality and size reduction, and the computational resources required for
compression and decompression. The choice between lossless and lossy compression depends on
the specific requirements of the application, including the importance of data fidelity, the
acceptable level of quality loss, and the need for compression speed or efficiency.

Measuring data compression involves assessing the efficiency of a compression algorithm in


reducing the size of data. The primary metric used for this purpose is the compression ratio,
which compares the size of the compressed data to the original size of the data. This ratio
provides a quantitative measure of the effectiveness of the compression.
The formula for the compression ratio (CR) is:

CR=Compressed Data ¿ ¿ Original Data ¿ ¿ ¿ ¿

The compression ratio can be expressed as a ratio (e.g., 2:1) or as a numerical value (e.g., 2). A
higher compression ratio indicates more efficient compression, meaning the compressed file is
significantly smaller than the original file.

Another related measure is the space savings, which tells you the percentage of space saved by
compression. It is calculated as:

Space Savings=1−¿

This formula gives you the percentage reduction in size from the original data to the compressed
data, providing another perspective on the efficiency of the compression process.

Both the compression ratio and space savings are useful for evaluating and comparing
compression algorithms or settings, especially when optimizing for storage space or transmission
speed while maintaining acceptable data quality.

3.1. Data redundancy


Data redundancy in multimedia data compression refers to the presence of repetitive or
unnecessary information within multimedia files, including images, audio, and video. Reducing
this redundancy is the key to effective compression, allowing the file size to be minimized
without significantly impacting the perceived quality of the content.

There are several types of data redundancy commonly targeted in multimedia data compression:

I. Spatial Redundancy: This occurs when there are patterns, textures, or areas of uniform
color within an image or video frame. Compression algorithms can exploit this
redundancy by encoding these areas more efficiently, storing information about the
patterns or colors once, rather than repeatedly.
II. Temporal Redundancy: In video data, temporal redundancy happens when consecutive
frames are similar or identical, such as in a static scene with minimal movement. Video
compression techniques can take advantage of this by only storing the differences
between frames instead of the entire frame each time.
III. Spectral Redundancy: This is relevant in color images and video, where there is often
correlation between the different color channels (e.g., RGB). Compression algorithms can
transform the color space to a format where the redundancy can be more easily reduced.
IV. Psycho-visual Redundancy: This involves information that may be present in the media
but is imperceptible or less significant to human perception. For example, certain audio
frequencies may be inaudible to humans, or small color changes in an image may not be
noticeable. Lossy compression techniques often reduce this type of redundancy by
prioritizing the retention of perceptually significant information.

Reducing data redundancy effectively requires a careful balance to maintain the quality and
integrity of the multimedia content while achieving significant compression. The specific
strategies and algorithms used can vary widely depending on the type of media and the intended
use of the compressed file.

3.1.1. Compression with Loss and Lossless

Compression techniques, whether for data in general or multimedia content specifically, can be
broadly classified into two categories based on how they handle the original data: lossless
compression and lossy compression. Both approaches aim to reduce file sizes, but they do so in
different ways and are suitable for different types of data and applications.

Lossless Compression

Lossless compression reduces file size without any loss of quality or data. When a file is
decompressed after lossless compression, it will be identical to its original state before
compression. This method is essential for applications where the exact preservation of data is
critical, such as text documents, executable files, and certain image formats (like PNG).

 Advantages: No loss of information, allowing for perfect reconstruction of the original


data.
 Disadvantages: Generally, achieves lower compression ratios compared to lossy
compression, especially for media files like images and videos where perceptual nuances
can be exploited for further size reduction.

Common Lossless Compression Algorithms:


 Huffman coding
 Lempel-Ziv-Welch (LZW)
 DEFLATE (used in ZIP files and PNG images)

Lossy Compression

Lossy compression significantly reduces file sizes by permanently eliminating certain parts of
the data deemed less important, based on the intended use and perceptual characteristics of the
human senses. This approach is often used for multimedia data (audio, images, videos) where a
reduction in quality is acceptable if it leads to substantial decreases in file size.

Advantages: Achieves much higher compression ratios than lossless compression, making it
ideal for reducing the storage and bandwidth requirements of multimedia content.

Disadvantages: The loss of data is irreversible, which might result in a noticeable decrease in
quality if compressed too aggressively or if the compressed file is intended for use in a high-
quality setting.

Common Lossy Compression Techniques:

JPEG (for images)

MPEG and H.264/H.265 (for videos)

MP3 and AAC (for audio)

Choosing Between Lossless and Lossy Compression

The choice between lossless and lossy compression depends on the specific requirements of the
application and the nature of the data being compressed. For archival purposes or for data where
precision is paramount, lossless compression is necessary. On the other hand, for streaming
media or where storage and bandwidth are limited, the more substantial size reductions offered
by lossy compression might be preferable, as long as the quality remains good enough for the
intended audience.
3.1.2. Entropy Coding
Entropy coding is a fundamental method used in multimedia data compression to reduce the size
of digital files, such as images, audio, and video. It is a form of lossless compression that works
by encoding the more frequent elements in the data with shorter codes and the less frequent
elements with longer codes. The principle behind entropy coding is to take advantage of the
statistical properties of the data, particularly the varying occurrence frequencies of different
elements (e.g., pixels in an image, audio samples, symbols in a text file).

The concept of "entropy" in this context comes from information theory, where it is a measure of
the randomness or unpredictability of information content. In practical terms, entropy coding
aims to minimize the average length of the codes assigned to each element, thereby reducing the
total amount of data required to represent the original content without loss.

Two widely used entropy coding techniques in multimedia compression are:

I. Huffman Coding: This is a variable-length code method where the most common
symbols are represented with the shortest codes and the least common symbols with the
longest codes. Huffman coding is optimal in the sense that it produces the lowest possible
average code length for a given set of symbol frequencies. It is used in many compression
standards, including JPEG for images.
II. Arithmetic Coding: Arithmetic coding is more complex than Huffman coding but can
achieve slightly better compression ratios. Instead of assigning separate codes to
individual symbols, it encodes the entire message into a single number, a fraction in the
range [0,1). It does this by successively narrowing down the range of possible values
based on the sequence of symbols in the input. Arithmetic coding can effectively handle
symbols with probabilities that are not powers of two, making it more efficient in some
cases than Huffman coding.

Both Huffman and arithmetic coding are examples of entropy coding methods that rely on the
statistical properties of the data to achieve compression. These techniques are particularly
effective for data with significant redundancy and predictable patterns, making them key
components of lossless compression algorithms and used in the final stages of many lossy
compression schemes to efficiently encode residual information.
3.1.3. Predictive coding
Predictive coding is a technique used in multimedia data compression, especially for images and
videos, which reduces file size by exploiting the redundancy inherent in the data. Instead of
encoding the actual content directly, predictive coding encodes the differences (or residuals)
between the actual values of pixels (in images) or frames (in videos) and their predicted values.
These predicted values are calculated using algorithms based on the properties of neighboring
pixels or preceding frames.

The basic steps involved in predictive coding are:

I. Prediction: For each element (e.g., a pixel in an image or a frame in a video), a


prediction is made based on neighboring elements or previous data. This prediction aims
to closely approximate the actual value.
II. Difference Calculation: The difference between the actual value of the element and its
predicted value is calculated. This difference is often smaller than the original data
values, leading to higher compression ratios when encoded.
III. Encoding the Difference: The differences (also known as residuals) are then encoded
using entropy coding techniques, such as Huffman coding or arithmetic coding. Since
these differences tend to be smaller and have less variability than the original
uncompressed data, they can be encoded more efficiently.
IV. Decoding (Reconstruction): To reconstruct the original data from the compressed
version, the decoder uses the same prediction model as the encoder to generate predicted
values, then adds the decoded differences to these predictions to obtain the original
values.

Predictive coding is particularly effective for data types with a high degree of correlation
between neighboring elements, such as natural images and video sequences, where adjacent
pixels or consecutive frames often have similar values. By encoding only the differences,
predictive coding significantly reduces the amount of data needed to represent the original
content.

Two common applications of predictive coding are:


 Differential Pulse Code Modulation (DPCM) for audio and simple image compression,
where each sample is predicted from previous samples.
 Motion Estimation and Compensation in video compression (e.g., in MPEG and H.264
codecs), where the movement of objects between frames is estimated, and only the
changes (motion vectors and residuals) are encoded.

Predictive coding can be used as part of both lossy and lossless compression schemes. In lossless
compression, the exact differences are encoded, allowing perfect reconstruction of the original
data. In lossy compression, the differences may be further quantized to achieve higher
compression ratios at the cost of some loss of fidelity.

3.2. Huffman Coding


Huffman coding is a widely used method of entropy coding used for lossless data compression.
Developed by David A. Huffman in the 1950s, it is a variable-length code algorithm that assigns
shorter codes to more frequent symbols and longer codes to less frequent symbols based on their
occurrence probabilities. The result is an optimal way to reduce the average code length and,
thereby, the overall size of the data.

How Huffman Coding Works:

1. Frequency Table: The first step in Huffman coding is to create a frequency table that
counts the occurrences of each symbol (e.g., characters in a text file, pixel values in an
image) in the data to be compressed.
2. Build a Binary Tree:
 Each symbol is initially considered a leaf node, with the frequency of the symbol
as its weight.
 Nodes are then combined from lowest to highest frequency to build a binary tree.
In each step, the two nodes with the lowest frequencies are removed from the pool
of nodes, combined into a new node (where the new node's frequency is the sum
of the two nodes' frequencies), and this new node is added back to the pool.
 This process repeats until there is only one node left, which becomes the root of
the Huffman tree.
3. Assign Codes to Symbols:
 Once the Huffman tree is constructed, each leaf node is assigned a unique binary
code. To determine a symbol's code, start from the root of the tree and trace the
path to the leaf node representing that symbol. Assign a binary '0' for every left
turn and a '1' for every right turn along the path.
 This method ensures that no code is a prefix of another, making the code a prefix
code, which is crucial for the unique decodability of the compressed data.
4. Encoding the Data:
 The original data is then encoded using the binary codes assigned to each symbol.
This transforms the data into a compressed binary sequence.
5. Decoding the Data:
 To decode the compressed data, you start from the root of the Huffman tree and
follow the binary codes through the tree to find the corresponding symbols until
the entire data sequence has been decoded.

Advantages of Huffman Coding:

 Efficiency: Huffman coding is optimal in the sense that it minimizes the average
code length, making it very efficient for data compression.
 Flexibility: It can be applied to any type of data (text, images, audio, etc.).
 Simplicity: The algorithm is straightforward to implement.

Disadvantages:

 Variable Length: Huffman codes are variable in length, which can lead to inefficiencies if
the symbol distribution is not well-suited to this type of coding.
 Static Model: The standard Huffman coding algorithm does not adapt to changing symbol
frequencies within a data stream. Adaptive Huffman coding variants address this issue but
are more complex.

Huffman coding remains a cornerstone of many compression algorithms and is often used in
combination with other compression techniques to achieve higher compression ratios.

Examples of Huffman Coding


To illustrate Huffman coding with an example, let's compress a simple string: "beekeeper". The
goal is to encode this string in a way that uses less space than the original ASCII encoding,
which typically uses 8 bits per character.

Step 1: Frequency Table

First, we count the frequency of each character in the string:

 b: 1
 e: 5
 k: 1
 p: 1
 r: 1

Step 2: Build the Huffman Tree

Next, we build the Huffman tree by combining the characters with the lowest frequencies until
there is only one node left:

I. Combine 'b' and 'k' (both have the lowest frequency, 1) into a new node with a total
frequency of 2.
II. Combine 'p' and 'r' (each with a frequency of 1) into a new node with a total frequency of
2.
III. Combine the 'bk' node and the 'pr' node (each with a frequency of 2) into a new node with
a total frequency of 4.
IV. Finally, combine the 'e' (with a frequency of 5) and the 'bkpr' node (with a frequency of 4)
to form the root of the tree, with a total frequency of 9, which is the total number of
characters.

The Huffman tree looks something like this:


Step 3: Assign Codes to Symbols

From the Huffman tree, we derive the binary codes for each character:

 e: 0
 b: 100
 k: 101
 p: 110
 r: 111

Step 4: Encoding the Data

Using these codes, we encode the original string "beekeeper":

 b: 100
 e: 0
 e: 0
 k: 101
 e: 0
 e: 0
 p: 110
 e: 0
 r: 111

So, "beekeeper" is encoded as: 1000001010001100111


Comparison with ASCII Encoding

In ASCII encoding, assuming 8 bits per character, "beekeeper" would require 9 characters * 8
bits = 72 bits.

With Huffman coding, our encoded string uses only 18 bits, demonstrating significant
compression.

Step 5: Decoding the Data

To decode the binary sequence "1000001010001100111" back into "beekeeper", you simply
follow the binary codes from the root of the Huffman tree to the leaves, translating each code
back into its corresponding character.

This example illustrates how Huffman coding efficiently compresses data by assigning shorter
codes to more frequent characters and longer codes to less frequent ones.

Examples of decoding in huffman coding

To demonstrate decoding with Huffman coding, let's decode the binary string
1000001010001100111 we previously encoded from the word "beekeeper" using the Huffman
tree we constructed. Recall, our character codes were:

 e: 0
 b: 100
 k: 101
 p: 110
 r: 111

And here's a simplified representation of our Huffman tree:


Step 1: Start with the Entire Encoded String

Start with the binary sequence: ‘1000001010001100111’.

Step 2: Decode from Left to Right

1) 100: The first '100' takes you down the right branch of the tree, then left to 'b', and finally
left to the leaf node 'b'. So, the first decoded character is 'b'.
2) 0: The next '0' takes you down the left branch directly to 'e'. So, the second character is
'e'.
3) 0: The next '0' also decodes to 'e'.
4) 0: Another '0', leading to 'e' again.
5) 101: The sequence '101' takes you down the right branch, then left to 'b', and right to the
leaf node 'k'. The next character is 'k'.
6) 0: The next '0' decodes to 'e'.
7) 0: Another '0', also decoding to 'e'.
8) 110: The sequence '110' takes you down the right branch, then right to 'p', and left to the
leaf node 'p'. The next character is 'p'.
9) 0: The next '0' decodes to 'e'.
10) 111: Finally, the sequence '111' takes you down the right branch, then right to 'p', and
right again to the leaf node 'r'. The last character is 'r'.

The Decoded Word

Putting it all together, the binary sequence ‘1000001010001100111’ decodes to "beekeeper",


which matches our original input word.
Decoding Process Explained

Decoding involves starting at the root of the Huffman tree and following the path indicated by
the binary sequence. A '0' directs you to move left, and a '1' directs you to move right. Each time
you reach a leaf node, you've decoded a character, and you return to the root to start decoding the
next sequence of bits. This process repeats until the entire binary string has been decoded.

This example illustrates how Huffman coding enables efficient data compression and
decompression, allowing the original data to be perfectly reconstructed from the compressed
form.

3.3. Adaptive Coding


Adaptive coding, in the context of data compression, refers to algorithms that adjust their
encoding strategy based on the data being processed, in real-time. Unlike static coding schemes,
where the encoding process is based on a fixed set of rules or patterns determined before the
compression starts, adaptive coding algorithms modify their rules and parameters dynamically as
they process the data. This allows them to efficiently handle diverse or changing data
characteristics, often resulting in better compression ratios and more flexibility.

How Adaptive Coding Works

Adaptive coding algorithms start with an initial model of the data, which is continually updated
as more data is processed. These updates refine the algorithm's understanding of the data's
characteristics, such as symbol frequencies in the case of text data or pixel patterns for images.
The algorithm uses this evolving model to make more accurate predictions about the data, which
in turn leads to more efficient encoding.

Types of Adaptive Coding

1. Adaptive Huffman Coding: This variant of Huffman coding updates its frequency table
and, consequently, the Huffman tree, as new data is processed. It starts with no prior
knowledge of the data frequencies and adjusts as data comes in, allowing it to effectively
compress data with varying or unknown symbol distributions.
2. Adaptive Arithmetic Coding: Like adaptive Huffman coding, adaptive arithmetic
coding adjusts its probability model for symbols as it encodes or decodes data. This
allows it to maintain high efficiency even when the data characteristics change over time.
3. Adaptive Run-Length Encoding (RLE): In cases where data contains sequences of
repeated symbols or patterns, an adaptive RLE algorithm can adjust its encoding strategy
based on the length and frequency of these runs, optimizing its output for the current
data.

Advantages of Adaptive Coding

 Flexibility: Adaptive coding can handle a wide range of data types and distributions
without needing prior knowledge of the data characteristics.
 Efficiency: By continuously updating its model of the data, adaptive coding can achieve
higher compression ratios, especially for data with changing patterns or distributions.
 Simplicity: For some applications, adaptive coding algorithms can be simpler to
implement than their static counterparts because they don't require a separate step to build
a model of the data before compression.

Disadvantages of Adaptive Coding

 Computational Overhead: Continuously updating the encoding model requires


additional computations, which can increase the processing time and resource
consumption, especially for large datasets.
 Initial Compression Efficiency: At the start of the compression process, when less data
has been seen, the efficiency of adaptive coding may be lower than static methods pre-
tuned to the data characteristics.

Adaptive coding is particularly useful in scenarios where the data characteristics are not fully
known in advance or when those characteristics may change over time. It's widely used in
streaming data compression, real-time communication systems, and applications where the data
is generated or collected incrementally.

Examples of Adaptive Coding


To illustrate adaptive coding with a practical example, let's consider a scenario where adaptive
Huffman coding is used to compress a text stream that changes in character frequency over time.
For simplicity, let's assume we're compressing a sequence of characters that initially consists
mostly of the letter 'a', but gradually shifts to include more of the letter 'b' as the sequence
progresses.

Initial Data:

 Text stream: "aaaaaaaabbbbbbbbbbbb"

In a static Huffman coding scenario, the tree is built based on the initial character frequencies,
and it doesn't change. However, in adaptive Huffman coding, the tree is updated as each
character is processed.

Adaptive Huffman Coding Process:

1. Start with an Empty Tree: Initially, the Huffman tree is empty because no characters have
been processed, and therefore no frequencies are known.
2. Processing the First 'a' Character:
 Since the tree is empty and doesn't contain 'a', it's added to the tree, and its frequency
is set to 1.
 The tree is adjusted to reflect this new information.
3. Processing Subsequent 'a' Characters:
 Each time another 'a' is processed, its frequency is incremented in the tree.
 The tree structure is updated to maintain efficient encoding for the most frequent
characters. At this stage, 'a' is the most frequent character.
4. Introduction of 'b' Characters:
 When the first 'b' is encountered, it is added to the tree, and its frequency is set to 1.
 As more 'b' characters are processed, their frequency increases, and the tree is
continuously updated to reflect the changing character distribution. Eventually, 'b'
becomes more frequent than 'a'.
5. Continuous Update:
 The Huffman tree is updated in real-time as each character is processed. This ensures
that the encoding always reflects the current frequency distribution of characters in
the stream.

Example Compression Output:

In the beginning, 'a' might be encoded with a shorter code due to its higher frequency. As 'b'
becomes more frequent, the adaptive Huffman tree adjusts, and 'b' may then be given a shorter
code relative to 'a'.

 Initial encoding might use a shorter code for 'a' (e.g., 0) and a longer one for 'b' (e.g.,
10).
 As 'b' becomes more frequent, the tree adapts, potentially switching to a shorter code
for 'b' (e.g., 0) as its frequency surpasses that of 'a'.

Decoding with Adaptive Huffman Coding:

Decoding requires the same adaptive process. The decoder starts with an empty tree and updates
it as characters are decoded, ensuring the encoder and decoder trees are synchronized.

Advantages Demonstrated:

 Efficiency: The adaptive approach ensures the coding remains efficient even as the
distribution of characters changes.
 Flexibility: No prior knowledge of the character distribution is needed; the method
adapts to the data in real-time.

This simplified example demonstrates the fundamental principles behind adaptive coding: the
ability to adjust to changes in data characteristics on-the-fly, ensuring efficient and effective
compression throughout the data stream.

3.4. Dictionary-based coding (LZW) Lempel-ZivWelch


Lempel-Ziv-Welch (LZW) is a dictionary-based coding algorithm for lossless data compression.
It was developed by Abraham Lempel, Jacob Ziv, and Terry Welch in the early 1980s. LZW is
notable for its simplicity and efficiency, especially in scenarios where the data contains repeated
sequences or patterns. It's widely used in various applications, including file compression (like in
the .zip format), GIF images, and the TIFF image format.

How LZW Works:

LZW compression builds a dictionary of substrings encountered in the input data stream and
encodes the data as a series of codes (indices into the dictionary), which are typically shorter
than the original sequences. The algorithm starts with a dictionary containing all possible single-
character strings (given the alphabet used in the data), and then dynamically adds new entries for
longer sequences encountered in the input.

1. Initialization: Begin with a dictionary that includes all possible single characters in the
data's alphabet, each assigned a unique code.
2. Processing:
 Read input sequences and look for the longest sequence already present in the
dictionary.
 When a sequence is found that is not in the dictionary, add it with a new code.
Output the code for the longest sequence found just before the new sequence.
 Continue this process, effectively growing the dictionary with more complex
sequences.
3. Output: The result is a stream of codes that reference the dictionary entries. These codes
are what get stored or transmitted, leading to compression.

Decoding with LZW:

Decoding reverses the process, using the codes to reconstruct the original data. The decoder
starts with the same initial dictionary as the encoder and rebuilds the dictionary using the
received codes, ensuring that the encoder and decoder dictionaries stay synchronized.

1. Read a code from the compressed file/stream.


2. Translate the code through the dictionary to find the corresponding sequence.
3. Output the sequence and read the next code.
4. As new sequences are encountered, update the dictionary just as the encoder did,
ensuring that the sequence corresponding to each code is correctly reconstructed.
Example of LZW Compression:

Consider a simple example with the input sequence "ABABABA".

1. The initial dictionary (for an alphabet {A, B}) might be: A=1, B=2.
2. Starting to read the input, "A" is found in the dictionary (code 1). Next, "AB" is not in the
dictionary, so add "AB"=3 to the dictionary and output the code for "A" (1).
3. Continue this process:
 Read "B", found in the dictionary (code 2). "BA" is not in the dictionary, so add
"BA"=4 and output the code for "B" (2).
 Read "AB", which was added earlier (code 3). "ABA" is not in the dictionary, so
add "ABA"=5 and output the code for "AB" (3).
 This continues until the end of the input.

The output compressed sequence might look like: 1, 2, 3, 1, which is significantly shorter than
the original for longer inputs.

LZW's efficiency comes from its ability to adapt its dictionary dynamically to the data, making it
very effective for files with repeated patterns.

Examples of Dictionary-based coding (LZW) Lempel-ZivWelch

To provide a more detailed example of LZW (Lempel-Ziv-Welch) compression in action, let's


compress a short string. We'll use the string "AABABAABAAA" as our input data. This example
will walk through how LZW dynamically builds its dictionary and compresses the input data into
a sequence of codes.

Initial Setup:

Assuming an alphabet of {A, B}, the initial dictionary is preloaded with the alphabet's symbols:

 A: 1
 B: 2

Step-by-Step Compression Process:


1. Step 1: The first symbol is 'A'. 'A' is in the dictionary with code 1. Look ahead to the next
symbol to form "AA", which is not in the dictionary. Add "AA" to the dictionary with the
next code (3) and output the code for 'A' (1).
Updated Dictionary:
 A: 1
 B: 2
 AA: 3
2. Step 2: The next sequence starts with 'A' again, but now we consider "AA" since it's in our
dictionary (code 3). The next symbol is 'B', forming "AAB" which is not in the dictionary.
Add "AAB" with the next code (4) and output the code for "AA" (3).
Updated Dictionary:
 A: 1
 B: 2
 AA: 3
 AAB: 4
3. Step 3: Proceed with 'A', but since we outputted "AA" last, we now consider 'B'. 'B' is in the
dictionary (code 2). The next symbol 'A' forms "BA", which is not in the dictionary. Add
"BA" with the next code (5) and output the code for 'B' (2).
Updated Dictionary:
 A: 1
 B: 2
 AA: 3
 AAB: 4
 BA: 5
4. Step 4: Next is 'A'. 'A' is in the dictionary (code 1). The next symbol 'A' forms "AA", which
is in the dictionary (code 3). The following symbol 'B' forms "AAB", which is also in the
dictionary (code 4). The next symbol 'A' forms "AABA", which is not in the dictionary. Add
"AABA" with the next code (6) and output the code for "AAB" (4).
Updated Dictionary:
 A: 1
 B: 2
 AA: 3
 AAB: 4
 BA: 5
 AABA: 6
5. Step 5: Finally, we have 'A'. 'A' is in the dictionary (code 1), and looking ahead, we see
"AA" (code 3) followed by "AAA" which is not in the dictionary. Add "AAA" with the next
code (7) and output the code for "AA" (3). Since "AAA" is the end of the input, output its
code too (7).
Final Dictionary:
 A: 1
 B: 2
 AA: 3
 AAB: 4
 BA: 5
 AABA: 6
 AAA: 7

Output:

The compressed output sequence is: 1, 3, 2, 4, 3, 7

Decompression:

To decompress, the process is reversed using the output codes to rebuild the string based on the
dictionary entries added during compression. The decompressor constructs the same dictionary
as the compressor by reading the codes and adding new entries based on the sequences it
decodes.

This example simplifies the compression process but illustrates the core principle of LZW
compression: as the data is processed, new sequences are added to the dictionary, and existing
sequences are encoded with shorter, fixed-length codes, leading to compression.
3.5. Arithmetic Coding
Arithmetic coding is a form of entropy coding used in lossless data compression. Unlike other
techniques that work by replacing symbols with fixed-length codes or variable-length codes (like
Huffman coding), arithmetic coding encodes the entire message into a single number, a fraction
in the range between 0 and 1. This method can achieve higher compression ratios, especially for
sources with a high degree of redundancy, and is particularly effective when dealing with
symbols that have very low probabilities of occurrence.

How Arithmetic Coding Works:

1. Initialization: Start with the interval [0, 1). This interval will be progressively subdivided
based on the probabilities of the symbols that need to be encoded.
2. Subdivision: The range [0, 1) is subdivided into subintervals, each corresponding to one of
the possible symbols in the source alphabet. The size of each subinterval is proportional to
the probability of the corresponding symbol. For a very simple example, if you have an
alphabet with just two symbols A and B, with probabilities 0.7 and 0.3 respectively, the
interval [0, 1) would be divided into [0, 0.7) for A and [0.7, 1) for B.
3. Encoding Process:
 For each symbol of the message to encode, the current interval is narrowed down to the
subinterval that corresponds to that symbol.
 This process is repeated for each symbol in the message, with the interval becoming
progressively smaller and more precise.
 After all symbols are processed, any number within the final interval can be used to
represent the entire message.
4. Decoding Process:
 The decoder starts with the same initial interval and the same subdividing process based
on symbol probabilities.
 Using the encoded number, the decoder can determine which subinterval (and therefore
which symbol) the number falls into.
 It then refines the interval to that subinterval and repeats the process until the entire
message is reconstructed.
Advantages of Arithmetic Coding:

 Efficiency: Can achieve higher compression ratios than other methods, especially for
symbols with very low probabilities.
 Flexibility: Can be adapted to different symbol probabilities dynamically, making it
suitable for adaptive encoding.

Challenges with Arithmetic Coding:

 Computational Complexity: The encoding and decoding processes involve multiple steps
of multiplication and division, making arithmetic coding more computationally intensive
than some other methods.
 Precision Requirements: Because the method narrows down an interval to represent the
entire message, it requires high precision arithmetic to handle very small interval sizes,
especially for long messages.

Despite these challenges, arithmetic coding is a powerful tool in the data compression arsenal,
particularly valuable when working with sources where symbol probabilities vary widely or are
known with high accuracy. It's used in various applications, including file compression and
multimedia encoding, often in conjunction with other compression techniques.

Examples of Arithmetic Coding

Let's go through a simplified example of arithmetic coding to compress the string "AABA".
Assume we have a source that produces only two symbols, A and B, with probabilities P(A) =
0.8 and P(B) = 0.2, respectively.

Step 1: Initialize the Interval

Initially, the entire interval is [0, 1).

Step 2: Subdivide Based on Probabilities

For each symbol, the current interval is subdivided based on the symbol's probability. The initial
subdivision for our alphabet (A, B) would be:

 Interval for A: [0, 0.8)


 Interval for B: [0.8, 1)
Step 3: Encoding "AABA"

1. First Symbol: "A"


 Since the first symbol is "A", we select the interval [0, 0.8).
 New interval: [0, 0.8)
2. Second Symbol: "A"
 We further subdivide [0, 0.8) based on P(A) = 0.8 and P(B) = 0.2:
 Interval for A within [0, 0.8): [0, 0.64) (because 0.8 * 0.8 = 0.64)
 Interval for B within [0, 0.8): [0.64, 0.8)
 Since the second symbol is "A", we select the interval [0, 0.64).
 New interval: [0, 0.64)
3. Third Symbol: "B"
 We now subdivide [0, 0.64) for "A" and "B":
 Interval for A within [0, 0.64): [0, 0.512) (because 0.64 * 0.8 = 0.512)
 Interval for B within [0, 0.64): [0.512, 0.64) (because 0.64 * 0.2 = 0.128 and
0.512 + 0.128 = 0.64)
 Since the third symbol is "B", we select the interval [0.512, 0.64).
 New interval: [0.512, 0.64)
4. Fourth Symbol: "A"
 Subdividing [0.512, 0.64) for "A" and "B":
 Interval for A within [0.512, 0.64): [0.512, 0.6144) (because 0.512 + (0.64 -
0.512) * 0.8 = 0.6144)
 Interval for B within [0.512, 0.64): [0.6144, 0.64)
 Since the fourth symbol is "A", we select the interval [0.512, 0.6144).
 Final interval: [0.512, 0.6144)

Step 4: Choose a Representative Value

Any value within the final interval [0.512, 0.6144) can represent the entire sequence "AABA".
For simplicity, we could choose the midpoint, approximately 0.5632, as the encoded value for
"AABA".

Decoding
The decoding process reverses the encoding steps. Starting with the encoded value (e.g., 0.5632)
and the initial interval [0, 1), the decoder uses the same subdivisions based on symbol
probabilities to reconstruct the original string "AABA".

Summary

This example simplifies the arithmetic coding process to illustrate the basic principles. Real
implementations require managing precision and efficiently handling the encoding and decoding
calculations, especially for longer sequences and larger alphabets. Despite its complexity,
arithmetic coding's ability to represent sequences as fractions within an interval makes it a
powerful tool for achieving high compression ratios, particularly when symbol probabilities vary
greatly.

3.6. Audio Compression


Audio compression is a technique used to reduce the size of audio files or streams without
significantly impacting the quality of the audio experience for listeners. It can be applied through
either lossless or lossy methods, depending on the requirements for audio quality, file size, and
application context.

Lossless Audio Compression

Lossless compression algorithms reduce audio file size without any loss of quality, ensuring that
the decompressed audio is bit-for-bit identical to the original. This type of compression is ideal
when audio fidelity is paramount, such as in professional audio editing and archival. Common
lossless audio compression formats include FLAC (Free Lossless Audio Codec), ALAC (Apple
Lossless Audio Codec), and WAV (with compression).

 Advantages: Perfect audio quality preservation and reversibility (the original audio can
be perfectly reconstructed from the compressed file).
 Disadvantages: Larger file sizes compared to lossy compression, limiting its usefulness
for streaming or storing large music libraries on devices with limited storage.
3.6.1. DPCM Audio Compression
Differential Pulse Code Modulation (DPCM) is a method of digitally representing analog signals
that is widely used in audio compression. The core principle behind DPCM is to exploit the
predictability of the successive samples in an audio signal to reduce the bitrate needed for digital
representation. Instead of encoding the absolute value of each sample, DPCM encodes the
difference between an actual audio sample and its predicted value, based on previous samples.
This difference is generally smaller than the original signal values, which leads to a more
efficient, compressed digital representation.

How DPCM Works:

1. Prediction: At the heart of DPCM is a predictor that estimates the next sample's value based
on a linear combination of previous samples. The predictor can be as simple as assuming the
next sample will be the same as the last one (zero-order prediction), or it can involve more
complex algorithms that consider several past samples to make a prediction.
2. Difference Calculation: For each sample, the difference between the actual sample value
and the predicted value is calculated. This difference is often smaller than the original audio
signal amplitude, especially in signals with low frequency content where successive samples
do not vary much.
3. Quantization: The difference signal is then quantized, which involves mapping the
continuous range of difference values to a discrete set of values. This step introduces
quantization noise but significantly reduces the amount of data needed to represent each
sample.
4. Encoding: Finally, the quantized differences are encoded for transmission or storage. Since
these differences tend to be smaller, they can often be represented using fewer bits than the
original samples.
5. Decoding and Reconstruction: At the decoder side, the process is reversed. The quantized
differences are used to reconstruct the original signal by successively adding them to the
predicted values.

Advantages of DPCM:
 Bitrate Reduction: By encoding differences that are typically smaller than the absolute
sample values, DPCM can achieve a lower bitrate compared to PCM (Pulse Code
Modulation), which encodes the absolute values directly.
 Adaptability: The predictor in DPCM can be adapted to the characteristics of the signal,
allowing for better performance in terms of signal-to-noise ratio (SNR) for specific types
of audio content.

Limitations:

 Quantization Noise: The quantization step introduces noise, which can become
perceptible if the quantization step size is too large compared to the signal variation.
 Predictor Complexity: The effectiveness of DPCM greatly depends on the predictor's
accuracy. Designing a predictor that performs well across diverse audio signals can be
challenging.

DPCM is especially effective for signals where consecutive samples do not vary greatly, such as
in speech signals or in audio signals with limited high-frequency content. It serves as a basis for
more advanced audio compression techniques, including Adaptive DPCM (ADPCM), which
dynamically adjusts the quantization step size and the predictor coefficients based on the signal
characteristics to further improve compression efficiency and signal quality.

Examples of DPCM Audio Compression

Differential Pulse Code Modulation (DPCM) is an audio compression technique that encodes the
difference between successive audio samples rather than the absolute sample values. This
method is particularly efficient for audio signals with minimal changes between consecutive
samples, as it typically results in smaller difference values that require fewer bits to encode. Here
are two examples illustrating the application of DPCM in audio compression:

Example 1: Voice Recording Compression

In a simple voice recording, where the speaker's voice maintains a relatively steady pitch and
volume, the audio signal's consecutive samples will not vary dramatically. DPCM can effectively
compress this type of audio by predicting each subsequent sample based on previous ones and
encoding the small differences. For instance, if the original PCM samples are 5000, 5020, 4990,
5010, 5025, DPCM might encode the initial value 5000 directly and then only the differences
+20, -30, +20, +15, which can be represented with fewer bits than the original PCM samples.

Example 2: Compressing a Simple Melody

Consider a simple melody played on a piano where the notes change slowly over time. The
digital representation of this melody, when sampled, would show gradual changes from one
sample to the next as the notes progress. Using DPCM, the system would predict the value of
each sample based on the values of one or more preceding samples. Since the melody evolves
slowly, the differences between the predicted and actual samples would be small. By encoding
these small differences instead of the full sample values, DPCM reduces the data needed to
represent the melody, effectively compressing the audio data.

In both examples, DPCM's efficiency comes from its ability to exploit the predictability of the
audio signal, encoding only the deviation from a prediction rather than the full sample value.
This results in significant data reduction, especially for audio signals with low variability
between samples. However, the effectiveness of DPCM depends on the characteristics of the
audio signal and the accuracy of the prediction model. For audio with rapid changes or high
complexity, the differences between successive samples might be larger, reducing the
compression efficiency.

3.6.2. ADPCM Audio Compression


Adaptive Differential Pulse Code Modulation (ADPCM) is an audio compression technique that
extends the principles of Differential Pulse Code Modulation (DPCM) by adapting the
quantization step size dynamically based on the signal's characteristics. ADPCM aims to achieve
higher compression ratios than standard PCM and DPCM while maintaining good audio quality.
It does so by more efficiently encoding the range of differences between predicted and actual
audio samples, adjusting to the signal's variability over time.

How ADPCM Works:

1. Prediction: Like DPCM, ADPCM predicts each audio sample based on previous samples.
However, ADPCM frequently adjusts its prediction algorithm to better fit the signal's
changing characteristics.
2. Difference Calculation: The difference between the actual sample and its predicted value is
calculated, just like in DPCM.
3. Adaptive Quantization: This is where ADPCM differs significantly from DPCM. The
quantization step size used to encode the difference between the actual and predicted samples
is dynamically adjusted based on the signal's characteristics. If the signal varies widely, a
larger step size might be used to efficiently encode larger differences. For more stable signal
parts, a smaller step size is used to capture finer details.
4. Encoding: The quantized differences, along with any necessary information about the
quantization step size adjustments, are encoded into the compressed audio stream.
5. Decoding: The decoding process reverses the encoding steps, using the encoded differences
and step size information to reconstruct the audio signal.

Advantages of ADPCM:

 Efficiency: By adjusting the quantization step size, ADPCM can more efficiently
compress a wider range of audio signals compared to PCM and DPCM, offering a good
compromise between file size and quality.
 Flexibility: The adaptive nature of the quantization allows ADPCM to handle diverse
audio content, from human speech to complex music, more effectively than non-adaptive
methods.
 Low Complexity: Despite its adaptiveness, ADPCM remains relatively simple and low
in computational complexity, making it suitable for real-time applications and devices
with limited processing power.

Applications:

ADPCM is widely used in scenarios where bandwidth or storage is limited but where a
reasonable quality of audio is still required. This includes:

 Telephony: Particularly in VoIP (Voice over IP) services where efficient data
transmission is crucial.
 Gaming: For in-game audio and voice chat, where keeping data rates low without
significantly compromising audio quality is important.
 Portable Media Players: Allowing for more audio files to be stored in memory-
constrained devices.

ADPCM strikes a balance between the simplicity of DPCM and the need for efficient audio
compression, making it a popular choice for a wide range of applications.

Examples of ADPCM Audio Compression

Adaptive Differential Pulse Code Modulation (ADPCM) is a method used extensively in digital
audio applications to compress audio files effectively. It's particularly well-suited for
environments where bandwidth or storage is limited but where better audio quality than basic
PCM or DPCM is desired. Below are practical examples illustrating where ADPCM compression
might be used:

Example 1: VoIP Communications

In Voice over Internet Protocol (VoIP) services, ADPCM is utilized to compress the audio signal
before transmission. This compression allows for clear voice communication over internet
connections with limited bandwidth. By dynamically adjusting the quantization step size based
on the audio signal's variability, ADPCM ensures that voice clarity is maintained even with the
compression, making conversations more intelligible compared to using non-adaptive methods.

Example 2: Digital Audio Players

Portable digital audio players often support ADPCM to maximize the number of songs that can
be stored on the device without significantly compromising audio quality. For instance, a player
might use ADPCM to compress audio files so that users can enjoy longer playback times and
carry a larger music library within the same storage capacity.

Example 3: Video Game Sound Effects

Video games, especially those on consoles or devices with limited storage capacity, use ADPCM
to compress sound effects and voice clips. This allows developers to include a wide range of
high-quality audio assets in the game without exhausting the available storage space. ADPCM's
efficient compression helps maintain immersive sound experiences while keeping game file sizes
manageable.
Example 4: Audio Books

Audio book platforms may use ADPCM to compress spoken-word content. This enables them to
offer extensive libraries of audio books that are easily downloadable, even over connections with
limited data rates. Listeners can enjoy clear, understandable narration with reduced file sizes,
facilitating the storage of multiple books on their devices.

Example 5: Wireless Audio Devices

Wireless audio devices, such as Bluetooth headsets and speakers, may employ ADPCM
compression to transmit audio data efficiently over wireless connections. This ensures that users
experience minimal latency and decent audio quality even when the available bandwidth is
constrained.

In these examples, ADPCM provides a practical solution for delivering audio content where the
balance between compression efficiency and audio quality is crucial. Its adaptability to the
signal's characteristics allows for significant data size reduction while minimizing perceptible
loss in sound quality, making it an invaluable tool in digital audio processing and transmission.

3.6.3. LPC Audio Compression


Linear Predictive Coding (LPC) is an advanced audio compression technique primarily used for
speech signals. LPC operates on the principle that a speech signal can be approximated as a
linear combination of past signal samples. This method is particularly effective for compressing
voice audio because it models the human vocal tract's behavior, which produces speech sounds
through a series of filters.

How LPC Works:

1. Signal Modeling: LPC analyzes a segment of the audio signal to determine a set of
coefficients that effectively model the signal's spectral envelope. These coefficients represent
the filter that simulates the vocal tract during the production of speech sounds.
2. Prediction: The LPC model predicts each sample of the speech signal based on a linear
combination of previous samples. The difference between the actual sample and its predicted
value (prediction error) is then encoded.
3. Encoding: Only the LPC coefficients, the prediction error, and some additional parameters
like pitch and gain (for voiced sounds) need to be encoded and transmitted or stored,
significantly reducing the data size compared to the original audio.
4. Decoding and Reconstruction: At the decoder, the encoded data (LPC coefficients and
additional parameters) is used to reconstruct the audio signal. A synthetic excitation signal,
generated based on the pitch information and filtered through the LPC filter defined by the
coefficients, produces the final speech output.

Advantages of LPC:

 High Compression Ratio: LPC can achieve high compression ratios, especially for
speech audio, by efficiently modeling the vocal tract with a small number of parameters.
 Low Bitrate: It is suitable for applications requiring low-bitrate transmission, such as
telecommunications and voice assistants, due to its efficient representation of speech.
 Quality: While highly compressed, LPC-generated speech retains intelligibility and can
be of sufficient quality for many applications involving voice communication.

Applications:

 Telephony: LPC's efficient compression makes it well-suited for digital telephony


systems, where bandwidth is limited.
 Speech Recognition: LPC coefficients can be used as features in speech recognition
systems due to their effective representation of speech characteristics.
 Text-to-Speech Systems: LPC can generate speech from text by synthesizing speech
signals based on LPC coefficients, offering a compact way to produce understandable
voice output.

LPC's focus on modeling the vocal tract and its effectiveness in compressing speech audio make
it distinct from other general audio compression techniques that might prioritize music or
broader ranges of sound. Its use in voice-related technologies highlights its importance in
facilitating efficient digital communication.

Examples of LPC Audio Compression


Linear Predictive Coding (LPC) is a widely utilized audio compression technique, especially in
the realm of speech compression and synthesis. It excels in applications where the primary audio
content is human speech due to its efficient modeling of the vocal tract. Here are practical
examples illustrating the use of LPC in various applications:

Example 1: Telecommunications

In telecommunication systems, especially in earlier mobile phones and VoIP services, LPC
compression is employed to transmit voice data efficiently over limited bandwidth. For instance,
the LPC-10e algorithm, transmitting at 2.4 kbps, has been used in military communications and
satellite links to ensure clear voice transmission with minimal data usage.

Example 2: Speech Recognition Systems

LPC coefficients provide a compact representation of speech signal characteristics, making them
valuable features for speech recognition algorithms. By analyzing the patterns in LPC
coefficients, speech recognition systems can differentiate between phonemes, words, or phrases,
improving their accuracy in understanding spoken commands.

Example 3: Text-to-Speech (TTS) Synthesis

Text-to-speech systems often use LPC to generate artificial speech sounds. Given a text input,
these systems convert the text into LPC parameters that model the sounds of spoken language. A
synthetic excitation signal is then filtered through the LPC model to produce speech. This
approach allows for highly intelligible speech output with relatively low computational
requirements.

Example 4: Voice Storage and Retrieval Systems

Voice mail and other voice storage systems benefit from LPC's high compression efficiency,
allowing them to store lengthy voice messages in minimal space. By compressing the voice data
with LPC, these systems can optimize storage capacity and facilitate quick retrieval and playback
of messages.

Example 5: Hearing Aids and Assistive Listening Devices


LPC is used in some hearing aids and assistive listening devices to enhance speech intelligibility
for the hearing impaired. By modeling and emphasizing the critical features of speech, these
devices can make it easier for users to understand spoken words, especially in noisy
environments.

These examples underscore the versatility of LPC in handling speech signals, offering a balance
between compression efficiency and the preservation of speech intelligibility and quality. LPC's
ability to model human vocal characteristics with a small set of parameters makes it an
indispensable tool in audio processing applications focused on human speech.

3.6.4. CELP Audio Compression


Code-Excited Linear Prediction (CELP) is an advanced audio compression technique
predominantly used for speech compression. CELP combines the principles of Linear Predictive
Coding (LPC) with codebook-driven excitation, making it highly efficient for compressing
speech signals with good quality at low bit rates. It's especially favored in telecommunications,
such as mobile telephony and Voice over IP (VoIP), due to its balance between compression
efficiency and audio quality.

How CELP Works:

1. Linear Prediction: Like LPC, CELP uses linear predictive coding to model the vocal
tract's transfer function, generating a set of LPC coefficients that describe the spectral
envelope of the speech signal.
2. Codebook Excitation: The core of CELP is its use of a stochastic codebook (or
excitation codebook) containing random signal samples, which, when filtered through the
LPC filter, can approximate a wide range of speech sounds. Additionally, CELP may use
an adaptive codebook to exploit the pitch periodicity of speech for voiced sounds.
3. Search and Optimization: For each segment of speech, the CELP encoder searches the
codebook to find the best match that, when passed through the LPC filter, closely
approximates the original speech signal. The selection is based on minimizing the error
between the original and synthesized speech.
4. Encoding: The chosen codebook index, along with the LPC coefficients and any
additional parameters (like gain factors), are encoded and transmitted or stored.
5. Decoding: The decoder uses the received LPC coefficients and codebook indices to
synthesize the speech signal, reconstructing the speech from the compressed data.

Advantages of CELP:

 High Quality at Low Bit Rates: CELP achieves high-quality speech reproduction at low
bit rates (ranging from 4.8 kbps to 16 kbps), making it ideal for bandwidth-constrained
applications.
 Flexibility: The technique is adaptable to various speech characteristics, ensuring
consistent quality across different speakers and speech content.
 Widespread Use: CELP and its variants (such as Enhanced Variable Rate Codec, EVRC,
and Adaptive Multi-Rate, AMR) are standardized and used widely in mobile and digital
telephony systems.

Applications:

1. Mobile Telephony: CELP codecs are used in GSM and other mobile phone standards to
compress voice data, allowing efficient transmission over wireless networks.
2. VoIP Services: Voice over IP services utilize CELP to deliver clear voice communication
over internet connections, optimizing bandwidth usage.
3. Audio Streaming: For applications where low bit rate audio streaming is necessary,
CELP provides a means to stream voice content with minimal data use.
4. Voice Messaging and Voicemail Systems: CELP compression allows these systems to
store voice messages compactly without significantly compromising playback quality.

CELP's efficient handling of speech signals, balancing compression and quality, has made it a
cornerstone in digital voice communication, enabling clear and intelligible speech transmission
even under limited bandwidth conditions.

Examples of CELP Audio Compression

Code-Excited Linear Prediction (CELP) is a sophisticated audio compression technique


primarily utilized for speech compression. It's designed to effectively encode speech at low bit
rates while preserving intelligibility and quality. Here are some practical examples and
applications where CELP compression technology is employed:
Example 1: Mobile Telephony

In mobile telephony, particularly in GSM (Global System for Mobile communications), CELP
codecs are standard for compressing voice data to ensure efficient use of the limited bandwidth
available for mobile networks. For example, the Half Rate (HR), Full Rate (FR), and Enhanced
Full Rate (EFR) codecs used in GSM are based on CELP technology, allowing for clear voice
calls even at low bit rates around 5.6 kbps to 12.2 kbps.

Example 2: Voice Over Internet Protocol (VoIP)

VoIP services, such as Skype and other internet telephony applications, use CELP-based codecs
to compress voice data for transmission over the internet. This compression is crucial for
reducing bandwidth usage and ensuring high-quality voice communication over IP networks. The
Adaptive Multi-Rate (AMR) codec, which adapts the bit rate dynamically based on network
conditions, is an example of a CELP codec used in VoIP.

Example 3: Digital Dictation and Voice Recording Devices

Digital dictation devices and some voice recorders use CELP compression to maximize storage
efficiency, allowing users to record hours of voice notes or dictations without requiring large
amounts of storage space. CELP's efficient speech compression algorithms enable these devices
to offer extended recording times while maintaining legible audio quality.

Example 4: Satellite Communications

In satellite communications, especially in systems where bandwidth is extremely precious and


costly, CELP codecs are utilized for voice transmission. Their ability to deliver understandable
speech at low bit rates makes them ideal for communication in remote areas, disaster response
scenarios, and military applications where satellite bandwidth may be limited.

Example 5: Audio Books and Language Learning Applications

Some audio book platforms and language learning apps employ CELP compression to reduce file
sizes of spoken word content. This makes it easier for users to download audio books or
language lessons, even with limited internet bandwidth, without significantly sacrificing audio
clarity and making the listening experience enjoyable.
These examples highlight CELP's significance in various domains, particularly where efficient
speech compression is vital. By enabling clear, intelligible speech transmission at low bit rates,
CELP codecs facilitate a wide range of applications in telecommunications, digital recording,
and online learning, among others.

3.7. Image Compression


Image compression is a process used to reduce the size of digital images, thereby decreasing
storage requirements, and speeding up file transfers without significantly impacting the visual
quality for the intended use. This process is crucial for web development, digital photography,
and online content sharing, where bandwidth and storage efficiency are important. Image
compression can be either lossless or lossy, depending on whether the process allows for perfect
reconstruction of the original image.

Lossless Compression

 Description: Lossless compression algorithms reduce the file size without losing any
image data or quality. The original image can be perfectly reconstructed from the
compressed version.
 Techniques: Common lossless image compression algorithms include Run-Length
Encoding (RLE), Huffman Coding, and PNG (Portable Network Graphics) format.
 Use Cases: Essential for applications where image integrity is critical, such as medical
imaging, technical drawings, and archival purposes.

Lossy Compression

 Description: Lossy compression algorithms achieve higher compression ratios by


selectively discarding less important information, leading to a reduction in image quality
that might be imperceptible to the human eye.
 Techniques: Widely used lossy compression methods include JPEG (Joint Photographic
Experts Group) and WebP formats, which are particularly effective for compressing
photographic images.
 Use Cases: Commonly used for web images, social media, digital photography, and any
application where slight decreases in image quality are acceptable in exchange for
significant reductions in file size.

Key Concepts in Image Compression

 Bit Depth: Refers to the number of bits used to represent each pixel's color in an image.
Reducing bit depth can decrease file size but also reduces color fidelity.
 Resolution: Lowering the resolution (number of pixels) of an image reduces its file size
but also affects its clarity and detail, especially when viewed at large sizes or on high-
resolution displays.
 Chroma Subsampling: A technique used in lossy compression for color images, based
on the principle that the human eye is less sensitive to color detail than to luminance. It
reduces the resolution of the chroma (color) information more than the luma (brightness),
effectively compressing the image with minimal perceived loss in quality.

Image Compression Process

1. Preprocessing: Involves converting the image to a suitable format and possibly modifying
bit depth or resolution.
2. Transformation: Many compression algorithms transform the image data from the spatial
domain to a frequency domain (e.g., using the Discrete Cosine Transform in JPEG), where
it's easier to identify and discard less important information.
3. Quantization (for lossy compression): Reduces the precision of the transformed image
data, effectively compressing the image by removing details that have little impact on visual
quality.
4. Encoding: Applies further compression through techniques like Huffman Coding or
arithmetic coding, efficiently encoding the image data into a compact form.
5. Decoding and Reconstruction: For viewing or editing, the compressed image is decoded. In
lossless compression, this step perfectly reconstructs the original image. In lossy
compression, the image closely approximates the original.

Image compression is a balancing act between reducing file size and maintaining adequate image
quality. The optimal compression method and settings depend on the specific requirements of
each application, such as the importance of image fidelity versus the need for efficient storage
and transmission.

3.7.1. JPEG
JPEG (Joint Photographic Experts Group) image compression is one of the most widely used
methods for compressing photographic images. The JPEG standard utilizes a lossy compression
technique, which means that some original image information is lost during the compression
process, but in such a way that the loss is generally imperceptible to the human eye. This allows
JPEG files to be significantly smaller than their uncompressed counterparts, making them ideal
for use on the web, in digital photography, and for storing large numbers of images where disk
space is at a premium.

How JPEG Compression Works:

1. Color Space Conversion: JPEG typically starts by converting the image from the RGB
color model (used by most digital cameras and computer displays) to the YCbCr color model.
This model separates the image into luminance (brightness) information (Y) and two
chrominance (color) components (Cb and Cr). This step is based on the fact that the human
eye is more sensitive to variations in brightness than in color.
2. Subsampling: The chrominance components are often subsampled because the human eye is
less sensitive to fine color details. This reduces the resolution of the color information while
maintaining the full resolution of the luminance data, contributing to overall compression.
3. Discrete Cosine Transform (DCT): The image is divided into blocks (usually 8x8 pixels),
and each block is transformed from the spatial domain to the frequency domain using the
Discrete Cosine Transform. This step concentrates most of the signal energy into a few low-
frequency components of the DCT coefficients.
4. Quantization: The DCT coefficients are quantized, which means reducing the precision of
the coefficients according to a quantization matrix, which more heavily quantizes high-
frequency components than low-frequency components. This step introduces the lossiness of
JPEG compression but significantly reduces the amount of data.
5. Entropy Coding: The quantized coefficients are then encoded using entropy coding
techniques like Huffman coding or arithmetic coding, further compressing the data by
exploiting statistical redundancy.
6. File Structure: The compressed data is packaged into a JPEG file format, along with headers
and other necessary metadata, such as the quantization and Huffman tables used for
decoding.

Advantages of JPEG Compression:

 Efficiency: JPEG can achieve high compression ratios, significantly reducing file sizes
without noticeable loss in image quality for most applications.
 Flexibility: The compression level can be adjusted, allowing a trade-off between image
quality and file size according to the user's needs.
 Popularity and Compatibility: JPEG is supported by virtually all web browsers, image
editing software, and digital cameras, making it a universal format for digital images.

Considerations:

 Lossy Compression: The lossy nature of JPEG means that some original image detail is
permanently lost, which can become noticeable at high compression levels or after
repeatedly saving the image.
 Artifacts: High levels of compression can lead to visible artifacts such as blockiness,
blurring, and "ringing" around edges.

JPEG's balance between compression efficiency and image quality, along with its widespread
support, has made it the format of choice for a vast array of digital imaging applications.

Examples JPEG Image Compression

JPEG (Joint Photographic Experts Group) image compression is a ubiquitous technology in


digital imaging, used across a variety of applications to reduce file sizes while maintaining
acceptable image quality. Here are practical examples where JPEG compression is commonly
employed:

1. Digital Photography

Nearly all digital cameras and smartphones use JPEG compression to store high-quality
photographs efficiently. This allows users to save thousands of images on a single device,
balancing quality, and storage space. Photographers often choose JPEG for quick sharing and
when shooting in rapid succession, where storage efficiency is crucial.

2. Web Images

Many images on the internet are stored and transmitted in JPEG format. Website developers
prefer JPEG for photographic content due to its high compression ratio, which significantly
reduces webpage load times and conserves bandwidth without compromising visual appeal.

3. Social Media

Social media platforms automatically compress uploaded images to JPEG to optimize server
storage and improve content delivery speeds. Despite the lossy compression, the retained image
quality is typically sufficient for viewing on screens, making JPEG ideal for sharing moments
and visual content online.

4. Email Attachments

When sending images as email attachments, JPEG is the go-to format because it reduces the file
size, facilitating quicker sending and receiving, especially when internet connectivity is limited.
This efficiency makes JPEG preferred for both personal and professional correspondence.

5. Real Estate and Online Marketplaces

JPEG images are extensively used in online real estate listings and e-commerce platforms. High-
quality photographs of properties, products, and services are compressed as JPEGs to provide
clear and attractive visuals to potential buyers, optimizing the browsing experience by enabling
fast image loading.

6. Graphic Design and Digital Marketing

In graphic design and digital marketing materials, JPEG is used for final drafts of visual assets
intended for digital viewing, such as banners, newsletters, and online ads. Designers often work
in lossless formats but convert to JPEG for distribution to balance quality with file size.

7. Photo Printing Services

Online photo printing services often accept JPEG files for printing. While these services may
support higher-quality formats, JPEG remains popular among consumers for its convenience.
The format provides a good balance of quality and file size, making it suitable for producing
physical prints from digital photos.

These examples demonstrate the versatility and efficiency of JPEG image compression across
different domains, highlighting its role as a fundamental technology in digital imaging. Its
widespread adoption is a testament to its ability to provide a practical compromise between
image quality and storage or bandwidth requirements.

You might also like