Understanding Mpeg-2: Digital Video Compression
Understanding Mpeg-2: Digital Video Compression
Understanding Mpeg-2: Digital Video Compression
By Mark Long
( All graphics presented below are from The New Age of Digital Video Compression,
Produced by Shelburne Films ¨Ï 1994. All Rights Reserved. )
Compressed Digital Video (CDV) is the driving force behind the world-wide revolution in
satellite delivered Direct to Home (DTH) TV program distribution. CDV television signals are
transmitted in an abbreviated format that dramatically reduces the amount of frequency
bandwidth required without substantially degrading the quality of the received pictures and
sound. The introduction of CDV technology is causing a dramatic decline in the operational
costs for TV service providers. The result has been a global explosion in the number of new
satellite delivered DTH TV services, including, news, sports, movies, Pay Per View (PPV)
special events, educational programming, and narrowcast offerings that can target the needs
of small segments within any potential viewing audience.
Personal computers have long used digital compression techniques to reduce the amount of
data storage capacity needed to save large computer files. For the pat decade, the global
telephone industry also has been using compression techniques to reduce the bandwidth, and
consequently the cost, required for establishing narrow-band telephone circuits. During the
early 1990s, communications engineers began developing high-capacity very large scale
integrated circuits (VLSICs) and sophisticated software routines that could compress broad-
band telecommunications signals such as video.
A conventional PAL or SECAM video signal contains 625 lines per individual image or frame.
Flashed at a rate of 25 frames per second, each frame has two interlaced fields, each
consisting of 312 and 1/2 lines, with field '1' displaying the even numbered lines and field '2'
displaying the odd numbered lines. The interlaced scanning of the two fields is so rapid that
the eye perceives each complete image or frame rather than the two separate fields within
any one frame.
A single frame of conventional analogue PAL or SECAM video is composed of more than
174,000 picture elements or pixels. Since video operates at a rate of 25 frames per second,
more than 4 million pixels are being sent to the TV screen each second. Digital TV
transmission systems convert the visual and audio information into streams of binary digits or
bits: strings of zeros and ones that correspond to the 'off' and 'on' logic states of computer
circuitry. A maximum of 32 bits per pixel, 8 bits per luminance element and 24 bits per
chrominance element, is needed to digitize a single color video signal. Four million pixels times
32 bits per pixel equals a transmission rate of 128 Megabits per second.
A 128 Mb/s digital TV signal would fill an entire satellite transponder to capacity! To allow
multiple digital TV signals to occupy the same transponder, some form of signal compression
is needed to dramatically reduce the number of bits used by multiple TV services occupying
the same satellite transponder.
In 1991, the MPEG 1 standard was introduced to handle the compressed digital representation
of non video sources of multimedia. However, some manufacturers soon discovered that MPEG
1 could be adapted for the transmission of video signals as long as the video material was first
converted from the original interlaced mode to a progressively scanned format. A few TV
programmers elected to use MPEG 1 to transmit via satellite while the MPEG committee
developed a standard specifically for interlaced scanning applications known as MPEG 2. The
MPEG committee selected its criteria for MPEG 2 in 1994.
The compressed digital video encoder scans subsections within each frame, called macro
blocks, and identifies which ones will not change position from one frame to the next. The
encoder also identifies predictor macro blocks while noting their position and direction of
motion. Only the relatively small difference, called the motion compensated residual, between
each predictor block and the affected current block is transmitted to the receiver. The
receiver/decoder stores the information that does not change from frame to frame it's buffer
memory and uses it periodically to fill in the blanks, so to speak.
A mathematical algorithm called the Discrete Cosine Transform (DCT) reorganizes the residual
difference between frames from a spatial domain into an equivalent series of coefficient
numbers in a frequency domain that can be more quickly transmitted. Quantization coding
converts these sets of coefficient numbers into even more compact representative numbers.
The encoder refers to an internal index or code book of possible representative numbers from
which it selects the code word that best matches each set of coefficients. Quantization coding
also rounds off all coefficient values, within a certain range of limits, to the same value.
Although this results in an approximation of the original signal, it is close enough to the
original to be acceptable for most viewing applications.
I, P, and B Frames
MPEG 2 provides for up to three types of frames called the I, P and B frames. The intra-frame,
or 'I' frame, serves as a reference for predicting subsequent frames. 'I' frames, which occur on
an average of one out of every ten to fifteen frames, only contains information presented
within itself. 'P' Frames are predicted from information presented in the nearest preceding 'I'
or 'P' frame. The bi-directional 'B' frames are coded using prediction data from the nearest
preceding 'I' or 'P' frame AND the nearest following 'I' or 'P' frame.
Not all MPEG 2 systems use 'B' frames. Although a more efficient level of compression is
achieved by 'B' frames, compatible receiver/decoders must have an additional memory buffer,
which increases the cost of the decoder.
MPEG 2 Satellite Transmission Rates
The transmission speed required for any MPEG 2 broadcast varies according to the nature of
the video material. The MPEG 2 encoder located at the satellite uplink has a finite time to
make encoding decisions. Pre-recorded movies and other taped material do not push the time
constraints of the encoder to the limit; the encoder can select at its leisure the most efficient
method for encoding at the lowest possible data rate. Live sports and other live action
materials require a higher data rate because the encoder is forced to make immediate coding
decisions and must also transmit complex, rapid motion changes without introducing high
levels of distortion.
Tuning into an MPEG 2 compressed digital video signal, however, is totally different. Weak
signals virtually appear to be random noise; the receiver will not display any picture at all until
sufficient signal is reaching the antenna. Then once the digital threshold of the
receiver/decoder has been exceed, a perfect picture will appear on the TV screen. MPEG 2
reception is like a light switch: it's either on or off. Furthermore, if the installer moves past the
antenna's peak performance position, the receiver/decoder will 'freeze frame' on the last
picture in its buffer memory and will not receive any further video until the antenna is brought
back to it optimum position.
Rain fades, which occur whenever these is moisture in the atmosphere between the satellite
and the receiving system on the ground, can degrade the incoming signal level. DTH installers
are advised to use antenna tuning meters or portable spectrum analyzers to peak antenna
performance, so that the DTH systems they install will have some signal margin above
receiver/decoder threshold to compensate for any Ku-band signal degradation caused by rain
* Bit Rate : This is the amount of data information being transmitted in one second of time.
The total bit stream passing through a single satellite transponder consists of as many as eight
TV services with associated audio, auxiliary audio services, conditional access data, and
auxiliary data services such as teletext. The informational bit rate for this transmission may be
as high as 49 Mega (million) bits per second (Mb/s) over a 36-MHz-wide satellite transponder.
A single video signal within this bit stream will have a lower bit rate. For example, a VHS
quality movie can be transmitted at a bit rate of 1.152 Mb/s; a news or general entertainment
TV program at 3.456 Mb/s; live sports at 4.608 Mb/s or studio-quality broadcasts at a rate of
more than 8 Mb/s.
* Bit Error Rate (BER) : Measured in exponential notation, the Bit Error Rate (BER)
expresses the performance level of the digital receiver. For example, a higher BER of 5.0 x 10-
5 is superior to a lower BER of 9.0 x 10-4. The higher the BER, the greater the
receiver/decoder's ability to perform well during marginal reception conditions, such as when
there is heavy rainfall, snow or wind gusts.
* Forward Error Correction (FEC) : The digital bit stream contains special codes that the
receiver/decoder can use to check to ensure that all bits of information sent have been
received. If bit errors due to thermal noise, for example, are detected, the receiver/decoder
can use the convolutional encoding information that has been forwarded to it by the
transmitting station to either correct each detected error or conceal it. Burst noise
interference, which can be caused by nearby automobile ignition noise or microwave ovens,
can be detected by the Reed Solomon block decoder.