MPEG - Motion Picture Expert Group
MPEG - Motion Picture Expert Group
MPEG - Motion Picture Expert Group
MPEG Standards
MPEG is an acronym for Moving Picture Experts Group, a committee formed by the
ISO (International Organisation for Standardisation) to develop this standard. MPEG
was formed in 1988 to establish an international standard for the coded representation
of moving pictures and associated audio on digital storage media.
o MPEG-1
Medium Bandwidth (up to 1.5Mbits/sec)
1.25Mbits/sec video 352 x 240 x 30Hz
250Kbits/sec audio (two channels)
Non-interlaced video
Optimised for CD-ROM
o MPEG-2
Higher Bandwidth (up to 40Mbits/sec)
Up to 5 audio channels (i.e. surround sound)
Wider range of frame sizes (including HDTV)
Can deal with interlaced video
o MPEG-3
MPEG-3 was for HDTV application with dimensions up to 1920 x
1080 x 30Hz, however, it was discovered that the MPEG-2 and
MPEG-2 syntax worked very well for HDTV rate video. Now
HDTV is a part of MPEG-2 High-1440 Level and High Level
toolkit.
o MPEG-4
Very Low Bandwidth (64Kbits/sec)
176 x 144 x 10Hz
Optimised for videophones
MPEG-2 Video
Video Stream Data Hierarchy
In the diagram above, we can see that the video bitstream consists of 5 layers:
GOP,
Pictures,
Slice,
Macroblock, and
Block.
Video Sequence
Begins with a sequence header, includes one or more groups of pictures, and ends
with an end-of-sequence code.
A header and a series of one of more pictures intended to allow random access into
the sequence.
Picture
The primary coding unit of a video sequence. A picture consists of three rectangular
matrices representing luminance (Y) and two chrominance (Cb and Cr) values. The Y
matrix has an even number of rows and columns. The Cb and Cr matrices are one-half
the size of the Y matrix in each direction (horizontal and vertical).
Slice
Macroblock
The basic coding unit in the MPEG algorithm. It is a 16 * 16 pixel segment in a frame.
Since each chrominance component has one-half the vertical and horizontal resolution
of the luminance component, a macroblock consists of four Y, one Cr, and one Cb
block.
Block
Block is the smallest coding unit in the MPEG algorithm. It consists of 8x8 pixels and
can be one of three types: luminance(Y), red chrominance(Cr), or blue
chrominance(Cb). The block is the basic unit in intra frame coding.
Picture Types
The MPEG standard specifically defines three types of pictures:
Intra Pictures
Predicted Pictures
Information of the previous I-frame and/or previous P-frames are required for
encoding and decoding.
The coding of P-frames utilizes successive images where their areas often do
not change at all. The whole area is shifted (temporal redundancy).
Temporal redundancy - It is required to determine the last P or I-frame that is
most similar to the block under consideration.
Motion Estimation Method is used at the encoder.
Predicted pictures, or P-pictures, are coded with respect to the nearest previous I- or
P-pictures. This technique is called forward prediction and is illustrated in the above
figure. Like I-pictures, P-pictures also can serve as a prediction reference for B-
pictures and future P-pictures. Moreover, P-pictures use motion compensation to
provide more compression than is possible with I-pictures.
Bidirectional Pictures
Bidirectional pictures, or B-pictures, are pictures that use both a past and future
picture as a reference. This technique is called bidirectional prediction. B-pictures
provide the most compression since it uses the past and future picture as a reference,
however, the computation time is the largest.
Both image blocks and prediction-error blocks have high spatial redundancy. To
reduce this redundancy, the MPEG algorithm transforms 8x8 blocks of pixels or 8x8
blocks of error terms from the spatial domain to the frequency domain with the
Discrete Cosine Transform (DCT).
Some blocks of pixels need to be coded more accurately than others. For example,
blocks with smooth intensity gradients need accurate coding to avoid visible block
boundaries. To deal with this inequality between blocks, the MPEG algorithm allows
the amount of quantization to be modified for each macroblock of pixels. This
mechanism can also be used to provide smooth adaptation to particular bit rate.
Predicted Pictures
Bidirectional Pictures
Example:
In the above pictures, there is some information which is not in the reference frame.
Therefore B pictures are coded like P-pictures except that the motion vectors can refer
either to the previous reference picture, the next picture, or both. The following is the
mechanism of B-picture coding.
Profiles and Levels
MPEG-2 is designed to support a wide range of applications and services of varying
bit rate, resolution, and quality. The MPEG-2 standard defines 4 profiles and 4 levels
for ensuring inter-operability of these applications.
Upon decoding, the video decoder will use the profiles and levels defined depending
on its availability and the need to handle the particular bitstream.
MPEG-2 Levels
Max.
Sampling Max.
Level Pixels/sec Significance
dimensions bitrate
fps
352 x 240 x CIF, consumer tape
Low 3.05 M 4 Mb/s
30 equiv.
720 x 480 x
Main 1• 0.40 M 15 Mb/s CCIR 601, studio TV
30
High 1440 x 1152 4x 601, consumer
47.00 M 60 Mb/s
1440 x 30 HDTV
1920 x 1080 production SMPTE
High 62.70 M 80 Mb/s
x 30 240 std
MPEG-2 Profiles
Profile Comments
Same as Main, only without B-pictures. Intended for software applications,
Simple
perhaps CATV.
Main Most decoder chips, CATV satellite. 95% of users.
Main+ Main with Spatial and SNR scalability.
Next Main+ with 4:2:2 macroblocks.
Level \
Simple Main Main+ Next
Profile
4:2:2
High illegal / illegal
chroma
High - With spatial 4:2:2
illegal /
1440 Scalability chroma
90% of Main with SNR 4:2:2
Main /
users scalability chroma
Main with SNR
Low illegal / illegal
scalability
An interlaced video sequence uses one of two picture structures: frame structure and
field structure. In the frame structure, lines of two fields alternate and the two fields
are coded together as a frame. One picture header is used for two fields. In the field
structure, the two fields of a frame may be coded independently of each other, and the
odd field is followed by the even field. Each of the two fields has its own picture
header.
The interlaced video sequence can switch between frame structure and field structures
on a picture-by-pictures basics. On the other hand, each picture in a progressive video
sequence is a frame picture.
MPEG-2 Audio
MPEG-2 provides a low bit rate coding for multichannel audio. There are five full
bandwidth channels (left, right, center, and two surround channels), plus an additional
low frequency enhancement channel, and/or up to seven commentary/multilingual
channels. The MPEG-2 Audio Standard will also extend the stereo and mono coding
of MPEG-1 Audio Standard (ISO/IEC IS 11172-3) to half sampling rates (16 kHz,
22.05 kHz and 24 kHz), for improved quality for bit rate at or below 64 kbits/s, per
channel