Div XUser Guide 521
Div XUser Guide 521
Div XUser Guide 521
1
Contents
Contents
Index
Index
Introduction Psychovisual Enhancement
Welcome to the guide............................. 5 What is psychovisual enhancement?....56
The origins of DivX® video ...................... 5 The DCT, iDCT and
What is DivX? ......................................... 7 Human Visual System......................56
Why use DivX? ....................................... 8 The psychovisual
enhancement system .......................57
Quick Start Guide Experiencing PVE .................................58
Introducing VirtualDub .......................... 11 Fast psychovisual enhancement...........59
Your first DivX....................................... 12 Slow psychovisual enhancement..........60
Your first Multipass ............................... 17
Notes on audio ..................................... 19 Source Preprocessing
What is pre-processing? .......................62
Forward When to use pre-processing .................62
General concepts ................................. 24 Spatial processing.................................63
Frames ............................................ 24 Temporal filtering ..................................63
Macroblocks and motion ................. 24 Source pre-processing ..........................64
Intra-frames and Predicted-frames.. 26
Quantizers ....................................... 27 Crop and Resize
A key to the guide................................. 29 Cropping and resizing via the encoder .66
Why crop or resize? ..............................66
Bitrate Mode Crop ...............................................67
What is the bitrate?............................... 32 Resize ...............................................69
Bitrate calculation ............................ 33
The bitrate calculator ............................ 34 MPEG-4 Tools
1-Pass .............................................. 35 Bi-directional encoding
Multipass .............................................. 37 What is bi-directional encoding? ......72
Multipass, 1st pass .......................... 37 Adaptive bi-directional encoding ......73
Multipass log file .............................. 38 Multiple consecutive
Multipass, nth pass.......................... 40 adaptive B-frames.......................74
Bitrate modulation............................ 40 The ordering of frames.....................75
1-Pass, Quality-based .......................... 44 Bi-directional encoding.....................77
Quarter Pixel
Performance/Quality What is quarter-pixel? ......................79
Fastest .............................................. 48 Why use quarter-pixel? ....................80
Fast .............................................. 50 Quarter-pixel ....................................81
Standard .............................................. 51 Global motion compensation
The Rate-Distortion Algorithm .............. 52 What is global motion
Slow .............................................. 54 compensation? ...........................83
Global motion compensation ...........84
The
Guide
Contents
Index
Advanced
Scene-change threshold ...................... 86
Maximum key-frame interval ................ 88
Quantization type ................................. 89
Interlacing
What is interlacing? .............................. 92
Why de-interlace? ................................ 93
Maintaining interlacing ......................... 93
Source interlace ................................... 94
Profiles
What is a profile?................................ 101
Why use profiles?............................... 101
Profiles ............................................ 102
Electrokompressiongraph™
What is the EKG?............................... 109
How the EKG works ........................... 110
Using the EKG.................................... 111
DivX® Decoder
About the decoder .............................. 114
Post-processing.................................. 114
De-blocking ................................... 115
De-ringing...................................... 115
Automatic post-processing............ 115
Post-processing levels .................. 116
Film effect........................................... 116
Quality settings................................... 117
Smooth playback........................... 117
YUV Extended............................... 117
Overlay Extended.......................... 118
Double buffering ............................ 118
Disable logo................................... 118
Support generic MPEG4 ............... 118
Acknowledgements
Thanks and credits ............................. 120
Legal ............................................ 120
The
Guide
Introduction
The
Guide
Introduction
Introduction
Welcome to the guide,
Welcome to the guide
With this guide we will demystify the black-art of DivX encoding - leading you
step-by-step through everything from creating your very first DivX video to
®
exploiting the many powerful advanced features that DivX has to offer.
This guide covers all of the features available in both the DivX and DivX Pro
encoders in detail. You will learn how to optimize video quality via the encoder
and where to use the EKG for even further enhancement.
We will discuss the DivX Certified™ Program and how you can create videos
compatible with the growing range of DivX Certified DVD players and other
portable devices that are set to take the consumer market by storm in 2004. We
will also show you how to configure the DivX decoder to improve the playback of
DivX videos on your personal computer.
The “MS MPEG4” codec from Microsoft, a video compression technology then
new to market, was limited to storing video data inside their proprietary ASF
(advanced systems format) container file. Gej wanted to use a high-quality video
codec in combination with standard AVI (audio video interleave) containers,
which would enable nearly all video editing applications to benefit from new
compression.
The
Guide
Introduction
The origins of DivX video
By September, 1999 Gej had achieved his goal of using a high performance
codec in combination with the AVI container, and DivX ;-) (the emoticon a jibe at
Circuit City’s now-deceased protected DVD rental system) was unleashed upon
the Internet. Boasting compression ratios over five times superior to the
technologies used in conventional DVDs, DivX ;-) would allow entire movies to be
transferred to single compact discs or distributed over high speed connections in
®
a matter of hours rather than days.
DivX ;-) gained an overnight cult following amongst the Internet underground.
Within days of its release, it became possible to download entire feature-length,
DVD-quality videos for the first time, and DivX ;-) was well on its way to
becoming the MP3 of video.
But DivX ;-) was just the beginning. Intrigued by the possibilities of video
compression and not ready to retire and rest on his laurels, Gej quickly dropped
the winky face and began the long process of creating a new codec entirely from
scratch.
In May 2000 Gej co-founded DivXNetworks with CEO Jordan Greenhall, Director
of Product Management Joe Bezdek, R&D director Darrius Thompson and Vice
President of Operations Tay Nguyen.
By August 2001, DivX 4 was born. A proprietary codec based on the international
MPEG-4 standard, DivX 4 was the first in a long line of DivXNetworks releases,
culminating in the commercial release of DivX 5.0 in March 2002.
Since that time DivX has gone from strength to strength - a patent-pending
technology that is constantly evolving in terms of the performance, quality, and
features. DivX 5.2 sees the ultimate culmination of this work.
The
Guide
What is DivX?
Introduction
What is DivX?
Firstly, let’s address what DivX is not. DivX is not a stand-alone encoding
application; that is when you install DivX you are not getting a program that will
convert your files to DivX format auomatically. For such an application, look to
Dr. DivX.
In short, this means DivX enables the export of high quality video from virtually
any software of your choosing. With the right software it becomes possible to
take any video source - be it live capture, DVD, DV, MPG, MP2, AVI or other -
and export them to DivX video.
Because it is based upon the MPEG4 Video Standard, DivX naturally supersedes
both MPEG1 (as used for Video CD) and MPEG2 (the standard for DVD video
and Super Video CD) for low-bitrate encoding. To put this into perspective, DivX
5 achieves DVD-quality video at one tenth the size of a DVD, and because DivX
5 is fully standards compliant you can watch your DivX videos on a growing
range of DivX Certified hardware players*.
* Different MPEG4 hardware players tend to support the MPEG4 Video Standard to varying degrees. For
guaranteed playback capability and performance only DivX Certified players are recommended.
The
Guide
Why use DivX?
Introduction
Why use DivX?
There are many codecs available these days, so why choose DivX to export your
video?
►Quality
DivX is a professional grade video codec. The Pro version available to
consumers for less than $20 is exactly the same program licensed for
use in commercial production environments. While other MPEG-4
encoders exist, none include the extensive performance optimizations
and patent-pending algorithms that enable DivX to deliver the highest
possible video quality while maintaining full standards compliance.
►Certified video
DivX produces video guaranteed to play back flawlessly on a growing
range of DivX Certified™ devices. The DivX encoder includes profile
presets that let you encode for players with differing capabilities; from
the Handheld profile for palm-top computers to High-Definition profile
for players that can decode for HDTV. Simply match the logo shown in
the DivX encoder to the logo on your player. The DivX Certified
Program already includes over 30 companies, and DivXNetworks
rigorously tests every single certified device to guarantee its
performance.
►Home movies
With DivX you can export your home movies in unprecedented quality
then burn them to CD and share them with friends at next to no cost.
Not only that, but because DivX movies take up so little file space you
no longer have to worry about cutting down the running length and
missing out on all those golden moments. Use DivX now to create clips
you can be proud to show your relatives for years to come.
►Backup DVDs*
DVDs are expensive, and one little scratch is sometimes all it takes to
turn your favorite disc into a worthless coaster. By converting your
DVDs into DivX format you can make inexpensive CD backups that will
save your original discs from wear and tear. Let DivX help you to enjoy
your collection while protecting it from damage.
* It is legal in many countries to make a single backup for personal use only. Local laws may apply.
DivX is not intended as a tool for unlicensed copying.
The
Guide
Introduction
Why use DivX?
►Internet distribution
Video transferred over the Internet used to be small, jumpy, and blocky
with poor color definition. DivX removes the barriers, letting you quickly
and easily transfer high-quality video via the Internet. Using a regular
broadband connection it is possible to transfer a feature-length movie in
hours, and shorter clips in minutes.
The
Guide
Quick Start
Guide
The
10
Guide
Quick Start Guide
VirtualDub is a free dubbing application that lets you cut, splice, dub, convert
between video formats, and apply filtering effects.. Although it lacks some
functionality typically associated with commercial video editing applications (easy
DVD conversion, programmable digital VCR, video transition effects, etc.), it is an
essential tool in anyone's arsenal: powerful, yet remarkably easy to use.
This Quick Start Guide will teach the basics of VirtualDub and walk you through
creating your very first DivX videos. For now, resist the urge to explore and
simply follow the instructions.
http://virtualdub.sourceforge.net
Extract the .zip file to a new folder on your hard drive. No formal installation
procedure is required. This guide uses VirtualDub version 1.5.4, but the interface
of later versions should remain similar.
You will also need a pre-existing sample .avi or .mpg file to work with in this
guide. We recommend a short file taken from a reliable source (such as CD-
ROM) that is unlikely to be corrupt or otherwise damaged.
The
11
Guide
Your first DivX
The
12
Guide
Quick Start Guide
Your first DivX
3. From the Audio menu set No Audio. This instructs
VirtualDub that even if the source file contains audio
data we do not want to process it or copy it into our
output file.
For the present time leave the video mode set to Full Processing Mode.
The
13
Guide
Quick Start Guide
Your first DivX
6. Congratulations!
The
14
Guide
Quick Start Guide
Your first DivX
8. The VirtualDub Status window should
pop up and allow you to monitor the
encoding progress.
The Feedback
Window, first
introduced in DivX 5.1, lets
you monitor and manipulate
the inner-workings of the
DivX encoder in real-time.
You might like to explore some of the display options shown here while
your video encodes, but for now avoid altering the Bitrate, pvLumaFlat, and
pvLumaTexture options in the bar at the very bottom of the window.
With regards to performance, the feedback window may reduce the rate at
which your video is encoded. To reduce overhead, increase the update
interval so that the feedback display is not updated every single frame.
The
15
Guide
Quick Start Guide
Your first DivX
The feedback and status windows will close automatically after encoding.
You should then be able to play your first DivX movie using the DivX Player
(bundled free with DivX 5.2), or any other media player.
If you would like The DivX Player to open your file by default as opposed to
any other player you can rename the AVI file after encoding so that the file
extension reads “.divx“ instead of “.avi”. The DivX Player automatically
registers the “.divx” file extension when it is installed.
The
16
Guide
Your first Multipass
The
17
Guide
Quick Start Guide
Your first Multipass
Because we selected Update log file, the encoder will record an analysis of
this nth pass with respect to the decisions it makes based on the previous
analysis from the log file. This lets you feed the analysis log from one nth
pass into successive nth passes and refine the encoding process toward
the optimal quality level.
5. Without changing any settings try saving the AVI file one more time, again
choosing to overwrite the previous file. The encoder will use the log file we
updated in the last nth pass to create an even higher quality video.
Although you might be tempted to continue running nth pass after nth pass
there is of course a limit on the quality attainable at any given bitrate.
Typically 98-99% of the optimal quality will be realized after 3 passes (1st,
nth, nth) or less.
The
18
Guide
Notes on audio
1. There are many audio formats and codecs that can be used in combination
with an AVI container, each with different properties and features. DivX AVI
files generally use Constant Bitrate MPEG1-Layer 3 Audio, or CBR MP3 for
short. CBR MP3 is the recommended audio format for DivX video and has
a number of benefits:
The MP3 codec distributed with Windows is the Fraunhofer IIS MPEG
Layer-3 Codec (advanced). This codec is capable of decoding high-bitrate
MP3 format audio but is limited to low-bitrate encoding.
VirtualDub works only with audio codecs designed for Microsoft’s Audio
Compression Manager, or ACM, and thus you will need to install an ACM
MP3 encoder.
The
19
Guide
Quick Start Guide
Notes on audio
Audio contained in an AVI file must be correctly interleaved for stable
2. playback performance—particularly when content is to be stored on optical
media (such as CD-R or DVD-R), or other media with high seek times.
Consider this AVI container with one video and one audio stream:
AVI Container
Audio Video
The
20
Guide
Quick Start Guide
Notes on audio
Now consider the corresponding correctly interleaved file:
AVI Container
Audio Video Audio Video Audio Video Audio Video Audio Video
Here audio chunks are interleaved within the file so they fall closely
alongside the related sequence of video. As playback progresses seeking
within the file is minimized, and in fact with appropriate buffering as
provided by most devices seeking will not even take place. This leads to
smoother playback performance. Because audio chunks are stored
alongside the corresponding video chunks if a viewer skips to a particular
point in the video the media player will likely locate the associated audio
and resume playback more promptly.
Note that the interleave options also present a method of adjusting audio/
video sync by adjusting the “Audio skew correction” value, either positively
or negatively in milliseconds.
For example: If the audio was running two seconds behind the video a
value of –2000 ms would correct for the delay. Conversely if the audio was
running one and a half seconds ahead of the video a value of 1500 ms
would be required.
The
21
Guide
Quick Start Guide
Notes on audio
VirtualDub contains two audio processing modes,
3. “Direct stream copy” and “Full processing mode”.
Both Direct stream copy and Full processing mode have exactly the same
meaning when applied to the video stream as opposed to the audio
stream.
Tip: Using the Audio skew correction (see note 2) in combination with
Direct Stream Copy mode for both audio and video streams allows you to
correct the audio/video sync of any pre-existing AVI file without requiring
recompression of either stream.
4. You can save a lot of time when performing Multipass encoding if you
enable audio processing only during the last pass you intend to save. It is
wasteful to process audio on every pass because the AVI file, including any
audio track, is overwritten each time.
The
22
Guide
Forward
The
23
Guide
Forward
Forward
General concepts, Frames,
General concepts
To understand the encoder features covered in the next section, we must first
Frames
At its most basic a
video is a series of
pictures shown one
after the other in
quick succession.
When the pictures
are played fast Frame 1 Frame 2 Frame 3
enough the image
appears to move. Each picture in a video is called a frame and the
speed at which they are shown is the frame rate, given in frames per
second, or fps.
The
24
Guide
Forward
Macroblocks and motion
When the encoder finds a matching area for a block
in the previous frame its position is recorded by use
of a vector. A vector is simply a numerical
representation of direction and magnitude. For
example the vector (4, -7) might mean right 4 units,
down 7 units.
Vector (4,-7)
Because these vectors happen to represent the
movement of a macroblock they are called motion vectors. In the
illustrations the red arrows represent the motion vectors.
Where blocks are gray in the compensated frame, the motion search
failed to find a suitable match in the reference frame. These blocks will
be encoded as image data instead of motion vectors.
The
25
Guide
Forward
Intra-frames and Predicted-frames
Intra-frames and Predicted-frames
The DivX encoder uses three frame: Intra-frames, Predicted-frames and
Bi-directional-frames, commonly known as I, P and B-frames. B-frames
are discussed later in the guide as an advanced topic.
I-frames serve a very important purpose: All the blocks in an I-frame are
stored as images, thus decoding an I frame reveals a complete picture
without dependency on reference frames. Therefore, I-frames are also
known as key-frames, and they are the only type of frame completely
independent of all others.
The
26
Guide
Forward
Quantizers
Quantizers
DivX uses a technique called quantization to control the accuracy of
the image data it stores. Quantizers are similar to the denominator in
an expression such as:
Data
Result = Quantizer
Quantizer = 3
Data series 7 14 21 28 35 42 49
Quantization data 2.33 4.66 7 9.33 11.66 14 16.33
Round-down result 2 4 7 9 11 14 16
Inverse quantization 6 12 21 27 33 42 48
Error 14% 14% 0% 4% 6% 0% 2%
Quantizer = 5
Data series 7 14 21 28 35 42 49
Quantization data 1.4 2.8 4.2 5.6 7 8.4 9.8
Round-down result 1 2 4 5 7 8 9
Inverse quantization 5 10 20 25 35 40 45
Error 29% 29% 5% 11% 0% 5% 8%
The
27
Guide
Forward
Quantizers
It soon becomes clear that higher 50
quantizers introduce larger errors 45
Although lower quantizers give higher quality images they also produce
larger file sizes. This is because the range of results encompassed by
lower quantizers is larger than that of higher quantizers.
Under most modes the DivX encoder will fully manage control of the
quantizer for you, although it is possible to specify a quantizer for
encoding if you want to.
The
28
Guide
A key to the guide
Forward
A key to the guide
From this point forward the guide will describe the features available in the DivX
encoder. Throughout the rest of the guide an easy to follow iconic key will be
used.
This indicates a feature is not compatible with the DivX Certified Program.
Use of the feature will mean your video will not be compatible with DivX
Certified devices.
This indicates that use of the feature may have implications for DivX Certified
devices.
This indicates there are performance tips associated with the feature.
This indicates that use of the feature may have implications upon other
options.
This indicates that there are usage notes associated with the feature.
The
29
Guide
Forward
A key to the guide
This indicates there are CLI parameters associated with the feature.
Additionally, at the beginning of each feature description you will see the Quick
Guide bar. QUICK GUIDE
Here the Quick Guide bar indicates that there are performance tips and usage
notes associated with the feature. All other icons are grayed out.
The
30
Guide
Bitrate mode
The
31
Guide
Bitrate mode
Bitrate mode
What is the bitrate?
What is the bitrate?
In a computer the minimum unit of storage is one bit. A bit can represent two
values, 0 or 1, and hence computers use the binary number system (base 2) as
opposed to the denary system (base 10, 0-9) that humans use. When several
bits are combined they can be used to store more complex data, similar to
adding more digits in denary in order to store larger numbers.
For practical purposes the smallest unit of storage software normally works with
is actually one byte. A byte is composed of eight bits.
To avoid confusion when working with quantities of bits and bytes the following
tables should be adhered to:
Bits Bytes
One bit = Smallest unit One byte = 8 bits
One kilobit = 1,000 bits One kilobyte = 1,024 bytes
One megabit = 1,000 kilobits One megabyte = 1,024 kilobytes
or 1,000,000 bits or 1,048,576 bytes
or 8,388,608 bits
The bitrate describes how many thousands of bits per second on average the
encoder should aim to spend when encoding the video. Given a video of any
fixed duration encoding at a higher bitrate will lead to larger file sizes (and better
quality video), while conversely encoding at a lower bitrate will lead to smaller file
sizes (but lower quality video).
A key aim when encoding is to achieve a desired file size; after all the prime
reason for video compression is to reduce storage requirements. Given the
length of a video and a target file size it is possible to determine a suitable bitrate
in just five simple calculations.
The following example demonstrates step by step the manual bitrate calculation
for encoding 60 minutes of video with 128 kbps MP3 audio to fill a 700MB CD-R.
The algorithm is applicable to any video or target file size.
The
32
Guide
Bitrate mode
What is the bitrate? - Bitrate calculation
Bitrate calculation
60 minute video with 128 Kbps MP3 audio for 700MB CD-R
Note: Considering the AVI container itself requires some bits we might use
a slightly lower bitrate to ensure hitting our target, e.g. 1,501 kbps
The
33
Guide
Bitrate mode
What is the bitrate?
The bitrate calculated is the average bitrate for the video.
DivX may choose to vary the actual bitrate throughout the
video as it encodes, a technique known as variable
bitrate encoding.
Since not all sequences in a video are equally complex in Average bitrate
terms of image and motion, it is impossible to maintain a Actual bitrate
The
34
Guide
1-Pass
Bitrate mode
1-Pass
QUICK GUIDE
In 1-Pass mode the encoder will output a
working DivX video stream as it receives the
source video.
When you set the bitrate mode to 1-Pass you will not be able to use the
Bitrate modulation control or the EKG application.
The
35
Guide
Bitrate mode
1-Pass
1-Pass mode is particularly useful when you are capturing from a live source
and you desire control over the bitrate/file size.
When working from static sources (for example re-processing existing stored
video) you will achieve more consistent quality by using Multipass mode.
Quantizer is floating point value between 1.0 and 31.0 specifying the fixed
quantizer to be used.
The
36
Guide
Multipass
Bitrate mode
Multipass
QUICK GUIDE
In Multipass mode you are required to run the
video through the encoder two or more times.
On every successive pass the video must be
identical to that which you used in your first pass.
Lets take a brief look at the log contents and what they mean:
##map version 8
nframes 5837
timescale 30000
passes 1
seq deltaT type total_bits motion_complexity texture_complexity modulation
0 0 I 7712 0.000000 0.251141 1.000000
2 2400 P 2952 0.035268 0.011100 1.000000
1 -1200 B 1144 0.015206 0.000000 1.000000
4 3600 P 6984 0.067981 0.069732 1.000000
3 -1200 B 2216 0.029374 0.000000 1.000000
6 3600 P 8160 0.079010 0.082583 1.000000
5 -1200 B 2264 0.029773 0.000745 1.000000
8 3600 P 14176 0.083506 0.294013 1.000000
The
37
Guide
Bitrate mode
Multipass
Multipass Log file
Log header
##map version <Value>
Value is the version number describing the format of the log file.
nframes <Value>
Value is the total number of frames listed in the log file.
timescale <Value>
Value is some arbitrary number representing the number of intervals
in one second of video.
passes <Value>
Value is the number of passes performed when the log was written.
Frame list
seq <Value>
Value is the sequence number of each frame.
The sequence numbers may appear to be out of order but in fact this
is because they are in display order. For example, in the section of
log shown the sequence reads 0,2,1 for a set of I,P,B frames
respectively. The order in which these frames must actually be
displayed is I,B,P—or 0,1,2 respectively.
deltaT <Value>
Value is the change in time (based upon the timescale value in the
header) between frames.
In the section of log shown the first three deltaT values are 0,2400,-
1200. Recalling that frames are listed in decoding order this actually
means the sequence is 0,1200,2400. With respect to the timescale
value (30000 units) this means the interval between frames (1200
units) is (1200/30000) = 0.04 seconds, and we can derive from this
frame-rate is 1/0.04=25 frames per second.
The
38
Guide
Bitrate mode
Multipass
type <Frame_type>
Frame_type is the type of each frame and can be one of ‘I’,’P’ or ‘B’,
representing intra-frame, predicted frame or bi-directional frame
respectively.
total_bits <Value>
Value is the total number of bits used by the encoder to encode the
frame.
motion_complexity <Value>
Value is a floating point number proportional to the number of bits
required to encode all motion data in the frame.
texture_complexity <Value>
Value is a floating point number proportional to the number of bits
required to encode the texture in the frame.
modulation <Value>
Value is a floating point number representing the modulation value
set by the EKG application between passes.
The
39
Guide
Bitrate mode
Multipass
Because DivX Pro includes the EKG tool for graphically displaying and
manipulating the contents of the log file it is rarely necessary to edit it
manually.
The log file will be updated during an nth pass if you have
selected the option to Update log file in the Multipass
encoding files area of the encoder configuration dialogue.
Note that if you do not select to update the log file during nth pass then
running successive passes will not improve the consistency of video
quality because the rate control strategy will not be refined.
Bitrate modulation
By using the Bitrate modulation slider it is
possible to bias the distribution of bandwidth
towards frames with either low or high motion,
allowing you to adapt the rate control to better suit the particular type of
video being encoded.
The
40
Guide
Bitrate mode
Multipass
Moving the slider left of center, or
lowering the modulation value, biases
bandwidth in favor of high-motion frames
and consequently away from low-motion
frames. Moving the slider right of center, Motion complexity
or raising the modulation value, biases Unmodulated bitrate
Bitrate modulated in favor of high-motion
bandwidth in favor of low-motion frames Bitrate modulated in favor of low-motion
and consequently away from high-motion
frames.
The
41
Guide
Bitrate mode
Multipass
Fastest Performance/Quality mode is unavailable during Multipass encoding
because it does not perform the motion estimation necessary for the rate
control to work effectively.
Disable audio processing on all but the last nth pass to reduce encoding time.
The
42
Guide
Bitrate mode
Multipass
Bitrate mode is one of:
Quantizer is floating point value between 1.0 and 31.0 specifying the fixed
quantizer to be used.
Log file:
-log <filename>
Filename is the log file name, fully qualified with a pre-existing path and
enclosed in parentheses. The default is “C:\DivX.log”.
The
43
Guide
1-Pass, Quality-based
Bitrate mode
1-Pass, Quality-based
QUICK GUIDE
In 1-Pass quality-based mode the encoder will
output a working DivX video stream as it
receives the source video, similar to in 1-Pass
mode except that you specify the quality rather than the
target bitrate.
By fixing the quantizer you guarantee a consistent quality throughout your entire
video file by ensuring the encoder stores the image data for each frame with a
consistent accuracy. However, when you fix the quantizer it is not possible to set
or predict the bitrate and hence file size because the encoder will always spend
as many bits as are required to encode each frame at the given quantizer.
Note that even when the quantizer is 1 and the display reads 100% quality the
encoded video will not be identical to the source. DivX is a lossy codec, and
there will always be some degradation from the source video regardless of the
combination of settings that you choose.
After running a 1-Pass quality-based encoding your DivX file can be viewed
immediately.
The
44
Guide
Bitrate mode
1-Pass, Quality-based
1-Pass, Quality-based mode is not explicitly unsupported by DivX Certified
devices, however the Video Buffer Verifier is disabled in this mode and there
is no guarantee that the video stream produced will not exceed the
capabilities of the certified device.
When 1-Pass, Quality-based mode is enabled any Video Buffer Verifier CLI
parameters are disregarded.
Use 1-Pass, Quality-based mode when capturing from a live source and
attempting to maintain a consistent quality throughout the video without
consideration of the resulting file size.
1-Pass, Quality-based mode is particularly useful for capture when you intend
to later re-compress a video using Multipass mode. In fact, 1-Pass, Quality-
based mode can be used in place of Multipass, 1st pass mode so long as
Write log file is enabled. In this way it is possible to capture in 1-Pass,
Quality-based mode and proceed directly to Multipass, nth pass. When
making use of this technique take care to avoid overwriting your 1st pass file
containing the captured video with the Multi-pass, nth pass output
accidentally.
Under the same principle, 1-Pass, Quality-based mode allows you to perform
XVID-like Multipass encoding. XVID performs its first pass with Q=2.
The
45
Guide
Bitrate mode
1-Pass, Quality-based
Bitrate mode is one of:
Quantizer is floating point value between 1.0 and 31.0 specifying the fixed
quantizer to be used.
Log file:
-log <filename>
Filename is the log file name, fully qualified with a pre-existing path and
enclosed in parentheses. The default is “C:\DivX.log”.
The
46
Guide
Performance/
Quality
The
47
Guide
Performance/Quality
Performance/Quality
Fastest
Fastest
QUICK GUIDE
The fastest performance/quality mode causes
the encoder to perform no motion search when
encoding. Because no motion estimation is done
all blocks will either be intra-blocks or
predicted with a null motion vector. In this
respect the encoded video will resemble an
MJPEG sequence.
If you choose to use Fastest mode in combination with 1-Pass mode the
encoder may fail to meet the average bitrate unless a very high average
bitrate is specified.
Fastest mode is unsuitable for use during multipass encoding because the
rate control algorithm used by the multipass system relies on accurate motion
complexity statistics to correctly distribute bandwidth throughout the video.
Therefore, multipass does not support Fastest mode.
Fastest mode is ideal for capture situations where encoding rate is critical
and bitrate is not a major concern - for example when you intend to later re-
compress the video using another mode.
The
48
Guide
Performance/Quality
Fastest
Performance/Quality:
-pq <mode>
The
49
Guide
Fast
Performance/Quality
Fast
QUICK GUIDE
The Fast performance/quality mode causes the
encoder to perform a basic, performance-
optimized motion search when encoding.
Fast mode is ideal for capture situations where encoding rate is critical but
bitrate is also a concern. Capturing in Fast mode with a high bitrate will likely
produce higher quality output than capturing in Fastest mode, but raises the
performance requirements placed upon the CPU.
Performance/Quality:
-pq <mode>
The
50
Guide
Standard
Performance/Quality
Standard
QUICK GUIDE
The Standard performance/quality mode
enables the motion search algorithm and is
functionally equivalent to Slowest mode in
previous versions of DivX 5, but contains
algorithm enhancements that improve video
quality.
Standard mode respects the Maximum Keyframe Interval and Scene Change
Threshold settings (described later). Using these two settings, it is possible to
control the encoder’s decision process for frame-type selection with respect
to intra-frames.
Performance/Quality:
-pq <mode>
The
51
Guide
The Rate-Distortion Algorithm
Performance/Quality
The Rate-Distortion Algorithm
First introduced in DivX 5.1, the Rate-Distortion algorithm is enabled when the
Performance/Quality slider is set to Slow. The rate-distortion algorithm allows the
outcome of various decisions made by the encoder to be evaluated intelligently
with respect to bit spend against quality gain where previously simple algorithms
would govern the process.
Every decision made as a video is encoded has an impact on both the rate and
the distortion. The rate is a term used to describe the bits spent, distortion to
describe the change in some measure of quality based upon any particular
decision.
The
52
Guide
Performance/Quality
The Rate-Distortion algorithm
Here points above the rate-distortion curve evaluate as good decisions, those
below the curve as bad decisions. The best decisions fall towards the upper-left
of the plot (higher quality and lower bit spend).
It follows that the encoder must derive this rate-distortion curve from somewhere.
The curve is approximated by some function inside the encoder accepting a
value that changes the shape of the curve. This value is Lambda, and you can
manipulate it via the Feedback window.
If the rate-distortion curve becomes too steep quality will be degraded because
the encoder will not be able to achieve the quality demanded of it for a given rise
in bit spend. If the rate-distortion curve becomes too flat quality will be degraded
because the encoder will waste a lot of bits for very marginal improvements,
reducing the number of free bits available overall.
Deriving the optimal value for Lambda is a very difficult process and under most
circumstances you should not over-ride the encoder default. The encoder will
normally vary Lambda automatically during encoding depending on
circumstances, over-riding instead fixes Lambda for the duration of one pass.
Lambda returns to its default mode of operation at the beginning of each pass.
The
53
Guide
Slow
Performance/Quality
Slow
QUICK GUIDE
Slow performance/quality mode enables the
rate-distortion algorithm, designed to
dramatically improve the video quality of low-
bitrate encoding.
In Slow mode the encoder uses the new Rate-Distortion algorithm to make
frame type decisions based upon the best balance of bit spend and quality
gain. Because of this the Scene Change Threshold settings become
redundant and is ignored.
QPel is unsupported in Slow mode. If you have enabled both QPel and Slow
mode the performance/quality mode will automatically default to Standard
mode when encoding begins.
Performance/Quality:
-pq <mode>
The
54
Guide
Psychovisual
Enhancement
The
55
Guide
Psychovisual Enhancement
Psychovisual Enhancement
What is Psychovisual Enhancement?
What is Psychovisual Enhancement?
DivX is a lossy codec meaning after encoding the video will have been degraded
to some extent. In other words it will have lost some detail.
Storing the color value of every pixel is very expensive in terms of bits and thus
DivX uses a technique known as the Discrete Cosine Transformation to convert
this series of values into frequency information. The process of the DCT is an
advanced mathematical topic that will not be covered here. However, the main
feature of the DCT result is a set of co-efficients that represent the magnitude of
the frequencies composing the input series in
order of ascending frequency.
The
56
Guide
Psychovisual Enhancement
The DCT, iDCT and Human Visual System,
The human visual system is far less sensitive to high frequencies in an image
than it is to low frequencies. One perceptual technique DivX uses during lossy
compression is the reduction in accuracy of higher frequency co-efficients, saving
bits while causing the least perceivable quality degradation.
It is in fact these DCT co-efficients that are quantized by DivX when image data
is encoded (see Forward—Quantizers). Greater quantization of the DCT co-
efficients means fewer bits are ultimately required to store them, but a less
accurate image results when the inverse discrete cosine transformation (or iDCT)
is performed during decoding—the process of restoring the original series (image
data) from the DCT result. This error between source and encoded image is
known as quantization noise.
1. In flat areas of the image co-efficients are manipulated so that fine details
are enhanced. If we were to encode in 1-Pass Quality Based mode (i.e. at
a fixed quantizer), this enhancement would naturally increase the bits
spent on flat areas of the image. However, at a fixed average, bitrate the
effect is actually that textured areas of the image will receive fewer bits
and hence when psycho-visual enhancements are enabled, artifacts will
be masked in textured areas of the image where they are least visible.
The
57
Guide
Psychovisual Enhancement
The psychovisual enhancement system
Because both psychovisual enhancement methods mask artifacts in textured
areas where they are least visible, the video quality appears to improve when
psychovisual enhancements are enabled. You can visualize the concept by
imagining two characters having a conversation while standing in front of a tree—
the psychovisual enhancement process might enhance details in the characters
faces at the expense of causing some artifacting among the leaves of the tree
where it would be least perceivable.
Experiencing PVE
The best way to
understand the psychovisual enhancements is to watch them working on an
image. The feedback window provides an excellent method of doing this, and
also allows you to configure the degree to which flat and texture psychovisual
Experiencing PVE
enhancements are performed.
With the video paused, slowly drag each slider back and forth from zero (no
psychovisual enhancement) to one (full psychovisual enhancement), watching
the image carefully as you do so. Since the aim of psychovisual enhancement is
to mask artifacts where you are least likely to perceive them, you should see that
both effects appear to improve quality when enabled to fuller extents.
Where information is removed from the image the number overlaid will be
positive and the macroblock tinted red. Where information is added to the image
the number overlaid will be negative and the macroblock tinted blue.
The
58
Guide
Fast psychovisual enhancement
Psychovisual Enhancement
Fast psychovisual enhancement
QUICK GUIDE
Fast psychovisual enhancement mode attempts
to manipulate the DCT co-efficients selectively
so that the resulting noise in the decoded image
is positioned where it will be least visible, for example
in areas of strong texture.
Psychovisual enhancements:
-psy <mode>
The
59
Guide
Slow psychovisual enhancement
Psychovisual Enhancement
Slow psychovisual enhancement
QUICK GUIDE
Slow psychovisual enhancement mode analyzes
each block and the blocks surrounding it in turn
to ensure any enhancement does not introduce
significant blocking and ringing artifacts, thus Slow
mode is less likely to introduce artifacts than Fast
mode.
Psychovisual enhancements:
-psy <mode>
The
60
Guide
Source
preprocessing
The
61
Guide
Source Preprocessing
Source Preprocessing
What is Preprocessing?
What is Preprocessing?
In the computer age, digital video can be acquired from a variety of sources. DV
The
62
Guide
Spatial filtering
Source Preprocessing
Spatial filtering
Pre-processing consists of two filter types, the first is a spatial filter. The spatial
filter is concerned with each pixel in the frame and the pixels surrounding it.
Noise manifests itself as high frequency changes between the colors of adjacent
Temporal filtering
pixels. You can stabilize the colors by applying a low-pass filter (one that rejects
high frequencies) to an image.
Temporal filtering
The second type of filter employed by the pre-processing algorithm is a temporal
filter. While the spatial filter can reduce noise within a single frame it does not
consider the changes in each pixel over time, i.e. from frame-to-frame. A good
example of temporal artifacting is flickering video, where even though there may
be little noise within each individual frame, there is temporal noise affecting the
intensity of pixels over time. Temporal noise is also responsible for producing
grain or discoloration patterns that change throughout a video as it is played.
Temporal filtering
The
63
Guide
Source Preprocessing
Source Preprocessing
Source Preprocessing
QUICK GUIDE
Source pre-processing applies spatial and
temporal filtering prior to encoding in order to
reduce noise in the source video.
Source pre-processing:
The
64
Guide
Crop and
Resize
The
65
Guide
Crop and Resize
Resizing
cropping by discarding areas of the video and resizing by
scaling the video along its horizontal and/or vertical axis.
The DivX encoder ensures that these two filter operations can
be performed from any application capable of compressing
DivX video regardless of whether the application itself includes
this functionality by providing its own crop and resize controls.
The encoder’s internal filters are optimized in terms of quality and performance
and may offer better performance than those in your video application.
Cropping is most useful when processing video sources that feature borders
around the exterior of the picture area, a common attribute of both widescreen
DVD video and analogue capture content.
Although the borders themselves are encoded
with relative ease, where the border meets the
picture a high contrast edge is created that
consumes a large number of bits when
encoded.
The
66
Guide
Crop and Resize
Why crop or resize?
This extra bit spend means throughout the video fewer bits are available to
encode the actual picture, causing video quality to suffer. By cropping borders no
bits are wasted outside the actual picture area and thus overall quality is
improved.
Resizing is used in a variety of situations, the simplest case being where there is
a desire to reduce the video dimensions. At lower bitrates reducing video
dimensions may assist in preserving quality by reducing the number of blocks
required to make up the picture - with fewer blocks each block should receive a
greater number of bits on average.
The
67
Guide
Crop
Crop top
Crop right
Crop left
Crop bottom
You should crop any borders surrounding a video before encoding it to avoid
wasting bits. This will improve the overall quality of your video.
Cropping:
-c <left>,<right>,<top>,<bottom>
Left, right, top and bottom are the number of pixels from each edge of the
image that will be discarded before encoding.
The
68
Guide
Resize
pixel.
70%
New color =
Because bilinear sampling considers only 2x2 pixels
70% (65% + 35%) +
30% (65% + 35%) in the source and interpolates linearly it produces a
softened image on resizing and can cause pixelation
in enlarged images.
The
69
Guide
Crop and Resize
Resize
The encoder resize filters are very highly optimized. Consider using them in
place of external resize filters provided with video editing applications.
Resizing in certain applications, such as VirtualDub, can require colorspace
conversions that reduce encoding performance and minimally degrade the
video. Where crop and resize are the only filtering operations required
greater encoding performance will be seen using the encoder’s resize filter.
Bilinear sampling is faster than bicubic sampling and is suitable for reducing
video. While it is acceptable to use bilinear sampling when enlarging, better
quality can be obtained with bicubic sampling.
A common use for resizing is aspect correction when encoding from DVD.
You can easily calculate the dimensions for the correct aspect ratio given the
original video dimensions and the aspect details from the DVD.
Example:
The video dimensions are 704x360 and the DVD box reports 2.35:1
aspect.
Or:
Resize:
-r <horizontal>,<vertical>,<mode>
The
70
Guide
MPEG-4 Tools
The
71
Guide
Bi-directional Encoding
Bi-directional Encoding
What is Bi-directional Encoding?
What is Bi-directional Encoding?
As discussed in Forward—General concepts, there are three different frame
types available for the DivX encoder to select from. These are intra-frames,
predicted-frames and bi-directional-frames, or I, P and B-frames respectively.
Recall that in an I-frame all blocks are intra-blocks and are encoded as the image
filling the block. In a P-frame blocks can be either intra-blocks or forward
predicted blocks—those described by a vector referencing a matching area of
image in the previous frame (known as forward prediction).
Bi-directional frames can contain blocks that are intra, forwards predicted,
backwards predicted, or both forwards and backwards predicted. This means
that B-frames reference not only the previous frame but the next frame also.
Frame 1 (I) Frame 2 (B) Frame 3 (P) Frame 4 (B) Frame 5 (P)
In the diagram above, frames 2 and 4 are bi-directional frames and any
macroblock may be predicted from either the previous (forward) or next
(backward) frame. If a block is both forwards and backwards predicted then when
motion compensation takes place the block image will be the result of blending
the appropriate areas of the forwards and backwards frames together.
Notice that frame 4 contains some intra-blocks because in the example these
blocks could not be predicted from either the forwards or backwards frame.
The
72
Guide
Bi-directional Encoding
What is Bi-directional Encoding?
As described earlier in the guide, predicting blocks consumes far less bits than
encoding them as intra-blocks. Because B-frames can be both forwards and
backwards predicted, they will generally have a lower proportion of intra-blocks
than any other frame type.
These two key features of B-frames lead them to offer the greatest compression
ratio of any of the available frame types.
In a non-adaptive scheme where B-frames are forced for every alternate frame
(IBPBPBP), brief but significant artifacting can occasionally occur should the B-
frame fall between two dissimilar reference frames.
Frame 1 (B) Frame 2 (P) Frame 3 (B) Frame 4 (P) Frame 5 (B)
The
73
Guide
Multiple consecutive adaptive B-frames
Bi-directional Encoding
Multiple consecutive adaptive B-frames
Frame 1 (I) Frame 2 (B) Frame 3 (B) Frame 4 (B) Frame 5 (P)
Where a typical frame sequence using only single consecutive B-frames might
read IBPBPBP, the DivX 5.2 encoder can produce a frame sequence that reads
IBBPBPBBBP, depending on the content being encoded.
Single B-frames offer a very low bit spend, but as more B-frames are placed in
sequence the bit spend associated with each frame rises because B-frames do
not reference each other Therefore, as the picture changes, each new B-frame
must include all the changes from the forwards and backwards reference frames.
Because the following P-frame can’t refer to changes described by the preceding
B-frames, it must describe the complete change from the previous reference
frame. As the number of consecutive B-frames grows, this change becomes
more significant, and so the bit spend for the P-frame tends towards that of an I-
frame, leading to a false economy in many consecutive B-frames.
The
74
Guide
The ordering of frames
Bi-directional Encoding
The ordering of frames
Backward prediction introduces issues related to the sequence in which frames
must be encoded and decoded. Whereas forward prediction, as used by P-
frames, simply references the last frame that was decoded, a backwards
predicted block in a B-frame makes reference to some future frame that has not
yet been decoded. Therefore, in order to decode B-frames it is necessary to
decode future frames first. This is why frames may appear out of order in the
multipass log file (see Bitrate mode—Multipass).
Consider the illustration below again. Here the blue arrows represent forwards
prediction in the P-frames and the red arrows represent either forward (left) or
backward (right) prediction in the B-frames.
Frame 1 (I) Frame 2 (B) Frame 3 (P) Frame 4 (B) Frame 5 (P)
It is clear from the diagram that the B-frames can make reference to future
frames because there are red arrows pointing to the right. The encoder must re-
order these frames so that they are in the correct order for decoding.
The
75
Guide
Bi-directional Encoding
The ordering of frames
Now consider the re-ordered sequence of frames:
Frame 1 (I) Frame 3 (P) Frame 2 (B) Frame 5 (P) Frame 4 (B)
Although Frame 2 still makes reference to both frame 1 and frame 3, frame 3 is
decoded first.
It is now clear that in the re-ordered sequence all frames can be decoded, as
every frame makes reference only to others that have been previously decoded
(all arrows point left). The decoder is responsible for displaying the decoded
frames in the correct order.
The
76
Guide
Bi-directional Encoding
Bi-directional Encoding
Bi-directional Encoding
QUICK GUIDE
Bi-directional encoding lets the encoder to use
the Bi-directional frame-type, supporting both
forwards and backwards motion prediction.
The
77
Guide
Bi-directional Encoding
Bi-directional Encoding
Bi-directional encoding can greatly improve video quality at low-medium
bitrates. At extremely high bitrates it may be beneficial to disable it.
Bi-directional encoding:
-b <mode>
The
78
Guide
Quarter Pixel
Quarter Pixel
What is Quarter-pixel?
What is Quarter-pixel?
In a digitized video each frame is composed of a 2-
dimensional array of pixels, the small colored
squares that when viewed collectively from a
distance make up the picture.
The simplest way to perform this operation would be to compare each block in
the current frame against the surrounding area in the reference frame, moving
the block one pixel in any direction at a time. The relative vector offsets from the
origin of the current block during this search might be (0, 0), (0, 1), (0, 2), (1, 0)
and so on.
This scheme would be called whole-pixel because the search was performed
against the reference frame at 1-pixel resolution.
The
79
Guide
Why use Quarter-pixel?
Quarter Pixel
Why use Quarter-pixel?
Motion vectors accurate to quarter-pixel resolution allow each motion-
compensated frame to be reconstructed more accurately than those accurate to
half-pixel resolution. The resulting video tends to appear sharper and of slightly
higher quality.
Consider that the eye is most sensitive to motion where objects in a scene are
moving very slowly. It would be difficult, for example, to accurately see how many
pixels a rocket had traveled across the screen after firing, but if instead we were
watching a snail crawling in front of the camera then accuracy would be far more
important.
The effect of perspective (where distant objects appear smaller than near
objects) biases the improvements made by quarter-pixel motion search with
respect to an objects depth in the scene. With respect to the image at the camera
objects will appear to move slowly in the distance
where perspective causes space to appear
compressed, and fast near to the camera where
perspective has the least significant effect.
Imagine a character walking across the video behind the car while it is positioned
near to the camera. The motion accuracy would not be critical because the
resolution of near objects is already high. In contrast, imagine the same
character walking across the video behind the while it is distant from the camera.
The character would occupy far fewer pixels and motion accuracy would be more
important for fluid and accurate motion.
The
80
Guide
Quarter-pixel
Quarter Pixel
Quarter-pixel
QUICK GUIDE
Quarter-pixel increases the resolution of the
motion search to one quarter of one pixel,
double that of the default half-pixel resolution.
The
81
Guide
Quarter Pixel
Quarter-pixel
If you are not creating content strictly for your personal use be aware of the
decoding implications of quarter-pixel motion accuracy. Slower computers
may perform very poorly when attempting to decode quarter-pixel video—
some may fail to decode in real-time even when the quality slider is at its
minimum setting. Others may require substantially reduced post-processing,
leading to visible artifacting that can counter act the quality improvements
offered by quarter-pixel.
Quarter-pixel:
-q
The
82
Guide
Global Motion Compensation
Likewise, if the camera were rotated clockwise all of the blocks representing a
static scene would appear to move anti-clockwise around the center of the
picture. If the camera were to zoom in, all blocks would appear to move outwards
from the center of the picture towards the edges. Its entirely possible that the
camera might zoom, rotate and pan at the same time.
During motion estimation, global motion compensation finds and records the
global motion attributable to translation, rotation and zooming of the camera.
Motion vectors for blocks whose motion is consistent with the global motion can
be calculated during decoding as opposed to being explicitly specified, and
hence fewer bits are spent recording motion vectors when global motion
compensation is enabled.
When global motion compensation is not used the encoder remains efficient in
recording translation. Each motion vector is recorded as the difference from the
vector of its neighboring block, thus when vectors are largely consistent
throughout a frame very few bits can be spent recording them.
The
83
Guide
Global Motion Compensation
If you are not creating content strictly for your personal use be aware of the
decoding implications of global motion compensation. Slower computers may
perform poorly when decoding video with global motion compensation.
Global motion compensation can improve video quality by reducing the bits
spent on motion estimation, leading to the rate control reducing quantizers
and thus improving quality.
-g
The
84
Guide
Advanced
The
85
Guide
Advanced
Advanced
Scene-change threshold
Scene-change threshold
QUICK GUIDE
As described in Forward—Predicted-frames and
Intra-frames, each frame in a video can be one
of three types: intra, predicted or bi-directional. It
follows that there must be some logic controlling frame
type selection—the encoder is not human and can’t
actually recognize where one scene changes into the
next.
The scene-change threshold defines the percentage of blocks not tracked from
the reference frames by the motion search required to trigger a scene-change
detection. When a scene-change is detected a key-frame (intra-frame) will be
used in place of a predicted or bi-directional frame.
Frame 1 (I) Frame 2 (B) Frame 3 (P) Frame 4 (B) Frame 5 (P)
100% untracked 0% untracked 9% untracked 4% untracked 14% untracked
Here if the threshold were 13% frame 5 would become a key-frame (intra-frame).
The
86
Guide
Advanced
Scene-change threshold
Note that what the encoder views as a scene-change does not necessarily
correspond with what humans perceive to be a scene-change—Regardless of
which value you configure the scene-change threshold to be, there is no
guarantee that the encoder will key-frame on an actual scene-change, just as
there is no guarantee that the encoder will not key-frame elsewhere.
Scene-change threshold:
-sc <Threshold>
The
87
Guide
Maximum key-frame interval
Advanced
Maximum key-frame interval
QUICK GUIDE
The maximum key-frame interval defines the
maximum number of consecutive frames that
may be a predicted type.
More frequent key-frames can reduce the seek recovery time in media
players when the viewer skips to a specific point in the video. Due to the
nature of predicted frames, a media player must decode every frame from the
nearest past key-frame to the seek target in order to resume playback at the
specified frame.
-key <frames>
The
88
Guide
Quantization type
Advanced
Quantization type
QUICK GUIDE
During encoding picture data is converted from
the spatial domain, i.e. color values per pixel
composing the picture, to the frequency domain.
Support for MPEG2 quantization is not required by any DivX Certified device.
If you encode your video using MPEG2 quantization you may be unable to
play it using DivX Certified devices.
The
89
Guide
Advanced
Quantization type
Quantization type:
-qm MPEG
The
90
Guide
Interlacing
The
91
Guide
Interlacing
Interlacing
What is interlacing?
What is interlacing?
Interlacing is a method adopted from the old days of analog television that
allowed the frame rate of a video to be doubled by broadcasting only half a frame
at a time. This was achieved by dividing each frame into two fields, each
occupying alternating horizontal lines. We call these the top and bottom fields.
The
92
Guide
Why de-interlace?
Interlacing
Why de-interlace?
During playback of an interlaced video half of one frame is continually being
merged with the previous picture. If the encoder were to treat each frame as
progressive (a complete picture that is not interlaced) then inevitably artifacting
would occur when half of a new frame was merged with the remaining half of the
previous frame, particularly where the new frame differs significantly from the
Maintaining interlacing
last.
Consider that if the encoder were to assume interlaced content was in fact
progressive then when motion occurred in the image interlacing artifacts would
be present. Greater motion causes greater artifacting and inconsistency in the
texture between frames. The motion search must be able to match blocks in the
current frame back to similar areas in the reference frame and an inconsistent
picture will cause this process to fail. Consequently, a higher proportion of blocks
will be intra-coded leading to a higher bit spend per frame, forcing the rate control
to use higher quantizers and lowering the overall quality of the encoded video.
Maintaining interlacing
In certain circumstances, it is desirable to maintain the original interlaced fields
discretely so they can be recreated without artifacting for display on an interlaced
device; playback via a DivX Certified device to an interlaced television, for
example.
To achieve this the encoder can be set to Encode as interlaced. Fields will be
preserved, however encoding each field individually requires a higher bitrate than
is normally required when encoding a progressive source.
The
93
Guide
Source interlace
Interlacing
Source interlace
QUICK GUIDE
Source interlace specifies the interlacing format
of the source video and how interlacing should
be handled by the encoder.
1. Progressive source
The encoder will assume that the source video
is progressive (not interlaced) and no de-
interlacing will be performed prior to encoding.
2. De-interlace source
The encoder will assume that the source video
is interlaced and will de-interlace it, encoding as
progressive.
3. Preserve interlacing
The encoder will assume that the source video is interlaced and will
encode each field in every frame separately.
DivX 5.2 does not support bi-directional coding when preserving interlacing. If
you choose to preserve interlacing bi-directional coding will be automatically
disabled.
The
94
Guide
Interlacing
Source interlace
If your source video is interlaced you should select to de-interlace all frames
to progressive unless you have a specific reason for preserving interlacing.
Encoding as interlaced requires substantially higher bitrates in order to
achieve equal perceptual quality to progressive content.
The DivX 5.2 decoder does not support real-time de-interlacing during
playback. If you encode as interlaced the interlaced fields will be visible when
played via a progressive display, such as a PC monitor.
If you wish to resize and de-interlace, you must de-interlace before resizing. If
you use the encoder’s resize filter then de-interlacing is always performed
before resizing. If you use an external resize filter you must also use an
external de-interlace filter so that de-interlacing takes place before resizing.
Source interlace:
-d <mode>
The
95
Guide
Video Buffer
Verifier
The
96
Guide
Video buffer verifier
The video buffer verifier forms part of a virtual decoder that is attached to the
encoder, limiting its output in terms of certain device capabilities: The sustainable
rate at which the device and receive the video stream, the buffer associated with
the video stream within the device, and how full that buffer should be before
playback begins.
When playing DivX videos on certain devices (e.g. a fast desktop computer),
these considerations can be overlooked without significant consequences.
However, when encoding for broadcast over an IP network (e.g. the Internet), or
for a hardware device with fixed sustainable throughput, it is critical for
uninterrupted playback that the abilities of the device are not exceeded.
Suppose a sustainable transfer rate of 2000 kbps from the drive and a 4096
kilobit read cache. If the encoder were to encode at 2500 kbps for a long period
this would be 500 kbps in excess of the drive’s ability to read from the disc.
Assuming the 4096 kilobit cache was full to begin with a 500 kbps under-run
would exhaust the buffer and cause playback to pause after around 8 seconds.
The
97
Guide
Video buffer verifier
Each DivX Certified Profile has specific VBV parameters associated with it
that ensure playback performance on DivX Certified devices. Although it is
possible to alter the VBV parameters in a way that would not negatively
impact playback performance, changing the VBV parameters is not
recommended except by experienced users with specific reasons for doing
so.
Using the video buffer verifier it is possible to create MPEG4 video streams
suitable for streaming, even though DivX Pro does not include a streaming
server itself.
Although the initial occupancy is always obeyed by the video buffer verifier
and is useful particularly at the beginning of an encoding session, it may not
always be respected by a decoder.
The
98
Guide
Video buffer verifier
Video buffer verifier
Video buffer verifier:
Bitrate is the maximum sustainable read bitrate in bits per second of the
target system.
Buffer size is the size in bits of the buffer associated with the video stream in
the target system.
Initial occupancy is the number of bits from the video stream assumed by the
encoder to have been pre-buffered by the target system before playback
begins.
The
99
Guide
Profiles
The
100
Guide
Profiles
Profiles
What is a profile?
What is a profile?
In truth these players would typically support some, but not all, versions of DivX,
and then only limited features from these versions. The capabilities of hardware
devices claiming DivX compatibility varied so widely that it became a hit-and-
miss affair attempting to encode DivX video that could be played consistently well
on all of them—a task that often seemed impossible.
Creating video streams for DivX Certified devices is as easy as selecting the
profile badge as displayed on the certified device from the Select Profile Wizard
prior to encoding.
The
101
Guide
Profiles
Profiles
Profiles
QUICK GUIDE
Profiles enforce that the encoder creates a video
stream compatible with DivX® Certified devices.
It is possible to encode video suitable for certified devices even when profiles
are disabled by carefully choosing encoder options.
When encoding for non-certified players that claim DivX compatibility for
greatest compatibility you should use Home Theatre profile.
The
102
Guide
Profiles
Profiles
Profiles:
-profile <number>
The
103
Guide
DivX®
Certified
Program
The
104
Guide
DivX® Certified Program
DivX Certified devices fall into one of four categories, each associated with a
suitable profile that defines its abilities.
The
105
Guide
What’s in a profile?
Maximum average
200 kbps 768 kbps 4000 kbps 8000 kbps
bitrate
Maximum peak
bitrate during any 1 800 kbps 4000 kbps 8000 kbps 32000 kbps
second of video
Bi-directional
No Yes Yes Yes
encoding support
Interlaced video
No No Yes Yes
support
Devices can exceed these minimum requirements but you should never rely
upon this when performing profile encoding.
By selecting the correct profile of the target DivX Certified device from the Select
Profile Wizard prior to encoding you ensure the video stream will always play
correctly on the device.
The
106
Guide
DivX certification requirements
Feature Required
All DivX 3.11 movies on 1 CD, anything under 1 mbps average bitrate Yes
All DivX 4 content Yes
DivX 5 content with no GMC and no QPel Yes
DivX video created for Video on Demand Yes
DivX video created on a DivX Certified encoding device Yes
MP3 audio in DivX video both CBR and VBR Yes
DivX 3.11 movies on 2 CDs (high bitrates) No
DivX 5 content with GMC or QPel No
XVID content No
ADPCM audio, PCM audio, Ogg Vorbis audio No
AVI files with bad audio/video interleaving No
The
107
Guide
E.K.G.
The
108
Guide
Electrokompressiongraph™
Electrokompressiongraph™
What is the EKG?
What is the EKG?
We saw in Bitrate mode—Multipass that after each pass of a Multipass encoding
a log file is written containing an analysis of the video encoded during the pass,
used in turn to optimize the rate-control strategy for any successive pass.
Normally the rate-control will use the information from the log file in an unbiased
fashion, lending equal weight to all frames and basing frame quantizers upon the
bitrate and frame complexity. However, the encoder itself can’t conceive of the
actual content of a video as humans can, or accurately determine how the
human eye will perceive the quality of any particular sequence of video.
The
109
Guide
How the EKG works
Electrokompressiongraph™
How the EKG works
As discussed in Forward-General concepts-Quantizers, DivX uses quantizers to
control the picture quality in each frame. Lower quantizers produce higher quality
and bit-spend, while higher quantizers produce lower quality and bit spend.
The aim of the EKG application is to control the modulation value associated with
each frame in the multipass log file. This modulation value actually manipulates
the frame quantizer chosen by the rate-control by acting as a co-efficient.
Thus if the rate control would normally encode a particular frame at Q=6 but a
modulation of 2.0 was specified the frame would instead be encoded at Q=12. A
modulation of 1.0 would leave the original quantizer unchanged (1.0 x Q = Q),
and a modulation of 0.5 would halve the quantizer so that the frame would be
encoded at Q=3.
Notice that the EKG can’t show the actual quantizer for any particular frame as
this is a decision made on-the-fly as video is encoded by the rate control.
Modulation allows you to manipulate quality in terms of proportionality, not in
terms of fixed quantizers.
Also notice that the relationship between quantizer value and quality is inverse, a
lower quantizer gives rise to higher quality. Therefore, a modulation of 2.0 is the
equivalent of 50% quality, while a modulation of 0.5 is the equivalent of 200%
quality.
The
110
Guide
Using the EKG
Electrokompressiongraph™
Using the EKG
Using the EKG it is possible to display graphically the contents of the multipass
log file after only the 1st pass, however in order to preview the encoded video as
you manipulate the modulation control at least one nth pass must have been
performed. Making best use of the EKG will therefore require at least three
passes—the EKG application being used between passes two and three.
The two left-most toolbar buttons add grid-lines to the graph area. You can also
select which statistics are graphed and how they are displayed from the Graph
menu.
The
111
Guide
Electrokompressiongraph™
Using the EKG
Click the Edit button on the
3. toolbar to enter editing mode.
Clicking on the graph will now
operate the modulation
control system.
After you release the grab-bar the new modulation for the
frame will be set and reflected in the orange modulation
line that runs throughout the graph.
5. After making your changes, choose Save Changes from the File menu.
You must now run one more nth pass to incorporate these changes into
your DivX video.
Modulation settings are preserved between passes. You can revise your
modulation settings at a later time if you wish.
The
112
Guide
DivX
Decoder
The
113
Guide
DivX® Decoder
DivX® Decoder
About the decoder
About the decoder
Post-processing
The DivX decoder plays all DivX® 3, 4 and 5 series video, applies special post-
processing effects to enhance video quality, and makes best use of the
advanced features provided by your graphics hardware for improved playback
performance.
The DivX decoder will be used by all third-party media players to play DivX video.
You need only configure it once to impose changes on all of your media players.
It is worth noting before discussing the decoder features that some third-party
MPEG4 decoder filters can commandeer DivX decoding from the genuine DivX
decoder in media players that are based upon Microsoft’s DirectShow platform.
You can check that the genuine DivX decoder is playing your DivX videos simply
by playing any DivX video in a DirectShow based media player (such as
Microsoft’s Windows Media Player) and watching for the DivX logo in the lower
right hand corner of the movie. The logo should appear for several seconds when
playback begins if the genuine DivX decoder filter is in use.
Post-processing
Post-processing refers to effects applied to the video after decoding to enhance
the picture in some way.
The DivX decoder contains three post-processing techniques. These are de-
blocking, de-ringing, and the film effect.
The
114
Guide
Deblocking
DivX® Decoder
Deblocking
The most visible artifact occurring in low-
bitrate DivX video is called blocking, and
appears as small 8x8 squares in the video.
Deringing
The artifact is caused by the DCT
algorithm used by DivX to encode the
image, and becomes strongest where
video is encoded at very low bitrates due to
high quantization.
Automatic post-processing
Deblocking blends the transition along the block edges so that individual blocks
become less apparent.
Deringing
Ringing is another artifact caused by the DCT algorithm. Ringing appears as an
outline or shadowing around contrasting edges in the picture.
Automatic post-processing
The DivX decoder can automatically adjust the level of de-
blocking and de-ringing based upon your computers real-
time playback performance. This prevents the decoder from
dropping frames if the CPU becomes overloaded.
The
115
Guide
Post-processing levels
DivX® Decoder
Post-processing levels,
The manual de-blocking and de-ringing controls do not control the strength of the
post-processing, but rather where and how it is applied.
Film effect
the luminance channel and so de-blocking gives preference to this.
Horizontal Vertical Horizontal Vertical
De-blocking
luminance luminance chrominance chrominance
level
de-blocking de-blocking de-blocking de-blocking
0 (MIN) No No No No
1 Yes No No No
2 Yes Yes No No
3 Yes Yes Yes No
4 (MAX) Yes Yes Yes Yes
Film effect
The film effect restores warmth to the image and masks the artificial smoothing
effect that can result from low-bitrate encoding by applying a film grain effect to
the picture.
The
116
Guide
Quality settings
DivX® Decoder
Quality settings
The quality settings page allows configuration of decoder
options relating to the rendering of DivX video.
►Smooth playback
When a DivX video contains bi-directional
frames smooth playback makes use of a buffer
so the decoder need only decode one frame for
every frame it displays.
The last frame of the video may be dropped due to the buffer but
because only one frame is decoded at a time smooth playback
enhances performance for slower CPUs.
With smooth playback disabled B-frames are decoded at the same time
as the frames they reference and then displayed at the appropriate
time.
If the frame sequence were 1I, 2P, 3B, 4P, 5B the decoder would
decode and display 1I, decode 2P and 3B and display 3B, wait and
display 2P, decode 4P and 5B and display 5B, wait and display 4P.
►YUV Extended
When enabled the decoder will output video in YV12 color mode. YV12
is the internal color format used by DivX and thus is the fastest method
of decoding as data can be sent directly to the video card.
The
117
Guide
DivX® Decoder
Quality settings
►Overlay Extended
When enabled the decoder will attempt to use the hardware overlay
instead of the software overlay. The hardware overlay is a feature of
some graphics cards that increases both rendering performance and
quality when video is scaled.
When overlay extended is enabled only one video file may be played
concurrently.
►Double Buffering
Double buffering causes the video card to reserve additional memory
for receiving video data. Double buffering improves the smoothness of
playback but may not work with older graphics card that have very little
onboard memory.
►Disable Logo
When enabled, this option will prevent the decoder from overlaying the
DivX Video logo in the lower right-hand corner of the video during the
first few seconds of playback.
The
118
Guide
Acknowledgements
The
119
Guide
Acknowledgements
Acknowledgments
Thanks and credits
Thanks and credits
Legal
Permission to illustrate this guide with images taken from
Killer Bean 2 was very kindly given by Jeff Lew. See his
work online and read about his DVD, Learning 3D Character
Animation:
http://www.jefflew.com
Thanks to the entire DivX Advanced Research Centre (DARC) team for their
assistance in writing this guide:
Eugene Kuznetsov
Andrea Graziani
John Funnell
Adam Li
Adrian Bourke
Cheng Huang
Jérôme Rota
Darrius Thompson
Legal
Microsoft is a registered trademark of Microsoft Corporation. QuickTime and Macintosh are trademarks of Apple
Computer Inc. VirtualDub is written and maintained by Avery Lee. DivX, “DivX Certified” and the DivX Certified logo are
registered trademarks of DivXNetworks. Killer Bean is the property of Jeff Lew. All other trademarks are the property of
their respective owners.
The
120
Guide