Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

HEVC Deblocking Filter

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

1746 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO.

12, DECEMBER 2012

HEVC Deblocking Filter


Andrey Norkin, Gisle Bjøntegaard, Arild Fuldseth, Matthias Narroschke, Masaru Ikeda,
Kenneth Andersson, Minhua Zhou, and Geert Van der Auwera

Abstract—This paper describes the in-loop deblocking filter which is followed by the quantization and coding of the
used in the upcoming High Efficiency Video Coding (HEVC) transform coefficients. While H.264/AVC [2] divides a picture
standard to reduce visible artifacts at block boundaries. The
into fixed size macroblocks of 16×16 samples, HEVC divides
deblocking filter performs detection of the artifacts at the coded
block boundaries and attenuates them by applying a selected a picture into coding tree units (CTU) of 16 × 16, 32 × 32
filter. Compared to the H.264/AVC deblocking filter, the HEVC or 64 × 64 samples. The coding tree units can be further
deblocking filter has lower computational complexity and better divided into smaller blocks using a quadtree structure; such
parallel processing capabilities while still achieving significant a block, called a coding unit (CU), can further be split into
reduction of the visual artifacts.
prediction units (PUs) and is also a root for the transform
Index Terms—Block-based coding, deblocking, video coding, quadtree. Each of the child nodes of the transform quadtree
video filtering, video processing. defines a transform unit (TU). The size of the transforms used
in the prediction error coding can vary from 4 × 4 to 32 × 32
I. Introduction samples, thus allowing transforms larger than in H.264/AVC,
which uses 4 × 4 and 8 × 8 transforms. As the optimal size of

H IGH EFFICIENCY Video Coding (HEVC) [1] is a new


video coding standard currently being developed jointly
by ITU-T SG 16 Q.6, also known as the Video Coding Experts
the above-mentioned blocks depends typically on the picture
content, the reconstructed picture is composed of blocks of
various sizes, each block being coded using an individual
Group (VCEG), and by ISO/IEC JTC 1/SC 29/WG 11, also prediction mode and the prediction error transform.
known as the Moving Picture Experts Group (MPEG) in the In a coding scheme that uses block-based prediction and
joint collaborative team on video coding (JCT-VC). The first transform coding, discontinuities can occur in the recon-
version of the HEVC standard is planned to be finalized in structed signal at the block boundaries. Visible discontinuities
January 2013, while the development of the scalable and 3-D at the block boundaries are known as blocking artifacts. A ma-
extensions of HEVC is expected in the following years. Simi- jor source of blocking artifacts is the block-transform coding
lar to the previous video coding standards, such as H.264/AVC, of the prediction error followed by coarse quantization. More-
the upcoming HEVC standard is based on a hybrid coding over, in a motion-compensated prediction process, predictions
scheme using block-based prediction and transform coding. for adjacent blocks in the current picture might not come
First, the input signal is split into rectangular blocks that can from adjacent blocks in the previously coded pictures, which
be predicted from previously decoded data either by motion- creates discontinuities at the block boundaries of the prediction
compensated prediction [3] or intra prediction. The resulting signal. Similarly, when applying intra prediction, the predic-
prediction error is coded by applying block transforms based tion process of adjacent blocks might be different causing
on an integer approximation of the discrete cosine transform, discontinuities at the block boundaries of the prediction signal.
Manuscript received April 15, 2012; revised July 19, 2012; accepted August Two approaches to reduce blocking artifacts are post-
20, 2012. Date of publication October 5, 2012; date of current version January filtering and in-loop filtering. Post-filtering is not specified
8, 2013. This paper was recommended by Associate Editor B. Pesquet- by the video coding standard and can be performed, e.g., in
Popescu.
A. Norkin and K. Andersson are with Ericsson Research, the display buffer. The implementer has a freedom to design
Stockholm 164 89, Sweden (e-mail: andrey.norkin@ericsson.com; an algorithm driven by application-specific requirements. In-
kenneth.r.andersson@ericsson.com). loop filters operate within the encoding and decoding loops.
G. Bjøntegaard and A. Fuldseth are with Cisco Systems Norway, Lysaker
1366, Norway (e-mail: arilfuld@cisco.com; gbjonteg@cisco.com). Therefore, they need to be normative to avoid drift between
M. Narroschke is with the Panasonic Research and Development Center, the encoder and decoder.
Langen 63255, Germany (e-mail: matthias.narroschke@eu.panasonic.com). The HEVC draft standard defines two in-loop filters that can
M. Ikeda is with the Technology Development Group, Sony Corporation,
Tokyo 141-8610, Japan (e-mail: masaru.ikeda@jp.sony.com). be applied sequentially to the reconstructed picture. The first
M. Zhou is with the Systems and Applications Research and Devel- one is the deblocking filter and the second one is the sample
opment Center, Texas Instruments, Inc., Dallas, TX 75243 USA (e-mail: adaptive offset filter (SAO) that are currently included into the
zhou@ti.com).
G. Van der Auwera is with Qualcomm Technologies, Inc., San Diego, CA main profile. This paper describes the first of these two in-loop
92121 USA (e-mail: geertv@qualcomm.com). filters, the deblocking filter. Depending on the configuration,
Color versions of one or more of the figures in this paper are available SAO can be applied to the output of the deblocking filtering
online at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TCSVT.2012.2223053 process.

1051-8215/$31.00 
c 2012 IEEE
NORKIN et al.: HEVC DEBLOCKING FILTER 1747

Fig. 1. 1-D example of block boundary with blocking artifact.

The deblocking filter in HEVC has been designed to im-


prove the subjective quality while reducing the complexity.
The latter consideration is important since the deblocking filter
of the H.264/AVC standard [2], [4] constitutes a significant Fig. 2. Illustration of picture samples and horizontal and vertical block
part of the decoder complexity. As a result, the HEVC de- boundaries on the 8 × 8 grid, and the nonoverlapping blocks of the 8 × 8
samples, which can be deblocked in parallel.
blocking filter is less complex as compared to the H.264/AVC
deblocking filter, while still having the capability to improve
the subjective and objective quality. that would reduce the subjective quality. Deciding whether
Another aspect that received significant attention in the to filter a block boundary should, therefore, depend on the
HEVC deblocking filter design is its suitability for parallel characteristics of the reconstructed pixel values on both sides
processing. Deblocking in HEVC has been designed in a way of that block boundary, and on the coded parameters indicating
to prevent spatial dependences across the picture, which, to- whether it is likely that a blocking artifact has been created
gether with other design features, enables easy parallelization by the coding process.
on multiple cores. Filtering decisions that are elaborated in the following
In the following sections, an overview of the HEVC de- subsections are made separately for each boundary of four-
blocking filter design is provided. For more details, the reader sample length that lies on the grid dividing the picture into
is referred to [1], and to the corresponding input contributions blocks of 8 × 8 samples. Block boundaries on the 8 × 8 grid
to the JCT-VC. The initial deblocking filter design was adopted are illustrated in Fig. 2. Only boundaries on the 8 × 8 grid,
from [5]. The filtering decisions and operations, as described which were either prediction unit or transform unit boundaries,
in Sections II and III, mainly result from the adoption of are subjected to deblocking.
the contributions in [6] and [7]. For sequence and picture- Deblocking is, therefore, performed on a four-sample part
level adaptivity (see Section IV) the main adopted contribution of a block boundary when all of the following three criteria are
is [12]. The parallel processing capabilities, as described in true: 1) the block boundary is a prediction unit or transform
Section V, mainly result from adoption of [8]–[10]. unit boundary; 2) the boundary strength is greater than zero;
and 3) variation of signal on both sides of a block boundary
is below a specified threshold (see Fig. 4). When certain addi-
II. Filtering Decisions
tional conditions (Section II-D) hold, a strong filter is applied
A. Block Boundaries for Deblocking on the block edge instead of the normal deblocking filter.
As mentioned above, independent coding of blocks creates
discontinuities at block boundaries. An example of a block B. Boundary Strength (Bs) and Edge-Level Adaptivity
boundary with a blocking artifact is shown in Fig. 1. Blocking Boundary strength (Bs) is calculated for boundaries that are
artifacts can easily be noticed by the human visual system either prediction unit boundaries or transform unit boundaries.
when the signal on both sides of the block boundary is The boundary strength can take one of the three possible
relatively smooth, but are more difficult to notice when the values: 0, 1, and 2. The definition of the Bs is shown in Table I.
signal shows high variation. Furthermore, if the original signal For the luma component, only block boundaries with Bs
across the block boundary is subjected to higher variations, values equal to one or two are filtered. This implies that there
then it is difficult to say whether changes in the reconstructed is typically no filtering within the static areas. This helps avoid
signal across the block boundary are caused by coding or multiple subsequent filtering of the same areas where pixels
belong to the original signal. are copied from one picture to another with a residual equal
The main difficulty when designing a deblocking filter is to zero, which can cause oversmoothing. The difference in
to decide whether or not to filter a particular block boundary, filtering operations between Bs equal to one and Bs equal to
and to decide on the filtering strength to be applied. Excessive two is described in Section III-D.
filtering may lead to unnecessary smoothing of the picture In the case of the chroma components, only boundaries with
details, whereas lack of filtering may leave blocking artifacts Bs equal to two are filtered. This implies that only those block
1748 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 12, DECEMBER 2012

TABLE I For block boundaries with an associate Bs greater than zero,


Definition of Bs Values for the Boundary Between and for which (1) holds, deblocking filtering is performed.
Two Neighboring Luma Blocks There are two deblocking filtering modes in HEVC, namely,
a normal filtering mode and strong filtering mode. For each
Conditions Bs block boundary of four samples in length, the deblocking filter
At least one of the blocks is Intra 2 switches between the normal and the strong filtering mode
At least one of the blocks has non-zero coded 1
residual coefficient and boundary is a transform based on the local signal characteristics.
boundary
Absolute differences between corresponding 1
spatial motion vector components of the two D. Decisions Between Normal and Strong Deblocking
blocks are >= 1 in units of integer pixels
Whether to apply strong or normal deblocking is also
Motion-compensated prediction for the two 1
blocks refers to different reference pictures or determined based on the first and the fourth lines across the
the number of motion vectors is different for block boundary of four samples (see Fig. 3). The following
the two blocks expressions using information from lines i = 0 and i = 3
Otherwise 0
are evaluated to make a decision between the normal and the
strong filtering [5], [9]:

|p2,i − 2p1,i + p0,i | + |q2,i − 2q1,i + q0,i | < β/8 (2)


|p3,i − p0,i | + |q0,i − q3,i | < β/8 (3)
|p0,i − q0,i | < 2.5tC . (4)

The threshold parameter tC depends on QP and is defined by


Fig. 3. Four-pixel long vertical block boundary formed by the adjacent a table (see Section IV for details). If (2), (3), and (4) hold for
blocks P and Q. Deblocking decisions are based on lines marked with the
dashed line (lines 0 and 3). both lines 0 and 3, the strong filtering is applied to the block
boundary. Otherwise, normal filtering is applied. Condition (2)
checks that there is a low spatial activity on the side of block
boundaries are filtered where at least one of the two adjacent
boundary [similar to (1) but using a lower threshold], condition
blocks is intra predicted.
(3) checks that the signal on the sides of the block boundary is
C. Local Adaptivity and Filtering Decisions flat, and condition (4) checks that the difference in intensities
of samples on two sides of the block boundary does not
If Bs is greater than zero, additional conditions are checked exceed the threshold, which is a multiple of the clipping value
for luma block edges to determine whether the deblocking tC (QP) dependent on QP (see Section V). The sequence of de-
filtering should be applied to the block boundary or not. blocking filtering decisions described above is summarized in
As we can see from Fig. 1, a blocking artifact is char- Fig. 4.
acterized by low spatial activity on both sides of the block
boundary, whereas there is discontinuity at the block boundary.
Therefore, for each block boundary of four-sample length on E. Deblocking Decisions in Normal Filtering Mode
the 8 × 8 sample grid that satisfies the conditions described Normal filtering has two modes differing in the number of
above, the following condition is checked to decide whether pixels being modified on each side of the block boundary. One
the deblocking filtering is applied (see Fig. 3): of the two modes is selected for each boundary based on the
| p2,0 − 2p1,0 + p0,0 | + |p2,3 − 2p1,3 + p0,3 | + following two conditions [6]:
|q2,0 − 2q1,0 + q0,0 | + |q2,3 − 2q1,3 + q0,3 | > β (1) |p2,0 − 2p1,0 + p0,0 | + |p2,3 − 2p1,3 + p0,3 | < 3/16β (5)
where threshold β depends on the quantization parameter QP |q2,0 − 2q1,0 + q0,0 | + |q2,3 − 2q1,3 + q0,3 | < 3/16β. (6)
that is used to adjust the quantization step for quantizing the
prediction error coefficients [5]. The threshold is derived from If (5) is true, the two nearest pixels to the block boundary
a table that has a piecewise linear dependence with values of can be modified in block P. Otherwise, only the nearest
QP, as described in Section IV. Equation (1) evaluates how pixel in block P can be modified. Similarly, if (6) holds, the
much signal on both sides of the block boundary deviates two nearest pixels to the block boundary can be modified
from a straight line (a constant level signal or a ramp). Only in block Q. Otherwise, only the nearest pixel can be modi-
the first and fourth lines in a block boundary of length 4 are fied. The thresholds used in (5) and (6) are also dependent
evaluated to reduce complexity. The example in Fig. 3 and on quantization parameter QP since they are multiples of
equations in this and the following sections only consider the threshold β. The values of the thresholds used in (5) and
case of a vertical block boundary for the sake of brevity. The (6) are less than the value of the threshold in (1), but
example can easily be extended to deblocking of horizontal greater than the value of the threshold in (3). This means
block boundaries by rotating the figure by 90° in the clockwise that the longer (stronger) filtering is allowed for the block
direction and changing row and column subscript indices in boundaries that have lower spatial activity on the sides of the
the equations. boundaries.
NORKIN et al.: HEVC DEBLOCKING FILTER 1749

Fig. 5. Decisions for normal filter that are applied to each line of four-sample
segment.

If this condition does not hold, it is likely that the change of


the signal on both sides of the block boundary is caused by a
natural edge and not by a blocking artifact.
If (5) is true, the modified value p1 in each line across the
block boundary is obtained by
p1 = p1 + p1 . (11)
Fig. 4. Decisions for each segment of block boundary of four samples in
length lying on 8×8 block boundary. PU: prediction unit. TU: transform unit. Similarly, if (6) is true, then q1 is calculated as
q1 = q1 + q1 (12)
III. Filtering Operations where the offset values p1 and q1 are obtained by clipping
A. Normal Filtering Operations the corresponding δp1 and δq1 values, which are calculated as
When a picture contains an inclined surface (or linear ramp
signal) that crosses a block boundary, the filter will be active. δp1 = (((p2 + p0 + 1) >> 1) − p1 + 0 ) >> 1 (13)
In these cases, the normal deblocking filter operations should
not modify the signal. δq1 = (((q2 + q0 + 1) >> 1) − q1 − 0 ) >> 1. (14)
In the normal filtering mode for a segment of four lines
(see Fig. 3), filtering operations are applied for each line. In Neglecting the clipping operation, the impulse response of the
the following, the second indices of pixels, indicating the line filter that corresponds to modification of the pixel at position
number, are omitted for brevity. p1 is (8 19 −1 9 −3)/32.
The filtered pixel values p0 and q0 are calculated for each The sequence of filtering decisions for each line of pixels
line across the block boundary as follows: in the normal filtering mode is summarized in Fig. 5.
B. Strong Filtering Operations
p0 = p0 + 0 (7)
The strong filter affects more pixels on each side of the
block boundary. Modifications of three pixels on each side
q0 = q0 − 0 (8) of the block boundary are similar to strong filtering in
H.264/AVC [4]. The offset values 0s , 1s , and 2s are added
where the value of 0 is obtained by clipping δ0 to pixels p0 , p1 , and p2 , respectively, after clipping of the
following δ0s , δ1s , and δ2s values:
δ0 = (9(q0 − p0 ) − 3(q1 − p1 ) + 8) >> 4. (9)
δ0s = (p2 + 2p1 − 6p0 + 2q0 + q1 + 4) >> 3 (15)
The clipping operation is described in Section III-D. Ne-
glecting the clipping operation, the impulse response of this δ1s = (p2 − 3p1 + p0 + q0 + 2) >> 2 (16)
filter is (3 7 9 −3)/16. The offset value δ0 corresponds to the
deviation of the signal at the sides of the block boundary from δ2s = (2p3 − 5p2 + p1 + p0 + q0 + 4) >> 3. (17)
a perfect ramp. The offset is zero if the signal across the block
boundary forms a ramp. The offset values for modification of pixels q0 , q1 , and q2
Furthermore, the deblocking filtering is applied to the row are calculated by exchanging q and p in (15), (16), and (17).
or column of samples across the block boundary, if and only Impulse responses of the filters that correspond to modification
if the following expression holds: of pixels p0 , p1 , and p2 are (1 2 2 2 1)/8, (1 1 1 1)/4, and
(2 3 1 1 1)/8, respectively, if the clipping operation is ne-
|δ0 | < 10tC. (10) glected.
1750 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 12, DECEMBER 2012

C. Chroma Deblocking
As mentioned previously, chroma deblocking is only per-
formed when Bs is equal to two. In this case, no further
deblocking decisions are done. Only pixels p0 and q0 are
modified as in (7) and (8). The deblocking is performed with
the c value, which is obtained by clipping the following δc
offset value:
δc = (((p0 − q0 ) << 2) + p1 − q1 + 4) >> 3 (18)
which corresponds to filtering by the filter with the impulse
response of (1 4 4 –1)/8.

D. Clipping
To prevent excessive blurriness, deblocking filtering is done
on a signal after QP-dependent clipping. Clipping is applied to
Fig. 6. Dependence of β on QP.
the δ values after their calculation and before modification of
the pixel values. The  values used in filtering are obtained by
clipping the δ values to the range −c to c as in (19). Clipping
provides more adaptivity to deblocking filtering. The clipping
is applied by performing the following operations:
 = Min(Max(−c, δ), c) (19)
where the value of c is equal to tC (n) for p0 and q0 , and tC (n)/2
for p1 and q1 in the case of normal filtering. In the case of
strong filtering, c is set equal to 2tC (n). Variable n is equal
to QP when both blocks adjacent to the boundary are inter
predicted and QP+2, if one of the blocks is intra predicted
(Bs = 2).
The dependence of the parameter tC on QP is illustrated in
Fig. 7. The blocking artifacts strength is generally greater for
intra predicted blocks. Therefore, larger modifications of pixel
values are allowed for intra-blocks than those for inter-blocks
Fig. 7. Dependence of tC on QP.
by using the clipping value tC (QP + 2) for block boundaries
with Bs equal to 2. normal filtering operation. One can observe that the value of
The filtered pixel values p0 , q0 , p1 and q1 for normal β increases with QP. Therefore, deblocking is enabled more
filtering and p0  and q0  for chroma deblocking are also clipped frequently at high QP values compared to low QP values, high
to stay in the range defined by the bit depth N QP values correspond to coarse, and low QP values correspond
p = Min(Max(0, p ), 2N − 1). (20) to fine quantization. One can also see that the deblocking
operation is effectively disabled for low QP values by setting
one or both of β and tC to zero.
The parameter tC controls the selection between the normal
IV. Sequence and Picture Level Adaptivity and strong filter and determines the maximum absolute value
Since different video sequences have different characteris- of modifications that are allowed for the pixel values for a
tics, deblocking strength can be adjusted on a sequence and certain QP for both normal and strong filtering operations. This
even on a picture basis. helps adaptively limit the amount of blurriness introduced by
As mentioned earlier, the main sources of blocking artifacts the deblocking filtering.
are block transforms and quantization. Therefore, blocking The deblocking parameters tC and β provide adaptivity
artifact severity depends, to a large extent, on the quantization according to the QP and prediction type. However, different
parameter QP. Therefore, in the deblocking filtering decisions, sequences or parts of the same sequence may have different
the QP value is taken into account. Thresholds β and tC characteristics. It may be important for content providers to
depend on the average QP value of two neighboring blocks change the amount of deblocking filtering on the sequence
with common block edge [13] and are typically stored in or even on a slice or picture basis. Therefore, deblocking
corresponding tables. The dependence of these parameters on adjustment parameters can be sent in the slice header or
QP is shown in Figs. 6 and 7. picture parameters set (PPS) to control the amount of de-
The parameter β controls what edges are filtered, controls blocking filtering applied. The corresponding parameters are
the selection between the normal and strong filter, and controls tc− offset− div2 and beta− offset− div2 [12]. These parameters
how many pixels from the block boundary are modified in the specify the offsets (divided by two) that are added to the QP
NORKIN et al.: HEVC DEBLOCKING FILTER 1751

value before determining the β and tC values. The parameter vertical edges and then to all horizontal edges in the picture.
beta− offset− div2 adjusts the number of pixels to which the de- Consequently, the order of vertical and horizontal filtering for
blocking filtering is applied, whereas parameter tc− offset− div2 each of the 8 × 8 blocks, as shown in Fig. 1, is exactly the
adjusts the amount of filtering that can be applied to those same irrespective of the block position. Moreover, the order of
pixels, as well as detection of natural edges. filtering the block boundaries does not change with different
orders of CTU decoding, which reduces hardware complexity.
As HEVC deblocking is independent for each 8×8 block, an
V. Computational Complexity and Parallelism encoder or decoder has the option of deblocking inner blocks
Compared to H.264/AVC, the complexity of the deblocking of a slice or a tile [11] only and leaving the slice or tile bound-
filter has been significantly reduced in HEVC due to several ary blocks out of the deblocking process in the first pass. In the
factors that are described in this section. Performing deblock- second pass, an encoder or decoder can go back and perform
ing on a grid of 8 × 8 samples as opposed to a grid of 4 × 4 deblocking along the slice or tile boundaries as a patch. Such
samples in H.264/AVC reduces the number of deblocking an option basically breaks in-loop filter (deblocking and SAO)
operations by a factor of two. Deblocking of the chroma dependence across the slice or tile boundaries and is very
component in the 4:2:0 format is also performed on the grid of useful for parallel processing on multicore platforms when
8×8 samples. Furthermore, the chroma blocks are filtered only the in-loop filters are enabled across slice or tile boundaries.
in cases when one of the adjacent blocks is intra predicted. By taking advantage of this property, each core can process a
This decreases the amount of chroma filtering further for inter- portion of a picture in parallel by skipping the in-loop filtering
coded slices. Filtering on an 8×8 sample grid may potentially for the slice or tile boundary blocks. After the entire picture is
lead to reduction in subjective quality. However, since the processed, a separate core can load the slice or tile boundary
number of 4 × 4 blocks in the picture for HEVC is generally blocks back and conduct a patch for in-loop filters along the
lower than that for H.264/AVC and 4 × 4 blocks in HEVC slice or tile boundaries to complete in-loop filtering for the
are usually used in the areas with higher temporal or spatial picture. Therefore, there is no need for de-coupling the entire
activity, applying filtering on an 8×8 sample grid is a tradeoff in-loop filtering process from the rest of the coding process,
between computational complexity and subjective quality. that significantly improves the throughput and greatly reduces
Another source of complexity reduction in HEVC deblock- memory bandwidth requirements for multi-core based HEVC
ing is related to the transform and prediction unit size. In implementations. This is not possible with the H.264/AVC
H.264/AVC, the largest transform size is 8 × 8, whereas deblocking filter design, in which the deblocking has to be
the largest prediction unit size is 16 × 16 samples, i.e., a decoupled if multiple slices are processed in parallel and
macroblock. However, in HEVC the largest transform size is deblocking across slice boundaries is enabled.
32 × 32 and the largest prediction unit size is 64 × 64 samples. Another advantage of the highly parallelizable HEVC de-
This additionally reduces the average amount of operations blocking filter is that it provides enough cycle margins to
(although not necessarily for the worst case) since deblocking enable a combination of the deblocking filter and SAO in the
is never performed inside these large blocks. same building block in hardware implementations. In a typical
Deblocking in HEVC has been designed to prevent spatial architecture, the HEVC deblocking filter only consumes from
dependences of the deblocking process across the picture. 84 to 88 cycles per 16 × 16 block, which is less than half of
There is no overlap between the filtering operations for one the typical 200 cycles per 16 × 16 block cycle budget (for a
block edge, which can modify at most three pixels from the 1080p@120 f/s video running at 250 MHz clock rate) [14].
block edge, and the filtering decisions for the neighboring Combining the deblocking filter and SAO in the same building
parallel block edge, which involves at most four pixels from block is beneficial in terms of hardware area cost, since
the block edge. Therefore, any vertical block edge in the SAO and deblocking can share the same memory interface,
picture can be deblocked in parallel to any other vertical in contrast to having separate building blocks and memory
edge. The same holds for horizontal edges. Note, however, interfaces for SAO and deblocking.
that sample values modified by deblocking of vertical block Since deblocking in HEVC is computationally less intensive
boundaries are used as the input for deblocking of horizontal and more parallelizable than in H.264/AVC, it can be said
block boundaries. that the HEVC deblocking is much less of a bottleneck when
For CTU-based processing, the deblocking in HEVC can be implementing a video decoder. The deblocking in HEVC is
performed on an 8 × 8 block basis. A picture can be divided a better tradeoff among coding efficiency (i.e., subjective and
into nonoverlapping blocks of 8×8 samples (see Fig. 2). Each objective quality), throughput, and implementation complexity
of those blocks contains all data required for its deblocking. when compared to the H.264/AVC design.
Consequently, deblocking can be performed independently for
each of those blocks of 8 × 8 samples. This makes the HEVC
deblocking easily parallelizable for any degree of parallelism VI. Results
by simply replicating the same 8 × 8 deblocking logic. This section demonstrates the objective and subjective im-
The order of filtering of vertical and horizontal edges pact of deblocking filtering. Tables II–V show the BD-rate
in HEVC is also different from that in H.264/AVC. In resulting from disabling the deblocking filtering for various
H.264/AVC, deblocking is performed on a macroblock basis. configurations used in the HEVC standardization [16]. These
However, the deblocking in HEVC is first applied to all configurations are all-intra where only intra prediction is used,
1752 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 12, DECEMBER 2012

TABLE II TABLE V
Average Bit Rate Increase at the Same Quality by Disabling Average Bit Rate Increase at the Same Quality by
the Deblocking Filter for the All-Intra Configuration Disabling the Deblocking Filter for the Low-Delay
P-Frame Configuration
All Intra Main
Y U V Low Delay P Main
Class A 1.9% 4.2% 3.7% Y U V
Class B 1.7% 4.5% 5.1% Class B 4.9% 2.5% 2.7%
Class C 0.9% 3.7% 4.3% Class C 2.6% 1.5% 2.1%
Class D 0.7% 3.0% 3.4% Class D 1.6% 1.4% 0.8%
Class E 2.1% 7.4% 8.8% Class E 6.2% 7.8% 9.0%
Class F 0.6% 1.9% 1.8% Class F 1.7% 1.0% 0.4%
Overall 1.3% 4.0% 4.4% Overall 3.3% 2.5% 2.7%

TABLE III
Average Bit Rate Increase at the Same Quality by Disabling
the Deblocking Filter for the Random-Access Configuration

Random Access Main


Y U V
Class A 3.6% 2.1% 1.9%
Class B 3.2% 1.9% 1.9%
Class C 2.1% 1.5% 1.9%
Class D 1.5% 1.1% 1.2%
Class F 1.2% 0.9% 0.9%
Overall 2.6% 1.6% 1.7%

TABLE IV
Fig. 8. Basketball Drive sequence coded in random access configuration at
Average Bit Rate Increase at the Same Quality by Disabling QP32. (a) Deblocking turned off. (b) Deblocking turned on.
the Deblocking Filter for the Low-Delay Configuration

Low delay B Main


Y U V
Class B 3.3% 1.3% 1.6%
Class C 2.1% 1.5% 1.5%
Class D 1.3% 0.8% 1.6%
Class E 3.8% 5.9% 7.3%
Class F 1.3% 0.4% 0.0%
Overall 2.4% 1.8% 2.1%

random-access that uses intra pictures over certain time inter-


vals and hierarchical-B coding structure, and two low-delay
configurations that have only one intra picture, and motion-
compensated prediction uses only temporally preceding pic-
tures. The low-delay P configuration does not use bidirectional
motion-compensated prediction. The BD rate is used in the Fig. 9. Kristen and Sara sequence coded in low-delay B configuration at
HEVC standardization as a measure for the average bit rate QP37. (a) Deblocking turned off. (b) Deblocking turned on.
reduction at the same mean squared error [15]. As a positive
number in the tables indicates an increased bit rate at the same that the deblocking filter effectively attenuates the blocking
quality, the HEVC deblocking filter leads to an average bit rate artifacts. The HEVC reference software HM6.0 was used in
reduction of 1.3%–3.3% at the same quality, dependent on the all experiments.
configuration. For certain sequences, more than 6% bit rate
reduction is achieved. Figs. 8 and 9 compare the visual quality
of coded sequences when the deblocking is turned on with the VII. Conclusion
configuration with the deblocking turned off. Fig. 8 shows a The deblocking filter in the upcoming HEVC standard
cropped part of a frame from the Basketball Drive sequence improves both the subjective and objective quality of the coded
(1080p@50 f/s) coded in random access configuration at video sequences, while being less computationally expensive
QP 32, where the deblocking filtering was applied and the than the deblocking filter in H.264/AVC. The decrease in com-
frame where the deblocking was turned off. Fig. 9 provides a putational complexity is achieved by reconsidering a number
comparison for a sequence Kristen and Sara (720p@60 f/s) of tools. The HEVC deblocking filtering operations can also
coded in low-delay B configuration at QP 37. It can be seen be easily performed in parallel on multiple processors, which
NORKIN et al.: HEVC DEBLOCKING FILTER 1753

is important for coding and decoding higher resolution video Andrey Norkin received the M.Sc. degree in com-
puter engineering from Ural State Technical Univer-
sequences. sity, Yekaterinburg, Russia, in 2001, and the Doctor
of Science degree in signal processing from the Tam-
pere University of Technology, Tampere, Finland, in
References 2007.
From 2002 to 2007, he was a Researcher with the
[1] B. Bross, W.-J. Han, G. J. Sullivan, J.-R. Ohm, and T. Wiegand, High Institute of Signal Processing, Tampere University
Efficiency Video Coding (HEVC) Text Specification Draft 8, ITU-T SG16 of Technology. He is currently a Senior Researcher
WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVTC-J1003, Joint and Work Package Leader with Ericsson Research,
Collaborative Team on Video Coding (JCTVC), Stockholm, Sweden, Stockholm, Sweden. He has been an active partici-
Jul. 2012. pant in the ITU-T/ISO/IEC Joint Collaborative Team for Video Coding and
[2] ITU-T and ISO/IEC JCT 1, Advanced Video Coding for Generic a Coordinator of the Core Experiment on deblocking filtering in HEVC
Audiovisual Services, ITU-T Rec. H.264 and ISO/IEC 14496-10 (AVC), standardization. His current research interests include video compression, 3-D
May 2003 (and subsequent editions). video, error-resilient coding, and image processing.
[3] T. Wedi and H. G. Musmann, “Motion and aliasing compensated
prediction for hybrid video coding,” IEEE Trans. Circuits Syst. Video
Technol., vol. 13, no. 7, pp. 577–586, Jul. 2003.
[4] P. List, A. Josh, J. Lainema, G. Bjontegaard, and M. Karczewicz, Gisle Bjøntegaard received the Dr. Philos degree in physics from the
“Adaptive deblocking filter,” IEEE Trans. Circuits Syst. Video Technol., University of Oslo, Oslo, Norway, in 1974.
vol. 13, no. 7, pp. 614–619, Jul. 2003. From 1974 to 1996, he was a Senior Scientist with Telenor Research and
[5] K. Ugur, K. R. Andersson, and A. Fuldseth, Video Coding Technol- Development, Oslo. His areas of work included radio link network design,
ogy Proposal by Tandberg, Nokia, and Ericsson, ITU-T SG16 WP3 reflector antenna design and construction, and digital signal processing. From
and ISO/IEC JTC1/SC29/WG11 document JCTVC-A119, Joint Col- 1980 to 1996, he was mainly involved in the development of digital video
laborative Team on Video Coding (JCTVC), Dresden, Germany, Apr. compression methods. He has contributed actively in development of the
2010. ITU video standards H.261, H.262, H,263, H.264, and ISO/IEC MPEG2
[6] A. Norkin, K. Andersson, R. Sjöberg, Q. Huang, J. An, X. Guo, and and MPEG4. From 1996 to 2002, he was a Group Manager with Telenor
S. Lei, CE12: Ericsson’s and MediaTek’s Deblocking Filter, ITU-T Broadband Services, Oslo, engaged in design of point-to-point satellite
SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC-F118, communication and development of a digital satellite TV platform. He has
Joint Collaborative Team on Video Coding (JCTVC), Turin, Italy, Jul. produced numerous contributions toward the development of the ITU-T
2011. standards H.263 and H.264. Since 2002, he has been a Principal Scientist with
[7] M. Ikeda and T. Suzuki, Non-CE10: Introduction of Strong Fil- Tandberg Telecom, Lysaker, Norway, working with video coding development
ter Clipping in Deblocking Filter, ITU-T SG16 WP3 and ISO/IEC and implementation. Since 2006, he has worked on further improvement of
JTC1/SC29/WG11 document JCTVC-H0275, Joint Collaborative Team digital video coding and is currently taking part in the definition of HEVC,
on Video Coding (JCTVC), San Jose, CA, Feb. 2012. developed jointly between ITU and ISO. In 2010, Tandberg Telecom was
[8] M. Ikeda, J. Tanaka, and T. Suzuki, CE12 Subset2: Parallel Deblocking acquired by Cisco Systems and he was promoted to a Cisco Fellow and is
Filter, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document presently with Cisco Systems, Norway.
JCTVC-E181, Joint Collaborative Team on Video Coding (JCTVC),
Geneva, Switzerland, Mar. 2011.
[9] M. Narroschke, S. Esenlik, and T. Wedi, CE12 Subtest 1: Results for
Arild Fuldseth received the B.Sc. degree from the Norwegian Institute of
Modified Decisions for Deblocking, ITU-T SG16 WP3 and ISO/IEC
Technology, Trondheim, Norway, in 1988, and the M.Sc. and Ph.D. degrees
JTC1/SC29/WG11 document JCTVC-G590, Joint Collaborative Team
in signal processing from North Carolina State University, Raleigh, and the
on Video Coding (JCTVC), Geneva, Switzerland, Nov. 2011.
Norwegian University of Science and Technology, Trondheim, in 1989 and
[10] A. Norkin, CE10.3: Deblocking Filter Simplifications: Bs Computa-
1997, respectively.
tion and Strong Filtering Decision, ITU-T SG16 WP3 and ISO/IEC
From 1989 to 1994, he was a Research Scientist with SINTEF, Trondheim.
JTC1/SC29/WG11 document JCTVC-H0473, Joint Collaborative Team
From 1997 to 2002, he was a Manager with the Signal Processing Group, Fast
on Video Coding (JCTVC), San Jose, CA, Feb. 2012.
Search and Transfer, Oslo, Norway. Since 2002, he has been with Tandberg
[11] A. Fuldseth, M. Horowitz, S. Xu, A. Segall, and M. Zhou, Tiles, ITU-T
Telecom, Oslo (now part of Cisco Systems, Oslo), where he is currently a
SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC-F335,
Principal Engineer working on video compression technology.
Joint Collaborative Team on Video Coding (JCTVC), Turin, Italy, Jul.
2011.
[12] T. Yamakage, S. Asaka, T. Chujoh, M. Karczewicz, and I. S. Chong,
CE12: Deblocking Filter Parameter Adjustment in Slice Level, ITU- Matthias Narroschke received the Dipl.-Ing. and
T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC- Dr.-Ing. degrees in electrical engineering from the
G174, Joint Collaborative Team on Video Coding (JCTVC), Geneva, University of Hannover, Hannover, Germany, in
Switzerland, Nov. 2011. 2001 and 2008, respectively.
[13] G. Van der Auwera, X. Wang, M. Karczewicz, M. Narroschke, A. From 2001 to 2007, he was a Research Engineer
Kotra, and T. Wedi (Panasonic), Support of Varying QP in Deblocking, and Teaching Assistant with the Institut für Informa-
ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC- tionsverarbeitung, University of Hannover. In 2003,
G1031, Joint Collaborative Team on Video Coding (JCTVC), Geneva, he became the Ober-Ingenieur. Since 2007, he has
Switzerland, Nov. 2011. been with the Panasonic Research and Development
[14] M. Zhou, O. Sezer, and V. Sze, CE12 Subset 2: Test Results and Center, Langen, Germany, where he is currently a
Architectural Study on De-Blocking Filter Without Parallel on/off Filter Principal Engineer. In 2008, he became a Guest
Decision, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document Lecturer with the University of Hannover. He has several patents pending
JCTVC-G088, Joint Collaborative Team on Video Coding (JCTVC), in the area of video coding, mostly in cooperation with Panasonic. His
Geneva, Switzerland, Nov. 2011. current research interests include video coding standardization activities and
[15] G. Bjontegaard, Calculation of Average PSNR Differences Between RD- the development of future video coding schemes.
Curves, ITU-T-T SG16 document VCEG-M33, Joint Collaborative Team Dr. Narroschke is an active contributor to the Motion Picture Experts Group
on Video Coding (JCTVC), 2001. of ISO/IEC SC29, and to the video coding experts group of ITU. He was the
[16] F. Bossen, Common Test Conditions, JCTVC-H1100, Joint Collaborative recipient of the Robert-Bosch-Prize for the Best Dipl.-Ing. Degree in Electrical
Team on Video Coding (JCTVC), San Jose, CA, 2012. Engineering in 2001.
1754 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 12, DECEMBER 2012

Masaru Ikeda received the B.S. and M.S. degrees Geert Van der Auwera received the Ph.D. degree in electrical engineering
in electrical and communication engineering from from Arizona State University, Tempe, in 2007, and the Belgian MSEE degree
Tohoku University, Sendai, Japan, in 1999 and 2001, from Vrije Universiteit Brussel, Brussels, Belgium, in 1997.
respectively. He is currently with Qualcomm Technologies, Inc., San Diego, CA, where
From 2001 to 2008, he was a Research Engineer he is actively contributing to the JCT-VC standardization effort on high-
with Sony Research Center and worked on computer efficiency video coding. Until January 2011, he was with Samsung Electronics,
vision. He is currently with the Technology Devel- Irvine, CA. Until December 2004, he was a Scientific Advisor with IWT-
opment Group, Sony Corporation, Tokyo, Japan. His Flanders, Institute for the Promotion of Innovation by Science and Technology,
current research interests include video compression, Flanders, Belgium. In 2000, he joined IWT-Flanders after researching wavelet
3-D video compression, and image processing. video coding at IMEC’s Electronics and Information Processing Department,
Brussels. His current research interest include video coding, video traffic and
quality characterization, and video streaming mechanisms and protocols.
Kenneth Andersson received the M.Sc. degree in computer science and Dr. Van der Auwera received the Barco and IBM Prizes in 1998 for his
engineering from Luleå University, Luleå, Sweden, in 1995, and the Ph.D. thesis on motion estimation in the wavelet domain from the Fund for Scientific
degree from Linköping University, Linköping, Sweden, in 2003. Research of Flanders, Belgium.
He has been with Ericsson Research, Stockholm, Sweden, since 1994,
where he has worked on speech coding and is currently a Senior Researcher
working on video coding. His current research interests include image and
video signal processing and video coding.

Minhua Zhou received the B.E. degree in electronic engineering and the M.E.
degree in communication and electronic systems from Shanghai Jiao Tong
University, Shanghai, China, in 1987 and 1990, respectively, and the Ph.D.
degree in electronic engineering from Technical University, Braunschweig,
Germany, in 1997.
From 1993 to 1998, he was a Researcher with the Heinrich-Hertz Institute,
Berlin, Germany. He is currently a Research Manager of video coding
technology with the Systems and Applications Research and Development
Center, Texas Instruments, Inc., Dallas. His current research interests include
video compression, video pre- and postprocessing, end-to-end video quality,
joint algorithm and architecture optimization, and 3-D video.
Dr. Zhou was the recipient of the Rudolf-Urtel Prize in 1997 from the
German Society for Film and Television Technologies in recognition of his
Ph.D. thesis work on Optimization of MPEG-2 Video Encoding.

You might also like