HEVC Deblocking Filter
HEVC Deblocking Filter
HEVC Deblocking Filter
Abstract—This paper describes the in-loop deblocking filter which is followed by the quantization and coding of the
used in the upcoming High Efficiency Video Coding (HEVC) transform coefficients. While H.264/AVC [2] divides a picture
standard to reduce visible artifacts at block boundaries. The
into fixed size macroblocks of 16×16 samples, HEVC divides
deblocking filter performs detection of the artifacts at the coded
block boundaries and attenuates them by applying a selected a picture into coding tree units (CTU) of 16 × 16, 32 × 32
filter. Compared to the H.264/AVC deblocking filter, the HEVC or 64 × 64 samples. The coding tree units can be further
deblocking filter has lower computational complexity and better divided into smaller blocks using a quadtree structure; such
parallel processing capabilities while still achieving significant a block, called a coding unit (CU), can further be split into
reduction of the visual artifacts.
prediction units (PUs) and is also a root for the transform
Index Terms—Block-based coding, deblocking, video coding, quadtree. Each of the child nodes of the transform quadtree
video filtering, video processing. defines a transform unit (TU). The size of the transforms used
in the prediction error coding can vary from 4 × 4 to 32 × 32
I. Introduction samples, thus allowing transforms larger than in H.264/AVC,
which uses 4 × 4 and 8 × 8 transforms. As the optimal size of
1051-8215/$31.00
c 2012 IEEE
NORKIN et al.: HEVC DEBLOCKING FILTER 1747
Fig. 5. Decisions for normal filter that are applied to each line of four-sample
segment.
C. Chroma Deblocking
As mentioned previously, chroma deblocking is only per-
formed when Bs is equal to two. In this case, no further
deblocking decisions are done. Only pixels p0 and q0 are
modified as in (7) and (8). The deblocking is performed with
the c value, which is obtained by clipping the following δc
offset value:
δc = (((p0 − q0 ) << 2) + p1 − q1 + 4) >> 3 (18)
which corresponds to filtering by the filter with the impulse
response of (1 4 4 –1)/8.
D. Clipping
To prevent excessive blurriness, deblocking filtering is done
on a signal after QP-dependent clipping. Clipping is applied to
Fig. 6. Dependence of β on QP.
the δ values after their calculation and before modification of
the pixel values. The values used in filtering are obtained by
clipping the δ values to the range −c to c as in (19). Clipping
provides more adaptivity to deblocking filtering. The clipping
is applied by performing the following operations:
= Min(Max(−c, δ), c) (19)
where the value of c is equal to tC (n) for p0 and q0 , and tC (n)/2
for p1 and q1 in the case of normal filtering. In the case of
strong filtering, c is set equal to 2tC (n). Variable n is equal
to QP when both blocks adjacent to the boundary are inter
predicted and QP+2, if one of the blocks is intra predicted
(Bs = 2).
The dependence of the parameter tC on QP is illustrated in
Fig. 7. The blocking artifacts strength is generally greater for
intra predicted blocks. Therefore, larger modifications of pixel
values are allowed for intra-blocks than those for inter-blocks
Fig. 7. Dependence of tC on QP.
by using the clipping value tC (QP + 2) for block boundaries
with Bs equal to 2. normal filtering operation. One can observe that the value of
The filtered pixel values p0 , q0 , p1 and q1 for normal β increases with QP. Therefore, deblocking is enabled more
filtering and p0 and q0 for chroma deblocking are also clipped frequently at high QP values compared to low QP values, high
to stay in the range defined by the bit depth N QP values correspond to coarse, and low QP values correspond
p = Min(Max(0, p ), 2N − 1). (20) to fine quantization. One can also see that the deblocking
operation is effectively disabled for low QP values by setting
one or both of β and tC to zero.
The parameter tC controls the selection between the normal
IV. Sequence and Picture Level Adaptivity and strong filter and determines the maximum absolute value
Since different video sequences have different characteris- of modifications that are allowed for the pixel values for a
tics, deblocking strength can be adjusted on a sequence and certain QP for both normal and strong filtering operations. This
even on a picture basis. helps adaptively limit the amount of blurriness introduced by
As mentioned earlier, the main sources of blocking artifacts the deblocking filtering.
are block transforms and quantization. Therefore, blocking The deblocking parameters tC and β provide adaptivity
artifact severity depends, to a large extent, on the quantization according to the QP and prediction type. However, different
parameter QP. Therefore, in the deblocking filtering decisions, sequences or parts of the same sequence may have different
the QP value is taken into account. Thresholds β and tC characteristics. It may be important for content providers to
depend on the average QP value of two neighboring blocks change the amount of deblocking filtering on the sequence
with common block edge [13] and are typically stored in or even on a slice or picture basis. Therefore, deblocking
corresponding tables. The dependence of these parameters on adjustment parameters can be sent in the slice header or
QP is shown in Figs. 6 and 7. picture parameters set (PPS) to control the amount of de-
The parameter β controls what edges are filtered, controls blocking filtering applied. The corresponding parameters are
the selection between the normal and strong filter, and controls tc− offset− div2 and beta− offset− div2 [12]. These parameters
how many pixels from the block boundary are modified in the specify the offsets (divided by two) that are added to the QP
NORKIN et al.: HEVC DEBLOCKING FILTER 1751
value before determining the β and tC values. The parameter vertical edges and then to all horizontal edges in the picture.
beta− offset− div2 adjusts the number of pixels to which the de- Consequently, the order of vertical and horizontal filtering for
blocking filtering is applied, whereas parameter tc− offset− div2 each of the 8 × 8 blocks, as shown in Fig. 1, is exactly the
adjusts the amount of filtering that can be applied to those same irrespective of the block position. Moreover, the order of
pixels, as well as detection of natural edges. filtering the block boundaries does not change with different
orders of CTU decoding, which reduces hardware complexity.
As HEVC deblocking is independent for each 8×8 block, an
V. Computational Complexity and Parallelism encoder or decoder has the option of deblocking inner blocks
Compared to H.264/AVC, the complexity of the deblocking of a slice or a tile [11] only and leaving the slice or tile bound-
filter has been significantly reduced in HEVC due to several ary blocks out of the deblocking process in the first pass. In the
factors that are described in this section. Performing deblock- second pass, an encoder or decoder can go back and perform
ing on a grid of 8 × 8 samples as opposed to a grid of 4 × 4 deblocking along the slice or tile boundaries as a patch. Such
samples in H.264/AVC reduces the number of deblocking an option basically breaks in-loop filter (deblocking and SAO)
operations by a factor of two. Deblocking of the chroma dependence across the slice or tile boundaries and is very
component in the 4:2:0 format is also performed on the grid of useful for parallel processing on multicore platforms when
8×8 samples. Furthermore, the chroma blocks are filtered only the in-loop filters are enabled across slice or tile boundaries.
in cases when one of the adjacent blocks is intra predicted. By taking advantage of this property, each core can process a
This decreases the amount of chroma filtering further for inter- portion of a picture in parallel by skipping the in-loop filtering
coded slices. Filtering on an 8×8 sample grid may potentially for the slice or tile boundary blocks. After the entire picture is
lead to reduction in subjective quality. However, since the processed, a separate core can load the slice or tile boundary
number of 4 × 4 blocks in the picture for HEVC is generally blocks back and conduct a patch for in-loop filters along the
lower than that for H.264/AVC and 4 × 4 blocks in HEVC slice or tile boundaries to complete in-loop filtering for the
are usually used in the areas with higher temporal or spatial picture. Therefore, there is no need for de-coupling the entire
activity, applying filtering on an 8×8 sample grid is a tradeoff in-loop filtering process from the rest of the coding process,
between computational complexity and subjective quality. that significantly improves the throughput and greatly reduces
Another source of complexity reduction in HEVC deblock- memory bandwidth requirements for multi-core based HEVC
ing is related to the transform and prediction unit size. In implementations. This is not possible with the H.264/AVC
H.264/AVC, the largest transform size is 8 × 8, whereas deblocking filter design, in which the deblocking has to be
the largest prediction unit size is 16 × 16 samples, i.e., a decoupled if multiple slices are processed in parallel and
macroblock. However, in HEVC the largest transform size is deblocking across slice boundaries is enabled.
32 × 32 and the largest prediction unit size is 64 × 64 samples. Another advantage of the highly parallelizable HEVC de-
This additionally reduces the average amount of operations blocking filter is that it provides enough cycle margins to
(although not necessarily for the worst case) since deblocking enable a combination of the deblocking filter and SAO in the
is never performed inside these large blocks. same building block in hardware implementations. In a typical
Deblocking in HEVC has been designed to prevent spatial architecture, the HEVC deblocking filter only consumes from
dependences of the deblocking process across the picture. 84 to 88 cycles per 16 × 16 block, which is less than half of
There is no overlap between the filtering operations for one the typical 200 cycles per 16 × 16 block cycle budget (for a
block edge, which can modify at most three pixels from the 1080p@120 f/s video running at 250 MHz clock rate) [14].
block edge, and the filtering decisions for the neighboring Combining the deblocking filter and SAO in the same building
parallel block edge, which involves at most four pixels from block is beneficial in terms of hardware area cost, since
the block edge. Therefore, any vertical block edge in the SAO and deblocking can share the same memory interface,
picture can be deblocked in parallel to any other vertical in contrast to having separate building blocks and memory
edge. The same holds for horizontal edges. Note, however, interfaces for SAO and deblocking.
that sample values modified by deblocking of vertical block Since deblocking in HEVC is computationally less intensive
boundaries are used as the input for deblocking of horizontal and more parallelizable than in H.264/AVC, it can be said
block boundaries. that the HEVC deblocking is much less of a bottleneck when
For CTU-based processing, the deblocking in HEVC can be implementing a video decoder. The deblocking in HEVC is
performed on an 8 × 8 block basis. A picture can be divided a better tradeoff among coding efficiency (i.e., subjective and
into nonoverlapping blocks of 8×8 samples (see Fig. 2). Each objective quality), throughput, and implementation complexity
of those blocks contains all data required for its deblocking. when compared to the H.264/AVC design.
Consequently, deblocking can be performed independently for
each of those blocks of 8 × 8 samples. This makes the HEVC
deblocking easily parallelizable for any degree of parallelism VI. Results
by simply replicating the same 8 × 8 deblocking logic. This section demonstrates the objective and subjective im-
The order of filtering of vertical and horizontal edges pact of deblocking filtering. Tables II–V show the BD-rate
in HEVC is also different from that in H.264/AVC. In resulting from disabling the deblocking filtering for various
H.264/AVC, deblocking is performed on a macroblock basis. configurations used in the HEVC standardization [16]. These
However, the deblocking in HEVC is first applied to all configurations are all-intra where only intra prediction is used,
1752 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 12, DECEMBER 2012
TABLE II TABLE V
Average Bit Rate Increase at the Same Quality by Disabling Average Bit Rate Increase at the Same Quality by
the Deblocking Filter for the All-Intra Configuration Disabling the Deblocking Filter for the Low-Delay
P-Frame Configuration
All Intra Main
Y U V Low Delay P Main
Class A 1.9% 4.2% 3.7% Y U V
Class B 1.7% 4.5% 5.1% Class B 4.9% 2.5% 2.7%
Class C 0.9% 3.7% 4.3% Class C 2.6% 1.5% 2.1%
Class D 0.7% 3.0% 3.4% Class D 1.6% 1.4% 0.8%
Class E 2.1% 7.4% 8.8% Class E 6.2% 7.8% 9.0%
Class F 0.6% 1.9% 1.8% Class F 1.7% 1.0% 0.4%
Overall 1.3% 4.0% 4.4% Overall 3.3% 2.5% 2.7%
TABLE III
Average Bit Rate Increase at the Same Quality by Disabling
the Deblocking Filter for the Random-Access Configuration
TABLE IV
Fig. 8. Basketball Drive sequence coded in random access configuration at
Average Bit Rate Increase at the Same Quality by Disabling QP32. (a) Deblocking turned off. (b) Deblocking turned on.
the Deblocking Filter for the Low-Delay Configuration
is important for coding and decoding higher resolution video Andrey Norkin received the M.Sc. degree in com-
puter engineering from Ural State Technical Univer-
sequences. sity, Yekaterinburg, Russia, in 2001, and the Doctor
of Science degree in signal processing from the Tam-
pere University of Technology, Tampere, Finland, in
References 2007.
From 2002 to 2007, he was a Researcher with the
[1] B. Bross, W.-J. Han, G. J. Sullivan, J.-R. Ohm, and T. Wiegand, High Institute of Signal Processing, Tampere University
Efficiency Video Coding (HEVC) Text Specification Draft 8, ITU-T SG16 of Technology. He is currently a Senior Researcher
WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVTC-J1003, Joint and Work Package Leader with Ericsson Research,
Collaborative Team on Video Coding (JCTVC), Stockholm, Sweden, Stockholm, Sweden. He has been an active partici-
Jul. 2012. pant in the ITU-T/ISO/IEC Joint Collaborative Team for Video Coding and
[2] ITU-T and ISO/IEC JCT 1, Advanced Video Coding for Generic a Coordinator of the Core Experiment on deblocking filtering in HEVC
Audiovisual Services, ITU-T Rec. H.264 and ISO/IEC 14496-10 (AVC), standardization. His current research interests include video compression, 3-D
May 2003 (and subsequent editions). video, error-resilient coding, and image processing.
[3] T. Wedi and H. G. Musmann, “Motion and aliasing compensated
prediction for hybrid video coding,” IEEE Trans. Circuits Syst. Video
Technol., vol. 13, no. 7, pp. 577–586, Jul. 2003.
[4] P. List, A. Josh, J. Lainema, G. Bjontegaard, and M. Karczewicz, Gisle Bjøntegaard received the Dr. Philos degree in physics from the
“Adaptive deblocking filter,” IEEE Trans. Circuits Syst. Video Technol., University of Oslo, Oslo, Norway, in 1974.
vol. 13, no. 7, pp. 614–619, Jul. 2003. From 1974 to 1996, he was a Senior Scientist with Telenor Research and
[5] K. Ugur, K. R. Andersson, and A. Fuldseth, Video Coding Technol- Development, Oslo. His areas of work included radio link network design,
ogy Proposal by Tandberg, Nokia, and Ericsson, ITU-T SG16 WP3 reflector antenna design and construction, and digital signal processing. From
and ISO/IEC JTC1/SC29/WG11 document JCTVC-A119, Joint Col- 1980 to 1996, he was mainly involved in the development of digital video
laborative Team on Video Coding (JCTVC), Dresden, Germany, Apr. compression methods. He has contributed actively in development of the
2010. ITU video standards H.261, H.262, H,263, H.264, and ISO/IEC MPEG2
[6] A. Norkin, K. Andersson, R. Sjöberg, Q. Huang, J. An, X. Guo, and and MPEG4. From 1996 to 2002, he was a Group Manager with Telenor
S. Lei, CE12: Ericsson’s and MediaTek’s Deblocking Filter, ITU-T Broadband Services, Oslo, engaged in design of point-to-point satellite
SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC-F118, communication and development of a digital satellite TV platform. He has
Joint Collaborative Team on Video Coding (JCTVC), Turin, Italy, Jul. produced numerous contributions toward the development of the ITU-T
2011. standards H.263 and H.264. Since 2002, he has been a Principal Scientist with
[7] M. Ikeda and T. Suzuki, Non-CE10: Introduction of Strong Fil- Tandberg Telecom, Lysaker, Norway, working with video coding development
ter Clipping in Deblocking Filter, ITU-T SG16 WP3 and ISO/IEC and implementation. Since 2006, he has worked on further improvement of
JTC1/SC29/WG11 document JCTVC-H0275, Joint Collaborative Team digital video coding and is currently taking part in the definition of HEVC,
on Video Coding (JCTVC), San Jose, CA, Feb. 2012. developed jointly between ITU and ISO. In 2010, Tandberg Telecom was
[8] M. Ikeda, J. Tanaka, and T. Suzuki, CE12 Subset2: Parallel Deblocking acquired by Cisco Systems and he was promoted to a Cisco Fellow and is
Filter, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document presently with Cisco Systems, Norway.
JCTVC-E181, Joint Collaborative Team on Video Coding (JCTVC),
Geneva, Switzerland, Mar. 2011.
[9] M. Narroschke, S. Esenlik, and T. Wedi, CE12 Subtest 1: Results for
Arild Fuldseth received the B.Sc. degree from the Norwegian Institute of
Modified Decisions for Deblocking, ITU-T SG16 WP3 and ISO/IEC
Technology, Trondheim, Norway, in 1988, and the M.Sc. and Ph.D. degrees
JTC1/SC29/WG11 document JCTVC-G590, Joint Collaborative Team
in signal processing from North Carolina State University, Raleigh, and the
on Video Coding (JCTVC), Geneva, Switzerland, Nov. 2011.
Norwegian University of Science and Technology, Trondheim, in 1989 and
[10] A. Norkin, CE10.3: Deblocking Filter Simplifications: Bs Computa-
1997, respectively.
tion and Strong Filtering Decision, ITU-T SG16 WP3 and ISO/IEC
From 1989 to 1994, he was a Research Scientist with SINTEF, Trondheim.
JTC1/SC29/WG11 document JCTVC-H0473, Joint Collaborative Team
From 1997 to 2002, he was a Manager with the Signal Processing Group, Fast
on Video Coding (JCTVC), San Jose, CA, Feb. 2012.
Search and Transfer, Oslo, Norway. Since 2002, he has been with Tandberg
[11] A. Fuldseth, M. Horowitz, S. Xu, A. Segall, and M. Zhou, Tiles, ITU-T
Telecom, Oslo (now part of Cisco Systems, Oslo), where he is currently a
SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC-F335,
Principal Engineer working on video compression technology.
Joint Collaborative Team on Video Coding (JCTVC), Turin, Italy, Jul.
2011.
[12] T. Yamakage, S. Asaka, T. Chujoh, M. Karczewicz, and I. S. Chong,
CE12: Deblocking Filter Parameter Adjustment in Slice Level, ITU- Matthias Narroschke received the Dipl.-Ing. and
T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC- Dr.-Ing. degrees in electrical engineering from the
G174, Joint Collaborative Team on Video Coding (JCTVC), Geneva, University of Hannover, Hannover, Germany, in
Switzerland, Nov. 2011. 2001 and 2008, respectively.
[13] G. Van der Auwera, X. Wang, M. Karczewicz, M. Narroschke, A. From 2001 to 2007, he was a Research Engineer
Kotra, and T. Wedi (Panasonic), Support of Varying QP in Deblocking, and Teaching Assistant with the Institut für Informa-
ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document JCTVC- tionsverarbeitung, University of Hannover. In 2003,
G1031, Joint Collaborative Team on Video Coding (JCTVC), Geneva, he became the Ober-Ingenieur. Since 2007, he has
Switzerland, Nov. 2011. been with the Panasonic Research and Development
[14] M. Zhou, O. Sezer, and V. Sze, CE12 Subset 2: Test Results and Center, Langen, Germany, where he is currently a
Architectural Study on De-Blocking Filter Without Parallel on/off Filter Principal Engineer. In 2008, he became a Guest
Decision, ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11 document Lecturer with the University of Hannover. He has several patents pending
JCTVC-G088, Joint Collaborative Team on Video Coding (JCTVC), in the area of video coding, mostly in cooperation with Panasonic. His
Geneva, Switzerland, Nov. 2011. current research interests include video coding standardization activities and
[15] G. Bjontegaard, Calculation of Average PSNR Differences Between RD- the development of future video coding schemes.
Curves, ITU-T-T SG16 document VCEG-M33, Joint Collaborative Team Dr. Narroschke is an active contributor to the Motion Picture Experts Group
on Video Coding (JCTVC), 2001. of ISO/IEC SC29, and to the video coding experts group of ITU. He was the
[16] F. Bossen, Common Test Conditions, JCTVC-H1100, Joint Collaborative recipient of the Robert-Bosch-Prize for the Best Dipl.-Ing. Degree in Electrical
Team on Video Coding (JCTVC), San Jose, CA, 2012. Engineering in 2001.
1754 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 22, NO. 12, DECEMBER 2012
Masaru Ikeda received the B.S. and M.S. degrees Geert Van der Auwera received the Ph.D. degree in electrical engineering
in electrical and communication engineering from from Arizona State University, Tempe, in 2007, and the Belgian MSEE degree
Tohoku University, Sendai, Japan, in 1999 and 2001, from Vrije Universiteit Brussel, Brussels, Belgium, in 1997.
respectively. He is currently with Qualcomm Technologies, Inc., San Diego, CA, where
From 2001 to 2008, he was a Research Engineer he is actively contributing to the JCT-VC standardization effort on high-
with Sony Research Center and worked on computer efficiency video coding. Until January 2011, he was with Samsung Electronics,
vision. He is currently with the Technology Devel- Irvine, CA. Until December 2004, he was a Scientific Advisor with IWT-
opment Group, Sony Corporation, Tokyo, Japan. His Flanders, Institute for the Promotion of Innovation by Science and Technology,
current research interests include video compression, Flanders, Belgium. In 2000, he joined IWT-Flanders after researching wavelet
3-D video compression, and image processing. video coding at IMEC’s Electronics and Information Processing Department,
Brussels. His current research interest include video coding, video traffic and
quality characterization, and video streaming mechanisms and protocols.
Kenneth Andersson received the M.Sc. degree in computer science and Dr. Van der Auwera received the Barco and IBM Prizes in 1998 for his
engineering from Luleå University, Luleå, Sweden, in 1995, and the Ph.D. thesis on motion estimation in the wavelet domain from the Fund for Scientific
degree from Linköping University, Linköping, Sweden, in 2003. Research of Flanders, Belgium.
He has been with Ericsson Research, Stockholm, Sweden, since 1994,
where he has worked on speech coding and is currently a Senior Researcher
working on video coding. His current research interests include image and
video signal processing and video coding.
Minhua Zhou received the B.E. degree in electronic engineering and the M.E.
degree in communication and electronic systems from Shanghai Jiao Tong
University, Shanghai, China, in 1987 and 1990, respectively, and the Ph.D.
degree in electronic engineering from Technical University, Braunschweig,
Germany, in 1997.
From 1993 to 1998, he was a Researcher with the Heinrich-Hertz Institute,
Berlin, Germany. He is currently a Research Manager of video coding
technology with the Systems and Applications Research and Development
Center, Texas Instruments, Inc., Dallas. His current research interests include
video compression, video pre- and postprocessing, end-to-end video quality,
joint algorithm and architecture optimization, and 3-D video.
Dr. Zhou was the recipient of the Rudolf-Urtel Prize in 1997 from the
German Society for Film and Television Technologies in recognition of his
Ph.D. thesis work on Optimization of MPEG-2 Video Encoding.