Image Coding Based On Image Segmentation and Shape Adaptive All Phase Biorthogonal Transform

Uploaded by

Penny Sheldon

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

Image Coding Based On Image Segmentation and Shape Adaptive All Phase Biorthogonal Transform

Uploaded by

Penny Sheldon

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Image Coding Based on Image Segmentation and

Shape Adaptive All Phase Biorthogonal Transform

Baochen Jiang1,2, Aiping Yang1, Chengyou Wang2, and Zhengxin Hou1*

1
School of Electronic Information Engineering, Tianjin University, Tianjin 300072, P. R. China
2
School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai 264209, P. R. China
jbc@sdu.edu.cn; yangaiping@tju.edu.cn; wangchengyou@sdu.edu.cn; zhengxinhou@163.com

Abstract—Based on the shape adaptive discrete cosine compression algorithm based on APBT with simpler
transform (SA-DCT) and its application in image coding, this quantization table improves the performance at low bit rates
paper proposes a new image coding algorithm based on image both in objective effects and subjective effects compared with
segmentation and shape adaptive all phase biorthogonal that based on DCT [9].
transform (SA-APBT). In this paper, an image is divided into
two parts, the region-of-interest (ROI) and the background area, On the basis of the better performance of image
which can be encoded separately. Intra macroblocks (all pixels compression based on APBT than the one based on DCT at
are located in ROI or background region) are processed with low bit rates [9], shape adaptive APBT (SA-APBT) is proposed
APBT; while marginal macroblocks (part of pixels are located for arbitrarily shaped image segments instead of SA-DCT in
in ROI) are processed with SA-APBT. Experimental results are this paper. Experimental results show that compared to SA-
obtained with the test images. It can be concluded that the DCT at the same bit rate, the PSNRs and reconstructed image
coding performance of the proposed algorithm is better than effects based on SA-APBT are comparable to SA-DCT
that of conventional algorithms. Both the objective quality and especially at low bit rates. In addition, the quantization table
subjective effect are improved. based on SA-APBT is simpler and computational complexity is
reduced.
Keywords—image coding; image segmentation; region-of-
interest (ROI); shape adaptive; discrete cosine transform (DCT); all The rest of this paper is organized as follows. Section II
phase biorthogonal transform (APBT) briefly introduces the conventional image segmentation
techniques, including edge detection, Otsu threshold selection
I. INTRODUCTION method and watershed algorithm. An improved threshold
selection method is proposed. APBT and SA-APBT are
Discrete cosine transform (DCT) [1], [2] has been used explained in Section III. And then in Section IV, the proposed
widely in image and video compression standards [3], [4]. image coding method based on image segmentation and SA-
Many block-based coding methods adopt DCT to transform an APBT is described. Experimental results and comparisons
image from spatial domain to transform domain. But these between the algorithm based on SA-DCT and the one based on
methods have shown many shortcomings such as blocking SA-APBT are presented in Section V. Conclusion of the paper
effects and ringing effects especially at low bit rates, which and discussions for future work are given in Section VI.
promotes the development of region-based image compression
[5]. Region-based image coding method which can improve the
quality of coded images is made up of three procedures: image II. IMAGE SEGMENTATION
segmentation, contour coding and texture coding. Firstly, an Image segmentation is to segment an image into different
image is divided into region-of-interest (ROI) and background regions and to extract the interested target. It is an important
area by image segmentation. Both areas are encoded and issue in the field of computer vision. The result of
transmitted independently. Therefore, the decoder can decode segmentation directly affects the performance of vision system.
and manipulate ROI or background area from bit stream. Image segmentation technique is the focus of this paper,
because the shape adaptive image compression systems need to
Shape adaptive DCT (SA-DCT) proposed by Sikora et al. is
point out the interesting region segmentation of image. There
a transform approach for arbitrarily shaped image segments [6].
are some conventional image segmentation methods, such as
Since put forward earlier, the low complexity SA-DCT has
edge detection, threshold segmentation, region growing, and
been used in many region-based coding systems and included
some new image segmentation methods, such as segmentation
in MPEG-4 video standard verification model [7]. But the
based on mathematical morphology, and spatial clustering
reconstructed images based on SA-DCT have obvious blocking
algorithm [10].
artifacts at low bit rates. On the other hand, the all phase
biorthogonal transform (APBT) which is based on Walsh-
Hadamard transform (WHT), DCT and inverse discrete cosine A. Edge Detection
transform (IDCT) was presented and used in image Edge detection technology is very important for digital
compression instead of DCT [8], [9]. Consequently, the image image processing. Because the edge is the boundary line of
*Corresponding author.
This work was supported by the National Natural Science Foundation of
China under Grant No. 61201371 and No. 61002027.

978-1-4673-5791-3/13/$31.00 ©2013 IEEE

Authorized licensed use limited to: J.R.D. Tata Memorial Library Indian Institute of Science Bengaluru. Downloaded on April 11,2022 at 14:54:23 UTC from IEEE Xplore. Restrictions apply.
target and background, we can distinguish between target and D. Improved Threshold Selection Method
background with the edge. The edge detection operator detects This paper improves the original threshold selection
the change of gray value in image. Mostly, conventional method and introduces some related operations from image
arithmetic detection extracts image edge using extreme of first- morphology. The proposed threshold selection method
order derivative or zero-crossing of second-order derivative combines morphological close operation with filling operation
which bases on gray discontinuousness on edge. Commonly to obtain the region-of-interest. The process of improved
used first-order derivative operators are gradient operator, threshold selection method is shown in Fig. 1. The Sobel
Sobel operator and Prewitt operator. Commonly used second- operator is used in edge detection and the morphological close
order derivative operators are Laplacian operator, Kirsch operation with the diameter of 5 pixels circular filling is
operator and Wallis operator. The gradient operator is sensitive adopted in the following process. The segmentation results are
not only to the edge information, but also to the pixels. shown in Fig. 2. Because it can only process simple images, for
complex images, the effectiveness of the algorithm is not better
B. Otsu Threshold Selection Method than that of manual segmentation. In some of the following
Otsu is an automatic threshold determination method. experiments, manual segmentation method is still used.
Because of its simplicity and high-speed of processing, the
method is commonly used in threshold selection. According to
the gray value threshold T, pixels in the image are divided into
two categories C0 and C1. The gray value of C0 is between 0
and T, and the gray value of C1 is between T+1 and L-1. The
mean of C0 and C1 can be obtained by (1) and (2):
T
u0 = ∑ iPi ϖ 0 , (1)
i =0

L −1
u1 = ∑ iP
i =T +1
i ϖ1 , (2)

where Pi is the probability of the gray value i , and

Fig. 1. Block diagram of improved threshold selection method.
T L −1
ϖ 0 = ∑ Pi , ϖ 1 = ∑ P = 1 −ϖ
i 0 . (3)
i =0 i =T +1

Therefore, the mean of the whole image is

uT = ϖ 0 u0 + ϖ 1u1 . (4)

The interclass variance is defined as (a) (b)

σ B2 = ϖ 0 (u0 − uT ) 2 + ϖ 1 (u1 − uT )2 = ϖ 1ϖ 0 (u0 − u1 ) 2 . (5)

The T of which σ B2 can obtain the maximum value is the

optimal threshold in Otsu method.

C. Watershed Algorithm
(c) (d)
The watershed is a widely used mathematical morphologic Fig. 2. Results of improved threshold selection method: (a) Original image
image segmentation method. It can get the precise edge which with size of 384×256, (b) Segmentation result, (c) Original image with size of
is continuous, closing and single-pixel. In this algorithm, an 640×400, (d) Segmentation result.
image is considered as a topographic surface. The points and
slashes over the border are likened to mountains and valleys. III. SHAPE ADAPTIVE ALL PHASE BIORTHOGONAL TRANSFORM
The watershed transform simulates a landscape being flooded
by sea water. Its basic idea is that the image is divided into A. All Phase Biorthogonal Transform (APBT)
some small areas, and the direction of water flow is decided by On the basis of all phase digital filtering [8], three kinds of
the image gradient value. According to the direction of each all phase biorthogonal transforms based on the WHT, DCT and
pixel, the region with minimum value is directly determined. IDCT were proposed and the matrices of APBT were deduced
Therefore, the image is partitioned into two different sets: in [9]. For example, the matrix V of the all phase inverse
catchment basins and watershed lines. discrete cosine biorthogonal transform (APIDCBT) is

Authorized licensed use limited to: J.R.D. Tata Memorial Library Indian Institute of Science Bengaluru. Downloaded on April 11,2022 at 14:54:23 UTC from IEEE Xplore. Restrictions apply.
⎧1 bi* =VM i ' ici* , (9)
⎪N , m = 0, n = 0,1, , N − 1,
⎪
V (m, n) = ⎨ (6) x j * =V N j ' i a j * , (10)
⎪ N − m + 2 − 1 cos m(2n + 1)π , m = 1, 2, , N − 1,
⎪⎩ N 2
2N n = 0,1, , N − 1.
where the stars indicate the resulting data after the SA-
Similar to the DCT matrix, it can be used in image APIDCBT coefficients are quantized and dequantized. The
compression to transform the image from spatial domain to VM i ' and VN j ' denote the inverse APIDCBT matrix VM i and
frequency domain too.
VN j respectively.
B. Shape Adaptive All Phase Biorthogonal Transform (SA- xj aj
O i
APBT)
Just like SA-DCT, the processes of SA-APBT are j
comprised of four main procedures: vertical shift, vertical 1-D
variable-length APBT, horizontal shift and horizontal 1-D
variable-length APBT. Among three kinds of all phase
biorthogonal transforms based on the WHT, DCT and IDCT,
the APIDCBT has the best performance in image compression
[9]. Therefore, the shape adaptive APBT (SA-APIDCBT) is (a) (b) (c)
mainly introduced and used in this paper.
Fig. 3(a) shows an example of a boundary block segmented bi ci
into foreground (black) and background (white) [6]. Before the
pixels in the foreground are vertical shifted, the position of the
first pixel of each column in the foreground and the length of
each column in the foreground are calculated. Then as is shown
in Fig. 3(b), all pixels of each column in the foreground are
(d) (e)
vertically shifted to the uppermost position according to the
calculated shape information. The all shifted pixels in the Fig. 3. The process of forward SA-APIDCBT on an arbitrarily shaped image
segment within an 8×8 block: (a) Original boundary block, (b) Boundary
foreground are grouped into column vector x j [11]. N j pixels block after the pixels in the foreground are shifted vertically to the upper
of each column vector x j are transformed into N j vertical border, (c) Boundary block after vertical 1-D subtransform, (d) Boundary
block after the transform coefficients are horizontally shifted to the left border,
transform coefficients by vertical 1-D APIDCBT (Fig. 3(c)). (e) Final SA-APIDCBT transform coefficients boundary block after horizontal
The first coefficient in each transform coefficient column 1-D subtransform.
vector a j is the DC coefficient. Based on the position of the
first coefficient of each row and the length of each row, the all IV. PROPOSED METHOD BASED ON IMAGE SEGMENTATION AND
transform coefficients of each row are horizontally shifted to SA-APBT
the left border of the boundary block (Fig. 3(d)). The all shifted
transform coefficients are grouped into row vector bi . A. Algorithm Description
Afterwards M i coefficients of each row vector bi are The proposed image coding algorithm based on image
transformed into final SA-APIDCBT coefficients by horizontal segmentation and SA-APBT is shown in Fig. 4. Firstly, the
1-D APIDCBT (Fig. 3(e)). The mathematical expressions of input image is divided into two parts: the region-of-interest and
two 1-D subtransforms in the vertical and horizontal directions the background area. In some cases, this step can be omitted.
during forward SA-APIDCBT are shown in (7) and (8): For example, when accessing the multimedia database, objects
a j = VN j i x j , (7) stored in the database may be segmented objects of arbitrary
shape. After the split, the objects are in arbitrary shapes.
c i = VM i i b i , (8) Therefore, it is necessary to represent and encode the contour
information of the object. Then, the image is encoded with
where VN j and VMi denote the APIDCBT matrix with length forward SA-APBT. After quantization and Zig-zag scan, the
obtained results are combined with the chain encoding
N j and M i respectively. Obviously the number of the final
information to complete Huffman encoding. At last, the
SA-APIDCBT coefficients is equal to the number of the pixels compressed image data in bit stream format can be obtained.
in the foreground.
After transmitting in the channel, the decoding procedures
On the basis of the transmitted contour data, the decoder
can transform the received SA-APIDCBT coefficients into in the receiving terminal are the inverse operations of the
original foreground data by the inverse SA-APIDCBT. The encoding process, including Huffman decoding, chain
mathematical expressions of the two 1-D subtransforms in the decoding, inverse Zig-zag scan, inverse quantization, inverse
horizontal and vertical directions during inverse SA-APIDCBT SA-APBT and object merging. Finally, the reconstructed
are shown in (9) and (10): image is obtained.

In the SA-APBT, the image is divided into 8×8 Chain code is a very common method. Shapes can not be
macroblocks. Then according to the number of pixels which simplified, but can be represented effectively. In 1961,
belong to the region-of-interest to determine the specific Freeman introduced the chain code to the field of image
coding scheme. The region-of-interest adopts binary code, “0” processing. He defined the concept of generalized chain codes
represents the pixel in the background area, while “1” [12]. With the chain code, any shape characteristics can be
represents the pixel in the region-of-interest, as shown in Fig. 5. calculated.
The region-of-interest and the background area are encoded
Chain code is used to represent the boundary line. These
separately in this paper. There are two block classification and
boundary lines are composed of line segments with designated
coding schemes: Intra macroblocks (all pixels are located in the
length and direction. Firstly, a point is selected as the starting
ROI or background area) are processed with APBT; while
point; the coordinate values in x, y direction are recorded. The
marginal macroblocks (part of pixels are located in the ROI)
next point can be determined according to its relative location
are processed with SA-APBT.
to the previous point, and be represented with a value from 1~8.
Chain code can express contours without distortion, and
compress some number of bits, as shown in Fig. 6.
2 3
4 2

3 1 5 1

6 8
(a) (b) 4 7
Fig. 5. Binary information of ROI: (a) Original image, (b) Shape (a) (b)
information.
Fig. 6. Chain code: (a) 4 chain code, (b) 8 chain code.
In the decoding side, the data are decoded by combining the
shape information; blocks are classified by the type of pixels. C. Uniform Quantization
In the image coding system based on the SA-APBT, in
B. Shape Representation and Chain Coding order to compress the high frequency components which are
Shape representation methods are divided into two not sensitive to the human eyes, different quantization intervals
categories, one category is coding schemes; another kind is the are adopted to the transformed coefficients: the quantization
simplification representation of contour. The commonly used step of the low frequency coefficients is small, while the
schemes are chain code method, spline method, polygon quantization step of the high frequency coefficients is large.
approximation method and labeled graph method. Considering The relative energy of APBT’s basis images increases with the
the characteristics of encoding algorithm proposed in this paper, increasing sequency along the down/right diagonal direction,
shapes are arbitrary, the chain code method is used to represent which results in the high-frequency attenuation characteristics
the contour.

Authorized licensed use limited to: J.R.D. Tata Memorial Library Indian Institute of Science Bengaluru. Downloaded on April 11,2022 at 14:54:23 UTC from IEEE Xplore. Restrictions apply.
of the coefficients of APBT. Therefore, in the quantization The rate distortion curves of images Lena and Zelda are
process, the uniform quantization can be used instead of the shown in Fig. 8 and Fig. 9 respectively.
complex quantization matrix which used in JPEG standard, and 42
it can reduce the complexity of the proposed algorithm.
40
In addition, because the region-of-interest is separated
automatically or by manual in advance, the ROI and 38

background area can be quantized separately. This coding 36

algorithm not only can guarantee a certain subjective effect, but

PSNR/dB
also can get a better compression ratio. Experimental results 34

show that: with the same quantization step length, if the 32

quantization step of background is increased, though PSNR
decreases, the subjective effect will not be reduced 30

significantly. As shown in Table I, although the bit rate of Fig.

28
7(b) is reduced by 0.10bpp than Fig. 7(a), and the PSNR DCT-JPEG
SA-DCT
reduces 1.20dB. Because the quantization step of region-of- 26
APIDCBT
interest is unchanged, the subjective effect is not affected too SA-APIDCBT
24
much. 0 0.5 1 1.5
Bit rate/bpp

Fig. 8. Rate distortion curves of image Lena.

38
PSNR/dB

34
(a) (b)
Fig. 7. Subjective effects of reconstructed images Lena using different
32
quantization steps in ROI: (a) 0.60bpp, (b) 0.50bpp.
DCT-JPEG
30 SA-DCT
TABLE I. EFFECTS OF QUANTIZATION STEP TO PSNR APIDCBT
SA-APIDCBT
28
PSNR/dB 0 0.5 1 1.5
Bit rate/bpp
0.60bpp 0.50bpp Bit rate/bpp

Whole image 35.79 34.59

Fig. 9. Rate distortion curves of image Zelda.
Region of interest 37.37 37.37
Fig. 10 shows the segmentation result of image Car using
Backgroud 40.94 37.84
the image segmentation algorithm proposed in this paper.

V. EXPERIMENTAL RESULTS AND PERFORMANCE

ANALYSIS
In order to test the ability of the proposed algorithm to
compress the region-of-interest, in this section, two typical test
images Lena and Zelda with size of 512×512 and another
image Car are selected and compressed at various bit rates. To
compute the PSNR of the region-of-interest, in this paper, the
whole image uses the same quantization step regardless of the
ROI and background area. In APBT-based algorithm,
APIDCBT is adopted. In SA-APBT-based algorithm, SA- (a) (b)
APIDCBT is adopted correspondingly. In SA-DCT, the Δ DC- Fig. 10. Subjective result using the proposed image segmentation method: (a)
Original image, (b) Image segmentation.
SA-DCT [11] is adopted to test the performance.

Authorized licensed use limited to: J.R.D. Tata Memorial Library Indian Institute of Science Bengaluru. Downloaded on April 11,2022 at 14:54:23 UTC from IEEE Xplore. Restrictions apply.
The rate distortion curves of image Car is shown in Fig. 11. used in video compression. Because the video signal has a
higher performance requirement and SA-DCT has been
48
adopted in the MPEG-4 international standard. We can foresee
46 that if the SA-APBT is used in video compression, it will
improve the coding performance to some extent.
44

42
ACKNOWLEDGMENT
40 This work was supported by the National Natural Science
PSNR/dB

38
Foundation of China (Grant No. 61201371, No. 61002027).
The authors would like to thank Xiangyu Niu, Chao Cui,
36 Xiaoyan Wang and Chunhui Guo for their help and valuable
34
suggestions. The authors also thank the anonymous reviewers
for their valuable comments to improve the presentation of the
32 DCT-JPEG paper.
SA-DCT
30 APIDCBT
SA-APIDCBT REFERENCES
28
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 [1] N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete cosine transform,”
Bit rate/bpp IEEE Trans. on Computer, vol. 23, no. 1, pp. 90-93, Jan. 1974.
[2] G. K. Wallace, “The JPEG still picture compression standard,” IEEE
Fig. 11. Rate distortion curves of image Car. Trans. on Consumer Electronics, vol. 38, no. 1, pp. 18-34, Feb. 1992.
[3] ISO/IEC, “Information Technology-Digital Compression and Coding of
From the experimental results, it can be concluded that at Continuous-tone Still Images-Part 1: Requirements and Guidelines,”
the same bit rate, the algorithm proposed in this paper can ISO/IEC 10918-1, 1994.
achieve good performance in ROI image coding, especially at [4] ISO/IEC, “Information Technology-Generic Coding of Moving Pictures
and Associated Audio Information,” ISO/IEC 13818-1, 1994.
low bit rates. Both objective quality and subjective effect are
[5] S. Makrogiannis, P. Schelkens, S. Fotopoulos, and J. Cornelis, “Region-
improved. For images Lena and Zelda, their coding oriented compression of color images using fuzzy inference and shape
performance is the best. Compared to the SA-DCT, when the adaptive DCT,” in Proc. of the IEEE Int. Conf. on Image Processing,
bit rate is 0.15bpp, the PSNR of SA-APIDCBT can be gained Thessaloniki, Greece, vol. 3, pp. 478-481, Oct. 2001.
about 1dB improvement. For image Car, the coding [6] T. Sikora, “Low complexity shape-adaptive DCT for coding of
arbitrarily shaped image segments,” Signal Processing: Image
performance also gets a certain ascend, when the bit rate is Communication, vol. 7, no. 4-6, pp. 381-395, Nov. 1995.
0.15bpp, the PSNR gains about 0.7dB improvement. [7] ISO/IEC JTC1/SC29, “Information technology – Generic coding of
audio-visual objects: Part 2 – Visual,” ISO/IEC 14496-2, Jan. 2000.
VI. CONCLUSION [8] Z. X. Hou and X. Yang, “The all phase DFT filter,” in Proc. of the 10th
IEEE Digital Signal Processing Workshop and the 2nd IEEE Signal
This paper proposes a new image coding algorithm based Processing Education Workshop, Pine Mountain, Georgia, USA, pp.
on image segmentation and shape adaptive all phase 221-226, Oct. 2002.
biorthogonal transform. Firstly, the commonly used image [9] Z. X. Hou, C. Y. Wang, and A. P. Yang, “All phase biorthogonal
transform and its application in JPEG-like image compression,” Signal
segmentation algorithms are briefly introduced. Based on this, Processing: Image Communication, vol. 24, no. 10, pp. 791-802, Nov.
threshold selection method is improved. Then the all phase 2009.
biorthogonal transform and shape adaptive all phase [10] R. C. Gonzalez, R. E. Woods, and S. L. Eddins, Digital Image Pocessing
biorthogonal transform are introduced. At last, the image Using MATLAB, Tennessee, USA: Gatesmark Publishing, pp. 486-534,
coding algorithm proposed in this paper is presented. 2009.
Experimental results show that compared with the conventional [11] P. Kauff and K. Schuur, “Shape-adaptive DCT with block-based DC
algorithms, the proposed algorithm can achieve better separation and Δ DC correction,” IEEE Trans. on Circuits and Systems
performance for ROI image coding. Both objective quality and for Video Technology, vol. 8, no. 3, pp. 237-242, Jun. 1998.
subjective effect are improved. [12] J. A. Saghri and H. Freeman, “Analysis of the precision of generalized
chain codes for the representation of planar curves,” IEEE Trans. on
Although the segmentation algorithm proposed in this Pattern Analysis and Machine Intelligence, vol. 3, no. 5, pp. 533-539,
Sep. 1981.
paper is effective for simple image, the effect is not ideal
enough. The segmentation algorithm proposed in this paper can
be improved in the future. In addition, the SA-APBT can be

Authorized licensed use limited to: J.R.D. Tata Memorial Library Indian Institute of Science Bengaluru. Downloaded on April 11,2022 at 14:54:23 UTC from IEEE Xplore. Restrictions apply.