Joc 2904 02
Joc 2904 02
Joc 2904 02
12-30
doi:10.3966/199115992018082904002
Li Li1,2, Lin Feng1,3*, Sheng-Lan Liu4, Mu-Xin Sun3, Jun Wu3, Hui-Bing Wang1
1 School of Computer Science and Technology, Faculty of Electronic Information and Electrical Engineering,
Dalian University of Technology, Dalian 116024, China
hdlili@126.com, fenglin@dlut.edu.cn, whb08421005@mail.dlut.edu.cn
2 School of Information and Electrical Engineering, Hebei University of Engineering, Handan 056038, China
3 School of Innovation and Entrepreneurship, Dalian University of Technology, Dalian 116024, China
{111111, shuxuewujun}@mail.dlut.edu.cn
4 School of Control Science and Engineering, Faculty of Electronic Information and Electrical Engineering,
Dalian University of Technology, Dalian 116024, China
liusl@mail.dlut.edu.cn
Abstract. Feature descriptors based on local pattern have been applied successfully in image
retrieval due to their simplicities. However, most of the local pattern methods only consider the
relationships between the center pixel and its boundary pixels. And these methods disregard the
co-occurrences between patterns in images. In this paper, we propose a novel feature extraction
algorithm called intensity-based co-occurrence local ternary patterns (CLTP) using HSV color
space. The brightness level at a center pixel is highly dependent on the brightness levels of its
neighbors. The neighbors intensity (NI) for a given center pixel are considered in CLTP and an
operator, namely NI-CLTP, is proposed. HSV color space is used in this algorithm to extract
color information. NI-CLTP encodes the intensity co-occurrence of similar ternary edges among
the surrounding neighbors for a given center pixel in an image and it is different from the
existing local pattern methods. Furthermore, NI-CLTP is combined with Gabor transform to
extract effective texture feature. Extensive experiments on diverse databases verify the
effectiveness of our proposed method.
Keywords: co-occurrence local ternary patterns, Gabor transform, image retrieval, local binary
pattern, local ternary pattern
1 Introduction
In recent years, with the growth of network technology and multimedia technology, the size of image
databases is increasing rapidly. It is an extremely difficult task to search and retrieve images from a huge
database. Therefore, image retrieval has become an important topic in pattern recognition and image
processing. At present, most of web search engines (such as Google and Baidu) can retrieve images using
keywords. However, text based retrieval methods have two drawbacks. Firstly, images are annotated by
human labor and it is time-consuming for a large volume of image database. Secondly, the retrieval
results are inaccurate, because different annotators tend to use different ways to handle the same image.
To overcome these problems, the content-based image retrieval (CBIR) is proposed. CBIR [1-4] first
extracts the visual low-level features (such as color, texture, shape, etc.) which can represent an image,
and then computes the similarities between the query image and images in the database. Finally, the top-
N images will be displayed by distance measure. Feature extraction is a critical step for retrieving the
*
Corresponding Author
12
Journal of Computers Vol. 29, No. 4, 2018
images from a huge database and the performance of low-level features directly impacts the results of
image retrieval.
Color and texture are two important visual features in CBIR. Diverse methods have been proposed to
extract the color and texture features for image retrieval. Color presents the intensity distribution in
different color channels. The important color descriptors are color histogram, color moments, color
correlograms and color coherence vector. Color histogram is invariant to scale and orientation and it is
widely applied for image retrieval [5]. However, color histogram lacks spatial information. Therefore,
color moments [6], color correlograms [7] and color coherence vector [8] have been proposed to exploit
the spatial information. Sticker et al. proposed color moments which considered the spatial position
information. However, color moments consume a lot of computation space. Color correlograms were
proposed that characterize the spatial correlation between pair of colors. Color coherence vector was
introduced for image retrieval which incorporated spatial information of images. Texture describes the
information of the coarseness and repetitive patterns of objects. Many texture descriptors have been
proposed to describe image features. Gabor wavelet transformation has been widely applied in computer
vision [9-10]. The scale-invariant and rotation-invariant Gabor transform features were extracted for
image retrieval by Han et al. [11]. Gray level co-occurrence matrix (GLCM) was introduced to describe
texture feature using the co-occurrence relationship of pixels in images [12-16]. Siqueira et al. [17]
extended the GLCM to multi-scale descriptors for texture description.
Many local patterns have been proposed for image retrieval, texture classification, face recognition, etc.
Viola et al. [18] introduced Haar features for real-time face detection. Ojala et al. [19] presented local
binary patterns (LBP) for image texture classification. Subsequently many variants related to LBP have
been presented for texture analysis, face recognition, face recognition [20], palmprint recognition, etc.
Heikkilä et al. [21] proposed center symmetric local binary pattern (CS-LBP) which compared center-
symmetric pairs of pixels. Tan et al. [22] presented local ternary patterns (LTP) for face recognition
which extends LBP to 3-valued codes. Local configuration pattern (LCP) has been combined with LBP
for texture classification [23]. Guo et al. [24] proposed completed local binary pattern (CLBP), which
integrated CLBP_Center (CLBP_C), CLBP_Sign (CLBP_S) and CLBP_Magnitude (CLBP_M) into joint
histogram for texture classification. Local mesh patterns (LMP) [25] encoded the relationship among the
surrounding neighbors for a given center pixel in an image for biomedical image indexing and retrieval.
Most of the techniques in CBIR have used a single feature during the last decade. However, a single
feature usually has some limitations. It is difficult to use single property to represent various image
contents. There are some methods that fuse color and texture features. Shen et al. [26] combined color,
texture and spatial structure for image representation. Images were segmented into small regular regions
and then extracted local color, texture and census transform histogram (CENTRIST) respectively. Liu et
al. [27] proposed multi-texton histogram (MTH) that integrates the advantages of co-occurrence matrix
and histogram to characterize color and texture features. ElAlami [28] exploited Gabor filter and 3D
color histogram for image retrieval.
Many local patterns have been proposed and these methods are used in image retrieval. However, most
of the local patterns only consider the relationships between a given center pixel and its neighbors in the
alone pattern, and these methods lose local textural information regarding the co-occurrence of patterns
in an image.
In this paper, we address the co-occurrence of similar local patterns. The main contributions of our
method are summarized as follows:
(1) We propose a novel neighbors intensity co-occurrence local ternary patterns (NI-CLTP) feature
descriptor. NI-CLTP codes the co-occurrence between adjacent local patterns, while traditional local
patterns lose the co-occurrence of patterns.
(2) We further extend NI-CLTP to neighbors intensity Gabor co-occurrence local ternary patterns (NI-
GCLTP) by Gabor filters, in which extracts different scales and orientations texture information.
(3) For a given pixel, NI-CLTP is calculated by the threshold of neighbors with the average gray level
of surrounding neighbors. This selection of threshold is more robust than the center pixel of LBP.
(4) We extract color and texture feature to represent images, while traditional local patterns methods
disregard the description of color information.
(5) We evaluate the proposed method on Corel-1000, Corel-5000 and Corel-10000 databases, and
experimental results demonstrate the effectiveness of the proposed method in CBIR.
The remainder of this paper is organized as follows. In Section 2, color space, LBP and LTP are
13
Intensity-based Co-occurrence Local Ternary Patterns for Image Retrieval
introduced. Section 3 presents co-occurrence local ternary patterns (CLTP) scheme. The proposed system
framework is illustrated in Section 4. Experimental results and discussions are illustrated in Section 5.
Finally, Section 6 concludes the paper.
Before discussing the proposed method in detail, this section gives brief reviews of color quantization in
HSV color space, the local binary pattern (LBP) and local ternary pattern (LTP).
There are already many excellent color space models [29], such as RGB, Lab, YIQ, HSV, etc. It has been
suggested that color space plays an important role on the performance of image representations. The
HSV color space [30] is very close to the human visual perception system and it is widely applied for
color feature extraction. H, S and V components represent hue, saturation and value information in HSV
color space respectively, H ∈ [0,360] , S ∈ [0,1] , V ∈ [0,1] . H component is measured by an angle value
from 0 to 360 and different angles present different colors. S component is numbered from 0 to 1, as
saturation varies from low to high, the color saturation increases. V component also varies from 0 to 1.
To extract discriminative information, HSV color space is adopted for image representation in this work.
To reduce the computational complexity, it is necessary to make appropriate color space quantization. In
our paper, H and S components are quantized into 72 and 20 bins.
The local binary pattern is encoded by comparing a given center pixel with its neighbors. LBP takes P
points around the center pixel as neighbors, the gray values of its neighbors are g 0 , g1 ,, g P −1 . The
spatial structure in the neighborhood is interpreted as a P -bit binary number by comparing the center
pixel with its neighbors, which can be defined as follows:
P −1
LBPP , R = ∑ s ( g m − g c )2m (1)
m =0
⎧1, if x ≥ 0
s ( x) = ⎨ (2)
⎩0, else
where P denotes the number of neighbors, and R is the radius of the circle defining the neighborhood.
g c denotes the gray-value of the center pixel.
Based on the basic idea in LBP, LTP [22] was designed to adopted 3-valued codes to describe the
neighboring relationship among pixels, in which gray-levels in the zone of width ±t around g c are
quantized to 0 , those above ( g c + t ) are quantized to +1 , those below ( g c − t ) are quantized to −1 . The
LBP code is replaced by a ternary LTP code and the LTP is defined as follows:
⎧1, x ≥ gc + t
⎪
s′( x, g c , t ) = ⎨0, x − gc < t (3)
⎪−1, x ≤ gc − t
⎩
where t is a user-specified threshold.
To facilitate the following discussion, we set the number of neighbors as 8 and the radius of the circle
defining the neighborhood as 1.
The above methods of the local patterns only consider the relationships between a given center pixel
and its neighbors in the alone pattern, and the correlation between adjacent patterns in images is
14
Journal of Computers Vol. 29, No. 4, 2018
overlooked. The co-occurrence between patterns can extract more discriminative information than alone
pattern without spatial correlation. Therefore, we consider the co-occurrence of similar adjacent patterns
in this paper to get a discriminative texture feature.
In an image, the brightness level at a point is highly dependent on the brightness levels of its neighbors
unless the image is simply random noise [31]. Inspired by this idea, Liu et al. [32] proposed the intensity-
based features which considered the intensities of neighbors (NI-LBP). However, NI-LBP does not
reveal any information regarding the co-occurrence of patterns in the image.
The idea of LBP, LTP and the co-occurrence matrix features have motivated us to propose the co-
occurrence local ternary patterns (CLTP) for image retrieval. In CLTP, we consider neighbors intensity
(NI) for a given center pixel. So the novel descriptor is named as NI-CLTP. Co-occurrence matrix
features have motivated us to consider the co-occurrence information between patterns. There are three
pixel pairs ( x, y ) = (1,1) , ( x, y ) = ( 0,0 ) and ( x, y ) = ( 0,1) or ( x, y ) = (1,0 ) between adjacent local binary
patterns in a local image patch. The co-occurrence of three pixel pairs can just use the ternary coding by
the idea of LTP. Therefore, we code co-occurrence pixel pairs by using 3-valued (1,2,0) and the ternary
coding scheme is similar to the coding scheme of LTP.
NI-CLTP is calculated based on the co-occurrence of similar ternary edges among the surrounding
neighbors for a given center pixel in eight directions as shown in Fig. 1. The co-occurrence among the
surrounding neighbors are calculated based on the relationships between neighbors with the average gray
value of neighbors (we use the average gray value of neighbors which can be found in Ref. [32]). To
facilitate discussion, we set used parameters as: P = 8 , R = 1 in the NI-CLTP.
Fig. 1. Example for NI-CLTP calculation (two average gray values of neighbors in R and R + 1 along
eight directions are 99.13 and 42.13)
NI-CLTP for a given center pixel is calculated by using intensity distributions of its neighbors as
follows:
TP , R ( g m ) = I P , R ( g m ) − t ; m = 1, 2,, P (4)
where P = 8 is the number of neighbors, R = 1 is the radius of the neighborhood, the gray values of the
1 P
surrounding neighbors are I P , R ( g m ) , and t = ∑ I P , R ( g m ) .
8 m =1
TP , R +1 ( g m ) = I P , R +1 ( g m ) − t ; m = 1, 2,, P (5)
P
1
where the gray values of the surrounding neighbors are I P , R +1 ( g m ) , and t = ∑ I P , R +1 ( g m ) .
8 m =1
After calculating the intensity distributions of the surrounding neighbors, we code them based on the
indicator s ( x ) as follows:
15
Intensity-based Co-occurrence Local Ternary Patterns for Image Retrieval
Then TP1, R ( g m ) and TP1, R +1 ( g m ) are coded as ternary values. NI-CLTP is calculated based on the co-
occurrence of these ternary values as follows:
⎡ s1 (TP1, R ( g1 ), TP1, R +1 ( g1 )), ⎤
⎢ ⎥
NI − CLTP = ⎢ s1 (TP1, R ( g 2 ), TP1, R +1 ( g 2 )), ⎥ (8)
⎢ ⎥
⎣⎢, s1 (TP , R ( g P ), TP , R +1 ( g P )) ⎦⎥
1 1
⎧1, if x = y = 1
⎪
s1 ( x, y ) = ⎨2, if x = y = 0 (9)
⎪0, else
⎩
NI-CLTP is a 3-valued codes (1,2,0) which is converted into two binary patterns.
Using this method, 2 × 2 P distinct values can be obtained for the NI-CLTP code. In order to reduce the
computational cost, we use the uniform patterns [33] in this work. The uniform pattern is the number of
bitwise transitions from 1 to 0 or vice versa and its uniformity measure is at most 2, other patterns are
non-uniform. For instance, the patterns 00000000 and 11001111 are uniform, while 00100100 and
01010010 are non-uniform patterns. The uniform patterns produce 59 output labels for neighbors of 8
sampling points. Thus, the feature vector length of NI-CLTP is 2 × 59 = 118 .
3.2 Analysis
For a given center pixel, NI-CLTP code is computed by comparing its neighbors with the average gray
level of its neighbors. Then NI-CLTP encodes the co-occurrence of 3-valued edges in eight directions.
Finally, a 3-valued codes is converted into two binary patterns ( NI − CLTP _1 and NI − CLTP _ 2 ). Fig. 2
takes a image from Corel-10000 database as an example to show the feature extraction process of NI-
CLTP.
Fig. 2. Feature vector generation for a given sample image based on NI-CLTP
Fig. 3 shows the comparison between LBP and NI-CLTP on a 5 × 5 sample image patch. Two average
gray values of local neighbors in R = 1 and R + 1 = 2 along eight directions are 99.13 and 42.13. From
Fig. 3, we can see that LBP does not match the visual patterns. In contrast, NI-CLTP outputs co-
occurrence visual patterns. Therefore, NI-CLTP preserves more weak edge information than LBP.
Gabor transform (GT) has been widely used in texture analysis. Joint LBP and GT have been proven to
be an effective method for face recognition [34]. The idea of GT has motivated us to propose Gabor co-
occurrence local ternary patterns (GCLTP) for image retrieval. In GCLTP, we also consider neighbors
intensity (NI) for a given center pixel. So the novel descriptor is named as NI-GCLTP. We choose Gabor
filters in different scales and orientations to process images.
16
Journal of Computers Vol. 29, No. 4, 2018
In this work, all filter parameters are set as the spatial implementation of Gabor transform [35]. A 2-D
Gabor function can be defined as follows:
1 ( )
⎡ − (1/ 2 ) x 2 / σ 2 + y 2 / σ 2 + 2π jω x ⎤
f ( x, y ) = e⎣ ⎦
x y
(10)
2πσ xσ y
where σ x and σ y represent the standard deviations of the Gaussian envelope, ω represents the frequency
of sinusoid.
The Gabor filters can be expressed in multiple orientations and scales as follows:
f s ,d ( x, y ) = t − s f ( x , y ) (11)
(t + 1) 2ln 2 1
ωs , d = H , σ x , s , d = , σ y ,s ,d =
2π t s (t − 1) L π H2 1
2π tan( ) −( )2
2D 2ln 2 2πσ x , s ,d
The response of the Gabor filter is given by the convolution of the Gabor window with a given image
I as follows:
Fs ,d ( x, y ) = ∑∑ I ( x − u , y − v) f s , d (u , v) (12)
u v
In this work, we choose Gabor filters in three scales and four directions (0 , 45 ,90 ,135 ) to process a
given image. Fig. 4 shows the feature extraction process of NI-GCLTP for a filtered image.
17
Intensity-based Co-occurrence Local Ternary Patterns for Image Retrieval
For a given center pixel, the NI-GCLTP is calculated by using intensity distributions of its neighbors
for a filtered image as follows:
⎧ FP , R , s ,d ( g m ) − t1 ; m =1
d = 0
⎪
⎪ FP , R , s ,d ( g m ) − t1 ; m = 2
d = 45
⎪
⎪F (g ) − t1 ; m = 3
d = 90
⎪ P , R , s ,d m
⎪F − t1 ; m = 4
⎪ P , R , s ,d ( g m ) d =135
FP , R , s , m ( g c ) P =8, R =1 = ⎨ (13)
⎪ FP , R , s ,d ( g m ) d =0 − t1 ; m=5
⎪
⎪ FP , R , s ,d ( g m ) d = 45
− t1 ; m=6
⎪
⎪ FP , R , s ,d ( g m ) d = 90
− t1 ; m=7
⎪
⎪ FP , R , s ,d ( g m ) − t1 ; m = 8
⎩ d =135
1 P
where t1 = ∑ FP, R,s,d ( g m ) . For a given center pixel, FP , R,s,d ( g m ) represents the GT response at its
8 m =1
neighbors.
⎧ FP , R +1, s , d ( g m′ ) − t2 ; m′ = 1
d = 0
⎪
⎪ FP , R +1, s , d ( g m′ ) − t2 ; m′ = 2
d = 45
⎪
⎪F (g ) − t2 ; m′ = 3
⎪ P , R +1, s , d m′ d = 90
⎪F − t 2 ; m′ = 4
⎪ P , R +1, s , d ( g m′ ) d =135
FP , R +1, s ,m′ ( g c ) P =8, R = 2 =⎨ (14)
⎪ FP , R +1, s , d ( g m′ ) d = 0
− t2 ; m′ = 5
⎪
⎪ FP , R +1, s , d ( g m′ ) d = 45
− t2 ; m′ = 6
⎪
⎪ FP , R +1, s , d ( g m′ ) d = 90
− t2 ; m′ = 7
⎪
⎪ FP , R +1, s , d ( g m′ )
⎩ d =135
− t 2 ; m′ = 8
18
Journal of Computers Vol. 29, No. 4, 2018
1 P
where t2 = ∑ FP, R +1,s,d ( g m′ ) . For a given center pixel, FP, R +1,s ,d ( g m′ ) represents the GT response at its
8 m′=1
neighbors.
After calculating the intensity distributions of the surrounding neighbors, we code them based on the
indicator s ( x ) as follows:
Then NI-GCLTP is converted into two binary patterns for each scale as similar to the NI-CLTP.
Finally, the histograms of NI-GCLTP are constructed for different scales (as shown in Fig. 4 ).
In this work, we propose a novel image feature representation method, namely NI-CLTP, which
represent color and texture features using the HSV color space. HSV color space is used, because it is
suitable to mimic the visual system of human. In the HSV color space, H and S component histograms
aim to extract global color image information. In order to extract reasonable information, we set H
component as 18, 36 and 72 bins and S component as 10 and 20 bins. And V component is employed to
extract texture feature because V component in HSV color space is very close to the gray level image
conversion of the RGB image. NI-CLTP extracts the correlation of local neighborhood distributions for
reference pixel in eight directions and preserves correlation among patterns, while conventional local
patterns only extract relationships between the given center pixel and its surrounding neighbors. So NI-
CLTP gets more information regarding to the correlation of pixels. Regarding the selection of the
threshold, average gray values of local neighbors is more robust than gray value of the center pixel.
In this paper, the flowchart of the proposed retrieval system framework can be seen in Fig.5 and
algorithm is represented as follows.
19
Intensity-based Co-occurrence Local Ternary Patterns for Image Retrieval
No.368 d=0 No.382 d=25.3867 No.336 d=25.7863 No.346 d=26.3973 No.376 d=26.8955
No.334 d=27.2827 No.373 d=27.4446 No.311 d=27.5196 No.331 d=27.7822 No.341 d=28.1234
No.313 d=28.7545 No.391 d=28.7927 No.343 d=28.9805 No.383 d=29.3336 No.316 d=29.4683
No.338 d=29.5169 No.306 d=29.5175 No.389 d=29.5583 No.385 d=29.6558 No.329 d=30.1312
(1) The proposed method considers the co-occurrence of local neighborhood distributions. However,
conventional methods only simply consider the relationships between the center pixel and its
neighborhoods using histogram and these methods only consider the relationship of the alone pattern.
These methods do not consider any information regarding the co-occurrence of similar edges patterns in
the image.
(2) As explained in Section 2.1, HSV color space is better than RGB color space and it has been
proved by experiments. H and S components are used to extract global color information, and the
combination of NI-CLTP and Gabor transform is applied to extract texture feature on V component.
Good feature representation and similarity measures are two crucial factors in CBIR which directly
impact image retrieval precision. In this work, several common similarity measures have been used for
the similarity match, including L2 distance, L1 distance, weighted L1 distance and improved Canberra
distance as mentioned in [36-38]. Let Y = [ y1 , y2 ,, yn ] be the feature vector of a query image and
X = [ x1 , x2 ,, xn ] be the feature vector of each image in the database, their similarity measures can be
represented as follows:
n
L2 distance: D ( X , Y ) = (∑ ( xi − yi ) )1/ 2
2
(18)
i =1
n
L1 distance: D ( X , Y ) = ∑ | xi − yi | (19)
i =1
n xi − yi
weighted L1 distance: D ( X , Y ) = ∑ (20)
i =1 1 + xi + yi
n xi − yi
improved Canberra distance: D ( X , Y ) = ∑ (21)
i =1 | xi + u x | + | yi + u y |
20
Journal of Computers Vol. 29, No. 4, 2018
n n
xi y
where u x = ∑ , u y = ∑ i . n is the length of feature vector.
i =1 n i =1 n
In order to analyze the performance of proposed method for image retrieval, several experiments were
conducted on three databases. The same preprocessing and color quantization (72 bins for H component,
20 bins for S component and the dimension of V component is the feature vector length of GLBP, GLCP,
GCSLBP, GCLBP, GLTP and GLMP) for those selected methods have been made to fairly evaluate the
performance of proposed method. The abbreviations of different methods have been given in Table 1.
5.1 Databases
In our experiments, we apply the Corel database which is commonly used in the field of image retrieval.
All images of Corel databases come from the Corel Gallery Magic 200,000 (8CDs) database which
contains various contents images. Three Corel databases are used which are shown in Fig. 6. The first
database is the Corel-1000 database [39] which consists 1000 images of size 256 × 384 or 384 × 256 in
JPG format and is divided into 10 different categories. The second database is Corel-5000 database
which contains 5000 images in 50 categories. Size of images is either 192 × 128 or 128 × 192 in JPEG
format. The third database is Corel-10000 database [40] which consists 10000 images in 100 categories.
The precision and recall are used to validate the performance of algorithm in image retrieval [41].
Precision is calculated as the ratio between the number of relevant images retrieved S and the number of
retrieved images N . Recall is calculated as the ratio between the number of relevant images retrieved
and the total number of relevant images T in the database.
S
P= (22)
N
21
Intensity-based Co-occurrence Local Ternary Patterns for Image Retrieval
S
R= (23)
T
In experiments, all images in the database are selected as the query images. First, we demonstrate the
performance of the proposed method under different quantization levels in HSV color space and confirm
the final quantization levels of H and V components. Second, we demonstrate that the improved
Canberra distance is more suitable for our retrieval system. Third, the performances of proposed method
are compared with other image retrieval methods.
In HSV color space, different quantization levels of H, S and V components are used to evaluate the
retrieval performance of proposed method. Table 2 displays the precisions and recalls of proposed
method under different quantization levels of H and V components on Corel databases.
In the experiments, the H component is set to 18, 36 and 72 bins and the V component is set to 10 and
20 bins in HSV color space. According to the retrieval results, the proposed method has best values under
H 72 and S 20 on Corel databases. However, when the quantization level of color is too high, the
complexity of proposed method is increased and the retrieval precision of proposed method may be
decreased. Thus, we choose the vector dimension of H component as 72 and S component as 20 in HSV
color space to characterize color feature.
Table 2. Precision (the number of retrieved image is 10) and recall (the number of retrieved image is 50)
of the proposed method with different quantization schemes for different databases based on HSV color
space. The bold values indicate the best results
Corel-1000 Corel-5000 Corel-10000
Quantization level
Precision (%) Recall (%) Precision (%) Recall (%) Precision (%) Recall (%)
HSV(18_10_1416) 75.95 29.43 34.39 8.67 54.86 16.12
HSV(18_20_1416) 76.54 29.88 34.95 8.97 55.21 16.33
HSV(36_10_1416) 76.91 30.06 35.76 9.14 55.59 16.49
HSV(36_20_1416) 77.54 30.45 36.45 9.40 55.99 16.71
HSV(72_10_1416) 78.01 30.67 36.88 9.58 56.10 16.84
HSV(72_20_1416) 78.43 31.05 37.54 9.83 56.49 17.04
Four different measures ( L1 distance, L2 distance, weighted L1 distance and improved Canberra
distance) have been used for measuring the similarity between query image and images in the database.
Precision and recall of proposed method with four similarity measures have been shown in Table 3. The
number of retrieved image is set from 10 to 30. Experimental results show that the improved Canberra
distance is giving better results than other similarity measures. L2 distance is not always the best metric,
because it puts more emphasis on features that are greatly dissimilar. L1 distance and weighted L1
distance are also two common similarity measures. However, they are not suitable for proposed retrieval
system although they are good similarity measures. As shown in Eq. (21), when the weight parameter is
taken into account and u x , u y are used as smoothing factors [37, 42], the performance of proposed
method is increased. So we adopt improved Canberra distance as similarity measure between images for
proposed retrieval system.
Table 3. The retrieval precision and recall of the proposed method with different distance measures for
Corel-1000 database
Precision (%) Recall (%)
Distance measure
10 15 20 25 30 10 15 20 25 30
L1 distance 77.73 74.39 71.94 69.64 67.64 7.77 11.16 14.39 17.41 20.29
L2 distance 74.24 70.79 68.28 66.39 64.54 7.42 10.62 13.66 16.60 19.36
weighted L1 distance 78.24 74.78 72.51 70.35 68.47 7.82 11.22 14.50 17.59 20.54
improved Canberra distance 78.43 75.33 72.75 70.52 68.56 7.84 11.30 14.55 17.63 20.57
22
Journal of Computers Vol. 29, No. 4, 2018
Fig. 7 plots the precision-recall curves of different methods on Corel-1K database. It can be seen that
the precision of the proposed method is 78.43% on Corel-1000 image database which is much better than
other methods. The precisions of HSV (72-20-708) GLBP, HSV (72-20-972) GLCP, HSV (72-20-
192)GCSLBP, HSV (72-20-1416) GCLBP, HSV (72-20-1416) GLTP and HSV (72-20-2124) GLMP are
only 72.97%, 70.03%, 71.37%, 74.39%, 75.14% and 71.19%. We compare proposed method with other
methods for each category on Corel-1000 image database to evaluate the robust performance of proposed
method for image category, the number of retrieved images is set to 20. Table 4 shows the precision and
recall of the proposed method for individual category on Corel-1000 database. According to Table 4
(HSV (72-20-708) GLBP, HSV(72-20-972)GLCP, HSV (72-20-192) GCSLBP, HSV (72-20-
1416)GCLBP, HSV (72-20-1416) GLTP and HSV (72-20-2124) GLMP are abbreviated to GLBP, GLCP,
GCSLBP, GCLBP, GLTP and GLMP), we can see that the precision and recall of the proposed method
are significantly higher than other descriptors in most categories. The precision and recall of the proposed
method are lower than other methods on African, dinosaur and food categories. The precision and recall
of GLTP are higher than the proposed method in flower category. And the precision and recall of the
proposed method are higher than GLTP, but the performance of the proposed method is lower than other
methods in horse categories. Comparing with other methods, the proposed method is more robust in most
categories.
80
HSV(72-20-708)GLBP
HSV(72-20-972)GLCP
75
HSV(72-20-192)GCSLBP
HSV(72-20-1416)GCLBP
70 HSV(72-20-1416)GLTP
HSV(72-20-2124)GLMP
Proposed Method
Precision(%)
65
60
55
50
45
5 10 15 20 25 30 35 40 45 50 55
Recall(%)
Fig. 7. The performance comparison of proposed method with other methods in terms of precision and
recall on Corel-1000 database
Table 4. The individual category retrieval results of proposed method and other methods on Corel-1000
database
Category Performance (%) GLBP GLCP GCSLBP GCLBP GLTP GLMP PM
Precision 72.90 69.50 70.35 72.05 68.55 70.50 68.15
African
Recall 14.58 13.90 14.07 14.41 13.71 14.10 13.63
Precision 37.50 32.85 33.35 38.15 42.15 34.75 54.15
Beach
Recall 7.50 6.57 6.67 7.63 8.43 6.95 10.83
Precision 56.75 54.75 55.55 55.85 53.25 55.45 60.95
Building
Recall 11.35 10.95 11.11 11.17 10.65 11.09 12.19
Precision 82.35 74.10 76.05 85.65 90.80 79.60 94.35
Bus
Recall 16.47 14.82 15.21 17.13 18.16 15.92 18.87
Precision 98.75 98.35 98.65 98.75 98.80 98.10 97.55
Dinosaur
Recall 19.75 19.67 19.73 19.75 19.76 19.62 19.51
Precision 46.55 44.10 44.75 48.95 49.85 45.60 64.50
Elephant
Recall 9.31 8.82 8.95 9.79 9.97 9.12 12.90
Precision 72.85 68.95 78.35 78.70 92.25 68.55 89.25
Flower
Recall 14.57 13.79 15.67 15.74 18.45 13.71 17.85
Precision 91.65 91.50 90.90 90.65 83.70 91.10 85.70
Horse
Recall 18.33 18.30 18.18 18.13 16.74 18.22 17.14
Precision 46.10 40.80 41.00 48.05 42.45 44.95 48.50
Mountain
Recall 9.22 8.16 8.20 9.61 8.49 8.99 9.70
Precision 67.80 66.30 67.70 68.95 70.35 65.40 64.35
Food
Recall 13.56 13.26 13.54 13.79 14.07 13.08 12.87
23
Intensity-based Co-occurrence Local Ternary Patterns for Image Retrieval
On Corel-5000 image database, the precision-recall curves are illustrated in Fig. 8(a). It can be seen
that the precision of the proposed method can reach 37.54%, and the precision of the proposed method is
higher than other methods. Fig. 8(b) and Fig. 8(c) illustrate the individual category precision and recall of
proposed method and other methods on Corel-5000 image database. We set the number of retrieved
image to 10 when computing the precision and set the number of retrieved image to 50 when computing
the recall. From Fig. 8(b) and Fig. 8(c), it is clear that the precision and recall of the proposed method
outperform other methods in most categories on Corel-5000 image database.
40
HSV(72-20-708)GLBP
HSV(72-20-972)GLCP
35 HSV(72-20-192)GCSLBP
HSV(72-20-1416)GCLBP
HSV(72-20-1416)GLTP
30 HSV(72-20-2124)GLMP
Proposed Method
Precision(%)
25
20
15
10
2 4 6 8 10 12 14 16
Recall(%)
60
Precision(%)
50
40
30
20
10
0 5 10 15 20 25 30 35 40 45 50
Number of image category
(b) Precision and image category number for each category images
18
HSV(72-20-708)GLBP
HSV(72-20-972)GLCP
HSV(72-20-192)GCSLBP
16 HSV(72-20-1416)GCLBP
HSV(72-20-1416)GLTP
HSV(72-20-2124)GLMP
Proposed Method
14
12
Recall(%)
10
2
0 5 10 15 20 25 30 35 40 45 50
Number of image category
(c) Recall and image category number for each category images
24
Journal of Computers Vol. 29, No. 4, 2018
On Corel-10000 image database, Fig. 9(a) plots the precision and recall curves. Fig. 9(a) shows that
proposed method outperforms other methods. The precision of the proposed method is 56.50%, and
precisions of other methods is less than 50%. Fig. 9(b) and Fig. 9(c) illustrate the individual category
precision and recall of different methods on Corel-10000 image database. We set the number of retrieved
image to 10 when computing the precision and set the number of retrieved image to 50 when computing
the recall. From Fig. 9(b) and Fig. 9(c), it is clear that the precision and recall of the proposed method
outperforms other methods in most categories on Corel-10000 image database.
60
HSV(72-20-708)GLBP
55 HSV(72-20-972)GLCP
HSV(72-20-192)GCSLBP
50 HSV(72-20-1416)GCLBP
HSV(72-20-1416)GLTP
45 HSV(72-20-2124)GLMP
Proposed Method
Precision(%)
40
35
30
25
20
15
0 5 10 15 20 25
Recall(%)
70
60
Precision(%)
50
40
30
20
10
0
0 10 20 30 40 50 60 70 80 90 100
Number of image category
(b) Precision and image category number for each category images
20
HSV(72-20-708)GLBP
HSV(72-20-972)GLCP
18 HSV(72-20-192)GCSLBP
HSV(72-20-1416)GCLBP
HSV(72-20-1416)GLTP
16 HSV(72-20-2124)GLMP
Proposed Method
14
12
Recall(%)
10
0
0 10 20 30 40 50 60 70 80 90 100
Number of image category
(c) Recall and image category number for each category images
25
Intensity-based Co-occurrence Local Ternary Patterns for Image Retrieval
Fig. 10 shows two examples of the image retrieval on Corel-10000 database by using the proposed
method. In Fig. 10(a), the query image (No. 5178) is a stamp and the top 20 retrieved images using
proposed method have very similar texture and color features to the query image. In Fig. 10(b), the query
image (No. 3768) is a car, all the top 20 retrieved images are correct. These examples validate that the
proposed method shows good discrimination power.
No.5178 d=0 No.5180 d=40.9447 No.5117 d=41.9231 No.5175 d=43.3088 No.5161 d=44.9613 No.3768 d=0 No.3772 d=30.02 No.3767 d=31.2531 No.3771 d=33.777 No.3716 d=33.8437
No.5111 d=45.0955 No.5115 d=46.044 No.5130 d=46.1244 No.5155 d=46.4288 No.5128 d=46.4757 No.3760 d=34.9927 No.3728 d=35.0104 No.3770 d=35.5252 No.3759 d=35.8205 No.3749 d=35.8778
No.5152 d=46.5142 No.5177 d=47.6897 No.5106 d=47.8222 No.5164 d=47.8904 No.5147 d=48.6993
No.3738 d=36.2173 No.3789 d=36.6187 No.3754 d=36.7452 No.3739 d=37.0655 No.3713 d=37.1246
No.5131 d=49.1036 No.5158 d=49.1663 No.5187 d=49.9261 No.5193 d=50.0499 No.5145 d=50.2352
No.3756 d=37.2195 No.3704 d=37.4304 No.3729 d=37.9344 No.3722 d=37.9881 No.3788 d=38.5934
(a) (b)
Fig. 10. The image retrieval results (stamps and cars) of proposed method on Corel-10000 database
(The top-left image is the query image)
We compare our method with other existing methods in terms of precision and recall on Corel-1000
database. In this experiment, we set the number of retrieved images to 20. As shown in Table 5, we can
see that the proposed method is better than other methods.
Table 5. Comparison of different image retrieval methods with 20 returns on Corel-1000 database
Methods
Database Performance (%)
LBP [19] Method [43] Method [44] Method [45] Proposed method
Precision 61.80 58.20 72.51 57.85 72.75
Corel-1000
Recall 12.36 11.64 14.50 11.57 14.55
For testing of the proposed method, we compare our method with LBP and CNN. Table 6 lists the
precision and recall of different methods on Corel-5000 database. The number of the retrieved images are
set from 10 to 50 in this experiment. It can be seen that the proposed method is better than LBP and CNN.
We also compare our method with other existing methods in terms of precision and recall on Corel-
10000 database. The number of the retrieved images is 12. According to Table 7, the precision and recall
of proposed method outperform other methods.
26
Journal of Computers Vol. 29, No. 4, 2018
Table 7. Comparison of different image retrieval methods with 12 returns on Corel-10000 database
Methods
Database Performance (%)
LBP [19] Method [46] Method [47] Method [45] Proposed method
Precision 35.84 45.24 47.25 46.67 53.81
Corel-10000
Recall 4.30 5.43 5.67 5.60 6.46
5.5 Discussion
LBP is invariant to monotonic intensity changes. Hence, it is robust to illumination and contrast
variations. However, it is sensitive to noise and small pixel value fluctuations. LTP is more resistant to
noise than LBP, but the selection of additional threshold values is not so simple. CLBP aims to preserve
additional distinguishing information by using the sign descriptor, the central pixel descriptor and
magnitude descriptor. However, the quantization of neighbor pixels into binary format is coarse via
global threshold. CS-LBP is a modified version of the LBP feature which compared center-symmetric
pairs of pixels, but it only performs well in interest region description. LCP considers the microscopic
configuration and local structure information which consist of microscopic configuration modeling and
LBP. Although LCP is invariant to illumination and image rotation, it is not appropriate for texture image
retrieval. LMP encodes the relationship among the surrounding neighbors. Although it has good
performance in biomedical image retrieval application, it has certain limitation in natural image retrieval.
LBP and LBP-like methods show impressive performance on texture analysis. However, they still
have limitations. These methods disregard the co-occurrences between patterns in images. NI-CLTP is
proposed to solve this problem and it considers the co-occurrence of similar ternary edges among the
surrounding neighbors for a given center pixel in an image. With this method, the selection of threshold
is more robust than the central pixel and the advantages of ternary pattern (the coding pattern of ternary
pattern is used) and co-occurrence can be integrated. Experimental results demonstrate that the proposed
method has better performance than LBP-like methods in CBIR.
Table 8 shows the feature vector dimension of various methods. As demonstrated in Table 8, the feature
vector length of proposed method is greater than HSV ( 72 − 20 − 708 ) GLBPu 2 _ 8 _1, HSV ( 72 − 20 − 972 )
GLCPu 2 _ 8 _1 and HSV ( 72 − 20 − 192 ) GCSLBPu 2 _ 8 _1 , but its performance is far better than other
existing methods on different databases. Fortunately, a feature size of 1508 is not a big problem for
implementation. In the future work, we will reduce the feature vector dimension of proposed method and
keep good retrieval precision.
6 Conclusion
This paper proposes a novel image retrieval algorithm, namely co-occurrence local ternary patterns
(CLTP) using the HSV color space. The CLTP considers neighbors intensity (NI) for a given center pixel
and consequently a new operator, namely NI-CLTP, is defined. NI-CLTP encodes the intensity-based co-
occurrence of local ternary edges among the surrounding neighbors for a given center pixel in eight
27
Intensity-based Co-occurrence Local Ternary Patterns for Image Retrieval
directions, which is different from the some existing LBP-like methods. Color feature is extracted using
H and S component in HSV color space. V component is used to apply the combination of NI-CLTP and
Gabor transform. The combined feature vector is applied to extract color and texture features. Extensive
experimental results show that the performance of proposed method in terms of precision and recall
outperform other methods on benchmark image databases.
In this paper, although the proposed image retrieval algorithm has superior performance by combining
color and texture, the feature vector dimension of proposed method needs to be reduced. In the future, we
plan to reduce the complexity of proposed method. Besides, the weights of color and texture are same in
this paper, the bad features will affect the performance of image retrieval after combining bad and good
features directly. Therefore, the weight distribution of multi-feature fusion also needs to be further
studied.
Acknowledgements
The work was supported by National Natural Science Foundation of PR China (Grant no. 61370200,
61672130, 61602082), China Postdoctoral Science Foundation (2015M581331) and Science and
Technology Research and Development Plan Project of Handan, Hebei Province (1721203049-1).
References
[1] Y. Rui, T.S. Huang, S.F. Chang, Image retrieval: current techniques, promising directions, and open issues, Journal of Visual
Communication and Image Representation 10(1)(1999) 39-62.
[2] M.E. ElAlami, A new matching strategy for content based image retrieval system, Applied Soft Computing 14(2014) 407-
418.
[3] O.A. Penatti, F.B. Silva, E. Valle, V. Gouet-Brunet, R.D.S. Torres, Visual word spatial arrangement for image retrieval and
[5] M.J. Swain, D.H. Ballard, Color indexing, International Journal of Computer Vision 7(1)(1991) 11-32.
[6] M.A. Stricker, M. Orengo, Similarity of color images, in: Proc. International Society for Optics and Photonics IS&T/SPIE’s
[8] G. Pass, R. Zabih, J. Miller, Comparing images using color coherence vectors, in: Proc. of the fourth ACM International
Conference on Multimedia, 1997.
[9] J.P. Jones, L.A. Palmer, An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate
[11] J. Han, K.K. Ma, Rotation-invariant and scale-invariant Gabor features for texture image retrieval, Image and Vision
Computing 25(9)(2007) 1474-1481.
[12] R.M. Haralick, K. Shanmugam, Textural features for image classification, IEEE Transactions on Systems, Man, and
28
Journal of Computers Vol. 29, No. 4, 2018
[14] A. Baraldi, F. Parmiggiani, An investigation of the textural characteristics associated with gray level cooccurrence matrix
statistical parameters, IEEE Transactions on Geoscience and Remote Sensing 33(2)(1995) 293-304.
[15] L.S. Davis, S.A. Johns, J.K. Aggarwal, Texture analysis using generalized co-occurrence matrices, IEEE Transactions on
[16] G.H. Liu, J.Y. Yang, Z. Li, Content-based image retrieval using computational visual attention model, Pattern Recognition
48(8)(2015) 2554-2566.
[17] F.R. de Siqueira, W.R. Schwartz, H. Pedrini, Multi-scale gray level co-occurrence matrices for texture description,
[19] T. Ojala, M. Pietikainen, T. Maenpaa, Multiresolution gray-scale and rotation invariant texture classification with local
binary patterns, IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7)(2002) 971-987.
[20] D. Huang, C. Shan, M. Ardabilian, Y. Wang, L. Chen, Local binary patterns and its application to facial image analysis: a
survey, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) 41(6)(2011) 765-781.
[21] M. Heikkilä, M. Pietikäinen, C. Schmid, Description of interest regions with local binary patterns, Pattern Recognition
42(3)(2009) 425-436.
[22] X. Tan, B. Triggs, Enhanced local texture feature sets for face recognition under difficult lighting conditions, IEEE
Transactions on Image Processing 19(6)(2010) 1635-1650.
[23] Y. Guo, G. Zhao, M. Pietikäinen, Texture classification using a linear configuration model based descriptor, in: Proc.
British Machine Vision Conference, 2011.
[24] Z. Guo, L. Zhang, D. Zhang, A completed modeling of local binary pattern operator for texture classification, IEEE
Transactions on Image Processing 19(6)(2010) 1657-1663.
[25] S. Murala, Q.J. Wu, Local mesh patterns versus local binary patterns: biomedical image indexing and retrieval, IEEE
[26] G.L. Shen, X.J. Wu, Content based image retrieval by combining color, texture and CENTRIST, in: Proc. 2013
Constantinides International Workshop on Signal Processing, 2013.
[27] G.H. Liu, L. Zhang, Y.K. Hou, Z.Y. Li, J.Y. Yang, Image retrieval based on multi-texton histogram, Pattern Recognition
43(7)(2010) 2380-2389.
[28] M.E. ElAlami, A novel image retrieval model based on the most relevant features, Knowledge-Based Systems 24(1)(2011)
23-32.
[29] M. Singha, K. Hemachandran, Performance analysis of color spaces in image retrieval, Assam University Journal of
Science and Technology 7(2)(2011) 94-104.
[30] A. Vadivel, S. Sural, A.K. Majumdar, An integrated color and intensity co-occurrence matrix, Pattern Recognition Letters
28(8)(2007) 974-983.
[31] G.R. Cross, A.K. Jain, Markov random field texture models, IEEE Transactions on Pattern Analysis and Machine
[32] L. Liu, L. Zhao, Y. Long, G. Kuang, P. Fieguth, Extended local binary patterns for texture classification, Image and Vision
Computing 30(2)(2012) 86-99.
[33] Z. Guo, L. Zhang, D. Zhang, Rotation invariant texture classification using LBP variance (LBPV) with global matching,
29
Intensity-based Co-occurrence Local Ternary Patterns for Image Retrieval
[34] W. Zhang, S. Shan, W. Gao, X. Chen, H. Zhang, Local Gabor binary pattern histogram sequence (LGBPHS): a novel non-
statistical model for face representation and recognition, in: Proc. the Tenth IEEE International Conference on Computer
Vision, 2005.
[35] S. Murala, R.P. Maheshwari, R. Balasubramanian, Local tetra patterns: a new feature descriptor for content-based image
retrieval, IEEE Transactions on Image Processing 21(5)(2012) 2874-2886.
[36] S. Antani, R. Kasturi, R. Jain, A survey on the use of pattern recognition methods for abstraction, indexing and retrieval of
198.
[38] G.H. Liu, Z.Y. Li, L. Zhang, Y. Xu, Image retrieval based on micro-structure descriptor, Pattern Recognition 44(9)(2011)
2123-2133.
[39] J.Z. Wang, J. Li, G. Wiederhold, SIMPLIcity: Semantics-sensitive integrated matching for picture libraries, IEEE
[42] G.N. Lance, W.T. Williams, Mixed-data classificatory programs I - agglomerative systems, Australian Computer Journal
1(1)(1967) 15-20.
[43] H.A. Jalab, Image retrieval system based on color layout descriptor and Gabor filters, in: Proc. 2011 IEEE Conference on
Open Systems (ICOS), 2011.
[44] M. Subrahmanyam, Q.J. Wu, R.P. Maheshwari, R. Balasubramanian, Modified color motif co-occurrence matrix for image
[45] A. Babenko, A. Slesarev, A. Chigorin, V. Lempitsky, Neural codes for image retrieval, in: Proc. European Conference on
Computer Vision, 2014.
[46] G.H. Liu, J.Y. Yang, Content-based image retrieval using color difference histogram, Pattern Recognition 46(1)(2013) 188-
198.
[47] S. Zeng, R. Huang, H. Wang, Z. Kang, Image retrieval using spatiograms of colors quantized by Gaussian mixture models,
30