Texture Features For Browsing and Retrieval of Image Data: Manjunathi and W.Y
Texture Features For Browsing and Retrieval of Image Data: Manjunathi and W.Y
Texture Features For Browsing and Retrieval of Image Data: Manjunathi and W.Y
8, AUGUST 1996
837
2 TEXTUREFEATUREEXTRACTION
2.1 Gabor Functions and Wavelets
A two dimensional Gabor function g(x, y) and its Fourier transform G(u, v) can be written as:
RETRIEVALof image data based on pictorial queries is an interesting and challenging problem. The recent emergence of multimedia
databases and digital libraries m,lkes this problem all the more
important. While manual image annotations can be used to a certain extent to help image search,, the feasibility of such an approach to large databases is a qu'estionable issue. In some cases,
such as face or texture patterns, simple textual descriptions can be
ambiguous and often inadequate for database search.
The objective of this paper is to study the use of texture as an
image feature for pattern retrieval. An image can be considered as
a mosaic of different texture regions, and the image features associated with these regions can be used for search and retrieval. A
typical query could be a region of interest provided by the user,
such as outlining a vegetation patch in a satellite image. The input
information in such cases is an intensity pattern or texture within a
rectangular window. See Fig. 6 for an example of a texture based
browsing application.
and texture analysis algoTexture analysis has a long hi~~tory
rithms range from using random field models to multiresolution
filtering techniques such as the wavelet transform. Several researchers have considered the use of such texture features for pattern retrieval [181, [191. This paper focuses on a multiresolution
representation based on Gabor filters. The use of Gabor filters in
extracting textured image features is motivated by various factors.
The Gabor representation has been shown to be optimal in the
sense of minimizing the joint two-dimensional uncertainty in
space and frequency [4]. These filters can be considered as orientation and scale tunable edge and line (bar) detectors, and the statistics of these microfeatures in a given region are often used to
characterize the underlying texture information. Gabor features
have been used in several image .analysis applications including
texture classification and segmentation [I], [141, image recognition
[51, [81, [131, image registration, and motion tracking [151.
The main contributions of this paper are summarized below:
1) A simple texture feature representation based on Gabor
features is proposed, and a filter design strategy is sug-
The authors are with the Depavtrnent of Electrical and Computer Engineering,
University of California at Santa Barbara. Santa Barbava, C A 93106-9560.
E-mail: rnanj@ece.ucsb.edu,wei@iplab.ecc~.ucsb.edu.
Manuscript received Dec. 36,1994. Recommended for acceptance by R. Picard.
For information on obtaining reprints of this article, please send e-mail to:
transpami@computer.oyg, and reference I E E E C S Log Number P96055.
x' = a-"(xcose
a > 1, m, n = integer
where W = U , and m = 0, 1, ..., S - 1. In order to eliminate sensitivity of the filter response to absolute intensity values, the real
(even) components of the 2D Gabor filters are biased by adding a
constant to make them zero mean (This can also be done by setting
G(0,O) in (2) to zero).
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 18, NO. 8, AUGUST 1996
838
the top 15 retrievals are from the same large image. The performance is measured in terms of the average retrieval rate which is
defined as the average percentage number of patterns belonging to
the same image as the query pattern in the top 15 matches.
We observe that the use of om, feature in addition to the mean
improves the retrieval performance considerably. This perhaps
explains the low classification rate of the Gabor filters reported in
[3]where only the mean value was used. On the average 74.37% of
the correct patterns are in the top 15 retrieved images. The performance increases to 92% if the top 100 (about 6% of the entire
database) retrievals are considered instead (i.e., more than 13 of
the 15 correct patterns are present). Some retrieval examples are
shown in Fig. 2. A detailed comparison with other texture features
is given in Section 4.
Fig. 1. The contours indicate the half-peak magnitude of the filter responses in the Gabor filter dictionary. The filter parameters used are
U,=0.4, U,=0.05, K = 6 , a n d S = 4 .
u...
J j(~.~,(xy)~dx~y.
and om,,= J j J (I W ~~, , ( X.
Y)
- i i n , n ) 2 ~ x(6)
d~
A feature vector is now constructed using p,, and om,as feature components. In the experiments, we use four scales S = 4 and
six orientations K = 6, resulting in a feature vector
7"'
where
IEEE TRANSACTIONS ON PATTERN AlVALYSlS AND MACHINE INTELLIGENCE, VOL. 18, NO. 8, AUGUST 1996
input
pattern
FFT
839
#4
I
selected filters
where <,,p,,,(n,v) is the Fourier transform of the input image pattern. F,,, (u,v) and F,,(tl, v) are the mean and variance associated
with the distribution of Fourier tre~nsformsof all image patterns in
the database. D(u, v) basically is the energy of the difference normalized by the variance associated with each frequency component (u, v). Each filter is evaluated based on the total difference energy within its spectral coverage:
c,,.
thogonal wavelets [61 (same as the ones used in [3] for the TWT).
The 128 x 128 image pattern is decomposed into three levels (4 x 3
= 12 bands) of the wavelet transform. The mean and standard deviation of the energy distribution corresponding to each of the
subbands at each decomposition level are used to construct a (12 x 2)
feature vector.
In [3], decomposition of image subbands at each level is based
on energy considerations and this results in a tree structured decompos$ion where different patterns have different structures. For
pattern retrieval applications, it is convenient to have a fixed
structure. A fixed d&compositiontree can be obtained by sequentially decomposing the LL, LH, and HL subbands. The HH band is
not decomposed as this often does not lead to stable features. A
three level decomposition results in 52 (4(1 + 3 + 9)) subbands. As
in the PWT, the mean and standard deviation in each subband are
used to construct a 52 x 2 component feature vector.
The third set of feature used are the MR-SAR model features
[171. Previous work [91, [201 indicate that the MR-SAR features at
levels 2, 3, and 4, provide the best overall performance. At each
level, five parameters are computed to represent the texture, thus
requiring a total of 15 feature components. The Mahalanobis distance is used to compare the feature vectors.
4.1.1 Summary of Comparisons
Table 1 provides a summary of the experimental results. It shows
the retrieval accuracy of the different texture features for each of
the 116 texture classes in the database. The Gabor features give the
best performance at close to 74% retrieval. This is closely followed
by the MR-SAR features at 73%. The TWT features perform mar-
IEEE TRANSACTIONS ON P A T E R N ANALYSIS AND MACHINE INTELLIGENCE, VOL. 18, NO. 8, AUGUST 1996
TABLE 1
AVERAGERECOGNITION
RATE FOR THE 116 TEXTURE
IMAGES IN THE DATABASE.
FROM THE USC DATABASE.
THE D * LABELS INDICATE TEXTURES FROM THE BORDATZ ALBUM [2] AND O* LABELS INDICATE TEXTURES
Average Retrieval Rate%
Gabor
PWT
TWT
MRSAR
99.17
52.92
9458
97.08
32.50
7542
97.50
36.25
58 33
98.75
67.92
49.17
D42
D(3
W
D4 100.00 90.83 67 92
D5 72.92 52.92 52.08
D6 100.00 100.00 l 0 0 m
D7 35.42 21.25 1958
D8 95 00 79.58 74 58
D9 93.75 84.58 77.50
Dl0 85.83 78.75 68.75
Dl1 100.00 7375 80.00
88 33
6333
100.00
D45
Dl
D2
D3
52.08
95.42
61.58
79.17
D46
D47
D-18
D49
D50
D51
98.33
D52
70.42
30.83
75.00
43 75
10000
8000
D53
D54
D55
100.00
Dl2
Dl3
86.25
42.92
7958
38.75
100.00
69.58
8667
100 00
10000
56.67
39.58
90.03
75.83
D26 100.00
88.75
99.17
100.00
1127
36.67
34.58
34.58
10.42
D28
95.42
72.08
8667
60.00
9750
64.58
88.75
67.50
33.75
77 92
23.75
72.50
33.75
57.50
I134 9917
035 9833
D36 49.17
D37 100.00
J>38 4667
D39 3958
D40 52.08
D41 78 75
92.92
82.92
57.08
78.75
31.67
24.17
56.67
68.33
D29
D30
D73
056
D57
D58
D59
DK)
D61
D62
D63
D64
D65
D66
D67
Gabor
50.00
11.25
1250
14.58
94.17
100.00
1917
100 00
87.92
83.75
72.08
100.00
50.83
lM.00
10300
10000
29.58
20.42
5250
13.75
35.83
3117
9458
100 00
96.67
7000
068 100.CO
D69 4250
PWT
59.17
13 75
13 33
22.m
70.42
lOO.00
7108
100.00
5625
91.25
55.42
100.00
56.67
9708
10000
9117
18.33
1083
3000
-17.92
15.00
2117
9000
100.00
9000
53.75
TWT MRSAR
56.67
38.75
11 25
9.58
1192
15 00
25 83
3 33
80.12
92.50
10000
9750
77.08
8667
100.00 100.00
75.83
87.08
9333
90.83
6125
70.00
100.00 10000
-17.50
57.92
99.17 100.00
100.00 1OO.00
100.00 100.00
25.42
27.92
14.17
2333
3792
5000
41.25
-15 83
50.83
-13.75
27.08
35.00
97.92
79.17
98.75 100.00
95.00
68.33
6375
6208
9958 I0000
39.17 44.17
D81
D82
D83
D84
D85
D86
D87
D88
D89
D90
D91
92.08
30.00
68.33
62.08
D94 100.00
91 67
92.08
90.83
87.50
98.33
37.08
65 CO
7750
29.17
92.50
94.17
39.58
87.50
99.17
37.92
52.50
87.08
52.08
71.67
65.00
53.33
72.92
74.58
DIM
Dl07
58.75
53.33
56.U
54.58
63.33
14.17
52.50
6500
5125
7250
59.17
50.00
55.83
59.58
63.75
51.25
73.75
66.67
44.58
54.58
60.83
62.92
56.25
66.67
48.75
53.33
51.67
5542
Dl08
3750
18.75
29.58
32.92
DIG9
Dl10
78 75
71.75
76.67
76.25
87.92
90.83
61.67
7875
9042
50.42
75.42
62.92
58 33
62.92
91.67
63 33
D95
D96
D97
D98
Dl00
Dl01
Dl02
Dl03
DIM
Dl05
79.17
72.50
79.17
87.50
72.92
TWT MRSAR
95.83
94.17
100 00 10000
99.58
99.17
100.00
99.58
100.00
99.17
71.67
95.00
82.92
92.92
51.67
45 83
2625
10.00
3167
47.08
16.25
27.50
92.50
38 75
D92
D93
100.00
36.25
Gabor PWT
100.00 90.83
100.00 IOO.00
100.00 98.75
100.00 100.00
99.58 9667
91 67 6083
9958 9208
41.67 48.75
21.25 2208
34.58 19.58
25.42 12.92
50.12
92.50
D70
D71
D72
1917
4292
-17.50
15.42
15.83
18.75
57.50
68.75
75 83
98.33
100.00
D73
66.67
51.67
5792
5708
9208
68.33
30.00
84.17
3167
79 58
29 17
27.92
67.50
52.08
99 58
4208
6000
41 67
58.33
94.58
98.75
92.08
100.00
97.50
100.00
100.00
88.33
97.92
'11 67
93.33
100.00
57.50
Dill
Dl12
Avg.
74.37
68.70
69.41
73.18
TABLE 2
CPU TIMES(ON A SUN S P A R C 2 0 WITH ONE PROCESSOR) AND FEATURE
VECTOR LENGTHFOR THE VARIOUS TEXTURE
FEATURES.
GABOR FEATURES
ARE COMPUTED
IN M A T L A B AND ALL THE OTHERS ARE WRITTENIN C LANGUAGE.
Gabor Pcatures
pfi";:;
~ u l featuie
l
Peaturc ExWaction
T~mc
Pealure Vector
Lenzlh
Scuihing and
Sortine Time
PWT
TWT
MRSAR
9.3 sec.
2.3 scc
1.3 scc
2.3 sec
34.0 sec
48
8
(4x2)
24
(12x2)
104
(52x2)
15
(2x2)
1.02 sec
0 l sec
a98 s
1 70 5 1 1
0 70 scc
1
(
IEEE TRANSACTIONS ON PATTERN AIVALYSIS AND MACHINE INTELLIGENCE, VOL. 18, NO. 8, AUGUST 1996
841
4.3 Discussions
A Gabor wavelet based texture analysis scheme is proposed and
its application to image databases is demonstrated. A comprehensive performance evaluation of thse method is given using a large
number of textures and a comparison with some of the well
known multiresolution texture classification algorithms is made.
Further, a novel adaptive filter selection strategy is suggested to
reduce the image processing computations while maintaining a
reasonable level of retrieval performance. The experimental results
indicate that these Gabor feature are quite robust. Rotation and
scale invariance is important in nnany applications and our preliminary results on rotation invariant classification 171 using Gabor
features are very encouraging.
Finally, a note on similarity measures. It is widely acknowledged
that this is an important but a difficult problem. Our initial results
using simple hybrid neural network learning algorithms appear
very promising in the context of learning similarity 1111, [12].
This research was partially supported by National Science Foundation grant IRI-9411330 and by NASA under grant number
842
IEEE TRANSACTIONS ON P A T E R N ANALYSIS AND MACHINE INTELLIGENCE, VOL. 18, NO. 8, AUGUST 1996
MULTISPECTRAL
remote-sensing data are being used for an increasing number of applications in a diverse set of fields including
agriculture, geology, mapping, water resources, and environmental science. The volume of satellite data that is available for
such applications is staggering. A Landsat thematic mapper, for
example, generates seven band images using three visible and four
infrared regions of the spectrum. Even though the Landsat provides coarser spatial resolution than many other remote-sensing
satellites, a single image corresponding to a 170 km by 185 km
region of earth requires over 200 Mbytes of storage and the satellite generates about 5,000 images per week. It has been predicted
that in a few years the amount of data originating from remotesensing satellites will reach a terabyte per day [Z].
~ i v e this
n large volume of data, effective tools for image access
are essential to allow the information in the database to be fully
exploited. Image database systems traditionally access images
using keywords or text associated with the images. Unfortunately,
it is often difficult to assign textual descriptions to images and
consequently
text-based queries often fail. In addition, the task of
.
.
manually annotating the current volume of satellite imagery
would involve a large amount of time and expense.
A recent trend & retrieval from image databases has been to
allow queries based on image content [12], [Ill, [15], [I]. Using this
paradigm, a user can specify a search using image properties such
as shape, color, or texture. In most cases, the user will not specify
numerical values of these properties, but rather will present the
system with example images. The system will compute features
from the example and use these features for database indexing. In
most cases, a detric is defined on these features that is intended to
model perceptual similarity.
The authors are with the Department of Electrical and Computer Engineering, University of California, Irvine, C A 92717.
E-mail: {healey, amit)@ece.uci.edu.
Manuscript received July 5,1995.
Recommended for acceptance by R. W. Picard.
For information on obtaining reprints of this article, please send e-mail to:
transpami@computer.org, and reference IEEECS Log Number P96052.
0162-8828196$05.00 0 1996 IEEE