6 - A - Robust - Dynamic - Niching - Genetic - Algorithm - With - Niche - Migration - For - Automatic - Clustering - Problem 2010
6 - A - Robust - Dynamic - Niching - Genetic - Algorithm - With - Niche - Migration - For - Automatic - Clustering - Problem 2010
Pattern Recognition
journal homepage: www.elsevier.com/locate/pr
a r t i c l e in f o a b s t r a c t
Article history: In this paper, a genetic clustering algorithm based on dynamic niching with niche migration
Received 16 January 2009 (DNNM-clustering) is proposed. It is an effective and robust approach to clustering on the basis of a
Received in revised form similarity function relating to the approximate density shape estimation. In the new algorithm, a
1 October 2009
dynamic identification of the niches with niche migration is performed at each generation to
Accepted 4 October 2009
automatically evolve the optimal number of clusters as well as the cluster centers of the data set
without invoking cluster validity functions. The niches can move slowly under the migration operator
Keywords: which makes the dynamic niching method independent of the radius of the niches. Compared to other
Clustering existing methods, the proposed clustering method exhibits the following robust characteristics: (1)
Genetic algorithms
robust to the initialization, (2) robust to clusters volumes (ability to detect different volumes of
Niching method
clusters), and (3) robust to noise. Moreover, it is free of the radius of the niches and does not need to
Niche migration
Remote sensing image pre-specify the number of clusters. Several data sets with widely varying characteristics are used to
demonstrate its superiority. An application of the DNNM-clustering algorithm in unsupervised
classification of the multispectral remote sensing image is also provided.
& 2009 Elsevier Ltd. All rights reserved.
0031-3203/$ - see front matter & 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.patcog.2009.10.020
ARTICLE IN PRESS
D.-X. Chang et al. / Pattern Recognition 43 (2010) 1346–1360 1347
method lies in how to define the spurious and compatible biological features capable of interbreeding among themselves,
clusters. Moreover, although overspecification of the cluster but unable to breed with individuals of other species [37]. By
number can reduce the initial cluster center effects, there is no analogy, in artificial systems, a niche corresponds to a local
way to guarantee that all clusters in the data set will be found. An optimum of the fitness function, and the individuals in one niche
alternative version of the progressive clustering is to seek one exhibit similar feature in terms of a given metric. Among niching
cluster at a time until no more ‘‘good’’ clusters can be found methods, fitness sharing (FS) and implicit fitness sharing are the
[16,17]. The performances of these techniques are also dependent best known and the most widely used methods [38–42]. In the
on the validity functions, which are used to evaluate the former, the fitness represents the resource for which the
individual clusters. individuals belonging to the same niche compete [38], while in
Since the global optimum of the validity function would the latter [40,41], the sharing effects are achieved by means of a
correspond to the most ‘‘valid’’ solutions with respect to the sample-and-match procedure.
functions, stochastic clustering algorithms based on genetic In FS, the fitness of an individual is reduced if there are many
algorithms (GAs) [18–21] have been reported to be able to other individuals near it and so the GA is forced to maintain
optimize the validity functions to determine the number of diversity in the population [38]. This method should define a
clusters and partitioning of the data set simultaneously. In these similarity metric on the search space and an appropriate niche
GA-based algorithms, the validity functions are regarded as the radius, representing the maximal distance among individuals to be
fitness function to evaluate the fitness of the individual, which considered similar and therefore belonging to the same niche. In
guides the evolution to search for the ‘‘valid’’ solution. In recent most circumstance, it is difficult to give an effective value for the
years, several clustering algorithms based on simple GA or its niche radius without any a priori knowledge. Deb and Goldberg
variants have been developed [22–36]. These algorithms fall into proposed a criterion for estimating the niche radius given the
two broad categories based on the representations for the heights of the peaks and their distances [39]. Since in most of
clustering solutions. The first category uses a straightforward the real applications there is very little prior knowledge about the
encoding, in which the chromosome is encoded as a string of fitness landscape, it is difficult to estimate the niche radius. In the
length n, where n is the number of data points and the element of implicit fitness sharing [40], sharing is accomplished by inducing
the chromosome denotes the cluster number that data point competition for limited and explicit resources, and there is no
belongs to, such as used in Refs. [22–24]. The desired number of specific limitation on the distance between peaks. This method
clusters should be specified in advance. Moreover, this approach avoids the difficult of appropriately choosing the niche radius and
does not reduce the size of the search space and searching the can be used to deal with problems in which the peaks are not
optimal solution can be onerous when the data points proliferate. equally spaced [40–42]. So, one of the most important limitations of
It is for this reason that some researchers opt to use a relatively FS seems to be removed. In fact, some other parameters, such as the
indirect approach where the chromosome encodes the centers of size of the sample of individuals that compete, the number of
the clusters, and each datum is subsequently assigned to the competition cycles and the definition of a matching procedure, need
closest cluster center [25–36]. This kind of algorithms can be to be set. In order to improve the performance of the FS methods,
subdivided into fixed-length encoding algorithms [25–31], which several dynamic niching methods were proposed [46,47]. These
use a fixed-length string to describe the cluster centers and the methods are based upon a dynamic, explicit identification of species
number of clusters is specified a prior, and variable-length discovered at each generation and the FS mechanism is restricted to
encoding algorithms [32–36], which use a variable-length string individuals belonging to the same species. However, the perfor-
to describe the cluster centers and the number of clusters is mance of these algorithms is dependent on the niche radius. When
automatically evolved. Although the number of cluster centers wrong value for the niche radius is selected, the algorithm did not
need not to be given in advance in the variable-length encoding find all the niches perfectly. In Ref. [48], a species conserving
algorithms, the initial values of the cluster centers are constrained genetic algorithm (SCGA) was proposed which does not consider
to be in the range from 2 to kmax , and kmax is the upper bound of any sharing mechanism. Once a new species is discovered, its fittest
the number of clusters and should be specified beforehand. individual is retained in the next generations until a fitter individual
Because the traditional GAs are suitable for locating the optimum for that species is generated. Therefore, each species populating a
of unimodal functions as they converge to a single solution of the region of the fitness landscape survives during the entire evolution,
search space, all these GA-based clustering algorithms consider whether or not it corresponds to an actual niche. Moreover, the
the clustering problem as a unimodal problem. Each chromosome performance of this algorithm is also depends on the niche radius.
is described by a sequence of the cluster centers. When every In addition, all these algorithms are not robust to noise. When the
cluster center is contained in the chromosome, then the fitness data set contains noise points, the performances of these algorithms
function reaches its global optimum. However, a simpler way is to are poor.
consider the clustering problem as a multimodal problem and In this paper, a new clustering algorithm based on dynamic
each cluster center corresponds to a local optimum of the fitness niching with niche migration (DNNM-clustering) is proposed
function. In this circumstance, each chromosome represents a which is robust to noise and cluster volumes. Within the
cluster center and all the local optima of the fitness function DNNM-clustering, a dynamic niching with niche migration is
should be found. Algorithms that allow the formation and the developed to preserve the diversity of the population. A simpler
maintenance of different solutions can be used to solve this representation is adopted, whereby each individual represents a
multimodal problem. single cluster center. All the niches presented in the population at
In order to preserve the population diversity, which prevents each generation are automatically and explicitly identified. Then,
GAs being trapped by a single optimum, niching methods have the application of FS is limited to individuals belonging to the
been developed. The basic idea of the niching methods is based same niche. In order to overcome the dependence of the niche
upon the natural ecosystems, which maintain population diver- radius, a niche migration is considered. This makes the algorithm
sity and permit the GA to investigate many optima in parallel. In work properly and independent of the niche radius even if some
nature, an ecosystem is typically composed of different physical noise points exist and the peaks are not equally spaced and have
niches that exhibit different features and allow both the different cluster volumes.
formation and the maintenance of different types of life (species). The rest of this paper is organized as follows. Section 2
It is assumed that a species is made up of individuals with similar provides the fitness function of the clustering problem used in the
ARTICLE IN PRESS
1348 D.-X. Chang et al. / Pattern Recognition 43 (2010) 1346–1360
algorithm. The dynamic niching with niche migration is presented detailed explanation can be found in Ref. [49]. Note that the ‘‘’’
in Section 3. Section 4 describes the evolutionary clustering in Fig. 1 means the value of J~ s ðxk Þ with respect to the data point xk ,
algorithm. Experimental results on several artificial data sets and k ¼ 1; 2; . . . ; n. According to Fig. 1(b), only two clusters will be
remote sensing image are given in Section 5. Experimental results found when g ¼ 1 and all the five peaks will be separated when g
demonstrate the effectiveness of the DNNM-clustering algorithm. increases to 5 and 10 as shown in Figs. 1(c) and (d).
Finally, conclusions are drawn in Section 6. Here, the CCA algorithm [49] is used to estimate g. For
convenience, it is presented in the following:
200
0.5
150
0
100
−0.5 50
1
1
0 0
−1
−1 −1
−0.5 0 0.5 1
80 60
60
40
40
20
20
0 0
1 1
1 1
0 0 0 0
−1 −1 −1 −1
Fig. 1. (a) Five-clusters data set. (b), (c) and (d) are plots of (3) (the approximate density shapes) with g ¼ 1, 5 and 10, respectively.
ARTICLE IN PRESS
D.-X. Chang et al. / Pattern Recognition 43 (2010) 1346–1360 1349
3.1. Fitness sharing In our algorithm, two strategies, the initialization of the niche
radius and the migration of the niche candidates, are used to
Fitness sharing modifies the search landscape by reducing the achieve the goal of independent of the initial niche radius. After
fitness of an individual in densely populated regions. It works by the initialization of the niche radius, the dynamic niching method
derating the fitness of each individual by an amount related to the attempts to find the niches according to this radius at each
number of similar individuals in the population. Specifically, the generation. And the niche candidates identified will change their
shared fitness fsh;t ðiÞ of an individual i at generation t is given by location under the migration operator. The skeleton of dynamic
niching algorithm is presented in Table 1 and the initialization of
ft ðiÞ
fsh;t ðiÞ ¼ ; ð4Þ the niche radius in Table 2.
mt ðiÞ
When a niche radius inputs, a preprocessing of the input by the
where ft ðiÞ is the raw fitness of the individual, and mt ðiÞ is the algorithm shown in Table 2 is conducted to ensure the niche
niche count which depends on the number and the relative candidates will be sufficiently diverse in the first generation. Here
positions of the individuals within the P population. The niche Z is a constant and Z A ð1; 2Þ.
count is calculated as In phase II of the algorithm shown in Table 1, a migration
X
P operator is introduced. In reality, if a city is prosperous and its
mt ðiÞ ¼ shðdij Þ; ð5Þ citizen lead comfortable lives, then it will attract the people living
j¼1 nearby to migrate to it. For the clustering problem, the effect of
this migration operation is to change the relative position of the
where P is the population size, dij is the distance between the
niches in the entire population. Based on this analogy between
individual i and j, and shðdij Þ is the sharing function which
our society and a clustering problem, a migration operator is
measures the similarity between two individuals. The most
introduced and explained in the following. First, several defini-
commonly used form of sh is
8 a tions used in the migration operator are given.
< 1- dij
sh
>
if dij o ssh ; Definition 1 (Niche attraction). Suppose c1 ; c2 ; . . . ; cm are m
shðdij Þ ¼ ssh ð6Þ
>
:0 individual in a niche, and the fitness values are f1 ; f2 ; . . . ; fm ,
otherwise;
respectively. The attraction one niche acts on another niche is
where ssh is the niche radius and ash is a constant parameter defined as
which regulates the shape of the sharing function. The value of ash F ¼ ðf1 þf2 þ þ fm Þ=m: ð7Þ
is commonly set to 1, yielding to a triangular form for the sharing
function [50]. The distance dij between individual i and j is Definition 2 (Migration principle). Let Ni and Nj be two niches,
implemented by defining a metric on either the genotypic or the and the niche attraction of the two niches are Fi and Fj ,
phenotypic space.
It has been proved that when the number of individuals within Table 1
the population is large enough and the niche radius is properly The dynamic niching algorithm with niche migration.
set, FS provides as many niches in the population as the number
of peaks in the fitness landscape [51,52]. But, there are several Input: the population Popt at generation t, the population size P, the niches
radius s( this value is obtained by the algorithm shown in Table 2)
problems with the fitness sharing approach. In order to ensure
that subpopulations are steadily formed and maintained, only the Sort the current population according to the raw fitness
individuals belonging to the same niche should share the vðtÞ ¼ 0 (the number of actual niches at generation t)
resources of the niche. This assumption is not generally true for uðtÞ ¼ 0 (the number of niche master candidates)
NC ¼ | (the niche master candidate set)
the FS methods [53], because each individual in the population
DN ¼ | (the dynamic niche set)
shares its fitness with all the individuals located at a distance
smaller than the niche radius, no matter for the actual peak, i.e., Phase I: The niche master candidates identification.
for the niche, to which they belong. As a consequence, individuals
For i ¼ 1 to P do
belonging to different peaks may share their fitness, while they if the i th individual is not marked then
should not. Moreover, the radius of the niches should be specified uðtÞ ¼ uðtÞþ 1
and this requires a priori knowledge of how far apart the optima NðuðtÞÞ ¼ 1 (the number of individuals in the u(t)th niche candidate set)
are. However, no information about the search space and the For j ¼ i þ 1 to P do
if (dði; jÞ o s) and (j th individual is not marked)
distance between the optima is available in the practical
insert j th individual into the niche master candidate set NC,
optimization problems. When the niche radius is wrong, the NðuðtÞÞ ¼ NðuðtÞÞþ 1
algorithm cannot find all the niches. In order to overcome these end if
drawbacks, a dynamic niching with niche migration is proposed. end for
If ðNðuðtÞÞ 41Þ then
vðtÞ ¼ vðtÞþ 1
3.2. Dynamic niching with niche migration mark i th individual as the niche master of the v(t)th niche
insert the pair (i th individual, NðuðtÞÞ) in DN
In this section, we propose a dynamic niching method which is end if
end if
independent of the niche radius. From Refs. [38–48], we can see End For
that the radius of the niches plays a crucial role in the
identification of the niches. If the niche radius chosen is too Phase II: The migration of the niches.
small, many niches may be found in every generation. On the Calculate the distance between the niche master candidates
other hand, a large value of the radius will make many solutions For l ¼ 1 to uðtÞ
indistinguishable. This means that too few niches will be If j th niche is the nearest neighbor to l th niche, then determine
conserved. If the radius is so large that only one niche master is the communication edge between these two niches according to Theorem 1.
If there exits communication between the two niches and Fl o Fj , then
found, the algorithm will degenerate into a simple genetic
niche l migrates toward niche j, otherwise niche l keep station.
algorithm and only find one optimum with the largest fitness End For
value.
ARTICLE IN PRESS
1350 D.-X. Chang et al. / Pattern Recognition 43 (2010) 1346–1360
Table 2
The initialization of the niche radius.
For the niche candidate sets identified in phase I of the Fig. 2. An example of influence of noise.
algorithm shown in Table 1, the nearest neighbor of each niche
should be found. Here, a uðtÞ uðtÞ matrix D is used to indicate the samples in between (see Fig. 2). From Fig. 2, we can see
nearest neighbor of the niche candidates, f ðM3 Þ oFðM4 Þ. Then according to Theorem 1, there is no
8 communication between M1 and M4 . However, the slight
<1 if dN ðNi ; Nj Þ ¼ min dN ðNk ; Nj Þ;
Dij ¼ k a j;k ¼ 1;2;...;uðtÞ
ð9Þ variance between the function values of M3 and M4 can be seen
:0 otherwise; as a result of the noise. In order to overcome the influence of
noise, we define a noise tolerance factor r ð0:8 r r r 1Þ and the
where dN ðNi ; Nj Þ is the distance between niche i and niche j. If inequality (11) modified as
Dij ¼ 1, then given the ability for the two niches to communicate.
f ðxm Þ o r minðfi ; fj Þ: ð12Þ
And the communication topology is specified by a matrix S, where
Sij is the number sent from niche i to niche j. Here, Sij ¼ 1 means Then inequality (12) will be used in the determination of
exist communication edge between these two niches, and Sij ¼ 0 communication between two points.
indicates no communication edge between them. The value of Sij After the determination of the communication, the magnitude
is determined by Theorem 1. of migration is defined.
Theorem 1. Let Ni and Nj be two niches, and Mi and Mj be the niche Definition 4 (Migration magnitude). Let Ni and Nj be two niches
masters of these two niches with fitness value fi and fj , a line that identified in one generation, and niche attraction of the two
intersects the two niche masters can be written as niches are Fi and Fj , respectively. The distance of these two niches
x ¼ Mi þkðMj -Mi Þ; k A ½0; 1: ð10Þ is dN ðNi ; Nj Þ and the niche masters be Mi and Mj . If Fi 4 Fj , then the
migration magnitude of the individual in Nj is defined as
Then, a series of points x1 ; x2 ; . . . ; xl is generated along this line and ,
the fitness of those points is calculated by Eq. (4). If ( mA ½1; l Fi
Dl ¼ d 2
r; ð13Þ
satisfies dN ðNi ; Nj Þ
,
f ðxm Þ o minðfi ; fj Þ; ð11Þ where r is the direction vector from the individual in Nj to Mi , and
d is a small constant greater than 0, called the migrating rate. We
then a valley lies between Ni and Nj , and at the same time there is no imagine here that the niches are migrating with negligible
communication between them and Sij ¼ 0. Otherwise, the commu- magnitude.
nication exist and Sij ¼ 1.
After the dynamic identification of the niche masters of the
The concept of Theorem 1 is simple. Given two end points in population Popt at generation t, the species belonging to the niche
Euclidean space, then choose a number of points along the line in master candidate can be defined as a subset Sit a | of individuals in
between the two end points and calculate the fitness of those the population Popt which have a distance from the master
points. In this way, it is possible to determine if a valley lies candidate less than the niche radius and do not belong to other
between the two end points (i.e., Mi and Mj ). If a valley lies species. If the number of the individuals in Sit is larger than 1, then
between the two niches (i.e., ( m A ½1; l, satisfies the inequality this subset is assumed as the actual niche, otherwise, the single
(11)), then the two niches can be seen as two different species and individual in the subset is considered as an isolated individual and
they should not communicate with each other. If no point has all the isolated individuals form the subset St . Then, the
lower fitness than either of the endpoints, then no valley lies population Popt at the generation is partitioned into a number
between the two niches, and they can communicate with each vðtÞ of species, say S1t ; S2t ; . . . ; SvðtÞ
t , and a number of isolated
other. The implementation of Theorem 1 is terminated on the first individuals
point discovered that had lower fitness than either of the two end 0 1
points. It is quite obvious how powerful this theorem is and how [
the decisions of whether the communication of niches exist using Popt ¼ @ St A [ St :
i
ð14Þ
i A f1;2;...;vðtÞg
it.
In fact, the inequality (11) used in Theorem 1 is not robust to After the identification of niches, the sharing fitness of each
noise. For example, given two end points M1 and M4 , and two individual was calculated according to (4). Here, the shared fitness
ARTICLE IN PRESS
D.-X. Chang et al. / Pattern Recognition 43 (2010) 1346–1360 1351
value for an individual within a dynamic niche (identified by the 4.3. Evolutionary operators
dynamic niching algorithm) is its raw fitness value divided by the
niche count. Otherwise, the individual belongs to the isolated Any combination of standard selection, crossover and mutate
category, and its fitness is not modified. The niche count in (5) is operators can be employed by our algorithm. Here intermediate
modified as recombination and uniform neighborhood mutation are used.
X For two randomly chosen parents c1 and c2 , the offspring c of
mt ðiÞ ¼ shðdij Þ; ð15Þ
the intermediate recombination crossover (with probability pc ) is
pj A Sit
c ¼ c1 þ rðc1 -c2 Þ; ð18Þ
and shðdij Þ is computed according to (6). Here, only the individuals
belonging to the same niche share their fitness and the fitness of where r is a uniformly distributed random number over ½0; 1.
the isolated individuals is not modified. Each chromosome undergoes mutation with a probability pm .
After all the niches have been found, the new population is If the minimum and maximum values of the data set along the q
q q
constructed by applying the usual genetic operators. Since some th dimension are cmin and cmax , respectively. If the position to be
niche masters may not survive during the evolution, the species mutated is the q th dimension of a cluster center with value cq ,
elitist strategy is implemented to enable the niche masters to then after uniform neighborhood mutation the value becomes
survive. Here, only the actual masters are conserved. c0q ¼ cq þ rm Rðcmax
q q
-cmin Þ; ð19Þ
terminates after some number of generations, fixed either by the The DNNM-clustering algorithm is described as follows:
user or determined dynamically by the program itself, and the
niche masters obtained is taken to be the solution.
1. Initialize a group of cluster centers with size of P.
2. Evaluate each chromosome.
22 3. Apply the dynamic niching algorithm and apply the fitness
sharing among the individuals belonging to the same niche.
20
4. If the termination condition is not reached, go to Step 5.
Otherwise, select the niche master from the population as the
18
final cluster centers.
16 5. Apply the selection operator.
6. Apply crossover operator to the selected individuals based on
14 the crossover probability.
7. Apply mutation operator to the selected individuals based on
12 the mutation probability.
8. Evaluate the newly generated candidates.
10
9. Apply the elitist strategy.
8 10. Go back to Step 3.
2 0.5 2 0.5
6 0.4 6 0.4
Average Number of Niches
Average Number of Niches
10 0.3 10 0.3
Standard Error
Average Number of Niches with σ = 3
Standard Error with σ = 3
14 0.2 14 Average Number of Niches with σ = 0.5 0.2
Standard Error with σ = 0.5
Average Number of Niches with σ = 3
18 0.1 18 Standard Error with σ = 3
0.1
22 0 22 0
26 26
0 50 100 150 200 0 50 100 150 200
Generations Generations
2 0.5 2 0.5
6 0.4 6 0.4
Average Number of Niches
10 Average Number of Niches with σ = 0.5 0.3 10 Standard Error with σ = 0.5 0.3
Standard Error
18 0.1 18 0.1
22 0 22 0
26 26
0 50 100 150 200 0 50 100 150 200
Generations Generations
Fig. 5. The average number of clusters and its standard error by (a) DNS, (b) SCGA, (c) DFS, (d) DNNM-clustering. In all the experiments P ¼ 100.
ARTICLE IN PRESS
D.-X. Chang et al. / Pattern Recognition 43 (2010) 1346–1360 1353
22 22
20 20
18 18
16 16
14 14
12 12
10 10
8 8
6 6
4 4
2 2
5 10 15 20 5 10 15 20
22 22
20 20
18 18
16 16
14 14
12 12
10 10
8 8
6 6
4 4
2 2
5 10 15 20 5 10 15 20
Fig. 6. The cluster centers obtained by using (a) DNS, (b) SCGA, (c) DFS, (d) DNNM-clustering. In all the experiments P ¼ 100.
16 16
14 14
12 12
10 10
8 8
6 6
4 4
4 6 8 10 12 14 16 4 6 8 10 12 14 16
16 16
14 14
12 12
10 10
8 8
6 6
4 4
4 6 8 10 12 14 16 4 6 8 10 12 14 16
Fig. 8. The cluster centers obtained by using (a) DNS, (b) SCGA, (c) DFS, (d) DNNM-clustering. In all the experiments P ¼ 100.
6. Experiments results
6.1. Experiments on artificial data sets
In order to validate the proposed algorithm, we have
performed a set of experiments with several data sets with In this section, the performances of the DNS [46], SCGA [48],
widely varying characteristics and multispectral remote sensing DFS [47] and DNNM-clustering are compared through the
ARTICLE IN PRESS
D.-X. Chang et al. / Pattern Recognition 43 (2010) 1346–1360 1355
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
7 7
6 6
5 5
4 4
3 3
2 2
1 1
0 0
0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7
Fig. 10. The cluster centers obtained by using (a) DNS, (b) SCGA, (c) DFS, (d) DNNM-clustering. In all the experiments P ¼ 150.
1X
R
Fig. 11. Data 4.
/vðtÞS ¼ Wit ; t ¼ 1; 2; . . . ; G: ð28Þ
Ri¼1
Then, the values /vð1ÞS; . . . ; /vðGÞS represent the average In the experiments, five artificial data sets with widely varying
behavior of the algorithm for the assigned values of P and s. characteristics are used for comparison. All the algorithms run
Finally, we compute the standard errors with two different radii. The number of niches (i.e., the number of
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
!ffi clusters) and the cluster centers obtained are given. Here, we only
u
u1 PR ðWit -/vðtÞSÞ2 give the number of niches of the first data set as example. In the
eð/vðtÞSÞ ¼ t i¼1
ð29Þ
experiments, the value of g in (3) should be determined by the
R R-1
CCA algorithm. The correlations for the five data sets are shown in
of /vðtÞS, 8t A f1; 2; . . . ; Gg. Table 3.
ARTICLE IN PRESS
1356 D.-X. Chang et al. / Pattern Recognition 43 (2010) 1346–1360
8 8
7.5 7.5
7 7
6.5 6.5
6 6
5.5 5.5
5 5
4.5 4.5
4 4
3.5 3.5
4 5 6 7 8 4 5 6 7 8
8 8
7.5 7.5
7 7
6.5 6.5
6 6
5.5 5.5
5 5
4.5 4.5
4 4
3.5 3.5
4 5 6 7 8 4 5 6 7 8
Fig. 12. The cluster centers obtained by using (a) DNS, (b) SCGA, (c) DFS, (d) DNNM-clustering. In all the experiments P ¼ 100.
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
−0.2 −0.2
−0.4 −0.4
−0.6 −0.6
−0.8 −0.8
−1 −1
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
1 1
0.8 0.8
0.6 0.6
0.4 0.4
0.2 0.2
0 0
−0.2 −0.2
−0.4 −0.4
−0.6 −0.6
−0.8 −0.8
−1 −1
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
Fig. 14. The cluster centers obtained by using (a) DNS, (b) SCGA, (c) DFS, (d) DNNM-clustering. In all the experiments P ¼ 100.
Table 4
The mean of the number of clusters obtained by DNS, SCGA, DFS and DNNM-
clustering applied to the two real-world data sets, here AC denotes the actual
number of clusters present in the data set.
DNNM-clustering algorithm has separated the first class-Setosa- of multispectral remote sensing image based on the spectral data
from the others. It is known that two classes (Versicolor and of pixels. Although the remote sensing images usually have a large
virginica) have a large amount of overlap, while the class Setosa is number of overlapping clusters, the experimental results show
linearly separable from the other two. In fact, some researchers that the multispectral image can be effectively grouped into
think that the Iris data set can be classed into tow classes [55,56]. several clusters by the proposed method.
For breast data, only DNNM-clustering algorithm can provide the In this experiment, the algorithms are used to partition
correct number of clusters. And the classification error of DNNM- different landcover regions in the remote sensing image. A 512
clustering algorithm is 3.53 percent. DNA, DFS and SCGA are also 512 remote sensing image of a part of MiYun obtained from
misled in the clustering because of the aforementioned problem Landsat-7 have been chosen. The image considered has three
in the selection of the niche radius. bands in the multispectral mode: band 3-red band, wavelength
From the experiment results of these data sets, it is seen that 0:63 0:69 mm; band 4-near-infrared band, wavelength
the DNNM-clustering algorithm is robust to the initializations. 0:76 0:94 mm; band 5-shortwave infrared band, wavelength
Since all the experiments have been repeated R ¼ 30 times with 1:55 1:75 mm. The pseudocolor images are shown in Fig. 15.
different initializations and in all the cases the correct estimation From the pseudocolor images, it can be seen that the landcovers
of the cluster centers is derived. of the images mainly contain five classes: water, vegetation (Veg),
mountain (Moun), residential areas (RA) and blank regions (BR). In
the experiment, we expect that the four algorithms can partition the
6.2. Experiment on remote sensing image clustering remote sensing images into visually distinct clusters automatically.
The number of population is set to 600 and the maximum
Remote sensing image analysis is attracting a growing interest generation 200. The crossover and mutation probabilities are the
in real-world applications. The design of robust and efficient same as those used in the first experiment.
clustering algorithms becomes one of the most important issues The clustering results for the image are shown in Fig. 16 with
addressed by the remote sensing community. In this section, we gray scale. The number of clusters identified by DNS, SCGA, DFS and
will apply DNS, SCGA, DFS and DNNM-clustering to the clustering DNNM-clustering are 3, 4, 3 and 6, respectively. As seen from Fig. 16,
Fig. 16. The clustering results of the remote sensing image using: (a) DNS; (b) SCGA; (c) DFS; (d) DNNM-clustering.
ARTICLE IN PRESS
D.-X. Chang et al. / Pattern Recognition 43 (2010) 1346–1360 1359
25
36 DFS
DFS
SCGA
SCGA
DNS
31 DNS 20
Average Number of Niches
DNNM−clustering
DNNM−clustering
26
15
21
16
10
11
6 5
1
0 2 4 6 8 10 12 14 16 18 20 22 24
0
Niche Radius 0.1 0.4 0.7 1 1.3 1.6 1.9 2.2 2.4
Fig. 17. The number of clusters obtained by using DNS, SCGA, DFS and DNNM- Fig. 19. The number of clusters obtained by using DNS, SCGA, DFS and DNNM-
clustering algorithms for Data 1. clustering algorithms for Data 5.
24 DFS
SCGA are averaged over 30 runs for each value of s. The results obtained
21 DNS for Data 1, Data 4 and Data 5 are shown in Figs. 17, 18 and 19,
DNNM−clustering respectively. In the experiments, the maximum value of the niche
Average Number of Niches
6
7. Conclusion
3
In this paper, a robust clustering algorithm based on dynamic
0
niching with niche migration (DNNM-clustering) has been
0 1 2 3 4 5
developed for solving clustering problems with unknown cluster
Niche Radius
number. The DNNM-clustering algorithm can find the optimal
Fig. 18. The number of clusters obtained by using DNS, SCGA, DFS and DNNM- number of clusters as well as the cluster centers automatically. As
clustering algorithms for Data 4. the number of clusters is not known a priori in most practical
circumstance, DNNM-clustering algorithm can be used more
widely. In the DNNM-clustering algorithm, each chromosome is
the water and the rivers in the residential areas are distinctly encoded a center of a cluster by a sequence of real-valued
demarcated from the rest by all the four algorithms. For DNS, SCGA numbers. This is more natural and simple than the presentation
and DFS, there are some confusion between the residential areas and used by other clustering algorithms based on GA. The dynamic
blank regions and between the mountain and the vegetation. For the niching is accomplished without assuming any prior knowledge
DNNM-clustering algorithm, most of the landcover categories have on the number of niches and the niche radius. The introduction of
been correctly distinguished. For example, the vegetation on the top the niche migration makes the DNNM-clustering algorithm is
left of the image, the residential areas and many other structures are insensitive to the choice of the initial radius. The superiority of the
identified by the DNNM-clustering algorithm. So we can conclude DNNM-clustering algorithm over DNS, SCGA and DFS algorithm
that DNNM-clustering algorithm is an efficient clustering algorithm has demonstrated by the experiments. All the experiment results
for differentiating the various landover types present in the image. described in this paper have shown that our algorithm is effective,
because it provides all the actual cluster centers. Moreover, the
DNNM-clustering has been applied to the multispectral remote
6.3. Effect of niche radius sensing image for clustering the pixels into several classes, which
also illustrated its effectiveness and superiority.
As mentioned earlier, the performance of the DNNM-clustering Although the results presented here are extremely encoura-
algorithm is independent of the initial niche radius. To examine ging, there is an issue that deserves in-depth study in the future.
this claim, we conduct a series of experiments, in which we vary The population size is undoubtedly crucial to the performance of
the value of niche radius s and count the number of niches found. the algorithm. In order to steadily maintain the actual number of
For these runs we use pc ¼ 0:8, pm ¼ 0:005, set the population size cluster, we should estimate the minimum population size needed
P ¼ 100, and set the number of generations G ¼ 200. The results by our method.
ARTICLE IN PRESS
1360 D.-X. Chang et al. / Pattern Recognition 43 (2010) 1346–1360
About the Author—DONGXIA CHANG received the B.S. and M.S. degrees in mathematics from Xi Xidian University, in 2000 and 2003, respectively. She is currently
pursuing the Ph.D. degree at the Department of Automation, Tsinghua University. Her current research interests include evolutionary computation, clustering and
intelligent signal processing.
About the Author—XIANDA ZHANG received the B.S. degree in radar engineering from Xidian University, in 1969, the M.S. degree in instrument engineering from Harbin
Institute of Technology in 1982, and the Ph.D. degree in electrical engineering from Tohoku University, Sendai, Japan, in 1987. Since 1992, he has been with the Department
of Automation, Tsinghua University. His current research interests are signal processing with applications in radar and communications and intelligent signal processing.
About the Author—CHANGWEN ZHENG received the B.S. degree in mathematics and Ph.D. degree in control science and engineering from Huazhong Normal University in
1992 and 2003, respectively. He was with the General Software Laboratory, Institute of Software, Chinese Academy of Sciences since 2003. His current research interests
include route planning, evolutionary computation and neural networks.
About the Author—DAOMING ZHANG received the B.S. and M.S. degrees in physics from National University of Defense Technology, in 2000 and 2003, respectively. He is
currently pursuing the Ph.D. degree at the Department of Automation, Tsinghua University. His current research interests include image fusion and intelligent signal
processing.