A clustering and dimensionality reduction based evolutionary algorithm for large-scale multi-objective problems
A clustering and dimensionality reduction based evolutionary algorithm for large-scale multi-objective problems
article info a b s t r a c t
Article history: When solving multi-objective problems (MOPs) with a large number of variables, analysis of the linkage
Received 21 January 2019 between decision variables is maybe useful for avoiding ‘‘the curse of dimensionality’’. In this work, a
Received in revised form 14 January 2020 clustering and dimensionality reduction based evolutionary algorithm for large-scale multi-objective
Accepted 18 January 2020
problems is suggested, which focuses on clustering decision variables into two categories and then
Available online 23 January 2020
utilizes a dimensionality reduction approach to get a lower dimensional representation for those
Keywords: variables that affect the convergence of the evolution. The interdependence analysis is carried out
Large-scale multi-objective problems next aiming to decompose the convergence variables into a number of subcomponents that are easier
Cooperative coevolution to be tackled. The algorithm presented in this article is promising on a series of test functions, and
Decision variable clustering the outcome of these experiments reveal that our suggested algorithm is able to prominently enhance
Dimensionality reduction the performance; meanwhile it can save computing costs to a large extent compared with some latest
evolutionary algorithms (EAs). In addition, the proposed algorithm can be extended to solve MOPs
with dimensions up to 5000, with a good performance obtained.
© 2020 Elsevier B.V. All rights reserved.
1. Introduction aims to decompose all variables into a set of smaller groups that
are easier to be settled. Cooperative coevolution optimizes each
In the academic sector of evolutionary computation, earlier sub-problem independently after decomposition is achieved.
interest is mainly drawn to single objective optimization. With In the cooperative coevolution (CC), a pivotal step is how
the appearance of many complicated problems in real world, to choose decomposition strategy. As CC optimizes each sub-
more and more researchers have devoted themselves to solving problem one by one, a satisfying decomposition strategy is sup-
this kind of optimization problems, which usually have no less posed to place interacting variables in one group and independent
than two conflict objectives and are referred to multi-objective variables in different groups at the same time. From the view
problems (MOPs). It is inspiring that lots of suggested evolution- of our point, the current grouping approaches fall into two cat-
ary algorithms have been successfully used to settle MOPs, such
egories: one is fixed grouping strategies [14–16] and the other
as NSGA-II [1] and MOEA/D [2] and NSGA-III [3].
is dynamic grouping categories [17,18]. For single-objective op-
It has been proved that these multi-objective evolutionary al-
timization problems [19–23], dependency relationship between
gorithms (MOEAs) have the ability to obtain all-right performance
variables are simple, so this kind of problems is relatively easy
on some test problems, and further have been used to tackle
with some practical issues [4–8]. However, in the real world, to solve. But different from [24–27], in multi-objective problems,
there still exist many problems that need to be solved which there are dependencies among variables and among multiple
contain a mass of decision variables [9–12]. The effectiveness and objectives, which makes it tougher to get perfect solutions.
efficacy of existing optimization approaches will deteriorate with When it comes to solve a multi-objective problem, linkage be-
the increasing of the number of dimensions due to ‘‘the curse of tween decision variables needs to be taken into account. Several
dimensionality’’. With the more and more extensive applications works have been done on this topic, among them grouping-based
with high dimensionality occurring, some special multi-objective methods are fast-growing. These methods divide all variables into
evolutionary algorithms need to be designed. a set of smaller sub-groups in search space to deal with them
A popular means to solve this type of problems is to apply a one by one. Many-objective evolutionary algorithm for large scale
divide-and-conquer tactics, which was first proposed in [13]. It variables (LMEA) [28] and decision variable analysis based evo-
lutionary algorithm (MOEA/DVA) [29] are two typical algorithms
∗ Corresponding author. for it. They both categorize the variables in decision space into
E-mail address: ruochenliu@xidian.edu.cn (R. Liu). diversity related clusters and convergence related groups, which
https://doi.org/10.1016/j.asoc.2020.106120
1568-4946/© 2020 Elsevier B.V. All rights reserved.
2 R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120
control the distribution on the final front of final population and 2. Background
how close they approach to the true Pareto Front (PF).
In addition to the linkage between variables, another point Several elementary definitions in the academic sector of multi-
that should not be ignored in multi-objective optimization prob- objective optimization are firstly presented in the following part.
lems is the mapping relationship between decision variables and Then, we concisely recall the existing MOEAs for MOPs. After that,
objective functions. Wang et al. put forward dimensional re- we recall some related works for large-scale optimization. Finally,
duction based memetic optimization strategy (DRMOS) in [30]. knowledge about PCA is elaborated.
The unbalancing mapping relationship means objective values
are influenced by variables differently. DRMOS aims to reduce 2.1. Multi-objective optimization
the dimensionality of search space by recognizing the mapping
relationship to evolve the population. In [31], the nonlinear cor- In general, we can formulate an MOP as
relation information entropy is used to measure the mapping min F(x) = (f1 (x), f2 (x), . . . , fm (x))
(1)
relationship between decision variables and objective functions subject to x ∈ Ω
because of its effectiveness in describing no matter linear or non-
linear correlation, which improves the computational efficiency where F(x) is a m dimensional objective function, it includes m
real-valued continuous objectives (f1 (x), f2 (x), . . . , fm (x)), x ∈ RD
of efficient decomposed search strategy.
is a vector that contains D dimensional variables in search space.
In this work, a clustering and dimensionality reduction based
In fact, there are numerous optimal solutions that cannot
evolutionary algorithm for multi-objective problems (MOPs) with
dominate each other in multi-objective optimization. In general,
large-scale variables is suggested. Firstly, we conduct a clus-
we can define the dominance relationship between two solutions
tering strategy to separate all variables in decision space into
for a minimum MOP as follows:
two clusters, named diversity related variables and convergence
related variables. Then principal components analysis (PCA) [32] ∀i ∈ {1, . . . , M } : Fi (x) ≤ Fi (y) ∧∃j ∈ {1, . . . , M } : Fj (x) < Fj (y) (2)
is utilized to get a lower representation for convergence related
If the abovementioned inequation is satisfied, we can state x
variables, which will next be decomposed into a set of sub-
dominates y, which can be given in the form of x ≺ y. Pareto
groups by the interdependence analysis procedure. In the end,
optimal refers to those solutions that are not dominated by any
every subcomponent is optimized cooperatively to evolve the
another solution.
population. The main contributions can be given as follows: Generally speaking, it is tough to get all Pareto optimal solu-
(1) A decision variable clustering strategy can separate all vari- tions. Thus, we have three aims to achieve: (1) enhance conver-
ables into two groups: diversity related variables as well as gent ability of the algorithm to make that all final solutions come
convergence related variables, which can group variables near to the true Pareto front (PF); (2) improve diversity of the
more accurately and benefits the whole evolution. The ap- algorithm to make that the objective vectors spread diversely on
proach uses the angles between the convergence direction the PF on condition that the convergence speed is satisfactory;
(3) use computing resource as little as possible on condition
and the sampled solutions as features and applies k-means
that convergence and diversity are both guaranteed. During the
approach to group these features. A smaller angle means
past couple of years, a lot of attention has been attracted to the
this variable has more contribution to convergence and a
area of multi-objective optimization and many researchers have
bigger angle means it contributes more to diversity. Thus
devoted themselves to this field. Therefore, a host of outstanding
all variables can be grouped either convergence variables
MOEAs [33,34] have been proposed. Generally, we can classify
or diversity variables.
MOEAs into three categories: (1) dominance-based MOEAs [1,
(2) PCA is used to get a lower representation for the origi-
3]; (2) decomposition-based MOEAs [2,35]; (3) indicator-based
nal convergence variables. It represents the most crucial MOEAs [36–38].
information of the data set in a new orthogonal coordi- In the first category, namely dominance-based MOEAs, the
nate system. PCA generates principal components (PCs) most prevalent one is NSGA-II [1]. It determines the dominance
by linearly combining original variables. In this way, PCs relationship among all solutions, then it applies a non-dominated
have no correlations to each other. They also maintain sorting strategy to sort all solutions. To make sure the diversity
the maximum variation of the original data. The majority of the population, an elitist preserving approach and a crowd-
information of original data set is maintained under such ing distance mechanism are put into use. During the evolu-
representation. This contributes to smaller computation tionary process, once N offspring are generated using genetic
costs and more satisfactory results. algorithm operators, where N is the size of population, the off-
(3) We have conducted empirical evaluations on several test spring are combined with the parents, then the aforementioned
suites to make comparisons with some MOEAs that have three mechanisms are applied to choose N individuals from the
the potential to solve large-scale MOPs to assess the prop- combined population to carry through next evolution. Based on
erty of our suggested evolutionary algorithm. The final the fundamental algorithm, many variants were raised, such as [3,
experimental outcome shows that our algorithm can solve 35].
large-scale optimization problems that have as many as The most famous one in the field of the decomposition-based
5000 decision variables, and satisfactory results using ac- MOEAs is MOEA/D [2]. Its core idea is the use of decomposition
ceptable computational cost are obtained. strategy, by which it can transform an MOP into a series of
single-objective optimization problems. It achieves this goal by
In the rest of the article, firstly we retrospect correlated con- attributing each objective a weight. There are three significant
cepts in Section 2. Some details of the primary procedure of decomposition methods, for example, we can transform an MOP
the suggested algorithm are introduced in Section 3. Section 4 into N single-objective problems using the Tchebycheff method.
elaborate experiments as well as results to empirically evaluate After this step, N individuals are produced and co-evolved. Fol-
the performance. Section 5 comes to conclusions and points out lowing this, a crowd of researchers have proposed many other
our future work. new versions of this type of MOEAs, including [33,36].
R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120 3
As for indicator-based MOEAs, they generally adopt a per- Another novel evolutionary algorithm adopting decision variable
formance indicator to guide evolution. Hypervolume (HV) [33] clustering approach is presented in [28]. It is specially tailored
is the most commonly used indicator because it can measure for dealing with many objective optimization problems with high
convergence performance and diversity performance simultane- dimensional decision space. It firstly clusters all decision vari-
ously. [36] is an instance of indicator-based MOEA. ables into two classes, where one class influences convergence
Even though these above mentioned MOEAs are promising performance and remain variables is correlated with diversity.
to solve MOPs already, they are all efficient and valid only in Then during optimization process, two strategies are adopted
relatively low-dimensional space. Once they are applied to solve separately for the two classes.
multi-objective problems with high dimensional decision vari- Both MOEA/DVA [29] and LMEA [28] apply grouping-based
ables, their performance would sharply decrease. The primary strategy to partition decision variables into some different cate-
reason is the exponentially increases in search space when the gories and then deal with them separately. The grouping strategy
dimension grows. On the other hand, it is easy for solutions they use is an idea called fixed grouping approach. There are dy-
to trap into local areas because lots of local PFs exist in high- namic grouping strategies as well. In [41], firstly, a decomposition
dimensional space. Hence, it is necessary to design particular pool that contains different group sizes is designed. Then, one of
MOEAs to deal with large-scale MOPs with effect. the group sizes is chosen probabilistically from this pool. Based
on the selected grouping size, the whole dimension is divided into
2.2. Large-scale multi-objective optimization many groups with different sizes. This method can save many
function evaluations.
We have noticed that there are numerous MOPs that contain
thousands of decision variables in our real-world applications, 2.3. Principal component analysis (PCA)
but the existed multi-objective evolutionary algorithms cannot
solve these problems perfectly. As a matter of fact, extensibility in PCA was put forward by Karl Pearson [42] and has become
variable space is an issue which has not been researched deeply one important statistical method from then on. It has been ap-
in the area of MOEAs. Experience shows that as the quantity plied in many aspects including dimensionality reduction, fea-
of variables in search space increases, the effectiveness of most ture elimination, multivariate data analysis, image recognition,
MOEAs decreases dramatically [39,40]. data visualization and machine learning tasks. In PCA, data is
In nature, coevolution occurs between interacting species or transformed to a new coordinate system determined by data
groups. In cooperative algorithms, individuals who cooperate well itself. In the transformation of coordinate system, the direction of
with others will be rewarded, while those who do not perform maximum variance is determined as the direction of orthogonal
well together will be punished. Antonio and Coello firstly utilized coordinate axis, on account of the maximum variance of data
the framework of cooperative coevolution (CC) in evolutionary comprising the most crucial information of data. The direction
algorithms aiming to solve large-scale MOPs in [13]. It applies which contains the maximum variance of original data is taken
a divide-and-conquer method to randomly decompose a lot of as the new coordinate axis. The next new coordinate axis chooses
variables in search space into several smaller subcomponents the direction which has the second largest variance and at the
with the same size. Then the subcomponents evolve collabo- same time is orthogonal to the first coordinate axis. Repeating the
ratively using a separate evolutionary algorithm. Following this process for the number of the feature dimension of the original
achievement, many other cooperative coevolutionary methods data, we notice that the first several coordinate axes contain
have been proposed. Majority of them aim at solving large scale most of the variances, and the latter ones contain almost zero
global optimization problems [17]. variances. Based on this knowledge, the remain coordinate axes
While for MOPs, we must consider multiple objectives at can be ignored, and the preceding several coordinate axes with
the same time, making it much more complicated when using absolutely no partial variance are just retained so as to realize
the abovementioned CC framework. Lately, an MOEA has been the dimension reduction of data features.
put forward by Ma et al. for settling large-scale multi-objective PCA was firstly put into use to reduce high dimensional space
problems, known as MOEA/DVA [29]. It firstly uses an approach in [43]. In higher dimensional analysis cases, the more explana-
called decision variable analysis to separate all decision variables tory variables are, the greater the possibility of over-fitting. There-
into three categories, which can be achieved by perturbing the fore, principal component analysis can mitigate the influence of
values of certain variables and then checking the dominance rela- over-fitting, particularly when there exists strong relationship
tionship between solutions. Base on the dominance relationship, between variables.
a decision variable can be classified into one of three groups. More
specifically, if the generated solutions do not dominate and are 3. The proposed algorithm
not dominated by any other solutions after variable perturbation,
this variable is called distance variable. If the generated solutions In this section, firstly we design a novel clustering and di-
have dominance relationship with each other after perturbation, mensionality reduction based evolutionary algorithm for MOPs
this variable is called position variable. Otherwise, the variable is with numerous decision variables, referred to as PCA-MOEA. As
called mixed variable. After the decision variable grouping step aforementioned, we can first adopt a clustering approach to clas-
is achieved, MOEA/DVA conducts an interdependence analysis to sify all decision variables into two categories, one correlated with
decompose high dimensional variables to several independent convergence and the other correlated with diversity. As a result,
subcomponents with lower number of variables and then op- we can handle with each group specifically. Firstly, the diversity
timizes each subcomponent separately. Once the solutions are variables are initialized by the uniformly sampling technique [44]
optimized to reach the Pareto front, all decision variables are op- and convergence variables are initialized randomly. After that,
timized together to make sure the final solutions are distributed we use PCA to obtain a lower representation for the convergence
uniformly on the front. One problem in MOEA/DVA is that it variables because the number of them is still large. In this step,
treats all mixed variables as diversity related variables simply. we can control the percent of variance, which decides the num-
However, in fact, these variables still need to be further divided ber of decision variables after processing. Then interdependence
into either diversity related groups or convergence related groups analysis is carried out to detect the interdependence relationship
because their contribution to convergence and diversity varies. among convergence variables that are in low dimension space.
4 R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120
Variables in the same subgroup interact with each other while Algorithm 2 Decision Variable Clustering
there are no interactions between two different subgroups. By
Input: pop, number of selected candidate solutionsSelNum,
this step, variables with small number can be optimized easily. In
number of perturbationPerNum
the last, the cooperative coevolution frame is employed to evolve
Output: [Diver, Conver]
the population. We present the primary steps of PCA-MOEA in
1: for i = 1 : n do
Algorithm 1.
2: C ← Choose SelNum solutions from pop randomly;
Several crucial components of PCA-MOEA are further illus-
3: for j = 1 : SelNum do
trated in the coming parts.
4: Make perturbation for the i-th variable of C[j] for PerNum
Algorithm 1 PCA-MOEA times to bring a generation SP and then normalize it;
Input: 5: Generate a fitting line L for SP in objective space;
the size of population N, FEmax (the maximal number of 6: Angle[i][j] ← the angle between generated L and normal
function evaluation), dimension n line of hyperplane;
Output: 7: MSE [i][j] ← the mean square error of the fitting;
final population pop, corresponding objective vectors Val 8: end for
1: Set FE = 0. Adopt Algorithm 2 to cluster decision variables to 9: end for
generate [Div er , Conv er ]; 10: CV ← {mean(MSE [i]) < 1e − 2|i = 1, ..., D};
2: pop(:, Div er) ← Employ uniformly sampling technique [44] 11: [C 1, C 2] ← apply k-means to group all variables into two
to initialize all diverse variables; categories
⋂ adopting Angle ⋂as feature;
3: pop(:, Conv er) ← Initialize the convergence variables (CVs) 12: if CV
⋂∅ and CV C 2 ̸= ∅ then
C 1 ̸=
randomly; 13: CV ← CV C , C is either C1 or C2 depending on which of
4: Evaluate the individuals,Val = F(pop) , Old Val = Val, FE = the average of Angle is smaller;
FE + N; 14: end if
5: [pop,Val] ← Apply PCA and Algorithm 3 to get the low- 15: DV ← {j ̸ ∈ CV |j = 1, ..., D}
dimensional representation lowCV of the large-scale Conver
to form new pop;
6: [Subcomponents, pop, Val] ← divide the lowCV and evolve
exist, and they are classified either diversity-related variables
population [pop, Val];
7: Threshold ← 0.01, Bound← 1;
or convergence-related variables. Besides, some diversity-related
8: while threshold ≤ BoundandFEmax > FE do
variables may also be considered as convergence-related vari-
9: for j = 1 : size(Subcomponents) do ables since they make more contribution to convergence.
10: [pop,Val] ← SubcomponentOptimizer Another one point that we should pay attention to is, as the
(pop, Val, Subcomponent [j]) case stands, certain decision variable may change its class in dif-
11: end for ferent regions. The more solutions we select to be interfered with,
12: utility ← CalculateUtility(Val,Old Val); the more areas we will probably sample, however, the global
13: end while
consistency of the final solutions still cannot be guaranteed.
the convergence direction. Following this we can get the angle To guarantee that different features are processed on the same
between convergence direction and the normal line of hyperplane scale, the next step is to re-scale coordinate because different
each fitted line L. In this case, we choose two candidate solutions attributes have different scales, so that each coordinate has a unit
for clustering and each variable is bound up with two angles, variance
these variables with larger angles make more contribution to
σj2 = 1
(xij )2
∑
diversity, while variables with smaller angles make more contri- m
(4)
bution to convergence. We need to note that the more candidate After that we replace each x with x /σj . Once the preprocess-
(i) i
solutions we select for clustering decision variables, the more ing step is achieved, we calculate the covariance matrix that is
angles bound up with each variable are generated. As a result, symbolized by Σ , defined as follows:
we can get more accurate measurements.
m
In the end, we adopt k-means clustering approach to separate ∑
all variables in search space into two groups considering the Σ= 1
m
(xi )(xi )T (5)
i=1
properties of each variable. Variables with smaller angle averages
are grouped in convergent correlation clustering, while the others The covariance matrix is symmetric whose size is N × N. In order
are grouped in the diversity-related cluster. to obtain it, we need to calculate the eigenvectors of the matrix
We should notice that the technique we adopt here can Σ . We can use singular value decomposition [45] to achieve this.
gain more precise results. Specifically speaking, these variables We can gain eigenvalues as well as eigenvalue matrices by
that are identified as mixed variables in MOEA/DVA no longer SVD, where ‘U’ is an N square matrix with a vector column
R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120 5
{u(1) , . . . , u(m) }. ‘S’ is a diagonal matrix, which only has the vector
element {s(1) , . . . , s(m) } in the diagonal line whose dimension also Algorithm 3 Interdependence Analysis
is N. Input: pop, Conver, number of chosen solutions CorNum
We can construct a new matrix Ureduce ∈ Rn×k by selecting the Output: subCVs
first k column of U to reduce dimension. We demand to build 1: set subCVs as ∅;
the relationship between W and Ureduce to get a vector called ‘W ’, 2: for all the ν ∈ CV do
denoted by W = (Ureduce )T × X where W ∈ Rn×1 . 3: set CorSet as ∅;
After the eigenvectors of the covariance matrix are obtained, 4: for all the Group ∈ subCVs do
we rank them in descending order of eigenvalues. Small eigen- 5: for all the u ∈ Group do
values mean that their components are less important, so we can 6: flag ← False;
ignore them without losing important information. Finally, we get 7: for i = 1 : CorNum do
a feature matrix, which is formed by using the eigenvalues we 8: Choose an individual p from pop randomly;
select from the list of eigenvectors in the columns. 9: if ν has interaction with u in individual p then
We can formulate the average squared error ASE by Eq. (6), 10: flag ← True;
where x(i) is the original data, xiproj is projection data which is in 11: CorSet = {Group} cupCorSet;
the low-dimensional subspace. 12: Break;
m 13: end if
end for
∑
14:
ASE = 1
m
∥x(i) − xiproj ∥2 (6)
15: if flag is True then
i=1
16: Break;
The total variation is shown as follows: 17: end if
m
∑ 18: end for
1
m
∥x(i) ∥2 (7) 19: end for
i=1 20: if CorSet = ∅ then
21: subCVs = subCVs ∪ {{ν}};
A representative method to determine k is to select the mini-
22: else
mum value that can maintain 99% variance:
∑m 23: subCVs = subCVs/CorSet;
1 (i) i 2
m i=1 ∥x −xproj ∥
∑m ≤ 0.01 (8) 24: Group ← all variables in CorSet and ν ;
1 (i) ∥2
m i=1 ∥ x 25: subCVs = subCVs ∪ {Group};
26: end if
3.3. Interdependence analysis 27: end for
In this section, we will detect the interaction between the 3.4. Subcomponent optimization
convergence-related variables, because the amount of them is
relatively large and they influence the convergence speed to the Just as proposed in MOEA/DVA [29], Algorithm 4 introduces
true PF. According to these relationships, these variables would subcomponent optimization in detail. When optimize one sub-
be further decomposed into a set of sub-groups. In this case, MOP, the information of its neighborhood is meaningful. Here
variables interact with each other are in the same subgroup, we use Euclidean distances between diversity variables to define
and there is no interaction between each of the two subgroups. neighborhood relations. If the diversity variables of the ith sub-
Those variables in the same subgroup are also called interacting MOP is close to the that of the jth sub-MOP, then we say ith
variables, and each of them cannot be optimized independently sub-MOP and jth sub-MOP are neighbors. Since the values of
diversity-related variables are constant for each sub-MOP earlier,
since there exists interactions among them. Algorithm 3 indicates
now only the offspring population of the ith sub-MOP is used to
the specifics of the interaction detection for variables correlated
renew the current solution. Differential evolution [46] is used to
with convergence, and the skill we apply is proposed in [29].
generate new solutions in this step.
Many single-objective large-scale optimization [19,25] have also
adopted similar approaches for variable interaction analysis.
We define the interactions between two variables as: sup- Algorithm 4 Subcomponent Optimization
pose we have an MOP f = min(f1 , f2 , . . . , fm ), if there exist x, Input: pop, Obj, indexes of convergence variables in the iden-
a1 , a2 , b1 , b1 , and among f1 , f2 , . . . , fm , at least one fk , 1 ≤ k ≤ m, tical subcomponent indexes.
can make fk (x)|xi =a2 ,xj =b1 < fk (x)|(xi =a1 ,xj =b1 ) and fk (x)|xi =a2 ,xj =b2 > Output: pop, corresponding objective vectors Val
fk (x)|xi =a1 ,xj =b2 be met simultaneously. Under this circumstance, 1: for i = 1 : N do
variables xi and xj are considered to be interacted with each other, 2: The p-th and q-th individual are randomly chosen from the
where fk (x)|xi =a2 ,xj =b1 = fk (x1 , . . . , xi−1 , a2 , . . . , xj−1 , b1 , . . . , xn ). In adjacent individuals. Adopt differential evolution (DE) strat-
′
Algorithm 3, we firstly initialize an empty subset of interaction egy [46] to form new variables y ← pop(i, indexes) + F ∗
variable groups and then assign convergence-related variables to (pop(p,indexes)-pop(q, indexes)) later execute mutational
′
disparate subgroups according to the pairwise interactions be- operation on y ;
tween variables. More specifically, two variables will be assigned 3: The value is replaced by the randomly generated value
′
into the same subgroup on condition that a variable interacts with within the boundary when a component of y is out of
at least one current variable in subCV; otherwise, this variable boundary. Then make evaluations for the new population,
will be assigned to a new subgroup. We can infer that there are set∑ FE=FE+1; ∑
m m
two extreme cases: in one situation, there is only one subgroup, 4: if j=1 fj (y) < j=1 Val(i, j) then
which signifies that the convergence variables are completely in- 5: let pop(i, :) =y and Val(i, :) =F(y);
separable, in another case, if the decision variables are completely 6: end if
7: end for
separable, the number of subgroups would be | CV |.
6 R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120
Table 1
Average and standard deviation of IGD obtained by the compared algorithms on problems with 30 variables.
Problems PCA-MOEA MOEA/DVA LMEA NSGA-III MOEA/D
3.6156E−03 4.1434E−03 5.9665E−03 7.5532E−03 5.9238E−03
UF1 ≈ – – –
(8.3932E−05) (8.2635E−05) (3.4239E−03) (5.3728E−03) (2.3132E−01)
3.9060E−03 4.2738E−03 7.3550E−03 8.2805E−03 6.2931E−03
UF2 ≈ – – –
(5.3910E−5) (8.3613E−4) (2.4186E−04) (4.3291E−03) (7.3813E−04)
1.1219E−01 2.2714E−02 7.8110E−02 4.3821E−02 8.2734E−03
UF3 + + + +
(9.2592E−02) (7.2599E−03) (7.5420E−03) (2.3748E−02) (6.2831E−03)
1.3683E−02 3.5067E−02 3.2457E−02 4.3279E−02 4.5277E−02
UF4 – – – –
(4.0392E−04) (1.0070E−03) (3.2281E−03) (6.4132E−03) (5.3288E−03)
2.5660E−02 3.2592E−02 1.2013E−01 4.2841E−02 4.2885E−02
UF5 ≈ – – –
(3.2931E−02) (4.6786E−03) (6.4271E−03) (8.3749E−02) (4.2173E−02)
6.2344E−01 5.6134E−02 1.2823E−01 3.2118E−01) 3.2813E−01
UF6 + + ≈
(3.4904E−02) (1.3729E−02) (4.3980E−03) (4.8871E−02) (3.2382E−01)
5.5550E−02 3.7667E−03 6.0919E−02 6.7283E−03 5.2318E−03
UF7 + ≈ + +
(4.0288E−04) (4.6437E−05) (8.7121E−04) (4.2142E−03) (5.3821E−03)
4.7767E−02 5.7788E−02 9.9038E−02 7.3877E−02 6.4738E−02
UF8 – – – –
(3.2915E−03) (1.1960E−02) (7.5730E−03) (3.2375E−02) (3.2732E−02)
4.4875E−02 1.2333E−01) 1.3572E−01 5.3729E−02 3.2831E−01
UF9 – ≈ –
(3.0980E−03) (1.6254E−01) (3.4729E−02) (2.3841E−02) (4.7694E−02)
1.0134E−01 1.0352E−01 4.6055E−01 4.3728E−01 3.8716E−01
UF10 ≈ – – –
(2.1218E−02) (3.3009E−03) (6.4729E−02) (7.2713E−01) (3.2371E−02)
9.2360E−01 2.1730E+00 1.5998E+00 1.323E+00 3.2166E−01
WFG1 – – – ≈
(3.2132E−02) (1.4480E−02) (3.2194E−03) (6.4182E−03) (6.2713E−02)
3.3146E−01 2.2200E−01 2.7077E−01 3.2931E−01 4.2318E−02
WFG2 ≈ ≈ ≈ –
(9.2194E−02) (3.4970E−02) (5.2188E−02) (3.8321E−02) (1.2131E−02)
4.4629E−02 7.7500E−02 5.9578E−02 3.2774E−01 5.2886E−01
WFG3 ≈ – – –
(4.1213E−02) (1.2320E−02) (6.3823E−02) (2.3238E−02) (2.3813E−02)
4.5311E−01 2.2543E−01 2.1377E−01 3.8692E−01 3.8125E−01
WFG4 + + ≈ ≈
(1.2102E−02) (1.6949E−03) (6.3831E−03) (2.3828E−02) (3.2842E−02)
3.9820E−01 2.1210E−01 2.3260E−01 3.0281E−01 2.9789E−01
WFG5 + + ≈ ≈
(8.2913E−03) (5.5937E−03) (3.2881E−03) (1.8792E−02) (3.2813E−02)
2.0512E−01 2.1900E−01 2.1807E−01 3.2981E−01 2.7688E−01
WFG6 ≈ ≈ – –
(3.2019E−03) (1.9370E−03) (8.5283E−03) (3.2933E−03) (2.3891E−02)
5.0831E−01 2.1500E−01 2.2136E−01 3.0218E−01 3.2831E−01
WFG7 + + + +
(8.2812E−03) (1.1140E−03) (8.4731E−03) (5.8079E−03) (4.2813E−03)
3.1863E−01 2.9030E−01 2.5522E−01 3.2813E−01 3.0084E−01
WFG8 ≈ + ≈ ≈
(9.3201E−03) (7.6020E−03) (4.2383E−03) (5.2739E−02) (4.3823E−02)
2.2627E−01 2.4660E−01 2.3309E−01 2.9743E−01 2.7649E−02
WFG9 ≈ ≈ – –
(8.2938E−03) (1.8960E−02) (2.3841E−03) (1.0836E−02) (7.3723E−03)
+/−/≈ 6/4/9 6/9/4 3/10/6 4/10/5
4. Experiments and results is set as 2500 in this work. Each experiment runs 30 times
independently. And the best results are highlighted on the basis
For the sake of demonstrating the performance of PCA-MOEA, of the average IGD.
several widely used suits of benchmark problems are experimen-
tally tested in this part, i.e. the DTLZ test suites [47], the ZDT 4.2. Experimental result and analysis
test suites [48], the complicated UF1–UF10 problems [49] and
WFG test suites [50]. All the simulations were run on a personal 4.2.1. Results of problems with 30 variables
computer with Windows 7 systems configured as Intel (R) Core In this part, we apply all algorithms on test suits with low-
(TM) i3 CPU M 350 and 4G RAM. dimensional decision variables, i.e., UF1–UF9 with 30 variables
and WFG1–WFG9 test problems having 24 variables. The rea-
4.1. Performance measurements son why we use these test problems is that we want to make
complete comparisons with MOEA/DVA [29]. The details of the
Inverted generational distance (IGD) [50] is adopted in the mathematical descriptions as well as their ideal PFs can be re-
following experimental studies, as it is an index that can assess ferred in [46] and [51]. For 2-objective functions, the size of
convergence and uniformity performance simultaneously. Let P∗ population is set as 100 while for 3-objective test instances, it
be plenty of evenly distributed solutions on the true PF, and P is set as 150, which is the same as in [29]. For fair comparisons,
is the approximate Pareto front. d(ν, P) expresses the minimum some general parameters used in compared algorithms are stated
Euclidean distance form ν in P∗ to solution P. Then we can define here. The distribution indexes of SBX as well as polynomial mu-
the IGD value from P∗ to P as tations were set at 15. Besides, we set the mutation probability as
ν∈P∗ d(ν,P) 1/n, where n refers to the quantity of variables in decision space.
∑
IGD(P∗ , P) = | P∗ | Crossover rate (CR) and scaling factor (F) of differential evolution
For two-objective test problems, we set the number of solu- are set as 1 and 0.5, respectively, as recommended in [52]. In
tions in P∗ as 500 and for three-objective instances that number MOEA/D, we set the neighbor subproblems (T) size as 0.1N as
R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120 7
Fig. 1. The PF obtained by five algorithms on UF2 function with 30 decision variables.
usual, where N is the size of population, and another parameter– PCA is set as 80%. After dimensionality reduction, the number
parent selection probability, is set as 0.9. These parameters are of convergence-related variables turns to 18 and 10 separately,
fixed in the following parts of experiments. As in [29], the max- while the original number is 29 and 20. This difference is not very
imal number of function evaluations is set as 300 000 for UF significant, as such problems are still small-scale MOPs, however,
test suites and 100 000 for WFG test suites respectively. In this by doing this, and we indeed reduce computational costs because
section, to make a fair comparison, we set the same number of less number of function evaluations is needed.
function evaluations as in [29]. But, it needs to be pointed out Figs. 1 and 2 shows the obtained PF by five algorithms on
that number only needs to be set as 100 000 for UF test suites UF2 and UF4 test functions. We can find clearly that the final
and 10 000 for WFG problems in the proposed algorithm, be- populations gotten by MOEA/DVA, MOEA/D and NSGA-III do not
cause in this case of setting we already have satisfactory results, spread over the whole true Pareto front on UF2 problem, even
which will be proven in later section. The code of LMEA can be though their convergence ability is satisfactory. LMEA has better
found in http://www.soft-computing.de/jin-pub_year.html. The convergence, but its diversity is less desirable than PCA-MOEA.
code of MOEA/D and NSGA-III can be found in PlatEMO [53]. For UF4 problem, LMEA and NSGA-III have a not too bad diversity
And MOEA/DVA is implemented by ourselves in Matlab according performance, but neither of them converges to the true PF, in
to [29]. addition, MOEA/DVA and MOEA/D do not obtain good results.
Table 1 presents the statistical results of the five compared However, our PCA-MOEA not only has great convergence ability,
algorithms in terms of IGD over 30 runs. In the table, the best its diversity performance is also excellent for the two problems.
results of the five algorithms are highlighted. IGD of the five com-
pared algorithms are also analyzed using the Wilcoxon signed- 4.2.2. Results of test problem with 200 variables
rank test [41]. ‘‘+’’, ‘‘−’’ and ‘‘≈’’ in Table 1 denote the perfor- MOEA/DVA dealt with MOPs with 200 variables to demon-
mance of the compared algorithms is better than, worse than, or strate its effectiveness. In this section of experiments, we conduct
similar to that of the proposed algorithm, respectively. all algorithms on test functions with 200 decision variables. To
From the statistical results, we can see that PCA-MOEA has make fair comparisons, we apply test function ZDT4, DTLZ1,
better performance on UF1–UF2, UF4–UF5, UF8, UF10, WFG3, DTLZ3, UF1–UF6 and UF10, as in [29]. The population size is
WFG6 as well as WFG9 problems, while MOEA/DVA performs 100 for ZDT4 and UF1–UF7 problems, while for DTLZ1, DTLZ3
best on UF6–UF7, WFG5 and WFG7, LMEA get the best results and UF8–UF10 problems, it is set as 150. Table 2 shows the
on WFG4 and WFG8 problems. As for UF3, WFG1 and WFG2, comparison results. For ZDT4, DTLZ and UF1–UF2 test functions,
MOEA/D has more satisfactory results over other algorithms. The the maximum number of function evaluations is set as 1 200 000
proposed PCA-MOEA significantly outperforms the LMEA, NSGA- and for UF3–UF6 and UF10 problems, it is set as 3 000 000 for
III and MOEA/D according to the Wilcoxon signed-rank test and MOEA/DVA, LMEA and MOEA/D. When applying MOEA/DVA, a
performs similarly to MOEA/DVA. The success of PCA-MOEA and smaller number is needed in different cases.
MOEA/DVA come from two facets. On one hand, they all decom- The statistical results of IGD over 30 runs of the compared
pose the convergence related variables into a series of smaller algorithms are described in Table 2. In the table, the results of the
sub-components, which can reduce the optimization difficulty. best performing algorithm, i.e., the smallest mean IGD values are
On the other hand, in the last stage of the whole evolution highlighted. The Wilcoxon signed-rank test [41] is conducted to
process, all variables are optimized together to get a uniformly compare the significance of difference between PCA/MOEA and
distributed population. We can reach the conclusion that if the the compared algorithms. ‘‘+’’, ‘‘−’’ and ‘‘≈’’ indicate that the
MOP is decomposable, we can use the structure of the problem performance of PCA/MOEA is better than, worse than, or similar
to simplify the optimization. to that of the algorithm under comparison, respectively.
For most UF test problems, all variables related to convergence We can observe that on most problems, the newly presented
are independent for each objective function while for most WFG algorithm is superior to MOEA/DVA, LMEA and MOEA/D. Not
test suites; the mapping from PS to PF has a great deviation. only does it have a smaller IGD value, which means the better
We kept eighty percent of variance so the second parameter of convergence and diversity ability, but also the number of function
8 R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120
Fig. 2. The PF obtained by five algorithms on UF4 function with 30 decision variables.
Table 2
Average and standard deviation of IGD obtained by the compared algorithms on problems with 200 variables.
Problems PCA-MOEA PCA-MOEA MOEA/DVA LMEA MOEA/D
(90%) (10%)
3.8682E−03 3.9283E−03 3.9231E−03 2.1664E−02 9.4592E−01
ZDT4 ≈ – –
(3.9829E−05) (4.3872E−04) (3.5337E−05) (6.4812E−03) (2.7429E−02)
4.1248E−02 1.6330E−02 2.2652E−02 8.1169E−02 3.5817E+02
DTLZ1 – – –
(5.3819E−04) (3.2938E−04) (1.1502E−04) (4.1739E−03) (6.3812E+01)
3.1999E−02 4.2469E−02 5.8425E−02 2.8812E−01 5.3798E+02
DTLZ3 – – –
(3.2911E−04) (8.2839E−04) (1.7293E−04) (4.3719E−03) (3.2189E+02)
3.9391E−03 2.7804E−03 4.0108E−03 8.8718E−03 8.3810E−02
UF1 – – –
(5.3297E−04) (2.1937E−04) (8.1582E−05) (2.3184E−02) (5.2739E−02)
3.6308E−03 2.9190E−03 4.0657E−03 1.3663E−02 3.4829E−02
UF2 – – –
(8.3239E−05) (8.3726E−05) (2.1424E−05) (4.3812E−03) (6.4822E−02)
5.2287E−02 3.6361E−02 3.9059E−03 2.7957E−02 2.3193E−02
UF3 + ≈ ≈
(6.4132E−05) (9.4273E−05) (5.0902E−05) (4.2831E−04) (2.3139E−02)
1.1352E−02 1.0076E−02 3.2392E−02 3.1062E−02 1.2931E−01
UF4 – – –
(8.4223E−05) (2.1837E−05) (2.8345E−04) (6.3719E−02) (2.3139E−02)
3.1679E−02 3.0466E−02 3.2378E−02 9.6097E−02 1.3822E−01
UF5 ≈ – –
(4.2813E−03) (6.8372E−03) (5.2747E−03) (8.2713E−03) (3.2819E−01)
1.4097E−02 3.6369E−02 1.8064E−02 3.5603E−02 4.3829E−02
UF6 – – –
(8.2831E−04) (4.1832E−03) (2.5301E−03) (3.2881E−03) (5.3723E−02)
6.8384E−02 1.0426E−01 2.3715E−01 4.1564E−01 1.3562E+00
UF10 – – –
(2.3928E−02) (9.4737E−01) (4.8283E−01) (6.3811E−02) (2.3913E−01)
+/−/≈ 1/7/2 0/9/1 0/9/1
evaluations is smaller. We just need 400 000 function evaluations Clearly, the PF get by PCA-MOEA is better than the others. Both
for the first five test instances and 1 000 000 for the last five the convergence ability and diversity ability are more excellent.
instances to get ideal results when the parameter in PCA is set And the result obtained by PCA-MOEA when the parameter is set
as 95%, the number in parentheses under the title of PCA-MOEA.
as 10% is slightly better than the other cases where the parameter
When it comes to 10%, the needed number of function evaluations
is set as 95% in Fig. 3. In Fig. 4, the difference between the first
is much smaller. In the procedure, it can be seen that the number
of convergence-related variables becomes 77 and 5 separately two figures is smaller, which means that the parameter setting
from the original 200 for 2-objective UF problems. The maximal has a little influence in this case.
number of function evaluations for the first five problems is set Fig. 5 shows the evolution curves of IGD values generated by
as 1 200 000 and for the other, it is set as 3 000 000 when all the the compared algorithms on four functions with 200 variables.
other algorithms work. From these figures, we can observe that PCA-MOEA converges
It should be noted that the number of convergence-related
faster and can converge at lower IGD values than the other algo-
variables after dimensionality reduction has correlation with the
rithms. The reason is that the dimensionality reduction method
number of the population size, which is no more than the initial
population size. reduces the difficulty of optimization and saves a lot of function
Figs. 3 and 4 present the final population obtained by com- evaluations. Furthermore, the independent optimization of each
pared algorithms on UF2 and UF4 test functions, respectively. sub-component can accelerate the convergence rate.
R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120 9
Fig. 3. The PF obtained by five algorithms on UF2 function with 200 decision variables.
Fig. 4. The PF obtained by five algorithms on ZDT4 function with 200 decision variables.
4.2.3. Results of test problem with 1000 variables functions. The population size and the adopted test problems stay
MOEA/DVA is promising for solving multi-objective problems the same with [54]. We compare PCA-MOEA with MOEA/DVA,
with 200 variables in decision space, however, when the di- LMEA, MOEA/D-RGD and MOEA/D. The statistical results of IGD
mensionality increases to 1000, it sharply loses its efficiency, over 30 runs of the compared algorithm can be found in Table 3.
which is mainly on account of the large expenses of function In the table, the results of the best performing algorithm are
evaluations in the procedure of decision variable analysis and highlighted. The Wilcoxon signed-rank test [41] is conducted
interdependence analysis. Thus it is urgently needed to get the to compare the significance of difference between PCA/MOEA
number of dimensionality reduced. In this part, we will make and the compared algorithms. We can obviously see from the
comparisons on UF test suites and WFG test suites with 1000 above table that on 1000 dimensional MOPs, PCA-MOEA has
variables. These two test suites are adopted in [54]. much better performance than MOEA/DVA, LMEA and MOEA/D
In [54], a random-based dynamic grouping approach is pro- on UF test problems. PCA-MOEA beats MOEA-RGD down on 8 UF
posed to solve MOPs with numerous decision variables, it decom- functions but does not perform much well on WFG test problems.
poses the whole dimension into a lot of groups with the same MOEA/RDG obtains 6 best results out of 9 problems. In this case,
size, each containing some variables. In this approach, the group we kept ten percent of variance in these experiments hence the
size and components in each group are all dynamic. So in this second parameter of PCA is set as 10%.
part, we also add MOEA/D-RDG developed in [54] as a compar- Another point our algorithm being better than other algo-
ative algorithm. MOEA/D-RDG is implemented by ourselves in rithms is that it saves much more computational costs. On 2-
Matlab according to [54]. objective UF1–UF7 functions, our algorithm only needs 6.0E+5
In these experiments, for fair comparison, we set population evaluations to get satisfactory results, and 1.0E+6 for 3-objective
size as 300 for 2-objective functions and 600 for 3-objective UF8–UF10 functions, and for 3-objective WFG1–WFG9 functions,
10 R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120
Fig. 5. The evolution curves of IGD values generated by the compared algorithms on four functions with 200 variables.
Table 3
Average and standard deviation of IGD obtained by the compared algorithms on problems with 1000 variables.
Problems PCA-MOEA MOEA/DVA LMEA MOEA/D-RDG MOEA/D
1.4917E−03 9.4342E−03 4.3791E−03 4.6088E−03 2.0250E−01
UF1 – – – –
(3.9433E−03) (4.2341E−03) (2.4810E−03) (8.7010E−04) (4.7538E−02)
1.8824E−03 6.3209E−03 2.0764E−02 5.4708E−03 9.0383E−02
UF2 – – – –
(9.8941E−03) (3.2052E−03) (6.4822E−03) (2.5791E−03) (4.2934E−02)
3.6242E−02 2.0931E−02 7.4728E−02 2.6605E−03 9.1349E−02
UF3 ≈ + + ≈
(4.3213E−03) (3.2395E−03) (5.3829E−03) (6.5686E−04) (4.2983E−02)
7.2061E−03 4.0328E−02 7.6210E−03 1.0796E−01 8.3742E−02
UF4 – ≈ – –
(2.4920E−02) (4.3291E−04) (4.3711E−04) (1.2308E−02) (4.2938E−02)
2.5029E−02 4.2980E−01 3.6433E−01 1.0984E−01 9.4829E−01
UF5 – – – –
3.9130E−04 8.3712E−03 (4.8792E−03) (1.3889E−01) (7.4872E−02)
1.0547E−02 5.3920E−02 4.6511E−02 2.4528E−02 7.4763E−01
UF6 – – – –
(4.2932E−03) (1.4391E−03) (2.6549E−03) (9.5862E−04) (4.3872E−02)
3.2817E−02 8.4240E−02 5.7547E−02 5.3030E−03 7.3729E−01
UF7 – – + –
(2.3873E−03) (8.3092E−03) (5.7671E−03) (7.2415E−04) (4.3710E−02)
2.8438E−02 5.3920E−01 7.6535E−01 1.6902E−01 4.2931E−01
UF8 – – – –
(7.6791E−02) (4.3239E−02) (1.3431E−02) (1.1688E−01) (9.3861E−02)
2.3988E−02 4.2989E−01 3.6541E−03 2.6720E−02 6.2094E−01
UF9 – + – –
(3.2938E−02) (4.3942E−02) (5.7641E−03) (1.6021E−02) (3.2339E−02)
9.0738E−02 7.1352E+00 2.0582E+00 2.5661E+00 9.3719E−01
UF10 – – – –
(3.2187E−03) (7.8921E−02) (8.7735E−03) (2.0235E−01) (3.2188E−02)
9.0033E−01 2.3981E+00 2.5068E+00 9.0189E−01 3.4792E+00
WFG1 – – ≈ –
(3.2032E−03) (3.4894E−02) (4.7641E−03) (1.2965E−02) (3.3019E−02)
2.9691E−01 5.3903E−01 3.0864E−01 1.2485E−01 1.2817E+00
WFG2 – ≈ + –
(8.2712E−01) (3.4740E−02) (7.5432E−02) (2.4283E−02) (4.8392E−02)
1.7017E−02 5.3910E+00 7.9849E−01 1.5543E+00 2.4729E+00
WFG3 – – – –
(3.2091E−02) (4.2139E−02) (3.5312E−03) (1.7457E−02) (1.2934E−03)
5.0524E−01 8.4912E−01 8.7512E−02 8.8967E−02 7.8371E−01
WFG4 – + + –
(3.2934E−02) (3.4929E−03) (5.6412E−03) (1.4603E−03) (4.3910E−02)
4.3920E−01 5.3291E−01 3.6511E−01 1.1670E−01 1.6791E+00
WFG5 – ≈ + –
(9.4883E−03) (4.9024E−03) (2.7759E−03) (2.2352E−04) (3.2938E−03)
2.4909E−01 4.3294E−02 6.3712E−02 8.8139E−02 8.3742E−01
WFG6 + + + –
(1.3249E−03) (3.4898E−03) (4.2810E−03) (5.1434E−04) (1.2938E−02)
7.3981E−01 4.3920E−01 9.8273E−02 8.7095E−02 1.4923E+00
WFG7 ≈ + + –
(7.9301E−03) (9.4232E−03) (2.3719E−04) (1.1844E−04) (9.4728E−03)
9.1930E−01 4.4903E−01 5.2718E−01 1.0856E−01 8.3872E−01
WFG8 ≈ ≈ + –
(7.3889E−03) (5.3291E−03) (7.4390E−03) (3.1555E−03) (3.0984r-3)
8.2839E−01 5.3931E−01 5.3719E−01 9.0481E−02 7.3763E−01
WFG9 ≈ ≈ + ≈
(3.1283E−03) (3.2939E−03) (5.3182E−03) (1.0489E−03) (9.3848E−02)
+/−/ 1/14/4 5/9/5 9/9/1 0/17/2
Table 4 Table 5
Average and standard deviation of IGD obtained the algorithms on problems The components of each algorithm.
with 2000 and 5000 variables. Algorithm Clustering Dimensionality Optimization method
Problems 2000-variable 5000-variable approach reduction
UF1 2.6929e−3(5.3811e−4) 2.6471e−3(3.2473e−4) technique
UF2 2.7500e−3(7.5819e−4) 1.4911e−3(8.3728e−4) PCA-MOEA k-means PCA Cooperative
UF3 6.9666e−2(4.2819e−4) 3.5940e−2(5.2938e−5) coevolution
UF4 9.2341e−3(6.5829e−4) 7.7227e−3(2.3948e−4) PCA-MOEA k-means No Cooperative
UF5 2.1823e−2(2.3180e−4) 2.2412e−2(4.2938e−4) without PCA coevolution
UF6 3.4261e−2(7.5918e−5) 3.2074e−2(4.9847e−5)
UF7 3.8879e−2(3.1823e−3) 3.3842e−2(3.2938e−3) MOEA/DVA Decision No Cooperative
UF8 4.4441e−2(9.7281e−2) 4.4154e−2(4.3984e−3) variable coevolution and
UF9 3.7932e−2(3.2810e−3) 3.1010e−2(2.9381e−3) analysis MOEA/D
UF10 9.1263e−2(3.1890e−3) 1.0446e−1(5.4421e−3) LMEA k-means No Convergence
optimization and
diversity optimization
MOEA/D No No MOEA/D
In order to verify the effectiveness of PCA, experiments with
and without PCA are carried out under the same algorithm frame-
work. We choose the UF1 test problem and run the above two
Fig. 7 is an image representation. From Table 5 and Fig. 7,
algorithms on 30, 100, 200, 300, 400, 500 and 1000 dimensions
we can see that the use of PCA significantly enhances the ef-
of UF1 respectively. As mentioned earlier, the introduction of
PCA will greatly save resources, which also means less time. We fectiveness of the algorithm and achieves excellent results in a
compare the time required for the two kinds of experiments very short time. The higher the dimension is, the more obvi-
when the IGD is around 0.003 in Table 6. ous the superiority is. In conclusion, the proposed PCA-MOEA
12 R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120
Table 6
Comparisons of time consumed with PCA and without PCA.
Dimension 30 100 200 300 400 500 1000
Time consumed with PCA 0.57 s 0.76 s 0.98 s 1.17 s 1.37 s 1.72 s 3.06 s
Time consumed without PCA 5.43 s 22 s 42 s 137 s 239 s 412 s 2520 s
Fig. 9. IGD metric value of PCA-MOEA on UF2 problem with different parameter
value, averaging over 20 runs.
Table 7
The variation of IGD value with parameters.
Parameter 10% 20% 30% 40% 50% 60%
Reduced dimension 21 43 67 93 120 149
IGD 1.4801e−3 1.6628e−3 1.8537e−3 1.6453e−3 1.8097e−3 1.7721e−3
Function evaluation 5e5 7e5 1e6 1.4e6 1.8e6 2e6
Parameter 70% 80% 85% 90% 97% 99%
Reduced dimension 180 215 234 254 285 294
IGD 2.0497e−3 2.0265e−3 1.9794e−3 2.1146e−3 1.9870e−3 1.8273e−3
Function evaluation 2.8e6 3e6 3.5e6 4.5e6 5e6 5.4e6
Table 8
Average and standard deviation value of relative HV by PCA-MOEA, MOEA/DVA, and LMEA for Wind Turbine
Placement Optimization with 200 turbines.
Combination PCA-MOEA MOEA-DVA LMEA NSGA-II
Scenario 1 0.9999(0.0325) 0.9360(0.0245) 0.5145(0.0962) 0.9652(0.0325)
1
Scenario 2 0.9969(0.0523) 0.9522(0.0265) 0.8831(0.0652) 0.9437(0.0259)
Scenario 1 0.9977(0.0298) 0.9896(0.0685) 0.9936(0.0631) 0.9885(0.0325)
2
Scenario 2 0.9865(0.0256) 0.9830(0.0365) 0.9825(0.0360) 0.9856(0.0532)
Scenario 1 0.9922(0.0092) 0.9788(0.0101) 0.9290(0.0206) 0.9652(0.0355)
3
Scenario 2 0.9882(0.0156) 0.9527(0.0139) 0.9489(0.0130) 0.9685(0.0326)
Fig. 10. The obtained fitness value by the four algorithms with 200 turbines in Scenario 1 when maximizing the power output (yield) and minimizing the area of
the convex hull.
the wind turbine, these point sets being given by the Euclidean In order to compare the performance of our algorithm, we
distance between any pair of turbines. The minimum spanning choose MOEA/DVA, LMEA and NSGA-II as comparison algorithms.
tree for this graph is calculated and used as a target for cable And NSGA-II is implemented by ourselves in Matlab according
length costs. to [1]. And for initialization, we generates individuals that meet
In our experiments, the cost of a convex hull is defined as the constraints. And if the positions of turbines do not meet
the area enclosed by a set of points that make up a convex hull. constraints, they are deleted and generate new solutions in their
This value is the minimum land area required for wind farm domain that satisfy constraints. And we choose 200 turbines
layout. We calculate convex hull using Graham’s scan algorithm. scenario. Scenario 1 and Scenario 2 are different scenarios and
And the introduction to calculating energy output can be found the latter is rather complex. The population size is 50, which is
in [55]. The code of these objective combinations can be found in the same as [55]. For all the experiment, the maximum number
https://github.com/d9w/WindFLO. of function evaluations is set as 1 000 000, and all the tests are
14 R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120
Fig. 11. The obtained fitness value by the four algorithms with 200 turbines in Scenario 2 when maximizing the power output (yield) and minimizing the area of
the convex hull.
Fig. 12. The obtained fitness value by the four algorithms with 200 turbines in Scenario 1 when maximizing the power output (yield) and minimizing the total
distance of the minimum spanning tree.
R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120 15
Fig. 13. The obtained fitness value by the four algorithms with 200 turbines in Scenario 2 when maximizing the power output (yield) and minimizing the total
distance of the minimum spanning tree.
Fig. 14. The obtained fitness value by the four algorithms with 200 turbines in Scenario 1 when combining power, convex hull area and minimum spanning tree.
16 R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120
Fig. 15. The obtained fitness value by the four algorithms with 200 turbines in Scenario 2 when combining power, convex hull area and minimum spanning tree.
run for 20 times independently. The obtained objective value algorithm has smaller area and minimum spanning tree while
will be normalized into the following ranges: area 10 km2 –100 obtaining higher productivity when compared with the other two
km2 , yield 1 000 000 kW–1 440 000 kW, spanning tree length 60 algorithms, which means that our algorithm can achieve high
km–150 km. productivity and save some costs at the same time. So our al-
Table 8 lists the relative HV values obtained by the three gorithm can deal with the Wind Turbine Placement Optimization
algorithm. and the reference point for HV calculation is set to problems.
(10 ∗ 1.1, 1 440 000 ∗ 1.1, 150 ∗ 1.1). The relative HV is the ratio
of the HV of the obtained objective value to the HV of the lowest
point of the normalized region. 5. Conclusion
In the table, the results of the best performing algorithm
are highlighted. From the results in Table 8, we can see that
This article suggests a principal component analysis based
the proposed algorithm can deal with wind turbine placement
optimization better than MOEA-DVA and LMEA. And in more multi-objective evolutionary algorithm for multi-objective prob-
complex scenarios, our algorithm has more obvious advantages. lems with numerous decision variables, in which a clustering
And we plot the result of the three algorithms from Figs. 10 to 15. method is also applied. We conduct our algorithm on some
From Figs. 10 and 11, that is plotting the objective combina- benchmark functions with low dimensionality, in the above men-
tion 1: maximizing the power output (yield) and minimizing the tioned experiment, i.e., UF test suits and WFG test suites with 30
area of the convex hull, we can see that PCA-MOEA can obtain variables. We can get the ideal results with less computational
similar yield while greatly saving the area of the convex hull costs by comparing our algorithm with MOEA/DVA. Besides, we
when comparing with MOEA/DVA and LMEA. For example, in have also applied our technique on several test problems with
Scenario 1, when the yield is 1.3 ∗ 106 kW, the area PCA-MOEA 200 variables and 1000 variables, which validate the effectiveness
needs is just about 7.4 ∗ 107 m2 , but MOEA/DVA and LMEA need and efficiency of our algorithm. Finally, we made a try to solve
9.08 ∗ 107 m2 and 9.18 ∗ 107 m2 respectively. problems with up to 5000 variables. It is inspiring that the pro-
For the objective combination 2: maximizing the power out-
posed approach can also tackle with such problems well, obtain
put (yield) and minimizing the total distance of the minimum
solutions that both convergence and diversity are satisfactory.
spanning tree, we can see from Figs. 12 and 13 that PCA-MOEA
Although PCA/MOEA is promising in solving large scale opti-
can use less cable length to achieve higher productivity when
comparing with MOEA/DVA and LMEA. And in Scenario 2, when mization problems, there are still some problems that need to
the yield is 2.54 ∗ 106 kW, the cable length PCA-MOEA needs is be solved in the years ahead. One of its drawbacks is that it
just about 7.09 ∗ 104 m but MOEA/DVA and LMEA need 8.2 ∗ 104 consumes too much computing resources in the variable grouping
m and 8.42 ∗ 104 m respectively. stage. How to deal with decision variables in a more effective way
As for three-objective problems, i.e. the objective combina- is a valuable research direction. Another interesting direction on
tion 3, we can see from the Figs. 14 and 15 that the proposed the parameter setting is PCA.
R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120 17
Declaration of competing interest [19] R. Cheng, Y. Jin, A competitive swarm optimizer for large scale
optimization, IEEE Trans. Cybern. 45 (2) (2015) 191.
No author associated with this paper has disclosed any po- [20] X. Li, X. Yao, X. Li, X. Yao, Cooperatively coevolving particle swarms for
large scale optimization, IEEE Trans. Evol. Comput. 16 (2) (2012) 210–224.
tential or pertinent conflicts which may be perceived to have
[21] K.H. Hajikolaei, Z. Pirmoradi, G.H. Cheng, G.G. Wang, Decomposition for
impending conflict with this work. For full disclosure statements large-scale global optimization based on quantified variable correlations
refer to https://doi.org/10.1016/j.asoc.2020.106120. uncovered by metamodelling, Eng. Optim. 47 (4) (2015) 429–452.
[22] J. Ghorpadeaher, V.A. Metre, Clustering multidimensional data with pso
CRediT authorship contribution statement based algorithm, Comput. Sci. (2014).
[23] R. Cheng, Y. Jin, A social learning particle swarm optimization algorithm
for scalable optimization, Inform. Sci. 291 (6) (2015) 43–60.
Ruochen Liu: Conceptualization, Methodology, Software. Rui
[24] W.N. Chen, J. Zhang, H.S.H. Chung, W.L. Zhong, W.G. Wu, Y.H. Shi, A novel
Ren: Software, Formal analysis, Writing - original draft. Jin Liu: set-based particle swarm optimization method for discrete optimization
Software, Formal analysis, Investigation, Writing - review & edit- problems, IEEE Trans. Evol. Comput. 14 (2) (2010) 278–300.
ing. Jing Liu: Data curation, Investigation. [25] W.N. Chen, J. Zhang, Y. Lin, N. Chen, Z.H. Zhan, S.H. Chung, Y. Li, Y.H. Shi,
Particle swarm optimization with an aging leader and challengers, IEEE
Trans. Evol. Comput. 17 (2) (2013) 241–258.
Acknowledgments
[26] W.J. Yu, M. Shen, W.N. Chen, Z.H. Zhan, Y.J. Gong, Y. Lin, O. Liu, J. Zhang,
Differential evolution with two-level parameter adaptation, IEEE Trans.
This work was supported by the National Natural Science Cybern. 44 (7) (2014) 1080–1099.
Foundation of China (Nos. 61876141 and 61373111), and the [27] K.P. Wong, Z.Y. Dong, Differential evolution, an alternative approach to
Provincial Natural Science Foundation of Shaanxi of China (No. evolutionary algorithm, in: International Conference on Intelligent Systems
2019JZ-26). Application to Power Systems, 2005.
[28] X. Zhang, Y. Tian, R. Cheng, Y. Jin, A decision variable clustering based
evolutionary algorithm for large-scale many-objective optimization, IEEE
References Trans. Evol. Comput. 22 (1) (2018) 97–112.
[29] X. Ma, F. Liu, Y. Qi, X. Wang, L. Li, L. Jiao, M. Yin, M. Gong, A multiobjective
[1] K. Deb, A. Pratap, S. Agarwal, T. Meyarivan, A fast and elitist multiobjective evolutionary algorithm based on decision variable analyses for multiob-
genetic algorithm: Nsga-ii, IEEE Trans. Evol. Comput. 6 (2) (2002) 182–197. jective optimization problems with large-scale variables, IEEE Trans. Evol.
[2] Q. Zhang, H. Li, Moea/d: A multiobjective evolutionary algorithm based on Comput. 20 (2) (2016) 275–298.
decomposition, IEEE Trans. Evol. Comput. 11 (6) (2007) 712–731. [30] H. Wang, L. Jiao, R. Shang, S. He, F. Liu, A memetic optimization strategy
[3] K. Deb, H. Jain, An evolutionary many-objective optimization algorithm us- based on dimension reduction in decision space, Evol. Comput. 23 (1)
ing reference-point-based nondominated sorting approach, part i: Solving (2015) 69–100.
problems with box constraints, IEEE Trans. Evol. Comput. 18 (4) (2014)
[31] H. Wang, Y. Jin, Efficient nonlinear correlation detection for decomposed
577–601.
search in evolutionary multi-objective optimization, in: 2017 IEEE Congress
[4] C. Liu, J. Liu, Z. Jiang, A multiobjective evolutionary algorithm based on
on Evolutionary Computation (CEC), 2017, pp. 649–656.
similarity for community detection from signed social networks, IEEE
[32] J. Shlens, A tutorial on principal component analysis, in: Systems
Trans. Cybern. 44 (12) (2014) 2274–2287.
Neurobiology Laboratory, Salk Institute for Biological Studies, 2005.
[5] H. Zhu, Y. Shi, Brain storm optimization algorithm for full area coverage
[33] Y. Qi, X. Ma, F. Liu, L. Jiao, J. Sun, J. Wu, Moea/d with adaptive weight
of wireless sensor networks, in: 2016 Eighth International Conference on
adjustment, Evol. Comput. 22 (2) (2014) 231–264.
Advanced Computational Intelligence (ICACI), 2016, pp. 14–20.
[6] Y. Guo, D. Liu, M. Chen, Y. Liu, An energy-efficient coverage optimiza- [34] Z. He, G.G. Yen, Diversity improvement in decomposition-based multi-
tion method for the wireless sensor networks based on multi-objective objective evolutionary algorithm for many-objective optimization prob-
quantum-inspired cultural algorithm, in: Advances in Neural Networks – lems, in: IEEE International Conference on Systems, Man and Cybernetics,
ISNN 2013, 2013, pp. 343–349. 2014, pp. 2409–2414.
[7] Y. Guo, D. Liu, Multi-population cooperative particle swarm cultural algo- [35] H.L. Liu, F. Gu, Q. Zhang, Decomposition of a multiobjective optimization
rithms, in: 2011 Seventh International Conference on Natural Computation, problem into a number of simple multiobjective subproblems, IEEE Trans.
Vol. 3, 2011, pp. 1351–1355. Evol. Comput. 18 (3) (2014) 450–455.
[8] Y. Guo, P. Zhang, J. Cheng, C. Wang, D. Gong, Interval multi-objective [36] E. Zitzler, S. Künzli, Indicator-based selection in multiobjective search,
quantum-inspired cultural algorithms, Neural Comput. Appl. 30 (3) (2018) Lecture Notes in Comput. Sci. 3242 (2004) 832–842.
709–722. [37] N. Beumea, M. Emmerich, Sms-emoa: Multiobjective selection based on
[9] L.M. Antonio, C.A.C. Coello, Use of cooperative coevolution for solving large dominated hypervolume, European J. Oper. Res. 181 (3) (2007) 1653–1669.
scale multiobjective optimization problems, in: 2013 IEEE Congress on [38] C.A.R.g. Villalobos, C.A.C. Coello, A New Multi-Objective Evolutionary Al-
Evolutionary Computation, 2013, pp. 2758–2765. gorithm Based on a Performance Assessment Indicator, 20 (1) (2012)
[10] M. Guo, C. Wang, Multi-objective quantum cultural algorithm and its 16–37.
application in the wireless sensor networks’ energy-efficient coverage [39] J.J. Durillo, A.J. Nebro, C.A.C. Coello, F. Luna, E. Alba, A comparative study
optimization, in: Intelligent Data Engineering and Automated Learning – of the effect of parameter scalability in multi-objective metaheuristics, in:
IDEAL 2013, 2013, pp. 161–167. Evolutionary Computation, 2008, pp. 1893–1900.
[11] R. Scheffermann, M. Bender, A. Cardeneo, Robust solutions for vehicle [40] G.R. Zavala, A.J. Nebro, F. Luna, C.A.C. Coello, A survey of multi-objective
routing problems via evolutionary multiobjective optimization, in: 2009 metaheuristics applied to structural optimization, Struct. Multidiscip.
IEEE Congress on Evolutionary Computation, 2009, pp. 1605–1612. Optim. 49 (4) (2013) 537–558.
[12] Y. Guo, J. Cheng, S. Luo, D. Gong, Y. Xue, Robust dynamic multi- [41] D.A. Wolfe, M. Hollander, Nonparametric statistical methods, in:
objective vehicle routing optimization method, IEEE/ACM Trans. Comput. Biostatistics and Microbiology: A Survival Manual, 2009.
Biol. Bioinform. 15 (6) (2018) 1891–1903.
[42] F.R.S. Karlpearson, Liii. On lines and planes of closest fit to systems of
[13] M.A. Potter, K.A.D. Jong, A cooperative coevolutionary approach to function
points in space, Phil. Mag. 2 (11) (1901) 559–572.
optimization, Third Parallel Prob. Solving Form Nat. 866 (1994) 249–257.
[43] R. Frisch, Correlation and Scatter in Statistical Variables, Sosialkonomisk
[14] X. Li, Y. Mei, X. Yao, M.N. Omidvar, Cooperative co-evolution with differ-
Institutt Universitetet i Oslo, 1951.
ential grouping for large scale optimization, IEEE Trans. Evol. Comput. 18
(3) (2014) 378–393. [44] K.T. Fang, D.K.J. Lin, Uniform experimental designs and their applications
[15] W. Chen, T. Weise, Z. Yang, K. Tang, Large-scale global optimization using in industry, Handbook of Statist. 22 (03) (2003) 131–170.
cooperative coevolution with variable interaction learning, in: International [45] R. Bro, E. Acar, T.G. Kolda, Resolving the sign ambiguity in the singular
Conference on Parallel Problem Solving from Nature, 2010, pp. 300–309. value decomposition, J. Chemom. 22 (2) (2008) 135–140.
[16] Y. Sun, M. Kirley, S.K. Halgamuge, Extended differential grouping for large [46] R. Storn, K. Price, Differential evolution – a simple and efficient heuristic
scale global optimization with direct and indirect variable interactions, 50 for global optimization over continuous spaces, J. Global Optim. 11 (4)
(Anno 26) (2) (2015) 313–320. (1997) 341–359.
[17] M.N. Omidvar, X. Li, X. Yao, Cooperative co-evolution with delta group- [47] K. Deb, L. Thiele, M. Laumanns, E. Zitzler, Scalable multi-objective op-
ing for large scale non-separable function optimization, in: Evolutionary timization test problems, in: Evolutionary Computation, 2002. CEC ’02.
Computation, 2011, pp. 1–8. Proceedings of the 2002 Congress on, 2002, pp. 825–830.
[18] Z. Yang, K. Tang, X. Yao, Multilevel cooperative coevolution for large scale [48] E. Zitzler, K. Deb, L. Thiele, Comparison of multiobjective evolutionary
optimization, in: Evolutionary Computation, 2008, pp. 1663–1670. algorithms: Empirical results, Evol. Comput. 8 (2) (2000) 173–195.
18 R. Liu, R. Ren, J. Liu et al. / Applied Soft Computing Journal 89 (2020) 106120
[49] Q. Zhang, A. Zhou, S. Zhao, P.N. Suganthan, W. Liu, S. Tiwari, Multi- [52] H. Li, Q. Zhang, Optimization problems with complicated pareto sets,
objective Optimization Test Instances for the CEC 2009 Special Session Moea/d and Nsga-ii, IEEE Trans. Evol. Comput. 13 (2) (2009) 284–302.
and Competition, special Session on Performance Assessment of Multi- [53] Y. Tian, R. Cheng, X. Zhang, Y. Jin, Platemo: A matlab platform for evo-
Objective Optimization Algorithms, Technical Report, 264, University of lutionary multi-objective optimization [educational forum], IEEE Comput.
Essex, Colchester, UK and Nanyang technological University, Singapore, Intell. Mag. 12 (4) (2017) 73–87.
2008. [54] A. Song, Q. Yang, W.-N. Chen, J. Zhang, A random-based dynamic grouping
[50] S. Huband, P. Hingston, L. Barone, R.L. While, A review of multiobjective strategy for large scale multi-objective optimization, in: Evolutionary
test problems and a scalable test problem toolkit, IEEE Trans. Evol. Comput. Computation (CEC), 2016 IEEE Congress on, 2016, pp. 468–475.
10 (5) (2006) 477–506. [55] R. Tran, J. Wu, C. Denison, T. Ackling, M. Wagner, F. Neumann, Fast
[51] Q. Zhang, W. Liu, H. Li, The performance of a new version of Moea/d, and effective multi-objective optimisation of wind turbine placement, in:
in: Evolutionary Computation, 2009. CEC ’09. IEEE Congress on, 2009, pp. Proceedings of the 15th Annual Conference on Genetic and Evolution-
203–208. ary Computation, ACM, 2013, pp. 1381–1388, http://dx.doi.org/10.1145/
2463372.2463541, URL http://doi.acm.org/10.1145/2463372.2463541.