Opposition-Based Differential Evolution
Opposition-Based Differential Evolution
Opposition-Based Differential Evolution
AbstractEvolutionary algorithms (EAs) are well-known optimization approaches to deal with nonlinear and complex problems.
However, these population-based algorithms are computationally
expensive due to the slow nature of the evolutionary process. This
paper presents a novel algorithm to accelerate the differential evolution (DE). The proposed opposition-based DE (ODE) employs
opposition-based learning (OBL) for population initialization and
also for generation jumping. In this work, opposite numbers have
been utilized to improve the convergence rate of DE. A comprehensive set of 58 complex benchmark functions including a wide range
of dimensions is employed for experimental verification. The influence of dimensionality, population size, jumping rate, and various
mutation strategies are also investigated. Additionally, the contribution of opposite numbers is empirically verified. We also provide a comparison of ODE to fuzzy adaptive DE (FADE). Experimental results confirm that the ODE outperforms the original DE
and FADE in terms of convergence speed and solution accuracy.
Index TermsDifferential evolution (DE), evolutionary algorithms, opposition-based learning, opposite numbers, optimiztion.
I. INTRODUCTION
65
TABLE I
f ). THE BEST RESULT OF EACH MUTATION STRATEGY IS EMPHASIZED
COMPARISON OF DE AND ODE ON FOUR DIFFERENT MUTATION STRATEGIES (f
IN BOLDFACE AND THE BEST RESULT AMONG FOUR MUTATION STRATEGIES (EIGHT RESULTS) IS HIGHLIGHTED IN ITALIC FONTS
(3)
66
TABLE II
f ). THE BEST RESULT OF EACH MUTATION
COMPARISON OF DE AND ODE ON FOUR DIFFERENT MUTATION STRATEGIES CONTINUED FROM TABLE I (f
STRATEGY IS EMPHASIZED IN BOLDFACE AND THE BEST RESULT AMONG FOUR MUTATION STRATEGIES (EIGHT RESULTS) IS HIGHLIGHTED IN ITALIC FONTS
into the classical DE. They employed fittest individual refinement which is a crossover-based local search. Fan and Lampinen
[23] introduced a new local search operation, trigonometric mutation, in order to obtain a better tradeoff between convergence
speed and robustness. Kaelo and Ali [24] employed reinforcement learning and different schemes for generating fitter trial
points.
Although the proposed algorithm in this paper also attempts
to enhance DE, its methodology is completely different from
all aforementioned works. That is, a first attempt to accelerate
convergence speed of DE by utilizing the scheme of OBL. We
use OBL for population initialization and the production of new
generations.
III. OPPOSITION-BASED LEARNING (OBL)
Generally speaking, evolutionary optimization methods
start with some initial solutions (initial population) and try to
improve them toward some optimal solution(s). The process
of searching terminates when some predefined criteria are
satisfied. In the absence of a priori information about the solution, we usually start with random guesses. The computation
time, among others, is related to the distance of these initial
guesses from the optimal solution. We can improve our chance
of starting with a closer (fitter) solution by simultaneously
checking the opposite solution. By doing this, the fitter one
(guess or opposite guess) can be chosen as an initial solution.
In fact, according to probability theory, 50% of the time a
guess is further from the solution than its opposite guess.
Therefore, starting with the closer of the two guesses (as judged
by its fitness) has the potential to accelerate convergence. The
same approach can be applied not only to initial solutions but
also continuously to each solution in the current population.
However, before concentrating on OBL, we need to define the
concept of opposite numbers [7].
be a real
(4)
67
TABLE III
PSEUDOCODE FOR OPPOSITION-BASED DIFFERENTIAL EVOLUTION (ODE). P : INITIAL POPULATION, OP : OPPOSITE OF INITIAL POPULATION, N : POPULATION
SIZE, P : CURRENT POPULATION, OP: OPPOSITE OF CURRENT POPULATION, V : NOISE VECTOR, U : TRIAL VECTOR, D : PROBLEM DIMENSION, [a ; b ]: RANGE OF
: MAXIMUM NUMBER OF
THE j TH VARIABLE, BFV: BEST FITNESS VALUE SO FAR, VTR: VALUE TO REACH, NFC: NUMBER OF FUNCTION CALLS, MAX
FUNCTION CALLS, F: MUTATION CONSTANT, rand(0; 1): UNIFORMLY GENERATED RANDOM NUMBER, C : CROSSOVER RATE, f ( 1 ): OBJECTIVE FUNCTION, P :
POPULATION OF NEXT GENERATION, J : JUMPING RATE, min : MINIMUM VALUE OF THE j TH VARIABLE IN THE CURRENT POPULATION, max : MAXIMUM
VALUE OF THE j TH VARIABLE IN THE CURRENT POPULATION. STEPS 15 AND 2632 ARE IMPLEMENTATIONS OF OPPOSITION-BASED INITIALIZATION AND
OPPOSITION-BASED GENERATION JUMPING, RESPECTIVELY
(6)
and
denote th variable of the th vector
where
of the population and the opposite-population, respectively.
3) Select the
fittest individuals from
as initial
population.
B. Opposition-Based Generation Jumping
By applying a similar approach to the current population, the
evolutionary process can be forced to jump to a new solution
candidate, which ideally is fitter than the current one. Based
(i.e., jumping probability), after generon a jumping rate
ating new populations by mutation, crossover, and selection,
fittest individthe opposite population is calculated and the
uals are selected from the union of the current population and
the opposite population. Unlike opposition-based initialization,
generation jumping calculates the opposite population dynamically. Instead of using variables predefined interval boundaries
68
same setting has been used in literature cited after of each parameter).
[15], [33], [34].
Population size,
Differential amplification factor,
[2], [4], [13],
[15], [18].
[2], [4], [13],
Crossover probability constant,
[15], [18].
(discussed in
Jumping rate constant,
Section V-F).
Mutation strategy [2]: DE/rand/1/bin (classic version of
DE) [2], [3], [15], [16], [26].
.
Maximum NFCs,
[35].
Value to reach, VTR
In order to maintain a reliable and fair comparison, for all 7
experiments: 1) the parameter settings are the same as above
for all experiments, unless we mention new settings for one
or some of them to serve the purpose of that parameter study;
2) for all conducted experiments, the reported values are the
average of the results for 50 independent runs, and most importantly; 3) extra fitness evaluation is required for the opposite points (both in population initialization and also generation
jumping phases) are counted.
ResultsA comprehensive set of experiments are conducted
and categorized as follows. In Section V-A, DE and ODE are
compared in terms of convergence speed and robustness. The
effect of problem dimensionality is investigated in Section V-B.
The contribution of opposite points to the achieved acceleration
results is demonstrated in Section V-C. The effect of population
size is studied in Section V-D. Comparison of DE and ODE
over some other mutation strategies is performed in Section V-E.
Discussion about the newly added control parameter, jumping
rate, is covered in Section V-F. Finally, comparison of the ODE
with FADE is given in Section V-G.
TABLE IV
COMPARISON OF DE AND ODE. D: DIMENSION, NFC: NUMBER OF FUNCTION
CALLS, SR: SUCCESS RATE, AR: ACCELERATION RATE. THE LAST ROW OF THE
TABLE PRESENTS THE AVERAGE SUCCESS RATES (SR ) AND THE AVERAGE
ACCELERATION RATE (AR ). THE BEST RESULTS FOR EACH CASE ARE
HIGHLIGHTED IN BOLDFACE
69
where
generates a uniformly distributed random
. This change is for the initializanumber on the interval
tion part, so the predefined boundaries of variables
have been used to generate new random numbers. In fact,
instead of generating
random individuals, this time we
candidate solutions.
generate
70
Fig. 1. Sample graphs (best solution versus NFCs) for performance comparison between DE and ODE. (a) f , ODE is 1.83 times faster. (b) f , ODE is 1.81 times
faster. (c) f , ODE is 1.64 times faster. (d) f , ODE is 1.72 times faster.
which means the RDE is 13% slower than its parent algorithm.
The average SR is almost the same for all of them (0.86, 0.86,
and 0.87 for DE, ODE, and RDE, respectively).
Results analysisJust by replacing the opposite numbers
with additional random numberswhile the random numbers
are generated uniformly in the variables dynamic intervals and
the rest of the proposed algorithm is kept untouchedthe avto 0.87
, which
erage AR drops from 1.44
is a 57% reduction in speed. This clearly demonstrates that the
achieved improvements are due to usage of opposite points, and
that the same level of improvement cannot be achieved via additional random sampling.
D. Experiment Series 4: Effect of Population Size
In order to investigate the effect of the population size, the
)
same experiments (conducted in Section V-A for
and
. The results for
are repeated for
and
are given in Tables XII and XIII,
respectively. In order to discuss the population size, the overall
results of three tables (Tables IV, XII, and XIII) are summarized
in Table VII.
, the average SR for DE and ODE is 0.79 and
For
0.77, respectively (DE performs marginally better than ODE).
However, DE fails to solve nine functions, while ODE fails on
seven. ODE outperforms DE on 35 functions; this number is 15
for
for DE. The average AR is 1.05 for this case (
TABLE V
COMPARISON OF DE AND ODE FOR DIMENSION SIZES D=2 AND 2D
FOR ALL SCALABLE FUNCTIONS OF THE TEST SUITE. IN THE BOTTOM
OF THE TABLE, THE AVERAGE SUCCESS RATES AND THE AVERAGE
ACCELERATION RATES FOR FUNCTIONS WITH D=2; 2D , AND FOR
BOTH (OVERALL) ARE PRESENTED. THE BEST RESULT OF NFC
AND SR FOR EACH CASE ARE HIGHLIGHTED IN BOLDFACE
71
72
TABLE VI
COMPARISON OF DE, ODE, AND RDE. THE BEST RESULT FOR EACH CASE IS HIGHLIGHTED IN BOLDFACE
(12)
By this definition, the two following algorithms have equal
.
performances
and
.
and
.
Now, we repeat the conducted experiments in Section V-A
with step size of 0.1 (i.e., 50 trials per function
for
per jumping rate value
). Due
to space limitations, we do not show all the results, only the
TABLE VII
THE SUMMARIZED RESULTS FROM TABLES IV, XII, AND XIII.
n AND n
ARE THE NUMBER OF FUNCTIONS FOR WHICH
IS THE NUMBER
DE OUTPERFORMS ODE AND VICE VERSA. n
OF UNSOLVED FUNCTIONS BY THE ALGORITHM (SR = 0)
obtained optimal value for the jumping rate with respect to the
success performance as given in Table IX.
As seen, the optimal values for the jumping rate are dis. However, jumping
tributed over the discrete interval
rates of 0.3 and 0.6 are repeated more than other values in
this table. Higher jumping rates mostly belong to the low-dimensional functions and lower ones to the high-dimensional
functions. The average value of the obtained optimal jumping
is equal to 0.37 for our test functions.
rates
Some sample graphs (SP versus ) are shown in Fig. 2 to
illustrate the effect of the jumping rate on success performance.
indicates the success performance
The point specified by
show
of the DE; the rest of points
the success performance of ODE. As mentioned before, we can
observe a sharp increase in the SP for hard functions (e.g.,
,
, and
) on higher jumping rates. Also, the SP
decreases for easy functions by increasing the jumping rate (see
, and
). Almost a smooth behavior for all functions
is recognizable for
(it was observed
even for many functions which their graphs are not presented
could be more
here). Hence, working in this interval
reliable for unknown optimization problems.
Results analysisLike DEs other control parameters, the
optimal jumping rate should have a problem-oriented value.
for
Our limited experiments suggest the range of
an unknown optimization problem. A first attempt can be
. Furthermore, for high-diconducted with
mensional problems, a smaller jumping rate is suggested.
G. Experiment Series 7: ODE Versus FADE
The primary purpose of this work is to introduce the notion
of opposition into the design and implementation of DE and
demonstrate its benefits. Many other extensions of DE, if not all,
can also be reconsidered to incorporate the opposition concept.
In this sense, ODE should be regarded as an example and not
as competitor to other DE versions. However, in order to assess
the performance of ODE, a comparison with at least one other
algorithm may be beneficial.
We have compared ODE with the FADE method of Liu and
Lampinen [13]. They tested FADE for ten well-known benchmark functions, of which we have nine of them in our testbed.
The comparison strategy is different for this experiment. The
algorithms are run 100 times. Subsequently, for equal (fixed)
NFCs, the average and standard deviation of the best solutions
are calculated for the purpose of comparison. The same settings for parameters [13] have been used in the current experiment to assure a fair comparison. The population size is equal
73
74
TABLE VIII
THE SUMMARIZED RESULTS FROM TABLES I AND II.
NFC IS THE SUMMATION OF THE NUMBER OF FUNCTION CALLS
IS THE NUMBER OF UNSOLVED
(JUST FOR THE FUNCTIONS WHICH ALL EIGHT COMPETITORS COULD SOLVE). n
AND n
ARE THE NUMBER OF FUNCTIONS FOR WHICH DE OUTPERFORMS ODE AND
FUNCTIONS (SR = 0).n
VICE VERSA. N IS THE NUMBER OF FUNCTIONS FOR WHICH THE ALGORITHM COULD OUTPERFORMS OTHER
IS THE AVERAGE ACCELERATION RATE
ALGORITHMS. AR
TABLE IX
OPTIMAL JUMPING RATE J
FOR ALL TEST FUNCTIONS WITH
RESPECT TO THE SUCCESS PERFORMANCE (SP) ON INTERVAL
(0; 0:6] WITH STEP SIZE OF 0.1
Rosenbrocks valley
Rastrigins function
The main motivation for the current work was utilizing the
notion of opposition to accelerate DE. In order to have a comparison with other methods (other than the original DE), ODE
was also compared with FADE. The results clearly confirmed
that ODE performs better than FADE in terms of convergence
rate and solution accuracy on the utilized 58 benchmark functions.
Utilizing opposite numbers to accelerate an optimization
method is a new concept. Further studies are still required to
investigate its benefits, weaknesses, and limitations. This work
can be considered as a first step in this direction. The main
claim is not defeating DE or any of its numerous versions but
to introduce a new notion into optimization via metaheuristicsthe notion of opposition.
Possible directions for future work include the adaptive setting of the jumping rate, proposing other possibilities to implement ODE (e.g., opposition-based mutation strategies), and applying the same or similar scheme to accelerate other population-based methods (e.g., GA and PSO).
Griewangks function
Beale function
APPENDIX A
LIST OF BENCHMARK FUNCTIONS
Sphere model
Colville function
Axis parallel hyperellipsoid
75
TABLE X
COMPARISON OF DE, ODE, AND FUZZY ADAPTIVE DE (FADE). MEAN BEST AND STANDARD DEVIATION (STD DEV) OF 100 RUNS ARE REPORTED.
: . T-TEST
FOR THE DE AND ODE, THE EQUAL NUMBER OF FUNCTION CALLS ARE USED INSTEAD OF GENERATION NUMBERS N
IS USED TO COMPARE ODE AGAINST DE AND FADE. INDICATES THE T -VALUE OF 99 DEGREE OF FREEDOM IS SIGNIFICANT AT
INDICATES OPTIMAL MINIMUM OF THE FUNCTION
A 0.05 LEVEL OF SIGNIFICANCE BY TWO-TAILED T-TEST. f
( 2 #Gen )
Easom function
where
Hartmann function 1
Levy function
Hartmann function 2
Matyas function
76
TABLE XI
NAME OF PROBLEMS FOR f -f
Zakharov function
Braninss function
Fig. 2. Graphs of success performance (SP) versus jumping rate (J 2 (0; 0:6]
with step size of 0.1) for some sample functions. The point declared by J = 0
shows the SP of the DE; the rest of points (0:1; 0:2; 0:3; 0:4; 0:5; 0:6) show the
SP of the ODE.
Perm function
Step function
TABLE XII
COMPARISON OF DE AND ODE (
= 50)
77
TABLE XIII
COMPARISON OF DE AND ODE (
= 200)
Kowaliks function
Quartic function, i.e., noise
where and
next page.
78
Shekels Family
where
APPENDIX B
COMPLEMENTARY RESULTS
and
The results for
Tables XII and XIII, respectively.
are given in
ACKNOWLEDGMENT
The authors would like to thank Prof. X. Yao and three anonymous referees for their detailed and constructive comments that
helped us to increase the quality of this work.
Tripod function
REFERENCES
where
for
, otherwise,
De Jongs function 4 (no noise)
Alpine function
Schaffers function 6
Pathological function
[13] J. Liu and J. Lampinen, A fuzzy adaptive differential evolution algorithm, Soft Computing-A Fusion of Foundations, Methodologies and
Applications, vol. 9, no. 6, pp. 448462, 2005.
[14] R. Storn, On the usage of differential evolution for function optimization, in Proc. Biennial Conf. North Amer. Fuzzy Inf. Process. Soc.,
1996, pp. 519523.
79