Monkey_algorithm_for_global_numerical_optimization
Monkey_algorithm_for_global_numerical_optimization
net/publication/228895335
CITATIONS READS
185 8,897
2 authors, including:
Wansheng Tang
Tianjin University
273 PUBLICATIONS 4,991 CITATIONS
SEE PROFILE
All content following this page was uploaded by Wansheng Tang on 06 March 2014.
Abstract
In this paper, monkey algorithm (MA) is designed to solve global numerical optimization problems with
continuous variables. The algorithm mainly consists of climb process, watch-jump process, and somersault
process in which the climb process is employed to search the local optimal solution, the watch-jump process
to look for other points whose objective values exceed those of the current solutions so as to accelerate
the monkeys’ search courses, and the somersault process to make the monkeys transfer to new search
domains rapidly. The proposed algorithm is applied to effectively solve the benchmark problems of global
optimization with 30, 1000 or even 10000 dimensions. The computational results show that the MA can
find optimal or near-optimal solutions to the problems with a large dimensions and very large numbers of
local optima.
c
°2008 World Academic Press, UK. All rights reserved.
Keywords: monkey algorithm, evolution algorithm, genetic algorithm, gradient algorithm, multivariate
optimization
1 Introduction
In almost all real-world optimization problems, it is necessary to use a mathematical algorithm that iteratively
seeks out the solutions because an analytical solution is rarely available [16]. Therefore, many evolution
algorithms such as genetic algorithm [7], ant algorithm [4] [5] and particle swarm methodology [8] have been
developed and received a great deal of attention in the literature (For surveys on evolution strategies, see Back
et al [1] and Beyer and Schwefel [2]). These algorithms have been successfully applied in various optimization
areas. However, one of the disadvantages of these algorithms is that their abilities in solving optimization
problems are reduced remarkably with the accretion of dimensions of the problems.
The aim of this paper is to design a new method called monkey algorithm (MA) to solve the optimization
of multivariate systems. The method derives from the simulation of mountain-climbing processes of monkeys.
Assume that there are many mountains in a given field (i.e., in the feasible space of the optimization problem),
in order to find the highest mountaintop (i.e., find the maximal value of the objective function), monkeys will
climb up from their respective positions (this action is called climb process). For each monkey, when it gets to
the top of the mountain, it is natural to have a look and to find out whether there are other mountains around
it higher than its present whereabout. If yes, it will jump somewhere of the mountain watched by it from the
current position (this action is called watch-jump process) and then repeat the climb process until it reaches
the top of the mountain. After repetitions of the climb process and the watch-jump process, each monkey
will find a locally maximal mountaintop around its initial point. In order to find a much higher mountaintop,
it is natural for each monkey to somersault to a new search domain (this action is called somersault process).
After many repetitious iterations of the climb process, the watch-jump process, and the somersault process,
the highest mountaintop found by the monkeys will be reported as an optimal value.
In optimization problems, it is well known that the number of local optima increases exponentially as the
increase of the dimension of the decision vector if the objective function is a multimodal function. On the one
hand, an algorithm may be trapped in the local optima of the optimization problem with large dimensions
[11]. On the other hand, much CPU time will be expended since a mass of calculation. In the MA, the purpose
of the somesault process is to make monkeys find new search domians and this action primely avoids running
into local search. In addition, the time consumed by the MA mainly lies in using the climb process to search
local optimal solutions. The essential feature of this process is the calculation of the pseudo-gradient of the
objective function that only requires two measurements of the objective function regardless of the dimension of
∗ Corresponding author. Email: zhao@tju.edu.cn (R. Zhao).
166 R. Zhao and W. Tang: Monkey Algorithm for Global Numerical Optimization
the optimization problem. This feature allows for a significant decrease in the cost of optimization, especially
in optimization problems with a large dimensions [16].
The paper is organized as follows. Section 2 describes the definition of optimization problems. The MA
for the global numerical optimization is described in Section 3. In Section 4, the proposed method is applied
to seek global optimal solutions to the benchmark problems with dimensions of 30, 1000 or even 10000.
2 Optimization Model
The mathematical representation of most optimization problems is the maximization (or minimization) of
some objective functions with respect to a decision vector. Without loss of generality, this paper will discuss
optimization in the context of maximization because a minimization problem can be trivially converted to a
maximization one by changing the sign of the objective function. The general form of the global optimization
model may be written as follows,
max f (x)
s.t. (1)
gj (x) ≤ 0, j = 1, 2, · · · , p,
where x = (x1 , x2 , · · · , xn ) is the decision vector in <n , f : <n → < the objective function, and gj : <n → <,
j = 1, 2, · · · , p the constraint functions.
Example 1 The following optimization model was provided by Koziel and Michalewicz [10],
sin3 (2πx1 ) sin(2πx2 )
max f (x1 , x2 ) =
x31 (x1 + x2 )
s.t.
x21 − x2 + 1 ≤ 0 (2)
1 − x1 + (x2 − 4)2 ≤ 0
0 ≤ xi ≤ 10, i = 1, 2.
Figure 1 depicts geometrically the contour map of the objective function f (x1 , x2 ) in [0, 10] × [0, 10] which
indicates that f (x1 , x2 ) is a multimodal function.
0.04
0.03
0.02
0.01
f(x1,x2)
−0.01
−0.02
−0.03
−0.04
10
8 10
6 8
4 6
4
2
2
x2 x1
3 Monkey Algorithm
In this section, an MA will be designed to solve the above-mentioned optimization problem. The main
components of the algorithm, representation of solution, initialization, climb process, watch-jump process and
somersault process, are presented, respectively. The details are listed as follows.
3.2 Initialization
It is necessary to initialize a position for each monkey. Here we assume that a region which contains the
potential optimal solutions can be determined in advance. Usually, this region will be designed to have a
nice shape, for example, an n-dimensional hypercube, because the computer can easily sample points from a
hypercube. And then a point is generated randomly from the hypercube. The point will be taken as a monkey’s
position provided that it is feasible. Otherwise, we re-sample points from the hypercube until a feasible point
will be found. Repeating the process M times, we obtain M feasible points xi = (xi1 , xi2 , · · · , xin ) which will
be employed to represent the initial positions of monkeys i, i = 1, 2, · · · , M , respectively. For example, in
Example 1, we can firstly make a 2-dimensional hypercube [0, 10] × [0, 10] and use the following subfunction
to initialize M positions:
#include “stdlib.h”
for i = 1 to M do
mark:
for j = 1 to 2 do
x[i][j] = 10 · rand()/(RAND MAX−1);
endfor
if x[i][1]2 − x[i][2] + 1 ≥ 0 goto mark;
if 1 − x[i][1] + (x[i][2] − 4)2 ≥ 0 goto mark;
return (x[i][1], x[i][2]);
endfor
where rand() is a function which produces an integer between 0 and RAND MAX−1 in “stdlib.h” of the C
library and thus 10 · rand()/(RAND MAX−1) will produce a real number between 0 and 10.
0.0215
0.027
0.021
0.0205 0.026
f(x1, x2)
f(x1,x2)
0.02
0.025
0.0195
0.024
0.019
0.0185
4.81 0.023
4.812
4.809 1.68 4.81 1.73
1.675
4.808 4.808 1.72
1.67
4.806 1.71
4.807 1.665 1.7
1.66 4.804 1.69
4.806 1.655 4.802 1.68
x2 x1 x2 x1
Figure 2: Climb process with a = 0.001 Figure 3: Climb process with a = 0.00001
j = 1, 2, · · · , n, respectively. The parameter a (a > 0), called the step length of the climb process, can
be determined by specific situations. For example, in Example 1, we can take a = 0.00001.
2). Calculate
f (xi + ∆xi ) − f (xi − ∆xi ) 0
fij (xi ) =
,
2∆xij
0
³ 0 0 0
´
j = 1, 2, · · · , n, respectively. The vector fi (xi ) = fi1 (xi ), fi2 (xi ), · · · , fin (xi ) is called the pseudo-
gradient of the objective function f (·) at the point xi .
³ 0 ´
3). Set yj = xij + a · sign fij (xi ) , j = 1, 2, · · · , n, respectively, and let y = (y1 , y2 , · · · , yn ).
Remark 1 The step length a plays a crucial role in the precision of the approximation of the local solution
in the climb process. The curve in Figure 2 depicts a segment of the climb track of a monkey with the step
length a = 0.001 while the curve in Figure 3 depicts that with a = 0.00001 in solving the model (2) in Example
1. From the figures, we can see that the spread of the values of the objective function reduces markedly with
the decrease of the step length a. Usually, the more smaller the parameter a is, the more precise solutions
are. Alternately, one can consider using a positive sequence {ak } which decreases to zero asymptotically as
k → +∞ instead of the constant a in Step 1. In addition, using standard SPSA (see Spall [15] [16]) for the
climb process is also a feasible idea.
Remark 2 The eyesight b can be determined by specific situations. For example, we may take b = 0.5 in
solving the model (2) in Example 1. Usually, the bigger the feasible space of optimal problem is, the bigger the
value of b should be taken.
1). Randomly generate a real number α from the interval [c, d] (called the somersault interval), where the
somersault interval [c, d] can be determined by specific situations.
2). Set
yj = xij + α(pj − xij ), (3)
M
1 X
where pj = xij , j = 1, 2, · · · , n, respectively. The point p = (p1 , p2 , · · · , pn ) is called the
M i=1
somersault pivot.
3). Set xi = y if y = (y1 , y2 , · · · , yn ) is feasible. Otherwise, repeat steps 1) and 2) until a feasible solution y
is found.
Remark 3 The somersault interval [c, d] in the somersault process governs the maximum distance that mon-
keys can somersault. For example, we may take [c, d] = [−1, 1] in Example 1. If we hope to enlarge the search
space of monkeys for those problems that have large feasible spaces, we can increase the values of |c| and d,
respectively. This alteration sometimes can accelerate the convergence behavior of the MA.
Remark 4 In step 2, if α ≥ 0, monkey i will somersault along the direction from its current position pointing
at the somersault pivot, otherwise, from the somersault pivot pointing at the current position.
ÃM !
0 1 X
Remark 5 The choice of the somersault pivot is not unique. For example, we can set pj = xlj − xij ,
M −1
l=1
and further replace the equation (3) with
0 0
yj = pj + α(pj − xij ) (4)
or
0
yj = xij + α|pj − xij |, (5)
j = 1, 2, · · · , n, respectively.
3.6 Termination
Following the climb process, the watch-jump process, and the somersault process, all monkeys are ready for
their next actions. The MA will terminate after a given number (called the cyclic number, denoted by N ) of
cyclic repetitions of the above steps. We now give a flow chart for the MA in Figure 4.
It is known that the best position does not necessarily appear in the last iteration, so the best one should
be kept from the beginning. If the monkeys find a better one in the new iteration, then the old one will be
replaced by it. This position will be reported as an optimal solution in the end of iterations.
The experimental results found by 20 runs of the MA for solving Model (2) are listed in Table 1. From
the table, we can see that the expected value of the results is 0.095824 which is very close to the result
fmax = 0.095825 obtained by Koziel and Michalewicz [10].
170 R. Zhao and W. Tang: Monkey Algorithm for Global Numerical Optimization
Start
Initialization
Climb process
Watch-jump process
Climb process
Somersault process
No
Meet stopping criterion?
Yes
Display the optimal solution
and the objective value
End
4 Numerical Examples
In this section, we use the benchmark functions (see Table 2) to test the effectiveness of the MA. It is well-
known that some of these benchmark functions have so many local minima that they are challengeable enough
for performance evaluation. These benchmark functions were tested widely by Chellapilla [3], He et al. [6], Tsai
et al. [17], Tu [18] and Yao et al. [20] (for more details on benchmark functions, see http://www.ise.org.cn/∼
benchmark).
The convergence behavior of the MA is governed by the selections of the parameters in the climb process,
watch-jump process, and somersault process. These parameters used in our experiments are listed in Table 3.
The results by 20 runs of the MA for the test functions are shown in Table 4. According to the table, for
the functions f2 (x)—f5 (x) and f7 (x)—f12 (x), the MA can find optimal or near-optimal solutions with much
faster convergence rates.
Journal of Uncertain Systems, Vol.2, No.3, pp.165-176, 2008 171
For f1 (x), all the search processes of the MA in the 20 executions are trapped by poor local minima and
then stagnated. One of the reasons is that the feasible space of f1 (x) is too large while the somersault interval
[c, d] taken is too small. If we hope to enlarge the search space of monkeys for those problems that have large
feasible spaces, we can increase the values of |c| and d, respectively. For example, we set [c, d] = [−10, 30]
instead of [c, d] = [−1, 1] for the test function f1 (x). This alteration sometimes can avoid effectively the local
search of the MA. Another reason is that the step length a and the climb number Nc are too small so that
sometimes the monkeys cannot arrive at their mountaintops at all in the climb process before the maximal
climber number is reached. One of choices is to increase the value of a or Nc , respectively. Increasing the
value of a will possibly cut down to the precise of the solution while increasing the climb number Nc will
result in much CPU time spend. Here, we replace a = 0.001 with a = 0.1 and keep the value of Nc unchanged.
The revised parameters, the expected value and the variance of the results obtained by 20 runs of the MA
with the revised parameters are provided in Table 5. From the table, we can see that the expected value of
the results reported by the MA is −12569.1378 which is very close to the maximal value fmax = −12569.18
Test functions The mean value f min The variance Var(f min )
f1 (x) −7126.7787 526102.4096
f2 (x) 0.0055 7.39459E-08
f3 (x) 0.0027 4.05818E-08
f4 (x) 0.0001 2.15E-11
f5 (x) 0.0000 5.15789E-13
f6 (x) — —
f7 (x) 0.0103 0.001627665
f8 (x) 0.0157 3.67884E-06
f9 (x) 0.1852 0.001201088
f10 (x) 0.0594 7.53108E-06
f11 (x) 0.0004 2.07521E-07
f12 (x) 0.0004 4.80708E-07
of f1 (x).
M = 5, a = 0.1
The revised parameters Nc = 2000, b = 0.5
[c, d] = [−10, 30], N = 60
The expected vale f 1 −12569.1378
The variance Var(f 1 ) 0.0007
The only exception is for the test function f6 (x), because the MA does not fully converge when the
maximum cyclic number is reached. The reason is that the large stochastic perturbations in the objective
function jumble up the pseudo-gradient of the objective function, resulting in a disordered climbing direction.
In fact, it is better to set the different parameters for different test functions. The step length a in the climb
process is a crucial parameter for the precision of solutions reported by the MA. If the objective functions
(for example, f2 , f3 , f4 ) are so sharp that the slight transfers of the decision variables will result in the
tremendous changes in the objective functions, the parameter a should be set to a small value. For instance,
we set a = 10−6 for f2 , f3 , f4 in our experiments instead of a = 0.001 in previous experiments.
For the eyesight b, we may follow a rule that the value of b is increased with the accretion of the size of
the feasible space. For example, we set b = 1 in our experiments for the test functions f2 , f3 , f5 , f7 , f9 , and
f10 since their feasible spaces are very small; while we set b = 10 for f1 , f4 , f8 , f11 and f12 since they have
large feasible spaces.
The somersault interval [c, d] in the somersault process govern the maximum distance that monkeys can
somersault. Usually, we may take [c, d] = [−1, 1]. If we hope to enlarge the search space of monkeys for
those problems that have large feasible spaces, we can increase the values of |c| and d. For example, we set
[c, d] = [−10, 30] for f1 . This alteration sometimes can accelerate the convergence behavior of the MA.
The revised parameters and the results by the MA for all the test function are listed in Table 6 while
Figures 5—15 depict the convergence processes of the MA. From Table 6 and Figures 5—15, we can see that
the MA can find very well the optimal values for the test functions.
We will end this section by considering the cases in which the problem dimensions of f2 —f5 are set as 1000
and 10000 instead of 30, respectively. The population size for the MA is also set to 5. The results obtained
by the MA are shown in Tables 7 and 8, respectively. It is revealed from the tables that the MA can find
optimal or near-optimal solutions to the problems. This fact indicates that the population size of the MA is
almost insensitive to the dimension of the problems.
Journal of Uncertain Systems, Vol.2, No.3, pp.165-176, 2008 173
f1 (x)
.. f2 (x)
.....
. . ..
0 ........ 400 ..... ....
.. . ..
-4000 ..... .... 300 ..... ........
.. . .....
-8000 ..... .... 200 ..... ....
. . ...
-12000 ..... ................................................................................................................. 100 ... . .
.................................................................................................................................... N . .......................................................................................................................................................
0 N
20 40 60 20 40 60
Figure 5: The convergence process for f1 (x) Figure 6: The convergence process for f2 (x)
with n = 30 with n = 30
f3 (x)
.... f4 (x)
.
... 60 ..... ....
20 ..... .... ... ..
.
. ..
15 ..... .............. 40 ..... ....
. ........ ... ..
10 ..... ...........
................
. .
20 ..... .......
5 ... .
. ........... ........
........................................................................................................................................................................... ... ..........
0 N 0 ....................................................................................................................................... N
20 40 60 20 40 60
Figure 7: The convergence process for f3 (x) Figure 8: The convergence process for f4 (x)
with n = 30 with n = 30
f5 (x)
.. f7 (x)
..
1500000 ..... ..... 500000 .....
... .. ...
.
1000000 ..... .... 100000 .........
... .. ......
.. .
500000 ..... .... 50000 ..........................
... ... ... ...........................
....................
0 ...................................................................................................................................... N 0 ................................................................................................................................................ N
20 40 60 300 600 900
Figure 9: The convergence process for f5 (x) Figure 10: The convergence process for f7 (x)
with n = 30 with n = 30
f8 (x)
.. .. f9 (x)
..
60000 ..... .... 9 ........
.... ... .......
40000 ..... .... 6 ..........
... ... .......
.
20000 ..... ........... 3 ..........
... ....................... ... ......
. .............. . ........
0 ...................................................................................................................................... N 0 .................................................................................................................................................... N
20 40 60 20 40 60
Figure 11: The convergence process for f8 (x) Figure 12: The convergence process for f9 (x)
with n = 30 with n = 30
174 R. Zhao and W. Tang: Monkey Algorithm for Global Numerical Optimization
Table 6: The revised parameters and the results for the test functions
f10..(x) f11..(x) .
.
9 .......... 600000 ..........
... ...... ... ...
6 ..... ............... 400000 ..... .....
... ......... ... ....
...........
3 ..... ..............
....... 200000 ..... .....
... ... ..
............................................................................................................................................................................... . ..
0 . N 0 .................................................................................................................................... N
20 40 60 20 40 60
Figure 13: The convergence process for Figure 14: The convergence process for
f10 (x) with n = 30 f11 (x) with n = 30
5 Further Applications
The designed MA can also be used to solve varieties of uncertain programming models provided by [12] [13]
[14] [19], such as
max E[f (x, ξ)]
subject to (6)
E[gj (x, ξ)] ≤ 0, j = 1, 2, . . . , p,
where x is a decision vector and ξ is a fuzzy vector, E is the fuzzy expected operator, f is the objective
function, and gj are the constraint functions for j = 1, 2, . . . , p. For a given x, the values of E[f (x, ξ)] and
E[gj (x, ξ)] can be estimated by fuzzy simulation ([12]). And then use the MA to search the optimal solutions
of the model.
f12.(x)
90 ..........
... ...
.
60 ..... .....
... ....
30 ..... ......
.... .......
0 .................................................................................................................................................... N
20 40 60
Table 7: The parameters and results for test functions with dimension n = 1000
Table 8: The parameters and results for test functions with dimension n = 10000
6 Conclusions
Monkey algorithm is a population-based algorithm, which is inspired by the mountain-climbing process of
monkeys. Similar to other population-based algorithms, as one of the evolutionary algorithms, monkey algo-
rithm can solve a variety of difficult optimization problems featuring non-linearity, non-differentiability, and
high dimensionality with a faster convergence rate. Another advantage of monkey algorithm is that it has a
few of the parameters to adjust, which makes it particularly easy to implement.
Acknowledgments
This work was partly supported by National Natural Science Foundation of China Grant No. 70571056,
70471049 and Program for New Century Excellent Talents in University. The authors are grateful to Prof.
Ningzhong Shi, Prof. Baoding Liu, and the members of the project team for helpful discussions, but will take
full responsibility for the views expressed in this paper.
References
[1] Back,T., F. Hoffmeister, and H. Schwefel, A survey of evolution strategies, Proceedings of the Fourth
International Conference of Genetic Algorithms, San Diego, pp.2-9, 1991.
[2] Beyer, H., and H. Schwefel, Evolution strategies—A comprehensive introduction, Natural Computing,
vol.1, pp.3C52, 2002.
[3] Chellapilla, K., Combining mutation operators in evolutionary programming, IEEE Transactions on
Evolutionary Computation, vol.2, pp.91-96, 1998.
[4] Dorigo, M., Optimization, learning and natural algorithms, Ph.D. Thesis, Politecnico di Milano, Italy,
1992.
[5] Dorigo, M., V. Maniezzo, and A. Colorni, The ant system: optimization by a colony of cooperative
agents, IEEE Transactions on System Man Cybernet, vol.26, pp.29-41, 1996.
[6] He, S., Q. Wu, et al., A particle swarm optimizer with passive congregation, BioSystems, vol.78, pp.135-
147, 2004.
[7] Holland, J.H., Adaptation in natural and artificial systems, University of Michigan Press, 1975, Extended
new Edition, MIT Press, Cambridge, 1992.
View publication stats
176 R. Zhao and W. Tang: Monkey Algorithm for Global Numerical Optimization
[8] Kennedy, J., and R. Eberhart, Particle swarm optimization, Proceedings of IEEE International Confer-
ence on Neural Networks, vol.4, pp.1942-1948, 1995.
[9] Kiefer, J., and J. Wolfowitz, Stochastic estimation of a regression function, Ann. Math. Stat., vol.23,
pp.462-466, 1952.
[10] Koziel, S., and Z. Michalewicz, Evolutionary algorithms, homomorphous mappings, and constrained
parameter optimization, Evolutionary Computation, vol.7, pp.19-44, 1999.
[11] Leung, Y., and Y. Wang, An orthogonal genetic algorithm with quantization for global numerical opti-
mization, IEEE Transactions on Evolutionary Computation, vol.5, pp.41-53, 2001.
[12] Liu, B., Theory and Practice of Uncertain Programming, Physica-Verlag, Heidelberg, 2002.
[13] Liu, B., A survey of entropy of fuzzy variables, Journal of Uncertain Systems, vol.1, pp.4-13, 2007.
[14] Liu, B., and Y.K. Liu, Expected value of fuzzy variable and fuzzy expected value model,IEEE Transac-
tions on Fuzzy Systems, vol.10, no.4, pp.445-450, 2002.
[15] Spall, J., Multivariate stochastic approximation using a simultaneous perturbation gradient approxima-
tion, IEEE Transactions on Automatic Control, vol.37, pp.332-341, 1992.
[16] Spall, J., An overview of the simultaneous perturbation method for efficient optimization, Johns Hopkins
APL Technical Digest, vol.19, pp.482-492, 1998.
[17] Tsai, J., T. Liu, and J. Chou, Hybrid taguchi-genetic algorithm for global numerical optimization, IEEE
Transactions on Evolutionary Computation, vol.8, pp.365-377, 2004.
[18] Tu, Z., and Y. Lu, A robust stochastic genetic algorithm (StGA) for global numerical optimization, IEEE
Transactions on Evolutionary Computation, vol.8, pp.456-470, 2004.
[19] Wang, C., W. Tang, and R. Zhao, The continuity and convexity analysis of the expected value function
of a fuzzy mapping, Journal of Uncertain Systems, vol.1, pp.148-160, 2007.
[20] Yao, X., and Y.K. Liu, Evolutionary programming made faster, IEEE Transactions on Evolutionary
Computation, vol.3, pp.82-102, 1999.