Genetic and Other Global Optimization Algorithms - Comparison and Use in Calibration Problems
Genetic and Other Global Optimization Algorithms - Comparison and Use in Calibration Problems
Genetic and Other Global Optimization Algorithms - Comparison and Use in Calibration Problems
1021-1028
ABSTRACT: Many issues related to water resources require the solution of optimization
problems. If the objective function is not known analytically, traditional methods are not
applicable and multi-extremum (global) optimization (GO) methods must be used. In the present
paper a brief overview of GO methods is given and nine of them are compared in terms of
effectiveness (accuracy), efficiency (number of needed function evaluations) and reliability on
several problems including two problems of model calibration. Two algorithms - adaptive cluster
covering (ACCO) and controlled random search (CRS4) - show better performance than the
popular genetic algorithm. The global optimization tool GLOBE used to perform the experiments
can be downloaded from www.ihe.nl/hi.
1. INTRODUCTION
Many issues related to water resources require the solution of optimization problems. These
include reservoir optimization, problems of optimal allocation of resources and planning,
calibration of models, and many others. Traditionally, optimization problems were solved using
linear and non-linear optimization techniques which normally assume that the minimized
function (objective function) is know in analytical form and that it has a single minimum.
(Without a loss of generality we will assume that the optimization problem is minimization
problem)
In practice, however there are many problems that cannot be described analytically and many
objective functions have multiple extrema. In these cases it is necessary to pose multi-extremum
(global) optimization problem (GOP) where the traditional optimization methods are not
applicable, and other solutions must be investigated. One of these typical GOPs is that of
automatic model calibration, or parameter identification. The objective function is then the
discrepancy between the model output and the observed data, i.e. the model error, measured
normally as the weighted RMSE. One of the approaches to solve GOPs that has become popular
during the recent years is the use of the so-called genetic algorithms (GAs) (Goldberg 1989,
Michalewicz 1996). A considerable number of publications related to water-resources are
devoted to their use (Wang 1991, Babovic et al. 1994, Cieniawski 1995, Savic & Walters 1997,
Franchini & Galeati 1997). (Evolutionary algorithms (EA) are variations of the same idea used
in GAs, but were developed by a different school. It is possible to say that EAs include GAs as a
particular case).
Other GO algorithms are used for solving calibration problems as well (Duan et al., 1993,
Kuczera 1997), but GAs seem to be prefered. Our experience however, shows that many
practitioners are unaware of the existence of other GO algorithms that are more efficient and
effective than GAs. This serves as a motivation for writing this paper which has the following
main objectives:
Proc. 3rd Intern. Conference on Hydroinformatics, Copenhagen, Denmark, 1998. Balkema Publishers. pp.1021-1028
Proc. 3rd Intern. Conference on Hydroinformatics, Copenhagen, Denmark, 1998. Balkema Publishers. pp.1021-1028
function value is taken as an approximation of the global value. If all previously chosen points
{x1, ..., xk} and function values {f(x1), ..., f(xk)} are used when choosing the next point xk+1, then
the algorithm is called a sequential (active) covering algorithm (and passive if there is no such
dependency). These algorithms were found to be inefficient.
The following algorithms belong to the group of random search methods.
Pure direct random search (uniform sampling). N points are drawn from a uniform
distribution in X and f is evaluated in these points; the smallest function value is the minimum f *
assessment. If f is continuous then there is an asymptotic guarantee of convergence, but the
number of function evaluations grows exponentially with n. An improvement is to make the
generation of evaluation points in a sequential manner taking into account already known
function values when the next point is chosen, producing thus an adaptive random search
(Pronzato et al. 1984).
Controlled random search (CRS) is associated with the name of W.L.Price who proposed
several versions of an algorithm where the new trial point in search (parameter) space is
generated on the basis of a randomly chosen subset of previously generated points; the widely
cited method is CRS2 (Price 1983). At each iteration, a simplex is formed from a sample and a
new trial point is generated as a reflection of one point in the centroid of the other points in this
simplex. If the worst point in the initially generated set is worse than the new one, it is replaced
by the latter. The ideas of CRS algorithms have been further extended by Ali and Storey 1994a
producing CRS4 and CRS5. In CRS4 if a new best point is found, it is Arewarded@ by an
additional search around it by sampling points from the beta-distribution. This method is
reportedly very efficient.
Evolutionary strategies and genetic algorithms. The family of evolutionary algorithms is
based on the idea of modelling the search process of natural evolution, though these models are
crude simplifications of biological reality. Evolutionary algorithms (EA) are variants of
randomized search, and use the terminology from biology and genetics. For example, given a
random sample at each iteration, pairs of parent individuals (points), selected on the basis of their
>fit= (function value), recombine and generate new >offspring=. The best of these are selected for
the next generation. Offspring may also >mutate= that is randomly change their position in space.
The idea is that fit parents are likely to produce even fitter children. In fact, any random search
may be interpreted in terms of biological evolution: generating a random point is analogous to a
mutation, and the step made towards the minimum after a successful trial may be treated as a
selection.
Historically, evolution algorithms have been developed in three variations - evolution strategies
(ES), evolutionary programming (EP), and genetic algorithms (GA). Back & Schwefel 1993 give
an overview of these approaches, which differ mainly in the types of mutation, recombination
and selection operators. In GA, the binary coding of coordinates is introduced, so that an l-bit
binary variable is used to represent integer code of one coordinate xi, with the value ranging from
0 to 2l-1 that can be mapped into the real-valued interval [ai,bi]. An overall binary string G of
length nl called a chromosome is obtained for each point by connecting the codings of all
coordinates. The mutation operator changes a randomly chosen bit in the string G to its negation.
The recombination (or crossover) operator is applied as follows: select two points (parents) S and
T from the population according to some rule (e.g., randomly), select a number (e.g., randomly)
between 1 and nl, and form either one new point S', or two new points S' and T', by taking lefthand side bits of coordinate values from the first parent S, and right-hand side bits from the other
parent T.
There are various versions of GA varying in the way crossover, selection and construction of
the new population is performed. In evolutionary strategies (ES), mutation of coordinates is
performed with respect to corresponding variances of a certain n-dimensional normal
distribution, and various versions of recombination are introduced. On GAs applications see,
Proc. 3rd Intern. Conference on Hydroinformatics, Copenhagen, Denmark, 1998. Balkema Publishers. pp.1021-1028
e.g., Wang 1991, Babovic et al. 1994, Cieniawski 1995, Savic & Walters 1997, Franchini &
Galeati 1997.
Multistart and clustering. The basic idea of the family of multistart methods is to apply a
search procedure several times, and then to choose an assessment of the global optimizer. One of
the popular versions of multistart used in global optimization is based on clustering, that is
creating groups of mutually close points that hopefully correspond to relevant regions of
attraction of potential starting points (Torn & ilinskas 1989). The region (area) of attraction of
a local minimum x* is the set of points in X starting from which a given local search procedure P
converges to x*. For the global optimization tool GLOBE used in the present study, we developed
two multistart algorithms - Multis and M-Simplex. They are both constructed according to the
following pattern:
1. Generate a set of N random points and evaluate f at these points.
2 (reduction). Reduce the initial set by choosing p best points (with the lowest fi).
3. (local search). Launch local search procedures starting from each of p points. The best point
reached is the minimizer assessment.
In Multis, at step 3 the Powell-Brent local search (see Powell 1964, Brent 1973, Press et al.,
1991) is started. In M-Simplex the downhill simplex descent of Melder & Nead 1965 is used.
The ACCO strategy developed by the author and covered below, also uses clustering as the first
step, but it is followed by the global randomized search, rather than local search.
Adaptive cluster covering (ACCO) (Solomatine 1995, 1998) is a workable combination of
generally accepted ideas of reduction, clustering and covering (Fig.1).
Proc. 3rd Intern. Conference on Hydroinformatics, Copenhagen, Denmark, 1998. Balkema Publishers. pp.1021-1028
distribution. Covering is repeated multiple times and each time the subdomain is progressively
reduced in size.
3. Adaptation. Adaptive algorithms update their algorithmic behaviour depending on the new
information revealed about the problem. In ACCO, adaptation is formed by shifting the
subregion of search, shrinking it, and changing the density (number of points) of each covering depending on the previous assessments of the global minimizer.
4. Periodic randomization. Due to the probabilistic character of points generation, any strategy
of randomized search may simply miss a promising region for search. In order to reduce this
danger, the initial population is re-randomized, i.e. the problem is solved several times.
Depending on the implementation of each of these principles, it is possible to generate a family
of various algorithms, suitable for certain situations, e.g. with non-rectangular domains (hulls),
non-uniform sampling and with various versions of cluster generation and stopping criteria.
Figure 1 shows the example of an initial sampling, and iterations 1 and 2 for one of the clusters
in a two dimensional case.
ACCOL strategy is the combination of ACCO with the multiple local searches:
1. ACCO phase. ACCO strategy is used to find several regions of attraction, represented by the
promising points that are close (such points we will call >potent=). The potent set P1 is formed by
taking one best point found for each cluster during progress of ACCO. After ACCO stops, the set
P1 is reduced to P2 by leaving only several m (1...4) best points which are also distant from each
other, with the distance at each dimension being larger than, for example, 10% of the range for
this dimension;
2. Local search (LS) phase. An accurate algorithm of local search is started from each of the
potent points of P2 (multistart) to find accurately the minimum; a version of the Powell-Brent
search is used.
Experiments have shown, that in comparison to traditional multistart, ACCOL brings
significant economy in function evaluations.
ACD algorithm (Solomatine 1998) is also a random search algorithm, and it combines ACCO
with the downhill simplex descents (DSD) of Nelder & Mead 1965. Its basic idea is to identify
the area around the possible local optimizer by using clustering, and then to apply covering and
DSD in this area. The main steps of ACD are:
sample points (e.g., uniformly), and reduce the sample to contain only the best points;
cluster points, and reduce clusters to contain only the best points;
in each cluster, apply the limited number of steps of DSD to each point, thus moving
them closer to an optimizer;
if the cluster is potentially >good' that is contains points with low function values, cover
the proximity of several best points by sampling more points, e.g. from uniform or beta
distribution;
apply local search (e.g., DSD, or some other algorithm of direct optimization) starting
from the best point in >good= clusters. In order to limit the number of steps, the fractional
tolerance is set to be, say, 10 times greater than the final tolerance (that is, the accuracy achieved
is somewhat average);
apply the final accurate local search (again, DSD) starting from the very best point
reached so far; the resulting point is the assessment of the global optimizer.
ACDL algorithm, combining ACD with the multiple local searches, has been built and tested
as well.
4. GLOBAL OPTIMIZATION TOOL GLOBE
A PC-based system GLOBE incorporating 9 GO algorithms was built. GLOBE can be
Proc. 3rd Intern. Conference on Hydroinformatics, Copenhagen, Denmark, 1998. Balkema Publishers. pp.1021-1028
configured to use an external program as a supplier of the objective function values. The number
of independent variables and the constraints imposed on their values are supplied by the user in
the form of a simple text file. Figure 2 shows how GLOBE is used in the problems of automatic
calibration. Model must be an executable module (program) which does not require any user
input, and the user has to supply two transfer programs P1 and P2. These three programs (Model,
P1, P2) are activated from GLOBE in a loop. GLOBE runs in DOS protected mode (DPMI)
providing enough memory to load the program modules. A Windows version is being developed.
The user interface includes several graphical windows displaying the progress of minimization in
different coordinate planes projections. The parameters of the algorithms can be easily changed
by the user.
Proc. 3rd Intern. Conference on Hydroinformatics, Copenhagen, Denmark, 1998. Balkema Publishers. pp.1021-1028
. 0.0
. 0.0
10
. 0.0
10 >1000
11
?
8
?
0.0
< 23.8
< 47.0
The most comprehensive experiments with all 9 algorithms included in GLOBE tool were set
up for the problems listed in Table 1. The size of this paper does not allow the description of all
the results; Figure 3 shows several typical examples of the process of minimization (averaged on
5 runs), including those for two hydrological conceptual rainfall-runoff models (Sugawara-type
tank model SIRT, see Solomatine 1995b and the distributed model ADM, Franchini & Galeati
1997).
The number N of points in the initial sample and the number of points in the reduced sample
were chosen according to the rule that these numbers must grow linearly with the dimension n,
from N=50 at n=2, to N=300 at n=30. For CRS2 and CRS4 the formula recommended by their
authors is N=10(n+1). In ACCOL, ACDL, Multis and M-Simplex the fractional tolerance of 0.001
was used. In GA fitness rank elitist selection is used together with a complex stopping rule
preventing premature termination.
Since GA uses discretized variables (we used the 15-bit coding, i.e. the range is 0...32767) an
accurate comparison would only be possible if the values of the variables for other algorithms
were discretized in the same range as well. This has been done for ACCO, ACD and CRS4. Other
algorithms, including the local search stages of ACCOL and ACDL, use real-valued variables.
Proc. 3rd Intern. Conference on Hydroinformatics, Copenhagen, Denmark, 1998. Balkema Publishers. pp.1021-1028
Proc. 3rd Intern. Conference on Hydroinformatics, Copenhagen, Denmark, 1998. Balkema Publishers. pp.1021-1028
6. DISCUSSION
Algorithms which are permanently oriented towards the whole function domain have to
perform more function evaluations, that is, have low efficiency (CRS2 and Multis). The lower
efficiency of GA can also be attributed to the type of >crossover= used (exchange of some of the
parents= coordinate values) which often leads to redundant evaluations of the >offspring= in the
search space quite far from their highly fit parents, and hence normally with lower fitness. So the
fitness gained by the parents may not be inherited by many of their offspring. It was also found
that GA often converges prematurely, especially in the variant with tournament selection.
Whether this feature is inherent to the whole class of evolutionary algorithms following the ideas
of natural evolution, which are indeed quite appealing but highly redundant, or it is a feature of
the version of a GA implemented in this study, has yet to be investigated. It is worth mentioning
that reportedly other types of crossover, like intermediate recombination in evolutionary
strategies (Back & Schwefel 1993) may improve the efficiency of evolutionary algorithms.
The relatively higher efficiency of ACCOL and CRS4 can be explained by their orientation towards
smaller search domains which is especially efficient for high dimensions. ACDL on some runs has
shown high efficiency but its reliability was not the best.
7. CONCLUSIONS
1. Our experience showed that GO techniques are useful in solving various classes of
Proc. 3rd Intern. Conference on Hydroinformatics, Copenhagen, Denmark, 1998. Balkema Publishers. pp.1021-1028
optimization problems. Among the GO algorithms compared ACCOL and CRS4 showed the
highest effectiveness, efficiency and reliability. In many practical problems where one function
evaluation is expensive (slow), and their total number is then the critical parameter, ACCO
(without the local search phase) would be the first choice to obtain a reasonable optimizer
assessment.
ACDL algorithm proved to be efficient and effective on some of the runs with functions of
higher dimensions. However, accurate tuning of its parameters is needed to improve its
reliability.
M-Simplex performs very well with the functions of low dimension but in higher dimensions it
often converges prematurely to a local minimum.
GA, CRS2, and Multis provide reasonable solutions as well. However, all of them require
considerably more function evaluations, and GA may also converge prematurely before it reaches
the global minimum. So for problems involving >expensive= functions with continuous variables
there are better alternatives like ACCOL or CRS4. Our other experiments (Abebe and Solomatine
1998) however, show that for certain classes of problems with highly discrete variables, e.g. in
water distribution network optimization, GA, due to its inherently discrete nature, can actually be
more accurate than other algorithms built originally for continuous variables (being still less
efficient than for example ACCO).
2. The choice between various methods of global optimization may depend on the type of
problem, and more research is needed to compare reportedly efficient methods like simulated
annealing, evolution strategies, topological multilevel linkage, shuffled simplex evolution and
others (see Ali and Storey 1994b; Locatelli and Schoen 1996, Neumaier 1998, Duan 1993,
Kuczera 1997). The best results can probably be achieved by structural adaptation, that is,
switching in the process of search between different algorithms.
3. Practically in all problems with continuous variables where the use of GAs was reported,
other GO algorithms can be used as well.
4. GLOBE tool showed itself as an efficient engine for model calibration; it can be downloaded
from www.ihe.nl/hi/.
REFERENCES
Abebe A.J. & Solomatine D.P. 1998. Application of global optimization to the design of pipe networks. Proc. Int.
Conf. Hydroinformatics-98.
Ali, M.M & Storey, C. 1994a. Modified controlled random search algorithms. Intern. J. Computer Math., 53, pp.
229-235.
Ali, M.M. & Storey, C. 1994b. Topographical multilevel single linkage. J. of Global Optimization, 5, pp. 349-358.
Babovic, V., Wu, Z. & Larsen L.C. 1994. Calibrating hydrodynamic models by means of simulated evolution,
Proc. Int. Conf. on Hydroinformatics, Delft, The Netherlands. Balkema, Rotterdam, pp.193-200.
Back, T. & Schwefel, H.-P. 1993. An overview of evolutionary algorithms for parameter optimization.
Evolutionary Computation, 1, No. 1, pp. 1-23.
Brent, R.P. 1973. Algorithms for minimization without derivatives. Prentice-Hall, Englewood-Cliffs, N.J., 195p.
Cieniawski, S.E, Eheart, J.W. & Ranjithan, S. 1995. Using genetic algorithms to solve a multiobjective
groundwater monitoring problem. Water Resour. Res., 31 (2), 399-409.
Constantinescu A. 1996. Calibration of hydrodynamic numerical models using global optimization techniques.
M.Sc. thesis No. HH262, IHE, Delft, 85p.
Dixon, L.C.W. & Szego, G.P. (eds.) 1978. Towards global optimization, North-Holland, Amsterdam, 472p.
Duan, Q., Gupta, V., Sorooshian, S. 1993. Shuffled complex evolution approach for effective and efficient global
minimization. J. of Optimiz. Theory Appl., 76 (3), pp. 501-521.
Franchini, M. & Galeati, G. 1997. Comparing several genetic algorithm schemes for the calibration of conceptual
rainfall-runoff models. Hydrol. Sci. J., 42 (3), 357 - 379.
Goldberg, D.E. 1989. Genetic algorithms in search, optimization and machine learning. Reading, MA: AddisonWesley.
Griewank, A.O. 1981. Generalized descent for global optimization. J. Optimiz. Theory Appl., 34 (1), 11-39.
Jacobs, D.A.H. 1977. The state of the art in numerical analysis, Academic Press, London.
Kuczera, G. 1997. Efficient subspace probabilistic parameter optimization for catchment models. Water Resour.
Proc. 3rd Intern. Conference on Hydroinformatics, Copenhagen, Denmark, 1998. Balkema Publishers. pp.1021-1028