1GAWO
1GAWO
1GAWO
__________________________________________________________________________________
Chapter 1
Genetic Algorithms
Start
Select Individuals
Mutate Individuals
Satisfy
Terminating No
Criteria?
Yes
End
1
GENETIC ALGORITHMS
__________________________________________________________________________________
1.2 REPRESENTATIONS OF GA
The early GA systems primarily involve binary strings. Binary strings are
popularly used because of theoretical convenience. The range of representation has
widened considerably in recent years and individuals are now commonly represented
as sequences of real numbers, structures, etc. These different forms maintain
precision, allow non-binary based genetic operations to be performed and eliminate
the need to convert numbers to strings. (Wright 1991; Davis 1991)
1.3 GA PARAMETERS
There are several important parameters to consider when working with GA.
This includes the setting of a population size, crossover rate and mutation rate. The
following lists the current popular parameter sizes :
population size : 10 to 100 1
crossover rate : 0.4 to 1.0
mutation rate : 0.005 - 0.01
1.4 GA POPULATION
1.4.1 Population Size
It has been reported that binary representation needs only a small population to
work efficiently while other representation require bigger population. (Reeves 1993)
To obtain answers quickly, that is to obtain good online performance, small
population, high crossover rate and high mutation rate are preferred. However, small
population may not cover the search space completely and readily loses useful
schemata. It has also been observed (on a uniformly scaled linear problem and a
massively multimodal deceptive problem) that computational cost tends to increase
quickly when population size becomes smaller than a critical size.
A large population is preferred if little is known about the problem. (Goldberg
et. al. 1995) On the other hand, large population can result in suboptimal solutions
1
Some reseachers suggested a population size of 20 -30 and crossoverrate of 0.75 - 0.95 instead.
(Schaffer, Caruana, Eshelman & Das 1989)
2
GENETIC ALGORITHMS
__________________________________________________________________________________
too, especially if computing resources are limited and do not permit sufficient
generations.
From the above discussion, it can be gathered that GA attempts to achieve two
conflicting ‘goals’. GA tries to locate the optimal solution (accuracy) and also tries to
converge as fast as possible to save computing resources (efficiency). It is difficult to
balance the parameters such that both goals can be satisfied.
Some of the above approaches can be repeatedly applied for every generation
of population. For example, method (b) and (c) can help to apply selective pressure
on the population, method (d) for maintaining diversity and preventing premature
convergence, and method (e) for guiding GA search.
3
GENETIC ALGORITHMS
__________________________________________________________________________________
1.6 SELECTION
Each individuals is ‘reproduced’ according to its fitness and the instances of
the individual in the population. The following sections list a few of the selection
methods which have been design to affect the population in different ways. Most
methods involves selecting of individuals for the next generations. Others investigate
how the individuals in a population can be replaced.
2
Assuming each individual has 10 genes and there are 3 possible alleles (0, 1, *) for each gene. As
such, there will be a maximum of 310 different schemata for the individual. If we have a population of
100 individuals, we will get a total of almost 6 million schemata.
3
The Schema Theorem probably still need further ‘enhancement’. (Grefenstette & Baker 1989)
Implicit parallelism describes about the searching of many schemata in parallel and does not describe
the effort required for various search regions. The theorem does not attempt to explain why GA is able
to generate better individuals via crossover. (Mühlenbein 1991) Besides, the theorem assumes that GA
knows beforehand which schemata are relevant so that higher fitness can be allocated.
4
GENETIC ALGORITHMS
__________________________________________________________________________________
5
GENETIC ALGORITHMS
__________________________________________________________________________________
individuals (obtained via genetic operations) are fitter than the worst individuals in the
population. This process has O(n) time complexity.
This selection mechanism is used in subsequent studies on drug design.
Unlike most of the TGAs, (diagram 1-1) this selection is performed after the genetic
operators. This is to allow the worst individuals to be replaced by new individuals
during the first generation.
1.7 CROSSOVER
A pair of children is obtained after crossing over of a pair of individuals
(parents). Crossover is applied so that schemata from the individual pairs can be
combined to give better children.
6
GENETIC ALGORITHMS
__________________________________________________________________________________
1.8 MUTATION
A gene of an individual is randomly selected. The allele is then randomly
changed to another value. In this way, Mutation helps to introduce new schemata and
insures against loss of useful schemata by crossover and selection.
It is suggested that mutation is capable of higher levels of disruption than
crossover and that mutation can perform whatever exploration that is done by
crossover. (Spears 1993) Crossover tends to preserve alleles that are common to the
pair of parents and this may limit the amount of exploration possible. Mutation has
been given more importance in recent works, as described in the following sections.
(Back 1995)
7
GENETIC ALGORITHMS
__________________________________________________________________________________
1.11 SEGA
Self Adaptive GA (SEGA)(Hiroshi 1994) involves Reorder, HillCrossover and
HillMutation. SEGA is different from TGA in several respects. SEGA does not have
any explicit selection mechanism and involves a Reorder mechanism instead (section
1.10.1). Besides, SEGA also makes use of Hillclimbing operators, which are absent
in TGA.
1.11.1 Reorder
The population of individuals is reshuffled so that different pair of individuals
can be crossed over. There is no selection in this process. The reshuffling process
has O(n) time complexity.
1.11.2 HillClimbers
The HillClimbing operation searches by moving to a fitter neighbour and
replacing the original individual by the neighbour. It is able to obtain solutions faster
than genetic operators. However, it is often only capable of locating global minimum
8
GENETIC ALGORITHMS
__________________________________________________________________________________
and needs to be started in a good region of search space. This is especially so if the
search space has many peaks. There are three main hillclimbing strategies :4
- Random-mutation (RMHC) : mutates a site randomly until improvement is
obtained. TGA has been found to perform worse than Random-Mutation-Hill-
Climbing (RMHC) for a number of test functions.(Forrest & Mitchell 1993)
- Next Ascent (NAHC) : changes bits from left to right of the chromosome
until improvement is obtained.
- Steepest Ascent (SAHC) : takes only the best possible improvement. (Forrest
& Mitchell 1993)
The hilloperators confer SEGA a number of advantages. In SEGA, every
individual controls its own rate of adaptation5. This results in a self-adaptive
population. By having variable mutation rates for each individual, convergence may
occur only after many generations. (Back 1995)
1.11.3 HillCrossover
The HillCrossover operator is a hybrid of both RMHC and OnePointCrossover
and it has been adopted by the SEGA. HillCrossover only succeeds if one of the
children have better fitness. If neither children has better fitness, another crossover
point is randomly selected and OnePointCrossover is performed again. If none of the
crossover points result in fitter children, no HillCrossover is performed for the pair of
parents.
HillCrossover is found to perform better than GA or HillClimber:
- HillCrossover integrates the benefits of GA and HillClimber. GA is good at
locating fit regions quickly but tends to spend lots of time hunting for the
optimal solutions. On the other hand, HillClimber is useful for finding global
minimum rapidly but may get stuck in local minimum if it is started in an unfit
region. (Baluja 1993) Hence, HillClimbers are often used in the late stages of
search. (Mühlenbein et. al. 1991)
- Positional bias is lowered when Crossover is repeatedly performed. More
exploration of search space occurs and results in less spurious correlation by
bad schemata. (Eshelman, Caruana & Schaffer 1989)
4
Different hillclimbing methods seem to work well for different problems. RMHC is able to locate
optimum faster than the rest when applied to the Royal Road functions. (Forrest & Mitchell 1993)
9
GENETIC ALGORITHMS
__________________________________________________________________________________
1.11.4 HillMutation
In SEGA, the mutation operator has been designed to succeed only if
hillclimbing occurs. No new individual is produced if it is weaker than the original
parental individual.
5
There are other forms of adaptation. This includes representation adaptation (Shaefer 1987), fitness
adaptation (Whitley 1987) and probability-of-operator-application adaptation (Davis 1989).
10
GENETIC ALGORITHMS
__________________________________________________________________________________
11