Topics in Complex Adaptive Systems Spring Semester, 2006 Genetic Algorithms Stephanie Forrest FEC 355E
Topics in Complex Adaptive Systems Spring Semester, 2006 Genetic Algorithms Stephanie Forrest FEC 355E
Topics in Complex Adaptive Systems Spring Semester, 2006 Genetic Algorithms Stephanie Forrest FEC 355E
Topics in Complex Adaptive Systems Spring Semester, 2006 Genetic Algorithms Stephanie Forrest FEC 355E
http://cs.unm.edu/~forrest/cas-class-06.html forrest@cs.unm.edu 505-277-7104
Genetic Algorithms
Principles of natural selection applied to computation:
Variation Selection Inheritance
Evolution in a computer:
Individuals (genotypes) stored in computers memory Evaluation of individuals (artificial selection) Differential reproduction through copying and deletion Variation introduced by analogy with mutation and crossover
Simple algorithm captures much of the richness seen in naturally evolving populations.
Population at Tn
Fitness
50 60 70 80 90 100
10
20
30
40
0 20 40 60 80 10 0 12 0 14 0 16 0 18 0 20 0 22
mean fitness
Generation
max fitness
0 24 0 26 0 28 0 30 0 32 0 34 0 36 0 38 0 40 0 42 0 44 0 46 0 48 0 50 0 52 0 53 5
F(x, y) = yx2 - x4 Degray 001 001 1 1 1 1 Bit string (Gray coded) 1 0 1 Base 2 5 Base 10
Decimal 0 1 2 3 4 5 6 7
Binary Gray code 000 001 010 011 100 101 110 111 000 001 011 010 110 111 101 100
Modeling:
Rule discovery in cognitive systems. Learning strategies for games. Affinity maturation in immune systems. Ecosystem modeling.
Genetic Programming
Evolve populations of computer programs:
Typically use the language Lisp. Select a set of primitive functions for each problem. Represent program as a syntax tree.
Many applications:
Optimal control (e.g., the pole balancing problem) Circuit design Symbolic regression (data fitting)
Genetic Programming
Expression LISP (= (* x x) (*3 x y) (* y y)) X2 = 3xy = y2
*
x x 3
*
x y y
*
y
References
J. H. Holland Adaptation in Natural and Artificial Systems. Univ. of Michigan Press (1975). Second Edition published by MIT Press (1992). D. E. Goldberg Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley (1989). M. Mitchell An Introduction to Genetic Algorithms. MIT Press (1996). S. Forrest Genetic Algorithms: Principles of natural selection applied to computation. Science 261:872-878 (1993). J. Koza Genetic Programming. MIT Press (1992).
Schema processing Schema Theorem Implicit parallelism Building block hypothesis K-armed bandit analogy
*1** *0**
1*** 0***
Schemas
Schemas capture important regularities in the search space: 1 0 0 1 1 1 0 1 0 0 1 1 * * * * 1 1 * * 0 * * * * * 0 * * 1 * * 0 * 1 1 Implicit Parallelism: 1 individual samples many schemas simultaneously. Schema Theorem: Reproduction and crossover guarantee exponentially increasing samples of the observed best schemas. Order of a schema O(s) = number of defined bits. Defining length of a schema D(s) = distance between outermost bits.
Schema Theorem
(Holland, 1975)
Let: s be a schema in population at time t,
N(s,t) be the number of instances of s at time t. Question: What is the expected N(s,t+1) ? Assume: Fitness-proportionate selection. F ( x) Expected number of offspring(x) = F (t ) Ignoring crossover and mutation,
( s, t ) N ( s, t + 1) = N ( s, t ) F (t )
Note: If
( s, t ) = c , then F (t )
N ( s, t ) = c t N ( s,0)
( s, t ) D( s) N ( s, t + 1) = N ( s, t )(1 pc )[(1 pm ) O ( s ) ] l 1 F (t )
schema 2
schema 3
Fitness
60 50 40 30 20 10 0
0 20 40 60 80 100 120 140 160 180 200 220 240 260 280 300
Generation
Study Question
Given a population, consisting of
N individuals, Each individual is L bits long,
How many schemas are sampled by the population (in one generation)? Hint:
Minimum value is: 2L Maximum value is Nx2L
Genome
0.0
1.0
Questions
When will genetic algorithms work well and when they will not?
Not appropriate for problems where it is important to find the exact global optimum. GA domains are typically those about which we have little analytical knowledge (complex, noisy, dynamic, poorly specified, etc.). Would like a mathematical characterization that is predictive.
What makes a problem easy for genetic algorithms? What distinguishes genetic algorithms from other optimization methods, such as hill climbing? What does it mean for a genetic algorithm to perform well?
2. Over time, information from low-order schemas is combined through crossover, and 3. GA detects biases in high-order schemas, 4. Eventually converging on the most fit region of the space. Implies that crossover is central to the success of the GA. Claim: GA allocates samples to schemas in a near optimal way:
K-armed bandit argument.
The payoff processes from the two arms are each stationary and independent of one another. The gambler does not know these payoffs and can estimate them only by playing coins on the different arms.
Claim: Optimal strategy is to exponentially increase the sampling rate of the observed best arm, as more samples are collected. Apply to schema sampling in GA:
3L schemas in an L-bit search space can be viewed as 3L arms of a multiarmed slot machine. Observed payoff of a schema H is simply its observed fitness. Claim: The GA is a near optimal strategy for sampling schemas. Maximizes on-line performance.
Common traps:
Infinite populations. Analysis intractable except for short strings (16 or less). Enumerate all possible populations. Convergence proofs based on idea that any string can mutate into any other string. Weak bounds.
Implementation Issues
Genetic programming:
Example: Trigonometric functions
Modeling applications:
Example: Prisoners Dilemma Example: Classifier Systems Example: Echo
Implementation Issues
Data structures:
Packed arrays of bits Byte arrays Vectors of real numbers Lists and trees Feature lists Binary encodings, gray codes Real numbers Permutations Trees
Crossover:
1-point, 2-point, n-point Uniform Special operators
Mutation:
Bit flips Creep (Gaussian noise)
Representation:
Selection
On next slide
Scaling
On next slide
Selection Methods
Fitness-proportionate (used in theoretical studies):
i Expected value of individuali: f exp(i ) = f Implement as roulette wheel:
8% 0% 20%
17% 55%
(SKIP)
Intended to prevent premature convergence (slow down evolution). Each individual ranked according to fitness. Expected value depends on rank. Min, Max are constants.
Exponential:
f' =kf
Normalizes difference between 1.0 and 1.5, and 1000.0 and 1000.5.
1
2
3 6 2 5
3 2 1 4 4 1 2 3
Problem: Mutation and crossover do not produce legal tours:
3 2 2 3
Solutions:
Other representations. Other operators. Penalize illegal solutions through fitness function.
Specialized Operators
What information (schemas) should be preserved?
Absolute position in the sequence Relative ordering in the sequence (precedence) Adjacency relations
How much randomness is introduced? Order crossover (Davis, 1985) Partially-mapped crossover (PMX) (Goldberg, et al. 1985) Cycle crossover (Oliver, et al. 1987) Edge-recombination:
Try to preserve adjacencies in parents Favor adjacencies common to both parents When 1,2 fail, make a random selection.
Evolution of Cooperation
(Axelrod, 1984)
Evolution of Cooperation
~1980 Two tournaments:
Each strategy (entry) encoded as a computer program. Round robin (everyone plays everyone). Tit-For-Tat (TFT) won both times.
Encoding:
Need to remember each players move for 3 time steps ==> 6 pieces of information. At each point, either player could cooperate or defect (binary decision). 26 = 64 possible histories. Value of bit at a given position tells strategy what to do (0 = coop, 1 = defect) in the context of that history. Additional 6 bits encodes assumption about previous interactions before the start of the game.
E.g.,
History of mutual cooperation for 3 time steps ==> 0. History of mutual defection for 3 times steps ==> 63.
Co-evolutionary environment:
Strategies play against other strategies in the population.
Key Values
3 1
1 5
5 4
4 2
2 3
Rotated Layout Start at position 2