Introduction To Genetic Programming Tutorial GECCO-2004-SEATTLE SUNDAY JUNE 27, 2004
Introduction To Genetic Programming Tutorial GECCO-2004-SEATTLE SUNDAY JUNE 27, 2004
INTRODUCTION TO GENETIC
PROGRAMMING
TUTORIAL
GECCO-2004—SEATTLE
SUNDAY JUNE 27, 2004
John R. Koza
Consulting Professor (Medical Informatics)
Department of Medicine
School of Medicine
Consulting Professor
Department of Electrical Engineering
School of Engineering
Stanford University
Stanford, California 94305
E-MAIL: koza@stanford.edu
http://www.smi.stanford.edu/people/koza/
http://www.genetic-programming.org
http://www.genetic-programming.com
2
THE CHALLENGE
MAIN POINTS
• Decision trees
• If-then production rules (e.g., expert systems)
• Horn clauses
• Neural nets (matrices of numerical weights)
• Bayesian networks
• Frames
• Propositional logic
• Binary decision diagrams
• Formal grammars
• Vectors of numerical coefficients for polynomials (adaptive
systems)
• Tables of values (reinforcement learning)
• Conceptual clusters
• Concept sets
• Parallel if-then rules (e.g., genetic classifier systems)
5
A COMPUTER PROGRAM
REPRESENTATION
• "Our view is that computer programs are the best
representation of computer programs."
6
No
i = M? i := i + 1
i := 0 Yes
Yes
Gen := Gen + 1 i = M? i := i + 1
No
Select Genetic Operation
Pr Select One Individual Copy into New
Based on Fitness Perform Reproduction
Population
Pc Select Two Individuals Perform Insert Offspring
into New i := i + 1
Based on Fitness Crossover
Population
Pm Select One Individual Insert Mutant into
Perform Mutation
Based on Fitness New Population
Select an Architecture Altering Operation
Based on its Specified Probability
COMPUTER PROGRAM
=PARSE TREE=PROGRAM TREE
=PROGRAM IN LISP=DATA=LIST
(+ 1 2 (IF (> TIME 10) 3 4))
• Terminal set T = {1, 2, 10, 3, 4, TIME}
• Function set F = {+, IF, >}
1 2 IF
> 3 4
TIME 10
8
2 *
* C
A B
MUTATION OPERATION
2 AND 3 NOR
D2 D1 D0 D1
4 5 6 7
NOR
NOR
D0 D1
NOT NOT
D0 D1
10
CROSSOVER (SEXUAL
RECOMBINATION) OPERATION FOR
COMPUTER PROGRAMS
• Select two parents probabilistically based on fitness
• Randomly pick a number from 1 to NUMBER-OF-POINTS
– independently for each of the two parental programs
• Identify the two subtrees rooted at the two picked points
1 1
+ *
2 5 2 5
* – * +
3 4 6 7 3 4 6 7
0.234 Z X 0.789 Z Y Y *
8 9
0.314 Z
Parent 1:
(+ (* 0.234 Z) (- X 0.789))
Parent 2:
(* (* Z Y) (+ Y (* 0.314 Z)))
11
+ *
+ – * *
Y * X 0.789 Z Y 0.234 Z
0.314 Z
2
Y + 0.314Z + X – 0.789 0.234Z Y
Offspring 1:
(+ (+ Y (* 0.314 Z))
(- X 0.789))
Offspring 2:
(* (* Z Y) (* 0.234 Z))
Terminal Set
Function Set
Fitness Measure A Computer
Parameters
GP Program
Termination
Criterion
13
SYMBOLIC REGRESSION OF
QUADRATIC POLYNOMIAL X2 + X + 1
+ 0 1 * 2 0 x -
x 1 x x -1 -2
X+1 X2 + 1 2 X
FITNESS
SYMBOLIC REGRESSION OF
QUADRATIC POLYNOMIAL X2 + X + 1
(a) (b) (c) (d)
- + + *
+ 0 1 * 2 0 x -
x 1 x x -1 -2
GENERATION 1
(a) (b) (c) (d)
- + - +
+ 0 % 0 x 0 1 *
x 1 x x + x
x 1
2
x+1 1 X x +x+1
SYMBOLIC REGRESSION
4 3 2
OF QUARTIC POLYNOMIAL X +X +X +X
(WITH 21 FITNESS CASES)
Independent Dependent
variable X Variable Y
(Input) (Output)
-1.0 0.0000
-0.9 -0.1629
-0.8 -0.2624
-0.7 -0.3129
-0.6 -0.3264
-0.5 -0.3125
-0.4 -0.2784
-0.3 -0.2289
-0.2 -0.1664
-0.1 -0.0909
0 0.0
0.1 0.1111
0.2 0.2496
0.3 0.4251
0.4 0.6496
0.5 0.9375
0.6 1.3056
0.7 1.7731
0.8 2.3616
0.9 3.0951
1.0 4.0000
17
TABLEAU⎯SYMBOLIC REGRESSION
4 3 2
OF QUARTIC POLYNOMIAL X +X +X +X
Objective: Find a function of one independent
variable, in symbolic form, that fits a
given sample of 21 (xi, yi) data points
Terminal set: X (the independent variable).
Function set: +, -, *, %, SIN, COS, EXP,
RLOG
Fitness cases: The given sample of 21 data points (xi,
yi) where the xi are in interval [–1,+1].
Raw fitness: The sum, taken over the 21 fitness cases,
of the absolute value of difference
between value of the dependent variable
produced by the individual program and
the target value yi of the dependent
variable.
Standardized Equals raw fitness.
fitness:
Hits: Number of fitness cases (0 – 21) for
which the value of the dependent
variable produced by the individual
program comes within 0.01 of the target
value yi of the dependent variable.
Wrapper: None.
Parameters: Population size, M = 500.
Maximum number of generations to be
run, G = 51.
Success An individual program scores 21 hits.
Predicate:
18
SYMBOLIC REGRESSION
4 3 2
OF QUARTIC POLYNOMIAL X +X +X +X
WORST-OF-GENERATION INDIVIDUAL
IN GENERATION 0 WITH RAW FITNESS
OF 1038
(EXP (- (% X (- X (SIN X)))
(RLOG (RLOG (* X X)))))
Equivalent to
SYMBOLIC REGRESSION
4 3 2
OF QUARTIC POLYNOMIAL X +X +X +X
(COS (COS (+ (- (* X X) (% X
X)) X)))
Equivalent to
Cos [Cos (x + x – 1)] 2
3 x 4 + x3 + x 2 + x
0
-1 0 1
-1
20
SYMBOLIC REGRESSION
4 3 2
OF QUARTIC POLYNOMIAL X +X +X +X
BEST-OF-GENERATION INDIVIDUAL IN
GENERATION 0 WITH RAW FITNESS OF
4.47 (AVERGAGE ERROR OF 0.2)
(* X (+ (+ (- (% X X) (% X X))
(SIN (- X X))) (RLOG (EXP (EXP
X)))))
Equivalent to
xex
4
x 4 + x3 + x 2 + x
3
xe x
2
0
-1 0 1
-1
21
SYMBOLIC REGRESSION
4 3 2
OF QUARTIC POLYNOMIAL X +X +X +X
SYMBOLIC REGRESSION
4 3 2
OF QUARTIC POLYNOMIAL X +X +X +X
BEST-OF-GENERATION INDIVIDUAL IN
GENERATION 2 WITH RAW FITNESS OF
2.57 (AVERGAGE ERROR OF 0.1)
(+ (* (* (+ X (* X (* X (% (% X
X) (+ X X)))))
(+ X (* X X))) X) X)
Equivalent to...
4 3 2
x + 1.5x + 0.5x + x
23
SYMBOLIC REGRESSION
4 3 2
OF QUARTIC POLYNOMIAL X +X +X +X
BEST-OF-RUN INDIVIDUAL IN
GENERATION 34 WITH RAW FITNESS
OF 0.00 (100%-CORRECT)
(+ X (* (+ X (* (* (+ X (- (COS
(- X X)) (- X X))) X) X)) X))
Equivalent to
4 3 2
x +x +x +x
+
X *
+ X
X *
* X
+ X
X –
COS –
– X X
X X
24
SYMBOLIC REGRESSION
4 3 2
OF QUARTIC POLYNOMIAL X +X +X +X
OBSERVATIONS
• GP works on this problem
• GP determines the size and shape of the
solution
• number of operations needed to solve the problem
• size and shape of the program tree
• content of the program tree (i.e., sequence of operations)
• GP operates the same whether the solution
is linear, polynomial, a rational fraction of
polynomials, exponential, trigonometric, etc.
• It's not how a human programmer would
have done it
• Cos (X - X) = 1
• Not parsimonious
• The extraneous functions – SIN, EXP,
RLOG, and RCOS are absent in the best
individual of later generations because they
are detrimental
• Cos (X - X) = 1 is the exception that proves the rule
• The answer is algebraically correct (hence
no further cross validation is needed)
25
CLASSIFICATION PROBLEM
INTER-TWINED SPIRALS
26
WALL-FOLLOWING PROBLEM
12 SONAR SENSORS
S01 = 16.4 S02 = 12.0 S03 = 12.0 S04 = 16.4
S00 = 12.4
S05 = 9.0
S11 = 12.4
S06 = 16.2
S09 = 9.4
WALL-FOLLOWING PROBLEM
FITNESS MEASURE
29
WALL-FOLLOWING PROBLEM
BEST PROGRAM OF GENERATION 57
• Scores 56 hits (out of 56)
• 145point program tree
30
AUTOMATICALLY DEFINED
FUNCTIONS (ADFS, SUBROUTINES)
AUTOMATICALLY DEFINED
FUNCTIONS (ADFS, SUBROUTINES)
Fitness L0 W0 H 0 L1 W1 H 1 Dependent
case variable D
1 3 4 7 2 5 3 54
2 7 10 9 10 3 1 600
3 10 9 4 8 1 6 312
4 3 9 5 1 6 4 111
5 4 3 2 7 6 1 –18
6 3 3 1 9 5 4 –171
7 5 9 9 1 7 6 363
8 1 2 9 3 9 2 –36
9 2 6 8 2 6 10 –24
10 8 1 10 7 5 1 45
33
AUTOMATICALLY DEFINED
FUNCTIONS (ADFS, SUBROUTINES)
* *
* L0 * W1
W0 H0 L1 H1
H0 H1
L0 L1
W0 W1
34
AUTOMATICALLY DEFINED
FUNCTIONS (ADFS, SUBROUTINES)
defun
* –
ARG1 ARG2 L0 W0 H0 L1 W1 H1
35
AUTOMATICALLY DEFINED
FUNCTIONS (ADFS, SUBROUTINES)
DIMENSIONAL NON-LINEAR
REGRESSION PROBLEM BECOMES AN
8-DIMENSIONAL PROBLEM
Fitness L0 W0 H 0 L1 W1 H 1 V0 V1 D
case
1 3 4 7 2 5 3 84 30 54
2 7 10 9 10 3 1 630 30 600
3 10 9 4 8 1 6 360 48 312
4 3 9 5 1 6 4 135 24 111
5 4 3 2 7 6 1 24 42 –18
6 3 3 1 9 5 4 9 180 –171
7 5 9 9 1 7 6 405 42 363
8 1 2 9 3 9 2 18 54 –36
9 2 6 8 2 6 10 96 120 –24
10 8 1 10 7 5 1 80 35 45
AUTOMATICALLY DEFINED
FUNCTIONS (ADFS, SUBROUTINES)
AUTOMATICALLY DEFINED
FUNCTIONS (ADFS, SUBROUTINES)
• Identify regularities
AUTOMATICALLY DEFINED
FUNCTIONS (ADFS, SUBROUTINES)
AUTOMATICALLY DEFINED
FUNCTIONS (ADFS, SUBROUTINES)
• ADFs work.
• ADFs do not solve problems in the style of human
programmers.
• ADFs reduce the computational effort required to solve a
problem.
• ADFs usually improve the parsimony of the solutions to a
problem.
• As the size of a problem is scaled up, the size of solutions
increases more slowly with ADFs than without them.
• As the size of a problem is scaled up, the computational
effort required to solve a problem increases more slowly
with ADFs than without them.
• The advantages in terms of computational effort and
parsimony conferred by ADFs increase as the size of the
problem is scaled up.
40
REUSE
REUSE
AUTOMATICALLY DEFINED
ITERATIONS (ADIS)
REUSE
TRANSMEMBRANE SEGMENT
IDENTIFICATION PROBLEM
• Goal is to classify a given protein segment as being a
transmembrane domain or non-transmembrane area of the
protein
• Generation 20 ⎯ Run 3 ⎯ Subset-creating version
• in-sample correlation of 0.976
• out-of-sample correlation of 0.968
• out-of-sample error rate 1.6%
(progn
(defun ADF0 ()
(ORN (ORN (ORN (I?) (H?)) (ORN (P?) (G?))) (ORN (ORN
(ORN (Y?) (N?)) (ORN (T?) (Q?))) (ORN (A?) (H?))))))
(defun ADF1 ()
(values (ORN (ORN (ORN (A?) (I?)) (ORN (L?) (W?)))
(ORN (ORN (T?) (L?)) (ORN (T?) (W?))))))
(defun ADF2 ()
(values (ORN (ORN (ORN (ORN (ORN (D?) (E?)) (ORN (ORN
(ORN (D?) (E?)) (ORN (ORN (T?) (W?)) (ORN (Q?)
(D?)))) (ORN (K?) (P?)))) (ORN (K?) (P?))) (ORN (T?)
(W?))) (ORN (ORN (E?) (A?)) (ORN (N?) (R?))))))
(progn (loop-over-residues
(SETM0 (+ (- (ADF1) (ADF2)) (SETM3 M0))))
REUSE
defloop 410
values 440
+ 1 M0 LEN
M1
M0 READV
M1
44
REUSE
AUTOMATICALLY DEFINED
RECURSION (ADR0) AND A RESULT-
PRODUCING BRANCH
• a recursion condition branch, RCB
• a recursion body branch, RBB
• a recursion update branch, RUB
• a recursion ground branch, RGB
progn 600
ADL0 LIST values 620 IFGTZ 630 * 650 IFGTZ 660 ADR0 680
611 612
661
ARG0 IFGTZ ADR0 IFGTZ IFGTZ 1 3 RLI -1 1 5
613 621 631 635 640 651 652 663 664 681
632
ARG0 1 -1 - RLI -1 1 RLI 1 -1 ARG0
622 623 624 636 638 639 641 643 644 662
GP TECHNIQUES
GP TECHNIQUES ⎯ CONTINUED
GP TECHNIQUES ⎯ CONTINUED
ARCHITECTURE-ALTERING
OPERATIONS
ARCHITECTURE-ALTERING
OPERATIONS
D0 ADF0 487
421
ARG1 AND 422 D1 D2
482 483 486
ARG1 ARG0 D3 NOR 489
423 424 488
D4 D0
490 491
50
ARCHITECTURE-ALTERING
OPERATIONS
progn 500
ARCHITECTURE-ALTERING
OPERATIONS
610 defun
values 670
ARCHITECTURE-ALTERING
OPERATIONS
SPECIALIZATION – REFINEMENT –
CASE SPLITTING
• Branch duplication
• Argument duplication
• Branch creation
• Argument creation
GENERALIZATION
• Branch deletion
• Argument deletion
53
ARCHITECTURE-ALTERING
OPERATIONS
GENETIC PROGRAMMING
PROBLEM SOLVER (GPPS)
⎯ VERSION 2.0
INPUT OUTPUT
VECTOR VECTOR
INPUT(0) OUTPUT(0)
INPUT(1) OUTPUT(1)
INPUT(2) OUTPUT(2)
• •
• GPPS 2.0 •
• PROGRAM •
INPUT(N1) OUTPUT(N2)
IMPLEMENTATION OF GP IN
ASSEMBLY CODE – COMPILED
GENETIC PROGRAMMING SYSTEM
(NORDIN 1994)
• Nordin, Peter. 1997. Evolutionary Program Induction of
Binary Machine Code and its Application. Munster,
Germany: Krehl Verlag.
• Opportunity to speed up GP that is done by slowly
INTERPRETING GP program trees.
Instead of interpreting the GP program tree, EXECUTE
this sequence of assembly code.
• Can identify small set of primitive functions that is useful
for large group of problems, such as +, -, *, % and also use
some conditional operations (IFLTE), some logical
functions (AND, OR, XOR, XNOR) and perhaps others (e.g.,
SRL, SLL, SETHI from Sun 4).
• Then, generate random sequence of assembly code
instructions at generation 0 from this small set of machine
code instructions (referring to certain registers).
• If ADFs are involved, generate fixed header and footer of
function and appropriate function call.
• Perform crossover possibly so as to preserve the integrity
of subtrees.
• If ADFs are involved, perform crossover so as to preserve
the integrity of the header and footer of function and the
function call.
56
CELLULAR ENCODING
(DEVELOPMENTAL GENETIC
PROGRAMMING)
AUTOMATIC PARALLELIZATION OF
SERIAL PROGRAMS USING GP
• Ryan, Conor. 1999. Automatic Re-engineering of Software
Using Genetic Programming. Amsterdam: Kluwer Academic
Publishers.
• Start with working serial computer program (embryo)
• GP program tree contains validity-preserving functions
that modify the current program. That is, the functions in
the program tree side-effect the current program.
• Execution of the complete GP program tree progressively
modifies the current program
• Fitness is based on execution time on the parallel computer
system
59
DEVELOPMENTAL GP
2 C 3 FLIP
-
60
DEVELOPMENTAL GP
2 C FLIP
3
– 4 SERIES 5 6 NOP
7
0.963 – 8 FLIP 9 SERIES 1 0 L 1 1 1 2 L
1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 2 1
– 0.880 END FLIP L END – L -0.657 END
2 2
DEVELOPMENTAL GP
DEVELOPMENTAL GP
EVALUATION OF FITNESS OF A
CIRCUIT
z0
IN OUT
+
Embryonic Circuit
Program Tree
Fitness
64
GENETICALLY EVOLVED 10 DB
AMPLIFIER FROM GENERATION 45
Darlington
Emitter-
Follower
Voltage Gain Stage Stage
70
REGISTER-CONTROLLED CAPACITOR
CIRCUIT
VOLTAGE-CURRENT-CONVERSION
CIRCUIT
BEST-OF-RUN FROM GENERATION 109
75
21 PREVIOUSLY PATENTED
INVENTIONS REINVENTED BY GP
Invention Date Inventor Place Patent
1 Darlington 1953 Sidney Bell Telephone 2,663,806
emitter- Darlington Laboratories
follower
section
2 Ladder filter 1917 George American 1,227,113
Campbell Telephone and
Telegraph
3 Crossover 1925 Otto Julius American 1,538,964
filter Zobel Telephone and
Telegraph
4 “M-derived 1925 Otto Julius American 1,538,964
half section” Zobel Telephone and
filter Telegraph
5 Cauer 1934– Wilhelm University of 1,958,742,
(elliptic) 1936 Cauer Gottingen 1,989,545
topology for
filters
6 Sorting 1962 Daniel G. General Precision, 3,029,413
network O’Connor Inc.
and
Raymond J.
Nelson
7 Computation See See text See text See text
al circuits text
8 Electronic See See text See text See text
thermometer text
9 Voltage See See text See text See text
reference text
circuit
10 60 dB and 96 See See text See text See text
dB amplifiers text
11 Second- 1942 Harry Jones Brown Instrument 2,282,726
derivative Company
controller
12 Philbrick 1956 George George A. 2,730,679
circuit Philbrick Philbrick
Researches
13 NAND circuit 1971 David H. Texas Instruments 3,560,760
Chung and Incorporated
Bill H.
77
Terrell
14 PID 1939 Albert Imperial Chemical 2,175,985
(proportional Callender Limited
, integrative, and Allan
and Stevenson
derivative)
controller
15 Negative 1937 Harold S. American 2,102,670,
feedback Black Telephone and 2,102,671
Telegraph
16 Low-voltage 2001 Sang Gug Information and 6,265,908
balun circuit Lee Communications
University
17 Mixed 2000 Turgut Lucent 6,013,958
analog-digital Sefket Aytur Technologies Inc.
variable
capacitor
circuit
18 High-current 2001 Timothy International 6,211,726
load circuit Daun- Business Machines
Lindberg Corporation
and Michael
Miller
19 Voltage- 2000 Akira Mitsumi Electric 6,166,529
current Ikeuchi and Co., Ltd.
conversion Naoshi
circuit Tokuda
20 Cubic 2000 Stefano Conexant Systems, 6,160,427
function Cipriani and Inc.
generator Anthony A.
Takeshian
21 Tunable 2001 Robert Infineon 6,225,859
integrated Irvine and Technologies AG
active filter Bernd Kolb
controllers
79
NOVELTY-DRIVEN EVOLUTION
NOVELTY-DRIVEN EVOLUTION ⎯
CONTINUED
• For circuits not scoring the maximum number (101) of
hits, the fitness of a circuit is the product of the two factors.
• For circuits scoring 101 hits (100%-compliant individuals),
fitness is the number of shared nodes and edges divided by
10,000.
SOLUTION NO. 1
SOLUTION NO. 5
82
G G
C13 C19 L16 L33
(-31.5,8.2) (-25.5,8.2) (-17.5,8.2) (17.5,8.2)
8.91nF 1.75nF 42700uH 90200uH
C17 C29
(-21.5,4.2) (5.5,4)
165nF 311nF
C40 VOUT
(28.5,0.2)
G 295nF
L23
(-5.5,-7.2)
90200uH
BEST-OF-RUN CIRCUIT OF
GENERATION 138 WITH 4 INDUCTORS
AND 4 CAPACITORS ⎯ AREA OF 359.4
VOUT
C27 C34
C18
C12 (2,1.2) (8,1.4)
(-4,1)
(-10,0.5) 256nF 256nF
256nF
155nF
G G G G
83
COMPARISON
Gen Component Area Four Fitness
s penalties
65 27 8,234 33.034348 33.042583
101 19 4,751 0.061965 0.004751
R18
Q39 G
Q43
Q36 G
Q45
R33
P
R30
P
R27
Q48
R24
P
R21
P
R4
V
G
R3
G
84
PID CONTROLLER
Block diagram of a plant and a PID controller composed of
proportional (P), integrative (I), and derivative (D) blocks
Controller 500
+214.0
532
522 538
530
+1000.0
Reference Control Plant
542
Signal Variable Output
+ +
508 512 524 548 1/s 568 590 Plant 594
510 540 580
560 592
+
- +
520 +15.5
552
596
85
940
+3.14
T = { REFERENCE_SIGNAL,
CONTROLLER_OUTPUT, PLANT_OUTPUT,
CONSTANT_0}
87
ARITHMETIC-PERFORMING SUBTREES
FOR THE TWO-LAG PLANT PROBLEM
• Signal processing blocks such as GAIN, LEAD, LAG,
and LAG2 possess numerical parameter(s)
• Parameter values can be established by an arithmetic-
performing subtree
• A constrained syntactic structure enforces a different
function and terminal set for the arithmetic-performing
subtrees (as opposed to all other parts of the program tree).
• Terminal set, Taps, for the arithmetic-performing subtrees
Taps = {ℜ}
where ℜ denotes constant numerical terminals in the range
from -1.0 to +1.0
BEST-OF-RUN GENETICALLY
EVOLVED CONTROLLER FROM
GENERATION 32 FOR THE TWO-LAG
PLANT
R(s) 1 U(s)
−1 −1 918.8
1 + 0.168s
1 1
−1 8.15 1+ 0.0385 s
1 + 0.156 s s
Y(s)
1+ 0.515s 1+ 0.0837 s
90
800m
600m
400m
200m
0
0 167m 333m 500m 667m 833m 1
Time (s)
OVERALL MODEL
D(s)
+
R(s) U(s) Y(s)
Gp(s) + Gc(s) G(s)
- +
H(s)
91
8m
6m
4m
2m
-2m
0 167m 333m 500m 667m 833m 1
Time(s)
92
REVERSE ENGINEERING OF
METABOLIC PATHWAYS (4-REACTION
NETWORK IN PHOSPHOLIPID CYCLE)
BEST-OF-GENERATION 66
C00162 Fatty Acid C00116 Glycerol
OUTPUT
C00165
(MEASURED)
Diacyl-glycerol Cell Membrane
DESIRED
C00162 Fatty Acid C00116 Glycerol
• Bit-string chromosome
Resistor | 2.5 Ω | Node 3 | Node 6
0 1 0 0 1 0 1 0 0 0 0 1 1 1 1 0
• Maintain constraints
Chromosome #1
1st Component | 2nd Component
L .220 1 2 C 403. 2 0
Chromosome #2
1st Component | 2nd Component
R 250. 0 1 C 100. 1 2
0.2
0 0.5 1 1.5 2
x(m)
1 (PROGN3
2 (TURN-RIGHT 0.125)
3 (LANDMARK
4 (REPEAT 2
5 (PROGN2
6 (DRAW 1.0 HALF-MM-WIRE)
7 (DRAW 0.5 NO-WIRE)))
8 (TRANSLATE-RIGHT 0.125 0.75))
0.2
y(m)
0.2
0 0.5 1 1.5 2
x(m)
100
0.2
y(m)
0.2
0 0.5 1 1.5 2
x(m)
REUSE
LOWPASS FILTER USING ADFS
REUSE
LOWPASS FILTER USING ADFS
GENERATION 9 - TWO-RUNG LADDER
REUSE
LOWPASS FILTER USING ADFS
GEN 16 – THREE-RUNG LADDER
REUSE
LOWPASS FILTER USING ADFS
GEN 20 – FOUR-RUNG LADDER
QUADRUPLY-CALLED TWO-PORTED
ADF0
REUSE
LOWPASS FILTER USING ADFS
GENERATION 31 ⎯ TOPOLOGY OF
CAUER (ELLIPTIC) FILTER
QUINTUPLY-CALLED THREE-PORTED
ADF0
PASSING A PARAMETER TO A
SUBSTRUCTURE
• The set of potential terminals for each construction-
continuing subtree of an automatically defined function,
Tccs-adf-potential, is
Tccs-adf-potential = {ARG0}
EMERGENCE OF A PARAMETERIZED
ARGUMENT IN A CIRCUIT
SUBSTRUCTURE
HIERARCHY OF BRANCHES FOR THE
BEST-OF-RUN CIRCUIT- FROM
GENERATION 158
execute
PASSING A PARAMETER TO A
SUBSTRUCTURE
THREE-PORTED AUTOMATICALLY
DEFINED FUNCTION ADF3 OF THE
BEST-OF-RUN CIRCUIT FROM
GENERATION 158
EMERGENCE OF A PARAMETERIZED
ARGUMENT IN A CIRCUIT
SUBSTRUCTURE
HIERARCHY OF BRANCHES FOR THE
BEST-OF-RUN CIRCUIT- FROM
GENERATION 158
execute
1x − 3x + 2 = 0
2
↓ | ↓
0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0
1.0 2.0
ax + bx + c = 0
2
EXAMPLE CIRCUIT
2 C FLIP
3
– 4 SERIES 5 6 NOP
7
0.963 – 8 FLIP 9 SERIES 1 0 L 1 1 1 2 L
1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 2 1
– 0.880 END FLIP L END – L -0.657 END
2 2
ARITHMETIC-PERFORMING SUBTREE
C
+ END
2.963 *
1.234 3.292
4.809 END
FREE VARIABLE
C
+ END
F *
1.234 3.292
117
L2 =
( )(
1.3406 × 10 −8 4. 7387 × 1012 + f 1. 3331× 10 16 + 9. 3714 × 105 f + f 2 )
+ ln f ≈
2. 4451× 10 8
+ ln f
(
f 3.4636 × 10 + f
12
) f
8.0198 × 10 7 L2 2.0262 × 10 8
L1 = L3 = + 2 ln f
f f
1. 3552 × 10 5
3. 7297 ×10 7 C3 =
L4 = f
f
1. 1056 × 10 5
C5 =
f
VARIABLE-CUTOFF
LOWPASS/HIGHPASS FILTER CIRCUIT
• Best-of-run circuit from generation 93 when inputs call for
a highpass filter (i.e., F1 > F2).
100 F 57 .2 F 49 .9 F 57 .2 F 49 .9 F 49 . 9 F
C1 = C2 = C3 = C4 = C5 = C6 =
F1 F1 F1 F1 F1 F1
91 .7 F
C4 =
F1
183 F 219 F 219 F
C1 = C2 = C3 =
F1 F1 F1
58 . 9H
L5 =
F1
119
PARALLELIZATION BY
SUBPOPULATIONS ("ISLAND" OR
"DEME" MODEL OR "DISTRIBUTED
GENETIC ALGORITHM")
HOST DEBUGGER
(Pentium PC) (optional) MESH MESH MESH
NODE NODE NODE
OUTPUT VIDEO
FILE DISPLAY
MESH MESH MESH
NODE NODE NODE
PETA-OPS
• Human brain operates at 1012 neurons operating at 103
per second = 1015 ops per second
• 1015 ops = 1 peta-op = 1 bs (brain second)
120
PROGRESSION OF RESULTS
System Period Speed- Qualitative nature of the results produced
up by genetic programming
Serial LISP 1987– 1 (base) • Toy problems of the 1980s and early
machine 1994 1990s from the fields of artificial
intelligence and machine learning
64-node 1994– 9 •Two human-competitive results involving
Transtech 1997 one-dimensional discrete data (not patent-
8-biy related)
transputer
64-node 1995– 22 • One human-competitive result involving
Parsytec 2000 two-dimensional discrete data
parallel • Numerous human-competitive results
machine involving continuous signals analyzed in
the frequency domain
• Numerous human-competitive results
involving 20th-century patented inventions
70-node 1999– 7.3 • One human-competitive result involving
Alpha 2001 continuous signals analyzed in the time
parallel domain
machine • Circuit synthesis extended from topology
and sizing to include routing and
placement (layout)
1,000-node 2000– 9.4 • Numerous human-competitive results
Pentium II 2002 involving continuous signals analyzed in
parallel the time domain
machine • Numerous general solutions to problems
in the form of parameterized topologies
• Six human-competitive results
duplicating the functionality of 21st-
century patented inventions
Long (4- 2002 9.3 • Generation of two patentable new
week) runs inventions
of 1,000-
node
Pentium II
parallel
machine
122
PROGRESSION OF QUALITATIVELY
MORE SUBSTANTIAL RESULTS
PRODUCED BY GENETIC
PROGRAMMING IN RELATION TO FIVE
ORDER-OF-MAGNITUDE INCREASES IN
COMPUTATIONAL POWER
• toy problems
EVOLVABLE HARDWARE
EVOLVABLE HARDWARE
SORTING NETWORKS
FUNDAMENTAL DIFFERENCES
BETWEEN GP AND OTHER
APPROACHES TO AI AND ML
(1) Representation: Genetic programming overtly conducts
it search for a solution to the given problem in program
space.
(2) Role of point-to-point transformations in the search:
Genetic programming does not conduct its search by
transforming a single point in the search space into another
single point, but instead transforms a set of points into
another set of points.
(3) Role of hill climbing in the search: Genetic
programming does not rely exclusively on greedy hill
climbing to conduct its search, but instead allocates a certain
number of trials, in a principled way, to choices that are
known to be inferior.
(4) Role of determinism in the search: Genetic
programming conducts its search probabilistically.
(5) Role of an explicit knowledge base: None.
(6) Role of formal logic in the search: None.
(7) Underpinnings of the technique: Biologically inspired.
126
37 HUMAN-COMPETITIVE RESULTS
(LIST AS OF APRIL 2004)
Claimed instance Basis for claim Reference
of human-
competitiveness
1 Creation of a better-than-classical quantum B, F Spector, Barnum, and
algorithm for the Deutsch-Jozsa “early Bernstein 1998
promise” problem
2 Creation of a better-than-classical quantum B, F Spector, Barnum, and
algorithm for Grover’s database search Bernstein 1999
problem
3 Creation of a quantum algorithm for the depth- D Spector, Barnum, Bernstein,
two AND/OR query problem that is better than and Swamy 1999; Barnum,
any previously published result Bernstein, and Spector 2000
4 Creation of a quantum algorithm for the depth- D Barnum, Bernstein, and
one OR query problem that is better than any Spector 2000
previously published result
5 Creation of a protocol for communicating D Spector and Bernstein 2003
information through a quantum gate that was
previously thought not to permit such
communication
6 Creation of a novel variant of quantum dense D Spector and Bernstein 2003
coding
7 Creation of a soccer-playing program that won H Luke 1998
its first two games in the Robo Cup 1997
competition
8 Creation of a soccer-playing program that H Andre and Teller 1999
ranked in the middle of the field of 34 human-
written programs in the Robo Cup 1998
competition
9 Creation of four different algorithms for the B, E Sections 18.8 and 18.10 of GP-
transmembrane segment identification problem 2 book and sections 16.5 and
for proteins 17.2 of GP-3 book
10 Creation of a sorting network for seven items A, D Sections 21.4.4, 23.6, and
using only 16 steps 57.8.1 of GP-3 book
11 Rediscovery of the Campbell ladder topology A, F Section 25.15.1 of GP-3 book
for lowpass and highpass filters and section 5.2 of GP-4 book
12 Rediscovery of the Zobel “M-derived half A, F Section 25.15.2 of GP-3 book
section” and “constant K” filter sections
13 Rediscovery of the Cauer (elliptic) topology for A, F Section 27.3.7 of GP-3 book
filters
14 Automatic decomposition of the problem of A, F Section 32.3 of GP-3 book
synthesizing a crossover filter
15 Rediscovery of a recognizable voltage gain A, F Section 42.3 of GP-3 book
stage and a Darlington emitter-follower section
of an amplifier and other circuits
16 Synthesis of 60 and 96 decibel amplifiers A, F Section 45.3 of GP-3 book
17 Synthesis of analog computational circuits for A, D, G Section 47.5.3 of GP-3 book
squaring, cubing, square root, cube root,
logarithm, and Gaussian functions
18 Synthesis of a real-time analog circuit for time- G Section 48.3 of GP-3 book
optimal control of a robot
128
1. LOGIC-BASED SEARCH
One approach that Turing identified is a search through the
space of integers representing candidate computer
programs.
2. CULTURAL SEARCH
Another approach is the "cultural search" which relies on
knowledge and expertise acquired over a period of years
from others (akin to present-day knowledge-based systems).
131
3. GENETICAL OR EVOLUTIONARY
SEARCH
"There is the genetical or evolutionary search by
which a combination of genes is looked for, the
criterion being the survival value."
17 AUTHORED BOOKS ON GP
Banzhaf, Wolfgang, Nordin, Peter, Keller, Robert E., and Francone, Frank D. 1998.
Genetic Programming - An Introduction. San Francisco, CA: Morgan Kaufman
Publishers and Heidelberg, Germany: dpunkt.verlag.
Babovic, Vladan. 1996b. Emergence, Evolution, Intelligence: Hydroinformatics. Rotterdam,
The Netherlands: Balkema Publishers.
Blickle, Tobias. 1997. Theory of Evolutionary Algorithms and Application to System
Synthesis. TIK-Schriftenreihe Nr. 17. Zurich, Switzerland: vdf Hochschul Verlag AG
and der ETH Zurich. ISBN 3-7281-2433-8.
Jacob, Christian. 1997. Principia Evolvica: Simulierte Evolution mit Mathematica.
Heidelberg, Germany: dpunkt.verlag. In German. English translation forthcoming in
2000 from Morgan Kaufman Publishers.
Jacob, Christian. 2001. Illustrating Evolutionary Computation with Mathematica. San
Francisco: Morgan Kaufmann.
Iba, Hitoshi. 1996. Genetic Programming. Tokyo: Tokyo Denki University Press. In
Japanese.
Koza, John R. 1992. Genetic Programming: On the Programming of Computers by Means of
Natural Selection. Cambridge, MA: The MIT Press.
Koza, John R. 1994a. Genetic Programming II: Automatic Discovery of Reusable Programs.
Cambridge, MA: The MIT Press
Koza, John R., Bennett III, Forrest H, Andre, David, and Keane, Martin A. 1999a. Genetic
Programming III: Darwinian Invention and Problem Solving. San Francisco, CA:
Morgan Kaufmann Publishers.
Koza, John R., Keane, Martin A., Streeter, Matthew J., Mydlowec, William, Yu, Jessen,
and Lanza, Guido. 2003. Genetic Programming IV. Routine Human-Competitive
Machine Intelligence. Kluwer Academic Publishers.
Langdon, William B. 1998. Genetic Programming and Data Structures: Genetic Programming
+ Data Structures = Automatic Programming! Amsterdam: Kluwer Academic
Publishers.
Langdon, William B. and Poli, Riccardo. 2002. Foundations of Genetic Programming.
Berlin: Springer-Verlag.
Nordin, Peter. 1997. Evolutionary Program Induction of Binary Machine Code and its
Application. Munster, Germany: Krehl Verlag.
O’Neill, Michael and Ryan, Conor. 2003. Grammatical Evolution: Evolutionary Automatic
Programming in an Arbitrary Language. Boston: Kluwer Academic Publishers.
Ryan, Conor. 1999. Automatic Re-engineering of Software Using Genetic Programming.
Amsterdam: Kluwer Academic Publishers.
Spector, Lee. 2004. Automatic Quantum Computer Programming: A Genetic Programming
Approach. Boston: Kluwer Academic Publishers.
Wong, Man Leung and Leung, Kwong Sak. 2000. Data Mining Using Grammar Based
Genetic Programming and Applications. Amsterdam: Kluwer Academic Publishers.
133
4 VIDEOTAPES ON GP
Koza, John R., and Rice, James P. 1992. Genetic Programming: The Movie.
Cambridge, MA: The MIT Press.
Koza, John R. 1994b. Genetic Programming II Videotape: The Next Generation.
Cambridge, MA: The MIT Press.
Koza, John R., Bennett III, Forrest H, Andre, David, Keane, Martin A., and
Brave, Scott. 1999. Genetic Programming III Videotape: Human-
Competitive Machine Intelligence. San Francisco, CA: Morgan Kaufmann
Publishers.
Koza, John R., Keane, Martin A., Streeter, Matthew J., Mydlowec, William,
Yu, Jessen, Lanza, Guido, and Fletcher, David. 2003. Genetic
Programming IV Video: Routine Human-Competitive Machine
Intelligence. Kluwer Academic Publishers.
136
Visit
http://www.cs.bham.ac.uk/~wbl/biblio/
or
http://liinwww.ira.uka.de/bibliography/Ai/g
enetic.programming.html
GP MAILING LIST
To subscribe to the Genetic Programming e-mail list,
• send e-mail message to:
genetic_programming-subscribe@yahoogroups.com
• visit the web page
http://groups.yahoo.com/group/genetic_programming/