GADataMining CNA

Genetic Algorithms for Data Mining
Sid Bhattacharyya
Overview
Genetic Algorithms: a gentle introduction
What are GAs How do they work/ Why? Critical issues
Using Genetic algorithms (effectively) Use in Data Mining
Natural Genetics to AI
Computational models inspired by biological evolution
survival of the fittest reproduction through cross-breeding
Genetic Algorithms
Population based search (parallel)
simultaneous search from multiple points in search space
population members: potential solutions
Fitness function (search objective)

numerical figure of merit/utility measure of an individual
selection
Mating and reproduction of individuals

crossover, mutation
Evolution from one generation to the next

iterative search, convergence
Advantage GAs
General purpose, robust search technique
application to varied problem types
Data mining
fitness function: flexible expression of modeling criteria, tradeoffs amongst multiple objectives models optimized to specific business objectives diverse model representation linear, non-linear interaction terms, rules, sequences, etc.
GA Application Examples
Function optimizers difficult, discontinuous, multi-modal, noisy functions Combinatorial optimization layout of VLSI circuits, factory scheduling, traveling salesman problem Design and Control bridge structures, neural networks, communication networks design; control of chemical plants, pipelines Machine learning classification rules, economic modeling, scheduling strategies Portfolio design, optimized trading models, direct marketing models, sequencing of TV advertisements, adaptive agents, data mining, etc.
GAs: Basic Principles

Representation of individuals
String of parameters (genes) : chromosome
eg. F(p,q,r,s,t): p q r s t
Bit-string representation (?):

100110101101100
genotype and phenotype

Survival of the fittest (Fitness function)
numerical figure of merit/utility measure of an individual tradeoff amongst a multiple evaluation criteria efficient evaluation

Reproduction to create offspring
Selection Crossover Mutation

Convergence
progression towards uniformity in population premature convergence? (local optima)
GA: Basic Operation

Selection
Solution1 (f 1) Solution2 (f 2) Solution3 (f 3) Solution4 (f 4) ... ... SolutionN (f N )
Recombination
Crossover
Mutation
Offspring1(1,4) Offspring2(1,4) Offspring3(2,7) Offspring4(2,7) ... ... OffspringN(x,y)
Solution1 Solution2 Solution2 Solution4 ... ... SolutionX
Generation t
Generation t+1
GAs: Parallel Search

Fitness
X X
Hill climber
Typical GA Run
Fitness Best Average
Generations
Operators: Selection
Fitness proportionate selection (fi/f ) number of reproductive trials for individuals
Selection
Roulette-wheel selection
(stochastic sampling with replacement)
wheel spaced in proportion to fitness values N (pop size) spins of the wheel
Selection
Stochastic universal sampling
N equally spaced pins on wheel single turn of the wheel
Selection
Premature converge Fitness scaling
f = f - (2*avg. - max.)
Ranked fitness Elitism Steady-state selection Demetic grouping
Operators: Crossover
Parent 1: 11010 101100101 Parent 2: xxyxx yxyyxxyxy
crossover site
Offspring 1: 11010 yxyyxxyxy Offspring 2: xxyxx 101100101

(Single-pt. crossover)
combining good building blocks
Crossover
Parent 1: axpsqvqbtpihd Parent 2: qzxxaycgbtphw
crossover sites
Offspring 1: azpsavcbtpphd Offspring 2: qxxxqyqgbtihw

(Uniform crossover)
Crossover
Fitness
X
Parents Offspring
Operators: Mutation
alters each gene with small probability
x1yx0y0yy0x yxy x1yx0y1yy0x xxy
Recombination operators
Mutation & premature convergence Mutation vs. Crossover
operator probabilities which is more important?
Optimal parameter settings (!)
Non-Binary Representations
Integer, real-number, order-based, rules, ... Binary or Real-valued?
real representations give faster, more consistent, more accurate results
High-level representation
intuitive, can utilize specialized crossover and mutation effective search over complex spaces design of representation and operators --forma theory
Real-valued representation
Parent1: Parent2: 3.45 0.56 6.78 0.976 2.5 0.98 1.06 4.20 0.34 1.8
Offspring1: 3.22 0.56 6.78 0.65 2.12 Offspring2: 1.43 1.06 4.20 0.41 1.93
(Arithmetic crossover)
Parent1: {(1 .2 x 1 3 .4 ) (5 .8 x 2 6 .0 ) (0 .2 x7 0 .6 1 )} Parent2: {(2.3 x 41 . ) (36 . x2 51 . ) (51 . x4 561 . ) 6 Offspring1: {(1.2 x 1 3.4) ( 2.2 x9 2.7 ) (51 . x4 5.61)} Offspring2:
{(2.3 x6 4.1) [( 3.6 x2 51 . ) (5.8 x 2 6.0)]
( 0.3 x3 11 . ) (0 .2 x7 0 .6 1 )}
( 0.3 x3 11 . ) (2.2 x9 2.7 )}
Generalize/Specialize
{( 0.3 x3 11 . ) ( 2.2 x9 2.7 )}
{ ( 0.3 x 3 11 . ) ( 2 .2 x 9 2 .7 ) (51 . x 4 6.2 )}
{( 0.3 x3 11 . ) ( 2.2 x9 2.7 )}

{ ( 0.4 5 x 3 0.9 ) (1.9 x 9 2 .9 )}
Tree-structured representation (GP)

if * /
log
AND < >
y x 5 y
(x log(y))/5) x 2
If (y<7) and (x>2) then 0, else 2x+y
Genetic search: Issues

Coding scheme, fitness function critical
General mechanism so robust that, within reasonable margins, parameter settings are not critical. exploiting problem-specific knowledge the art in GA design!
Genetic search: Issues

Stochastic search
multiple runs with different random streams
Exploration vs. exploitation of search Does not guarantee optimality ! But . Structured population models Parallelizable for large data
GAs and Optimization

Search space: representation Global search without gradient information
functions with multiple local optima non-differentiable functions
Robust, assumption-free, and very general Hybrid approaches -- GAs with conventional optimization techniques
Using GAs ?
When to use a GA? GA and traditional techniques How long does it take? Will it perform better?
Using GAs
population size mutation, crossover rates how many generations multiple runs
Is it a black-box?
?
Data characteristics Fitness function GA parameters
Huh?
GA Application Examples
Function optimizers
difficult, discontinuous, multimodal, noisy functions
Combinatorial optimization
layout of VLSI circuits, factory scheduling
Design and Control

bridge structures, neural networks, communication networks design; control of chemical plants, pipelines
Machine learning
classification rules, economic modeling, scheduling strategies
Portfolio design, optimized trading models, direct marketing models, sequencing of TV advertisements, adaptive agents, data mining, etc.
GAs and Data Mining

Discovery Prediction Hypothesis testing and refinement
Data Mining
Pattern templates
([attribute in {v1,v2}] and [attribute=value]) or ([attribute in {v1,v2,v3}] and [attribute>value]) or
when S, if C then P
when region=ne if inc > 41K and child>2 then x-sales>100 C P
when S, C and P are positively correlated the mean of A when S and C, is significantly different from the mean of A when S
Data mining
How good are the patterns
accuracy coverage support
# cases in C and P # cases in C # cases in C and P # cases in P # cases in C # cases in S
Understandability
GA for Data Mining

Fitness evaluation
Expected values Chi-square
eij =
i i 2
ri c j
SI C C
P n11 c1
P n12 c2 r1 = n11 + n12 r2 = n21 + n22 n
2 =
n ( nij cij ) 2
eij
n21 n22
Cramer' s V =
higher values imply C and P are related Correlation linear correlation -- product moment corr. coefficient monotonically correlated -- Spearmans rank corr. coeff. Correlation coefficient x support Interesting rule S I C I P
S IC S I P S
DM application
Symbolic models of consumer choice
{ ( 3 5 in c 4 0 K ) ( a g e < 4 3 ) o r ( in c > 6 3 K ) ( a g e > 5 5 ) } t h e n B u y
assumption-free behavioral insights for targeting promotions advantage over decision trees algorithms?
DTs are stepwise optimal, but not globally so high noise-sensitivity of DTs
advantages over neural networks
Performance evaluation
Accuracy/Error rate
will higher accuracy give better performance for the target task?
The use of error rate often suggests insufficiently careful thought about the real objectives of the research David J. Hand, Construction and Assessment of Classification Rules.
Actual P Predicted P N N
True P False N
False P True N
sensitivity, specificity misclassification costs Of course, with 99:1 split in data, default dummy model gives 99% accuracy.
Model Representation
Non-linear tree-structured models (GP)
Non-linear interaction terms Function set : internal nodes
{+,-,*,/,log}
* / log x3 x1 5
Terminal set: leaf nodes

{constants, variables}
(x1 log(x3))/5)
DM Performance: Decile Analysis

Decile top 2 3 4 5 6 7 8 9 bottom Total Number of Customers 2500 2500 2500 2500 2500 2500 2500 2500 2500 2500 25,000 Number of Responses 2179 1753 396 111 110 85 67 69 49 55 4874 Response Rate (%) 87.2 70.1 15.8 4.4 4.4 3.4 2.7 2.8 2.0 2.2 19.5 Cumulative Cumulative Responses Response Rate (%) 2179 3932 4328 4439 4549 4634 4701 4770 4819 4874 87.2 78.6 57.7 44.4 36.4 30.9 26.9 23.9 21.4 19.5 Cumulative Response Lift 447 403 296 228 187 158 138 122 110 100
Cumulative Lift decile =
cum. avg. performancedecile overall avg. performance
* 100.
Decile Maximization(DMAX)
Objective
Find model f(x) (predictor variables x) such that performance in upper deciles (specified depth-of-file) is maximized Number of
Decile
Explicitly manages resource constraint mailings to particular depths-of file Performance at different mailing depths models optimized for different mailing depths
top 2 3 4 5 6 7 8 9 bottom
Responders/ Profit max max max
DMAX: Illustrative Example

45 40 35 30 25 20 15 10 5 0 0 5 10 15 20 25 30 35 40
$10 DMAX 40% ($32) $4 OLS($28) $6 $2
$5 $9 $7 $3 $1 $8
Profit $10 $9 $8 $7 $6 $5 $4 $3 $2 $1
X1 45 35 31 30 6 45 30 23 16 12
X2 5 21 38 30 10 37 10 30 13 30
OLS: .14 X1 + .06 X2 DMAX 40%: .19 X1 + .07 X2
GA DMAX
Representation: w1 w2 w3 .. wk Integrated variable selection Fitness evaluation
classification accuracy model reliability maximize specified decile performance
response, profit, etc.
Hybrid algorithm
Comparative Performance: Case I

Response modeling
maximize response in top 3 deciles 4.6% response to mailing
DMAX (30%): - 0.01X1 - 2.51X2 - 0.008X3 - 0.08X4 LOGIT : - 0.40 - 0.01X2 - 0.007X3- 3.25X4 Neural Network: 3 layers, 2 hidden nodes, 12 coefficients
Case I: Genetic Algorithm DMAX (30%)

Number Number Decile Cum Cum of of Response Response Response Customers Responses Rate Rate Lift 4,617 865 18.7% 18.7% 411 4,617 382 8.3% 13.5% 296 4,617 290 6.3% 11.1% 244 4,617 128 2.8% 9.0% 198 4,617 97 2.1% 7.6% 167 4,617 81 1.8% 6.7% 146 4,617 79 1.7% 5.9% 130 4,617 72 1.6% 5.4% 118 4,617 67 1.5% 5.0% 109 4,617 43 0.9% 4.6% 100 46,170 2,104 4.6%
Decile top 2 3 4 5 6 7 8 9 bottom TOTAL
Case I: Cum Response Lift Comparison

Genetic Logistic Neural Decile Algorithm Regression Network DMAX(30%) top 411 384 385 2 296 284 277 3 244 227 221 4 198 194 186 5 167 166 164 6 146 146 146 7 130 131 131 8 118 119 118 9 109 108 108 bottom 100 100 100
Case II 2% Response Rate
Cum Response Lift Comparison

Genetic Genetic Genetic Genetic Logistic Decile Algorithm Algorithm Algorithm Algorithm Regression DMAX(10%) DMAX(20%) DMAX(30%) DMAX(40%) 1 220 186 191 192 194 2 174 195 166 166 165 3 157 173 179 150 148 4 148 158 158 161 154* 5 139 145 146 146 146 6 131 135 138 138 138 7 122 124 127 127 127 8 114 116 117 117 117 9 108 108 109 109 109 bottom 100 100 100 100 100
Case II: 2% Response Rate Smoothness: Logistic Regression

Number Number Decile Cum Cum of of Response Response Response Customers Responses Rate Rate Lift 7,203 283 3.9% 3.9% 194 7,220 200 2.8% 3.3% 165 7,225 165 2.3% 3.0% 148 7,215 255* 3.5% 3.1% 154* 7,227 167 2.3% 3.0% 146 7,220 140 1.9% 2.8% 138 7,209 89 1.2% 2.6% 127 7,228 68 0.9% 2.4% 117 7,205 65 0.9% 2.2% 109 7,232 32 0.4% 2.0% 100 72,184 1,464 2.0%
Case II: 2% Response Rate Smoothness: GA DMAX (10%)

Number Number Decile Cum Cum of of Response Response Response Customers Responses Rate Rate Lift 7,203 322 4.5% 4.5% 220 7,220 188 2.6% 3.5% 174 7,225 178 2.5% 3.2% 157 7,215 178 2.5% 3.0% 148 7,227 151 2.1% 2.8% 139 7,220 133 1.8% 2.7% 131 7,209 103 1.4% 2.5% 122 7,228 84 1.2% 2.3% 114 7,205 81 1.1% 2.2% 108 7,232 46 0.6% 2.0% 100 72,184 1,464 2.0%
Case II: 2% Response Rate Smoothness: GA DMAX (20%)

Number Number Decile Cum Cum of of Response Response Response Customers Responses Rate Rate Lift 7,203 271 3.8% 3.8% 186 7,220 299* 4.1% 4.0% 195* 7,225 191 2.6% 3.5% 173 7,215 162 2.2% 3.2% 158 7,227 140 1.9% 2.9% 145 7,220 119 1.8% 2.7% 135 7,209 90 1.2% 2.5% 124 7,228 85 1.2% 2.3% 116 7,205 69 1.0% 2.2% 108 7,232 38 0.5% 2.0% 100 72,184 1,464 2.0%
Comparative Performance: Case III

Profit modeling
maximize profit in top 2 deciles mailing (profit / size)
Non-responder: -$0.29 / 92.55% Unpaid responder: -$5.65 / 7.10% Paid responder: +$275 / 0.35% Average profit for mailing: +$0.32 DMAX (20%): - .36X1 - .23X2 + .005X3 + .24X4 LOGIT(PR): - .01X1 - .03X2 + .322X3 + .25X4
Case IV: Profit Model Genetic Algorithm DMAX (20%)

Decile top 2 3 4 5 6 7 8 9 bottom TOTAL Number Percent Percent of PAID UNPAID Customers Responders Responders 8,171 0.82% 10.1% 8,171 0.62% 8.7% 8,171 0.37% 8.2% 8,171 0.34% 8.4% 8,171 0.29% 5.9% 8,171 0.32% 7.4% 8,171 0.23% 4.0% 8,171 0.18% 4.8% 8,171 0.24% 8.3% 8,171 0.17% 4.9% 81,710 0.35% 7.1% Decile Average Profit $1.43 $0.96 $0.28 $0.20 $0.20 $0.19 $0.13 -$0.04 -$0.06 -$0.08 Cum Average Profit $1.43 $1.20 $0.89 $0.72 $0.62 $0.54 $0.49 $0.42 $0.37 $0.32 Cum Profit Lift 444 371 277 223 191 169 151 130 114 100
Case IV: Profit Model
Cum Profit Lift Comparison

Decile top 2 3 4 5 6 7 8 9 bottom Genetic Algorithm DMAX (20%) 444 371 277 223 191 169 151 130 114 100 Logistic Regression 385 294 235 190 184 163 146 123 111 100
Modeling on Multiple Objectives

Model [y1,..,yk] = f (x)
simultaneously optimize on multiple objectives
Some common DM modeling desirables

response and high purchase revenues likely churners with high usage of services high tenure and usage purchase and non-return cross-selling, etc.
[or CPR (Combined Profit and Response) Models]
Multiple objectives
Traditional approaches
multiple single-objective models, and combine weighted average of objectives
conflicting objectives
different levels of tradeoffs frontier of non-dominated solutions choice of final model based on diverse decisionmaker objectives, can also be subjective
Pareto Frontier
Non-dominated solutions
multiple objectives i, f a(x) better than f b(x)
if
non-dominated models dominated models
i : i ( f a ( x )) i ( f b ( x )) j : j ( f a ( x )) > j ( f b ( x ))
Single GA run obtains tradeoff frontier of

non-dominated solutions f k(x)
Multi-objective GA
Pareto-Based Selection (Louis and Rawlins, 93)
randomly select a pair of solutions from population generate two new offspring determine the Pareto-optimal set from parents and offspring, and choose two solutions for new population
Elitistism
retain best solution intact in next population fosters local search around best solution
retain non-dominated set of solutions intact in next generation
Fitness evaluation
DMAX approach fitness at specified depth-of-file d
Experimental Study: Data

Cellular-phone provider seeking to identify potential high-value churners
two dependent variables
binary Churn variable continuous variable measuring revenue ($)
predictors: minutes-of-use (peak and off-peak), average charges,

and payment information, etc.
obtained after EDA, normalized to 0 mean 1 s.d
50,000 sample: 25,000 for training, 25,000 for test set
Multiple Objectives: Performance

Churn lift $-Lift
Rd R / Nd N
model capturing more churners in top deciles is better
Cd C / Nd N
model giving high revenue customers in upper deciles is better
overall modeling objective

maximize expected revenue saved through identification of highvalue churners Churn-Lift * $-Lift
Experimental Study
Non-dominated models: Decile 1 (Training)

Decile 1 (trg)
400 350 300 $-Lift 250 200 150 100 50 0 0 100 200 300 Churn-Lift 400 500 600
GP GA Logistic OLS
5 independent GA runs, aggregate the sets of non-dominated solutions
Experimental Study
Non-dominated models: Decile 1 (Test)

Decile 1 (Test)
400 350 300 250 $-Lift 200 150 100 50 0 0 100 200 Churn-Lift 300 400 500
GP GA Logistic OLS
Experimental Study

Decile 2 (Test)
300 250 200 $-Lift 150 100 50 0 0 50 100 150 200 250 300 350 400 450 Churn-Lift
GP GA Logistic OLS
Experimental Study

Decile 3 (Test)
250
200
GP GA Logistic OLS
$-Lift
150
100
50
0 0 50 100 150 200 250 300 350 Churn-Lift
Experimental Study

Decile 7 (Test)
140
120
GP GA Logistic OLS
$-Lift
100
80
60
80 90 100 110 120 130 140 150
Churn-Lift
Experimental Study
Performance Summary
Performance Churn-Lift, $-Lift GA-best GP-best Product of Lifts Churn-Lift, $-Lift Product of Lifts Churn-Lift, $-Lift Logistic Regression Product of Lifts Churn-Lift, $-Lift OLS Regression Product of Lifts Churn-Lift, $-Lift OLS * Logistic Product of Lifts
Decile 1
304.9, 261.7 797.8 343.7, 256.5 881.5 447.1,111.8 499.8 116.2, 360.5 418.8 79, 357 282
Decile 2
265.4, 207.4 550.4 343.5, 182.1 625.5 403.4, 72.6 292.7 108.1, 271.7 293.71 76, 263 201
Decile 3
272.3. 155.0 422.2 275.1, 178.3 490.4 295.9, 57.4 169.96 99.7, 223.2 222.5 74, 217 160
Decile 7
138.8, 126.9 176.1 139.4, 131.2 182.9 137.8, 66.7 91.9 91.8, 136.2 125.1 78, 136 106
General Optimization of Lifts

Fitness function
Seeks a general maximization of lifts at all deciles
Specific vs. General Lift Opt

Performance GA-best Lift-Opt $-Lift, Churn-Lift Product of Lifts
Decile 1
304.9, 261.7 797.8 303.2, 261 791.4
Decile 2
265.4, 207.4 550.4 288.3, 188.8 544.3
Decile 3
272.3. 155.0 422.2 276.7, 151.3 418.6
Decile 7
138.8, 126.9 176.1 138.1, 104.5, 144.3
$-Lift, Churn-Lift GA-best General-Opt Product of Lifts
GP-best Lift-Opt
Churn-Lift, $-Lift Product of Lifts
343.7, 256.5 881.5 332, 252.5 838.3
343.5, 182.1 625.5 265, 223.1 591.2
275.1, 178.3 490.4 233.9, 186.5 436.2
139.4, 131.2 182.9 132.3, 133.1 176.1
Churn-Lift, $-Lift GP-best General-Opt Product of Lifts
Table: Best Prod-Lifts in Deciles
Specific vs. General Lift Opt.

Performance GA-best Lift-Opt GA-best General-Opt
Decile 1 $-Lift ChurnLift

361.4 361.7 464.7 421
Decile 2 $-Lift Churn -Lift

271.6 273.3 401.3 398.1

223.9 223.9 309.8 304.1

136.6 136.6 139.5 138.4
GA-best Lift-Opt GA-best General Opt
372.7 372.1
475.2 421.3
276.5 276.8
417.9 378.3
226.1 226.6
310.3 296.7
137.2 137.1
139.8 139.8
Table: Best $-Lift and Churn-Lifts in Deciles
Case Study EC challenge
EDA, Variable-selection
Problem
15,178 obs., 79 variables, response dependent Seeking maximum lift in the top decile Logistic regression model
15 variables, after EDA, transformation
(many of them combinations of multiple vars.)
This is the hard part!
Lift of 126 in the top decile
EC approach
Include all variables Explore simple terms: non-linear GP models
small populations, looking for robust terms
Final model(s) using obtained terms
Case Study EC
Various 2-5 var. terms show some predictability
Lifts ranging in 122-127
Models on these terms

Non-linear, Linear model: lifts in 126-132
Examples
3 tan(HC211) + EC31 (OCC81 - log10(ORDTERM1/IC191))*STATE2*HHAS21 STATE2 * HHAS21 (OCC81 - log10(B)) * B * (A + B + (ORDTERM1 * (A + B))) A = (STATE2 - SECGENDE) and B = STATE2*HHAS21 B + tan(2B + HHAS21) + EC31 + (ORDTERM1)*(B + tan[B + HHAS21 + ((HHAS21*HV31)/2.1)] ) AB^3 (1 + OCC81) + AB(OCC81) + 2DEB(OCC81)^2. 4A + B + C + 2D + E + 2*OCC81 (10 vars. total) Trg:122.5 Test: 122.5 Trg: 124.9 Test:126.4 Trg: 121.3 Test: 121.3 Trg: 131.5 Test: 126.9
Trg: 131.1 Test:127.8
Trg: 134.4 Test 131.6 Trg: 132.5 Test: 131.7

GADataMining CNA

Uploaded by

Copyright:

Available Formats

GADataMining CNA

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

GADataMining CNA

Uploaded by

Copyright:

Available Formats

Genetic Algorithms for Data Mining

Using Genetic algorithms (effectively) Use in Data Mining

population members: potential solutions

Fitness function (search objective)

Mating and reproduction of individuals

Evolution from one generation to the next

GAs: Basic Principles

Bit-string representation (?):

genotype and phenotype

GAs: Basic Principles

GAs: Basic Principles

GAs: Basic Principles

GA: Basic Operation

Solution1 Solution2 Solution2 Solution4 ... ... SolutionX

GAs: Parallel Search

Ranked fitness Elitism Steady-state selection Demetic grouping

Offspring 1: 11010 yxyyxxyxy Offspring 2: xxyxx 101100101

combining good building blocks

Offspring 1: azpsavcbtpphd Offspring 2: qxxxqyqgbtihw

Optimal parameter settings (!)

( 0.3 x3 11 . ) (2.2 x9 2.7 )}

{( 0.3 x3 11 . ) ( 2.2 x9 2.7 )}

Tree-structured representation (GP)

Genetic search: Issues

Genetic search: Issues

GAs and Optimization

Design and Control

GAs and Data Mining

GA for Data Mining

P n12 c2 r1 = n11 + n12 r2 = n21 + n22 n

advantages over neural networks

Terminal set: leaf nodes

DM Performance: Decile Analysis

Cumulative Lift decile =

cum. avg. performancedecile overall avg. performance

Responders/ Profit max max max

DMAX: Illustrative Example

$10 DMAX 40% ($32) $4 OLS($28) $6 $2

OLS: .14 X1 + .06 X2 DMAX 40%: .19 X1 + .07 X2

Comparative Performance: Case I

Case I: Genetic Algorithm DMAX (30%)

Decile top 2 3 4 5 6 7 8 9 bottom TOTAL

Case I: Cum Response Lift Comparison

Case II 2% Response Rate

Cum Response Lift Comparison

Case II: 2% Response Rate Smoothness: Logistic Regression

Decile top 2 3 4 5 6 7 8 9 bottom TOTAL

Case II: 2% Response Rate Smoothness: GA DMAX (10%)

Decile top 2 3 4 5 6 7 8 9 bottom TOTAL

Case II: 2% Response Rate Smoothness: GA DMAX (20%)

Decile top 2 3 4 5 6 7 8 9 bottom TOTAL

Comparative Performance: Case III

Case IV: Profit Model Genetic Algorithm DMAX (20%)

Case IV: Profit Model

Cum Profit Lift Comparison

Modeling on Multiple Objectives

Some common DM modeling desirables

[or CPR (Combined Profit and Response) Models]

Single GA run obtains tradeoff frontier of

retain non-dominated set of solutions intact in next generation