This paper describes how fitness inheritance can be used to estimate fitness for a proportion of newly sampled candidate solutions in the Bayesian optimization algorithm (BOA). The goal of estimating fitness for some candidate solutions is to reduce the number of fitness evaluations for problems where fitness evaluation is expensive. Bayesian networks used in BOA to model promising solutions and generate the new ones are extended to allow not only for modeling and sampling candidate solutions, but also for estimating their fitness. The results indicate that fitness inheritance is a promising concept in BOA, because population-sizing requirements for building appropriate models of promising solutions lead to good fitness estimates even if only a small proportion of candidate solutions is evaluated using the actual fitness function. This can lead to a reduction of the number of actual fitness evaluations by a factor of 30 or more.
1. Fitness Inheritance in BOA
Martin Pelikan Kumara Sastry
Dept. of Math and CS Illinois GA Lab
Univ. of Missouri at St. Louis Univ. of Illinois at Urbana-Champaign
1
2. Motivation
Bayesian optimization algorithm (BOA)
Scales up on decomposable problems
O(n)-O(n2) evaluations until convergence
Expensive evaluations
Real-world evaluations can be complex
FEA, simulation, …
O(n2) is often not enough
This paper
Extend probabilistic model to include fitness info
Use model to evaluate part of the population
2
3. Outline
BOA basics
Fitness inheritance in BOA
Extend Bayesian networks with fitness.
Use extended model for evaluation.
Experiments
Future work
Summary and conclusions
3
4. Bayesian Optimization Alg. (BOA)
Pelikan, Goldberg, and Cantu-Paz (1998)
Similar to genetic algorithms (GAs)
Replace mutation + crossover by
Build Bayesian network to model selected solutions.
Sample Bayesian network to generate new candidate
solutions.
4
5. BOA
Bayesian New
Current network
Selection population
population
Restricted tournament replacement
5
6. Bayesian Networks (BNs)
2 components
Structure
directed acyclic graph
nodes = variables (string positions)
Edges = dependencies between variables
Parameters
Conditional probabilities p(X|Px), where
X is a variable
Px are parents of X (variables that X depends on)
6
7. BN example
A B p(A|B)
0 0 0.10
0 1 0.60
A 1 0 0.90
1 1 0.40
B C C A p(C|A)
0 0 0.80
0 1 0.55
B p(B)
1 0 0.20
0 0.25
1 1 0.45
1 0.75 7
8. Extending BNs with fitness info
A B p(A|B) f(A|B)
Basic idea
0 0 0.10 -0.5
Don’t work only with
conditional probabilities 0 1 0.60 0.5
Add also fitness info for 1 0 0.90 0.3
fitness estimation 1 1 0.40 -0.3
Fitness info attached to p(X|Px) denoted by f(X|Px)
Contribution of X restricted by Px
f (X = x | Px = px ) = f (X = x, Px = px ) − f (Px = px )
f (X = x, Px = px )
avg. fitness of solutions with X=x and Px=px
f (Px = px )
avg. fitness of solutions with Px
8
9. Estimating fitness
Equation
( )
n
f (X1 , X2 ,K , X n ) = favg + ∑ f Xi | PXi
i=0
In words
Fitness = avg. fitness + avg. contribution of each bit
Avg. contributions taken w.r.t. context from BN
9
10. BNs with decision trees
Local structures in BNs
More efficient representation for p ( X | Px )
Example for p ( A | B C )
B
0 1
p( A | B = 0) C
0 1
p( A | B = 1, C = 0 ) p( A | B = 1, C = 1)
10
11. BNs with decision trees + fitness
Same idea
Attach fitness info to
each probability
B
0 1
p( A | B = 0) C
f ( A | B = 0) 0 1
p ( A | B = 1, C = 0) p ( A | B = 1, C = 1)
f ( A | B = 1, C = 0) f ( A | B = 1, C = 1)
11
12. Estimating fitness again
Same as before…because both BNs represent the same
Equation
( )
n
f ( X 1 , X 2 ,K , X n ) = f avg + ∑ f X i | PX i
i =0
In words
Fitness = avg. fitness + avg. contribution of each bit
Avg. contributions taken w.r.t. context from BN
12
13. Where to learn fitness from?
Evaluate entire initial population
Choose inheritance proportion, pi
After that
Evaluate (1-pi) proportion of offspring
Use evaluated parents + evaluated offspring to learn
Estimate fitness of the remaining prop. pi
Sample for learning: N(1-pi) to N+N(1-pi)
Often, 2N(1-pi)
13
14. Simple example: Onemax
Onemax
n
f ( X 1 , X 2 ,K , X n ) = ∑ X i
i =1
What happens?
Average fitness grows (as predicted by theory)
No context is necessary
Fitness contributions stay constant
f ( X i = 1) = +0.5
f ( X i = 0 ) = −0.5
14
15. Experiments
Problems
50-bit onemax
10 traps of order 4
10 traps of order 5
Settings
Inheritance proportion from 0 to 0.999
Minimum population size for reliable convergence
Many runs for each setting (300 runs for each setting)
Output
Speed-up (in terms of real fitness evaluations)
15
19. Discussion
Inheritance proportion
High proportions of inheritance work great.
Speed-up
Optimal speed-up of 30-53
High speed-up for almost any setting
The tougher the problem, the better the speed-up
Why so good?
Learning probabilistic model difficult, so accurate
fitness info can be added at not much extra cost.
19
20. Conclusions
Fitness inheritance works great in BOA
Theory now exists (Satry et al., 2004) that
explains these results
High proportions of inheritance lead to high
speed-ups
Challenging problems allow much speed-up
Useful for practitioners with computationally
complex fitness function
20
21. Contact
Martin Pelikan
Dept. of Math and Computer Science, 320 CCB
University of Missouri at St. Louis
8001 Natural Bridge Rd.
St. Louis, MO 63121
E-mail: pelikan@cs.umsl.edu
WWW: http://www.cs.umsl.edu/~pelikan/
21