1 a genetic algorithm 1. obtain several creatures! (spontaneous generation??) 2. evolve! perform...
TRANSCRIPT
1
A Genetic Algorithm
1. Obtain several creatures! (Spontaneous generation??)
2. Evolve! Perform selective breeding:
(a) Run a couple of tournaments
(b) Let the winners breed
(c) Mutate and test their children
(d) Let the children live in the losers' homes
2
Evolution Runs Until:
A perfect individual appears (if you know what the goal is),
Or: improvement appears to be stalled,
Or: you give up (your computing budget is exhausted).
3
Comments
Stochastic algorithmrandomness has an essential role in genetic algorithmsboth selection and reproduction needs random
proceduresConsider population of solutions
evaluates more than a single solution at each iterationassortment, amenable for parallelisation
RobustnessAbility to perform consistently well on a broad range of
problem typesno particular requirements on the problems before
using GAs
4
Benefits of Genetic Algorithms
Concept is easy to understand
Modular, separate from application
Supports multi-objective optimization
Good for “noisy” environments
Always an answer; answer gets better with time
Inherently parallel; easily distributed
5
Benefits of Genetic Algorithms (cont.)
Many ways to speed up and improve a GA-based application as knowledge about problem domain is gained
Easy to exploit previous or alternate solutions
Flexible building blocks for hybrid applications
Substantial history and range of use
6
Uses of GAs
GAs (and SAs): the algorithms of despair. Use a GA when
you have no idea how to reasonably solve a problem
calculus doesn't apply
generation of all solutions is impractical
but, you can evaluate posed solutions
7
Problem & Representations
Chromosomes represent problems' solutions as genotypes
They should be amenable to:
Creation (spontaneous generation)
Evaluation (fitness) via development of phenotypes
Modification (mutation)
Crossover (recombination)
8
How GAs Represent Problems' Solutions: Genotypes
Bit strings -- this is the most common method
Strings on small alphabets (e.g., C, G, A, T)
Permutations (Queens, Salesmen)
Trees (Lisp programs).
Genotypes must allow for: creation, modification, and crossover.
9
Bit Strings
Bit strings, (B0;B1; ;BN-1), often represent solutions welland permit easy fitness evaluationsIndividuals are bit strings we call the chromosomes
Sample problems:Maximize the ones count = kBk
Optimize f(x), letting x = 0.B0B1B2 BN-1 in binary Map coloring (2 bits per country color) Music (3 or 4 bits per note)Creation, modification, and crossover are easy.
10
Sample Problems
Search for the best bit sting pattern for some application, such as:
"Baby Problem 1"
Find the bit string with the largest number of 1's.
Not very interesting,
but this simple problem can prove that the system works.
It's a "stub."
11
Sample Problems
Find the maximum value of the function
f(x) = sin(2x3) sin(25x) for 0x<1
The bit string (b1; b2; ; bn) represents x in binary:
x = 0.b1b2 bn =kbk2-k
12
Sample Problems
Find the maximum value of the function
f(x,y) = sin(2x3) cos(cos(42y)) sin(25x) + y2
for 0 x < 1 and. 0 y < 1.
The bit string represents both x and y.
13
Gleason's Problem
Given a “random” matrix, M, with values 1's
Change all the signs in some rows, and in some columns.
Try to maximize the number of +1's.
14
Hill Climbing Gets Local Optima
Locate rows or columns with negative sums and invert them.
This gets caught in local optima!
Example: 5 5 matrix with 17 +1's. +1 +1 +1 1 1 +1 +1 +1 1 1 +1 +1 +1 +1 +1 1 1 +1 +1 +1 1 1 +1 +1 +1
Every row, column has positive sum. No single inversion can improve it!
15
Solution Representation
The size of M is C R
Use a bit string of length C + R: B = (b1, , bC, bC+1, , bC+R)
Meaning of B:
for 1 j C; bj = 1 invert column j
for C + 1 j; bj = 1 invert row j
Apply the changes dictated by B to M, to get M’
Then the fitness of B is the sum of all the elements in M’
16
Four Moves Improves
Invert rows 1 and 2, to lose two +1's. 1 1 1 +1 +1 1 1 1 +1 +1 +1 +1 +1 +1 +1 1 1 +1 +1 +1 1 1 +1 +1 +1
Invert columns 1 and 2, to gain six +1's. +1 +1 1 +1 +1 +1 +1 1 +1 +1 1 1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1
17
The Structure of My GA Code
Start-up
Main program loop
Parent and loser selection
Initialization subroutine (somewhat problem dependent)
“fv” -- the fitness evaluation code (is problem dependent)
Crossover
Mutation
18
GA Main Program: Init
main(int argc; char **argv)
{
int who;
params( argc, argv );
init();
for( who = 0; who < POP_SIZE; who++ )
fitness[who] = fv(who);
printf( "End of initial pop\n. . . Now evolve!\n" );
. . . main loop goes here . . . .
}
19
Notes on GA Main Program: Init
1. arg gives the name of a param file
2. params( argc, argv ) processes that
3. POP_SIZE is an example of a run-time param
20
Initialization Subroutineinit( ){ MAX_HERO = N; /* for problem #1 */ for( i = 0; i < POP_SIZE; i++ ) for( j = 0; j < N; j++ ) p[i][j] = random_int( 2 );}
21
Fitness Calculationint fv(int who )
{
int i, the_fitness = 0;
fitness_evals++;
for( i = 0; i < N; i++ ) the_fitness += p[who][i];
if( print_every_fitness ) printf( "%4d fitness: ... " );
if( the_fitness > hero ) {
hero = the_fitness;
printf( "New hero %4d ... " );
}
return( the_fitness );
}
22
GA Main Program: Loop
for( trial = 0; trial < LOOPS; trial++ ){ if( hero >= MAX_HERO ) printf( "Goal reached: %d\n", hero ), break; selection( ) crossover( ) mutation( ) for( who = 0; who < POP_SIZE; who++ ) fitness[who] = fv(who);}printf( "%d evaluations", fitness_evals);printf( "Hero = %d\n", hero );
23
Selection
Selects individuals for reproduction– randomly with a probability depending on the
relative fitness of the individuals so that the best ones are more often chosen for reproduction rather than poor ones
– Proportionate-based selectionpicks out individuals based upon their fitness
values relative to the fitness of the other individuals in the population
– Ordinal-based selectionselects individuals based upon their rank within the
population; independent of the fitness distribution
24
Roulette Wheel Selection
Here is a common technique: let F = j=1 to popsizefitnessj
Select individual k to be a parent with probability fitnessk/F
25
Roulette Wheel Selectionassigns to each solution a sector of a roulette wheel
whose size is proportional to the appropriate fitness measure
chooses a random position on the wheel (spin the wheel)Fitnessa:1b:3c:5d:3e:2f:2g:8
c
d
e
fg
a
b
26
Roulette Wheel Example
For each chromosome evaluate the fitness and the cumulative fitness
For N times calculate a random number
Select the chromosome where its cumulative fitness is the first value greater than the generated random number
Individual Chromosome Fitness Cumulative x1 101100 20 20 x2 111000 7 27 x3 001110 6 33 x4 101010 10 43 x5 100011 12 55 x6 011011 9 64
27
Roulette Wheel Example
Individual Chromosome Fitness Cumulative Random Individual x1 101100 20 20 42.8 x4 x2 111000 7 27 19.78 x1 x3 001110 6 33 42.73 x4 x4 101010 10 43 58.44 x6 x5 100011 12 55 27.31 x3 x6 011011 9 64 28.31 x3
28
Roulette Wheel Selection
There are some problems here:
fitnesses shouldn't be negative
only useful for max problem
probabilities should be “right” avoid skewing by super heros.
29
Parent Selection: Rank
Here is another technique.
Order the individuals by fitness rank
Worst individual has rank 1. Best individual has rank POPSIZE
Let F = 1 + 2 + 3 + + POP_SIZE
Select individual k to be a parent with probability rankk/F
Benefits of rank selection: the probabilities are all positive the probability distribution is “even”
30
Parent Selection: Rank Power
Yet another technique.
Order the individuals by fitness rankWorst individual has rank 1. Best individual has rank
POP_SIZE
Let F = 1s + 2s + 3s + + POP_SIZEs
Select individual k to be a parent with probability rankks/F
benefits:
the probabilities are all positive
the probabilities can be skewed to use more “elitist” selection
31
Tournament Selection
Pick k members of the population at randomselect one of them in some manner that depends on
fitness
32
Tournament Selection
void tournament(int *winner, *loser){ int size = tournament_size, i, winfit, losefit; for( i = 0; i < size; i++ ) { int j = random_int( POP_SIZE );; if( j==0 || fitness[j] > winfit ) winfit = fitness[j],*winner = j; if( j==0 || fitness[j] < losefit ) losefit = fitness[j],*loser = j; }}
33
Crossover Methods
Crossover is a primary tool of a GA. (The other main tool is selection.)
CROSS_RATE: determine if the chromosome attend the crossover
Common techniques for bit string representations:
One-point crossover: Parents exchange a random prefix
Two-point crossover: Parents exchange a random substring
Uniform crossover: Each child bit comes arbitrarily from either parent
(We need more clever methods for permutations & trees.)
34
1-point Crossover
Suppose we have 2 strings a and b, each consisting of 6 variablesa1, a2, a3, a4, a5, a6b1, b2, b3, b4, b5, b6representing two solutions to a problem
a crossover point is chosen at random and a new solution is produced by combining the pieces of the original solutionsif crossover point was 2a1, a2, b3, b4, b5, b6b1, b2, a3, a4, a5, a6
35
1-point Crossover
Parents Children
36
2-point Crossover
With one-point crossover the head and the tail of one chromosome cannot be passed together to the offspring
If both the head and the tail of a chromosome conatin good generic information, none of the offsprings obtained directly with one-point crossover will share the two good features
A 2-point crossover avoids such a drawback
Parents Children
37
Uniform Crossover
Each gene in the offspring is created by copying the corresponding gene from one or the other parentchosen according to a random generated binary
crossover mask of the same length as the chromosomeswhere there is a 1 in the crossover mask the
gene is copied from the first parent and where there is a 0 in the mask the gene is copied from the second parent
a new crossover mask is randomly generated for each pair of parents
38
Uniform Crossover
Parents Child
1 0 0 1 0 1 1 0CrossoverMask
39
Uniform Crossover
make_children(int p1, p2, c1, c2)
{
int i, j;
for( i = 0; i < N; i++ ) {
if( random_int(2) )
p[c1][i] = p[p1][i],p[c2][i] = p[p2][i];
else
p[c1][i] = p[p2][i],p[c2][i] = p[p1][i];
}
}
40
Another Clever CrossoverSelect three individuals, A, B, and C.Suppose A has the highest fitness and C the lowest.
Create a child like this.
for(i = 0; i < length; i++ ) { if( A[i] == B[i] ) child[i] = A[i]; else child[i] = 1 - C[i]; }
We just suppose C is a “bad example.”
41
Crossover Methods & Schemas
Crossovers try to combine good schemas in the good parents.
The schemas are the good genes, building blocks to gather.
The simplest schemas are substrings.
1-point & 2-point crossovers preserve short substring schemas.
Uniform crossover is uniformly hostile to all kinds of schemas.
42
Limit Consistency of Crossover Operator
)()()(lim 00 11 mm iiiikk
xPxPxxP
43
Crossover for Permutations (A Tricky Issue)
Small-alphabet techniques fail. Some common methods are:
OX: ordered crossover
PMX: partially matched crossover
CX: cycle crossover
We will address these and others later.
44
Crossover for Trees
These trees often represent computer programs.
Think Lisp
Interchange randomly chosen subtrees of parents.
45
Mutation: Preserve Genetic Diversity
Mutation is a minor GA tool .
Provides the opportunity to reach parts of the search space which perhaps cannot be reached by crossover alone. Without mutation we may get premature convergence to a population of identical clones– mutation helps for the exploration of the whole search space by
maintaining genetic diversity in the population– each gene of a string is examined in turn and with a small
probability its current allele is changed– 011001 could become 010011
• if the 3rd and 5th alleles are mutated
46
Mutate Strings & Permutations
Bit strings (or small alphabets)
Flip some bits
Reverse a substring (nature does this)
Permutations
Transpose some pairs
Reverse a substring
Trees . . .
47
Mutation
Randomize each bit with probability MUT_ RATE
mutate(int who)
{
int j;
for( j = 0; j < N; j++ ) {
if( MUT_RATE > drand48( ) )
p[who][j] = random_int(2);
}
}
48
Mutation
Mutation rate determines the probability that a mutation will occur.
Mutation is employed to give new information to the population
and also prevents the population from becoming saturated with similar chromosomes
Large mutation rates increase the probability that good schemata will be destroyed, but increase population diversity.
The best mutation rate is application dependent but for most applications is between 0.001 and 0.1.
49
Mutation
Some researchers have published "rules of thumb" for choosing the best mutation rate based on the length of the chromosome and the population size.
DeJong suggested that the mutation rate be inversely proportional to the population size. (1/L)
Hessner and Manner suggest that the optimal mutation rate is approximately
(M * L1/2)-1
where M is the population size and L is the length of the chromosome.
50
Recombination vs Mutation
Recombination
modifications depend on the whole population
decreasing effects with convergence
exploitation operator
Mutation
mandatory to escape local optima
strong causality principle
exploration operator
51
Recombination vs Mutation
Historical “irrationale”GA emphasize crossoverES and EP emphasize mutation
Problem-dependent rationale:fitness partially separable?existence of building blocks?Semantically meaningful recombination operator?
52
Replacement
A method to determine which of the current members of the population, if any, should be replaced by the new solutions.
Generational updatesSteady state updates
53
Generational Updates Replacement
• produce N children from a population of size N to form the population at the next time step and this new population of children completely replaces the parent selection
• a derived generational update scheme can also be used– (+)-update and (, )-update
is the parent population is the number of children produced of size
– the best individuals from either the offspring population or the combined parent and offspring populations form the next generation
54
Steady state replacement
• New individuals are inserted in the population as soon as they are created by replacing an existing member of the population– the worst or the oldest member– tournament replacement
• as tournament selection but this time the less good solutions are picked more often than the good ones
– the most similar member– elitism
• never replace the best individuals in the population with inferior solution, so best solution is always available for reproduction
– harder to escape from a local optimum
55
Simple GA Simulation
Optimisation problemmaximise the function f(x)=x2
x between 0 and 31objective function
code using binary representation110112 = 1910
1*24+0*23+0*22+1*21+1*20
0 is then 0000031 is then 11111
56
Simple GA Simulation: Initial Population
Lets assume that we want to create a initial population of 4
random flip coin 20 times (4 population size * string of 5)
String InitialNo. Population1 011012 110003 010004 10011
57
Simple GA Simulation: Fitness Function
String Initial f(x) Prob. = fi/Sum
No. Population
1 01101 169 0.142 11000 576 0.493 01000 64 0.064 10011 361 0.31 Sum 1170
Average 293 Max 576
58
Simple GA Simulation: Roulette Wheel
String Initial f(x) Prob Roulette.No. Population Wheel
1 01101 169 0.14 12 11000 576 0.49 23 01000 64 0.06 04 10011 361 0.31 1
Sum 1170 Average 293 Max 576
59
Simple GA Simulation: Mating Pool
String Initial f(x) Prob Roulette. MatingNo. Population Wheel Pool
1 01101 169 0.14 1 011012 11000 576 0.49 2 110003 01000 64 0.06 0 110004 10011 361 0.31 1 10011
Sum 1170 Average 293 Max 576
60
Simple GA Simulation: Mate
String Initial f(x) Prob Roulette. Mating MateNo. Population Wheel Pool
1 01101 169 0.14 1 01101 22 11000 576 0.49 2 11000 13 01000 64 0.06 0 11000 44 10011 361 0.31 1 10011 3
Sum 1170 Average 293 Mate is Randomly
selected Max 576
61
Simple GA Simulation: Crossover
String Initial f(x) Prob R. Mating Mate Crossover No. Population W Pool1 01101 169 0.14 1 01101 2 42 11000 576 0.49 2 11000 1 43 01000 64 0.06 0 11000 4
24 10011 361 0.31 1 10011 3
2 Sum 1170 Average 293 Crossover is Randomly selected Max 576
62
Simple GA Simulation: New Population
String Initial f(x) Prob R. Mating Mate Cross New No. Population W Pool over Pop.1 01101 169 0.14 1 01101 2 4 011002 11000 576 0.49 2 11000 1 4 110013 01000 64 0 .06 0 11000 4 2 110114 10011 361 0.31 1 10011 3 2 10000
Sum 1170 Average 293 Max 576
63
Simple GA Simulation: Mutation
String Initial f(x) Prob R. Mating Mate Cross New No. Population W Pool over Pop.1 01101 169 0.14 1 01101 2 4 011002 11000 576 0.49 2 11000 1 4 110013 01000 64 0 .06 0 11000 4 2 110114 10011 361 0.31 1 10011 3 2 10000
Sum 1170 Average 293 Max 576
64
Simple GA Simulation: Mutation
String Initial f(x) Prob R. Mating Mate Cross New No. Population W Pool over Pop.1 01101 169 0.14 1 01101 2 4 011002 11000 576 0.49 2 11000 1 4 100013 01000 64 0 .06 0 11000 4 2 110114 10011 361 0.31 1 10011 3 2 10000
Sum 1170 Average 293 Max 576
65
Simple GA Simulation: Evaluation New Population
String Initial f(x) Prob R. Mating Mate C New f(x)No. Population W Pool o Pop.1 01101 169 0.14 1 01101 2 4 01100
1442 11000 576 0.49 2 11000 1 4 10001
2893 01000 64 0 .060 11000 4 2 11011
7294 10011 361 0.31 1 10011 3 2 10000
256 Sum 1170 1418
Average 293 354 Max 576 729
66
One Dimensional Optimization
Locate x, 0 x < 1 where the function F(x) is maximized.
F(x) = a1e-((x-c1)/s1)2 + a2e-((x-c2)/s2)2
a1; a2; c1; c2; s1; s2 are parameters.
s2 << s1 and a2 > a1, so max is near x = c2 and hard to find.
67
A Function to Optimize
68
A Function to Optimize
69
A Function to Optimize: See Next Visual
70
A Function to Optimize: Zoomed in to See the Max
71
Fitness Calculation for 1D Function Optimization
# define sqr(X) (X)*(X)double F(double x){ return a1*exp(-sqr((x-c1)/s1)) +
a2*exp(-sqr((x-c2)/s2)) }
double fv(int who){ int i; double x = 0.0; for( i = 0; i < N; i++ ) { x = (x + p[who][i])/2.0; } return( F(x) );}
72
Another Function to Optimize
Find the max. of the “peaks” functionz = f(x, y) = 3*(1-x)^2*exp(-(x^2) - (y+1)^2) - 10*(x/5 - x^3 - y^5)*exp(-x^2-
y^2) -1/3*exp(-(x+1)^2 - y^2).
73
Derivatives of the “peaks” functiondz/dx = -6*(1-x)*exp(-x^2-(y+1)^2) - 6*(1-x)^2*x*exp(-x^2-(y+1)^2) -
10*(1/5-3*x^2)*exp(-x^2-y^2) + 20*(1/5*x-x^3-y^5)*x*exp(-x^2-y^2) - 1/3*(-2*x-2)*exp(-(x+1)^2-y^2)
dz/dy = 3*(1-x)^2*(-2*y-2)*exp(-x^2-(y+1)^2) + 50*y^4*exp(-x^2-y^2) + 20*(1/5*x-x^3-y^5)*y*exp(-x^2-y^2) + 2/3*y*exp(-(x+1)^2-y^2)
d(dz/dx)/dx = 36*x*exp(-x^2-(y+1)^2) - 18*x^2*exp(-x^2-(y+1)^2) - 24*x^3*exp(-x^2-(y+1)^2) + 12*x^4*exp(-x^2-(y+1)^2) + 72*x*exp(-x^2-y^2) - 148*x^3*exp(-x^2-y^2) - 20*y^5*exp(-x^2-y^2) + 40*x^5*exp(-x^2-y^2) + 40*x^2*exp(-x^2-y^2)*y^5 -2/3*exp(-(x+1)^2-y^2) - 4/3*exp(-(x+1)^2-y^2)*x^2 -8/3*exp(-(x+1)^2-y^2)*x
d(dz/dy)/dy = -6*(1-x)^2*exp(-x^2-(y+1)^2) + 3*(1-x)^2*(-2*y-2)^2*exp(-x^2-(y+1)^2) + 200*y^3*exp(-x^2-y^2)-200*y^5*exp(-x^2-y^2) + 20*(1/5*x-x^3-y^5)*exp(-x^2-y^2) - 40*(1/5*x-x^3-y^5)*y^2*exp(-x^2-y^2) + 2/3*exp(-(x+1)^2-y^2)-4/3*y^2*exp(-(x+1)^2-y^2)
74
GA process
10th generation5th generationInitial population
75
Performance profile
76
Setting the Parameters
How do we choose the “params”?
There is a statistical discipline: experimental design
For the present, we will treat this issue informally.
Try different setting and get a problem to work.
Then systematically try various settings.
77
Params Settings for Problem #1print_the_params 1POP_SIZE 200tournament_size 2print_every_hero 1uniform 1FITNESSES 10000N 400problem_number 1MUT_RATE 0.002000seed 0.000000These settings worked!Result: Stopped after 8824 fitness evals. Hero = 400
78
Pop Sizes 5, 10, 100, 300, 1000 For Problem 1
5: too much exploitation. 1000: too much exploration.
79
Mutation Rates 0.001, 0.005, 0.007, 0.01 For Problem 1
0.01: too much exploration
80
Improving Performance
Although GAs are conceptually simple, to the newcomer the number of configuration choices can be overwhelming.
Once the scientist using a GA for the first time receives less than satisfactory results there are several steps which can be taken to improve search performance.
81
Improving Performance
The first approach to improving search performance is to simply use different values for mutation rates, population size, etc..
Many times, search performance can be improved by making the optimization more stochastic or becoming more hill-climbing in nature.
This trial and error approach, although time-consuming, will usually result in improved search performance.
If changing the configuration parameters has no effect on search performance a more fundamental problem may be the cause.
82
Improving Performance
A simple method of improving GA performance is to simply change the genetic representation.
Many researchers publish results showing that binary codings worked better for their application, while other researchers report different results.
Another simple modification which has been known to increase search performance for binary coded problems is changing to a gray coding.
If changing representations or codings does not work or if the application involves a form of feature selection, the problem may be more complicated.