g5baim artificial intelligence methods graham kendall genetic algorithms

45
G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

Upload: demarcus-cossey

Post on 01-Apr-2015

225 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIMArtificial Intelligence Methods

Graham KendallGenetic Algorithms

Page 2: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

Charles Darwin 1809 - 1882

"A man who dares to waste an hour of life has not discovered the value of life"

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Page 3: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Genetic Algorithms

• Based on survival of the fittest

• Developed extensively by John Holland in mid 70’s

• Based on a population based approach

• Can be run on parallel machines

• Only the evaluation function has domain knowledge

• Can be implemented as three modules; the evaluation module, the population module and the reproduction module.

• Solutions (individuals) often coded as bit strings

• Algorithm uses terms from genetics; population, chromosome and gene

Page 4: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Algorithm• Initialise a population of chromosomes

• Evaluate each chromosome (individual) in the population

• Create new chromosomes by mating chromosomes in the current population (using crossover and mutation)

• Delete members of the existing population to make way for the new members

• Evaluate the new members and insert them into the population

• Repeat stage 2 until some termination condition is reached (normally based on time or number of populations produced)

• Return the best chromosome as the solution

Page 5: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Algorithm - Evaluation Module

• Responsible for evaluating a chromosome

• Only part of the GA that has any knowledge

about the problem. The rest of the GA

modules are simply operating on (typically)

bit strings with no information about the

problem

• A different evaluation module is needed for

each problem

Page 6: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Algorithm - Population Module

• Responsible for maintaining the population

• Initilisation– Random

– Known Solutions

Page 7: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Algorithm - Population Module

• Deletion– Delete-All : Deletes all the members of the current

population and replaces them with the same number of chromosomes that have just been created

– Steady-State : Deletes n old members and replaces them with n new members; n is a parameterBut do you delete the worst individuals, pick them at random or delete the chromosomes that you used as parents?

– Steady-State-No-Duplicates : Same as steady-state but checks that no duplicate chromosomes are added to the population. This adds to the computational overhead but can mean that more of the search space is explored

Page 8: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Parent Selection - Roulette Wheel

Chromosome 1 2 3 4 5 6 7 8 9 10Fitness 12 18 15 17 3 136 12 15 20 15

Running Total 12 30 45 62 65 201 213 228 248 263

1 2 3 4 5 6 7 8 9 10

Random Number 234 156 8 174 219 255 143 94 210 31

Chromsome Chosen 9 6 #N/A 6 8 10 6 6 7 3

NotesPress F9 to regenerate random numbersNotice how C6 tends to dominate due to its dominant fitness

Roulette Wheel Selection

•Sum the fitnesses of all the population members, TF

•Generate a random number, m, between 0 and TF

•Return the first population member whose fitness added to the preceding population members is greater than or equal to m

Page 9: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Parent Selection - Tournament

• Select a pair of individuals at random. Generate a random number, R, between 0 and 1. If R < r use the first individual as a parent. If the R >= r then use the second individual as the parent. This is repeated to select the second parent. The value of r is a parameter to this method

• Select two individuals at random. The individual with the highest evaluation becomes the parent. Repeat to find a second parent

Page 10: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Fitness Techniques

• Fitness-Is-Evaluation : Simply have the fitness of the

chromosome equal to its evaluation

• Windowing : Takes the lowest evaluation and assigns each

chromosome a fitness equal to the amount it exceeds this

minimum.

• Linear Normalization : The chromosomes are sorted by

decreasing evaluation value. Then the chromosomes are

assigned a fitness value that starts with a constant value and

decreases linearly. The initial value and the decrement are

parameters to the techniques

Page 11: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Population Module - Parameters

• Population Size

• Elitism

Page 12: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Reproduction - Crossover Operators

One Point Crossover inOne Point Crossover inGenetic AlgorithmsGenetic Algorithms

© Graham Kendall

[email protected]

http://cs.nott.ac.uk/~gxk

Partially MatchedPartially MatchedCrossover (PMX) inCrossover (PMX) inGenetic AlgorithmsGenetic Algorithms

© Graham Kendall

[email protected]

http://cs.nott.ac.uk/~gxk

Uniform Crossover inUniform Crossover inGenetic AlgorithmsGenetic Algorithms

© Graham Kendall

[email protected]

http://cs.nott.ac.uk/~gxk

Order Based Crossover

Cycle Crossover

Page 13: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Example

• Crossover probability, PC = 1.0• Mutation probability, PM = 0.0• Maximise f(x) = x3 - 60 * x2 + 900 * x +100• 0 <= x >= 31• x can be represented using five binary digits

0 100

1 9412 16683 22874 28045 32256 35567 38038 39729 4069

10 410011 407112 398813 385714 368415 347516 323617 297318 269219 239920 210021 180122 150823 122724 96425 72526 51627 34328 21229 12930 100

0

500

1000

1500

2000

2500

3000

3500

4000

4500

1 3 5 7 9

11

13

15

17

19

21

23

25

27

29

31

33

35

37

Max : x = 10

f(x) = x^3 - 60x^2 + 900x + 100

Page 14: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Example

• Generate random individuals

Chromosome Binary String x f(x)P1 11100 28 212P2 01111 15 3475P3 10111 23 1227P4 00100 4 2804

TOTAL 7718AVERAGE 1929.50

Page 15: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Example

• Choose Parents, using roulette wheel selection

• Crossover point is chosen randomly

Roulette Wheel Parent Chosen Crossover Point4116 P3 N/A1915 P2 1

Page 16: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Example - Crossover

1 0 1 1 1

0 1 1 1 1

P3

P2

1 1 1 1 1

0 0 1 1 1

C1

C2

0 0 1 0 0

0 1 1 1 1

P4

P2

0 0 1 1 1

0 1 1 0 0

C3

C4

Page 17: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Example - After First Round of Breeding• The average evaluation has risen• P2, was the strongest individual in the initial

population. It was chosen both times but we have lost it from the current population

• We have a value of x=7 in the population which is the closest value to 10 we have found

Chromosome Binary String x f(x)P1 11111 31 131P2 00111 7 3803

P3 00111 7 3803P4 01100 12 3998

TOTAL 11735AVERAGE 2933.75

Page 18: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Example - Question?

• Assume the initial population was 17, 21, 4 and 28. Using the same GA methods we used above (PC = 1.0, PM = 0.0), what chance is there of finding the global optimum?

• The answer is in the handout - but try it first

Page 19: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA Example - Mutation

• A method of ensuring premature convergence does not occur

• Usually set to a small value

• Dynamic mutation and crossover rates

Page 20: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA - Schema Theorem - Introduction

• Developed by John Holland

• Question : How likely is a schema to survive from one generation to the next?

• Question : How many schema are likely to be present in the next generation?

Page 21: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA - Schema Theorem - What is a Schema?

0 0 0 1 0 1 1

0 1 0 0 0 0 1

* * 0 * 0 * 1

C1

C2

Schema

Another Schema * * * * * * 1

Page 22: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA - Schema Theorem - Implicit Parallelism

• If a chromosome is of length n then it contains 3n schemata (as each position can have the value 0, 1 or *)

• In theory, this means that for a population of M individuals we are evaluating up to M3n schemata

• But, bear in mind that some schemata will not be represented and others will overlap with other schemata

• This is exactly what we want. We eventually want to create a population that is full of fitter schemata and we will have lost weaker schemata

• It is the fact that we are manipulating M individuals but M3n schemata that gives genetic algorithms what has been called implicit parallelism

Page 23: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA - Schema Theorem - Definitions

• Length : is defined as the distance between the start of the schema and the end of the schema minus one (Goldberg, 1989)

• Order : is defined as the number of defined positions

• Fitness Ratio : is defined as the ratio of the fitness of a schema to the average fitness of the population

* * 0 * 0 * 1 Length = 6

Order = 3

Page 24: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic AlgorithmsGA - Schema Theorem - Intuition about length

• The longer the length of the schema, the more chance there is of the schema being disrupted by a crossover operation

• This implies that shorter schemata have a better chance of surviving from one generation to the next

• In turn, this implies that if we know that certain attributes of a problem fit well together then these should be placed as close as possible together in the coding

Page 25: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA - Schema Theorem - Intuition about order

• This observation is also true for the order of the chromosome. If we are not worried about the number of defined positions (i.e. we allow as many ‘*’ as possible) then a crossover operation has less chance of disrupting good schemata

• Intuitively, it would seem better to have short, low-order schema

• This is only based on empirical evidence but it is widely believed that these assumptions are true and the following theory makes some sense of this

Page 26: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA - Schema Theorem

• Using a technique where we choose parents relative to their fitness (e.g. roulette wheel selection), fitter schema should find their way from one generation to another

• Intuitively, if a schema is fitter than average then it should not only survive to the next generation but should also increase its presence in the population

• If is the number of instances of any particular schema S within the population at time t, then at t+1 we would expect

(S, t +1) > (S)to hold for above average fitness schemata

Page 27: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA - Schema Theorem - Number of Schema

• Going one stage further we can estimate the number of schema present at t +1

if

SfntStS

)(),()1,(

n is the size of the populationf(S) is the fitness of the schemafi is the fitness of the population

avgf

SftStS

)(),()1,(

favg is the average fitness of the population

Page 28: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

GA - Schema Theorem - Reproduction of Schema

• If a particular schema stays a constant, c, above the average we can say even more about the effects of reproduction

avg

avgavg

f

cfftStS

)(),()1,(

= (S, t)(1 + c)

(S, t) = (S, t)(1 + c)t

• Setting t=0

• Notice that the number of schema rises exponentially

Page 29: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Probability of non-disruption through crossover

• Given a schema, what is the probability of it not being disrupted by a crossover operation?

1

)(1

n

SlPP

CNC

PC is the probability of crossover,l(s) is the length of the schema,n is the length of the chromosome

Page 30: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Probability of non-disruption through crossover

• l(s) = 4 and n = 11• Assume PC = 1

• The probability of the schema being disrupted by a crossover operation is 1- 1 x 4 / 10 = 0.6

• We can easily confirm this by seeing that there are six crossover positions, of a possible ten (we assume we do not pick crossover points at the “outside”) that will not disrupt the schema

0 1 1 1 0 0 1 0 0 1 1

1

)(1

n

SlPP

CNC

Page 31: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Probability of non-disruption through crossover

• But what if we crossover this schema with one that is the same?

0 1 1 1 0 0 1 0 0 1 1

1 0 1 1 0 0 1 1 1 0 1

Page 32: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Probability of non-disruption through crossover

The probability that the schema in the other parent is an instance of a different schema is given by

(1-P[S,t])

where P[S, t] is the probability that the schema in the other parent is the same as the schema in the initial parent

We need to do is multiply our original definition of PNC by the probability it is an instance of a different schema

]),[1(1

)(1 tSP

n

SlPP

CNC

Page 33: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Probability of non-disruption through crossover

0 1 1 1 0 0 1 0 0 1 1

PC = 1l(s) = 4n = 11P[S, t] = 1 (i.e. the other parent’s schema is the same as the initial parent – therefore we would expect the schema to appear in the next population)

]),[1(1

)(1 tSP

n

SlPP

CNC

= 1

P[S, t] = 0 ]),[1(1

)(1 tSP

n

SlPP

CNC

= 0.6

Page 34: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Probability of non-disruption through mutation

0 1 1 * * 0 1 0 0 1 1

• As mutation can be applied to all the genes in a chromosome we do not need worry about the length of the chromosome, nor do we need worry about the length of the schema

• We are concerned with the order

• For example, a schema of length 4 but only of order 2. It is only the bits that are defined within the schema that are of concern to us. The “don’t care” (*’s) can be mutated without affecting the schema

Page 35: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Probability of non-disruption through mutation

0 1 1 * * 0 1 0 0 1 1

• The probability of a single bit within a schema surviving mutation is

1 - PM

• The probability of surviving mutation is

(1 - PM)K(S)

• which can be approximated to

1 - PMK(S) ≈ 1 – K(S)PM

Page 36: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Probability of non-disruption through mutation

0 1 1 * * 0 1 0 0 1 1

Assume PM = 0.01 then the probability of the above schema surviving is

(1 - PM)K(S) = (1 - 0.01)3 = 0.97

If the schema had a higher order, say K(S) = 100, then the probability of the schema surviving

(1 - PM)K(S) = (1 - 0.01)100 = 0.366

demonstrating that short schema have a better chance of surviving

Page 37: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Schema Theory

Assume PM = 0.01 then the probability of the above schema surviving is

MK(S)P]),[1(

1

)(1

)(),()1,( tSP

n

SlP

f

SftStS

C

avg

Number of schema present at t

Probability of schema surviving crossover

Probability of schema surviving mutation

Page 38: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Schema Theory - Try it

Parameters (you need to supply)Probability of Crossover - P C 1.00

Probability of Mutation - P M 0.01

Length of Schema - l(S) 4.00Order of Schema - O(S) 5.00Length of Chromosome - n 100.00Probability of Schema's being the same 0.00

Fitness Ratio - f(S) / f avg 9.88

Instances of S at time t 10.00Constant - c 3.00

Reproduction Formulae

Expected Instances of Schema at t+1 98.75

How instances of schema rises exponentially with timeTime 10.00

1 40.002 160.003 640.004 2560.005 10240.00

Crossover FormulaeProbability of non-disruption of schema 0.96

Mutation FormulaeProbability of non-disruption of schema 0.95Approximation 0.95

Page 39: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Coding Schemes

• When applying a GA to a problem one of the decisions we have to make is how to represent the problem

• The classic approach is to use bit strings and there are still some people who argue that unless you use bit strings then you have moved away from a GA

• Bit strings are useful as• How do you represent and define a neighbourhood

for real numbers?• How do you cope with invalid solutions?

• Bit strings seem like a good coding scheme if we can represent our problem using this notation

Page 40: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Coding Schemes

Gray codes have the property that adjacent integers only differ in one bit position. Take, for example, decimal 3. To move to decimal 4, using

binary representation, we have to change all three bits. Using the gray code only one bit changes

Decimal Binary Gray Code0 000 0001 001 0012 010 0113 011 0104 100 1105 101 1116 110 1017 111 100

Page 41: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Coding Schemes

• Hollstien, 1971 investigated the use of GAs for optimizing functions of two variables and claimed that a Gray code representation worked slightly better than the binary representation

• He attributed this difference to the adjacency property of Gray codes

• In general, adjacent integers in the binary representaion often lie many bit flips apart (as shown with 3 and 4)

• This fact makes it less likely that a mutation operator can effect small changes for a binary-coded chromosome

Page 42: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Coding Schemes

• A Gray code representation seems to improve a mutation operator's chances of making incremental improvements. Why?

• In a binary-coded string of length N, a single mutation in the most significant bit (MSB) alters the number by 2N-1

• In a Gray-coded string, fewer mutations lead to a change this large

1 1 0 0 1 1 = 51

0 1 0 0 1 1 = 19

2N-1 = 32

Page 43: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Coding Schemes

The use of Gray codes does pay a price for this feature. The "fewer mutations" which lead to large changes, lead to much larger changes

In the Gray code illustrated above, for example, a single mutation of the left-most bit changes a zero to a seven and vice-versa, while the largest change a single mutation can make to a corresponding binary-coded individual is always four

However most mutations will make only small changes, while the occasional mutation that effects a truly big change may allow exploration of a new area of the search space

Page 44: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIM Genetic AlgorithmsG5BAIM Genetic Algorithms

Coding Schemes

• The algorithm for converting between the Gray code described above (there are others) and the decimal binary representation is as follows

• Label the bits of a binary-coded string B[i], where larger i's represent more significant bits

• Label the corresponding Gray-coded string G[i]• Convert one to the other as follows

• Copy the most significant bit• For each smaller i do G[i] = XOR(B[i+1], B[i]) (to

convert binary to Gray)Or• B[i] = XOR(B[i+1], G[i]) (to convert Gray to

binary)

Page 45: G5BAIM Artificial Intelligence Methods Graham Kendall Genetic Algorithms

G5BAIMArtificial Intelligence Methods

Graham KendallEnd of Genetic Algorithms