2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉...

62

Click here to load reader

Upload: tracey-newton

Post on 11-Jan-2016

270 views

Category:

Documents


9 download

TRANSCRIPT

Page 1: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

12007 济南

树结构编码进化优化算法

济南大学 计算智能实验室 陈月辉[email protected] http://cilab.ujn.edu.cn

Page 2: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

2

GeneticGenetic Programming Programming₪ Developed: USA in the 1990’sDeveloped: USA in the 1990’s₪ Early names: J. KozaEarly names: J. Koza₪ Typically applied to:Typically applied to:

■ machine learning tasks (prediction, classification machine learning tasks (prediction, classification …)…)

₪ Attributed features:Attributed features:■ competes with neural nets and alikecompetes with neural nets and alike■ needs huge populations (thousands)needs huge populations (thousands)■ slowslow

₪ Special:Special:■ non-linear chromosomes: trees, graphsnon-linear chromosomes: trees, graphs■ mutation possible but not necessary (disputed!)mutation possible but not necessary (disputed!)

Page 3: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

3

GP technical summary tableauGP technical summary tableau

RepresentationRepresentation Tree structuresTree structures

RecombinationRecombination Exchange of subtreesExchange of subtrees

MutationMutation Random change in treesRandom change in trees

Parent selectionParent selection Fitness proportionalFitness proportional

Survivor selectionSurvivor selection Generational replacementGenerational replacement

Page 4: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

4

Introductory example: credit scoringIntroductory example: credit scoring₪ Bank wants to distinguish good from bad loan Bank wants to distinguish good from bad loan

applicantsapplicants₪ Model needed that matches historical dataModel needed that matches historical data

IDID No of No of childrenchildren

SalarySalary Marital Marital statusstatus

OK?OK?

ID-1ID-1 22 8500085000 MarriedMarried 11

ID-2ID-2 00 3000030000 SingleSingle 00

ID-3ID-3 11 4000040000 Divorced Divorced 00

……

Page 5: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

5

₪ A possible model: A possible model: ₪ IF IF (NOC = 2) AND (S > 80000)(NOC = 2) AND (S > 80000) THEN THEN goodgood ELSE ELSE badbad₪ In general: In general: ₪ IF IF formulaformula THEN THEN goodgood ELSE ELSE badbad₪ Only unknown is the right formula, henceOnly unknown is the right formula, hence₪ Our search space (phenotypes) is the set of formulasOur search space (phenotypes) is the set of formulas₪ Natural fitness of a formula: percentage of well Natural fitness of a formula: percentage of well

classified cases of the model it stands forclassified cases of the model it stands for₪ Natural representation of formulas (genotypes) is: Natural representation of formulas (genotypes) is:

parse treesparse trees

Introductory example: credit scoringIntroductory example: credit scoring

Page 6: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

6

IF IF (NOC = 2) AND (S > 80000)(NOC = 2) AND (S > 80000) THEN THEN goodgood ELSE ELSE badbad

can be represented by the following treecan be represented by the following tree

AND

S2NOC 80000

>=

Introductory example: credit scoringIntroductory example: credit scoring

Page 7: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

7

Tree based representationTree based representation

₪ Trees are a universal form, e.g. consider Trees are a universal form, e.g. consider ₪ Arithmetic formulaArithmetic formula

₪ Logical formulaLogical formula

₪ ProgramProgram

15)3(2

yx

(x true) (( x y ) (z (x y)))

i =1;while (i < 20){

i = i +1}

Page 8: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

8

15)3(2

yx

Tree based representationTree based representation

Page 9: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

9

(x true) (( x y ) (z (x y)))

Tree based representationTree based representation

Page 10: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

10

Tree based representationTree based representation

i =1;while (i < 20){

i = i +1}

Page 11: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

11

Tree based representationTree based representation₪ In GA, ES, EP chromosomes are linear In GA, ES, EP chromosomes are linear

structures (bit strings, integer string, real-structures (bit strings, integer string, real-valued vectors, permutations)valued vectors, permutations)

₪ Tree shaped chromosomes are non-linear Tree shaped chromosomes are non-linear structuresstructures

₪ In GA, ES, EP the size of the chromosomes In GA, ES, EP the size of the chromosomes is fixedis fixed

₪ Trees in GP may vary in depth and width Trees in GP may vary in depth and width

Page 12: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

12

Tree based representationTree based representation₪ Symbolic expressions can be defined by Symbolic expressions can be defined by

■ Terminal set TTerminal set T■ Function set F (with the arities of function symbFunction set F (with the arities of function symb

ols)ols)₪ Adopting the following general recursive deAdopting the following general recursive de

finition:finition:■ Every t Every t T is a correct expression T is a correct expression■ f(ef(e11, …, e, …, enn) is a correct expression if f ) is a correct expression if f F, arity F, arity

(f)=n and e(f)=n and e11, …, e, …, enn are correct expressions are correct expressions ■ There are no other forms of correct expressionsThere are no other forms of correct expressions

₪ In general, expressions in GP are not typed In general, expressions in GP are not typed (closure property: any f (closure property: any f F can take any g F can take any g F as argument) F as argument)

Page 13: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

13

Offspring creation schemeOffspring creation scheme₪ Compare Compare ₪ GA scheme using crossover AND mutation GA scheme using crossover AND mutation

sequentially (be it probabilistically)sequentially (be it probabilistically)₪ GP scheme using crossover OR mutation (cGP scheme using crossover OR mutation (c

hosen probabilistically)hosen probabilistically)

Page 14: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

14

FlowchartFlowchart

GP flowchartGA flowchart

Page 15: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

15

MutationMutation

₪ Most common mutation: replace randomly cMost common mutation: replace randomly chosen subtree by randomly generated treehosen subtree by randomly generated tree

Page 16: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

16

Mutation cont’dMutation cont’d₪ Mutation has two parameters:Mutation has two parameters:

■ Probability pProbability pmm to choose mutation vs. recombinat to choose mutation vs. recombinat

ionion■ Probability to chose an internal point as the root Probability to chose an internal point as the root

of the subtree to be replacedof the subtree to be replaced

₪ Remarkably pm is advised to be 0 (Koza’92) Remarkably pm is advised to be 0 (Koza’92) or very small, like 0.05 (Banzhaf et al. ’98)or very small, like 0.05 (Banzhaf et al. ’98)

₪ The size of the child can exceed the size of tThe size of the child can exceed the size of the parenthe parent

Page 17: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

17

Recombination = CrossoverRecombination = Crossover₪ Most common recombination: exchange two Most common recombination: exchange two

randomly chosen subtrees among the parenrandomly chosen subtrees among the parentsts

₪ Recombination has two parameters:Recombination has two parameters:■ Probability pProbability pcc to choose recombination vs. mutat to choose recombination vs. mutat

ionion■ Probability to chose an internal point within each Probability to chose an internal point within each

parent as crossover pointparent as crossover point

₪ The size of offspring can exceed that of the The size of offspring can exceed that of the parentsparents

Page 18: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

18

CrossoverCrossover

Child 2

Parent 1Parent 2

Child 1

Page 19: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

19

Selection Selection ₪ Parent selection typically fitness proportionateParent selection typically fitness proportionate₪ Over-selection in very large populationsOver-selection in very large populations

■ rank population by fitness and divide it into two groups: rank population by fitness and divide it into two groups: ■ group 1: best x% of population, group 2 other (100-x)%group 1: best x% of population, group 2 other (100-x)%■ 80% of selection operations chooses from group 1, 20% 80% of selection operations chooses from group 1, 20%

from group 2from group 2■ for pop. size = 1000, 2000, 4000, 8000 x = 32%, 16%, 8%, for pop. size = 1000, 2000, 4000, 8000 x = 32%, 16%, 8%,

4%4%■ motivation: to increase efficiency, %’s come from rule of motivation: to increase efficiency, %’s come from rule of

thumb thumb ₪ Survivor selection: Survivor selection:

■ Typical: generational scheme (thus none)Typical: generational scheme (thus none)■ Recently steady-state is becoming popular for its elitismRecently steady-state is becoming popular for its elitism

Page 20: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

20

InitializationInitialization₪ Maximum initial depth of trees Dmax is setMaximum initial depth of trees Dmax is set₪ Full method (each branch has depth = Dmax):Full method (each branch has depth = Dmax):

■ nodes at depth d < Dmax randomly chosen from function nodes at depth d < Dmax randomly chosen from function set Fset F

■ nodes at depth d = Dmax randomly chosen from terminal nodes at depth d = Dmax randomly chosen from terminal set Tset T

₪ Grow method (each branch has depth Grow method (each branch has depth Dmax): Dmax):■ nodes at depth d < Dmax randomly chosen from F nodes at depth d < Dmax randomly chosen from F T T■ nodes at depth d = Dmax randomly chosen from Tnodes at depth d = Dmax randomly chosen from T

₪ Common GP initialisation: ramped half-and-half, Common GP initialisation: ramped half-and-half, where grow & full method each deliver half of initial where grow & full method each deliver half of initial populationpopulation

Page 21: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

21

Bloat Bloat (膨胀)(膨胀)₪ Bloat = “survival of the fattest”, i.e., the tree Bloat = “survival of the fattest”, i.e., the tree

sizes in the population are increasing over sizes in the population are increasing over timetime

₪ Ongoing research and debate about the Ongoing research and debate about the reasons reasons

₪ Needs countermeasures, e.g.Needs countermeasures, e.g.■ Prohibiting variation operators that would deliver Prohibiting variation operators that would deliver

“too big” children“too big” children■ Parsimony pressure: penalty for being oversizedParsimony pressure: penalty for being oversized

Page 22: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

22

Problems involving “physical” environmentsProblems involving “physical” environments

₪Trees for data fitting vs. trees Trees for data fitting vs. trees (programs) that are “really” executable(programs) that are “really” executable

₪Execution can change the Execution can change the environment environment the calculation of the calculation of fitnessfitness

₪Example: robot controllerExample: robot controller₪Fitness calculations mostly by Fitness calculations mostly by

simulation, ranging from expensive to simulation, ranging from expensive to extremely expensive (in time)extremely expensive (in time)

₪But evolved controllers are often to But evolved controllers are often to very good very good

Page 23: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

23

Example application: Example application: symbolic regression symbolic regression

₪ Given some points in Given some points in RR22, (x, (x11, y, y11), … , (x), … , (xnn, y, ynn))

₪ Find function f(x) s.t. Find function f(x) s.t. i = 1, …, n : f(xi = 1, …, n : f(xii) = y) = yii

₪ Possible GP solution:Possible GP solution:■ Representation by F = {+, -, /, sin, cos}, T = Representation by F = {+, -, /, sin, cos}, T = RR {x} {x}■ Fitness is the errorFitness is the error■ All operators standardAll operators standard■ pop.size = 1000, ramped half-half initialisationpop.size = 1000, ramped half-half initialisation■ Termination: n “hits” or 50000 fitness evaluations reached Termination: n “hits” or 50000 fitness evaluations reached

(where “hit” is if | f(x(where “hit” is if | f(xii) – y) – yi i | < 0.0001) | < 0.0001)

2

1

))(()( i

n

ii yxfferr

Page 24: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

24

DiscussionDiscussion₪ Is GP:Is GP:

₪ The art of evolving computer programs ?The art of evolving computer programs ?₪ Means to automated programming of Means to automated programming of

computers?computers?₪ GA with another representation?GA with another representation?

Page 25: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

25

CREATING RANDOM PROGRAMS ₪Available functions

F = {+, -, *, %, IFLTE}₪IFLTE – if arg1 <= arg2 return arg3 else return

arg4₪Available terminals

T = {X, Y, Random-Constants}₪The random programs are:

■ Of different sizes and shapes ■ Syntactically valid

■ Executable

Page 26: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

26

CREATING RANDOM PROGRAMS

Page 27: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

27

MUTATION OPERATION₪Select 1 parent probabilistically based on

fitness₪Pick point from 1 to NUMBER-OF-POINTS₪Delete subtree at the picked point₪Grow new subtree at the mutation point in

same way as generated trees for initial random population (generation 0)

₪The result is a syntactically valid executable program

₪Put the offspring into the next generation of the population

Page 28: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

28

MUTATION OPERATION

Page 29: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

29

CROSSOVER OPERATION ₪Select 2 parents probabilistically based on

fitness₪Randomly pick a number from 1 to

NUMBER-OF-POINTS for 1st parent₪ Independently randomly pick a number for

2nd parent₪The result is a syntactically valid executable

program₪Put the offspring into the next generation of

the population₪ Identify the subtrees rooted at the two

picked points

Page 30: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

30

CROSSOVER OPERATION

Page 31: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

31

Architecture-Altering Operations₪1.subroutine duplication operation

Page 32: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

32

Architecture-Altering Operations₪2. Argument duplication

Page 33: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

33

Architecture-Altering Operations₪3.Subroutine creation operation

Page 34: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

34

Architecture-Altering Operations₪4. Subroutine deletion

Page 35: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

35

Architecture-Altering Operations₪5. Argument deletion

Page 36: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

36

FIVE MAJOR PREPARATORY STEPS

• Determining the set of terminals• Determining the set of functions• Determining the fitness measure • Determining the parameters for the run• Determining the method for designating a result and

the criterion for terminating a run

Page 37: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

37

概率增强式程序进化( PIPE )₪ Salustowicz & Schmidhuber (1997)Salustowicz & Schmidhuber (1997)

■ Probabilistic incremental program evolution (Probabilistic incremental program evolution (PIPPIPEE))

₪ Model:Model:■ Probabilistic prototype tree (PPT)Probabilistic prototype tree (PPT)■ Each node: Distribution over instruction setEach node: Distribution over instruction set■ Can grow and shrink (variable size)Can grow and shrink (variable size)

₪ Update algorithmUpdate algorithm■ Similar to PBILSimilar to PBIL■ Elitism is incorporatedElitism is incorporated

Page 38: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

38

Probability Prototype TreeProbability Prototype Tree₪ Complete n-ary treeComplete n-ary tree

₪ Each node Each node NNd,wd,w contains contains

■ Random constant, Random constant, RRd,wd,w

■ Variable probability vectorVariable probability vector■ l+kl+k components (instructions) components (instructions)■ d d : Node’s depth, : Node’s depth, ww : Horizontal position : Horizontal position

■ ppd,wd,w(i)(i) : probability of choosing instruction i : probability of choosing instruction i

Page 39: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

39

Program GenerationProgram Generation₪ Start with root node: Start with root node: d = w = 0d = w = 0₪ Depth first, left-to-right traversalDepth first, left-to-right traversal

■ Choose instruction Choose instruction ii with with ppd,wd,w(i)(i)

■ If If ii is a random constant is a random constant■ If If ppd,wd,w(i) > Tr(i) > Tr use use RRd,wd,w

■ Uniformly random numberUniformly random number

Page 40: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

40

Example: PPT & GenerationExample: PPT & Generation

Page 41: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

41

PIPE AlgorithmPIPE Algorithm₪ Initialize probabilistic prototype treeInitialize probabilistic prototype tree₪ Repeat until termination criteria is metRepeat until termination criteria is met

■ Create population of programsCreate population of programs■ Grow PPT if requiredGrow PPT if required

■ Evaluate populationEvaluate population■ Favor smaller programs if all is equalFavor smaller programs if all is equal

■ Update & mutate PPTUpdate & mutate PPT■ Prune PPTPrune PPT

Page 42: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

42

FlowchartFlowchart

初始化

停 止

迭代次数 =0

精华学习 基于种群的学习迭代次数 +1

找到满意解

PIPE 算法程序流程图

迭代次数! =0

rPel

Page 43: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

43

PPT InitializationPPT Initialization

₪ Random constant Random constant RRd,wd,w = U[0,1) = U[0,1)

₪ pptt = Probability of using terminal set= Probability of using terminal set

₪ For all terminal instructionsFor all terminal instructions■ ppd,wd,w(i) = p(i) = ptt/l/l

₪ For all function instructionFor all function instruction■ ppd,wd,w(i) = (1-p(i) = (1-ptt)/k)/k

Page 44: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

44

PPT GrowthPPT Growth

Growth “on demand”

Page 45: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

45

Updating & Mutating PPTUpdating & Mutating PPT

₪ Want best tree probability to be Want best tree probability to be PPTT

₪ ppd,wd,w(I) updated iteratively(I) updated iteratively

ppd,wd,w(i) = p(i) = pd,wd,w(i) + (i) + (1-p (1-pd,wd,w(i))(i))

₪ MutationMutationppd,wd,w(i) = p(i) = pd,wd,w(i) + (i) + mm (1-p (1-pd,wd,w(i))(i))

₪ Normalize probabilitiesNormalize probabilities

wd

wdbb

elbbT ipP

f

fPPP

,, )(,)1(

Page 46: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

46

PPT PruningPPT Pruning

₪ Prune if any Prune if any ppd,wd,w(i) > T(i) > Tpp

₪ TTp p = 0.9= 0.9

Page 47: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

47

Summary: PIPESummary: PIPE₪ PBIL like algorithm for evolving programsPBIL like algorithm for evolving programs

■ Probabilistic prototype treeProbabilistic prototype tree■ Variable lengthVariable length

■ PBIL updation rulePBIL updation rule

₪ ResultsResults better than GPbetter than GP₪ Many user defined constantsMany user defined constants

■ Effects are not understoodEffects are not understood

Page 48: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

48

例子 1: 曲线拟和₪Sin(x) 可以展开成标准泰勒公式

,!7!5!3

)sin(753

R For xxxx

xx …

•运算符集即可以选为 },{,*,%},{ RxTFI

• 设计一个适应值函数(在这个实验中取期望输出与模型输出之间的绝对误差之和为适应值函数)计算问题的个体的适应值

Page 49: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

49

Parameter SettingParameter Setting

种群规模种群规模 PSPS 1010

初始终端概率初始终端概率 PPTT 0.80.8

优化学习概率优化学习概率 PPelel 0.010.01

学习率学习率 lrlr 0.010.01

适应值常数适应值常数 εε 0.0000010.000001

总体变异概率总体变异概率 0.40.4

变异率变异率 mrmr 0.40.4

剪除阀值剪除阀值 TTpp 0.9999990.999999

随机常数阀值随机常数阀值 TTRR 0.30.3

Page 50: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

50

ResultResult

正旋函数的曲线拟和

Page 51: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

51

例子 2 :非线性系统识别

u(k)

u(k)

y(k-1)

y(k)

u(k-m)

u(k-1)

e(k+1)

y(k+1)

y(k-m)

PLANT

PIPE 模型

+

-

………

Page 52: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

52

SimulationSimulation

)()]1(),([)1( kukykyfky

被识别对象:

)1()(1

]5.2)()[1()()]1(),([

22

kyky

kykykykykyf

激励采样信号

)25/2sin()( kku

函数运算符集为 {+ , - , * , % , sin , cos}终端集 {x0 , x1 , x2 , R}

Page 53: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

53

ResultResult

)))573655.0))))))(cos01(

1)))(2)))(cos21))(963181.0)1)1)230661.0))2(sin723960.0))(*(

)0)11))(*(11(cos(%)(1(*(cos)(911381.0(*(*(sin(0)(%(*1)2

)0(sin(%((cos(*())))))))))0)22(*(cos(0)(%0(cos(cos(2(2

n(%)))(cos(si2))00))))))(1))817219.0)0))))005139.0(cos340801.0

(445955.0(cos())(*(275634.0(cos0)(*(*(0)114430.0(*(*(cos2

)(0(%(sin2(sin(*))(sin(*(%1)583063.10()(21((*((%(sin(*(*1

))0)22()(1))083342.0)377678.0(sin(*(cos(((()2,1,0(

xx

xxxxxxx

xxxxxxxxx

xxxxxxxx

xxxxx

xxx

xxxxxxx

xxxxxxxg

2,1,0 xxx 和 )2,1,0( xxxg 分别代表 )1()(),(),1( kykukyky 和

Page 54: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

54

Test resultTest result

测试 )10sin(1.0)32sin(1.0)25sin(3.0)( kkkku

Page 55: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

55

Gene Expression ProgrammingGene Expression Programming

₪2001 年 12 月 Candida Ferreira 在结合了 GA和 GP 思想的基础上正式提出了基因表达式编程(Gene Expression Programming)

₪GEP 和 GA,GP 之间的关系

GEP 的优点 : 继承了 GA 的刚性 , 规矩 , 快速 , 易用和 GP 的柔性 , 易变 ,多能 .比 GA 和 GP 的速度快 10—1000 倍 !

Page 56: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

56

什么是基因表达式什么是基因表达式 ??

基因表达式基因表达式 : sqrt,*,+,*,a,*,sqrt,a,b,c,/,1,-,c,d: sqrt,*,+,*,a,*,sqrt,a,b,c,/,1,-,c,d

这个字符串究竟表示什么呢这个字符串究竟表示什么呢 ??

一个基因表达式对应一个树结构

Page 57: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

57

GAGA Encoding Chromosome - fixed-length binary string (common technique) Gene - each bit of the string

genes chromosome

1 0 0 1 1 0 1 1

Reproduction Recombination (crossover) – exchanges parts of two chromosomes

(usual rate 0.7)

Mutation – changes the gene value (usual rate 0.001-0.0001)

1 0 0 1 1 11 10 1

1 0 0 1 1 0 0 11 0

Point choosen randomly

Page 58: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

58

GPGP

GP search for the computer program to solve the problem, not for the solution to the problem.

Computer program - any computing language (in principle) - LISP (List Processor) (in practice)LISP - highly symbol-oriented

a*b-c (-(*ab)c)-

Mathematical expression S-expression

Graphical representation of S-expression

* c

a b

functions (+,*) and terminals (a,b,c)

Chromosome: S-expression - variable length => more flexibility - sintax constraints => invalid expressions

produced in the evolution process must be eliminated => waste of CPU

Encoding

Reproduction Recombination (crossover) and Mutation (usualy)

Page 59: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

59

Gene Expression ProgrammingGene Expression Programming

works with two entities: chromosomes and expression trees

search for the computer program that solve the problem (as GP)

Candidate solution represented by an expression tree (ET) (similar with GP tree)

)()( dcba Q

+

*

d

-

ca b

ET encoded in a chromosome: read ET from left to right and from top to bottom

Q*-+abcd

Q means sqrt

Decoding the chromosome (translates the chromosome in an ET)•first line of ET (root) – first element of the chromosome•next line of ET – as many arguments needed by the element in the previous line

Encoding

Page 60: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

60

GEPGEP

Chromosome – has one or more genes of equal length

Gene – head: contains both functions and terminals (length h) - tail: contains only terminals (length t)

t=h(n-1)+1 n – number of arguments of the function with the highest number of arguments

e.g. set of functions: Q,*,/,-,+ set of terminals: a,b

n=2; h=15 (choosen) =>t =16 => length of gene=15+16=31

*b+a-aQab+//+b+babbabbbababbaaa

*

b +

-

a Qa

aET ends before the end of the gene!

Page 61: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

61

GEPGEPReproduction

Genetic operators applied on chromosoms not on ET => always produce sintactically correct structures! Recombination Mutation Transposition – a part of the chromosome moved to another part of the same chromosome

e.g. Mutation: Q replaced with *

*

b +

-

a Qa

a

*

b +

-

a *a

a

*b+a-aQab+//+b+babbabbbababbaaa

b

*b+a-a*ab+//+b+babbabbbababbaaa

Page 62: 2007 济南 1 树结构编码进化优化算法 济南大学 计算智能实验室 陈月辉 yhchen@ujn.edu.cnyhchen@ujn.edu.cn ://cilab.ujn.edu.cn

62

Gene Expression ProgrammingGene Expression Programming

For Details,For Details,

http://www.gene-expression-programming.com/http://www.gene-expression-programming.com/

其他其他 GEPGEP 训练算法训练算法₪ 禁忌搜索算法禁忌搜索算法 (TS)(TS)₪ Immune Algorithms, etc.Immune Algorithms, etc.