introduction to evolutionary computation matthew evett eastern michigan university
Post on 21-Dec-2015
217 views
TRANSCRIPT
Introduction to Evolutionary Computation
Matthew Evett
Eastern Michigan University
Evolutionary Computation is….
Umbrella term for machine learning techniques that are modeled on the processes of neo-Darwinian evolution. Genetic algorithms, genetic programming, artificial
life, evolutionary programming Survival of the fittest, evolutionary pressure
Techniques for automatically finding solutions, or near solutions, to very difficult problems.
Why is EC Cool?
EC techniques have found solutions better than any previously known for many domains Electronic circuit design, scheduling,
pharmaceutical design
Autonomous solution discovery is fun Look Ma! No hands!
Darwinian Evolution
Works on population scale, not individual
Chance plays a part Variation affects viability Fittest don’t always survive!
Heredity of traitsFinite resources to yield
competition
EC is not...
“Real” evolution, or “real” geneticsIt is modeled on natural genetic systems
only in a simple sense. Term “genetic” is really used to mean
“heredity” Real genetics is much more complicated
Overview of the Talk
We’ll look at two related techniques... genetic algorithms genetic programming
We’ll look at some demos of evolutionary systems.
History of EC
Friedberg’s induced compilers (1958)Evolutionary Programming (1965)
Fogel, Owens & Walsh
Evolutionary Strategies (Recehenberg ‘72)Genetic Algorithms (Holland ‘75)Genetic Programming
Tree-based GA (Cramer ‘85, Koza ‘89) “True” GP (Koza ‘92)
Basic evolutionary algorithm
Select individuals forreproduction
Create new population viaRecombination & mutation
Success
Failure
Reached Maximum Number ofGenerations?
Found Acceptable Solution?
Score PopulationGeneration_Count=Generation_Count +1
Generate Random Population
Population of individuals, each representing a potential solution to the problem in question.
Genetic Algorithms (GA)
Population individuals are (fixed-length) binary strings (“genome”)
Start with a population of random strings.Measure “fitness” of individuals.Each generation forms a new population
from old via recombination and mutation.Solutions improve over generations.
Three steps to setting up a GA
1) Devise a binary encoding representing the potential solutions to a problem.
2) Define a fitness function. Objective measure of quality of individual
3) Set control parameters. population size maximum number of generations probability of mutation and crossover, etc.
Example: Designing a Truss
10 members16 diameters avail.
Different costs Different strengths
Find cheapest that is strong enough
40-bit genome Each 4 bit sequence reps.
diam. of 1 member
A1 A2
A3
A4A9
A7
A10 A6
A5
A8
50lb 50lb
0010 1110 0001 0010 1010 1111 0001 0110 0110 1010
Running a GA Generate an initial population of random binary
stringsCalculate “fitness” of each individual
Fitness is cost of design, + penalty for fails
Create next generation Select on the basis of “fitness” Recombination/mating Select some elements for mutation.
• Typically one or two random bits will be flipped
Crossover in GA
Single-point crossover There are many other forms
Randomly select crossover pointSwap crossover fragmentsOffspring will have a combination of
randomly selected parts of both parents
00101101
10010011
00101011
10010101
Parents Children
Running a GA (continued)
Repeatedly create new generations Calculate fitness
Terminate when an acceptable solution is been found or when the specified maximum number of generations is reached.
Running a GA/GP
Select for & do directreproduction
Select for & Mate
Select for & Mutate
Success
Failure
Reached Maximum Number ofGenerations?
Found Acceptable Solution?
Score PopulationGeneration_Count=Generation_Count +1
Generate Random Population
Major phases of evolutionary algorithms:
Results of Truss Example
Optimal solution is known, but rare Number of possible designs is 240
Typical run 200 individuals/pop.; 40 generations
Yields answer within 1% of optimal …but examines only 8000 individuals!
(.0000007% of designs)
Genetic Programming (GP)
GP is a domain-independent method for inducing programs by searching a space of S-expressions.
GP’s search technique is similar to GA’s. The elements of a population are programs,
encoded as s-expressions. The Lisp programming language is based on s-
expressions.• Original GP work was done in Lisp.
SQRT
/
a b
+ 2.0
Genetic Programming Elements
S-expressions Prefix notation Programs, encoded as
trees, evaluated via post-order traversal
Ex: tree corresponding to the S-expression (sqrt ( / (+ a b) 2.0 ) )
Representation of a ProgramS-expressions can be
converted to C…. if
> 20.0 /
a 10.0 a 2.0
float treeFunc(float a){
if ( a > 10.0) {
return 20.0;}else{
return a/2.0;}
}
Looping constructs and subroutine calls are also possible.
Three steps to setting up a GP Define appropriate set of functions and terminals.
Must have closure. Functions and Terminals must be sufficient.
Define a fitness function. Set control parameters.
GA population size, maximum size or depth of the individual trees size and the shape of the original trees, etc.
terminal set = {a, b, c, 0, 1, 2}
function set ={+, -, *, /, SQRT}
Starting a GP
Generate an initial population of random S-expression trees.
Calculate fitness value for each individual Often over a set of test cases.
Running a GP
Create the next generation (population) Select elements for reproduction
• Random, fitness-proportionate, tournament. Reproduce:
• Direct reproduction (cloning)
• Mating– Mating method differs from GA’s.
• Mutation– Also differs from GA’s.
GP Crossover
Randomly choose crossover points.
Swap rooted subtrees.
“Closure” property guarantees viability of offspring
+ 3
a b
+
7 *
* b
2 a
/CrossoverPoint
CrossoverPoint
3
/
+
a b
+
7
*
* b
2 a
Swap
3
/
+
a b
+
7*
* b
2 a
S-expression Mating Process
Step 1) Randomly select crossover points in bothparents.
Step 2) Detach the Subtrees and swap them.
Resulting in these offspring.
(a+b)/3 7+((2*a)*b)
7+(a+b)((2*a)*b)/3
Mutation with GP
*
+ /
a 10.0
a 20
Randomly SelectMutation Point
*
- /
* 4 a 20
*
b b
a
Replace with RandomlyGenerated Tree
Elements that are selected for mutation will have some randomly selected node (and any subtree under it) replaced with a randomly generated subtree.
Point mutation Tree growth (shown
here)
Running a GP (continued)
Repeatedly create new generations.Terminate when an acceptable solution is
found or when a specified maximum number of generations is reached. The termination criteria is often based on a
number of hits, where a hit is defined as the successful completion of some subgoal.
Example: Santa Fe Trail
Ant animats, acquiring food. Some gaps in trail 89 food “pellets”
Evolve control strategy to consume all pellets In acceptable time
Representing “Ants”
“Terminals” are functions, whose evaluation causes ant to move.
Fitness = # of pellets consumed in 400 terminal evaluations. Prevents infinite runs, and weak solutions.
T = {ahead, left, right}
F ={if-food-ahead, progn2, progn3}
(if-food-ahead (move) (progn2 (left) (move)))
Demo: Santa Fe AntDuring run, shows path of
best-of-generation, best-of-runChong, 1998
Santa Fe Ant Demo (done) http://studentweb.cs.bham.ac.uk/~fsc/DGP.html The applet
GP Generated Military TacticsSquadron has a
destinationOrdered either to:
evade or attackPorto, Fogel & Fogel,
1998Population of
strategies
Generating tactics
Every 20 seconds of real time, do GP run, 40 generations.
Predicts 20 mins ahead.Allows adaptation to
changing situation.Here, order is changed
from “evade” to “attack”.
Co-evolution
Simulation uses GP-developed strategy for both squadrons.
Real-time success
Platform: Sparc 20Actual Pentagon
military simulation.Blue squad fires on
red.
Learning to Walk with GP
Evolve control strategies for movement of arbitrarily articulated animats. Karl Sims, 1995
Fitness is rate of travel physics model LOTS of CPU cycles!
GA-learned bipedal motion Individual strategies can
be observed on the applet. (http://www.jsh.net/andy/gat/environ.html)
User can view all trials, or just the best-of-generation.
Constrained skeletons.Dick, 1998
Financial Symbolic Regression
The goal is time series prediction, where the target points are a financial time series.
In this case we are using a target time series derived from the daily closing prices of the S&P 500 from the years 1994 and 1995.
Uses 33 independent variables taken from time series that that are derived from the S&P 500 itself and from the closing daily prices of 32 Fidelity Select Mutual Funds.
Evett & Fernandez, 1996, 1997.
0
100
200
300
400
500
600
700
940103 940701 941229 950628 951226
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
Solving Financial Problems
The top line in the graph is the daily closing price of the S&P 500. The solid line below it is the graph of the target time series after preprocessing.
The dotted line is a function evolved using GP. It is included here only as an example to illustrate that criterion for success does not require a great deal of accuracy.
The Example Evolved Functiony = (((0.38)-((-0.20923)-(FSPTX-(((-0.79706)
/(0.38))*((FSUTX-FSCSX)*(FSCGX-(-0.34247))))))) *(SPX*((0.82794)/(0.54431)))) The independent variables that were used by this
evolved function are derived from the following time series.
• FSPTX Fidelity select Technology Portfolio. • FSUTX Fidelity Select Utility Portfolio• FSCSX Fidelity Select Software Portfolio• FSCGX Fidelity Select Capital Goods Portfolio• SPX S&P 500 Index
ConclusionsEvolutionary algorithms are a powerful
technique for problem solving in domains that: are variable difficult, if not impossible to optimize
GP is especially useful for problems for which the form of the solution is not known.
Evolutionary techniques are becoming widespread.
Overview of the SoftwareObject Oriented C++. Windows 95 (MS Visual C++ 5.0)Ported to UNIX. (GNU C++)Extended to run cooperatively on
multiple machines using MPI.
Thank You! Are there any Questions?