11 implementation - making genetic programming work

11 Implementation - Making Genetic Programming Work

Tradeoff to Consider This chapter addresses fundamental “how to” issues of GP. Coding a GP system that is fast, memory efficient, easily

maintainable, portable, and flexible is a difficult task. There are many tradeoffs, and this chapter will describe

how major GP systems have addressed those tradeoffs.

We will present examples to show why GP uses CPU cycles and memory so profusely.

We will describe systematically various low-level representations of evolving programs.

We will give an overview of the parameters that must be set during GP runs and suggest rules of thumb for setting those parameters.

Why Is GP so Computationally Intensive?

Large Population: 500 50,000 or 600,000 Large Individual Programs: long solutions, introns Many Fitness Evaluation

Why Does GP Ust so Much CPU Time? G = 50, F = 200, P = 2000

Why Does GP Use so Much Main Memory? Limits of Efficient Memory Use

Most GP systems represent the individuals/programs symbolically in some sort of high-level data structure.

The decision to represent the GP individual in data structures that are less efficient memory-wise is often a deliberate tradeoff made by the software designer to effect, for instance easier, application of the genetic operation or portability.

A Memory Usage Example Three pointers to represent each node - two pointers to

point to the input nodes and one pointer for the function represented by the node

3 pointers 4bytes + 4bytes = 16bytes <Nmax>P16bytes = 800200016bytes = 25.6Mbytes

Effects of Extensive Memory Use Memory access is slow Garbage collection

Computer Representation of Individuals

1. LISP lists

2. Data structures in complied language such as C, PASCAL, or FORTRAN

3. Native machine code

Implementation Using LISP

Lists and Symbolic Expressions

The “How to” of LISP Implementations LISP S-expression make it very easy to perform genetic

operations like tree-based crossover.

The Disadvantages of LISP S-Expressions Memory Problem: lists are constantly being created and

destroyed during evolution. Although many flavors of LISP have built-in garbage

collection, GP may create faster than it can be collected. Speed: C-based GP is more than ten times as fast as

LISP-based GP LISP does not have the same advantage in manipulating

other GP type genomes such as a linear graph genome.

Some Necessary Data Structures

Arrays Variable size individuals Complex crossover operators Variable amounts of data per node

Although arrays are easy to manipulate and to access, there are tradeoffs in using arrays to hold GP individuals.

Linked Lists

Ease in resizing

Simple to crossover

Flexible in size of elements Fast insertion/deletion

Demanding in memory requirements Slow access

Stacks

Implementations With Arrays or Stacks

Postfix Expressions with a Stack

One advantage of postfix ordering is that, if one evaluates the postfix expression from left to right, one will always have evaluated the operands to each operator before it is necessary to process the operator.

Postfix Evaluation Stack Evaluation

)()( zyxwSQRTyzwx

Postfix Crossover Start at any node, and move to the right to another node. If the number of items on the stack is never less than

zero, and the final number of items on the stack is one, then the visited nodes represent a subtree, that is, a subexpression.

The compact individual representation in memory A little slower that the tree arrangements The compactness of the representation could make the

system difficult to extend. It does not allow for skipping evaluation of parts of the

program that do not need to be evaluated.

Prefix Expressions with Recursive Evaluation Evaluating a Prefix Array

the system calls EvalNextArg over and over again until it has completely evaluated the individual.

SQRT(EvalNextArg) SQRT(EvalNextArg* EvalNextArg)

Prefix Crossover Starting at any element, take the arity of each element

minus one and sum these numbers from left to right. Wherever the sum equals minus one, a complete

subexpression is covered by the visited nodes.

A Graph Implementation of GP Start and end nodes PADO use of the stack PADO use of indexed memory From an implementation viewpoint, PADO is much more

difficult than tree or linear genomes.

Dual Representation of Individual For the purpose of storage, crossover, and mutation,

individuals are stored in a linked list. For the purpose of execution, individuals are stored as

arrays.

Implementations Using Machine Code

Evolving Machine Code with AIMGP An individual represented in a problem-specific language

is executed by a virtual machine. High ability to customize the language depending on the

properties of the problem at hand The need for the virtual machine involves a large

programming and run time overhead

To compile each individual from a higher level representation into machine code before evaluation

This approach provide GP with problem-specific and powerful operators and also results in high-speed execution of the individual.

Long execution times may be due to a long-running loop or a large number of fitness cases in the training set.

Problem-specific operators are frequently required.

Automatic Induction of Machine code with GP (AIMGP) Representing individuals as machine code programs

which are directly executable. Each individual is a piece of machine code. There are no virtual machines, intermediate languages,

interpreters, or compilers involved. AIMGP has accelerated individual execution speed by a

factor of 2000 compared to LISP implementations.

The Structure of Machine Code Functions The header deals with administration necessary when a

function gets entered during execution. The footer “cleans up” after a function call. The return instruction follows the footer and forces the

system to leave the function and to return program control to the calling procedure.

The function body consist of the actual program representing and individual.

A buffer is reserved at the end of each individual to allow for length variations.

Genetic Operators A mutation operator changes the content of an instruction

by mutating op-codes, constants or register references. The crossover operator: protected and instruction

A Guide to Parameter Choices

Population Size Bigger populations take more time when evolving a

generation but have more genetic diversity, explore more areas of the search space, and may even reduce the number of evaluations required for finding a solution.

A starting point of P = 1000 is usually acceptable for smaller problems.

A rule of thumb in dealing with more difficult problems is that, if a problem is sufficiently difficult, then the population size should start at around P = 10000.

This number should be increased if the other parameters tend to exert heavy selection pressure.

A larger number of training cases requires an increase in the population size.

Maximum Number of Generations It is impractical to run most GP systems for that many

generations - there is not enough CPU time available. Start testing with a relatively low setting for Gmax, such as

50 Gmax 100 generations. If you are not getting the results you want, first raise the

population size and then raise the number of generations. Explosive growth of introns almost always marks the end

of effective evolution. As a rule of thumb, when destructive crossover falls to

below 10% of all crossover events, no further effective evolution will occur.

Terminal and Functions Set Make the terminal and function set as small as possible.

Larger sets usually mean longer search time. It is not that important to have (all) customized functions

in the function set: the system often evolves its own approximations.

It is very important, however, that the function set contain functions permitting non-linear behavior, such as if-then functions.

The function set should also be adapted to the problem Sometimes transformations on data are very valuable, for

instance, fast Fourier transforms.

Mutation and Crossover Balance The typical settings of mutation and crossover probabilities

in GP involve very high rates of crossover and very low rates of mutation.

Experiments suggest that a different balance (pc = 0.5, pm = 0.5) between the two operators may lead to better results on harder problems.

Selection pressure: the authors have very good experiences with low selection pressure (tournaments of 4).

Parsimony Pressure Variable parsimony pressure produces very nice, short, and

elegant solutions. Some researchers have reported good results with adaptive

parsimony, which is applied only when a solution that performs well is found.

Maximum Program Size The maximum depth of trees or the maximum program

size should be set such that the programs can contain about ten times the number of nodes as the expected solution size.

Initial Program Size Typically, the initial program size should be very small

compared to the maximum size. When no success results from this approach, we suggest

trying longer programs at the start to allow the system to start with some complexity already and to avoid local minima early on.

11 implementation - making genetic programming work

Documents