graph based ea

19
550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUT A TION, VOL. 10, NO. 5, OCTOBER 2006 Graph-Based Evolutionary Algorithms Kenneth Mark Bryden, Daniel A. Ashlock, Steven Corns, and Stephen J. Willson AbstractEvolutionary algorithms use crossover to combine in- formation from pairs of solutions and use selection to retain the best solutions. Ideally, crossover takes distinct good features from each of the two structures involved. This process creates a conict: progress results from crossing over structures with different fea- ture s, but crossover prod uces new stru ctur es that are like theirpar- ents and so reduces the diversity on which it depends. As evolution continues, the algorithm searches a smaller and smaller portion of the search space. Mutation can help maintain diversity but is not a panacea for diversity loss. This paper explores evolutionary al- gorithms that use combinatorial graphs to limit possible crossover partners. These graphs limit the speed and manner in which infor- mation can spread giving competing solutions time to mature. This use of graphs is a computationally inexpensive method of picking a global level of tradeoff between exploration and exploitation. The results of using 26 graphs with a diverse collection of graphical properties ar e presented. The test problems use d are: one-max, the De Jon g functions , theGrie wan gk fun cti on in thr ee tosevendimen- sions, the self-avoiding random walk problem in 9, 12, 16, 20, 25, 30, and 36 dimensions, the plus-one-recall-store (PORS) problem with and , location of length-six one-error-cor- recting DNA barcodes, and solving a simple differential equation semi-symbolically. The choice of combinatorial graph has a signicant effect on the time-to-solution. In the cases studied, the optimal choice of graph impro ved solutio n time as much as 63-f old with typic al impact being in the range of 15% to 100% variation. The graph yielding superior performance is found to be problem dependent. In general, the optimal graph diameter increases and the optimal ave rage degree decre ases with the complexit y and difc ulty of the tness landscape . The use of dive rse graphs as popula tion structures for a collection of problems also permits a classication of the problems. A phylogenetic analysis of the problems using normalized time to solution on each graph groups the numerical problems as a clade together with one-max; self-avoiding walks form a clade with the semisymbolic differential equation solution; and the PORS and DNA barcode problems form a superclade with the numerical problems but are substantially distinct from them. This novel form of analysis has the potential to aid researchers choosing problems for a test suite. Index T ermsEvolutionary algorithm, graph-based algorithms, population structure, test suite. Manuscript received October 18, 2004; revised March 14, 2005. This work was supported in part by a Grant from the National Energy Technology Labo- ratory, U.S. Department of Energy. K. M. Bryden and S. Corns are with the Department of Mechanical Engi- neering, Iowa State University, Ames, IA 50011 USA (e-mail: kmbryden@ias- tate.edu; [email protected] ). D. A. Ashloc k is wit h the Depart men t of Mat hematics and Stati sti cs, Univ ersit y of Guel ph, Guelph, ON N1G 2R4, Canada (e-mail: dash lock@ uoguelph.ca). S. J. Willson is with the Department of Mathematics, Iowa State University, Ames, IA 50011 USA (e-mail: [email protected]). Digital Object Identier 10.1109/TEVC.2005.863 128 I. INTRODUCTION I N NATURE, constraints such as geography, mutual infer- tility, or partner selection mechanisms are imposed on a in- dividual’s ability to reproduce sexually with other individuals. In the simple genetic algorithm (SGA) [19], the only constraint on reproduction is that tter individuals have a higher proba- bility of being selected to participate. In nature, individuals sep- arated by great distances, no matter what their respective t- nesses, have a very low probability of reproducing with each other . Within many speci es, one also nds cultural or beha vioral constraints on the probability of two individuals reproducing. Birds have complex mating dances that help to identify good par tne rs; fro gs use dis tin ctive cal ls for the same purpose; ins ect s employ pheromones, and human partner selection techniques are complex and variable. Examples of this kind of premating nongeographic isolation can be found in [2]. Any widespread biological phenomenon that appears over and over in popula- tions subject to natural selection probably conveys a selective advantage. Limiting mate choice is thus likely to be desirable in an evolutionary algorithm. In a complex polymodal tness landscape, it can prevent so-called premature convergence. As we will see subsequently, it can be counterproductive in simple, unimodal tness landscapes. One of the standard iss ues in popula tio n gen etics is ex- plaini ng why ther e are not gr ea ter pr oblems with loss of  div ersity in natural popul ations even thoug h simpl e mathe- matical models show that diversity should vanish rapidly. The theory of isolation by distance [44] gives one reason why diver- sity loss is lower than expected; the separation imposed by the geography slows the spread of genetic information. Kimura and Crow [24] examined the rate at which populations on different graphical structures lose their genetic diversity under simple reproduction without selection. Ana log ously, one of the fundament al pro ble ms in evo- lut ion ary alg ori thms is maintaini ng use ful di ver sit y in the popula tio n as the algori thm pro gresse s. It is important to note that for some problems the useful level of diversity is almost nil, in others rich diversity prevents convergence to an undesirable local optimum. During reproduction, individuals in the population are replaced by individuals with parts copied from a stochastically restricted subset of the population, and so diversity loss is acute if not carefully managed. Currently, the primary tool for such management is setting the rate of application of mutation operators. Imposing geography on the algorithm is another management tool. Implemented properly, such geography can have a very low runtime cost. Except possibly for raising the mutation rate, imposing a ge- ography is the least computationally intensive of the extant di- versity preservation techniques. If diversity preservation is re- quired and the randomness of preserving it with a high mutation rate is undesirable, then imposing a geography may be a good 1089-778X/$20.00 © 2006 IEEE

Upload: joeamal

Post on 08-Apr-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 1/18

550 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

Graph-Based Evolutionary AlgorithmsKenneth Mark Bryden, Daniel A. Ashlock, Steven Corns, and Stephen J. Willson

AbstractEvolutionary algorithms use crossover to combine in-formation from pairs of solutions and use selection to retain thebest solutions. Ideally, crossover takes distinct good features fromeach of the two structures involved. This process creates a conflict:progress results from crossing over structures with different fea-tures, but crossover produces new structures that are like theirpar-ents and so reduces the diversity on which it depends. As evolutioncontinues, the algorithm searches a smaller and smaller portion of the search space. Mutation can help maintain diversity but is nota panacea for diversity loss. This paper explores evolutionary al-gorithms that use combinatorial graphs to limit possible crossoverpartners. These graphs limit the speed and manner in which infor-mation can spread giving competing solutions time to mature. Thisuse of graphs is a computationally inexpensive method of picking aglobal level of tradeoff between exploration and exploitation. The

results of using 26 graphs with a diverse collection of graphicalproperties are presented. The test problems used are: one-max, theDe Jong functions, theGriewangk function in three to sevendimen-sions, the self-avoiding random walk problem in 9, 12, 16, 20, 25,30, and 36 dimensions, the plus-one-recall-store (PORS) problemwith = 1 5 1 6        and 1 7     , location of length-six one-error-cor-recting DNA barcodes, and solving a simple differential equationsemi-symbolically.

The choice of combinatorial graph has a significant effect onthe time-to-solution. In the cases studied, the optimal choice of graph improved solution time as much as 63-fold with typicalimpact being in the range of 15% to 100% variation. The graphyielding superior performance is found to be problem dependent.In general, the optimal graph diameter increases and the optimalaverage degree decreases with the complexity and difficulty of 

the fitness landscape. The use of diverse graphs as populationstructures for a collection of problems also permits a classificationof the problems. A phylogenetic analysis of the problems usingnormalized time to solution on each graph groups the numericalproblems as a clade together with one-max; self-avoiding walksform a clade with the semisymbolic differential equation solution;and the PORS and DNA barcode problems form a superclade withthe numerical problems but are substantially distinct from them.This novel form of analysis has the potential to aid researcherschoosing problems for a test suite.

Index TermsEvolutionary algorithm, graph-based algorithms,population structure, test suite.

Manuscript received October 18, 2004; revised March 14, 2005. This work was supported in part by a Grant from the National Energy Technology Labo-ratory, U.S. Department of Energy.

K. M. Bryden and S. Corns are with the Department of Mechanical Engi-neering, Iowa State University, Ames, IA 50011 USA (e-mail: [email protected]; [email protected]).

D. A. Ashlock is with the Department of Mathematics and Statistics,University of Guelph, Guelph, ON N1G 2R4, Canada (e-mail: [email protected]).

S. J. Willson is with the Department of Mathematics, Iowa State University,Ames, IA 50011 USA (e-mail: [email protected]).

Digital Object Identifier 10.1109/TEVC.2005.863128

I. INTRODUCTION

IN NATURE, constraints such as geography, mutual infer-tility, or partner selection mechanisms are imposed on a in-

dividual’s ability to reproduce sexually with other individuals.In the simple genetic algorithm (SGA) [19], the only constrainton reproduction is that fitter individuals have a higher proba-bility of being selected to participate. In nature, individuals sep-arated by great distances, no matter what their respective fit-nesses, have a very low probability of reproducing with eachother. Within many species, one also finds cultural or behavioralconstraints on the probability of two individuals reproducing.Birds have complex mating dances that help to identify goodpartners; frogs use distinctive calls for the same purpose; insects

employ pheromones, and human partner selection techniquesare complex and variable. Examples of this kind of prematingnongeographic isolation can be found in [2]. Any widespreadbiological phenomenon that appears over and over in popula-tions subject to natural selection probably conveys a selectiveadvantage. Limiting mate choice is thus likely to be desirablein an evolutionary algorithm. In a complex polymodal fitnesslandscape, it can prevent so-called premature convergence. Aswe will see subsequently, it can be counterproductive in simple,

unimodal fitness landscapes.One of the standard issues in population genetics is ex-

plaining why there are not greater problems with loss of 

diversity in natural populations even though simple mathe-matical models show that diversity should vanish rapidly. Thetheory of isolation by distance [44] gives one reason why diver-sity loss is lower than expected; the separation imposed by thegeography slows the spread of genetic information. Kimura andCrow [24] examined the rate at which populations on differentgraphical structures lose their genetic diversity under simplereproduction without selection.

Analogously, one of the fundamental problems in evo-lutionary algorithms is maintaining useful diversity in thepopulation as the algorithm progresses. It is important tonote that for some problems the useful level of diversity isalmost nil, in others rich diversity prevents convergence to an

undesirable local optimum. During reproduction, individualsin the population are replaced by individuals with parts copiedfrom a stochastically restricted subset of the population, andso diversity loss is acute if not carefully managed. Currently,the primary tool for such management is setting the rate of application of mutation operators. Imposing geography on thealgorithm is another management tool. Implemented properly,such geography can have a very low runtime cost.

Except possibly for raising the mutation rate, imposing a ge-ography is the least computationally intensive of the extant di-versity preservation techniques. If diversity preservation is re-quired and the randomness of preserving it with a high mutationrate is undesirable, then imposing a geography may be a good

1089-778X/$20.00 © 2006 IEEE

Page 2: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 2/18

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 551

choice. The geography is selected as part of the algorithm de-

sign and need not impact runtime significantly. This paper ex-

plores this by imposing various geographical structures coded

as combinatorial graphs on an evolutionary algorithm. We call

the result a “graph-based evolutionary algorithm” (GBEA). A

unique feature of this paper is the exploration of many different

geographic structures rather than a small group of highly related

geographies. A small, initial study in this area appears in [8].

Various approaches to managing diversity loss appear in the

literature. These include using a high mutation rate, reducing the

fitness of organisms in proportion to the number of organisms

representing similar solutions (niche specialization), directly re-

jecting duplicate solutions (e.g., taboo search), and attempting

to intelligently manage diversity loss. Many of these methods

suffer from requiring the ability to compute the degree to which

creatures are similar. With a plain string representation in which

each character has meaning, this is easy. More complex repre-

sentations such as finite state machines [18], parse trees (with

their potential for bloat) [9], or GP-automata [3], all of which

permit a single solution to have multiple encodings, render thiskind of distance computation challenging. An example of this

type of distance computation appears in [39], in which the au-

thors make diversity part of a multiobjective optimizer.

Intelligent management of diversity loss is a potentially

valuable approach. Intelligent management removes diversity

rapidly when it is not required and conserves it when it is. An

example of this type of technique appears in [21]. As presented,

the technique requires the ability to estimate which building

blocks within population members are better or worse than

others. This restricts its utility to problems that are represented

in a fashion such that i) there are identifiable building blocks

for which ii) meaningful estimates of relative worth can be

made. As with other schemes for managing diversity, it comes

with some degree of computational overhead. In [21], two vari-

ations of the technique are compared on a variety of parameter

estimation problems and are shown to enhance performance.

This success relies in part on incorporating domain knowledge

into the representation of the parameter estimation problems

so that the building blocks are transparently available to the

algorithms.

Another approach to diversity management is to impose a

geography upon the population. In [1] a population is placed

on each processor of a multiprocessor machine with occasional

migration. This differs from the work presented here in that

each vertex of the graph contains an entire population ratherthan a single population member. It also uses a single graph,

the connection topology wired into the multiprocessor machine

on which the work was performed. The current paper general-

izes this work in that it considers many different graphs and dif-

fers from it in its choice of what to place at a vertex. Placing

whole populations on a vertex is an option. It is also possible to

create graphs that simulate placing a population at each vertex.

The graph , defined subsequently, is an example of this type

of simulation.

In [29], a version of Darwin’s ideas about the origin of di-

versity on islands and its later winnowing on continents ap-

pears. That paper used a much smaller collection of graphs than

the current one as well as evaluating population members ontheir competitive ability to play the iterated prisoner’s dilemma

rather than on optimization problems. The only real point of 

commonality is the recognition that graphs may be valuable

as geographic structures. In spite of this lack of commonality,

there are ideas which may be valuable to extending and im-

proving GBEAs in [29]. The idea of the continent/island inter-

action suggests the use of graphs not in this paper, and the notion

of training competitive agents is a potentially interesting appli-

cation.

One of a large series of investigations by Whitley [31] ex-

plores island model algorithms. Distinct populations are placed

on islands and migration rates and populations sizes are tuned

with resulting performance enhancement at least partially attrib-

utable to the geographic preservation of diversity. This work is

the most similar to GBEAs of which we are aware. As with the

continent/island cycle, the island model can be approximated by

choosing the correct graph.

Davidor et al. [15] tried using a steady-state ecological

model on a grid called the ECOlogical framework. In this work,

a neighbor could breed with its eight neighbors in the grid.

Davidor demonstrated improved performance over a baselinealgorithm for job shop scheduling with geographically con-

strained mating. In all the ECOlogical studies, the geography

corresponded to an 8-neighbor toroidal graph with size to

be chosen as 32 32, 45 45, or 71 71 depending on a

heuristic estimate of the correct population size.

Here is an outline of the remainder of this paper. Section II

gives the background mathematical definitions including the

choice of graphs used in the experiments. In Section III,

graph-based evolutionary algorithms are defined, and the 23

test problems are described. Section IV gives the precise design

of the experiments. Section V describes the outcomes of the

experiments and discusses the results. Section VI provides the

taxonomic analysis of the results and discusses their signif-icance. Section VII draws overall conclusions for this paper.

Section VIII discusses what directions might be valuable for

additional study.

II. MATHEMATICAL BACKGROUND

We assume some familiarity with graph theory [41] in this

paper. A combinatorial graph or graph is a collection

of vertices and of edgeswhere is a set of unordered

pairs from . Two distinct vertices of the graph are neigh-

bors if they are members of the same edge. The number of edges

containing a vertex is the degree of that vertex. If all vertices in a

graph have the same degree, then the graph is said to be regular .If the common degree of a regular graph is , then the graph is

said to be -regular . A graph is connected  if one can go from

any vertex to any other vertex by traversing a sequence of ver-

tices and edges. The diameter of a graph is the largest number

of edges in a shortest path between any two of the vertices. The

diameter is, in some sense, the shortest path across the graph.

In this paper, a graph used to constrain mating in a population

will be called the population structure. The general strategy is

to use the graph to specify the geography on which a population

lives, permitting mating only between neighbors, and finding

graphs that can preserve diversity without hindering any poten-

tial progress due to heterogeneous crossover.

This paper utilizes a nonstandard operation on graphs calledsimplexification. Simplexification at a vertex replaces with

Page 3: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 3/18

552 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

Fig. 1. Simplexification of a vertex with four neighbors.

a cluster of vertices, one for each neighbor of so that all the

new vertices are neighbors of one another and each is a neighbor

of exactly one of  ’s former neighbors. Simplexification of a

vertex with four neighbors is shown in Fig. 1. The effect of 

simplexification is to create small groups of vertices that are

closely coupled to one another but less closely coupled to the

rest of the graph. This creates an analog of a biological refuge

in the graphical connection topology. By simplexification of a

graph, we mean simultaneous simplexification of all the graph’s

vertices.

A. List of Graphs

This section provides some necessary mathematical defini-

tions and describes the combinatorial graphs used in this paper.

Definition 1: The complete graph on vertices, denoted ,

has vertices and all possible edges. An example of a complete

graph is shown in Fig. 2.

Definition 2: The complete bipartite graph with and

vertices, denoted , has vertices divided into disjoint sets

of and vertices and all possible edges that have one end in

each of the two disjoint sets. The three-pre- graph shown in

Fig. 2 is the complete bipartite graph .

Definition 3: The -cycle, denoted , has vertex set .Edges join pairs of vertices that differ by 1 so that the

vertices form a ring with each vertex having two neighbors.

Definition 4: The -hypercube, denoted , hasthe setof all

-character binary strings as its set of vertices. Edges consist of 

pairs of strings that differ in exactly one position. A 4-hypercube

is shown in Fig. 2.

Definition 5: The -torus, denoted , has vertex

set . Edges are pairs of vertices that differ either by

1 in their first coordinate or by 1 in their

second coordinate but not both. These graphs are grids

thatwrap (as tori) atthe edges.A 12 6-torus isshownin Fig.2.

Definition 6: The generalized Petersen graph with parame-

ters and , with relatively prime to , is denoted andhas vertex set . The vertices are

connected in a standard -cycle. The vertices are

also connected in an -cycle but with the th vertex connected

to the vertex. F inally, pairs of vertices ,

are connected. The graph is shown in Fig. 2.

Definition 7: A tree is a connected graph with no cycles.

Degree zero or one vertices are termed leaves of the tree. A

regular balanced tree of degree is a tree constructed in the

following manner. Begin with a single vertex. Attach neigh-

bors to that vertex and place these neighbors in a queue. Pro-

cessing the queue in order, add 1 neighbors to the vertex

most recently removed from the queue and add these neighbors

to the end of the queue. Continue in this fashion until the treehas the desired number of vertices. The resulting graph is a tree

in which all nonleaves have degree and which has, construc-

tively, the smallest possible diameter among trees with all non-

leaves having degree . We denote these graphs ,

where is the number of vertices. Notice that not all are pos-

sible for a given .

Definition 8: The graph is created by starting with

and then simplexifying the entire graph three times. Two of the

steps leading to the graph are shown in Fig. 2.

In addition, four classes of  random graphs are used in this

paper. A random graph is specified by the algorithm used to

create it. Three instances from each class of random graph are

used.

Definition 9: An edge move is performed as follows. Two

edges and are found that have the property that

none of , , , or are themselves edges.

The edges and are deleted from the graph, and

the edges and are added. Notice that edge moves

preserve the regularity of a graph if it is regular.

Definition 10: Regular random graphs are generated by the

following algorithm. Start with a regular graph (recall that aregular graph has all vertices of the same degree) and repeat-

edly perform 3000 edge moves on edges selected uniformly at

random from those that are valid for edge moves. For 3-regular

random graphs, use as the starting point. For 4-regular

random graphs, use as the starting point. For 9-regular

random graphs, use as the starting point. These graphs are

denoted , where is the number of vertices, is the

regular degree, and , is the instance of the graph in this

paper.

Definition 11: Generate random toroidal graphs as follows.

A set of 512 points are randomly placed onto the unit torus (the

unit square wrapped at the edges, not the torus graph) and edges

are created between those at distance 0.07 or less from one an-other. This distance was chosen to give an average degree of 

about six. After generation, the graph is checked to see if it is

connected. Graphs that are not connected are rejected. These

graphs are denoted , where is the radius for

edge creation, and is the instance of the graph in this

paper.

See Table I for a list of the graphs used in the work reported

in this paper. It should be noted that all graphs used, including

the random graphs but excluding , have 512 ver-

tices so as to control for population size. The one off-size graph

has 510 vertices since a 5-regular balanced tree cannot have

512 vertices. Exploration of the tradeoffs involved in varying

the number of vertices more than a tiny amount is a topic forfuture research. The complete graph is included as a base-

line. Graph-based evolutionary algorithms become equivalent to

standard evolutionary algorithms when the graph used is .

III. GRAPH-BASED EVOLUTIONARY ALGORITHMS

This section defines a GBEA as it is used in this paper.

(Clearly, many other methods of incorporating graphs into

evolutionary algorithms are possible.) Choose a graph with

vertex set and edge set to use as a population

structure. Place one individual on each vertex of . Then, use

a steady-state evolutionary algorithm [32], [37], [42] in which

evolution proceeds one mating event at a time. A mating eventis performed as follows. Pick a vertex uniformly

Page 4: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 4/18

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 553

Fig. 2. Examples of complete, Petersen, Torus, and hypercube graphs, and some of the steps leading to theZ     

graph. These examples are all smaller than thegraphs actually used but are members of the same families of graphs.

TABLE IGRAPHS USED AND THEIR INDEX NAMES. INDEX NAMES ARE USED TO INDEX THE GRAPHS IN FIGURES

at random. A neighbor of is then chosen for mating. The

variation operators, crossover and mutation, are used to produce

a single new individual that may or may not be used to replace

the individual on vertex . The details of how the neighbor

is picked for mating and how to decide if the new individual

replaces the individual on are together called the local mating

rule of the GBEA. This research used local mating rules that

pick a neighbor in direct proportion to its fitness (local roulette

selection) and permit the new individual to replace the old

either automatically or only if it is at least as fit. These local

mating rules are called local roulette mating and local elite

roulette mating, respectively. Section IV will specify whichlocal mating rule is used for each test problem.

A graph-based evolutionary algorithm need not be steady

state. Its steady-state character in this paper is a choice. A

generational graph-based algorithm could be implemented in

a number of ways. For example, roulette-select a neighbor for

each vertex to be the coparent based on fitness. Run some form

of reproduction on the population member at the vertex and the

coparent to obtain the structure that will occupy the vertex in the

next generation. The use of a generational form of GBEAs may

be desirable in the following circumstances. Suppose that the

fitness evaluation has a variable component, either it changes

with the population or as new cases of the problem being solved

are generated. An example of the former would be the trainingof agents to play the iterated prisoner’s dilemma [7], [16], [17].

Page 5: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 5/18

554 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

Fig. 3. Examples of two optimal and two suboptimal walks for the 4 2       4 instance of the SAW problem. The fitnesses of the examples are 16, 16, 14, and 15,corresponding to the number of squares visited.

Fig. 4. An optimal PORS tree located by a GBEA with graph C      for n     = 1 6    nodes shown in LISP-like notation.

An example of the latter would be the Tartarus task [5], [38]. A

generational algorithm evaluates fitness across the population

against the same opponents or test cases, yielding a fairness

unavailable in a steady state algorithm.

A recent paper [14] by Choi and Moon uses the term “graph-

based” in a different sense. In that paper, an analysis of the graph

theory underlying the sorting network problem is used to obtain

substantial performance improvement. Other than the chancesimilarity of terminology, it is a distinct type of research.

IV. EXPERIMENTAL DESIGN

The test problems were chosen because they represented

different classes of problems that have been well studied and

have known solutions. For evolutionary algorithms, one-max is

a standard test problem. The De Jong functions are well known

and permit comparison with other work using those functions,

although they do not meet the criteria given in [43] to be a test

suite. The lower-dimensional cases of the Griewangk function

are dif ficult functions for optimization. Plus-one-recall-store

(PORS) is a test problem with an exceptionally well-character-ized fitness landscape for genetic programming. The

case is a deceptive problem, containing a unique and narrow

global optimum and many broader local optima, while the

and cases are not. The DNA barcode problem

is a new problem, included as an applied problem with the

parameters that have been most studied. The ordinary differen-

tial equation solution is a precursor to many applied problems

including heat transfer, fluid flow, and combustion [10], [13],

[33].Simulations were performed for 23 test problems on each

of the 26 graphs given in Table I. For 22 of the problems,

5000 independent evolutionary simulations were performed,

and for one problem (differential equation solution), 10,000

simulations were performed to obtain tighter confidence inter-

vals. The number of mating events required to find a correct

solution to the problem was saved for each of these 3,120,000

simulations. If more than 1,000,000 mating events were re-

quired, the simulation was recorded as having failed to find an

answer. For each graph and problem, the mean and standard

deviation of the number of mating events to solution were used

to construct 95% confidence intervals for the mean time tosolution. These are displayed in Figs. 5–12. The test problems,

Page 6: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 6/18

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 555

Fig. 5. Mean mating events to solution with 95% confidence intervals for theone-max problem.

the local mating rule used with each test problem, and the exact

character and rate of the variation operators used are described

in the following sections.

A. One-Max

The one-max problem uses a string of bits for a chromo-

some. In this paper, we used 20-bit strings. The fitness of a

string is its weight (the number of ones in it). For the one-max

study, we used local roulette mating. The crossover operator was

two-point crossover. The mutation operator flipped one bit se-

lected uniformly at random. The choice not to use elite replace-

ment on the one-max problem reflected the essentially trivial

character of the problem. The search problem is harder without

elite replacement and so more likely to yield information about

the relative merit of graphs.

B. De Jong Functions

The De Jong functions are described in detail in

[22]. is a three-dimensional bowl. is a fourth-degree bi-

variate polynomial surface featuring a broad suboptimal peak.is a sum of integer parts of  five independent variables cre-

ating a function that is flat where it is not discontinuous, a kind

of six-dimensional ziggurat. is a fourth-order paraboloid in

30 dimensions with distinct diameters in different numbers of 

dimensions made more complex by adding Gaussian noise.

is the so-called “foxhole” function with many narrow local op-

tima placed on a grid. These functions are traditional test prob-

lems in function optimization but do not serve as a complete test

suite. See [43] for incisive comments.

C. The Griewangk Function

The Griewangk function is a sum of quadratic bowls, oneper dimension, with cosine terms added to them, subsequently

translated to yield a positive function. It has a plethora of local

optima and is a natural member of a test suite. As the dimension

of the Griewangk function increases, it approaches a unimodal

bowl [43]. For this reason, we include this function in five cases

of relatively low dimension, .

D. Self-Avoiding Walks

The self-avoiding walk (SAW) problem uses a string as its

chromosome. The string is over the alphabet

with the letters corresponding to up, down, left, and right

moves on a grid, respectively. The cases of the SAW problem

on grids of size 3 3, 3 4, 4 4, 4 5, 5 5, 5 6,

and 6 6 are used. The length of a SAW chromosome is

equal to the number of cells in the grid minus one. Fitness

is evaluated by starting in the lower left corner of the grid

and then making the moves specified by the chromosome.

The sequence of moves made is referred to as the walk .

If a move is made that would cause the walk to leave the

grid, then that move is ignored. The walk can also revisit

cells of the grid. Fitness is equal to the number of squares

visited when the walk is completed. The problem is called

the self-avoiding walk problem because optimal solutions do

not revisit squares; they are self-avoiding walks. Examples

of SAW chromosomes and their fitness evaluations are given

in Fig. 3.

The self-avoiding walk functions fill a role similar to those of 

NK-landscapes [23]. Both types of problem are scalable with a

large degree of epistasis, and both possess many global and local

optima. The fourth example given in Fig. 3 has fitness 15 but no

near neighbors (in the Hamming metric) with fitness 16. It is an

example of a local optimum. The SAW problem differs from the

NK-landscape problems in several ways. Every instance of the

SAW problem has a known best fitness; it is possible to know

when you have succeeded. This makes the collection of statis-

tics on algorithm behavior easier. The walk for a given SAW

chromosome yields a simple and intuitive visualization that can

be used to help in analysis.

As SAW problems are a new type of test problem, they should

be checked against the list of criteria for good test suite problems

given in [43].

Criterion 1): SAW problems are quite resistant to hill

climbing. Testing with a single mutation hill climber using the

mutation operator of this paper showed that the ratio of local to

global optima located explodes combinatorially as the problem

size increases.

Criterion 2): The SAW problem is constructively nonlinear,

nonseparable, and asymmetric. If we permute the order of 

moves made, the fitness of a given chromosome varies sub-

stantially. A perfect walk ’s moves can be reordered so that the

majority of moves are made off the grid, reducing its fitness

substantially. Since sequences of moves are good only from a

particular starting position, the SAW problem is quite nonsepa-

rable. Loci near the beginning of the chromosome have fitness

independent of later loci, but the fitness of later loci deeply

depends on the values of earlier loci; the fitness is thus not evenclose to additive and the problem is nonlinear.

Page 7: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 7/18

556 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

Fig. 6. Mean mating events to solution with 95% confidence intervals for the De Jong test suite, functions F     

0                                                F      .

Criterion 3): The SAW problem is scalable. The SAW

problem contains an infinite number of cases that canbe scaled from trivial to hard.

Criterion 4): This is the sole criterion that the SAW problem

fails to satisfy. The evaluation cost of a SAW problem is smallwhen its size is such that there is any hope of solving it.

Page 8: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 8/18

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 557

Fig. 7. Mean mating events to solution with 95% confidence intervals for the Griewangk function in three to seven dimensions.

Criterion 5): The SAW problem uses a canonical represen-tation, a string over a four-letter alphabet. The SAW problem

thus satisfies four of the five criteria needed for members of agood test suite and so, paired with a problem that has scalable

Page 9: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 9/18

558 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

Fig. 8. Mean mating events to solution with 95% confidence intervals for self-avoiding walks of size 3 2       3, 3 2       4, 4 2       4, and 4 2       5.

cost, yields an acceptable test suite. Work on using GBEAs with

a high evaluation cost problem appear in [12] and [40].

E. Plus-One-Recall-Store

The PORS problem is described in detail in [6]. It is a type of 

maximum problem within the domain of genetic programming

[9], [25], [26] with a small operation set and a calculator-style

memory. The goal of the test problem, called the PORS ef fi-

cient node use problem, is to find parse trees that evaluate to the

largest integer result possible given a fixed maximum number

of parse tree nodes. The language has two operations: integer

addition and a store operation that places its argument in an ex-

ternal memory location. The language also has two terminals:

the integer 1 and recall from an external memory. The dif fi-culty of the PORS ef ficient node use problem varies strongly

according to the congruence class ( 3) of the number of 

nodes permitted. We ran experiments on , , and

nodes representing all three classes. The hardest case

is ; the easiest is . An example of a solution lo-

cated for is given in Fig. 4. Fitness for a given parse

tree was the size of the number it produced when evaluated. In

this set of experiments, the initial population was composed of 

randomly generated trees with exactly nodes. A successful in-

dividual was defined to be a tree that produced the largest pos-

sible number (these numbers are computed in [6]). Crossover

was performed by the usual subtree exchange [26]. If this pro-

duced a tree with more than nodes, then a subtree of the root

node iteratively replaced the root node until the tree had fewer

than nodes. This operation is called chopping. Mutation wasperformed by replacing a subtree picked uniformly at random

Page 10: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 10/18

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 559

Fig. 9. Mean mating events to solution with 95% confidence intervals for self-avoiding walks of size 5 2       5, 5 2       6, and 6 2       6.

with a new random subtree of the same size for each new tree

produced. For the PORS experiments, local elite roulette mating

was used.

F. DNA Barcodes

DNA barcodes [4] are error correcting codes [28] over the

DNA alphabet which are able to correct errors rel-

ative to the edit metric [20]. They are used as embedded markers

in genetic constructs to permit retention of source information

when sequencing pooled genetic libraries. An example of their

successful use to retrieve sequence source information appears

in [30].

Unlike binary error correcting codes over the Hamming

metric, edit metric codes lack a beautiful algebraic theory.

Those used were located with a greedy closure evolutionary

algorithm [4]. This type of evolutionary algorithm uses arepresentation consisting of a partial structure. The fitness of an

individual partial structure is the quality (in this case size) of its

completion by a greedy algorithm. When searching for DNA

barcodes, the partial structure is a choice of three random DNAcodewords, and the greedy algorithm is Conway’s lexicode

algorithm [4]. Fitness is simply the size of the code located

by Conway’s algorithm. The DNA barcode search problem

exhibits a high degree of epistasis, and work thus far suggests

it has an exceedingly rugged fitness landscape.

The algorithm in this paper searches for six-letter DNA words

that are at a mutual distance of at least three. These are the pa-

rameters used for the wet lab testing of the technique in [30].

Barcodes of this size and distance can correct one sequencing

(edit) error.

G. Differential Equation Solution

Solving differential equations is a common genetic program-ming problem. Modifying the usual technique, the algorithm in

Page 11: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 11/18

560 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

Fig. 10. Mean mating events to solution with 95% confidence intervals for the PORS problem withn       = 1 5   

(top left),n       = 1 6   

(top right), andn       = 1 7   

(bottom).

this paper when computing fitness extracts the derivatives sym-

bolically. Code for solving differential equations with symbolicderivatives used in the fitness function is available by contacting

the second author.

We solve the differential equation

(1)

a simple homogeneous equation with a two-dimensional solu-

tion space

(2)

for any constants , .

The parse tree language used has operations and terminals

given in Table II. Trees were initialized to have six total oper-

ations and terminals. Fitness for a parse tree coding a function

was computed as the sum of the error function

over 100 equally spaced samplepoints in the range . This is the squared deviation

from agreement with the differential equation. This function is

to be minimized, and the algorithm continues until 1,000,000

mating events have taken place (this did not happen in prac-

tice), or until the fitness function summed over all 100 sample

points drops below 0.001. A filter was included to prevent trivial

solutions, e.g., , and trivial solutions were assigned a

fitness of when they were detected.

Crossover and chopping were performed as in the PORS ex-

periments; trees were chopped if they had in excess of 22 total

operations and terminals. In addition to subtree mutation of the

sort used in the PORS experiments, a constant mutation was ap-

plied to each new parse tree. Constant mutation has no effect on

Page 12: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 12/18

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 561

TABLE IIOPERATIONS AND TERMINALS FOR THE DIFFERENTIAL EQUATION PARSE TREE

LANGUAGE. THE SYMBOL r DENOTES A REAL NUMBER

parse trees that do not contain ephemeral real constants. For atree that does contain such constants, either as a terminal or as

part of a unary scaling operation, one of the constants is selected

uniformly at random, and a number uniformly distributed in the

region [ 0.1,0.1] is added to the constant. Ephemeral constants

are initialized in the range [ 1,1] but may be taken outside of 

this range by constant mutation. Local elite roulette mating was

used. Each new tree produced resulted from a subtree crossover

and was subjected to both a subtree mutation and a constant mu-

tation.

Equations (3)–(5) are examples of solutions found by a

GBEA on the graph . All of these are in fact analytical solu-

tions to the equation as were the majority of solutions located

(3)

(4)

(5)

V. RESULTS

The primary objective of this paper was to determine the po-

tential impact of population structure in the form of combinato-

rial graphs on solution speed. It also sought to document which

graphs yield superior performance for a specific problem. The

major result can be summarized by saying that choice of graphsubstantially impacts solution time and that the correct choice

of graph varies from problem to problem.

For each graph and test problem, 5000 tests were completed,

except in the case of the differential equation where 10,000 tests

were completed for each graph. In each case, time-to-solution

numbers were saved with time measured in mating events. For

the one-max, SAW, and PORS problems, the solution consisted

of the appearance of the first instance of the known correct so-

lution. In the function optimization problems, a simulation was

said to have found the solution when it obtained a value within

0.001 of the known optimal value. DNA barcodes were evolved

until they achieved the size of the current best known solution.

For the differential equation problem, the correct solution wastaken to be a total squared error over all 100 sample points of at

Fig. 11. Mean mating events to solution with 95% confidence intervals for thedifferential equation solution problem.

Fig. 12. Mean mating events to solution with 95% confidence intervals for the

DNA barcode problem.

most 0.001. Figs. 5–12 show the relative performance of each

graph as scatter plots with 95% confidence intervals and the

graphs sorted in increasing order of time to solution.

As used in this discussion, “performance” refers to the

number of mating events required to find an acceptable solution

to the problem. The top-to-bottom impact that the choice of 

graph has on problems is shown in Table III. An initial examina-

tion of the confidence intervals shows that performance varies

from graph to graph, often significantly. Also, the degree to

which performance varies is problem dependent. This indicates

that graph-based evolutionary algorithms have the potential

to significantly reduce convergence time for many classes of challenging problems. It is important to remember that the

Page 13: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 13/18

562 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

one-max problem as presented here uses a nonelitist algorithm,

increasing its dif ficulty. The other 20 sets of simulations use

elitist algorithms.

The test problems can be divided into the following groups.

1) Problems with a simple fitness landscape (one-max, ,

, and  ). The fitness landscape for these problems is

a single hill that fills the entire landscape, adjusted in the

case of by noise. The landscapes for one-max and

are very similar (discontinuous pyramids), though they

use different representations. These are relatively straight-

forward optimization problems.

2) Problems in which the fitness landscape has many local

optima and several global optima (PORS16, PORS17, dif-

ferential equation, SAW). Although the PORS16 problem

has multiple optimal hills in its fitness landscape, each of 

these hills has the same fitness value. Additionally, many

of the suboptimal solutions for the PORS16 problem con-

tain “building blocks” that are tree fragments that create

the numbers 2 or 3 or multiply a single argument by thosenumbers. The optimal answer requires two fragments cre-

ating or multiplying by two and two fragments creating

or multiplying by three. The effect of this is that the ma-

jority of the local optima in the search space contain the

tree fragments needed in each of the 24 optimal answers.

These optimal answers differ only in the details of how

they use the building blocks. See [6] for details.

The fitness landscape for the differential equation

problem is the most intricate of these test problems. It is

far larger and weirder than landscapes for the other test

problems. As a search problem, it is dense with small

correct answers [e.g., (3) and (5)], so much of the space

is not involved in most searches. Unlike the PORS16

problem, the majority of these solutions cannot be built

from fragments of each other.

3) Mildly deceptive or dif ficult landscapes with a global op-

tima hidden by a larger local optima ( and some of the

lower -dimensional Griewangk functions). This class of 

problems had or one of its random variants as their

best graph, with a modest improvement in performance

from 2% to 16%.

4) Problems with very dif ficult, possibly deceptive land-

scapes (PORS15, , DNA barcodes). The PORS15

problem is the hardest search problem among the test

problems. The dif ficulty arises because the correct solu-tion is a unique tree that computes 32 (2 ), and because

trees that generate threes are local optima that use large

(five node) tree fragments that are of no use at all in an

optimal solution. See [6] for details. The foxhole function

also has a large number of traps.

From Table III, it is clear that the use of graphs has a substan-

tial impact on the dif ficult or deceptive problems in the test set.

However, for the three hardest problems ( , PORS15, DNA

barcodes), the best graph to use was very different. To compare

and PORS15, examine Fig. 13. The baseline evolutionary al-

gorithm, the GBEA with the complete graph , is the lowest

diameter graph in both plots, with Log Diameter . For thefoxhole function , the complete graph is an outlier, whereas

TABLE IIIIMPACT OF CHOICE OF GRAPH ON SOLUTION TIME. F OR EACH PROBLEM THE

MINIMUM AND MAXIMUM MEAN TIME TO SOLUTION FOR ANY OF

THE 26 GRAPHS USED IS GIVEN TOGETHER WITH THEIR RATIO. THE

BENEFIT COLUMN GIVES THE IMPROVEMENT OVER THE BASELINE

STANDARD EVOLUTIONARY ALGORITHM

for PORS15, the complete graph is part of a smooth inverse cor-

relation between diameter and time to solution.

A. Performance of the Complete Graph

The complete graph, a GBEA configured to run as a stan-

dard evolutionary algorithm, yielded the best results for the fol-

lowing problems: one-max, the noisy unimodal function , thesimplest of the three PORS problems, all cases of the SAW

problem, and the differential equation problem. This last had

the shortest time to solution on average of any of the problems

checked. These problems include both unimodal and the most

highly polymodal problems in the test set. They do not include

the dif ficult problems: PORS15, DNA barcodes, and the foxhole

function .

B. Performance of Degree-9 Graphs

The hypercube or one of the three random graphs derived

from it was the best graph for , , , , and all instances of  

the Griewangk function. In most of these cases, the hypercubeand its random analogs outperformed the complete graph by a

modest margin. For the deceptive function , the difference

was quite large. If only one graph must be chosen, then the suite

of problems used in this paper suggest the hypercube is a good

compromise choice. It did, however, perform poorly on both

PORS15 and DNA barcodes.

C. Performance on the Hardest Problem

If we rate problem dif ficulty by time to solution, the PORS15

problem is the most dif ficult problem in the test suite. The worst

graph for this problem is the complete graph. The second worst

are the four degree-9 graphs. Thus, the two types of graph thatbetween them are the best for all other problems in the test suite

Page 14: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 14/18

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 563

Fig. 13. A plot of graph diameter versus time to solution forF     

and PORS15.

are the worst for PORS15. The best graphs for this problem are

, , and .

The graphs , are the two highest diameter graphs.

The graph is the only one specifically designed for GBEAs. It

is essentially fractal in character, with closely coupled smaller

groups organized into less closely coupled larger groups across

three levels of scale; it is intended to create local demes. Having

a high diameter is another method of creating disparate demesthrough isolation by distance.

D. The Impact of Randomization

For degree 3, 4, and 9, three randomized graphs each were

included. The experiments demonstrate very little impact of 

this randomization. In most experiments, the randomized

graphs group with the nonrandom graphs of the same degree.

Even when a statistically significant separation appeared, e.g.,

degree-4 graphs for PORS15, the graphs were in the middleof the distribution of performances. There are combinatorially

huge numbers of randomized versions of a given type of regular

graph. Some of these may in fact exhibit significantly superior

performancenevertheless randomly sampled graphs within a

degree family used in this paper do not exhibit useful levels of 

enhanced performance.

E. The Very Worst Type of Graph

The regular trees were extraordinary in having only one

problem in which any of them performed well, the DNA

barcode problem. For the SAW problems they were in the

middle of the pack. The SAW problems were best solved with

a standard algorithm, i.e., a GBEA using . For PORS15,

they were in the bottom half but beat the complete graph and

the degree-9 (hypercube) family. For all other problems they

were the worst, often by a large margin. The current test suite

of problems gives no reason to think these graphs should ever

be used.

F. The Deceptive Functions

The De Jong function and PORS15 are the deceptive prob-

lems in the test suite. Fig. 13, which displays time to solution

versus the log of graph diameter, shows that the behavior of thegraphs on these problems are very different. For , there is a

rough correlation of log diameter and time to solution, with the

complete graph and the regular trees behaving as outliers. For

PORS15, there is a fairly strict inverse correlation of log diam-

eter with time to solution.

The behavior of and on demonstrate that di-

ameter and degree do not tell the whole story. If we dismiss the

behavior of the regular trees as pathological, the two graphs with

extreme degree and diameter have almost the same average time

to solution on . The way that beats on many prob-

lems is additional evidence that graph structure beyond degree

and diameter impacts performance.

PORS15 has a simpler behavior than . It was hypothesizedthat high diameter graphs act like island models. The water be-

tween the islands is made of majority low fitness members of 

the initial population. The islands form around distinct higher

fitness individuals. Each island is a chance not to fall into one

of the local optima of the fitness landscape. In order to check 

this hypothesis, a set of runs was performed with 128 disjoint

copies of . The time to solution was comparable to that of 

.

VI. PHYLOGENETIC ANALYSIS

The data available after performing the 26-graph 23-problemcomparison permit a novel sort of analysis of the problems. A

Page 15: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 15/18

564 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

taxonomy is a hierarchical classification of a set. Linnaeus estab-

lished the first definite hierarchy used to classify living organ-

isms. Each organism was assigned a kingdom, phylum, class,

order, family, genus, and species. This hierarchy gave a tree

structure to the taxonomy for all living creatures. Modern tax-

onomy has nineteen levels of classification, extending Linnaeus’

original seven. A cladogram is a tree diagram showing the evo-lutionary relationship among various taxonomic groups. The

reader should see [27] for details on modern taxonomic proce-

dures for living organisms. The data gathered in the GBEA study

were used to create a taxonomy of test problems. Making such

a cladogram required extracting taxonomic characters from the

collection of problems. A taxonomic character is simply a mea-

surable or computable quantity such as number of legs or max-

imum number of teeth in a healthy adult. Using the taxonomic

characters, hierarchical clustering produced a cladogram that

classified the problems as more or less similar. Hierarchical

clustering starts with the members of a set (thought of as sin-

gleton subsets), finds the two closest, and replaces them with

their union or average, repeating until all members are merged.The choice of taxonomic characters used for clustering is crit-

ical. They must avoid bias; they must vary across the set of prob-

lems; and they must avoid arbitrary judgments to the greatest

degree possible. Using color in a numerical tree-building algo-

rithm, for instance, requires numbers be assigned to colors in a

fashion that arbitrarily ranks some colors as closer to one an-

other than others. The preceding brief discussion gives only a

taste of the dif ficulty of choosing good taxonomic characters.

Readers familiar with choosing decision variables for automatic

classification, decision trees, and related branches of machine

learning will recognize the issues. Any taxonomic character or

decision variable must be relevant to the decision being made,vary across the set of objects being classified, and be cleanly

computable for all members of the set of objects being classified.

GBEAs provide a source of taxonomic characters that are

computable for any evolutionary computation problem that has

a detectable solution or end point. The time to solution for a

problem varies in a complex manner with the choice of graph-

ical connection topology. This complexity is itself the genesis of 

the taxonomic characters. The taxonomic characters used to de-

scribe a problem are the normalized mean solution times for the

problem on each graph. These characters are purely numerical.

They are objective in the sense that they do not favor any par-

ticular choice of representation or parameter setting. This gives

each of our 23 problems a set of 26 taxonomic characters. The

resulting taxonomy is given in Fig. 14.

A. Details of the Taxonomic Technique

For each of the 23 problems , a real vector with 26

entries corresponding to the normalized mean solution time

in each of the 26 graphs was created. The entry of 

corresponding to graph was the normalized mean

number of mating events required to solve problem on graph

. The linear normalization was set so that the solution of 

on the graph which required the largest mean number of 

mating events among the 26 different graphs received the score

, and the graph which required the smallestmean number of mating events received the score .

Fig. 14. Results of taxonomic analysis of the test problems.

For each pair of problems and , the Euclidean distance

between the vectors and was then com-

puted by the formula

was interpreted as the distance between the problems

and . An “UPGMA” tree was used to describe the taxonomic

relationships among the 23 problems.

UPGMA is a clustering method commonly used to transform

distance data into a tree. It received attention in [34], and a goodrecent description may be found in [36]. It is especially reliable

if the distances have a uniform meaning. Normalization of the

numbers makes the widely different rates of conver-

gence comparable so that the inferred distances are appropriate

for analysis by UPGMA.

UPGMA is an acronym for “unweighted pair group method

with arithmetic mean.” Given a collection of taxa and distance

between taxa and , the method first links the two taxa

and that are least distant. The taxa and are merged into a

new unit . For all taxa other than and , a new distance

is computed as the average of and , and it is noted that the

new taxon really represents the average of two original taxa.

Henceforth, and are ignored, and the procedure is repeated tofind the next pair of taxa that are least distant. When two taxa

Page 16: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 16/18

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 565

and are combined into a new taxon , the new distance is

the average of and , weighted according to the number of 

original taxa in and , respectively; contains all the original

taxa in both and . The procedure ends when the last two taxa

are merged.

The UPGMA tree was computed using the standard soft-

ware package PAUP [35]. is shown in Fig. 14. Horizontaldistances are proportional to the edge lengths, while vertical

distances are arbitrary and selected for legibility. Problems sepa-

rated by a small horizontal distance (such as Griewangk5, 6, and

7) should be regarded as very similar. Wide separations should

be regarded as significant.

B. Discussion of the Taxonomic Results

The tree given in Fig. 14 has several striking features.

1) All the numerical problems are grouped into a single clade

with the one-max problem.

2) All the SAW problems are grouped into a single clade.Moreover, the SAW problems break into two subclades

(one of which is 3 4, 3 4, and 4 4) of comparable

horizontal extent as the numerical problems and hence of 

comparable diversity of problem type as the numerical

problems.

3) The three PORS functions appear substantially different

both from each other and from the numerical and SAW

clades, as indicated both by their large horizontal extent

and their placement so as not to form a clade.

The utility of the taxonomy is demonstrated by the

two-member clade containing the PORS15 and DNA barcode

problems, the most dif ficult problems tested here. Suppose thatPORS15 were part of a standard test suite of problems, and

the DNA barcode problem was regarded as a new practical

problem, not as part of the test suite. Taxonomic analysis places

the DNA barcode problem with PORS15, which suggests that

the graphs which worked well on PORS15 would be most

likely to work well on the DNA barcode problem. In fact, this

expectation is realized in this case. Examining Figs. 10 and

12, we see that these two problems perform best on the same

three graphs ( , , and ) and also perform worst

on the same five graphs ( (512,9,1), (512,9,2), (512,9,3),

, and complete). The good performance is on comparatively

sparse graphs, and the poor performance is on graphs of high

regular degree. This suggests that future searches for better

DNA barcodes should use GBEAs with sparse graphs (and

avoid graphs of high regular degree). This information is of 

substantial worth in an ongoing project to create DNA barcode

sets for new sequencing projects.

The SAW and PORS problems demonstrate their worth as test

suite problems by exhibiting substantial diversity in problem

characteristics (horizontal extent in ). These results confirm

the mathematical analysis in [6] that suggests that the three

PORS problems have substantially different characteristics. The

placement of PORS17 between PORS15 and PORS16 confirms

that PORS17 is of an intermediate nature compared to the other

two problems. The SAW problem set generates substantial di-versity by simply varying its parameter. By contrast, the numer-

ical problems generate less diversity, and an effective test suite

might omit some of the problems as being redundant.

It is important to note that the taxonomy reflects relative per-

formance on different graphs and not problem dif ficulty. The

normalization of mean times into the range [0,1] before use in

comparing the problems eliminates all information about the

amount of time required to solve the problem. This explainswhy the semisymbolic differential equation problem ended up

as a sister group to the SAW clade even though it is enormously

easier than most of the SAW problems. This comparative sim-

plicity is shown by the small number of mating events required

for solution in Fig. 11 compared with Fig. 9.

Overall, the taxonomy in Fig. 14 is plausible and agrees

with what the authors know of the test problems. The tech-

nique shows promise for helping to decide which problems are

similar. It may also help to winnow large test problem suites

by picking representatives from groups of similar problems

(such as selecting only a few representative numerical problems

rather than including all of them).

VII. CONCLUSION

Graph-based evolutionary algorithms can improve per-

formance on some problems. Among the problems used in

this paper, performance gain was the greatest on the hardest

problems. The largest improvement in performance was in

excess of 1200%, but roughly half of all test problems showed

no improvement from using a GBEA. The choice of correct

graph for a GBEA is clearly problem dependent. The taxonomy

given in Fig. 14 gives some guidance as to which problems are

similar, at least in the sense of being solved quickly or slowly

on the same graphs. As a rule of thumb, dif ficult and deceptiveproblems work best with sparser graphs.

The additional runtime cost of using a GBEA, over that of a

standard evolutionary algorithm for the same problem, is very

low. If a good graph for the problem can be located, then there

is potential for substantial benefit at very low cost. For the

Griewangk functions and the SAW problems, the performance

ordering of the graphs was robust as the dimension of the

problem changed. This suggests that locating a “correct” graph

for a problem could be done on lower dimensional or smaller

problem cases and then scaled. This notion requires additional

study.

The behavior of a problem on a suite of graphs forms an in-

teresting description of the problem itself. By looking at which

graphs work well or badly with a given problem, the problem

can be characterized. This gives an objective taxonomic tool

which could be quite useful for classifying problems. It is worth

noting that the taxonomy, as presented here, is an essentially ex-

ploratory technique for data analysis.

GBEAs have been applied to the problem of designing a

wood-burning stove for use in Nicaragua. The goal was to

decide where to place baf fles in the flow of combustion gasses

to make the temperature of the stove top as even as possible.

The design and deployment of these stoves is described in

[40]. The details of thermal systems engineering of the stoves

appears in [12]. A description of the impact of using GBEAs onthe problem is given in [11]. To summarize the results: it was

Page 17: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 17/18

566 IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, VOL. 10, NO. 5, OCTOBER 2006

found that preserving diversity uniformly hindered progress.

The better diversity was maintained, the worse the average

value of the objective function for the stove.

The application of placing baf fles in a wood-burning stove is

an example of a problem for which the GBEA technique is not

required. The evidence inversely correlating diversity preser-

vation with performance suggests that the baf fle placementproblem is best run on a highly connected graph if a GBEA

is used at all. The evidence also suggests that a standard EA

is a better choice than a GBEA for this problem. At present

there are no obvious features of this thermal systems problem

that suggest ab initio that it was a problem for which diversity

preservation was bad.

VIII. FUTURE DIRECTIONS

The graphs used in this paper are highly symmetric graphs,

regular graphs, regular trees, or random regular graphs. The

sole exception is the random toroidal graphs which, while not

regular, are isotropic in the sense that the neighborhood of any

vertex is generated with the same kind of random process as

all the other vertex-neighborhoods. The idea of interactions

between a continent and nearby islands which motivated the

work in [29] suggests the use of a different sort of graph. This

conjectural class of graphs would have a highly connected core

with sparse connections outward to other connected regions

with fewer vertices than the core. The core area would serve as

the continent and the smaller connected regions as the islands.

The graph is somewhat similar in its connectivity to an

archipelago, at best a distant approach to a continent/island

graph.During the review procedure, it was pointed out that exami-

nation of GBEA behavior without crossover would be valuable.

In such a GBEA, a population member would be selected at

random, and neighbor selected in a fitness proportional manner,

and that neighbor copied over the selected population member

and then mutated. A comparison of the work presented here with

data from a crossover-free version would permit examination of 

the utility of crossover. It would help to distinguish between two

different explanations for the observed changes in performance.

Are they due to the effects of geographic isolation or to hetero-

geneous crossover? It is hypothesized in this paper that enabling

the maximum number of crossovers between dissimilar parents

will enhance performance on at least some of the test problems.

It is just as plausible that isolation, enabled by the connectivity

of the various graphs, is varying the effective number of sub-

populations exploring distinct solutions.

In addition to crossover, a number of other standard evolu-

tionary algorithm parameters have not been tested for sensi-

tivity. Additional work has already demonstrated that popula-

tion size has a substantial impact on the performance of GBEAs.

In this paper, the degree of a graph was strongly predictive of 

performance on a problem. Often graphs of the same degree

sorted together in the ordering from best to worst performance.

This ordering by degree changes when the number of vertices

in the graph is varied and a manuscript addressing this featureof GBEAs is in preparation.

The taxonomic analysis technique can benefit from win-

nowing the list of graphs. In many cases several graphs yield

essentially the same taxonomic information. Using the time

to solution data, a smaller set of graphs has been selected

that is conjectured to yield similar taxonomic information. In

particular, random graphs derived from the same regular graph

seem to yield similar performance to their progenitor on allproblems and hence provide no additional taxonomic informa-

tion. The reduced list of graphs recommended is: (510,5),

, (512,3), , , , , ,

, , , , (0.07,1), , and . Re-

ducing the list of graphs from 23 to 15 permits generation of the

taxonomic characters with 40 000 fewer evolutionary algorithm

runs with the current experimental design. Researchers wanting

to apply GBEAs to their problems on these graphs in a manner

that can be incorporated into the current taxonomic effort may

contact the second author for exact descriptions of the graphs,

especially (0.07,1), which is an instance of running an algo-

rithm for generating graphs and so not completely specified

here.In this paper, the taxonomic technique is used to compare 26

different problems each of which appears with exactly one rep-

resentation and exactly one setting of the possible evolutionary

algorithm parameters. A distinct application of the technique

would be to taxonomize the impact of changing representa-

tion and evolutionary algorithm parameters within a problem.

This would be a step toward understanding which versions of a

problem (encompassing both representation and algorithm pa-

rameter settings) are substantially different from one another

and which are essentially the same.

ACKNOWLEDGMENT

The authors would like to thank the members of the Iowa

State Complex Adaptive Systems Program for helpful com-

ments and discussions.

REFERENCES

[1] D. L. Ackley and M. L. Littman, “A case for distributed Lamarckianevolution,” in Artificial Life III:Santa Fe InstituteStudiesin theSciencesof Complexity, C. Langton, C. Taylor, J. D. Farmer, and S. Ramussen,

Eds. Redwood City, CA: Addison-Wesley, 1993, vol. 10.[2] J. Alcock, Animal Behavior, an Evolutionary Approach, 7th

ed. Sutherland, MA: Sinauer Associates, 2003.[3] D. Ashlock, “GP-automata for dividing the dollar,” in Proc. 2nd Annu.

Conf. Genetic Programming, San Francisco, CA, 1997, pp. 18–26.

[4] D. Ashlock, L. Guo, and F. Qiu, “Greedy closure genetic algorithms,” inProc. Congr. Evol. Comput., Piscataway, NJ, 2002, pp. 1296–1301.

[5] D. Ashlock and M. Joenks, “ISAc lists, a different representation forprogram induction,” in Proc. 3rd Annual Genetic Programming Conf.,San Francisco, CA, 1998, pp. 3–10.

[6] D. Ashlock and J. I. Lathrop, “A fully characterized test suite for ge-

netic programming,” in Evolutionary Programming VII . New York:Springer-Verlag, 1998, pp. 537–546.

[7] D. Ashlock, M. D. Smucker, E. A. Stanley, and L. Tesfatsion, “Prefer-ential partner selection in an evolutionary study of prisoner’s dilemma,”Biosystems, vol. 37, pp. 99–125, 1996.

[8] D. Ashlock, J. Walker, and M. Smucker, “Graph based genetic algo-rithms,” in Proc. Congr. Evol. Comput., San Francisco, CA, 1999, pp.1362–1368.

[9] W. Banzhaf, P. Nordin, R. E. Keller, and F. D. Francone, Genetic Pro-

gramming: An Introduction. San Francisco, CA: Morgan Kaufmann,

1998.[10] J. Bebernes and D. Eberly, Mathematical Problems From CombustionTheory. New York: Springer-Verlag, 1989.

Page 18: graph based EA

8/7/2019 graph based EA

http://slidepdf.com/reader/full/graph-based-ea 18/18

BRYDEN et al.: GRAPH-BASED EVOLUTIONARY ALGORITHMS 567

[11] K. M. Bryden, D. Ashlock, and D. McCorkle, “An application of graphbased evolutionary algorithms for diversity preservation,” in Proc. 2004Congr. Evol. Comput., vol. 1, 2004, pp. 419–426.

[12] K. M. Bryden, D. A. Ashlock, D. S. McCorkle, and G. L. Urban, “Opti-mization of heat transfer utilizing graph based evolutionary algorithms,”Int. J. Heat Fluid Flow, vol. 24, no. 2, pp. 267–277, 2003.

[13] H. S. Carslaw and J. C. Jaeger, Conduction of Heat in Solids, 2nded. London, U.K.: Oxford Univ. Press, 1959.

[14] S. Choi and B. Moon, “A graph-base Lamarkian-Baldwinian hybrid forthe sorting network problem,” IEEE Trans. Evol. Comput., vol. 9, no. 1,pp. 105–114, 2005.

[15] Y. Davidor, T. Yamada, and R. Nakano, “The ECOlogical framework II: Improving GA performance at virtually zero cost,” in Proc. 5th Int.Conf. Genetic Algorithms, San Mateo, CA, 1993, pp. 171–176.

[16] D. B. Fogel, “Evolving behaviors in the iterated prisoners dilemma,”Evol. Comput., vol. 1, no. 1, pp. 77–97, 1993.

[17] , “On the relationship between the duration of an encounter and theevolution of cooperation in the iterated prisoner’s dilemma,” WorkingPaper, Jul. 1994.

[18] L. J. Fogel, A. J. Owens, and M. J. Walsh, “Intelligent decision makingthrough a simulation of evolution,” Behav. Sci., vol. 11, no. 4, pp.253–272, 1965.

[19] D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Ma-

chine Learning. Reading, MA: Addison-Wesley, 1989.[20] D.Gusfield, Algorithms on Strings, Trees and Sequences: Computer Sci-

ence and Computational Biology. Cambridge, U.K.: Cambridge Univ.Press, 1997.

[21] S. Ho, L. Shu, and J. Chen, “Intelligent evolutionary algorithms forlargeparameter optimization optimizationproblems,” IEEETrans. Evol.Comput., vol. 8, no. 6, pp. 522–541, 2004.

[22] K. A.De Jong, “An analysis of thebehaviorof a class of genetic adaptivesystems,” Ph.D. dissertation, Univ. of Michigan, Ann Arbor, MI, 1975.

[23] S. A. Kauffman, The Origins of Order . New York: Oxford Univ. Press,1993.

[24] M. Kimura and J. Crow, “On the maximum avoidance of inbreeding,”Genet. Res., vol. 4, pp. 399–415, 1963.

[25] K.Kinnear, Advancesin Genetic Programming. Cambridge, MA: MITPress, 1994.

[26] J. R. Koza, Genetic Programming. Cambridge, MA: The MIT Press,1992.

[27] E. Mayr and P. D. Ashlock, Principles of Systematic Zoology. NewYork: McGraw-Hill, 1991.

[28] R. McEliece, The Theory of Information and Coding. Reading, MA:Addison-Wesley, 19 77.

[29] H. Mühlenbein, “Darwin’s continent cycle theory and its simulation bythe prisoner’s dilemma,” Complex Syst., vol. 5, pp. 459–478, 1991.

[30] F. Qiu, L. Guo, T. J. Wen, D. A. Ashlock, and P. S. Schnable, “DNA se-quence-based bar-codes for tracking the origins of ESTS from a maizeCDNA library constructed using multiple MRNA sources,” Plant Phys-iology, vol. 133, pp. 475–481, 2003.

[31] D. Whitley, S. Rana, and R. Heckendorn, “Island model genetic algo-rithms and linearly separable problems,” in Proc. AISB Workshop Evol.Comput., D. Corne and J. Shapiro, Eds., New York, 1997, pp. 109–125.

[32] C. Reynolds, “An evolved, vision-based behavioral model of coordi-nated group motion,” in From Animals to Animals 2, J.-A. Meyer, H. L.Roiblat, and S. Wilson, Eds. Cambridge, MA: MIT Press, 1992, pp.384–392.

[33] H. Schlichting, Boundary Layer Theory, 7th ed. New York: McGraw-

Hill, 1979.[34] P. H. A. Sneath and R. R. Sokal, Numerical Taxonomy; the Princi-

ples and Practice of Numerical Classification. San Francisco, CA:

Freeman, 1973.[35] D. L. Swofford,

P A U P        . PhylogeneticAnalysis UsingParsimony ( and Other Methods). Version 4. Sunderland, MA: Sinauer, 2002.

[36] D. L. Swofford, G. J. Olsen, P. J. Waddell, and D. M. Hillis, “Phyloge-netic inference,” in Molecular Systematics, 2nd ed, D. Hillis, C. Moritz,

and B. Mable, Eds. Sunderland, MA: Sinauer, 1996, pp. 407–514.[37] G. Syswerda, “A study of reproduction in generational and steady

state genetic algorithms,” in Foundations of Genetic Algorithms. SanMateo, CA: Morgan Kaufmann, 1991, pp. 94–101.

[38] A. Teller, “The evolution of mental models,” in Advances in GeneticProgramming, K. Kinnear,Ed. Cambridge,MA: TheMIT Press,1994,ch. 9.

[39] A. Toffolo and E. Benini, “Genetic diversity as an objective in multi-objective evolutionary algorithms,” Evol. Comput., vol. 11, no. 2, pp.

151–167, 2004.

[40] G. L. Urban, K. M. Bryden, and D. Ashlock, “Engineering optimizationof an improved plancha stove,” Energy Sustain. Develop., vol. 6, no. 2,pp. 5–15, 2002.

[41] D. B. West, Introduction to Graph Theory. Upper Saddle River, NJ:Prentice-Hall, 1996.

[42] D. Whitley, “The genitor algorithm and selection pressure: Why rank based allocation of reproductive trials is best,” in Proc. 3rd Int. Conf.Genetic Algorthms, 1989, pp. 116–121.

[43] D. Whitley, K. Mathias, and R. J. Dzubera, “Evaluating evolutionaryalgorithms,” Artif. Intell., vol. 85, pp. 245–276, 1996.[44] S. Wright, Evolution, W. B. Provine, Ed. Chicago, IL: Univ. of 

Chicago Press, 1986.

Kenneth Mark Bryden is an Associate Professorand Associate Chair of the Mechanical EngineeringDepartment, Iowa State University (ISU), Ames, IA.He currently heads the Virtual Engineering ResearchLaboratory with the Virtual Reality Applications

Center. The Virtual Engineering Research Groupfocuses on integration of information technologies

and cognition into the engineering process to supportdecision making for and realization of complexsystems. Prior to his arrival at ISU, He worked 14

years in a wide range of engineering positions withWestinghouse Electric Corporation within the Naval Reactors Program. Thisincluded eight years in power plant operations and testing and six years inengineering support. His primary research interests are in the integration of virtual reality, high-performance computing, and new computationalalgorithmsto solve complex, tightly coupled engineering, and decision analysis problems.

Daniel A. Ashlock received thedoctoral degreefromthe California Institute of Technology, Pasadena.

He is a Researcher with interests in bioinformaticsand the theory and practice of evolutionary com-putation. His doctoral work was in combinatorics.During 13 years at Iowa State University, he wasHead of the Complex Adaptive Systems Programand developed courses in both evolutionary com-

putation and bioinformatics. Joining the faculty of the Department of Mathematics and Statistics in theUniversity of Guelph as their Bioinformatics Chair,

he continues to work in both evolutionary computation and bioinformatics.This work appears in more than 50 peer-reviewed scientific publications withtopics as diverse as corn genomics, automatic programming, and the design of ef ficient wood burning stoves for use in the third world.

Steven Corns received the B.S. and M.S. degrees in

mechanical engineering from Iowa State University,Ames, in 2001and 2003, respectively. He is currentlyworking towards the Ph.D. degree in mechanicalengineering with the Virtual Engineering ResearchGroup.

His main research interests are in the area of evo-

lutionary computation applied to biological systemsand the mechanics of information transfer in evolu-tionary algorithms.

Stephen J. Willson received the A.B. degree fromHarvard University, Cambridge, MA, in 1968, and

the M.A. and Ph.D. degrees from the University of Michigan, Ann Arbor in 1970and 1973, respectively,all in mathematics.

His dissertation was in algebraic topology underthe supervision of A. G. Wasserman. He joinedIowa State University, Ames, in 1973, where he iscurrently a Professor of mathematics. His researchinterests include computational biology (especiallyphylogenetics), fractals, cellular automata, and game

theory.