evolution of complex behavior controllers using genetic algorithms title

39
Evolution of Complex Behavior Controllers using Genetic Algorithms Title

Upload: silas-brown

Post on 18-Jan-2016

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

Evolution

of

Complex Behavior Controllers

using

Genetic Algorithms

Title

Page 2: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 2

• Kerry Gruber (UNR now Intel)

• Jason Baurick (UNR)

• Sushil J. Louis (UNR)

• Funded in part from grant number 9624130 from NSF

Page 3: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 3

G.A.T.O.R.S

Genetic Algorithm Training

of

Robot Simulations

Page 4: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 4

General Description

• Use artificial neural networks for control of the simulated robots

• Evolve the weights of the neural networks using a Genetic Algorithm (GA)

Page 5: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 5

Goals• Develop controllers exhibiting generalized

complex behavior– Perform complex spatially-independent tasks

– Able to perform adequately under varying environmental conditions

• Performance which meets or exceeds that of controllers designed by humans

• Use a minimum of state information

Page 6: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 6

Why use a GA to train the neural network?

• Training is based on actual performance instead of expected performance– Supervised training models rely on the designer’s

understanding of the environment and the expected consequences of input to output relationships

– GA uses whole-run performance; instead of single-step input to output relationships

Page 7: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 7

Vacuum Cleaning(Cover as larger an area as possible in the time allotted.)

• Move through the environment without retracing previous steps, and do so with no spatial information of the current or previous locations occupied

• Recharge by locating and accessing energy supplies/outlets (prey capture scenarios)

– Energy supplies may only be used by one unit at a time and are not accessible for a time afterwards (only one vacuum cleaner to an outlet)

• Interact with obstacles in the environment without crashing (obstacle avoidance)

• Negotiate obstacles without crashing (wall following)

Page 8: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 8

Simulator• Robots

– Predators (Vacuum cleaners)– Prey (Energy supplies/outlets)

• Environment– 300X300 Spatially Independent Grid– Contains Obstacles

• Simulation Process– 1000 time steps

Page 9: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 9

Simulation

Obstacles

Prey

Predators

Page 10: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 10

Predators• Two independent motors

– 1 on each side – 4 possible states per motor

• Battery – Depleted as the robot moves

– May be recharged by consuming prey

• Five binary touch sensors – 4 feelers, 1 crash sensor

• Two real-valued hearing sensors

Page 11: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 11

Rear

HearingSensor

VariableLength 10 Units

6 Units

TouchSensor

Robot Sensor Positions

Page 12: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 12

Prey/Battery• Stationary

• Emit Sound (signal) Audible to Predators– Inversely proportional to square of distance.– Cut off outside hearing range

• May be Consumed by Predators– Only a single predator may have access at a time– May not be re-used for a certain period of time

Page 13: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 13

Environment• 300X300 Spatially Independent Grid• 3-10 Obstacles (5-10% of area each)

• 5 Unit Border (Assures entire area not covered)

Page 14: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 14

Random Environment

Obstacles

Prey

Predators

Page 15: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 15

Simulation Process• 1000 Steps• Each predator randomly given a chance to move

– Provided touch and hearing sensor levels

• New position is determined and battery levels decreased in accordance with motor settings– If boundary crossed, moved outside of boundary and crash registered

(battery still decreased as if no crash occurred)

• If in contact with “live” prey, battery recharged and prey consumed

• New sensor states determined• If battery depleted, predator considered “dead”• Sleeping prey awaken if timer expired

Page 16: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 16

Neural Network• Two-layer fully connected artificial neural network

• Sigmoid activation function

• Each node has a bias

• 10 Inputs – 5 binary touch sensors

– 2 hearing sensors (Using binary states dependent on side and presence)

– 2 binary hearing states

– Battery level

• 1-10 Hidden nodes

• 4 Output nodes; output threshold of 0.5

Page 17: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 17

Virtual Prey Location

HearingSensors

Actual PreyLocation

Virtual PreyLocation

Page 18: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 18

Hearing Sensor StatesPresence of a virtual food source required a minimum of state information in order to find and capture prey.

• Hearing sensor levels used to determine whether any prey could be heard and the side they were on

• Input as two binary levels

• Save and used as state information during the next step

Page 19: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 19

GA-Encoding• 16-bit binary representation of weights

(1024-bit string for 4 hidden nodes)

• Input weights range from -100 to 100

• Biases range from -100*N to 100*N (N=number of inputs)

Node 1

W11 W21 ... WN1 B1 W1N W2N ... WNN BN

Node N

...

Page 20: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 20

GA-Initialization• Input weights randomly initialized over

full range

• Operating point initialization – Biases set at -0.5*Input Weights

Page 21: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 21

Transition Probability Distribution

0.9960.750

1.E-05

1.E-03

1.E-01

1.E+01

0 512 1024

Number of Transitions

Pro

babi

lity

Op. Region Random

75% of all nodes with the bias set by random generation never have a state transition during the first generation. With operating point initialization, 99.6% transition for 512 of 1024 possible input combinations. (10 million randomly generated nodes.)

Page 22: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 22

First Generation Coverage Probability Distribution

1.E-06

1.E-05

1.E-041.E-03

1.E-02

1.E-01

1.E+00

0 200 400 600 800 1000

Coverage

Pro

babili

ty

Op. Region Random

Operating point initialization more evenly distributes the initial coverage values used for fitness determination. Other feature measurements show similar differences. (250K random networks.)

Page 23: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 23

Fitness vs. Initialization Method

507090

110130150170190

0 10 20 30 40 50 60 70 80 90 100Generation

Fit

nes

sRandom Op. Point

Using operating point initialization, the GA progresses at a higher rate because the initial nodes are actually operational. (Differences in the random environments cause the high fluctuation of values.)

Page 24: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 24

GA-Fitness Function• Use five features

– Area coverage– Number of prey consumed– Distance covered– Number of crashes– Number of obstacle touches

• Relative fitness based on the averages and standard deviations of each generation

Fi = Wf * 2(Xif-f)/f

where:

Fi is the fitness for the ith featureWf is the weight of the featureXif is the score for a given feature-f is the average value of a feature for a given generationf is the standard deviation of a feature for a given generation

Page 25: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 25

GA-Fitness Function Cont.• Only feature scores which indicate operation are

included in the average and deviation calculation– Non functioning units tend to lower averages

– Non functioning units increase deviations

– Leads to insufficient selection pressure

• The deviation for crashes is set to the average if the deviation is greater than the average– There is clear cut-off for this feature

– High deviations and low averages lead to insufficient selection pressure as the GA matures

Page 26: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 26

Training• Competitive environment; 3 human-

programmed controllers

• Random obstacles, prey, initial positions, and environment variables generated for each generation

• Random variables same for each chromosome

Page 27: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 27

Training VariablesVariable Min. Value Max. Value

Number of Obs. 3 10

Area Coverage per Obs. 5% 10%

Ear Length 1 10

Hearing Range 20 200

Number of Prey 1 4

Prey Length 1 10

Prey Sleep Time 100 1000

Variables are set at the beginning of each generation and maintained for all chromosomes.

Page 28: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 28

GA Setting for Final Controller

Variable SettingHidden Nodes 4Generations 1000

Population Size 100Crossover Rate 100%

Crossover Points 1Mutation Rate 0.001

Elite Percentage 30%Consumption Weight 50

Touch Weight 20Crash Weight 75

Distance Weight 25Coverage Weight 100

Page 29: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 29

Final Selection• For testing purposes, final controller was

selected by hand based on objective performance in pre-defined environments

• Web demonstration uses the best sum of fitnesses of the top 20 controllers over 10 different environments

Page 30: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 30

Implementation• 8 node Beowulf cluster

– PII 400MHz machines

– Red Hat Linux

– LAM version of MPI

• 3 Java interface applet/applications– Configuration

– Training

– Simulation Display

Page 31: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 31

Speed-Up

1.71

2.34

3.483.37

2.88

1.00

3.93

0.93

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

1 2 3 4 5 6 7 8

Number of Nodes

Sp

eed

-up

Page 32: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 32

Results-Average Area Coverage

300

400

500

600

700

800

0 10 20 30 40 50 60 70 80 90 100Random Environment Set

Co

vera

ge

1 2 3 Test Unit

Averages using 100 different random simulation settings in 100different environments.

Page 33: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 33

Results-Average Number of Touches

Averages using 100 different random simulation settings in 100different environments.

0100200300400500600700

0 10 20 30 40 50 60 70 80 90 100Random Environment Set

Nu

mb

er o

f T

ou

ches

1 2 3 Test Unit

Page 34: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 34

Results-Average Distance

Test unit covers more area, but less distance. Indicates a slower speed and better energy conservation.

1000

1100

1200

1300

1400

0 10 20 30 40 50 60 70 80 90 100Random Environment Set

Dis

tan

ce1 2 3 Test Unit

Page 35: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 35

Results-Coverage vs. Prey Sleep Time

Final controller is less affected by variations in prey sleep time. (Sleep time increment over

100 iterations, results averaged from 100 random environments at each setting.)

300

400

500

600

700

800

100 1000Prey Sleep Time

Co

vera

ge1 2 3 Test Unit

Page 36: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 36

Results-Crashes vs. Hearing Range

Final controller performs poorly in relation to crashes as the hearing range is increased. (Hearing range increment over 100 iterations, results averaged from 100 random

environments at each setting.)

0

24

68

1012

14

20 200Hearing Range

Cra

shes1 2 3 Test Unit

Page 37: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 37

Results-Crashes vs. Noise Bias

Final controller performs adequately in the presence of noise (Noise increment over 100

iterations, results averaged from 100 random environments at each setting.)

0

2040

60

80100

120

0 10 20 30 40 50 60 70 80 90 100

Noise Bias

Cra

shes1 2 3 Test Unit

Page 38: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 38

Results-Coverage for Robots Trained w/wo Noise

Controllers trained in the presence of noise are less susceptible to its effects, but do not reach peak performance. (Noise increment over 100 iterations, results averaged from 100 random

environments at each setting.)

0

200

400

600

800

0 10 20 30 40 50 60 70 80 90 100

Noise Bias

Co

vera

geTest Unit Noise-Trained

Page 39: Evolution of Complex Behavior Controllers using Genetic Algorithms Title

http://www.cs.unr.edu/~gruber 39

Conclusions• Result controller surpassed those produced by humans in the

areas of coverage and energy conservation

• All stages of the GAs progression must be taken into account in the fitness function to achieve acceptable results– Operating point initialization guarantees function nodes during early

generations; and appears to increase GA performance

– Relative scoring functions appears to provide good selection pressure over the life of the GA