a neat way for evolving echo state networks

ECAI 2010ECAI 2010

A NEAT Way for Evolving A NEAT Way for Evolving Echo State NetworksEcho State Networks

Kyriakos C. ChatzidimitriouPericles A. Mitkas

Intelligent Systems and Software Engineering Labgroup Informatics and Telematics Institute Electrical and Computer Eng. Dept.Centre for Research and Technology-Hellas Aristotle University of Thessaloniki

Thessaloniki, Greece

ECAI 2010, Lisbon A NEAT Way for Evolving ESNs 2

The problemThe problem

• Engineer fully autonomous-intelligent agents

• Model as reinforcement learning problems• Need good Function Approximators• Plus properties like:

– Non-linear– Non-Markovian

Generalization


Adaptive function Adaptive function approximatorsapproximators

• Problems continue: – Q: What FA to choose…?– A: Something powerful/suitable– Q: Adjust the parameters…

• Neural Nets: Number of neurons? Topology? Weights?

– A: Adaptive function approximators

• FAs built automatically, ad-hoc, per problem/environment

• How? Through the synthesis of learning and evolution Good for autonomy

Good for the user


Our proposalOur proposal

• Our proposal for an adaptive FA methodology– Built bottom up, combining powerful ideas and

algorithms from the research literature into a single methodology

– Each one fills a different gap, developing into something complete

– Design goal: cover as many aspects as possible


The IngredientsThe Ingredients

• 1 ESN (Echo State Network) [H. Jaeger]• 50% NEAT

(NeuroEvolution of Augmented Topologies) [K. Stanley]

• 50% TD-Learning


Basic Echo State Basic Echo State NetworkNetwork

If output units are linear: y(t) = w u(t) + w’ x(t)Linear function with a) linear b) non-linear and temporal features

Large number of features

Sparse

Mean around 0

Spectral radius less than 1


NEAT – Basic PrinciplesNEAT – Basic Principles

• Start minimally and complexify

• Weight & structuralmutation

• Speciation through clusteringto protect innovation

• Crossover networksthrough historical markings on connections

• Adapt it and use it as meta-search algorithm (its principles) for ESNs

1 23

1 2

3

13

2


• Combine global and local search

• Evolution helps learning “avoid traps”• Learning helps evolution to “find better locations” nearby• ESNs allow us to do that easily

– Linear learning schemes so one can use all the classic linear RL/SL learning algorithms (TD, LS, LSTD etc.)

– Need to adjust the part of evolution – adjust NEAT

Evolution and learningEvolution and learning


InitializationInitialization

• Start minimally with 1 reservoir neuron– XOR problem

• Input weights: random [-1,1]

• Output weights: 0• Reservoir weights:

– [-1,1]– Density– Mean(Wres) = 0


Mutation – Add nodeMutation – Add node

• Node added– Adds a new feature,

gene increases– Historical markings to

the node– All the reservoir

connections are initially disabled

• Later enabled through link mutation or crossover

1

32

4


More MutationsMore Mutations

• Add/remove connections• Mutate weights

– Restart– Perturb

• The weights added/changed towards making Mean(Wres) = 0

• Mutate density/spectral radius– Restart– Perturb

1

32

4


CrossoverCrossover

1

32

4

5

1

32

Let’s assume the smallest gene is alsothe fittest.


AlignmentAlignment

1

3

2

4

5

1

3

2


FittestFittest

1

3

2

1

3

2


AlignmentAlignment

1

3

2

4

5

1

3

2


LargestLargest

1

3

2

4

5

1

3

2


SpeciationSpeciation

• ESN are supposed to be sparse• Structural similarity on connections like

NEAT would eliminate the notion of speciation

• Similarity on the basic macroscopic ESN properties:– spectral radius, density, # nodes


LearningLearning

• Use simple GD TD-learning for RL• Use Least Squares for time series - online

updating is not required• Tons of methods can be used here (both

under TD-learning or doing policy search using EC)

• Tested also Darwinian versus Lamarckian evolution


Basic FlowBasic Flow

Init Pop

Simulation Learning

Fitness

Speciation

Selection

Mutation

Crossover

Next Gen

Champion

GeneralizationPerformance


ExperimentsExperiments

• Reinforcement Learning– Mountain Car– Single & Double Pole

Balancing

• Time Series– Mackey-Glass


Time Series – Mackey Time Series – Mackey GlassGlass

• Better prediction errors than another recent TWEANN on ESN algorithm – One order of magnitude both for test and

generalization errors

• Main differences is that we start minimally and do crossover, speciation


Mountain CarMountain Car

• Same behavior as NEAT+Q algorithm– NEAT+Q = “NEAT” + “Q-

Learning through back-propagation” – “recurrences”

• Same generalization behavior (around -50)

• Our approach solves also Non-Markovian 2D and 3D cars problems with learning (only position and not a speed signal is available)


Pole BalancingPole Balancing

• Comparable performance with NEAT with respect to networks evaluated

• Our approach takes more time due to the learning procedure

• 1st bell to accommodate a more advanced learning algorithm than simple GD


Vs.Vs.

• Simple ESN– Problem probably due to on-line learning (online-

learning, RL and NNs not a good triplet)– Not a problem with our approach since NE finds

ESNs “that are better able to learn”

• Linear TD-learning (no reservoir)– No reservoir => No Non-Markovian signals, Worse

behavior

• No-learning, only evolution• No clear conclusions, but vs. simple GD TD-Learning (2nd

bell)


ExperimentsExperiments

• Reinforcement Learning– Mountain Car– Single & Double Pole

Balancing – 3D Mountain Car– Server Job

Scheduling [+++]

• Time Series– Mackey-Glass [+]– MSO [-]– Lorentz Attractor [+]

15% improvement than

NEAT+Q


Future WorkFuture Work

• Even more automation, driven by the problem at hand– For example adapting operator probabilities online

• Test new RL-TD learning techniques – i.e. iLSTD, GQ

• More difficult test-beds– TAC Ad Auctions– Poker– Real Time Strategy

ECAI 2010ECAI 2010

Thank you for your attentionThank you for your attention

Questions?

Kyriakos [email protected]

http://issel.ee.auth.gr

mailto:[email protected]

http://issel.ee.auth.gr/

a neat way for evolving echo state networks

Technology

esnsa neat way

crossover1324a neat

nodesa neat way

usera neat way

possiblea neat way

neata neat way

tdlearninga neat way

evolving esnsevolution