innorobo 2016 keynote - making robots dream to face open environments

Making robots dream to face open environments

Stéphane Doncieux

What a machine can do

YASKAWA BUSHIDO PROJECT / industrial robot vs sword master

Deep Blue vs G. Kasparov 1997

Motion Problem resolution

Doncieux, S. (to appear) Creativity: A Driver for Research on Robotics in Open Environments, Intellectica

Perf

orm

ance

Context

Robot A

Robot B

??? ???

Known

Unknown Unknown

How can a robot face a new environment ?

1. Robustness

2. Learning

3. Development

Manual development

Autonomous development

1. Robustness

Manual development


Learning

2. Learning

Reward

High

Low A

Learning the action to apply in a state to maximize reward.

Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge: MIT press.

2. Learningcontinuous actions & states

EvaluationGenotype

Fitness

Random generation

Selection

Variation

8.300110100111

Termination

Initial conditionsEvaluation

Genotype

Phenotype

Behavior

Environment

Fitness

Evolutionary Robotics

Mouret, J.B., Bredeche, N. et Doncieux S. La robotique évolutionniste Pour la science n°87, Avril-Juin 2015

Doncieux, S., Bredeche, N., Mouret, J.-B., & Eiben, A. E. (2015). Evolutionary Robotics: What, Why, and Where to.

Frontiers in Evolutionary Robotics, doi: 10.3389/frobt.2015.00004

A

Kober, J., Bagnell, J. a., & Peters, J. (2013). Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11), 1238–1274. doi:10.1177/0278364913495721

2. Learning The representation is critical !

???• Reinforcement Learning: fast but requires an efficient representation • Evolutionary Robotics: low level representation, but slow…

3. Development

Weng, J. (2004). Developmental robotics : Theory and experiments. International Journal of Humanoid Robotics, 1(2), 199–236.

Manual development


3. DevelopmentInsights from psychology

The importance of redescribing knowledge representations

« A specifically human way to gain knowledge is for the mind to exploit internally the information that it has already stored (both innate and acquired), by redescribing its representations or, more precisely, by iteratively re-representing in different representational formats what its internal representations represent » [Karmiloff-Smith 1996]

When to restructure and consolidate knowledge ?

« Sleep consolidates recent memories and, concomitantly, could allow insight by changing their representational structure. » [Wagner, 2004]

Kick-off meeting DREAM, Paris, 26/01/2015

Deferred Restructuring of Experience in Autonomous Machines

H2020 FET Proactive « Knowing, doing, being » 01/2015-12/2018

http://www.robotsthatdream.eu/ https://twitter.com/robotsthatdream

3. Development Changing representations

Daytime experience

(large batch)

Daytime

Consolidated knowledge- task-relevant features- task contexts- abstract knowledge- new motivations

No initial policyNo single taskMotivations:- curiosity- satisfying humans- global mission

Behavior explorationKnowledge improvement

Knowledge adaptation

Small batch

Skill Knowledge validation

Sequence of learning episodes driven by motivations

New situation:-no reprogramming-fast adaptation

Knowledge sharingbetween robots:- better generalization- faster learning

Nighttime

Dream

Collective scale

Individual scale

Knowledge restructuring

Transfer from STM to LTM

http://www.robotsthatdream.eu/

https://twitter.com/robotsthatdream

Learning 10 to 100

times faster

Generates examplesof behaviours

Discrete actions and sensors to consider

Passive analysis

Representationredescription

2

1

Learning

Direct policy search(neuroevolution)

Task-agnosticrepresentations

Slow learningLimited generalization

3

Learning

Discrete reinforcementlearning

Task-specific representations

Fast learningGood generalization

Development: bootstrapping simple manipulation skills

1. Day 1: sensori-motor babbling 2. «Night» Learning to manipulate an object in simulation

3. Day 2 : Back to reality

Thank you ! Questions ?

[email protected] https://twitter.com/SDoncieux http://people.isir.upmc.fr/doncieux

mailto:[email protected]

https://twitter.com/SDoncieux

http://people.isir.upmc.fr/doncieux

innorobo 2016 keynote - making robots dream to face open environments

Technology