learning anticipatory motor control -...

21
Learning Anticipatory Motor Control D. Bailly P. Andry P. Gaussier A Rengerve ETIS UMR CNRS 8051 ENSEA - Univertité Cergy-Pontoise [email protected] GDR Robotique et Neuroscience septembre 2012

Upload: others

Post on 15-Apr-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Learning Anticipatory Motor Control

D. Bailly P. Andry P. Gaussier A Rengerve

ETIS UMR CNRS 8051ENSEA - Univertité Cergy-Pontoise

[email protected]

GDR Robotique et Neuroscience septembre 2012

ANR joint project INTERACT:

Understanding the link between low-level physical compliance and higher level behaviors : HR Cooperation

Development of a NN Model for motor Control : Cerebellum

Pluri-disciplinary : Understanding the process of decision making from motor theory :

“Decision making is based on the anticipation of the consequence of motor actions”

- disclaimer : this work is not mature (at all) -

Context

LISV-BIAUVSQ

Hydraulic DevicesETISUCP ENSEA

Neural Network ModelsURECA

LiLLE III Psychology

Context

Issues“Decision making is based on the anticipation of the consequence of motor actions”

= circuits, action processing, may be different according to the intention, the goal, the expected reward (i.e : the context)

Motor theory of social cognition : MNs, [Rizzolatti, Gallese, Decety, Wolpert...]

hyp: if processings are different, then motor trajectories may be different.

Can we read the “social motivation” from motor trajectories ?

Issues

[Jacob & Jeannerod 2005] : NOmotor trajectory is the trace of motor intention.

we can’t read the agent’s social motivation

(DR Jeckyl and Mr Hyde imaginary experiment).

“Decision making is based on the anticipation of the consequence of motor actions”

= circuits, action processing, may be different according to the intention, the goal, the expected reward (i.e : the context)

Motor theory of social cognition : MNs, [Rizzolatti, Gallese, Decety, Wolpert...]

if processing are different, then motor trajectories may be different.

Can we read the “social motivation” from motor trajectories ?

Issues BUT... [Becchio et al. 2008, Becchio et al. 2010, Ferri 2011] :

The “reach-to-grasp” has no direct social outcome, but is also affected (similar effect as the “place” action)

trajectory lenghttrajectory heightwrist max amplitudetime to peak velocity

• Our NN model for motor control : cortical areas, hippocampus, cerebellum, striatum

• On going experiment (simulation) : can our model account for Becchio’s results ?

Overview (on going work)

NN model for motor controlReferences :

• seminal and functional : VITE model [Bullock] : “via points”

• developmental : [Gaussier, Andry, Quoy, Giovannangelli, Oudeyer] (imitation, navigation, intrinsic motivations)

• Neuro-anatomy : [Grossberg, Shadmer&Krakauer, Doya, Guenter, Arleo]

• Robotics : HRI [Billard, Fukuyori]

NN model for motor controlSensori-motor categories

• associative learning between different modalities (prop-vision)

• building visuo-motor “attractors” to reach parts of the workspace [Fukuyori SAB08]

• motor babling : learning is self-supervised - ART like NN - [Grossberg] - vigilance parameter

• emulates a control of the muscle : stretch and force = position control by activation signal

NN model for motor controlSensori-motor categories

• Reaching a visual target

• not accurate, but functional

• basis for immediate imitation [Rengerve, Andry]

NN model for motor controlTransitions

• Learning a sequence of attractors

• hippocampus [Grossberg - Banquet - Gaussier]

• Timing and step by step prediction of the future attractor

• building elementary trajectories

• reaching a visual target

• not accurate

• granular cells :

• distal links : propriocetion (O) - US -

• proximal links (mossy fibers) :

• output : proprioception at t+1

NN model for motor controlCerebellum

• many projections [Doya : a simulator and internal models]

• rebuilds signals from a modality to another

• run faster (20 times than cortico-hippocampus loop)

• smooth trajectories

• predicts at t+1 the proprioceptive ( O) information

• reaching a visual target

• not accurate

Cerebellum

• many projections [Doya : simulator + internal models]

• rebuilds signals from a modality to another

• run faster (20 times than cortico-hippocampus loop)

• smooth trajectories

• predicts at t+1 the proprioceptive ( O) information

• reaching a visual target

• not accurate• granular cells :

• distal links : propriocetion (O)

• proximal links (mossy fibers) :

• output : proprioception at t+1

NN model for motor control

• Our NN model for motor control : cortical areas, hippocampus, cerebellum, striatum

• On going experiment (simulation) : toward becchio experiments -> global modulation

Overview (on going work)

How do we go higher (or lower ?)

the change in the kinematics is global

linked to an initial recognition of the task’s context

changing speed or neural time : do not affect the trajectory

• Hyp :

• the confidence in the task modulates the vigilance level = recognition level of the attractors

Becchio’s exp.

Becchio’s exp.How do we go higher (or lower ?)

the change in the kinematics is global

linked to an initial recognition of the task’s context

changing speed or neural time : do not affect the trajectory

• Hyp :

• the confidence in the task modulates the vigilance level = recognition level of the attractors

How do we go higher (or lower ?)

the change in the kinematics is global

linked to an initial recognition of the task’s context

changing speed or neural time : do not affect the trajectory

Hypothesis :

• the confidence in the task modulates the vigilance level = recognition level of the attractors

• low vigilance induces early recognition of the visuo-motor state

• early recognition induces a lower trajectory

Becchio’s exp.

Becchio’s exp.Striatum

• evaluates the situation, estimate the reward, and change the vigilance accordingly

• preliminary results (simulation)

• reaching a visual target

• not accurate

Becchio’s exp.With a developmental plausible explanation :

during the first learning trials : the system follows and learns accurately the example

vigilance is high, the “basin of recognition” of the visuo-motor state is small, the trajectory is accurate

during the life of the agent :

if trials with a lower vigilance earns reward (big objects, no obstacles, etc), then trajectory with a lower path should be learned : less energy needed to obtain the same reward

• consistent with the experiment :

• individual condition :

• only object, easy, standard moves

= low vigilance, low trajectory

• social condition

• adds a social constrain

= higher vigilance, high trajectory

Future worksStriatum : could initiate exploration strategy

• evaluates the situation and change the vigilance accordingly

• estimate the reward according to the visual distance (visuo-motor states in the visual space)

• reaching a visual target

• not accurate

Conclusion• NN model for motor control

• cortex - hippocampus - cerebellum - striatum

• Interesting solution for Becchio’s experiments, developmentally plausible (hyp)

• Next : robotic validation (electric - hydraulic)

• hippocampus-cerebellum link is not very plausible

• far from being optimal (cerebellum ?)

• striatum-PFC

• reaching a visual target

• not accurate