the role of dopamine in planning and action on neural ... · on neural correlates of reinforcement...
TRANSCRIPT
![Page 1: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/1.jpg)
ON NEURAL CORRELATES OF REINFORCEMENT LEARNING
the role of dopamine in planning and action
Genela Morris Dept. of Neurobiology Haifa University [email protected]
![Page 2: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/2.jpg)
Suggested reading
• Dayan P and Abbott LF. Theoretical Neuroscience: Computational and Mathematical Modeling of Neural Systems. MIT Press, Cambridge MA (2001): Ch. 9
• Barto AG & Sutton RS. Reinforcement Learning: An introduction. MIT Press, Cambridge MA (1988) : Ch. 3, Ch. 6 + some of Ch. 2
• Schultz W, Dayan P, Montague PR (1997), A neural substrate of prediction and reward, Science 275: 1593-1599
• Figures from research papers are referenced throughout the presentation
![Page 3: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/3.jpg)
June 12 Haifa University 3
Reinforcement learning the basics
Supervised learning –
all knowing teacher, detailed feedback
Reinforcement learning –
scalar (correct/incorrect) feedback
Unsupervised learning –
self organization
![Page 4: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/4.jpg)
June 12 4
Reinforcement learning: The law of effect
“The Law of Effect is that: Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur”
Edward Lee Thorndike (1911)
![Page 5: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/5.jpg)
Early attempts at modeling
• By associative rules
• Classical conditioning
June 12 5
![Page 6: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/6.jpg)
Properties of classical conditioning
(Pavlov 1927)
• Acquisition.
• Partial Reinforcement (probabilistic).
• Generalization.
• Interstimulus Interval (ISI) effects.
• Intertrial Interval (ITI) effects.
![Page 7: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/7.jpg)
So far…
• A simple association (coincidence, Hebbian) model can explain the phenomenon.
– But…
CS
US
• Acquisition.
• Partial Reinforcement (probabilistic).
• Generalization.
• Interstimulus Interval (ISI) effects.
• Intertrial Interval (ITI) effects.
UR CR
![Page 8: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/8.jpg)
Classical conditioning
The Elements:
US: Unconditioned stimulus
UR: Unconditioned response
NS: Neutral stimulus
CS: Conditioned stimulus
CS1: Conditioned stimulus 1
CS2: Conditioned stimulus 2
CR: Conditioned response
![Page 9: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/9.jpg)
Properties of classical conditioning
(Cnt’d)
• Conditioned Inhibition
• latent inhibition
• Relative validity (Wagner 1968).
• Blocking (Kamin 1968)
• …
CS must RELIABLY predict US
![Page 10: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/10.jpg)
Which simple association can’t explain
Learning occurs not because two
events co-occur, but because that
co-occurrence is otherwise
UNPREDICTED
![Page 11: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/11.jpg)
Rescorla-Wagner rule (1972)
Learning to predict reward R given stimulus U=1
Goal: Form a prediction V of the reward of the form:
V=ωU
And learn to change ω :
Δ ω =ε(R-V)U
After learning of consistent pairing: ω=R
Where:
U=CS availability (0,1);
V=reward prediction:
R=reward availability (0,1) :
ω = weight of the connection
between U and V
ε = learning rate
R-V = prediction error
![Page 12: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/12.jpg)
Blocking with Rescorla Wagner
• Given U1, U2 and R, after U1 has been learnt:
• ω1=R
• V= ω1U1+ ω2U2
• Prediction error: R-V=0
And no learning occurs for ω2
R 0
![Page 13: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/13.jpg)
June 12 13
Critical problems, for control
1. Exploration/exploitation
![Page 14: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/14.jpg)
June 12 14
Solutions, for control
1. Variability in response policy
1. Greedy Random (gambling)
2. Based on expected return
![Page 15: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/15.jpg)
June 12 Haifa University 15
Decision behaviour, theory and practice
maximizing
probability-
matching
monkeys?
Rright/(Rright+Rleft)
Cri
gh
t/(C
rig
ht+
Cle
ft)
0 0.5 1
1
0.5
right
left
leftright
right
leftright
right
RR
R
CC
C
)(
R = reward
C =
ch
oic
e
![Page 16: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/16.jpg)
June 12 Haifa University 16
Monkeys’ decisions: probability matching
![Page 17: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/17.jpg)
… whether optimal or not
• Actions are related to their consequences
![Page 18: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/18.jpg)
June 12 19
Critical problems in reinforcement learning (and in
Rescorla-Wagner)
2. Temporal credit assignment
state 1
state 2 state N R
![Page 19: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/19.jpg)
June 12 Haifa University 20
TD learning - solution for temporal credit assignment
1. Estimate value of current state (Vt=rt+ γ’rt+1+…) :
(discounted) sum of expected rewards
2. Measure ‘truer’ value of current state: reward at present state + estimated value of next state (rt+ γVt+1)
3. TD error
4. Use TD error to improve 1 (Vtk+1=Vt
k+η δt)
where:Vt = value of the state reached at time t in iteration k
rt = reward given at time t; η = learning rate, δ = prediction error
tttt VVr 1
![Page 20: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/20.jpg)
June 12 Haifa University 21
TD error: tttt VVr 1
time
![Page 21: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/21.jpg)
June 12 Haifa University 22
TD error: tttt rVV 1
time
![Page 22: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/22.jpg)
Berlin 2004 23
Basal ganglia - anatomy
![Page 23: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/23.jpg)
11-Jun-12 medical neurosciences
Intracranial self stimulation
Activates reward circuits
![Page 24: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/24.jpg)
11-Jun-12 medical neurosciences
The midbrain dopamine system
DA
STR
Ctx
D1/5
![Page 25: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/25.jpg)
Berlin 2004 26
Dopamine and acetylcholine meet in the
striatum
Mouse Monkey
![Page 26: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/26.jpg)
Berlin 2004 27
Facts to remember (1)
• Basal ganglia receive cortical input
• Basal ganglia project to frontal cortex
• Dopamine and acetylcholine localization
![Page 27: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/27.jpg)
June 12 Haifa University 28
The midbrain dopamine system
Schultz et al,
J. Neurosci 13:
900-913 ,1993
![Page 28: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/28.jpg)
June 12 Haifa University 29
Probabilistic instrumental conditioning task
tttt rVV 1
Morris et al., Neuron 43(1): 133-143, 2004
![Page 29: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/29.jpg)
Berlin 2004 30
DA response
![Page 30: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/30.jpg)
June 12 Haifa University 31
Dopamine population response- cue
n=114
![Page 31: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/31.jpg)
June 12 Haifa University 32
Dopamine population response-reward
![Page 32: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/32.jpg)
June 12 Haifa University 33
Dopamine population response – reward omission
![Page 33: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/33.jpg)
June 12 Haifa University 34
Instrumental conditioning - results
• Responses to visual cue are correlated with future reward probability
• Responses to reward are inversely correlated with reward probability
• Responses to reward omission are indifferent to reward probability
Dopamine neurons provide an accurate TD signal (but only in the positive domain)
![Page 34: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/34.jpg)
June 12 Haifa University 35
… and it can cause long term plasticity of cortico-striatal synapses
Reynolds et al, A cellular mechanism of reward-related learning Nature 413,
67 - 70 (2001)
DA
STR
Ctx
D1/5
![Page 35: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/35.jpg)
… and it can cause long term plasticity of cortico-striatal synapses
Shen et al., Science 321:848-851 2008
![Page 36: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/36.jpg)
Facts to remember 2
• DA neurons provide a TD error signal
• To the cortico (state) striatal (action) synapses
• And DA modulates synaptic plasticity
![Page 37: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/37.jpg)
June 12 Haifa University 38
State 1 State 2 State
Action 1
Agent Environment
Action
+Reinforcement
Control - Adding action
The agent has to:
– Learn to predict reinforcement state value
– Know the state-action-state transitions behavioural policy
![Page 38: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/38.jpg)
June 12 Haifa University 39
Solution 1: actor/critic networks
Environment
Action Actor
Critic State
Reward
TD
![Page 39: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/39.jpg)
June 12 Haifa University 40
How can the dopamine signal contribute to decision behaviour?
• Long term policy-shaping effect
through synaptic plasticity
• Immediate effect on action
btmaction
eP
)(1
1
DA
STR
Ctx
D1/5
CS
State
Action
Envirt
Actor
Critic Reward TD
State
Action
Envirt
Actor
Critic Reward TD
![Page 40: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/40.jpg)
June 12 Haifa University 41
Monkeys’ decisions: probability matching
![Page 41: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/41.jpg)
June 12 Haifa University 42
The two armed bandit task
Morris et al., Nature Neurosci 9: 1057-1063, 2006
![Page 42: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/42.jpg)
June 12 Haifa University 43
Monkeys’ decisions: probability matching
![Page 43: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/43.jpg)
June 12 Haifa University 44
Lost in translation?
reward behaviour
dopamine
response
plasticity in
action circuits
![Page 44: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/44.jpg)
June 12 Haifa University 45
Monkeys’ decisions: shaping by dopamine
![Page 45: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/45.jpg)
Coding 2011 46
Dopamine neurons during decision
stimulus action
decision
DA DA
![Page 46: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/46.jpg)
ICNC 2005 47
Are DA neurons aware of future choice?Hi (explore) Lo (exploit)
![Page 47: the role of dopamine in planning and action ON NEURAL ... · ON NEURAL CORRELATES OF REINFORCEMENT LEARNING the role of dopamine in planning and action Genela Morris Dept. of Neurobiology](https://reader035.vdocuments.net/reader035/viewer/2022070707/5e9f7cfcf1397b42e377f494/html5/thumbnails/47.jpg)
ICNC 2005 48
The learning is of state-action values