learning in an uncertain and changing world - oecdlearning in an uncertain and changing world...

Learning in an uncertain and changing world

Insight from neuroscience to decision science

Florent MEYNIEL

Neurospin, CEA, France

Example of monthly result of your stock.Should you sell or keep it?

$ $ X $ $ X $ $ $ X X $ $ $ $ X $ $ $

Time

Starting with an example

Example of monthly result of your stock.Should you sell or keep it?

$ $ X $ $ X $ $ $ X X $ $ $ $ X $ $ $

Time

X X X X X

Starting with an example

● Can we assume a separation between inference and choice processes?

● What is being inferred when subjects predict future observations?

● How do people weight sequential observations in the face of change points?

● Do people entertain a hierarchical model when learning in a uncertain and changing world?

Questions addressed in this presentation

Separating inference and decision processes

Bayesian Decision Theory

Maloney & Zhang 2010 Vision research

Estimation (inference)

Selection (optimization)

Estimation the state of the world (θ), given the observations received (o) and my model of the world (m).E.g. What are the reward associated with different options ? …

p(θ|o ,m)∝ p(o|θ ,m) p(m)

Given an objective function (F, a.k.a. cost function, utility function) and the estimated state of the world, select the best option.

argmaxa

F (a ,θ)









argmaxa

F (a ,θ)

“Actor-critic” model

Sutton & Barto 1998 MIT press

Critic

Actor

Worldfeedbackfeedback

actions

Reinfor-cement signal






Neuro-anatomy of “Actor-critic”

O’Doherty et al Science 2004

Ventral striatum

Dorsal striatum

Compare expected reward and actual reward.Results: prediction error (dopamine system)

Select action




argmaxa

F (a ,θ)

“Actor-critic” model

Sutton & Barto 1998 MIT press

Critic

Actor

Worldfeedbackfeedback

actions

Reinfor-cement signal

Bayesian inference provides a principled framework for inference and prediction

?

● Bayesian inference computes the posterior probability of statistics.


?

Learn the statistics θ of observations, given assumptions (M) about the generative processp(θ | o

1, …, o

N, M)


● Bayesian inference provides optimal predictions about future observations given the previous observations and assumptions about the generative process. We called an ideal observer.


?


1, …, o

N, M)

Turn this estimate into a predictionp(o

N+1 | θ, M)


● Bayesian inference provides optimal predictions about future observations given the previous observations and assumptions about the generative process. We called an ideal observer.


?


1, …, o

N, M)

Turn this estimate into a predictionp(o

N+1 | θ, M)

x

● The inference can be iterated (Gelman, Bishop, Sutton & Barto). Tracking improbable (i.e. surprising) events allows the ideal observer to revise the estimates of statistics (Friston 2005).

Reverse-engineering surprise signals unravels the statistical model used by the brain

Compute statistics

Make predictions

Compare predictionsto new observations

Oups!

Reverse-engineering surprise signals unravels the statistical model used by the brain

Compute statistics

Make predictions

Compare predictionsto new observations

Oups!

Computationalneuroscience

Evidence for an automatic tracking of statistical regularities in electro-encephalography (EEG) signals

EEG recorded during passive listening.

pitch

X

Y

time

Meyniel, Maheu & Dehaene, Plos Computational Biology 2016

Squires et al Science 1976



pitch

X

Y

The P300 amplitude relates to the improbability of the current sound given the previous ones, suggesting a tracking of:→ The global item frequencyin the entire sequence.e.g. p(X) = 0.7 vs. p(X) = 0.3.

time





time


pitch

X

Y




time

→ The local item frequencyin the recent history.e.g. XXXXX vs. YYYYX.


pitch

X

Y




time

→ The local item frequencyin the recent history.e.g. XXXXX vs. YYYYX.→ The local alternation frequencyWhether items were repeated in the recent historye.g. YYXXX vs. YXYXX.


pitch

X

Y




Replication and further computational refinements: Mars 2008, Kolossa 2013, Lieder 2013, Maheu 2017...

time

→ The local item frequencyin the recent history.e.g. XXXXX vs. YYYYX.→ The local alternation frequencyWhether items were repeated in the recent historye.g. YYXXX vs. YXYXX.


pitch

X

Y


What is the simplest model that the brain must entertain to account for these effects?

What is being inferred: local transition probabilities

EEG DATALocal transition

probability model


Surprise = -log(p(observation))

P( | )P( | )

What is being inferred: local transition probabilities

EEG DATALocal transition

probability model

Implication: a tracking of transition probabilities may render the brain fit to detect serial correlations, and even causality.


Surprise = -log(p(observation))

P( | )P( | )

Similar sequential effects are found in very simple decisions

Since the same sequential effect are observed in brain responses that signals surprising event, sequential effects observed in behavior derive from the subject’s expectation regarding the statistical properties underlying their observations.

310

320

330

340

350

360

370

380

390

400

Re

act

ion

tim

es

(ms)

A simple reaction time task

Redrawn from Cogn Affect Behav Neurosci. 2002; 2: 283–299

The local transition probability model accounts for the asymmetric perception of randomness

Rating of the perceived randomness of binary sequences. (Falk, 1975)

O O X O X X O X O X O O O O X X O X O O O






→ here, p(alternate) = 12/20




→ Studies of perceived randomness show a bias for alternations, max around 0.6. (Falk, 1975; Falk & Konold, 1997; Bakan, 1960; Budescu, 1987; Rapoport & Budescy, 1992; Kareev, 1992)→ The perceived randomness can be formalized as a posterior entropy→ Our model predict an asymmetry of the perceived entropy (that is all the stronger that the integration is local)→ The asymmetry is specific of our model


→ here, p(alternate) = 12/20

It is necessary to forget in the face of change points

p(

o )

(B) Learning processes with different reliance on past observations

(A) Observed sequence

0 50 100 150 200 250

0 50 100 150 200 2500

0.5

1allprevious 15

true


p(

o )



0 50 100 150 200 250

0 50 100 150 200 2500

0.5

1allprevious 15

true

● When faced with change points, one should rely more on recent observations in order to quickly update his knowledge.


p(

o )



0 50 100 150 200 250

0 50 100 150 200 2500

0.5

1allprevious 15

true


● Estimates are less accurate when one erroneously assumes that there is no change point than when one erroneously assumes that there are change points.


p(

o )



0 50 100 150 200 250

0 50 100 150 200 2500

0.5

1allprevious 15

true



● Implication: in general, it is optimal to assume that there are change points, and hence it is rational to rely more on recent observations. (Yu and Cohen NIPS 2009).


p(

o )



0 50 100 150 200 250

0 50 100 150 200 2500

0.5

1allprevious 15

true



● Implication: in general, it is optimal to assume that there are change points, and hence it is rational to rely more on recent observations. (Yu and Cohen NIPS 2009).

● Implication: the probability-matching behavior lawfully emerges from a very local inference (Yu and Huang Decision 2014).

EEG surprise signals indicate that the inference is local

The amplitude of the P300, an EEG signature of surprise, indicate that subjects spontaneously predict the next observation using a local inference.

The best fit is obtained with a leak factor ω=16, i.e. a given observation has half its weight after 16*ln(2)≈11 new observations.

Multiple time scales of integration co-exit in the brain

The local transition probability model can be fit on each time point of the evoked-response recorded with MEEG.


Maheu, Dehaene & Meyniel, in prep

Multiple time scales of integration co-exit in the brain


● Later brain responses are best explained by increasingly shorter integration windows.● Late brain responses (>300 ms) typically correspond to conscious brain processes.● The very short integration windows of late brain response may correspond to a conscious search for

“patterns” in the observed sequence of stimuli.



Do subject entertain a hierarchical inference when learning in an uncertain and changing world ?

Some neuroscientists propose that the brain computes a hierarchical model (Friston, Klaas, Mathy, Nassar, Gallistel, Meyniel, … ) while other propose that a leaky integration suffices (Yu, Fusi, Soltani, … )

Behavioral evidence in favor of hierarchical inference in the human brain

Learning rate increases when volatility increases

Behrens et al Nat Neuro 2007

Task: choose the best cue; the reward rate associated to each cue change occasionally at “change points”.

Lear

ning

rat

e


Task: predict the mean of a gaussian distribution, whose mean (and SD) changes occasionally at “change points”.




# trial relative to change point

Lear

ning

rat

e

Lear

ning

rat

e

Learning rate increases after a change point

Nassar et al Nat Neuro 2012


Subjects detect change pointsMeyniel et al PCB 2015

Task: predict the mean of a gaussian distribution, whose mean (and SD) changes occasionally at “change points”.




# trial relative to change point

Lear

ning

rat

e

Lear

ning

rat

e

Learning rate increases after a change point

Nassar et al Nat Neuro 2012

Task: estimate the (volatile) probabilities that generate a binary sequence, and detect the moment they change.

See also: Gallistel et al Psych Rev 2014

Neural evidence in favor of hierarchical inference in the human brain

Meyniel and Dehaene PNAS 2017

IPS

OFC

TO

● While subjects covertly estimated the probabilities underlying an sequence of stimuli, the (optimal) confidence level accompanying probability estimates correlated with the activity of brain-scale networks.

● Activity in this region predicted the accuracy of subject's confidence reports.

● Those results indicate that the brain indeed tracks probabilities and even the reliability of its estimates, and computes a hierarchical probabilistic inference.

Summary

● Our brain is equipped with a powerful machinery for computing statistics from sequences of observations that may be used to guide decisions.

● The brain infers, at a minimum, the transition probabilities between successive event types.

– This can account for the tendency to perceive serial correlations and causal relations, and for a biased perception of randomness.

● This machinery is tuned to non-stationarity and favor recent observation to estimate current statistics.

– This can account for various behavioral effects: probability matching, sequential effects in choices and reactions times, a conscious search for patterns, recency effects, unstability…

● The human learning algorithm is hierarchical and Bayesian: it takes into account the occurrence of change points and the confidence in its own inference:

– This can account for the flexibility of human learning.

– Unfit prior (e.g. regarding probability of observation, regarding volatility) may account for sub-optimal behavior and appear irrationalities.

Expectations emerge more rapidly from repetitions than alternations

A qualitative agreement with the P300 data by Squires et al. 1976



Alternation freq. effect

Local freq. effect




The order matters (order reserved)

Local freq. effect




Local effects even when no global bias


Local freq. effect



Global item freq. effect




Local freq. effect



Global item freq. effect




Stronger expectations after repetitions than alternations

Local freq. effect


A modified task to further test the computation of transition probabilities

pitch

B

A

MEG recorded during passive listening.

p(A|B)

p(B|A)

p(Alt.)

p(A)

No bias

Global frequency bias

Global alternation bias

Global repetition bias

A modified version of the original task by Squires et al: additional blocks with biased transition probabilities

time


The local transition probability model accounts for both local and global effects of statistics in all conditions

We collapsed data across time (by averaging between 500-730ms) and space (by filtering). The spatial filters were estimated and applied using cross-validation on half of the data.


Can human subjects explicitly track time-varying transition probabilities?

Meyniel, Schlunneger & Dehaene, Plos Computational Biology 2015

Bayesian inversion by the Ideal Observer(infer probabilities given the observations)

learning in an uncertain and changing world - oecdlearning in an uncertain and changing world...

Documents