utility-weighted sampling in decisions from experience · utility-weighted sampling in decisions...

Post on 23-Jun-2020

14 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Utility-weighted sampling in decisions from experience

Falk Lieder, Tom Griffiths, Ming HsuUC Berkeley

Extreme potential outcomes influence people as if they were far more likely than they really are.

Extreme potential outcomes influence people as if they were far more likely than they really are.

Extreme potential outcomes influence people as if they were far more likely than they really are.

Extreme potential outcomes influence people as if they were far more likely than they really are.

38% of Americans say they are less likely to travel overseas because of 9/11.

Extreme potential outcomes influence people as if they were far more likely than they really are.

38% of Americans say they are less likely to travel overseas because of 9/11.

Expected Utility Theory

Expected Utility Theory

Take action

utility ofoutcome O

expected value

Expected Utility Theory

Take action

utility ofoutcome O

expected value

Expected Utility Theory

Take action

utility ofoutcome O

expected value

Intractable!

Expected Utility Theory

Take action

utility ofoutcome O

expected value

Intractable!

Violated!

EU can be Approximated by Sampling

EU can be Approximated by Sampling

simulated outcomes

EU can be Approximated by Sampling

simulated outcomes

EU estimates

EU can be Approximated by Sampling

simulated outcomes

EU estimates decision

EU can be Approximated by Sampling

simulated outcomes

EU estimates decision

finite time finitely many simulated outcomes

Representative sampling is dangerous

Representative sampling is dangerous

Representative sampling is dangerous

variance

Representative sampling is dangerous

variance

Representative sampling is dangerous

biasvariance

Representative sampling is dangerous

vs.

biasvariance

Utility estimation by importance samplingfu

nde

ath

p

Utility estimation by importance sampling

simulatedoutcomes

fun

deat

h

pq

Utility estimation by importance sampling

simulatedoutcomes

fun

deat

h

o1, o2, o3 q~

pq

Utility estimation by importance sampling

simulatedoutcomes EU estimates

fun

deat

h

o1, o2, o3 q~

pq

Utility estimation by importance sampling

simulatedoutcomes EU estimates

fun

deat

h

o1, o2, o3 q~

w1=p(o1)/ q(o1)p

q

Utility estimation by importance sampling

simulatedoutcomes EU estimates

fun

deat

h

o1, o2, o3 q~

w1=p(o1)/ q(o1)

w3=p(o3)/ q(o3)…

pq

Utility estimation by importance sampling

simulatedoutcomes EU estimates decision

fun

deat

h

o1, o2, o3 q~

w1=p(o1)/ q(o1)

w3=p(o3)/ q(o3)…

pq

Utility estimation by importance sampling

Which distribution should the brain sample from?

simulatedoutcomes EU estimates decision

fun

deat

h

o1, o2, o3 q~

w1=p(o1)/ q(o1)

w3=p(o3)/ q(o3)…

pq

Answer: Utility-Weighted Sampling (UWS)

simulationfrequency

probability

extremity

Answer: Utility-Weighted Sampling (UWS)

simulationfrequency

probability

extremity

Lieder, Hsu, Griffiths (2014)

Decisions from Experience (Ludvig, et al., 2014)

Decisions from Experience (Ludvig, et al., 2014)

Decisions from Experience (Ludvig, et al., 2014)

Decisions from Experience (Ludvig, et al., 2008)

Decisions from Experience (Ludvig, et al., 2008)

Decisions from Experience (Ludvig, et al., 2008)

Inconsistent Risk Preferences Emerge from Learning

+40/0 vs. +20

-40/ 0 vs. -20

+45/5 vs. +25

-45/-5 vs. -25

UWS Can Emerge from Reward-Modulated Associative Learning

Actions:

UWS Can Emerge from Reward-Modulated Associative Learning

Actions:

at

UWS Can Emerge from Reward-Modulated Associative Learning

Actions:

Outcomes: -40 -20 +20 +40

at

ot

UWS Can Emerge from Reward-Modulated Associative Learning

Actions:

Outcomes: -40 -20 +20 +40

wt

at

ot

UWS Can Emerge from Reward-Modulated Associative Learning

Actions:

Outcomes: -40 -20 +20 +40

wt

at

ot

prediction error

UWS Can Emerge from Reward-Modulated Associative Learning

Actions:

Outcomes: -40 -20 +20 +40

wt

learning rate

at

ot

prediction error

UWS Can Emerge from Reward-Modulated Associative Learning

Actions:

Outcomes: -40 -20 +20 +40

wt

learning rate

at

ot

prediction error

UWS Can Emerge from Reward-Modulated Associative Learning

Actions:

Outcomes: -40 -20 +20 +40

wt

For all o≠ot:

forgetting rate

learning rate

at

ot

prediction error

Learning Rule Convergences to Utility-Weighted Sampling

Utility-weighted learning converges to

with

Learning Rule Convergences to Utility-Weighted Sampling

Utility-weighted learning converges to

with

with activation function the networklearns to perform utility-weighted sampling.

Efficient coding (Summerfield & Tsetsos, 2015)

Efficient coding (Summerfield & Tsetsos, 2015)

Efficient coding (Summerfield & Tsetsos, 2015)

Model fitting

Maximum-Likelihood-Estimation offrom block-by-block choice frequencies inExperiments 1-4 by Ludvig et al. (2014).

A single set of parameters fits all experiments.

UWS captures that people learn to overweight extreme outcomes

+40/0 vs. +20

-40/ 0 vs. -20

+45/5 vs. +25

-45/-5 vs. -25

UWS captures that people learn to overweight extreme outcomes

+40/0 vs. +20

-40/ 0 vs. -20

+45/5 vs. +25

-45/-5 vs. -25

Utility-Weighted Sampling Captures Memory Biases (Madan et al. 2014)

Which outcome comes to mind first?

Utility-Weighted Sampling Captures Memory Biases (Madan et al. 2014)

Which outcome comes to mind first?

Utility-Weighted Sampling Captures Memory Biases (Madan et al. 2014)

Which outcome comes to mind first?

Utility-Weighted Sampling Captures Frequency Estimation Bias (Madan et al. 2014)

How often did this door lead to each outcome?

+40: ___ %+20: ___ %

0: ___ %

Utility-Weighted Sampling Captures Frequency Estimation Bias (Madan et al. 2014)

How often did this door lead to each outcome?

+40: ___ %+20: ___ %

0: ___ %

Utility-Weighted Sampling Captures Frequency Estimation Bias (Madan et al. 2014)

How often did this door lead to each outcome?

+40: ___ %+20: ___ %

0: ___ %

Biased Beliefs Predict Risk Seeking

rpeople= +0.16; p<0.05

Judged Freq. of High Gain (%)

Biased Beliefs Predict Risk SeekingrUWS = +0.23rpeople= +0.16; p<0.05

Judged Freq. of High Gain (%)

Biased Beliefs Predict Risk SeekingrUWS = +0.23rpeople= +0.16; p<0.05 rpeople = -0.48; p< 0.05

Judged Freq. of High Gain (%) Judged Freq. of Large Loss (%)

Biased Beliefs Predict Risk SeekingrUWS = +0.23 rUWS = -0.44rpeople= +0.16; p<0.05 rpeople = -0.48; p< 0.05

Judged Freq. of High Gain (%) Judged Freq. of Large Loss (%)

Conclusions

|PE|

Conclusions1. Utility-weighted sampling provides a unifying explanation

for biases in memory, judgment, and decision making.

|PE|

Conclusions1. Utility-weighted sampling provides a unifying explanation

for biases in memory, judgment, and decision making.2. Utility-weighted sampling can emerge from reward-

modulated associative learning.

|PE|

Conclusions1. Utility-weighted sampling provides a unifying explanation

for biases in memory, judgment, and decision making.2. Utility-weighted sampling can emerge from reward-

modulated associative learning.3. People overweight extreme events, because it is rational

to focus on the most important eventualities.

|PE|

Conclusions1. Utility-weighted sampling provides a unifying explanation

for biases in memory, judgment, and decision making.2. Utility-weighted sampling can emerge from reward-

modulated associative learning.3. People overweight extreme events, because it is rational

to focus on the most important eventualities.4. Some cognitive biases may serve or reflect the rational

allocation of finite cognitive resources.

|PE|

Thank you!

This work was supported by grant number N00014-13-1-0341 from the Office of Naval Research to Thomas L. Griffiths, grant number RO1 MH098023 from the National Institutes of Health to Ming Hsu, and a Berkeley Graduate Fellowship to Falk Lieder.We thank Elliot Ludvig, Quentin Huys, and Mike Pacer for their suggestions and feedback.

Poster T28

top related