wisdom of crowds in human memory: reconstructing events by aggregating memories across individuals
DESCRIPTION
Wisdom of Crowds in Human Memory: Reconstructing Events by Aggregating Memories across Individuals. Mark Steyvers Department of Cognitive Sciences University of California, Irvine. Joint work with: Brent Miller, Pernille Hemmer, Mike Yi Michael Lee, Bill Batchelder , Paolo Napoletano. - PowerPoint PPT PresentationTRANSCRIPT
Wisdom of Crowds in Human Memory: Reconstructing Events by Aggregating
Memories across Individuals
Mark SteyversDepartment of Cognitive Sciences
University of California, Irvine
Joint work with:Brent Miller, Pernille Hemmer, Mike Yi
Michael Lee, Bill Batchelder, Paolo Napoletano
Wisdom of crowds phenomenon
Group estimate often performs as well as or better than best individual in the group
2
Examples of wisdom of crowds phenomenon
3
Who wants to be a millionaire?Galton’s Ox (1907): Median of individual estimates comes close to true answer
Tasks studied in our research
Ordering/ranking problems declarative memory: order of US presidents, ranking cities by size episodic memory: order of events (i.e., serial recall) predictive rankings: fantasy football
Matching problems assign N items to N responses e.g., match paintings to artists, or flags to countries
Traveling Salesman problems find shortest route between cities
problems involving permutations 4
Ulysses S. Grant
James Garfield
Rutherford B. Hayes
Abraham Lincoln
Andrew Johnson
James Garfield
Ulysses S. Grant
Rutherford B. Hayes
Andrew Johnson
Abraham Lincoln
Recollecting order from Declarative Memory
time
Place these presidents in the correct order
Recollecting order from episodic memory
6http://www.youtube.com/watch?v=a6tSyDHXViM&feature=related
Place scenes in correct order (serial recall)
7
time
A B C D
Goal: aggregating responses
8
D A B C A B D C B A D C A C B D A D B C
Aggregation Algorithm
A B C D A B C D
ground truth
=?
group answer
Bayesian Approach
9
D A B C A B D C B A D C A C B D A D B C
Generative Model
A B C D
group answer =latent random variable
Task constraints
No communication between individuals
There is always a true answer (ground truth)
Aggregation algorithm never has access to ground truth unsupervised methods ground truth only used for evaluation
10
Research Goals Aggregation of permutation data
going beyond numerical estimates or multiple choice questions combinatorially complex
Incorporate individual differences going beyond models that treat every vote equally assume some individuals might be “experts”
Take cognitive processes into account going beyond mere statistical aggregation
Hierarchical Bayesian models
11
Part IOrdering Problems
12
Experiment 1
Task: order all 44 US presidents
Methods 26 participants (college undergraduates) Names of presidents written on cards Cards could be shuffled on large table
13
= 1= 1+1Measuring performance
Kendall’s Tau: The number of adjacent pair-wise swaps
Ordering by IndividualA B E C D
True OrderA B C D E
C DEA B
A B E C D
A B C D E= 2
Empirical Results
15
1 10 200
100
200
300
400
500
Individuals (ordered from best to worst)
(random guessing)
Probabilistic models Thurstone (1927), Mallows (1957), Plackett-Luce (1975) Lebanon-Mao (2008)
Spectral methods Diaconis (1989)
Heuristic methods from voting theory Borda count
… however, many of these approaches were developed for preference rankings
Many methods for analyzing rank data…
16
Bayesian models constrained by human cognition
Extension of Thurstone’s (1927) model Extension of Estes (1972) perturbation model
17
Bayesian Thurstonian Approach
18
Each item has a true coordinate on some dimension
A B C
Bayesian Thurstonian Approach
19
A B C
… but there is noise because of encoding and/or retrieval error
Person 1
Bayesian Thurstonian Approach
20
Each person’s mental representation is based on (latent) samples of these distributions
B C
A B C
Person 1
A
Bayesian Thurstonian Approach
21
B C
A B C
The observed ordering is based on the ordering of the samples
A < B < C
Observed Ordering:
Person 1
A
Bayesian Thurstonian Approach
22
People draw from distributions with common means but different variances
Person 1
B C
A B CA < B < C
Observed Ordering:
Person 2
A B C
BC
Observed Ordering:
A < C < BA
A
Graphical Model Notation
23
jx
1x
2x 3xj=1..3
shaded = observednot shaded = latent
Graphical Model of Bayesian Thurstonian Model
24
j individuals
jx
jy
μ
j
| , ~ N ,ij j jx
( )j jranky x
~ Gamma ,1 /j
Latent ground truth
Individual noise level
Mental representation
Observed ordering
1 10 200
50
100
150
200
250
300
350
Individuals
Thurstonian ModelIndividuals
(weak) wisdom of crowds effect
26
model’s ordering is as good as best individual (but not better)
Inferred Distributions for 44 US Presidents
27
George Washington (1)John Adams (2)
Thomas Jefferson (3)James Madison (4)James Monroe (6)
John Quincy Adams (5)Andrew Jackson (7)
Martin Van Buren (8)William Henry Harrison (21)
John Tyler (10)James Knox Polk (18)
Zachary Taylor (16)Millard Fillmore (11)Franklin Pierce (19)
James Buchanan (13)Abraham Lincoln (9)
Andrew Johnson (12)Ulysses S. Grant (17)
Rutherford B. Hayes (20)James Garfield (22)Chester Arthur (15)
Grover Cleveland 1 (23)Benjamin Harrison (14)
Grover Cleveland 2 (25)William McKinley (24)
Theodore Roosevelt (29)William Howard Taft (27)
Woodrow Wilson (30)Warren Harding (26)Calvin Coolidge (28)Herbert Hoover (31)
Franklin D. Roosevelt (32)Harry S. Truman (33)
Dwight Eisenhower (34)John F. Kennedy (37)
Lyndon B. Johnson (36)Richard Nixon (39)
Gerald Ford (35)James Carter (38)
Ronald Reagan (40)George H.W. Bush (41)
William Clinton (42)George W. Bush (43)
Barack Obama (44)
median and minimumsigma
Model can predict individual performance
28
0 0.1 0.2 0.3 0.450
100
150
200
250
300
R=0.941
inferred noise level for
each individual
distance to ground
truth
individual
Extension of Estes (1972) Perturbation Model
Main idea: item order is perturbed locally
Our extension: perturbation noise varies
between individuals and items
29
A
True order
B C D E
Recalled order
DB C EA
Strong wisdom of crowds effect
31
1 10 200
50
100
150
200
250
300
350
Individuals
Thurstonian ModelPerturbationIndividuals
Perturbation model’s ordering is better than best individual
Perturbation
Inferred Perturbation Matrix and Item Accuracy
322 6 10 14 18 22 26 30 34 38 42
1. George Washington (1)2. John Adams (2)
3. Thomas Jefferson (3)4. James Madison (4)5. James Monroe (6)
6. John Quincy Adams (5)7. Andrew Jackson (7)
8. Martin Van Buren (8)9. William Henry Harrison (21)
10. John Tyler (11)11. James Knox Polk (16)
12. Zachary Taylor (18)13. Millard Fillmore (9)
14. Franklin Pierce (20)15. James Buchanan (13)16. Abraham Lincoln (15)17. Andrew Johnson (10)18. Ulysses S. Grant (17)
19. Rutherford B. Hayes (19)20. James Garfield (22)21. Chester Arthur (14)
22. Grover Cleveland 1 (23)23. Benjamin Harrison (12)
24. Grover Cleveland 2 (25)25. William McKinley (24)
26. Theodore Roosevelt (28)27. William Howard Taft (26)
28. Woodrow Wilson (30)29. Warren Harding (27)30. Calvin Coolidge (29)31. Herbert Hoover (31)
32. Franklin D. Roosevelt (32)33. Harry S. Truman (33)
34. Dwight Eisenhower (34)35. John F. Kennedy (35)
36. Lyndon B. Johnson (36)37. Richard Nixon (38)
38. Gerald Ford (37)39. James Carter (39)
40. Ronald Reagan (40)41. George H.W. Bush (41)
42. William Clinton (42)43. George W. Bush (43)
44. Barack Obama (44)
Output position
True
pos
ition
0 5 10
Abraham Lincoln
Richard Nixon
James Carter
Alternative Heuristic Models
Many heuristic methods from voting theory E.g., Borda count method
Suppose we have 10 items assign a count of 10 to first item, 9 for second item, etc add counts over individuals order items by the Borda count
i.e., rank by average rank across people
33
Model Comparison
34
1 10 20 300
50
100
150
200
250
300
350
Individuals
Thurstonian ModelPerturbationBorda countIndividuals
Borda
Experiment 2
78 participants 17 problems each with 10 items
Chronological Events Physical Measures Purely ordinal problems, e.g.
Ten Amendments Ten commandments
35
Example results
36
1. Oregon (1)2. Utah (2)
3. Nebraska (3)4. Iowa (4)
5. Alabama (6)6. Ohio (5)
7. Virginia (7)8. Delaware (8)
9. Connecticut (9)10. Maine (10)
1. Freedom of speech & relig... (1)2. Right to bear arms (2)
3. No quartering of soldiers... (3)4. No unreasonable searches (4)
5. Due process (5)6. Trial by Jury (6)
7. Civil Trial by Jury (7)8. No cruel punishment (8)
9. Right to non-specified ri... (10)10. Power for the States & Pe... (9)
Perturbation Model Thurstonian Model
Average results over 17 Problems
37
Individuals
Mea
n
1 10 20 30 40 50 60 70 800
5
10
15
20
25
Individuals
Mea
n
Thurstonian ModelPerturbation ModelBorda countIndividuals
Strong wisdom of crowds effect across problems
0.8 1 1.2 1.4 1.6 1.8
0
2
4
6
8
10
12
14
16
18R=-0.752
1
2
3
4
5
6
7
8
9
10
1112
13
14
15
16
17
Predicting problem difficulty
38
std
dispersion of noise levels across individual
distance of group
answer to ground truth
ordering states geographically
city size rankings
Effect of Group Composition
How many individuals do we need to average over?
39
Effect of Group Size: random groups
40
0 10 20 30 40 50 60 70 807
8
9
10
11
12
13
14
Group Size
T=0T=2
T=12
Experts vs. Crowds
Can we find experts in the crowd? Can we form small groups of experts?
Approach Form a group for some particular task Select individuals with the smallest sigma (“experts”) based on
previous tasks Vary the number of previous tasks
41
Group Composition based on prior performance
42
0 10 20 30 40 50 60 70 807
8
9
10
11
12
13
14
Group Size
T=0T=2
T=12
T = 0
# previous tasks
T = 2T = 8
Group size (best individuals first)
Methods for Selecting Experts
43
Endogenous: no feedback
required
Exogenous: selecting people based on
actual performance
0 10 20 30 407
8
9
10
11
12
13
14
0 20 407
8
9
10
11
12
13
14
Aggregating Episodic Memories
44
Study this sequence of images
Place the images in correct sequence (serial recall)
45
A
B
C
D
E
F
G
H
I
J
Average results across 6 problems
46
Mea
n
1 10 20 300
5
10
15
Individuals
Thurstonian ModelPerturbation ModelBorda countIndividuals
Example calibration result for individuals
47
0 2 4 60
5
10
15
20
25
30
R=0.920
inferred noise level
distance to ground
truth
individual
(pizza sequence; perturbation model)
Predictive Rankings: fantasy football
48
South Australian Football League (32 people rank 9 teams)
1 10 20 300
20
40
60
80
Individuals
Thurstonian ModelPerturbation ModelBorda countIndividuals
Australian Football League (29 people rank 16 teams)
1 10 20 300
5
10
15
20
25
Individuals
1 10 20 300
20
40
60
80
Part IIMatching Problems
49
Study these combinations
50
2 3 4 51
B C D EA
Find all matching pairs
51
Experiment
15 subjects
8 problems 4 problems with 5 items 4 problems with 10 items
52
Mean accuracy across 8 problems
53
1 2 3 4 5 6 7 8 9 10 11 12 13 14 150
0.2
0.4
0.6
0.8
1
Individuals
Mea
n A
ccur
acy
Bayesian Matching Model
Proposed process: match “known” items guess between remaining ones
Individual differences some items easier to know some participants know more
54
Graphical Model
55
i items
jx
jy
z
ja
Latent ground truth
Observed matching
Knowledge State
jsProb. of knowing
id
j individuals
logitj i js d a
~ Bernoulliij ijx s
1 1( )
1 / ! 0ij
ij ij ij
xp y z
n x
person abilityitem easiness
Modeling results across 8 problems
56
1 5 10 150
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Individuals
Mea
n A
ccur
acy
Bayesian MatchingHungarian AlgorithmIndividuals
Calibration at level of items and people
57
ITEMS INDIVIDUALS
0 0.5 10
0.2
0.4
0.6
0.8
1
D (inferred)
D (a
ctua
l)
Clothing and faces (5)
R=0.318
0 0.5 10
0.2
0.4
0.6
0.8
1
D (inferred)
D (a
ctua
l)
Clothing and faces (10)
R=0.722
0 0.5 10
0.2
0.4
0.6
0.8
1
D (inferred)
D (a
ctua
l)
Animals and houses (5)
R=0.433
0 0.5 10
0.2
0.4
0.6
0.8
1
D (inferred)
D (a
ctua
l)
Animals and houses (10)
R=0.854
0 0.5 10
0.2
0.4
0.6
0.8
1
D (inferred)
D (a
ctua
l)
Weapons and faces (5)
R=0.969
0 0.5 10
0.2
0.4
0.6
0.8
1
D (inferred)
D (a
ctua
l)
Weapons and faces (10)
R=0.893
0 0.5 10
0.2
0.4
0.6
0.8
1
D (inferred)
D (a
ctua
l)
Sport and faces (5)
R=0.223
0 0.5 10
0.2
0.4
0.6
0.8
1
D (inferred)
D (a
ctua
l)
Sport and faces (10)
R=0.898
(for weapons and faces 10 items problem)
0 0.5 10
0.2
0.4
0.6
0.8
1
A (inferred)A
(act
ual)
Clothing and faces (5)
R=0.955
0 0.5 10
0.2
0.4
0.6
0.8
1
A (inferred)
A (a
ctua
l)
Clothing and faces (10)
R=0.994
0 0.5 10
0.2
0.4
0.6
0.8
1
A (inferred)
A (a
ctua
l)
Animals and houses (5)
R=0.962
0 0.5 10
0.2
0.4
0.6
0.8
1
A (inferred)
A (a
ctua
l)
Animals and houses (10)
R=0.971
0 0.5 10
0.2
0.4
0.6
0.8
1
A (inferred)
A (a
ctua
l)
Weapons and faces (5)
R=0.943
0 0.5 10
0.2
0.4
0.6
0.8
1
A (inferred)
A (a
ctua
l)
Weapons and faces (10)
R=0.957
0 0.5 10
0.2
0.4
0.6
0.8
1
A (inferred)
A (a
ctua
l)
Sport and faces (5)
R=0.953
0 0.5 10
0.2
0.4
0.6
0.8
1
A (inferred)
A (a
ctua
l)
Sport and faces (10)
R=0.984
Varying number of individuals
58
0 5 10 1550
55
60
65
70
75
80
85
90
95
100
Number of Individuals
Mea
n A
ccur
acy
Bayesian MatchingHungarian Algorithm
0 1-2 3-4 5+0
0.2
0.4
0.6
0.8
1
0 1-2 3-4 5+0
0.2
0.4
0.6
0.8
1
How predictive are subject provided confidence ratings?
59
# guesses estimatedby individual
Acc
urac
y
# guesses estimatedby model
(based on variable A)
r=-.50 r=-.81
Another matching problem
60
Dutch
Danish
Yiddish
Thai
Vietnamese
Chinese
Georgian
Russian
Japanese
A
B
C
D
E
F
G
H
I
godt nytår
gelukkig nieuwjaar
a gut yohr
С Новым Годом
สวสัดีปีใหม่
Chúc Mừng Nǎm Mới
გილოცავთ ახალწელს
Modeling Results – Declarative Tasks
62
1 10 200
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Individuals
Mea
n A
ccur
acy
Bayesian MatchingHungarian AlgorithmIndividuals
Part IIITraveling Salesman Problems
65
1
2
3
45
6
7
8
9
10
11
12
13
14
15
16
17
18
19
2021
22
23
24
25
26
27
28
29
30
B30-21Find the shortest route between cities
66
1
2
3
45
6
7
8
9
10
11
12
13
14
15
16
17
18
19
2021
22
23
24
25
26
27
28
29
30
B30-21 - subj 5
1
2
3
45
6
7
8
9
10
11
12
13
14
15
16
17
18
19
2021
22
23
24
25
26
27
28
29
30
B30-21 - subj 83
1
2
3
45
6
7
8
9
10
11
12
13
14
15
16
17
18
19
2021
22
23
24
25
26
27
28
29
30
B30-21 - subj 60
1
2
3
45
6
7
8
9
10
11
12
13
14
15
16
17
18
19
2021
22
23
24
25
26
27
28
29
30
B30-21
B30-21
Individual 5 Individual 83 Individual 60Optimal
Dataset Vickers, Bovet, Lee, & Hughes (2003)
83 participants 7 problems of 30 cities
TSP Aggregation Problem
Propose a good solution based on all individual solutions
Task constraints Data consists of city order only No access to city locations
68
Approach
Find tours with edges for which many individuals agree
Calculate agreement matrix A A = n × n matrix, where n is the number of cities aij indicates the number of participants that connect cities i and j.
Find tour that maximizes
69
tourji
cija
),(
(this itself is a non-Euclidian TSP problem)
Line thickness = agreement
70
1
2
3
45
6
7
8
9
10
11
12
13
14
15
16
17
18
19
2021
22
23
24
25
26
27
28
29
30
B30-21Blue = Aggregate Tour
71
Results averaged across 7 problems
0
2
4
6
8
10
12
14
16
18
Per
cent
ove
r Opt
imal
aggregate
Part IVSummary & Conclusions
74
When do we get wisdom of crowds effect?
Independent errors different people knowing different things Some minimal number of individuals
10-20 individuals often sufficient
75
What are methods for finding experts?
1) Self-reported expertise: unreliable has led to claims of “myth of expertise”
2) Based on explicit scores by comparing to ground truth but ground truth might not be immediately available
3) Endogenously discover experts Use the crowd to discover experts Small groups of experts can be effective
76
What to do about systematic biases?
In some tasks, individuals systematically distort the ground truth spatial and temporal distortions memory distortions (e.g. false memory) decision-making distortions
Does this diminish the wisdom of crowds effect? maybe… but a model that predicts these systematic distortions might be
able to “undo” them
77
Conclusion
Effective aggregation of human judgments requires cognitive models
Psychology and cognitive science can inform aggregation models
78
Online Experiments
Experiment 1 (Prior knowledge) http://madlab.ss.uci.edu/dem2/examples/
Experiment 2a (Serial Recall) study sequence of still images http://madlab.ss.uci.edu/memslides/
Experiment 2b (Serial Recall) study video http://madlab.ss.uci.edu/dem/
80
Graphical Model
81
i items
jx
jy
z
ja
Latent ground truth
Observed matching
Knowledge State
jsProb. of knowing
id
j individuals
logitj i js d a
~ Bernoulliij ijx s
1 1( )
1 / ! 0ij
ij ij ij
xp y z
n x
item and person parameters
MDS solution of pairwise tau distances
82-15 -10 -5 0 5 10 15 20 25 30 35-20
-15
-10
-5
0
5
10
15
7
26
3
16
7 96
1
22
2
13
12
7
11
14
9
5
7
11
8
3
24
3
7
10
10
4
03
6
9
6
26
5
18
44 3
14
6
2
5
3
5
1
4210
11
4
3
42
0
8
21
7
3
5
1
1
8
1
33
14
3
20
6
8
16
7
22
23
2 3710
states westeast
IndividualsTruthThurstonian Model
distance to truth
MDS solution of pairwise tau distances
83-20 -15 -10 -5 0 5 10 15 20 25
-20
-15
-10
-5
0
5
10
15
20
14
23
25
24
18 24
13
14
10
5
9
20
8
20
15
18
12
33
25
29
171
14
20
27176
13
11
15
3
17
17
17
24
7
26
9
13
17
27
13
15
11
15
15
23
2811
26
16
4
27
9
23
24
11
17
19
15
22
2
15
14
12
21
11
26
11
18
35
22
10
20
24
25
1
19
7
0
ten commandments
IndividualsTruthThurstonian Model
Hierarchical Bayesian Models
Generative models ordering information cognitively plausible individual differences
Group response = probability distribution over all permutations of N items With N=44 items, we have 44! > 1053 combinations Approximate inference methods: MCMC
84
Model incorporating overall person ability
85
j individuals
jmx
jmy
mμ
jm
| , ~ N ,ijm m jm m jmx
( )jm jmranky x
~ Gamma ,1 /jm j j
Overall ability
Task specific ability
m tasks
j ~ Gamma ,1 /j j individuals
1 10 20 30 40 50 60 70 800
5
10
15
20
25
Individuals
Mea
n
Thurstonian Model v1Thurstonian Model v2Borda countModeIndividuals
Average results over 17 Problems
86
Mea
n new model
Thurstonian Model – stereotyped event sequences
87
event1 (1)event2 (2)event3 (3)event4 (4)event5 (5)event6 (7)event7 (6)event8 (8)event9 (9)
event10 (10)
Bus (Recall)
0
5
10
15
20
25
R=0.890
event1 (1)event2 (2)event3 (3)event4 (4)event5 (5)event6 (6)event7 (7)event8 (8)event9 (9)
event10 (10)
Morning (Recall)
0
5
10
15
20
25
R=0.982
event1 (1)event2 (2)event3 (3)event4 (4)event5 (5)event6 (6)event7 (7)event8 (8)event9 (9)
event10 (10)
Wedding (Recall)
0 0.5 1 1.5 20
5
10
15
20
25
R=0.973
Thurstonian Model – “random” videos
88
event1 (1)event2 (2)event3 (3)event4 (5)event5 (7)event6 (6)event7 (4)event8 (8)event9 (9)
event10 (10)
Yogurt (Recall)
0
5
10
15
20
25
R=0.908
event1 (1)event2 (3)event3 (4)event4 (5)event5 (2)event6 (6)event7 (7)event8 (9)
event9 (10)event10 (8)
Pizza (Recall)
0
5
10
15
20
25
R=0.851
event1 (1)event2 (2)event3 (3)event4 (4)event5 (6)event6 (5)event7 (7)event8 (8)event9 (9)
event10 (10)
Clay (Recall)
0 0.5 1 1.5 20
5
10
15
20
25
R=0.928
Heuristic Aggregation Approach
Combinatorial optimization problem maximizes agreement in assigning N items to N responses
Hungarian algorithm construct a count matrix M Mij = number of people that paired item i with response j find row and column permutations to maximize diagonal sum O( n3 )
89
Hungarian Algorithm Example
90= correct
DutchDan
ish
Frenc
h
Japan
ese
Span
ish
Arabic
Chinese
German
Italia
nRussi
an
ThaiViet
namese
Wels
hGeo
rgian
Yiddish
gelukkig Nieuwjaar 7 3 0 0 0 1 0 0 0 0 0 0 2 0 2godt nytår 2 3 0 0 0 0 0 2 0 2 0 0 1 3 2
bonne année 0 0 14 0 1 0 0 0 0 0 0 0 0 0 00 0 0 9 0 0 2 0 1 0 3 0 0 0 0
feliz año nuevo 0 0 0 0 14 0 0 0 0 0 1 0 0 0 0عامسعيد 0 1 0 0 0 14 0 0 0 0 0 0 0 0 0
0 0 0 2 0 0 12 0 0 0 0 1 0 0 0ein gutes neues Jahr 3 1 0 0 0 0 0 9 0 0 0 0 1 0 1
felice anno nuovo 0 0 0 0 0 0 0 0 14 1 0 0 0 0 0С Новым Годом 0 0 1 0 0 0 0 0 0 11 0 0 1 2 0
สวัสดีปีใหม่ ่ 0 0 0 1 0 0 1 0 0 0 7 1 1 4 0Chúc Mừng Nǎm Mới 0 0 0 0 0 0 0 0 0 1 0 11 1 2 0
Blwyddyn Newydd Dda 0 4 0 1 0 0 0 0 0 0 1 0 6 1 2გილოცავთ ახალ წელს 0 0 0 2 0 0 0 1 0 0 3 2 0 1 6
a gut yohr 3 3 0 0 0 0 0 3 0 0 0 0 2 2 2
= incorrect