SFI 6 6 - 07
The Collective Intelligence of Diverse Agents:
Micro Foundations of Uncertainty
Lu Hong&
Scott E Page
SFI 6 6 - 07
Outline• Aside on Theoretical Foundations• The Wisdom of Crowds• Standard Models• Interpretation Framework• Mathematical Results• Diversity, Democracy, and Markets
SFI 6 6 - 07
Methodological Tradeoff
Logical Informal |____________________________________________|Mathematical Appreciative
Brittle Flexible
|____________________________________________|Mathematical Appreciative
SFI 6 6 - 07
Agent Based Models
Logical Informal |_____ABM___________________________________|Mathematical Appreciative
Brittle Flexible
|____________________________________ABM____|Mathematical Appreciative
SFI 6 6 - 07
Model Benchmarking
Real World
Math ABM
SFI 6 6 - 07
Model Validation
Real World
Math ABM
SFI 6 6 - 07
Methodological Translation
Real World
Math ABM
SFI 6 6 - 07
Models of Collective Wisdom
SFI 6 6 - 07
Von Hayek
...it is largely because civilization enables us constantly to profit from knowledge which we individually do not possess and because each individual's use of his particular knowledge may serve to assist others unknown to him in achieving their ends that men as members of civilized society can pursue their individual ends so much more successfully than they could alone.
SFI 6 6 - 07
Aristotle
“For each individual among the many has a share of excellence and practical wisdom, and when they meet together, just as they become in a manner one man, who has many feet, and hands, and senses, so too with regard to their character and thought.’’
SFI 6 6 - 07
Aristotle
“Hence, the many are better judges than a single man of music and poetry, for some understand one part and some another, and among them they understand the whole.”
Politics book 3 chapter 11
SFI 6 6 - 07
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
SFI 6 6 - 07
The Wisdom of Crowds:Galton’s Steer
1906 Fat Stock and Poultry Exhibition, 787 people guessed the weight of a steer. Their average guess: 1,197 lbs.
SFI 6 6 - 07
The Wisdom of Crowds:Galton’s Steer
1906 Fat Stock and Poultry Exhibition, 787 people guessed the weight of a steer. Their average guess: 1,197 lbs.
Actual Weight: 1,198 lbs.
SFI 6 6 - 07
Who Wants to Be a Millionaire
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
SFI 6 6 - 07
Experts or Crowds?
Experts: Correct 2/3 of the time
Audience: Correct over 90% of the time
SFI 6 6 - 07
Three Mathematical Models
SFI 6 6 - 07
Model 1: Known Information
Best Selling Cereal of All Time
a) Corn Flakesb) Rice Krispiesc) Cheeriosd) Frosted Flakes
SFI 6 6 - 07
Answer:
c) Cheerios
SFI 6 6 - 07
How Errors Cancel
Consider a crowd of 100 people
10% Know the correct answer10% Narrowed down to two answers36% Narrowed down to three
answers44% No clue
SFI 6 6 - 07
# Votes for Correct Answer
10: 10% Know the correct answer 5: 10% Narrowed down to two answers:12: 36% Narrowed down to three
answers: 11: 44% No clue
38 TOTAL
SFI 6 6 - 07
Why The Crowd’s Correct
The correct answer gets 38 votes. Assume that the other 62 votes are
spread across the other three. Each of those three receives around 20 votes.
SFI 6 6 - 07
N.B.
The crowd can be correct with very high probability even if no one in the crowd knows the correct answer.
SFI 6 6 - 07
The Math
40: Cheerios or Corn Flakes30: Cheerios or Frosted Flakes30: Cheerios or Rice Krispies
Cheerios gets 50 votes!
SFI 6 6 - 07
Model 2: Correlated Signal
Suppose that we’re trying to discover whether or not a truck full of sour cream has gone bad due to a faulty refrigerator.
SFI 6 6 - 07
Model 2: Correlated Signal
Suppose that we’re trying to discover whether or not a truck full of sour cream has gone bad due to a faulty refrigerator.
True State: G (good) or B (bad)
SFI 6 6 - 07
Signals
Suppose that we can test pints of sour cream and get signals (g and b) and that with probability 3/4, these signals are correct.
SFI 6 6 - 07
Signals
Suppose that we can test pints of sour cream and get signals (g and b) and that with probability 3/4, these signals are correct.
If the sour cream is bad, 3/4 of the time we’ll get the signal b.
SFI 6 6 - 07
Three PeopleTrue State: B
Correct Outcomes
P1 P2 P3 Probability b b b (3/4)(3/4)(3/4) = 27/64 b b g (3/4)(3/4)(1/4) = 9/64 b g b (3/4)(1/4)(3/4) = 9/64 g b b (1/4)(3/4)(3/4) = 9/64
Total = 54/64
SFI 6 6 - 07
Three PeopleTrue State: B
Incorrect Outcomes
P1 P2 P3 Probability g g g (1/4)(1/4)(1/4) = 1/64 b g g (3/4)(1/4)(1/4) = 3/64 g b g (1/4)(3/4)(1/4) = 3/64 g g b (1/4)(1/4)(3/4) = 3/64
Total = 10/64
SFI 6 6 - 07
General Model
With probability p > 0.5, people get the correct signal. Therefore, if N people get signals, pN get the correct signal.
As N gets large, the expected probability of a correct vote goes to one.
SFI 6 6 - 07
Model 3: Averaging of Noise
Suppose that we want to predict the luminosity of a star. Each of 100 people stationed around the globe takes out a light meter and takes a reading.
SFI 6 6 - 07
Model 3: Averaging of Noise
Suppose that we want to predict the luminosity of a star. Each of 100 people stationed around the globe takes out a light meter and takes a reading.
Call the reading for person k, r(k)
SFI 6 6 - 07
Noise/Interference
The signal that a person gets equals the true luminosity, L, plus or minus an error term, due to ambient light, humidity or who knows what.
r(k) = L + e(k)
e(k) is the error term
SFI 6 6 - 07
Noises Off
The average of the signals equals L plus the average of the error terms:
[r(1) + r(2) + r(N)]/N = L + [e(1) + e(2) +..e(N)]/N
If the error terms are, on average, zero, then they all cancel, and the prediction is accurate.
SFI 6 6 - 07
Important Questions
Why should we assume that these error terms are, on average, equal to zero?
Why should we assume the signals are independent?
Is this how an ABM would capture collective wisdom?
SFI 6 6 - 07
Markets and Democracy
Model 1: Some people know the answer
Model 2: People get signals that are probabilistically correct
Model 3: People see the true state plus an error
SFI 6 6 - 07
NONE do.
SFI 6 6 - 07
Signal
Outcome Signal
noise
SFI 6 6 - 07
Generated Signals
• True state of the world: x• Signal: s• Joint probability distribution: f(s,x)• Conditional probability distribution:
f(s|x)
SFI 6 6 - 07
Generated Signals
True state generates something that is correlated with the state’s value
- luminosity of stars S = L+e
- quality of a product {good, bad}
s = True quality with prob p
SFI 6 6 - 07
A Generated Signal
A chef of unknown quality produces batches of risotto. Each batch is a signal of the chef’s quality. Batches temporally separate enough to be considered independent revelations of quality.
SFI 6 6 - 07
Predictive Model: Lu Hong
Attributes Prediction
model
SFI 6 6 - 07
Interpretations
Reality consists of many variables or attributes. People cannot include them all.
Therefore, we consider only some attributes or lump things together into categories. (Reed 1972, Rosch 1978)
SFI 6 6 - 07
“Lump to Live”
If we did not lump various experiences, situations, and events into categories, we could not draw inferences, make generalities, or construct mental models.
SFI 6 6 - 07
Predictive Models
Edwards is a liberal; therefore he’ll raise taxes.
The stock’s price earnings ratio is high; therefore, the stock is a bad investment.
SFI 6 6 - 07
How Do We Predict?
We parse the world into categories and make predictions based on those interpretations.
SFI 6 6 - 07
Interpretations
Victorian NovelModern ArchitecturePrice Earnings RatioModern ArtSKA
SFI 6 6 - 07
Predictive Models
I love SKA music!!
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
SFI 6 6 - 07
Model Interpreted Signals
• Situations/objects in the world have many attributes (x1, x2, x3 …. xn)
• Outcome function maps situations to outcomes/states F:X S
• Agents have predictive models based on subsets of attributes.
SFI 6 6 - 07
People
We differ in how we categorize.
Thus, we differ in our predictions.
SFI 6 6 - 07
Pile SortPlace the following food items in piles with
at least two items per pile:
Broccoli Canned Ham CarrotsFresh Salmon Bananas ApplesSpam Ahi Tuna NY Strip SteakRib Roast Sea Bass Canned
Salmon
SFI 6 6 - 07
Veggies Fish & Meat Canned StuffBroccoli Fresh Salmon Canned SalmonCarrots Ahi Tuna SpamArugula Niman Pork Canned BeetsFennel Sea Bass Canned Posole
BOBO Sort
SFI 6 6 - 07
Veggies Fish & Meat Weird StuffBroccoli Fresh Salmon Ahi Tuna Canned Beets Canned Salmon ArugulaCarrots Spam Fennel Niman Pork Canned Posole
Sea Bass
Airstream Sort
SFI 6 6 - 07
Agents
Differ in location in space or on network
Differ in type
Therefore, differ in pieces of information that they use
SFI 6 6 - 07
An Example
What follows is an example in which a crowd of three people make a collective prediction.
SFI 6 6 - 07
Reality
CharismaH MH ML L
H Experience
MH ML
L
G G G
G G G G
G
G
B
B
B
B
B
B
B
B
SFI 6 6 - 07
Experience Interpretation
HExperience MH
ML
L
G G G
G G G G
G
G
B
B
B
B
B
B
B
B
B
B
B
B
B G
G
75 % Correct
SFI 6 6 - 07
Charisma Interpretation
H MH ML L
75% Correct
G B
G BBBG
G G
BG G
BG G
BG G
G
G
B
B
B
B B
SFI 6 6 - 07
Balanced Interpretation
H MH ML L
75% Correct H
Extreme on MHone measure.Moderate on MLthe other
L
G G G
G B G G
G
G
B
B
B
B
B
B
B
B
B
G
G
SFI 6 6 - 07
Voting Outcome
H MH ML L
H
MH ML
L
GGB GGG GBG
GGG GGB G GBG
BGG
BGG
BGB
GBB
BBG
BBB
BBB
BBG
BBG
BGB
SFI 6 6 - 07
Reality
G G G
G G G G
G
G
B
B
B
B
B
B
B
B
SFI 6 6 - 07
Row and Column Correct
GGB GGG
GGG GGB G
BBG
BBB
BBB
BBG
SFI 6 6 - 07
Row and Column Split
GBG
G GBG
BGG
BGG
BGB
GBB
BBG
BGB
SFI 6 6 - 07
Key Idea
Think of these predictions as signals. To differentiate them from our standard, generated signals, call them interpreted signals.
SFI 6 6 - 07
Independence of Interpreted Signals
Consider the interpreted signals based on charisma and on experience.
Each was correct with probability 0.75
SFI 6 6 - 07
Both Row and Column Correct
GGB GGG
GGG GGB G
BBG
BBB
BBB
BBG
SFI 6 6 - 07
Negative Correlation
Probability Correct Prediction = 0.75
Probability Both Correct = 0.5
If Independent, Probablility Both Correct = 0.56
SFI 6 6 - 07
Conditional Independence?
• Probability each is correct conditional on the outcome G equals 0.75
• Probability both correct conditional on the outcome G equals 0.5
Correctness of the predictions is negatively correlated conditional on the outcome being good.
SFI 6 6 - 07
Binary Interpreted Signals
• Set of objects |X|=N• Set of outcomes S = {G,B}
• Interpretation: Ij = {mj,1,mj,2…mj,nj} is a partition of X
• P(mj,i) = probability mj,i arises
SFI 6 6 - 07
Four Types of Independence
1. Independent Interpretations2. Independent Interpreted Signals3. Independently Correct Interpreted
Signals4. Conditionally Independent
Interpreted Signals
SFI 6 6 - 07
Independent Interpretations
P(mji and mkl) = P(mji)P(mkl)
Probability j says “i” and k say “el”equals the product of the probability that j says “i” times the probability k says “el”.
SFI 6 6 - 07
Why Independent Interpretations
We’re interested in independent interpretations because that’s the best people or agents could do in the binary setting. It’s the most diverse two predictions could be.
Captures a world in which agents or people look at distinct pieces of information.
SFI 6 6 - 07
Independent Interpretations
Claim: If two interpretations are independent, then X can be represented by a K dimensional rectangle with the two interpretations looking at non overlapping subsets of variables.
SFI 6 6 - 07
Independent Not Different
Independent interpretations must rely on the same fundamental representation and look at different parts of it. Thus, to say that two people have independent perspectives is to say that they look at the world the same way but look at different parts of the same representation.
SFI 6 6 - 07
Independent Interpreted Signals
Interpreted signal: sj (mji) prediction by j given in set I
Interpreted signals are independent iff sj (mji) and sk (mkl) are independent random variables.
SFI 6 6 - 07
Claim: Independent interpretations imply independent interpreted signals
pf: if what we see is independent, what we predict has to be independent.
SFI 6 6 - 07
Claim: Independent interpreted signals need not imply independent interpretations.
pf: Outcomes {G1,G2,G3,B1,B2,B3}
Person 1: {G1,G2 B1: g} {G3,B2,B3:b}Person 2: {G1,G2, G3 ,B3: g} {B1,B2:b}
Independent interpreted signals: P(g,b) = P(g,.)P(.,b)
SFI 6 6 - 07
Independently Correct Interpreted Signals
C(sj (mji)) = 1 if prediction is correct, 0 else
Predictions are independently correct iff
C(sj (mji)) and C(Sk(mkl)) are independent random variables.
SFI 6 6 - 07
Claim: Independent predictions need not be independently correct predictions.
Pf: recall our example. The predictions were independent but they were not independently correct.
SFI 6 6 - 07
A prediction is reasonable if it is correct at least half of the time.
A prediction is informative if it is correct more than half of the time.
SFI 6 6 - 07
Claim: Informative predictions need not be reasonable conditional on every state
G G BG G BG G B
B B G
GGGB
Conditional on state B, the prediction is correct 2/5 of the time
SFI 6 6 - 07
Claim: Independent, informative interpreted signals that predict good and bad outcomes with equal likelihood must be negatively correlated in their correctness.
SFI 6 6 - 07
Proofg b
g
b
G = XB = 1-X
G = YB = 1-Y
G = ZB = 1-Z
G = WB = 1-W
Prob row correct: (X+Y+2-(W+Z))/4Prob column correct: (X+Z+2-(W+Y))/4Prob both correct: (X+1-W)/4
(X+Y+2-(W+Z))(X+Z+2-(W+Y)) - 4(X+1-W) = (X-W)2 - (Y-Z)2 > 0
SFI 6 6 - 07
Negative Result
We cannot assume independent signals and be consistent with independent interpretations.
SFI 6 6 - 07
What Does This All Mean?
The following assumptions which are common in in literature are inconsistent with independent interpreted signals
States = {G,B} equally likely
Signals = {g,b} independent conditional on the state across agents.
SFI 6 6 - 07
However, much of the time, mathematical models do not assume unconditional independence, but independence conditional on the true outcome.
SFI 6 6 - 07
Negative Conditional Correlation
Claim: If interpreted signals are informative and independent, then they must be negatively correlated conditional on at least one outcome.
SFI 6 6 - 07
Negative Correlation
Claim: If interpreted signals are informative and independent, then they must be negatively correlated conditional on at least one outcome.
Independence conditional on the state is impossible
SFI 6 6 - 07
Positive Result
Claim: Independent, informative interpreted signals that predict good and bad outcomes with equal likelihood that are correct with probability p exhibit negative correlation equal to 1 - (1/4(p-p2))
SFI 6 6 - 07
Amazing Result
Claim: The complexity of the outcome function does not alter correlation other than through the accuracy of the interpreted signals
SFI 6 6 - 07
Resurrecting Independence
We can obtain independence if we relax the assumption that people use independent interpretations and if we make some incredibly heroic assumptions about the topology over states and how people construct categories.
SFI 6 6 - 07
Resurrecting Independence
K, r, m are positive integers, K>1, 2r>m>r
A state is a vector of K attributes, (x1,...,xK); takes a value from {0,1}; each xi takes a value from {1,...,m}; each state is equally likely
The outcome function F(x1,...,xK)= if an even number of xi’s have values greater than r; 1- otherwise
SFI 6 6 - 07
Resurrecting Independence
Interpretation i considers every attribute except attribute xi
Interpreted signal si based on interpretation i equals if an even number of x attributes other than xi have values greater than r; 1- otherwise
SFI 6 6 - 07
Claim: Any outcome function that produces conditionally independent interpreted signals is isomorphic to this example.
SFI 6 6 - 07
One Left Out
The only way to align conditionally independent signals with interpreted signals is to assume each person leaves out a different attribute.
This doesn’t make sense if seen from an incentive standpoint.
SFI 6 6 - 07
Diversity in Democracy & Markets
Diverse interpretations -- interpretations that use distinct attributes create negatively correlated signals.
SFI 6 6 - 07
Collective Accuracy
If we take the collective prediction to
be equal to the average of individuals’ predictions, then the following holds.
Collective Error = Average Error - Variance
SFI 6 6 - 07
Efficient Individual Signals
Suppose that agents evolve predictive
models (interpreted signals) and that each new category has a cost. Then, there exist efficient (but not accurate) interpreted signals.
See Fryer and Jackson
SFI 6 6 - 07
Efficient Collective Signals
Suppose that we take the distribution
of the accuracy of signals as given, then it is possible to determine which signals to include.
See N. Johnson
SFI 6 6 - 07
Evolved Interpreted Signals: Democracy
Suppose that we allow agents to evolve interpretations. Over time, the agents become more accurate, but the collection becomes less accurate due to the reduction in diversity (variance).
Kollman and Page
SFI 6 6 - 07
Evolved Interpreted Signals:Markets
Markets create incentives for people to look at different attributes. In an auction setting, it may be incentive compatible to look at distinct attributes -- providing micro foundations for both Aristotle and Hayek.
SFI 6 6 - 07
Small Groups
With endogenous information
acquisition members of small groups should be able to look at different attributes and do better than independence would predict.
SFI 6 6 - 07
Large Groups
Even with endogenous information
acquisition and the incentives to think differently, people may not be able to generate enough encodings to avoid positive correlation.
Thus, those limiting results as N gets large may not hold.
SFI 6 6 - 07
Summary
• Conceptual Contribution– Shown difference between ABM and
Mathematical models of signals– Linked to psychology and shown how
“diversity” might explain signals– Shown chink in armor of independence
assumption (maybe it’s too convenient)
SFI 6 6 - 07
Summary Contributions
– Collective Wisdom depends on either• Smart people or• Diversity
– Can expect large groups to find the best barbeque in NC (generated signals) but not to make the correct choice on a proxy vote
SFI 6 6 - 07
Summary
Extensions– Explore complexity
• How does the mapping from attributes to outcomes effect signal correlation and accuracy for more than two people?
– Explore endogenous information