griffiths & tenenbaum - reconciling intuition and probability

6
Randomness and Coincidences: Reconciling Intuition and Probability Theory Thomas L. Griffiths & Joshua B. Tenenbaum Department of Psychology Stanford University Stanford, CA 94305-2130 USA gruffydd,jbt @psych.stanford.edu Abstract We argue that the apparent inconsistency between peo- ple’s intuitions about chance and the normative predic- tions of probability theory, as expressed in judgments about randomness and coincidences, can be resolved by focussing on the evidence observations provide about the processes that generated them, rather than their likeli- hood. This argument is supported by probabilistic mod- eling of sequence and number production, together with two experiments that examine people’s judgments about coincidences. People are notoriously inaccurate in their judgments about randomness, such as whether a sequence of heads and tails like is more random than the se- quence . Intuitively, the former sequence seems more random, but both sequences are equally likely to be produced by a random generating process that chooses or with equal probability, such as a fair coin. This kind of question is often used to illustrate how our intuitions about chance deviate from the normative standards set by probability theory. Our intuitions about coincidental events, which seem to be defined by their apparent improbability, have faced similar criticism from statisticians (eg. Diaconis & Mosteller, 1989). The apparent inconsistency between our intuitions about chance and the formal structure of probability the- ory has provoked attention from philosophers and mathe- maticians, as well as psychologists. As a result, a number of definitions of randomness exist in both the mathemat- ical (eg. Chaitin, 2001; Kac, 1983; Li & Vitanyi, 1997) and the psychological (eg. Falk, 1981; Lopes, 1982) liter- ature. These definitions vary in how well they satisfy our intuitions, and can be hard to reconcile with probability theory. In this paper, we will argue that there is a natu- ral relationship between people’s intuitions about chance and the normative standards of probability theory. Tra- ditional criticism of people’s intuitions about chance has focused on the fact that people are poor estimators of the likelihood of events being produced by a particular gen- erating process. The models we present turn this ques- tion around, asking how much more likely a set of events makes a particular generating process. This question is far more useful in everyday life, where it is often more important to reason diagnostically than predictively, at- tempting to infer the structure of our world from the data we observe. Randomness Reichenbach (1934/1949) is credited with having first suggested that mathematical novices will be unable to produce random sequences, instead showing a tendency to overestimate the frequency with which outcomes alter- nate. Subsequent research has provided support for this claim (reviewed in Bar-Hillel & Wagenaar, 1991; Tune, 1964; Wagenaar, 1972), with both sequences of numbers (eg. Budescu, 1987; Rabinowitz, Dunlap, Grant, & Cam- pione, 1989) and two-dimensional black and white grids (Falk, 1981). In producing binary sequences, people al- ternate with a probability of approximately 0.6, rather than the 0.5 that is seen in sequences produced by a ran- dom generating process. This preference for alternation results in subjectively random sequences containing less runs – such as an interrupted series of heads in a set of coin flips – than might be expected by chance (Lopes, 1982). Theories of subjective randomness A number of theories have been proposed to account for the accuracy of Reichenbach’s conjecture. These theo- ries have included postulating that people develop a con- cept of randomness that differs from the true definition of the term (eg. Budescu, 1987; Falk, 1981; Skinner, 1942), and that limited short-term memory might con- tribute to people’s responses (Baddeley, 1966; Kareev, 1992; 1995; Wiegersma, 1982). Most recently, Falk and Konold (1997) suggested that the concept of randomness can be connected to the subjective complexity of a se- quence, characterized by the difficulty of specifying a rule by which a sequence can be generated. This idea is related to a notion of complexity based on descrip- tion length (Li & Vitanyi, 1997), and has been considered elsewhere in psychology (Chater, 1996). The account of randomness that has had the strongest influence upon the wider literature of cognitive psychol- ogy is Kahneman and Tversky’s (1972) suggestion that people may be attempting to produce sequences that are “representative” of the output of a random generating process. For sequences, this means that the number of elements of each type appearing in the sequence should correspond to the overall probability with which these el- ements occur. Random sequences should also maintain local representativeness, such that subsequences demon-

Upload: matcult

Post on 08-Nov-2014

10 views

Category:

Documents


0 download

DESCRIPTION

Academic parapsychology paper

TRANSCRIPT

Page 1: Griffiths & Tenenbaum - Reconciling Intuition and Probability

Randomnessand Coincidences:ReconcilingIntuition and Probability Theory

ThomasL. Griffiths & JoshuaB. TenenbaumDepartmentof Psychology

StanfordUniversityStanford,CA 94305-2130USA�

gruffydd,jbt � @psych.stanford.edu

Abstract

We argue that the apparentinconsistency betweenpeo-ple’s intuitions aboutchanceand the normative predic-tions of probability theory, as expressedin judgmentsaboutrandomnessandcoincidences,canbe resolved byfocussingon theevidenceobservationsprovide abouttheprocessesthat generatedthem, rather than their likeli-hood. This argumentis supportedby probabilisticmod-eling of sequenceandnumberproduction,togetherwithtwo experimentsthat examinepeople’s judgmentsaboutcoincidences.

Peopleare notoriouslyinaccuratein their judgmentsaboutrandomness,suchaswhethera sequenceof headsand tails like ��������������� is more randomthan the se-quence ��������������� . Intuitively, the former sequenceseemsmore random, but both sequencesare equallylikely to be producedby a randomgeneratingprocessthatchooses� or � with equalprobability, suchasa faircoin. Thiskind of questionis oftenusedto illustratehowour intuitions aboutchancedeviate from the normativestandardssetby probability theory. Our intuitionsaboutcoincidentalevents,which seemto be definedby theirapparentimprobability, havefacedsimilarcriticismfromstatisticians(eg. Diaconis& Mosteller, 1989).

The apparentinconsistency betweenour intuitionsaboutchanceandtheformalstructureof probabilitythe-ory hasprovokedattentionfrom philosophersandmathe-maticians,aswell aspsychologists.As aresult,anumberof definitionsof randomnessexist in boththemathemat-ical (eg. Chaitin,2001;Kac, 1983;Li & Vitanyi, 1997)andthepsychological(eg.Falk,1981;Lopes,1982)liter-ature.Thesedefinitionsvary in how well they satisfyourintuitions,andcanbehardto reconcilewith probabilitytheory. In this paper, we will arguethat thereis a natu-ral relationshipbetweenpeople’s intuitionsaboutchanceandthe normative standardsof probability theory. Tra-ditional criticism of people’s intuitionsaboutchancehasfocusedon thefactthatpeoplearepoorestimatorsof thelikelihoodof eventsbeingproducedby a particulargen-eratingprocess.The modelswe presentturn this ques-tion around,askinghow muchmorelikely asetof eventsmakesa particulargeneratingprocess.This questionisfar moreuseful in everydaylife, whereit is often moreimportantto reasondiagnosticallythanpredictively, at-temptingto infer thestructureof ourworld from thedataweobserve.

RandomnessReichenbach(1934/1949)is creditedwith having firstsuggestedthat mathematicalnovices will be unabletoproducerandomsequences,insteadshowing a tendencyto overestimatethefrequency with whichoutcomesalter-nate. Subsequentresearchhasprovidedsupportfor thisclaim (reviewed in Bar-Hillel & Wagenaar, 1991;Tune,1964;Wagenaar, 1972),with bothsequencesof numbers(eg.Budescu,1987;Rabinowitz, Dunlap,Grant,& Cam-pione,1989)andtwo-dimensionalblackandwhite grids(Falk, 1981). In producingbinarysequences,peopleal-ternatewith a probability of approximately0.6, ratherthanthe0.5 thatis seenin sequencesproducedby a ran-domgeneratingprocess.This preferencefor alternationresultsin subjectively randomsequencescontaininglessruns– suchasan interruptedseriesof headsin a setofcoin flips – than might be expectedby chance(Lopes,1982).

Theoriesof subjective randomness

A numberof theorieshave beenproposedto accountforthe accuracy of Reichenbach’s conjecture.Thesetheo-rieshave includedpostulatingthatpeopledevelopacon-ceptof randomnessthat differs from the true definitionof the term (eg. Budescu,1987; Falk, 1981; Skinner,1942), and that limited short-termmemorymight con-tribute to people’s responses(Baddeley, 1966; Kareev,1992;1995;Wiegersma,1982).Most recently, Falk andKonold(1997)suggestedthattheconceptof randomnesscanbe connectedto the subjective complexity of a se-quence,characterizedby the difficulty of specifyingarule by which a sequencecan be generated.This ideais relatedto a notion of complexity basedon descrip-tion length(Li & Vitanyi, 1997),andhasbeenconsideredelsewherein psychology(Chater, 1996).

Theaccountof randomnessthathashadthestrongestinfluenceuponthewider literatureof cognitive psychol-ogy is KahnemanandTversky’s (1972)suggestionthatpeoplemaybeattemptingto producesequencesthatare“representative” of the output of a randomgeneratingprocess.For sequences,this meansthat the numberofelementsof eachtypeappearingin thesequenceshouldcorrespondto theoverallprobabilitywith whichtheseel-ementsoccur. Randomsequencesshouldalsomaintainlocal representativeness,suchthatsubsequencesdemon-

Page 2: Griffiths & Tenenbaum - Reconciling Intuition and Probability

stratethe appropriateprobabilities. This claim seemsconsistentwith empiricalresults(eg. Budescu,1987).

Formalizing representativenessA major challengefor a theory of randomnessbasedupon representativenessis to expressexactly what itmeansfor an outcometo be representative of a randomgeneratingprocess.Oneinterpretationof this statementis that the outcomeprovides evidencefor having beenproducedby a randomgeneratingprocess. This inter-pretationhasthe advantageof submittingeasily to for-malization in the languageof probability theory. In acompanionpaper(Tenenbaum& Griffiths, submitted),we presentargumentsandempiricalsupportfor this asa generalformalizationof representativeness.Here,wewill summarizethe theoryanddiscussin depthits im-plications for understandingpeople’s intuitions aboutchance.

If we are consideringtwo candidateprocessesbywhich an outcomecould be generated– one random,andonecontainingsystematicregularities– thetotal ev-idencein favor of therandomgeneratingprocesscanbeassessedby thelogarithmof theratioof theprobabilitiesof theseprocesses

logP � random� x �P � regular� x � (1)

whereP � random� x � and P � regular� x � are the probabili-tiesof arandomandaregulargeneratingprocessrespec-tively, giventheoutcomex.

This quantity can be computedusing the odds-ratioform of Bayesrule

P � random� x �P � regular� x �

P� x � random�P � x� regular�

P � random�P � regular� (2)

in which it is conventional to refer to the term on theleft handsideof theequationastheposteriorodds,andthe first andsecondtermson the right handsideasthelikelihoodratio andthe prior oddsrespectively. Of thelatter two terms,the specificoutcomex influencesonlythe likelihood ratio. Thus the contribution of x to theevidencein favour of a randomgeneratingprocesscanbemeasuredby thelogarithmof thelikelihoodratio,

random(x) logP � x� random�P � x� regular� (3)

This methodof assessingthe weight of evidencefor aparticularhypothesisprovidedby anobservationis oftenusedin Bayesianstatistics,and the log likelihood-ratiogiven above is called a Bayesfactor (Kass& Raftery,1995). TheBayesfactorfor a setof independentobser-vationswill bethesumof their individualBayesfactors,andthe expressionhasa clear informationtheoreticin-terpretation(Good,1979). Theabove expressionis alsocloselyconnectedto thenotionof minimumdescriptionlength,connectingthis approachto randomnesswith theideasof Falk andKonold(1997)andChater(1996).

Defining regularityEvaluatingthe evidencethat a particularoutcomepro-vides for a randomgeneratingprocessrequirescom-putingtwo probabilities:P � x� random� andP � x� regular� .The first of theseprobabilities follows from the defi-nition of the randomgeneratingprocess. For exam-ple, P ������������������� random� is � 1

2 � 8, as it would be forany sequenceof the samelength. However, comput-ing P� x� regular� requiresspecifying the probability ofthe observed outcomeresultingfrom a generatingpro-cessthat involvesregularities. While this probability ishardto define,it is in generaleasyto computeP � x� hi � ,wherehi might be somehypothesisedregularity. In thecaseof sequencesof headsand tails, for instance,himight correspondto a particularprobability of observ-ing heads,P ��� � p. In this caseP����������������� � hi � isp4 � 1 � p � 4. Using the calculusof probability, we canobtainP� x � regular� by summingover thesetof hypothe-sisedregularities,� ,

P � x� regular� ∑hi � H

P � x� hi � regular� P � hi � regular�

∑hi � H

P � x� hi � P � hi � (4)

whereP � hi � is a prior probability on hi . In all applica-tions discussedin this paper, we assumethat P � hi � isuniformover all hi � � .

RandomsequencesFor thecaseof binarysequences,suchasthosethatmightbeproducedby flipping a coin, possibleregularitiescanbedividedinto two classes.Oneclassassumesthatflipsareindependent,andtheregularitiesit containsareasser-tionsaboutthevalueof P ��� � . Thesecondclassincludesregularitiesthat make referenceto propertiesof subse-quencescontainingmore than a single element,suchasalternation,runs,andsymmetries.Sincethis secondclassis lesswell defined,it is instructive to examinetheaccountthatcanbeobtainedjust by usingthefirst classof regularities.

Taking � to be all valuesof p P ����� ��� 0� 1� , wehave P� H � T � random� � 1

2 � H � T andP � H � T � regular� 10 pH � 1 � p� Tdp, whereH � T denotesa sequencecon-

tainingH headsandT tails. Completingthe integral, itfollows from (3) that

random� H � T � log� H � T

H �2H � T (5)

This resulthasanumberof appealingproperties.Firstly,it is maximizedwhenH T, which is consistentwithKahnemanandTversky’s (1972)original descriptionofthe representativenessof randomsequences.Secondly,theratio involvedessentiallymeasuresthesizeof thesetof sequencessharingthesamenumberof headsandtails.A sequencelike ��������������� is uniquein its composition,whereas��������������� hasa compositionmuchmorecom-monlyobtainedby flipping acoineighttimes.

Page 3: Griffiths & Tenenbaum - Reconciling Intuition and Probability

The Zenith radio dataHaving definedasimplemodelfor judgmentsof thesub-jective randomnessof sequences,we have the opportu-nity to apply it to data. Oneclassicdatasetconcerningthe productionof randomsequencesis the Zenith radiodata.Thesedatawereobtainedasa resultof anattemptby theZenithcorporationto testthehypothesisthatpeo-plearesensitiveto psychictransmissions.Onseveraloc-casionsin 1937,aradioprogramtookplaceduringwhichagroupof psychicswouldtransmitarandomlygeneratedbinary sequenceto the receptive minds of their listen-ers.Thelistenerswereaskedto write down thesequencethat they received, one elementat a time. The binarychoicesincludedheadsand tails, light and dark, blackandwhite, andseveral symbolscommonlyusedin testsof psychicabilities. Listenersthen mailed in their re-sponses,which wereanalysed.Theseresponsesdemon-stratedstrongpreferencesfor particularsequences,buttherewasnosystematiceffectof theactualsequencethatwastransmitted(Goodfellow, 1938).Thedataarethusarich sourceof informationaboutresponsepreferencesforrandomsequences.Therelativefrequenciesof thediffer-entsequences,collapsedover choiceof first symbol,areshown in Figure1.

0

0.05

0.1

0.15

0.2Zenith Radio Data

Pro

babi

lity

0

0.05

0.1

0.15

0.2

Pro

babi

lity

0000

0

0000

1

0001

0

0001

1

0010

0

0010

1

0011

0

0011

1

0100

0

0100

1

0101

0

0101

1

0110

0

0110

1

0111

0

0111

1

Randomness model

Figure1: The upperpanelshows the original Zenith radiodata,representingthe responsesof 20,099participants,fromGoodfellow (1938). The lower panelshows thepredictionsoftherandomnessmodel.Sequencesarecollapsedovertheinitialchoice,representedby 0.

Modeling random sequenceproductionOneof the most importantcharacteristicsof the Zenithradio datais that people’s responseswereproducedse-quentially. In producingeachelementof the sequence,peoplehad knowledgeof the previous elements.Kah-nemanandTversky (1972)suggestedthat in producingsuchsequences,peoplepay attentionto the local repre-sentativenessof their choices– therepresentativenessof

eachsubsequence.To capturethis idea,we defineLk to be the local rep-

resentativenessof choosing� asthe kth response– theextentto which � resultsin amorerandomoutcomethan� , assessedover thesubsequencesstartingonestepback,two stepsback,andsoforth.

Lk k � 1

∑i � 1

random� Hi � 1� Ti ��� random� Hi � Ti � 1 � (6)

wheretheHi � Ti arethetalliesof headsandtailscountingback i stepsin the sequence.We canthenconvert thisquantityinto aprobabilityusinga logistic, to give

P � Rk � � 1

1 � e� λDk

1

1 � ∏k � 1i � 1

Ti � 1Hi � 1

� λ (7)

wheretheλ parameterscalestheeffect thatLk hasontheresultingprobability. Theprobabilityof thesequenceasawholeis thentheproductof theprobabilitiesof theRk,andthe resultdefinesa probability distribution over thesetof binary sequencesof lengthk. This distribution isshown in Figure1 for k 5.

This simple model provides a remarkablygood ac-countof the responsepreferencespeopledemonstratedin the Zenith radio experiment. Thereis onecleardis-crepancy: the modelpredictsthat the sequence��������� ,equivalentto ��������� or ��������� , shouldoccurfarmoreof-ten thanin thedata.We canexplain people’s avoidanceof thissequenceby thefactthatalternationitself formsaregularity, albeitonethatis beyondthescopeof thissim-ple model.More striking is theaccountthemodelgivesof thedifferentfrequenciesof sequenceswith lessappar-entregularities,suchas��������� and��������� . Excludingthediscrepantdatapoint, the modelgivesa parameter-freeordinalcorrelationrs 0 97,andwith λ 0 6 hasa lin-earcorrelationr 0 95. Interestingly, themodelpredictsalternation,for sequencesthatareotherwiseequallyrep-resentative, with a probabilityof 1

1� 2! λ . With thevalueof λ usedin fitting the Zenith radio data,the resultingpredictedprobabilityof alternationis 0.6,consistentwithpreviousfindings(eg. Falk, 1981).

Pick a numberResearchon subjective randomnesshasfocusedalmostexclusively onsequences,but sequencesarenot theonlystimuli that excite our intuitions aboutchance. In par-ticular, randomnumbersloom larger in life than in theliterature,althoughtherehave beena few studiesthathave investigatedresponsepreferencesfor numbersbe-tween0 and9. Kubovy andPsotka(1976)reportedthefrequency with which peopleproducenumbersbetween0 and9 whenaskedto pick a number, aggregatedacrossseveralstudies.Theseresultsareshown in Figure2. Peo-ple showed a clearpreferencefor the number7, which

Page 4: Griffiths & Tenenbaum - Reconciling Intuition and Probability

Kubovy andPsotka(1976,p. 294) explainedwith ref-erenceto the propertiesof the numbersinvolved – forexample,6 is even, anda multiple of 3, but it is harderto find propertiesof 7. This explanationis suggestiveof the kinds of regular generatingprocessesthat couldbe involved in producingnumbers.ShepardandArabie(1979) found that people’s similarity judgmentsaboutnumberscould be capturedby propertieslike thosede-scribedbyKubovy andPsotka(1976),suchasbeingevennumbers,powersof 2, or multiplesof 3.

Takingthearithmeticpropertiesof numbersto consti-tute hypotheticalregularities,we canspecify the quan-tities necessaryto computerandom� x � . Our hi aresetsof numbersthat sharesomeproperty, suchasthe setofeven numbersbetween0 and9. For any hi , we defineP � x� hi �

1"hi" for x � hi and0 otherwise,where � hi � is the

sizeof the set. This meansthat observationsgeneratedfrom a regularity areuniformly sampledfrom that regu-larity. SettingP � hi � to giveequalweightto all hi , wecancomputeP � x� regular� .

This modelcanbeappliedto thedataof Kubovy andPsotka(1976).Sincetherearetenpossibleresponses,wehaveP� x � random�

110. Taking �

�even,multiplesof

3, multiplesof 5, powersof 2, endpoints,random� , weobtainthevaluesof random� x � shown in Figure2. Ran-domnessneedsto beincludedin � sothatrandom� x � isdefinedwhenx is not in any otherregularity. Its inclu-sionis analogousto theincorporationof anoiseprocess,andis in fact formally identical in this case.The orderof the modelpredictionsis a parameterfree result,andgivesthe ordinal correlationrs 0 99. Applying a sin-gle parameterpower transformationto the predictions,y# � y � min � y �$� 0 % 98, givesr 0 95.

0 1 2 3 4 5 6 7 8 90

0.1

0.2

0.3

0.4Kubovy and Psotka (1976)

Pro

babi

lity

0 1 2 3 4 5 6 7 8 9

Randomness model

Number

Figure 2: The upperpanelshows numberproductiondatafrom Kubovy andPsotka(1976),takenfrom 1,770participantschoosingnumbersbetween0 and9. Thelowerpanelshowsthetransformedpredictionsof therandomnessmodel.

CoincidencesTheapparentlyincongruousfrequency of unlikely eventshasdrawn attentionfrom a numberof psychologistsandstatisticians. Diaconis and Mosteller (1989), in theiranalysisof why suchcoincidencesaremorelikely thanthey seem,definea coincidenceas‘ $ $ a surprisingcon-currenceof events, perceived as meaningfully related,with no apparentcausalconnection’(p. 853). They goon to suggestthat the “surprising” frequency of theseevents is due to the flexibility that we allow in identi-fying meaningfulrelationships. Togetherwith the factthat everydaylife providesa vastnumberof opportuni-tiesfor coincidencesto occur, ourwillingnessto toleratenearmissesandto considereachof anumberof possibleconcurrencesmeaningfulcontributes to explaining thefrequency with which coincidencesoccur. DiaconisandMostellersuggestedthatthesurprisethatpeopleshow atthesolutionto theBirthdayProblem– thefact thatonly23peoplearerequiredto give a50%chanceof two peo-ple sharingthesamebirthday– suggeststhatsimilar ne-glectof combinatorialgrowth contributesto theunderes-timationof thelikelihoodof coincidences.Psychologicalresearchaddressingcoincidencesseemsconsistentwiththis view, suggestingthat selective memory(Hintzman,Asher, & Stern,1978)andpreferentialweightingof first-handexperiences(Falk & MacGregor, 1983)mightfacil-itatetheunder-estimationof theprobabilityof events.

Not just lik elihood&'&(&Theabove analysessuffer from thesamemisconceptionthat interferedwith producinga probabilisticaccountofrandomness:the notion that people’s judgmentsresultfrom thelikelihoodof particularoutcomes.Subjectively,coincidencesareeventsthatseemunlikely, andarehencesurprisingwhen the occur. However, just as with ran-dom sequences,setsof eventsthat areequally likely tobeproducedby arandomgeneratingprocessdiffer in thedegreeto which they seemto be coincidences.Follow-ing DiaconisandMosteller’s suggestionthat the Birth-day Problemprovidesa domainfor the investigation ofcoincidences,considerthekindsof coincidencesformedby setsof birthdays.If we meetfour peopleandfind outthattheir birthdaysareOctober4, October4, October4,and October4, this is a much bigger coincidencethanif the samepeoplehave birthdaysMay 14, July 8, Au-gust21,andDecember25,despitethefactthatthesesetsof birthdaysareequallylikely to beobservedby chance.Theway that thesesetsof birthdaysdiffer is thatoneofthemcontainsan obvious regularity: all four birthdaysoccuron thesameday.

Modeling coincidencesJustassequencesdiffer in the amountof evidencetheyprovide for having beenproducedby a randomgener-ating process,setsof birthdaysdiffer in how muchevi-dencethey provide for having beenproducedby a pro-cessthatcontainsregularities.We arguethattheamountof evidencethataneventprovidesfor a regulargenerat-ing processwill correspondto how big a coincidenceit

Page 5: Griffiths & Tenenbaum - Reconciling Intuition and Probability

seems,andthatthiscanbecomputedin thesamewayasfor randomness,

coincidence(x) logP � x� regular�P� x � random� (8)

To apply this model we have to definethe regularities� . For birthdays,theseregularitiesshouldcorrespondto relationshipsthat canexist amongdates.Our modelof coincidencesuseda set of regularitiesthat reflectedproximity in date(from 1 to 30 days),belongingto thesamecalendarmonth,andhaving thesamecalendardate(eg. January17, March 17, September17, December17). We also assumedthat eachyear consistsof 12monthsof 30 days each. Thus, for a set of n birth-days,X

�x1 � ) $ � xn � , we have P � X � random� � 1

360 � n.In defining P � X � regular� , we want to respectthe factthat regularitiesamongbirthdaysarestill striking evenwhenthey areembeddedin noise– for instance,Febru-ary 2, March 26, April 3, June12, June12, June12,June12, November22 still provides strong evidencefor a regularity in the generatingprocess.To allow themodel to toleratenoisy regularities, we can introducea noise term into P � X � hi � , which we denoteα. Theprobability calculuslets us integrateout unwantedpa-rameters,so the introduction of a noise processneednot result in addinga free parameterto the model. Inparticular, P � X � hi �

10 P � X � α � hi � P� α � hi � dα. Assum-

ing that the dateswe observe areindependent,we haveP � X � α � hi � ∏x j � X P � x j � α � hi � , and, taking a uniform

prior on α, P � X � hi � is simply 10 ∏x j � X P � x j � α � hi � dα,

where

P � x j � α � hi � α

360 � � 1 � α � 1"hi" x j � hi

α360 x j *� hi

(9)

Thiscorrespondsto datesbeingsampleduniformly fromthe entireyearwith probability α, anduniformly fromthe regularity with probability � 1 � α � . The resultingP � X � hi � canthenbesubstitutedinto (4), settingP � hi � togive equalweightto eachhi , resultingin P � X � regular� .How big a coincidence?The model outlined above makes strong predictionsabout the degree to which different sets of birthdaysshouldbe judgedto constitutecoincidences.We con-ductedasimpleexperimentto examinethesepredictions.The participantswere93 undergraduatesfrom StanfordUniversity, participatingfor partial coursecredit. Four-teenpotentialrelationshipsbetweenbirthdayswereex-amined,using two setsof dates. Eachparticipantsawonesetof dates,in a randomorder. Thedatesreflected:2, 4, 6, and8 apparentlyunrelatedbirthdays,2 birthdayson the sameday, 2 birthdaysin 2 daysacrossa monthboundary, 4 birthdayson the sameday, 4 birthdaysinone week acrossa month boundary, 4 birthdaysin thesamecalendarmonth,4 birthdayswith thesamecalendardates,and2 sameday, 4 sameday, and4 samedatewith

an additional4 unrelatedbirthdays,as well as 4 sameweek with an additional2 unrelatedbirthdays. Thesedatesweredeliveredin aquestionnaire.Eachparticipantwasinstructedto ratehow big a coincidenceeachsetofdateswas,usinga scalein which 1 denotedno coinci-denceand10denotedaverybig coincidence.

Theresultsof theexperiment,togetherwith themodelpredictions,areshown in Figure3. Again, the ordinalpredictionsof the model are parameterfree, with rs 0 94. Applying the transfomationy# � y � min � y �)� 0 % 48,gives r 0 95. The main discrepanciesbetweenthemodelandthe dataare the four birthdaysthat occur inthe samecalendarmonth, and the orderingof the ran-domdates.Theformercouldbeaddressedby increasingthe prior probability given to the regularity of being inthesamecalendarmonth– clearlythiswasgivengreaterweightby theparticipantsthanby themodel.Explainingtheincreasein thejudgedcoincidencewith largersetsofunrelateddatesis moredifficult, but may be a resultofopportunisticcoincidences:asmoredatesareprovided,participantshavemoreopportunitiesto identify complexregularitiesor find datesof personalrelevance.Thispro-cesscanbe incorporatedinto the model, at the costofgreatercomplexity.

0

5

10

Rat

ing

How big a coincidence?

Coincidences model

0

5

10

Rat

ing

How random?

2 ra

ndom

4 ra

ndom

6 ra

ndom

8 ra

ndom

2 sa

me

day

2 in

2 d

ays

4 sa

me

day

4 sa

me

wee

k

4 sa

me

mon

th

4 sa

me

date

2 sa

me

day

4 sa

me

day

4 sa

me

date

4 sa

me

wee

k

(with random dates)

Figure3: The top panelshows the judgedextent of coinci-dencefor eachsetof dates.Themiddlepanelis thepredictionsof the coincidencesmodel, subjectedto a transformationde-scribedin thetext. Thebottompanelshows randomnessjudg-mentsfor thesamestimuli.

Relating randomnessand coincidencesJudgmentsof randomnessandcoincidencesboth reflecttheevidencethata setof observationsprovidesfor hav-ing beenproducedby a particular generatingprocess.Eventsthat provide goodevidencefor a randomgener-ating processare viewed as random,while eventsthatprovide evidencefor a generatingprocessincorporatingsomeregularity seemlike coincidences.By examining(3) and (9), we seethat thesephenomenaare formallyidentified as inverselyrelated: coincidencesare eventsthatdeviatefrom ournotionsof randomness.

Page 6: Griffiths & Tenenbaum - Reconciling Intuition and Probability

We conducteda further experimentto seeif this re-lationshipwasborneout in people’s judgments.Partici-pantswere120undergraduatesfrom StanfordUniversity,participatingfor partialcoursecredit.Thedateswerethesameasthoseusedpreviously, anddeliveredin similarformat. Eachparticipantwasinstructedto ratehow ran-domeachsetof dateswas,usinga scalein which 1 de-notednotatall randomand10denotedvery random.

The resultsof this experimentareshown in Figure3.Thecorrelationbetweentherandomnessjudgmentsandthecoincidencejudgmentsis r � 0 94,consistentwiththehypothesisthatrandomnessandcoincidencesarein-verselylinearly related.The main discrepancy betweenthe two datasetsis that the additionof unrelateddatesseemsto haveastrongereffectonrandomnessjudgmentsthanoncoincidencejudgments.

ConclusionThe models we have discussedin this paper providea connectionbetweenpeople’s intuitions aboutchance,expressedin judgmentsabout randomnessand coinci-dences,and the formal structureof probability theory.This connectiondependsupon changingthe way wemodelquestionsaboutprobability. Ratherthanconsider-ing the likelihoodof eventsbeingproducedby a partic-ulargeneratingprocess,ourmodelsaddressthequestionof how much more likely a set of eventsmakes a par-ticular generatingprocess.This questionis a structuralinference,drawing conclusionsabouttheworld from ob-served data. Framedin this way, people’s judgmentsarerevealedto accuratelyapproximatethestatisticalev-idencethat observations provide for having beenpro-ducedby a particulargeneratingprocess.The apparentinaccuracy of our intuitionsmaythusbea resultof con-sideringnormative theoriesbaseduponthelikelihoodofeventsratherthantheevidencethey provide for a struc-tural inference.

ReferencesBaddeley, A. D. (1966). The capacityof generatinginforma-tion by randomization. Quarterly Journal of ExperimentalPsychology, 18:119–129.

Bar-Hillel, M. andWagenaar, W. A. (1991).Theperceptionofrandomness.Advancesin appliedmathematics, 12:428–454.

Budescu,D. V. (1987).A markov modelfor generationof ran-dombinarysequences.Journal of ExperimentalPsychology:Humanperceptionandperformance, 12:25–39.

Chaitin,G. J. (2001).Exploringrandomness. SpringerVerlag,London.

Chater, N. (1996). Reconciling simplicity and likelihoodprinciplesinperceptualorganization. Psychological Review,103:566–581.

Diaconis,P. and Mosteller, F. (1989). Methodsfor studyingcoincidences.Journalof theAmericanStatisticalAssociation,84:853–861.

Falk, R. (1981). The perceptionof randomness.In Proceed-ingsof thefifth internationalconferencefor thepsychologyofmathematicseducation, volume1, pages222–229,Grenoble,France.LaboratoireIMAG.

Falk, R. andKonold,C. (1997). Making senseof randomness:Implicit encodingasa biasfor judgment. Psychological Re-view, 104:301–318.

Falk, R. andMacGregor, D. (1983). Thesurprisingnessof co-incidences.In Humphreys, P., Svenson,O., andVari, A., ed-itors, Analysingand aiding decisionprocesses, pages489–502.North-Holland,New York.

Good,I. J. (1979). A.m. turing’s statisticalwork in world warii. Biometrika, 66:393–396.

Goodfellow, L. D. (1938).A psychologicalinterpretationof theresultsof theZenith radioexperimentsin telepathy. Journalof ExperimentalPsychology, 23:601–632.

Hintzman,D. L., Asher, S. J., andStern,L. D. (1978). Inci-dentalretrieval andmemoryfor coincidences.In Gruneberg,M. M., Morris, P. E., andSykes,R. N., editors,Practical as-pectsof memory, pages61–68.AcademicPress,New York.

Kac,M. (1983).Whatis random?AmericanScientist, 71:405–406.

Kahneman,D. andTversky, A. (1972). Subjective probabil-ity: A judgmentof representativeness.CognitivePsychology,3:430–454.

Kareev, Y. (1992).Not thatbadafterall: Generationof randomsequences.Journalof ExperimentalPsychology: HumanPer-ceptionandPerformance, 18:1189–1194.

Kareev, Y. (1995). Througha narrow window: Working mem-ory capacityand the detectionof covariation. Cognition,56:263–269.

Kass,R. E. andRaftery, A. E. (1995). Bayesfactors.Journalof theAmericanStatisticalAssociation, 90:773–795.

Kubovy, M. andPsotka,J. (1976).Thepredominanceof sevenandthe apparentspontaneityof numericalchoices. Journalof ExperimentalPsychology: HumanPerceptionandPerfor-mance, 2:291–294.

Li, M. andVitanyi, P. (1997). An introductionto Kolmogorovcomplexity andits applications. SpringerVerlag,London.

Lopes,L. (1982). Doing the impossible:A noteon inductionandthe experienceof randomness.Journal of ExperimentalPsycholgoy, 8:626–636.

Rabinowitz, F. M., Dunlap,W. P., Grant,M. J.,andCampione,J. C. (1989). The rulesusedby childrenandadultsto gen-eraterandomnumbers.Journalof MathematicalPsychology,33:227–287.

Reichenbach,H. (1934/1949).Thetheoryof probability. Uni-versityof CaliforniaPress,Berkeley.

Shepard,R. andArabie,P. (1979). Additive clutering: Repre-sentationof similaritiesascombinationsof discreteoverlap-pingproperties.Psychological Review, 86:87–123.

Skinner, B. F. (1942). Theprocessesinvolved in the repeatedguessingof alternatives.Journalof experimentalpsychology,39:322–326.

Tenenbaum,J.B. andGriffiths,T. L. (submitted).Therationalbasisof representativeness. Submittedto the23rdConferenceof theCognitiveScienceSociety, 2001.

Tune,G. S. (1964). Responsepreferences:A review of somerelevantliterature.Psychological Bulletin, 61:286–302.

Wagenaar, W. A. (1972). Generationof randomsequencesbyhumansubjects:A critical survey of literature.PsychologicalBulletin, 77:65–72.

Wiegersma,S.(1982).Canrepetitionavoidancein randomiza-tion be explainedby randomnessconcepts? PsychologicalResearch, 44:189–198.

AcknowledgementsThiswork wassupportedby aHackett studentshipto thefirst authorandfundsfrom MitsubishiElectricResearchLaboratories.The authorsthankPersiDiaconisfor in-spirationandconversation,TaniaLombrozofor helpfulcomments,andKevin Lortie for finding theleak.