psychological reviewnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. in...

15
PSYCHOLOGICAL REVIEW Copyright © 1973 by the American Psychological Association, Inc. VOL. 80, No. 4 JULY 1973 ON THE PSYCHOLOGY OF PREDICTION 1 DANIEL KAHNEMAN 2 AND AMOS TVERSKY Hebrew University of Jerusalem, Israel, and Oregon Research Institute Intuitive predictions follow a judgmental heuristic—representativeness. By this heuristic, people predict the outcome that appears most representative of the evidence. Consequently, intuitive predictions are insensitive to the relia- bility of the evidence or to the prior probability of the outcome, in violation of the logic of statistical prediction. The hypothesis that people predict by representativeness is supported in a series of studies with both naive and so- phisticated subjects. It is shown that the ranking of outcomes by likelihood coincides with their ranking by representativeness and that people erroneously predict rare events and extreme values if these happen to be representative. The experience of unjustified confidence in predictions and the prevalence of fallacious intuitions concerning statistical regression are traced to the repre- sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg- ments of confidence and contrast these rules to the normative principles of statis- tical prediction. Two classes of prediction are discussed: category prediction and numerical prediction. In a categorical case, the prediction is given in nominal form, for example, the winner in an election, 1 Research for this study was supported by the following grants: Grants MH 12972 and MH 21216 from the National Institute of Mental Health and Grant RR 05612 from the National Institute of Health, U. S. Public Health Service; Grant GS 3250 from the National Science Foundation. Computing assistance was obtained from the Health Services Computing Facility, University ot California at Los Angeles, sponsored by Grant MH 10822 from the U. S. Public Health Service. The authors thank Robyn Dawes, Lewis Gold- berg, and Paul Slovic for their comments. Sundra Gregory and Richard Kleinknecht assisted in the preparation of the test material and the collection of data. J Requests for reprints should be sent to Daniel Kahneman, Department of Psychology, Hebrew University, Jerusalem, Israel. the diagnosis of a patient, or a person's future occupation. In a numerical case, the prediction is given in numerical form, for example, the future value of a particular stock or of a student's grade point average. In making predictions and judgments under uncertainty, people do not appear to follow the calculus of chance or the statistical theory of prediction. Instead, they rely on a limited number of heuristics which sometimes yield reasonable judg- ments and sometimes lead to severe and systematic errors (Kahneman & Tversky, 1972; Tversky & Kahneman, 1971, 1973). The present paper is concerned with the role of one of these heuristics—representa- tiveness—in intuitive predictions. Given specific evidence (e.g., a person- ality sketch), the outcomes under consider- ation (e.g., occupations or levels of achieve- ment) can be ordered by the degree to which they are representative of that evi- dence. The thesis of this paper is that people predict by representativeness, that is, they select or order outcomes by the 237

Upload: others

Post on 26-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

PSYCHOLOGICALREVIEW

Copyright © 1973 by the American Psychological Association, Inc.

VOL. 80, No. 4 JULY 1973

ON THE PSYCHOLOGY OF PREDICTION 1

DANIEL K A H N E M A N 2 AND AMOS TVERSKY

Hebrew University of Jerusalem, Israel, and Oregon Research Institute

Intui t ive predictions follow a judgmental heuristic—representativeness. Bythis heuristic, people predict the outcome that appears most representative ofthe evidence. Consequently, intuitive predictions are insensitive to the relia-bili ty of the evidence or to the prior probability of the outcome, in violationof the logic of statistical prediction. The hypothesis that people predict byrepresentativeness is supported in a series of studies with both naive and so-phisticated subjects. It is shown that the ranking of outcomes by likelihoodcoincides with their ranking by representativeness and that people erroneouslypredict rare events and extreme values if these happen to be representative.The experience of unjust i f ied confidence in predictions and the prevalence offallacious in tui t ions concerning statistical regression are traced to the repre-sentativeness heuristic.

In this paper, we explore the rules thatdetermine intuit ive predictions and judg-ments of confidence and contrast theserules to the normative principles of statis-tical prediction. Two classes of predictionare discussed: category prediction andnumerical prediction. In a categoricalcase, the prediction is given in nominalform, for example, the winner in an election,

1 Research for this study was supported by thefollowing grants: Grants MH 12972 and MH 21216from the National Institute of Mental Health andGrant RR 05612 from the National Ins t i tu te ofHealth, U. S. Public Health Service; Grant GS 3250from the National Science Foundation. Computingassistance was obtained from the Health ServicesComputing Facility, University ot California atLos Angeles, sponsored by Grant MH 10822 fromthe U. S. Public Health Service.

The authors thank Robyn Dawes, Lewis Gold-berg, and Paul Slovic for their comments. SundraGregory and Richard Kleinknecht assisted in thepreparation of the test material and the collectionof data.

J Requests for reprints should be sent to DanielKahneman, Department of Psychology, HebrewUniversity, Jerusalem, Israel.

the diagnosis of a patient, or a person'sfuture occupation. In a numerical case,the prediction is given in numerical form,for example, the future value of a particularstock or of a student's grade point average.

In making predictions and judgmentsunder uncertainty, people do not appearto follow the calculus of chance or thestatistical theory of prediction. Instead,they rely on a limited number of heuristicswhich sometimes yield reasonable judg-ments and sometimes lead to severe andsystematic errors (Kahneman & Tversky,1972; Tversky & Kahneman, 1971, 1973).The present paper is concerned with therole of one of these heuristics—representa-tiveness—in intuitive predictions.

Given specific evidence (e.g., a person-ality sketch), the outcomes under consider-ation (e.g., occupations or levels of achieve-ment) can be ordered by the degree towhich they are representative of that evi-dence. The thesis of this paper is thatpeople predict by representativeness, thatis, they select or order outcomes by the

237

Page 2: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

238 DANIEL K A H N E M A N AND AMOS TVERSKY

TABLE t

ESTIMATED BASIC RATES OF THE NINE AREAS OFGRADUATE SPECIALIZATION AND SUMMARY

OF SIMILARITY AND PREDICTIONDATA FOR TOM W.

Graduate specializationarea

BusinessAdministration

Computer ScienceEngineeringHumanities

and EducationLawLibrary Science1

MedicinePhysical and

Life SciencesSocial Science

and Social Work

Meanindued

base rate(in %)

IS79

20938

Meansimilarity

rank

,1.92.12.9

7.25.94.25.9

Meanlikelihood

rank

4.,<2.52.6

7.65.24.75.8

112 4.5 ! 4..!

17\

8.2 8.0

degree to which the outcomes representthe essential features of the evidence. Inmany situations, representative outcomesare indeed more likely than others. J low-ever, this is not always the case, becausethere are factors (e.g., the prior probabili-ties of outcomes and the reliabil i ty of theevidence) which affect the likelihood of out-comes but not their representativeness.Because these factors are ignored, intuitivepredictions violate the statistical rules ofprediction in systematic and fundamenta lways. To confirm th is hypothesis, weshow that the ordering of outcomes byperceived likelihood coincides with theirordering by representativeness and thatintuitive predictions are essentially un-affected by considerations of prior prob-ability and expected predictive accuracy.

In the first section, we investigate cate-gory predictions and show that they con-form to an independent assessment ofrepresentativeness and that they are essen-tially independent of the prior probabili-ties of outcomes. In the next section, weinvestigate numerical predictions and showthat they are not properly regressive andare essentially unaffected by considerationsof reliability. The following three sectionsdiscuss, in turn, methodological issues inthe study of prediction, the sources of un-

justified confidence in predictions, andsome fallacious intuitions concerning re-gression effects.

CATKGORICAL PREDICTION

Base Rate, Similarity, and Likelihood

The following experimental example il-lustrates prediction by representativenessand the fallacies associated wi th this modeof intuit ive prediction. A group of 69subjects3 (the base-rate group) was askedthe following question: "Consider all first-year graduate students in the U. S. today.Please write clown your best guesses aboutthe percentage of these students who arenow enrolled in each of the following ninefields of specialization." The nine fieldsare listed in Table 1. The first column ofthis table presents the mean estimates ofbase rate for the various fields.

A second group of 65 subjects (thesimilarity group) was presented with thefollowing personality sketch:

Tom \V. is of high intelligence, although lacking intrue creativity. I le has a need for order and clarity,and for neat and t idy systems in which every detailfinds its appropriate place. His writ ing is rather dulland mechanical, occasionally enlivened by somewhatcorny puns and by Hashes of imagination of thesci-fi type. He has a strong drive for competence.He seems to have l i t t le feel and l i t t le sympathy forother people and does not enjoy interacting withothers. Self-centered, he nonetheless has a deepmoral sense.

The subjects were asked to rank the nineareas in terms of "how similar is Tom W.to the typical graduate student in each ofthe following nine fields of graduate special-ization?" The second column in Table 1presents the mean similarity ranks assignedto the various fields.

Finally, a prediction group, consisting of114 graduate students in psychology atthree major universities in the UnitedStates, was given the personality sketch ofTom W., with the following additionalinformation:

s Unless otherwise specified, the subjects in thestudies reported in this paper were paid volunteersrecruited through a student paper at the Universityof Oregon. Data were collected in group settings.

Page 3: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

ON THE PSYCHOLOGY OF PREDICTION 239

The preceding personality sketch of Tom \V. waswritten during Tom's senior year in high school bya psychologist, on the basis of projective tests.Tom W. is currently a graduate student. Pleaserank the following nine fields of graduate specializa-tion in order of the likelihood that Tom \V. is nowa graduate student in each of these fields.

The third column in Table 1 presentsthe means of the ranks assigned to the out-comes by the subjects in the predictiongroup.

The product-moment correlations be-tween the columns of Table 1 were com-puted. The correlation between judgedlikelihood and similarity is .97, while thecorrelation between judged likelihood andestimated base rate4 is —.65. Evidently,judgments of likelihood essentially coincidewith judgments of similarity and are quiteunlike the estimates of base rates. Thisresult provides a direct confirmation of thehypothesis that people predict by repre-sentativeness, or similarity.

The judgments of likelihood by the psy-chology graduate students drastically vio-late the normative rules of prediction.More than 95% of those respondents judgedthat Tom W. is more likely to study com-puter science than humanities or education,although they were surely aware of thefact that there arc many more graduatestudents in the latter field. According tothe base-rate estimates shown in Table 1,the prior odds for humanities or educationagainst computer science are about 3 to 1.(The actual odds are considerably higher.)

According to Bayes' rule, it is possible toovercome the prior odds against Tom W.being in computer science rather than inhumanities or education, if the descriptionof his personality is both accurate anddiagnostic. The graduate students in ourstudy, however, did not believe that theseconditions were met. Following the pre-diction task, the respondents were askedto estimate the percentage of hits (i.e.,correct first choices among the nine areas)which could be achieved with several typesof information. The median estimate ofhits was 23% for predictions based on

4 In computing this correlation, the ranks were in-verted so that a high judged likelihood was assigneda high value.

projective tests, which compares to 53%,for example, for predictions based on highschool seniors' reports of their interests andplans. Evidently, projective tests wereheld in low esteem. Nevertheless, thegraduate students relied on a descriptionderived from such tests and ignored thebase rates.

In general, three types of information arerelevant to statistical prediction: (a) prioror background information (e.g., base ratesof fields of graduate specialization); (b)specific evidence concerning the individualcase (e.g., the description of Tom W.); (c)the expected accuracy of prediction (e.g.,the estimated probability of hits). Afundamental rule of statistical predictionis that expected accuracy controls the rela-tive weights assigned to specific evidenceand to prior information. When expectedaccuracy decreases, predictions should be-come more regressive, that is, closer to theexpectations based on prior information.In the case of Tom W., expected accuracywas low, and prior probabilities should havebeen weighted heavily. Instead, our sub-jects predicted by representativeness, thatis, they ordered outcomes by their simi-larity to the specific evidence, with noregard for prior probabilities.

In their exclusive reliance on the person-ality sketch, the subjects in the predictiongroup apparently ignored the following con-siderations. Eirst, given the notorious in-validity of projective personality tests, it isvery likely that: Tom W. was never in factas compulsive and as aloof as his descrip-tion suggests. Second, even if the descrip-tion was valid when Tom W. was in highschool, it may no longer be valid now thathe is in graduate school. Finally, even ifthe description is still valid, there are prob-ably more people who fit that descriptionamong students of humanities and educa-tion than among students of computerscience, simply because there are so manymore students in the former than in thelatter field.

Manipulation of Expected Accuracy

An additional study tests the hypothesisthat, contrary to the statistical model, a

Page 4: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

240 DANIEL KAHNEMAN AND AMOS TVERSKY

TABLE 2PRODUCT-MOMKNT CoKKKI.ATIONS OF M l C A N I . IKKI . IHOOD R A N K WITH MlsAN S I M I L A R I T Y

R A N K A N D WITH BASIC RATK

r

With mean similarity rankWith base rate

Modal first prediction

Law

.93

.33

ComputerScience

.06-.35

Medicine

.92

.27

LibraryScience

.88-.03

BusinessAdministration

.88

.62

manipulation of expected accuracy does no!affect the pattern of predictions. The ex-perimental material consisted of five thumb-nail personality sketches of ninth-gradeboys, allegedly written by a counselor onthe basis of an interview in the context ofa longitudinal study. The design was thesame as in the Tom VV. study. For eachdescription, subjects in one group (TV = 69)ranked the nine fields of graduate special-ization (see Table 1) in terms of the simi-larity of the boy described to their "imageof the typical first-year graduate studentin that field." Following the similar i tyjudgments, they estimated the base-ratefrequency of the nine areas of graduatespecialization. These estimates were shownin Table 1. The remaining subjects weretold that the five cases had been randomlyselected from among the participants inthe original study who are now first-yeargraduate students. One group, the high-accuracy group (N = 55), was told that"on the basis of such descriptions, studentslike yourself make correct predictions inabout 55% of the cases." The low-accuracy group (N = 50) was told thatstudents' predictions in this task arecorrect in about 27% of the cases. Foreach description, the subjects ranked thenine fields according to "the likelihood thatthe person described is now a graduatestudent in that field." For each descrip-t ion, they also estimated the probabilitythat their first choice was correct.

The manipulation of expected accuracyhad a significant effect on these probabilityjudgments. The mean estimates were .70and .56, respectively, for the high- and low-accuracy group (/ = 3.72, /> < .001). Mow-ever, the orderings of the n ine outcomes

produced under the low-accuracy instruc-tions were not significantly closer to thebase-rate distribution than the orderingsproduced under the high-accuracy instruc-tions. A product-moment correlation wascomputed for each judge, between theaverage rank he had assigned to each of thenine outcomes (over the five descriptions)and the base rate. This correlation is anoverall measure of the degree to which thesubject's predictions conform to the base-rate distribution. The averages of theseindividual correlations were .13 for subjectsin the high-accuracy group and .16 forsubjects in the low-accuracy group. Thedifference docs not approach significance(/ = .42, df = 103). This pattern of judg-ments violates the normative theory ofprediction, according to which any de-crease in expected accuracy should beaccompanied by a shif t of predictions to-ward the base rate.

Since the manipulation of expected ac-curacy had no effect on predictions, thetwo prediction groups were pooled. Sub-sequent analyses were the same as in theTom W. study. For each description, twocorrelations were computed: (a) betweenmean likelihood rank and mean similarityrank and (b) between mean likelihood rankand mean base rate. These correlationsare shown in Table 2, with the outcomejudged most likely for each description.The correlations between prediction andsimilarity are consistently high. In con-trast, there is no systematic relation be-tween prediction and base rate: the correla-tions vary widely depending on whetherthe most representative outcomes for eachdescription happen to be frequent orrare.

Page 5: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

ON THE PSYCHOLOGY OF PREDICTION 241

Here again, considerations of base ratewere neglected. In the statistical theory,one is allowed to ignore the base rate onlywhen one expects to be infallible. In allother cases, an appropriate compromisemust be found between the ordering sug-gested by the description and the orderingof the base rates. It is hardly believablethat a cursory description of a fourteen-year-old child based on a single interviewcould justify the degree of infallibility im-plied by the predictions of our subjects.

Following the five personality descrip-tions, the subjects were given an additionalproblem:

About Don you will be told nothing except that heparticipated in the original study and is now a first-year graduate student. Please indicate your order-ing and report your confidence for this case as well.

For Don the correlation between meanlikelihood rank and estimated base ratewas .74. Thus, the knowledge of baserates, which was not applied when a de-scription was given, was utilized when nospecific evidence was available.

Prior versus Individuating Evidence

The next study provides a more stringenttest of the hypothesis that intuitive pre-dictions are dominated by representative-ness and are relatively insensitive to priorprobabilities. In this study, the priorprobabilities were made exceptionally sali-ent and compatible with the responsemode. Subjects were presented with thefollowing cover story:

A panel of psychologists have interviewed and ad-ministered personality tests to 30 engineers and 70lawyers, all successful in their respective fields. Onthe basis of this information, thumbnail descriptionsof the 30 engineers and 70 lawyers have been written.You will find on your forms five descriptions, chosenat random from the 100 available descriptions.For each description, please indicate your prob-ability that the person described is an engineer,on a scale from 0 to 100.The same task has been performed by a panel ofexperts, who were highly accurate in assigningprobabilities to the various descriptions. You willbe paid a bonus to the extent that your estimatescome close to those of the expert panel.

These instructions were given to a groupof 85 subjects (the low-engineer, or L

group). Subjects in another group (thehigh-engineer, H group; N = 86) weregiven identical instructions except for theprior probabilities: they were told that theset from which the descriptions had beendrawn consisted of 70 engineers and 30lawyers. All subjects were presented withthe same five descriptions. One of thedescriptions follows:Jack is a 45-year-old man. He is married and hasfour children. He is generally conservative, careful,and ambitious. He shows no interest in politicaland social issues and spends most of his free time onhis many hobbies which include home carpentry,sailing, and mathematical puzzles.The probability that Jack is one of the 30 engineersin the sample of 100 is %.

Following the five descriptions, the subjectsencountered the null description:Suppose now that you are given no informationwhatsoever about an individual chosen at randomfrom the sample.The probability that this man is one of the 30engineers in the sample of 100 is .%.

In both the high-engineer and low-engi-neer groups, half of the subjects were askedto evaluate, for each description, the prob-ability that the person described was anengineer (as in the example above), whilethe other subjects evaluated, for each de-scription, the probability that the persondescribed was a lawyer. This manipulationhad no effect. The median probabilitiesassigned to the outcomes engineer andlawyer in the two different forms added toabout 100% for each description. Con-sequently, the data for the two forms werepooled, and the results are presented interms of the outcome engineer.

The design of this experiment permitsthe calculation of the normatively appro-priate pattern of judgments. The deriva-tion relies on Bayes' formula, in odds form.Let 0 denote the odds that a particulardescription belongs to an engineer ratherthan to a lawyer. According to Bayes'rule, O = Q-R, where Q denotes the priorodds that a randomly selected descriptionbelongs to an engineer rather than to alawyer; and R is the likelihood ratio for aparticular description, that is, the ratio ofthe probability that a person randomlydrawn from a population of engineers will

Page 6: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

242 DANIEL KAHNEMAN AND AMOS TVKRSKY

Probability (Eng ineer !Low Prior

FIG. 1. Median judged probability (engineer) forfive descriptions and for the null description (squaresymbol) under high and low prior probabilities.(The curved line displays the correct relation ac-cording to Bayes' rule.)

be so described to the probability that aperson randomly drawn from a populationof lawyers will be so described.

For the high-engineer group, who weretold that the sample consists of 70 engi-neers and 30 lawyers, the prior odds Qnequal 70/30. For the low-engineer group,the prior odds QL equal 30/70. Thus, foreach description, the ratio of the posteriorodds for the two groups is

OHQL-R

7/33/7

= 5.44.

Since the likelihood ratio is cancelled inthis formula, the same value of OH/OLshould obtain for all descriptions. In thepresent design, therefore, the correct effectof the manipulation of prior odds can becomputed without knowledge of the likeli-hood ratio.

Figure 1 presents the median probabilityestimates for each description, under thetwo conditions of prior odds. For eachdescription, the median estimate of prob-ability when the prior is high (Qn = 70/30)is plotted against the median estimate whenthe prior is low (@L = 30/70). Accordingto the normative equation developed in thepreceding paragraph, all points should lie

on the curved (Bayesian) line. In fact,only the empty square which correspondsto the null description falls on this line:when given no description, subjects judgedthe probability to be 70% under QH and30% under QL. In the other five cases, thepoints fall close to the identity line.

The effect of prior probability, althoughslight, is statistically significant. For eachsubject the mean probability estimate wascomputed over all cases except the null.The average of these values was 50% forthe low-engineer group and 55% for thehigh-engineer group (t = 3.23, df = 169,p < .01). Nevertheless, as can be seenfrom Figure 1, every point is closer to theidentity line than to the Bayesian line. Itis fair to conclude that explicit manipula-tion of the prior distribution had a minimaleffect on subjective probability. As inthe preceding experiment, subjects appliedtheir knowledge of the prior only when theywere given no specific evidence. As en-tailed by the representativeness hypothesis,prior probabilities were largely ignoredwhen individuating information was madeavailable.

The strength of this effect is demon-strated by the responses to the followingdescription:

Dick is a 30-year-old man. He is married with nochildren. A man of high ability and high motivation,he promises to be quite successful in his field. He iswell liked by his colleagues.

This description was constructed to betotally uninformative with regard to Dick'sprofession. Our subjects agreed: medianestimates were 50% in both the low- andhigh-engineer groups (see Figure 1). Thecontrast between the responses to this de-scription and to the null description isilluminating. Evidently, people responddifferently when given no specific evidenceand when given worthless evidence. Whenno specific evidence is given, the priorprobabilities are properly utilized; whenworthless specific evidence is given, priorprobabilities are ignored.

There are situations in which prior prob-abilities are likely to play a more substan-tial role. In all the examples discussed sofar, distinct stereotypes were associated

Page 7: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

ON THE PSYCHOLOGY OF PREDICTION 243

with the alternative outcomes, and judg-ments were controlled, we suggest, by thedegree to which the descriptions appearedrepresentative of these stereotypes. Inother problems, the outcomes are morenaturally viewed as segments of a dimen-sion. Suppose, for example, that one isasked to judge the probability that each ofseveral students will receive a fellowship.In this problem, there are no well-delineatedstereotypes of recipients and nonrecipientsof fellowships. Rather, it is natural toregard the outcome (i.e., obtaining a. fellow-ship) as determined by a cutoff point alongthe dimension of academic achievement orability. Prior probabilities, that is, thepercentage of fellowships in the relevantgroup, could be used to define the outcomesby locating the cutoff point. Consequently,they are not likely to be ignored. In addi-tion, we would expect extreme prior prob-abilities to have some effect even in thepresence of clear stereotypes of the out-comes. A precise delineation of the condi-tions under which prior information is usedor discarded awaits further investigation.

One of the basic principles of statisticalprediction is that prior probability, whichsummarizes what we knew about the prob-lem before receiving independent specificevidence, remains relevant even after suchevidence is obtained. Bayes' rule trans-lates this qualitative principle into a multi-plicative relation between prior odds andthe likelihood ratio. Our subjects, how-ever, failed to integrate prior probabilitywith specific evidence. When exposed toa description, however scanty or suspect,of Tom W. or of Dick (the engineer/lawyer), they apparently felt that the dis-tribution of occupations in his group wasno longer relevant. The failure to appre-ciate the relevance of prior probability inthe presence of specific evidence is perhapsone of the most significant departures ofintuition from the normative theory ofprediction.

NUMERICAL PREDICTION

A fundamental rule of the normativetheory of prediction is that the variabilityof predictions, over a set of cases, should

reflect predictive accuracy. When pre-dictive accuracy is perfect, one predictsthe criterion value that will actually occur.When uncertainty is maximal, a fixed valueis predicted in all cases. (In category pre-diction, one predicts the most frequentcategory. In numerical prediction, onepredicts the mean, the mode, the median,or some other value depending on the lossfunction.) Thus, the variability of predic-tions is equal to the variability of thecriterion when predictive accuracy is per-fect, and the variability of predictions iszero when predictive accuracy is zero.With intermediate predictive accuracy, thevariability of predictions takes an inter-mediate value, that is, predictions are re-gressive with respect to the criterion.Thus, the greater the uncertainty, thesmaller the variability of predictions. Pre-dictions by representativeness do not followthis rule. It was shown in the previoussection that people did not regress towardmore frequent categories when expectedaccuracy of predictions was reduced. Thepresent section demonstrates an analo-gous failure in the context of numericalprediction.

Prediction of Outcomes versus Evaluation ofInputs

Suppose one is told that a college fresh-man has been described by a counselor asintelligent, self-confident, well-read, hard-working, and inquisitive. Consider twotypes of questions that might be askedabout this description:(a) Evaluation: How does this description impressyou with respect to academic ability? What per-centage of descriptions of freshmen do you believewould impress you more? (b) Prediction: What isyour estimate of the grade point average that thisstudent will obtain? What is the percentage offreshmen who obtain a higher grade point average?

There is an important difference betweenthe two questions. In the first, you evalu-ate the input; in the second, you predict anoutcome. Since there is surely greater un-certainty about the second question thanabout the first, your prediction should bemore regressive than your evaluation.That is, the percentage you give as a pre-

Page 8: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

244 DANIEL K A H N E M A N AND AMOS TVKRSKY

ADJECTIVES

40 60

Evaluation

a.O

REPORTS

40 60

Evaluation

100

FIG. 2. Predicted percentile grade point averageas a function of percentile evaluation for adjectives(A) and reports (B).

diction should be closer to 50% than thepercentage you give as an evaluation. Tohighlight the difference between the twoquestions, consider the possibility that thedescription is inaccurate. This shouldhave no effect on your evaluation: theordering of descriptions with respect to theimpressions they make on you is inde-pendent of their accuracy. In predicting,on the other hand, you should be regressiveto the extent that you suspect the descrip-tion to be inaccurate or your prediction tobe invalid.

The representativeness hypothesis, how-ever, entails that prediction and evaluationshould coincide. In evaluating a given de-scription, people select a score which, pre-sumably, is most representative of the de-scription. If people predict by representa-tiveness, they will also select the most repre-sentative score as their prediction. Con-sequently, the evaluation and the predic-tion will be essentially identical. Severalstudies were conducted to test this hypoth-esis. In each of these studies the subjectswere given descriptive information con-cerning a set of cases. An evaluation groupevaluated the quality of each descriptionrelative to a stated population, and a pre-diction group predicted future performance.The judgments of the two groups werecompared to test whether predictions aremore regressive than evaluations.

In two studies, subjects were given de-scriptions of college freshmen, allegedlywritten by a counselor on the basis of aninterview administered to the entering class.In the first study, each description con-sisted of five adjectives, referring to in-tellectual qualities and to character, asin the example cited above. In the secondstudy, the descriptions were paragraph-length reports, including details of thestudent's background and of his currentadjustment to college. In both studies theevaluation groups were asked to evaluateeach one of the descriptions by estimating"the percentage of students in the entireclass whose descriptions indicate a higheracademic ability." The prediction groupswere given the same descriptions and wereasked to predict the grade point averageachieved by each student at the end of hisfreshman year and his class standing inpercen tiles.

The results of both studies are shown inFigure 2, which plots, for each description,the mean prediction of percentile gradepoint average against the mean evaluation.The only systematic discrepancy betweenpredictions and evaluations is observed inthe adjectives study where predictions wereconsistently higher than the correspondingevaluations. The standard deviation ofpredictions or evaluations was computed

Page 9: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

ON THE PSYCHOLOGY OF PREDICTION 245

within the data of each subject. A com-parison of these values indicated no signifi-cant differences in variability between theevaluation and the prediction groups,within the range of values under study. Inthe adjectives study, the average standarddeviation was 25.7 for the evaluation group(N = 38) and 24.0 for the prediction group(N = 36) (/ = 1.25, df = 72, ns). In thereports study, the average standard devia-tion was 22.2 for the evaluation group(N = 37) and 21.4 for the prediction group(N = 63) (t = .75, df = 98, ns). In bothstudies the prediction and the evaluationgroups produced equally extreme judg-ments, although the former predicted aremote objective criterion on the basis ofsketchy interview information, while thelatter merely evaluated the impression ob-tained from each description. In the statis-tical theory of prediction, the observedequivalence between prediction and evalua-tion would be justified only if predictiveaccuracy were perfect, a condition whichcould not conceivably be met in thesestudies.

Further evidence for the equivalence ofevaluation and prediction was obtained ina master's thesis by Beyth (1972). Shepresented three groups of subjects withseven paragraphs, each describing the per-formance of a student-teacher during aparticular practice lesson. The subjectswere students in a statistics course at theHebrew University. They were told thatthe descriptions had been drawn fromamong the files of 100 elementary schoolteachers who, five years earlier, had com-pleted their teacher training program. Sub-jects in an evaluation group were askedto evaluate the quality of the lesson de-scribed in the paragraph, in percentilescores relative to the stated population.Subjects in a prediction group were askedto predict in percentile scores the currentstanding of each teacher, that is, his overallcompetence five years after the descriptionwas written. An evaluation-predictiongroup performed both tasks. As in thestudies described above, the differencesbetween evaluation and prediction were notsignificant. This result held in both the

between-subjects and within-subject com-parisons. Although the judges were un-doubtedly aware of the multitude of factorsthat intervene between a single trial lessonand teaching competence five years later,this knowledge did not cause their predic-tions to be more regressive than theirevaluations.

Prediction versus Translation

The previous studies showed that pre-dictions of a variable are not regressivewhen compared to evaluations of the inputsin terms of that variable. In the followingstudy, we show that there are situations inwhich predictions of a variable (academicachievement) are no more regressive thana mere translation of that variable fromone scale to another. The grade pointaverage was chosen as the outcome variable,because its correlates and distributionalproperties are well known to the subjectpopulation.

Three groups of subjects participated inthe experiment. Subjects in all groupspredicted the grade point average of 10hypothetical students on the basis of asingle percentile score obtained by each ofthese students. The same set of percentilescores was presented to all groups, but thethree groups received different interpreta-tions of the input variable as follows.

1. Percentile grade point average. Thesubjects in Group 1 (N = 32) were toldthat "for each of several students you willbe given a percentile score representinghis academic achievements in the freshmanyear, and you will be asked to give yourbest guess about his grade point averagefor that year." It was explained to thesubjects that "a percentile score of 65, forexample, means that the grade pointaverage achieved by this student is betterthan that achieved by 65% of his class, etc."

2. Mental concentration. The subjectsin Group 2 (N = 37) were told that "thetest of mental concentration measures one'sability to concentrate and to extract all theinformation conveyed by complex messages.It was found that students with high gradepoint averages tend to score high on themental concentration test and vice versa.

Page 10: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

246 DANIEL KAHNEMAN AND AMOS TVERSKY

o

PERCENTILE GPA

-0 MENTAL CONCENTRATION

- -A SENSE OF HUMOR

SO 60 70 BO

Percentile score

FIG. 3. Predictions of grade point average frompercentile scores on three variables.

However, performance on the mental con-centration test was found to depend on themood and mental state of the person at thetime he took the test. Thus, when testedrepeatedly, the same person could obtainquite different scores, depending on theamount of sleep he had the night before orhow well he felt that day."

3. Sense of humor. The subjects inGroup 3 (N = 35) were told that "thetest of sense of humor measures the abilityof people to invent witty captions forcartoons and to appreciate humor in variousforms. It was found that students whoscore high on this test tend, by and large,to obtain a higher grade point average thanstudents who score low. However, it is notpossible to predict grade point averagefrom sense of humor with high accuracy."

In the present design, all subjects pre-dicted grade point average on the basis ofthe same set of percentile scores. Group 1merely translated values of percentile gradepoint average onto the grade point averagescale. Groups 2 and 3, on the other hand,predicted grade point average from moreremote inputs. Normative considerationstherefore dictate that the predictions ofthese groups should be more regressive,that is, less variable, than the judgmentsof Group 1. The representativeness hy-pothesis, however, suggests a differentpattern of results.

Group 2 predicted from a potentiallyvalid but unreliable test of mental concen-tration which was presented as a measure ofacademic ability. We hypothesized thatthe predictions of this group would be non-regressive when compared to the predic-tions of Group 1. In general, we conjecturethat the achievement score (e.g., gradepoint average) which best represents apercentile value on a measure of ability(e.g., mental concentration) is that whichcorresponds to the same precentile on thescale of achievement. Since representative-ness is not affected by unreliability, weexpected the predictions of grade pointaverage from the unreliable test of mentalconcentration to be essentially identical tothe predictions of grade point average frompercentile grade point average. The pre-dictions of Group 3, on the other hand, wereexpected to be regressive because sense ofhumor is not commonly viewed as a mea-sure of academic ability.

The mean predictions assigned to the10 percentile scores by the three groups areshown in Figure 3. It is evident in thefigure that the predictions of Group 2 areno more regressive than the predictions ofGroup 1, while the predictions of Group 3appear more regressive.

Four indices were computed within thedata of each individual subject: the meanof his predictions, the standard deviationof his predictions, the slope of the regressionof predicted grade point average on theinput scores, and the product-moment cor-relation between them. The means of thesevalues for the three groups are shown inTable 3.

It is apparent in the table that the sub-jects in all three groups produced orderlydata, as evinced by the high correlationsbetween inputs and predictions (the averagecorrelations were obtained by transformingindividual values to Fisher's 2). The resultsof planned comparisons between Groups1 and 2 and between Groups 2 and 3 con-firm the pattern observed in Figure 3.There are no significant differences betweenthe predictions from percentile grade pointaverage and from mental concentration.Thus, people fail to regress when predicting

Page 11: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

ON THE PSYCHOLOGY OF PREDICTION 247

TABLE 3

AVERAGES OF INDIVIDUAL PREDICTION STATISTICS FOR THE THREE GROUPS AND RESULTS OFPLANNED COMPARISONS BETWEEN GROUPS 1 AND 2, AND BETWEEN GROUPS 2 AND 3

Statistic

Mean predictedgrade point average

SD of predictionsSlope of regressionr

Group

1. PercenLilegrade point

2.27.91.030.97

1 vs. 2

nsnsnsns

2. Mentalconcentration

2.3S.87.029.95

2 vs. 3

.05

.01

.01ns

3. Sense ofhumor

2.46.69.022.94

a measure of achievement from a measureof ability, however unreliable.

The predictions from sense of humor, onthe other hand, are regressive, although notenough. The correlation between gradepoint average and sense of humor inferredfrom a comparison of the regression linesis about .70. In addition, the predictionsfrom sense of humor are significantly higherthan the predictions from mental concentra-tion. There is also a tendency for predic-tions from mental concentration to behigher than predictions based on percentilegrade point average. We have observedthis finding in many studies. When pre-dicting the academic achievement of anindividual on the basis of imperfect informa-tion, subjects exhibit leniency (Guilford,1954). They respond to a reduction ofvalidity by raising the predicted level ofperformance.

Predictions are expected to be essentiallynonregressive whenever the input and out-come variables are viewed as manifesta-tions of the same trait. An example ofsuch predictions has been observed in areal-life setting, the Officer Selection Boardof the Israel Army. The highly experiencedofficers who participate in the assessmentteam normally evaluate candidates on a7-point scale at the completion of severaldays of testing and observation. For thepurposes of the study, they were requiredin addition to predict, for each successfulcandidate, the final grade that he wouldobtain in officer training school. In over200 cases, assessed by a substantial numberof different judges, the distribution of pre-

dicted grades was found to be virtuallyidentical to the actual distribution of finalgrades in officer training school, with oneobvious exception: predictions of failurewere less frequent than actual failures. Inparticular, the frequencies of predictionsin the two highest categories preciselymatched the actual frequencies. All judgeswere keenly aware of research indicating thattheir predictive validity was only moderate(on the order of .20 to .40). Nevertheless,their predictions were nonregressive.

METHODOLOGICAL CONSIDERATIONS

The representativeness hypothesis statesthat predictions do not differ from evalua-tions or assessments of similarity, althoughthe normative statistical theory entails thatpredictions should be less extreme thanthese judgments. The test of the repre-sentativeness hypothesis therefore requiresa design in which predictions are comparedto another type of judgment. Variants oftwo comparative designs were used in thestudies reported in this paper.

In one design, labeled A-XY, differentgroups of subjects judged two variables(X and Y) on the basis of the same inputinformation (A). In the case of Tom W.,for example, two different groups weregiven the same input information (A), thatis, a personality description. One groupranked the outcomes in terms of similarity(X), while the other ranked the outcomesin terms of likelihood (Y). Similarly, inseveral studies of numerical prediction,different groups were given the same in-formation (A), for example, a list of five

Page 12: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

248 DANIEL KAHNEMAN AND AMOS TVERSKY

adjectives describing a student. One groupprovided an evaluation (X) and the othera prediction (Y).

In another design, labeled AB-X, twodifferent groups of subjects judged the sameoutcome variable (X) on the basis ofdifferent information inputs (A and B).In the engineer/lawyer study, for example,two different groups made the same judg-ment (X) of the likelihood that a particularindividual is an engineer. They were givena brief description of his personality anddifferent information (A and B) concerningthe base-rate frequencies of engineers andlawyers. In the context of numerical pre-diction, different groups predicted gradepoint average (X), from scores on differentvariables, percentile grade point average(A) and mental concentration (B).

The representativeness hypothesis wassupported in these comparative designs byshowing that contrary to the normativemodel, predictions are no more regressivethan evaluations or judgments of simi-larity. It is also possible to ask whetherintuitive predictions are regressive whencompared to the actual outcomes, or to theinputs when the inputs and outcomes aremeasured on the same scale. Even whenpredictions are no more regressive thantranslations, we expect them to be slightlyregressive when compared to the outcomes,because of the well-known central-tendencyerror (Johnson, 1972; Woodworth, 1938).In a wide variety of judgment tasks, in-cluding the mere translation of inputs fromone scale to another, subjects tend to avoidextreme responses and to constrict thevariability of their judgments (Stevens &Greenbaum, 1966). Because of this re-sponse bias, judgments will be regressive,when compared to inputs or to outcomes.The designs employed in the present paperneutralize this effect by comparing twojudgments, both of which are subject to thesame bias.

The present set of studies was concernedwith situations in which people make pre-dictions on the basis of information that isavailable to them prior to the experiment,in the form of stereotypes (e.g., of anengineer) and expectations concerning rela-

tionships between variables. Outcomefeedback was not provided, and the numberof judgments required of each subject wassmall. In contrast, most previous studies ofprediction have dealt with the learning offunctional or statistical relations amongvariables with which the subjects had noprior acquaintance. These studies typicallyinvolve a large number of trials and variousforms of outcome feedback. (Some of thisliterature has been reviewed in Slovic andLichtenstein, 1971.) In studies of repeti-tive predictions with feedback, subjectsgenerally predict by selecting outcomes sothat the entire sequence or pattern ofpredictions is highly representative of thedistribution of outcomes. For example,subjects in probability-learning studiesgenerate sequences of predictions whichroughly match the statistical character-istics of the sequence of outcomes. Simi-larly, subjects in numerical predictiontasks approximately reproduce the scatter-plot, that is, the joint distribution of inputsand outcomes (see, e.g., Gray, 1968). Todo so, subjects resort to a mixed strategy:for any given input they generate a dis-tribution of different predictions. Thesepredictions reflect the fact that any oneinput is followed by different outcomes ondifferent trials. Evidently, the rules ofprediction are different in the two para-digms, although representativeness is in-volved in both. In the feedback paradigm,subjects produce response sequences whichrepresent the entire pattern of associationbetween inputs and outcomes. In thesituations explored in the present paper,subjects select the prediction which bestrepresents their impressions of each indi-vidual case. The two approaches lead todifferent violations of the normative rule:the representation of uncertainty througha mixed strategy in the feedback paradigmand the discarding of uncertainty throughprediction by evaluation in the presentparadigm.

CONFIDENCE AND THE ILLUSIONOF VALIDITY

As demonstrated in the preceding sec-tions, one predicts by selecting the outcome

Page 13: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

ON THE PSYCHOLOGY OF PREDICTION 249

that is most representative of the input.We propose that the degree of confidenceone has in a prediction reflects the degreeto which the selected outcome is more repre-sentative of the input than are other out-comes. A major determinant of represen-tativeness in the context of numerical pre-diction with multiattribute inputs (e.g.,score profiles) is the consistency, or co-herence, of the input. The more consistentthe input, the more representative thepredicted score will appear and the greaterthe confidence in that prediction. Forexample, people predict an overall B aver-age with more confidence on the basis of Bgrades in two separate introductory coursesthan on the basis of an A and a C. Indeed,internal variability or inconsistency of theinput has been found to decrease confidencein predictions (Slovic, 1966).

The intuition that consistent profilesallow greater predictability than incon-sistent profiles is compelling. It is worthnoting, however, that this belief is in-compatible with the commonly appliedmultivariate model of prediction (i.e., thenormal linear model) in which expectedpredictive accuracy is independent ofwithin-profile variability.

Consistent profiles will typically be en-countered when the judge predicts fromhighly correlated scores. Inconsistent pro-files, on the other hand, are more frequentwhen the intercorrelations are low. Be-cause confidence increases with consistency,confidence will generally be high when theinput variables are highly correlated. How-ever, given input variables of stated va-lidity, the multiple correlation with thecriterion is inversely related to the correla-tions among the inputs. Thus, a para-doxical situation arises where high inter-correlations among inputs increase con-fidence and decrease validity.

To demonstrate this effect, we requiredsubjects to predict grade point average onthe basis of two pairs of aptitude tests.Subjects were told that one pair of tests(creative thinking and symbolic ability)was highly correlated, while the other pairof tests (mental flexibility and systematicreasoning) was not correlated. The scores

they encountered conformed to these ex-pectations. (For half of the subjects thelabels of the correlated and the uncorre-lated pairs of tests were reversed). Sub-jects were told that "all tests were foundequally successful in predicting college per-formance." In this situation, of course, ahigher predictive accuracy can be achievedwith the uncorrelated than with the corre-lated pair of tests. As expected, however,subjects were more confident in predictingfrom the correlated tests, over the en-tire range of predicted scores (t = 4.80,df = 129, p < .001). That is, they weremore confident in a context of inferiorpredictive validity.

Another finding observed in many pre-diction studies, including our own, is thatconfidence is a J-shaped function of the pre-dicted level of performance (see Johnson,1972). Subjects predict outstandingly highachievement with very high confidence, andthey have more confidence in the predic-tion of utter failure than of mediocreperformance. As we saw earlier, intuitivepredictions are often insufficiently regres-sive. The discrepancies between predic-tions and outcomes, therefore, are largestat the extremes. The J-shaped confidencefunction entails that subjects are mostconfident in predictions that are most likelyto be off the mark.

The foregoing analysis shows that thefactors which enhance confidence, forexample, consistency and extremity, areoften negatively correlated with predictiveaccuracy. Thus, people are prone to ex-perience much confidence in highly falliblejudgments, a phenomenon that may betermed the illusion of validity. Like otherperceptual and judgmental errors, theillusion of validity often persists even whenits illusory character is recognized. Wheninterviewing a candidate, for example,many of us have experienced great con-fidence in our prediction of his futureperformance, despite our knowledge thatinterviews are notoriously fallible.

INTUITIONS ABOUT REGRESSION

Regression effects are all about us. Inour experience, most outstanding fathers

Page 14: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

250 DANIEL KAHNEMAN AND AMOS TVERSKY

have somewhat disappointing sons, brilliantwives have duller husbands, the ill-adjustedtend to adjust and the fortunate are even-tually striken by ill luck. In spite of theseencounters, people do not acquire a propernotion of regression. First, they do notexpect regression in many situations whereit is bound to occur. Second, as any teacherof statistics will attest, a proper notion ofregression is extremely difficult to acquire.Third, when people observe regression,they typically invent spurious dynamicexplanations for it.

What is it that makes the concept ofregression counterintuitive and difficultto acquire and apply? We suggest that amajor source of difficulty is that regressioneffects typically violate the intuition thatthe predicted outcome should be maximallyrepresentative of the input information.5

To illustrate the persistence of nonre-gressive intuitions despite considerable ex-posure to statistics, we presented the fol-lowing problem to our sample of graduatestudents in psychology:

A problem of testing. A randomly selected indi-vidual has obtained a score of 140 on a standard IQtest. Suppose than an IQ score is the sum of a"true" score and a random error of measurementwhich is normally distributed.Please give your best guess about the 95% upperand lower confidence bounds for the true [Q of thisperson. That is, give a high estimate such that youare 95% sure that the true IQ score is, in fact,lower than that estimate, and a low estimate suchthat you are 95% sure that the true score is in facthigher.

In this problem, the respondents weretold to regard the observed score as thesum of a "true" score and an error com-ponent. Since the observed score is con-siderably higher than the population mean,it is more likely than not that the errorcomponent is positive and that this indi-vidual will obtain a somewhat lower scoreon subsequent tests. The majority of sub-jects (73 of 108), however, stated confidence

6 The expectation that every significant particle ofbehavior is highly representative of the actor'spersonality may explain why laymen and psychol-ogists alike are perenially surprised by the negligiblecorrelations among seemingly interchangeable mea-sures of honesty, of risk taking, of agressiou, and ofdependency (Mischel, 1968).

intervals that were symmetric around 140,failing to express any expectation of re-gression. Of the remaining 35 subjects, 24stated regressive confidence intervals and 11stated counterregressive intervals. Thus,most subjects ignored the effects of un-reliability in the input and predicted as ifthe value of 140 was a true score. Thetendency to predict as if the input informa-tion were error free has been observedrepeatedly in this paper.

The occurrence of regression is some-times recognized, either because we discoverregression effects in our own observationsor because we are explicitly told that re-gression has occurred. When recognized, aregression effect is typically regarded as asystematic change that requires substan-tive explanation. Indeed, many spuriousexplanations of regression effects havebeen offered in the social sciences.6 Dy-namic principles have been invoked to ex-plain why businesses which did excep-tionally well at one point in time tend todeteriorate subsequently and why trainingin interpreting facial expressions is beneficialto trainees who scored poorly on a pretestand detrimental to those who did best.Some of these explanations might not havebeen offered, had the authors realized thatgiven two variables of equal variances, thefollowing two statements arc logicallyequivalent: (a) Y is regressive with respectto X; (b) the correlation between Y andX is less than unity. Explaining regres-sion, therefore, is tantamount to explainingwhy a correlation is less than unity.

As a final illustration of how difficult itis to recognize and properly interpret re-gression, consider the following questionwhich was put to our sample of graduatestudents. The problem described actuallyarose in the experience of one of the authors.

A problem of training. The instructors in a flightschool adopted a policy of consistent positive rein-forcement recommended by psychologists. Theyverbally reinforced each successful execution of aflight maneuver. After some experience with thistraining approach, the instructors claimed thatcontrary to psychological doctrine, high praise for

6 For enlightening discussions of regression falla-cies in research, see, for example, Campbell (1969)and Wallis and Roberts (1956).

Page 15: PSYCHOLOGICAL REVIEWnas.psych.uidaho.edu/~ad.uidaho.edu\bdyre/psyc562...sentativeness heuristic. In this paper, we explore the rules that determine intuitive predictions and judg-ments

ON THE PSYCHOLOGY OF PREDICTION 251

good execution of complex maneuvers typically re-sults in a decrement of performance on the next try.What should the psychologist say in response?

Regression is inevitable in flight maneu-vers because performance is not perfectlyreliable and progress between successivemaneuvers is slow. Hence, pilots who didexceptionally well on one trial are likely todeteriorate on the next, regardless of theinstructors' reaction to the initial success.The experienced flight instructors actuallydiscovered the regression but attributed itto the detrimental effect of positive rein-forcement. This true story illustrates asaddening aspect of the human condition.We normally reinforce others when theirbehavior is good and punish them whentheir behavior is bad. By regression alone,therefore, they are most likely to improveafter being punished and most likely todeteriorate after being rewarded. Conse-quently, we are exposed to a lifetimeschedule in which we are most often re-warded for punishing others, and punishedfor rewarding.

Not one of the graduate students whoanswered this question suggested that re-gression could cause the problem. Instead,they proposed that verbal reinforcementsmight be ineffective for pilots or that theycould lead to overconfidence. Some stu-dents even doubted the validity of theinstructors' impressions and discussed pos-sible sources of bias in their perception ofthe situation. These respondents hadundoubtedly been exposed to a thoroughtreatment of statistical regression. Never-theless, they failed to recognize an instanceof regression when it was not couched inthe familiar terms of the height of fathers

and sons. Evidently, statistical trainingalone does not change fundamental intui-tions about uncertainty.

REFERENCES

BEYTH, R. Man as an intiutive statistician: Onerroneous intuitions concerning the descriptionand prediction of events. Unpublished master'sthesis (in Hebrew), The Hebrew University,Jerusalem, 1972.

CAMPBELL, D. T. Reforms as experiments. Ameri-can Psychologist, 1969, 24, 409-429.

GRAY, C. W. Predicting with intuitive correlations.Psychonomic Science, 1968, 11, 41-42.

GUILFORD, J. P. Psychometric methods. New York:McGraw-Hill, 1954.

JOHNSON, D. M. Systematic introduction to thepsychology of thinking. New York: Harper &Row, 1972.

KAHNEMAN, D., & TVERSKY, A. Subjective prob-ability: A judgment of representativeness. Cog-nitive Psychology, 1972, 3, 430-454.

MISCHEL, W. Personality and assessment. NewYork: Wiley, 1968.

Sl.ovic, P. Cue consistency and cue utilization injudgment. American Journal of Psychology, 1966,79, 427-434.

SLOVIC, P., & LICHTENSTEIN, S. Comparison ofBayesian and regression approaches to the studyof information processing in judgment. Organ-izational Behavior and Hitman Performance, 1971,6, 649-744.

STEVENS, S. S., & GREENBAOM, H. B. Regressioneffect in psychophysical judgment. Perception &Psychophysics, 1966, 1, 439-446.

TVEKSKY, A., & KAHNKMAN, D. Belief in the lawof small numbers. Psychological Bulletin, 1971,76, 105-110.

TVERSKY, A., & KAHNEMAN, D. Availability: Aheuristic for judging frequency and probability.Cognitive Psychology, 1973, in press.

WALLIS, W. A., & ROBERTS, H. V. Statistics: Anew approach. New York: The Free Press, 1956.

WOODWORTH, R. S. Experimental psychology. NewYork: Holt, 1938.

(Received November 8, 1972)