xlm-22-2-510

Upload: juliana-fonseca

Post on 03-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 xlm-22-2-510

    1/15

    Journal of Experimental Psychology:Learning, Memonr, and Cognition19%, Vol . 22 , No. 2 ,51 0-524CopyrM ii 1996 by the American Psychological Association, Inc.0278-7393/96/S3.00

    The Learning Curve as a Metacognitive ToolRobert A. Josephs , David H . Silvera, and R. Brian GieslerUniversity of Texas at A ustin

    A series of 4 experiments demonstrated that when practicing for a test of problem solving,recognition and selectbn of a particular stopping signal was shown to depend on the within-setvariability (statistical noise) of a problem set. When noise was high, participants used anondiagnostic stopping signal based on the initial size of the problem set. However, when noise waslow, access to evidence of learning was possible, resulting in the operation of a stopping signalbased on learning curve characteristics. By blocking access to local segments of the learning curvethrough the use of variability masking, it was demonstrated that participants were sensitive tochanges throughout the entire curve. In spite of the long history associated with the psychologicalstudy of the learning curve, these results provide the first d emonstration of the learning curve as asource of metacognitive judgment and regulation.

    On the face of it, preparation for an exam appears to be arelatively simple task. The sensible strategy should involvelittle more than continuing in one's study efforts until thematerial is learned. Unfortunately, predictions of performanceand judgments of learning are often weakly correlated withactual performance {e.g., Glenberg, Sanocki, Epstein, & Mor-ris, 1987; Maki & Berry, 1984; Maki & S erra, 1992; Weaver,1990). A number of explanations have been given to accountfor this poor relationship, including the use of general,undifferentiated domain familiarity in place of text-specificknowledge (e.g., Glenb erg et al., 1987), the u nderutilization ofnormative item difficulty coupled with an overemphasis onindividuating factors (e.g., Nelson, Leonesio, Landwehr, &Narens, 1986), the lack of sensitivity in the comprehensionmeasure (e.g., Weaver, 1 990), the substitution of valid p repara-tion cues with computationally simpler but often misleadingindicators of preparation (e.g., Josephs, Giesler, & Silvera,1994; Josephs & Hahn, 1995), lack of prior knowledge (e.g.,Maki & Serra, 1992; Josephs & Hahn, 1995), and the nature ofthe p redictor (e.g., feeling of knowingC ostermans, Lories, &Ansay, 1992; Metcalfe, 1986; Metcalfe & W iebe, 1987).This disconnection between judgment and performance hasobvious implications for education and human performance.Inadequate preparation can lead to poor performance; over-prepara tion is inefficient and can lead to po or performance on

    Robert A. Josephs, David H. Silvera, and R. Brian Giesler,Department of Psychology, University of Texas at Austin. David H.Silvera is now at the Department of Psychology, University of Tr

  • 7/28/2019 xlm-22-2-510

    2/15

    LEARNING CURVE AS STOPPING SIGNAL 511individual level of aspiration and m otivation), we propose thatthe variability in item difficulty within a problem set (inessence, the statistical noise within the set that is created bybetween-item difficulty) plays a central role in determ ining thenature of the stopping signal used to terminate study effort.When noise is low, individuals presumably are afforded theopportunity to access changes in their performance (i.e., theyhave access to their learning curves), making it possible toexperimentally investigate the relationship between learningcurve characteristics and amount of study effort. Effort shouldcease soon after the learning curveflattensout, signaling to th elearner a lack of marginal gain associated with further study.This seems reasonably intuitive.1 However, we hasten to pointout that in spite of the long history associated with thepsychological study of the learning curve (e.g., Mazur &Hastie, 1978), it appears that investigations of the learningcurve (occasionally referred to as the practice curve) havefocused exclusively on its function as a descriptive device usedto illustrate the relationship between practice and perfor-mance. We are aware of no previous research that investigatedthe explicit metacognitive functions of the learning curveinessence, a person's awareness of hisor he r learning curve, andthe implications of this awareness for judgment and behaviornor are we aware of research into the conditions under whichaccess to the learning curve is promoted or obscured. Neverthe-less, this lack of empirical and theoretical precedent is quiteunderstandable, given the relatively brief history associatedwith the study of human metacognition.

    A Stopping Signal Based on Problem Set SizeUnd er conditions in which learning curve access is blockedas a result of a high degree of within-set variability, an

    individual presumably initiates a search for auxiliary stoppingsignals. An array of candidates quickly comes to mind. Forexample, a person can use social comparison information("I've studied at least as hard as my classmates"), individualbase-rate information ("I've studied as much as I usuallystudy"), externally defined context information ("I've com-pleted all of the homework and problem sets assigned by theprofessor"), information that defies categorization ("I studieduntil I fell asleep"), and so on. The less than ideal nature ofmany of these strategies is attested to by the large individualdifferences found in people's abilities to acquire and masterthe declarative and procedural knowledge necessary to per-form well on such exams (e.g., Lehman, Lempert, & Nisbett,1988; Staszewski, 1988). For example, for some peo ple comple-tion of an assigned problem set may be more than sufficient toallow for good performance on the exam, whereas for othe rs, agreat deal more practice may be neede d (e.g., Nelson, 1993).

    When a test situation becomes familiar and thus pred ictable(e.g., weekly quizzes), a study strategy that is initially nondiag-nostic may become appropriate. For example, a social compari-son strategy can be adjusted as the student becom es aware ofhis or her performa nce, relative to the rest of the class ("Basedon the last exam, I probably need to study more than myclassmates if I want to perform as well as they perform").However, there are numerous and important test situationsthat do not yield this familiarity advantage (e.g., standardized

    tests such as the Scholastic Aptitude Test or G radua te R ecordExamination, final exams, entrance exams, and so on). In thesesituations, we argue that learning curve information is theideal information source for successful and efficient studyeffort. Yet, as we hope to demonstrate, access to the learningcurve depends on an infrequently occurring set of problem setcharacteristics, thus opening the door to a host of strategieswhose diagnostic value is questionable.Although any one of a number of these strategies may beused when learning curve access is not possible, we proposethat the nature of many study situations is such that onestrategy in particular may enjoy disproportiona te use. The factthat many study situations a re characterized by a well-definedand discrete problem space (e.g., a problem set assigned priorto a final exam, a series of homework assignments assignedprior to a midterm exam, a study manual used in preparationfor a standardized exam) suggests that the degree of comple-tion of a particular set of information can provide a readilyappare nt and computationally simple indicator of preparatio n.To the extent that the initial information set is equated withoptimal preparation, the "dent" placed in the size of the setcan be used to judge progress in skill acquisition. Although, thisimplies the use of a proportion heuristic in which preparatio n isgauged as a function of the proportion of initial problem setcompleted, we do n ot believe that most individuals engaged instudy behavior have an implicit rule that would generate aninvariant proportion-to-judgment relationship. Not only is thisunlikely because of th e cognitive effort involved in such acalculation (e.g., Payne, Bettman, & Johnson, 1990), there isnothing in our experience to suggest the existence of such arule. Rather, it is more likely that initial set size serves toanchor subsequent judgments of preparedness.

    Why should set size serve to anchor preparedness judg-ments? Although admittedly speculative, it may be that theperception of privileged access attributed by the test taker tothe problem set provider (e.g., the professor or the study guidepublisher) bolsters the test taker's faith in anchoring ontoproblem set size as a metacognitive con trol cue ("The profes-sor knows what will be on the test. Furthermore, I have noreason to distrust her. Therefore, I have confidence in theefficacy of preparing for the test by completing the studymaterials she has provided for me. If I discharge my responsi-bilities to her by studying hard using the materials she hasprovided, I will be rewarded with a high grade." Indeed, thislogic is quite similar to L erner's, 1980, notion of the belief in ajust world in which rewards and punishments are believed to1 The term learning curve as used throughout this article is definednot as the formal mathematical function that best captures anindividual's practice-performance relationship, but rather in thebroad sense as the progressive improvement and subsequent mastery(decrease in marginal improvement) that characterizes the typicalpractice-performance relationship. Thus, when we state that individu-als demonstrate access to and use of their learning curves, we assumeonly an awareness of improvement and a subsequent leveling-off isperformance, leading to a regulation of behavior based on thisawareness. We further assume that most people lack the ability toperform the complex mathematical computations necessary to affordthem access to the arithmetic functions that formally describe theirlearning processes.

  • 7/28/2019 xlm-22-2-510

    3/15

    51 2 JOSEPHS, SILVERA, AND GIESLERbe consistently and justly rneted out to those who deservethem). The normative status of the use of set size is likely tovary on a case-by-case basis, determined by individual andsituational factors. Thus, unlike the strong rhetoric regardingthe nonnormative status of other anchoring effects (e.g.,Poulton, 1968; Slovic, Fischhoff, & Lichtenstein, 1982; Tversky& Kahneman, 1974), we prefer to adopt a wait-and-seeattitude, allowing objective measures of performance to be th eultimate arbiter of what is normative and what is not.

    Overview of Experimen tsIn Experiment 1, we explored the effects that the within-setvariability of a practice set has in the selection of a stoppingrule. When variability is low, we suspected that problemsolving effort would be influenced by learning curve cha racter-istics. Specifically, we hypothesized that problem solving effortshould continue until the participant's performance curveflattens out. When variability is high, we expected effort to bedetermined by the size of the problem set. The larger the initial

    set of practice problems, the more problems participantsshould solve. In Experiment 2, we tested the hypothesis thatwithin-set variability would influence judgments of exam pre-paredness when solving a predetermined quantity of practiceproblems. We predicted that judgments of preparedness andpredictions of future performance would be determined by thesize of the problem set under conditions of high within-setvariability. No set-size effects were predicted under lowvariability conditions, presumably becau se of th e dom inance oflearning curve information. In E xperiment 3, we examined thecontributions to metacognitive functioning of the two majorcomponents of an anagram-generated learning curve: thedescending, or improvement limb, and the flat, or masterylimb. By independently masking access to each limb, we wereable to assess each limb's unique contribution to metacognitiveregulation. In Experiment 4, we tested the hypothesis that theheuristic set size strategy would persist in spite of the obviouslack of diagnostic value of the initial size of the study set.Exper iment 1

    Given the lack of empirical precedent to support ourprimary hypotheses, this first experiment was necessarilyexploratory in nature. Participants were told that an anagramsolving session they were about to begin served as practice foran anagram test to be administered later in the experimentalsession. They were instructed to continue solving anagramsuntil they felt prepared enough to do well on the upcomingtest. In a situation such as this, the ap propriate stopping signalshould be based on a participant's assessment of his or herperformance as reflected in a leveling-off of problem solvingperformance, in essence, the type of information awareness ofone's learning curve would provide. Unfortunately, awarenessof the learning curve may require a series of formidable mentalcomputations. Any variability in the difficulty of a study set hasto be factored out to obtain a valid assessment of learning. So,for example, if several moderate-difficulty problems are fol-lowed by several easier problems, which are, in turn, followedby several very difficult problem s, an improvemen t in prob lem

    solving performance attributable to learning cannot be calcu-lated without controlling for and factoring out the variability ofdifficulty of the problems contained within the set.In Experiment 1, we sought to manipulate the ease withwhich participants would be made aware of learning curveinformation. Participants either solved a series of anagramsthat were relatively uniform in difficulty or that varied consid-erably in difficulty. The uniform set was composed exclusivelyoffive-letteranagrams, whereas the variable set was composedof an equivalent m ixture of four-, five-, and six-letter anagrams.The logic behind this manipulation was straightforwardperformance information (e.g., one's learning curve) should berelatively accessible in the uniform anagram set because of therelative lack of variability in the difficulty of the anagram set.The lack of between-anagram variability in difficulty shouldgive pa rticipants a chance to access systematic changes in the irperformance and regulate their problem solving efforts as perthese changes.However, the between-anagram variability in the variableset should make access to the learning curve difficult. C hangesin problem solving performance as a result of learning shouldbe overwhelmed by the variability in problem difficulty. Inessence, noise should surpass signal. In this case, we hypoth-esized that participants would be forced to use set-sizeinformation to regulate their problem solving effort. Thus, wepredicted that the initial size of the problem set woulddetermine time and effort spent in the skill acquisition process.To this end, the number of anagrams initially placed beforeparticipants was manipulated, such that some participantswere given a set of 25 anagrams and others were given 200anagrams.

    We predicted that participants in the variable-set conditionwould be influenced by the initial size of the problem set, suchthat participants in the 200-anagrarn set condition would solvemore anagrams than participants in the 25-anagram set condi-tion, in spite of our efforts to make participants aware of thearbitrary m ethods used to determ ine the size of these anagramsets. However, we predicted that participants in the uniform-set condition would show a minimal influence of set size.Rather, we entertained the possibility that these participantswould cease solving anagrams at a point about which theirlearning curves began to flatten out.

    MethodParticipants. Ninety-five undergraduates (45 men and 50 women)at the University of Texas at Austin participated in exchange for

    course credit.Design and procedure. The experimental design took the form of a2 (set size) x 2 (uniformity of problem set) between-subject factorialdesign. Participants were either given a 25-anagram set or a 200-anagram set. To emphasize the arbitrary nature of the size of theproblem set, the experimenter left the laboratory at this point,explaining that "I have to go into the next room to grab a bunch ofanagrams for you to work on." The experimenter returned to thelaboratory with a stack of either 25 or 200 anagrams, depending on theexperimental condition. Anagrams were handwritten on index cards,with 1 anagram per card. The experimenter kept track of problemsolving performance by the use of a hand-held stopwatch.Participants were not told how many anagrams were contained in

  • 7/28/2019 xlm-22-2-510

    4/15

    LEARNING CURVE AS STOPPING SIGNAL 513each set, which indicated to participants the ostensibly capriciousmethod by which the quantity of anagrams was selected. To addemphasis to the arbitrary nature of the size of the problem set, theexperimenter stated that "We have a virtually unlimited supply ofthese anagram s, so if you finish the ones in front of you, let m e knowand I'll go grab another bunch."

    Each problem set was composed of either allfive-letteranagrams ora randomized set of an equal number of four-, five-, and six-letteranagrams. Participants were told that when they felt prepared toperform well on a test of their anagram solving ability, they were tostop work, at which point they would be given a short test of theirproblem solving ability. Actually, such a test was not administered.Rather, when participants indicated that they felt sufficiently pre-pared, they were debriefed a nd dismissed from th e experiment.

    ResultsPerformance was defined simply as the time required tosolve each anagram. To test the efficacy of the anagram-variability manipulation, we co nducted an analysis of variance(ANOVA) to determine whether the overall performance

    variation in the variable set was greater than that in theuniform set. For each participant, a least-squares regressionline was fitted to his or her performance plot (the timerequired to solve each anagram was plotted against anagramorder), with the ensuing residual indicating variability. Th eseresidual scores were averaged within condition, and, as weexpected, problem solving performance was significantly morevariable in the variable-difficulty set (M = 35.3 s) than in theuniform-difficulty set (M = 25.9 s), F{\, 91) = 5.15,p < .05,MSE = 122.62.The initial size of the anagram set had an overall main effecton number of anagrams solved, with an average of 19.2anagrams solved in the 25-anagram set and an average of 25.7anagrams solved in the 200-anagram set, F(l, 91) = 4.42, p

  • 7/28/2019 xlm-22-2-510

    5/15

    514 JOSEPHS, SILVERA, AND GFESLER(those in the uniform-set condition) were expected to solve anequivalent number of anagrams after dyldx = 1, and thuswere predicted to show no effect of set size. However,participants in the variable-set condition were expected toshow a relatively large effect of set size, such that those in the25-anagram set condition were predicted to solve fewer ana-grams than those in the 200-anagram set condition, consistentwith their hypothesized lack of access to their learning curves.As Figure 1 shows, the data conformed quite well to ourexpectations. A 2 (set size) x 2 (intraset variability) ANO VArevealed a statistically significant interaction between set sizeand intraset variability, F(l, 85) = 6.44, p < .05, MSE =208.93. A set of planned contrasts revealed th at set size had noeffect among uniform-set participants (Ms = 6.2 in the 25-anagram set, 7.2 in the 200-anagram set), but a large effectamong participants in the variable-set condition. Amongparticipants in the variable-set condition, those in the 25-anagram set condition solved, on average, 3.2 anagramsbeyond the point at which dyidx = 1, whereas those in the200-anagram set condition solved, on average, 13.1 anagrams,F( l , 41) = 13.83,/; < .05,M5 = 208.93.Discussion

    In Experiment 1, we sought to explore the role of problemset variability in the selection of stopping signals, and byimplication, the importance of problem set construction foracademic performance. When the internal variability of a

    15 _

    5 -

    Uniform Variable

    Intraset Variability

    Figure I. Number of anagrams solved post dyidx = 1 as a functionof set size and intraset variability.

    problem set was low, problem solving effort was unaffected byset size. Rather, participants appear to have been guided bylearning curve characteristics. On average, participants whowere working under low-variability conditions discontinuedtheir p roblem solving efforts soon after the point a t which theirlearning curves began toflattenout. When variability was high,problem solving effort was strongly influenced by the size ofthe initial problem set, a heuristic strategy based on initial sizeof the problem set. Although the degree to which this heuristicpermeates real world situations is not known, the conditionsunder which set-size effects operate suggest the more generalquestion of the extent to which acquisition set characteristicsprevent access to normative stopping signals, and thus openthe door to a host of low quality heuristic stopping strategies ofwhich a set-size effect is but one of many.The importance of problem set variability for successful skillacquisition seems self-evident, yet has not been a topic ofinvestigation for reasons discussed earlier. The results ofExperiment 1 suggest that th e extent to which a heuristic andpotentially misleading stopping strategy is employed dependson the amount of statistical noise created by a problem set.When noise was low, participants apparently were aware ofchanges in their performance and appeared to use thisinformation appropriately. Although no poststudy perfor-mance me asures were incorporated into the design of Experi-ment 1, the implications for performance are clear. Use of amisleading stopping signal can result in either prematuretermination of a study session or a needlessly long studysession, thus robbing other tasks of the time they deserve.Finally, although the learning curve as a descriptive functionhas been studied since the earliest days of modern experimen-tal psychology (e.g., Bryan & Harter, 1897), the use of thelearning curve as a metacognitive control cue has not, to thebest of our know ledge, been the focus of psychological inquiry.Thus, the fact that Experiment 1 provides evidence suggestiveof participants' ability to recognize and correctly apply theinformation communicated by the learning curve is bothheartening and exciting.

    Although the internal variability of the two problem setsused in Experimen t 1 was demonstrably different, the factremains that this difference in variability was produced at acost. The sets differed in content, with the variable setcontaining anagrams of a type and difficulty that distinguishedit from the five-letter anagrams exclusively composing theuniform set. Thus, it is conceivable that the behavioral differ-ences that resulted from the two sets may have been due to acharacteristic other than intraset variability. To address thepossibility of a confounding variable that may have arisen as aresult of this procedure, a second experiment was conducted inwhich intrase t variability was manipulated in a manner tha t didnot sacrifice across-set anagram equivalence.Experiment 2

    In Experiment 2, all participants, regardless of condition,were supplied with the same anagrams. To achieve thisequivalence, a separate group of pretest participants was usedto determine the mean time to solution for each anagram. Wethen used this information to construct two anagram sets that

  • 7/28/2019 xlm-22-2-510

    6/15

    LEARNING CURVE AS STOPPING SIGNAL 515took on the following characteristics: In one set, the anagramswere ordered from most to least difficult, on the basis of theirpretest means. In the other set, these same anagrams wererandomly sorted.

    We hypothesized that as a result of the directional nature ofthe variance in the descending anagram set, participants in thiscondition would attribute improvements in performance tolearning and would thus ignore set size as a control cue. Theprogressive decrease in normative difficulty associated withthis set should mimic and amplify a participant's naturallearning effects. It is important to note that although thismanipulation was designed to result in an improvement inperformance due in part to factors other than learning, thephenomenal experience of learning was expected to remain.Participants were told that the anagrams selected for inclusionin the problem set were of equivalent difficulty. Thus, aware-ness of a marginal decrease in learning due to practice shouldcommunicate the same metacognitive awareness, regardless ofthe ersatz nature of the performance curve. Similar to thevariable-set condition used in Experiment 1, the greaterintraset variance of the random-sort set was predicted to blockaccess to learning curve information, thus promoting the use ofset-size information.

    One other important difference existed between Experi-ments 1 and 2. In Experiment l, recall that participants wereallowed to continue working until they felt prepared enough toperform well on an upcoming exam. In Experiment 2, allparticipants were stopped after completing a predeterminedquantity of problems (20 anagrams) and were then given ashort questionnaire asking for judgments of preparedness andpredictions of future performance. On the basis of the ideathat preparedness judgments underlie metacognitive regula-tion, we expected that the patterns of data obtained in thisexperiment would mimic those obtained in Experiment 1. Inthe random-sort condition, judgments were predicted to de-pend on initial set size, with judgments of preparedness andpredictions of upcoming performance being higher in the25-anagram condition than in the 200-anagram condition. Weexpected no judgment differences in the descending set condi-tion, as the nature of the anagram sequence should allowparticipants to judge preparedness on the basis of the shape oftheir performance curves.

    MethodParticipants. Forty-nine undergraduates (22 men and 27 women)at the University of Texas at Austin participated in exchange for

    course credit.Pretest to determine anagram difficulty. We pretested a set of 60 five-and six-letter anagrams on an independent sample of38 undergradu-ates and obtained mean solution times for each anagram. For eachpretest participant, within-set anagram order was randomized. Aver-age solution times with associated standard deviations were plotted inorder of descending magnitude of solution time, and from this plot aset of20 anagrams was selected with the following characteristics: (a)The 20 anagrams selected for inclusion ranged in average solution timefrom a high of 106 s to a low of43 s. (b) Anagrams 1-15 were selectedsuch that each successive anagram had an associated mean solutiontime that was approximately 4 s faster than the previous anagram (asone would expect, there was some slight variation around this average.

    For example, Anagram 1 had a mean solution time of 106.0 s, andAnagram 2 had a mean solution time of 301.8 s); (c) The last five(Anagrams 16-20) all had mean solution times that were within 1 s ofeach other and were, as a group, 5.5 s faster than the last anagram inthe descending set (Anagram 15). This set-up was designed to mimic atypical exponential decay curve (as in Experiment 1, we found that adecay function expressed asy = a e~bi-x ~ P> + c resulted in a good fitto these data), (d) We sought to minimize variability within theproblem set by excluding anagrams with large standard deviationswhenever possible (e.g., if3 anagrams from the initial set of 60 eachhad mean solution times of approximately 77 s, the one with the loweststandard deviation was selected).It was the case th at quite of few of the six-letter anagrams generatedfaster mean solution times than the five-letter anagrams. This waspartly a function of the number of solutions for a given anagram as wellas a function of the particular letter order composing a given anagram(letter order wasfixedprior to the pretest). So, for example, the mostdifficult of the 20 anagrams chosen was afive-letteranagram, whereasamong the easiest group of 5 anagrams (Anagrams 16-20), 2 weresix-letter anagrams. This mixture gave the appearance of uniformdifficulty and increased the likelihood that the attempt to deceiveparticipants into believing that the anagram set was uniform indifficulty would meet with success.Design and procedure. The e xperimental design rook the form of a2 (set size) x 2 (descending vs. variable set) factorial. Participantswere randomly assigned to a set-size condition (either 25 or 200anagrams) and a variability condition. Participants were told that theywere participating in a study of learning, in which they would be givena set of practice problems and would subsequently be tested toascertain how well they learned to solve these types of problems.Participants were told that in preparation for the upcoming problemsolving tes t they were being given a set of practice problems to work onand were to continue working until they were told that time hadexpired. As in Experiment 1, participants were informed of theexistence of an abundance of additional practice problems in anadjoining room, such that if they finished the set in front of thembefore time expired, the experimenter would supply them withadditional anagrams.Furthermore, participants were told that the anagrams composingthe practice set were all equivalent in difficulty and were alsoequivalent in difficulty to the anagram s that composed the forthcomingexam. As in Experiment 1, the experimenter, using a stopwatch, timedeach participant's anagram performance, supplying a one-letter hintfor each 60s period that the anagram remained unsolved.

    Upon completing the 20th anagram, the experimenter informed theparticipant that time had expired. The experimenter then placed thejudgment questionnaire in front of the participant, and asked theparticipant to complete the questionnaire before taking the exam. Thequestionnaire consisted of two questions, each followed by a 15-pointLikert scale. Question 1 was "How prepared are you for the upcomingexam?", with the rating scale anchored at not at al l prepared andextremety well prepared. Question 2 was "How do you think you willperform on the upcoming exam?", with the rating scale anchored atpoorly and extremely well. After completion of the questionnaire, theexperimenter informed the participant that the experiment was over.The participant was then debriefed, probed for suspicion, and dis-missed.

    ResultsPrior to debriefing, the experimenter asked participants to

    indicate which, if any, aspect of the experimental proceduremay have produced feelings of suspicion or incredulity. Oneparticipant indicated that he was skeptical of the occurrence of

  • 7/28/2019 xlm-22-2-510

    7/15

    516 JOSEPHS, SILVERA, AND GIESLERthe upcoming exam. He was dropped from all forthcominganalyses. None of the remaining participants expressed skepti-cism or suspicion about any aspect of the experimentalprocedure.Intraset variability. To assess the manipulation of intra-set variability, an exponential decay function ( y = a e-H'-p) + c) w a s u s e c i to assess goodness of fit for eachparticipant's performance function. Average variability wascalculated for each participant by taking the average absolutedifference between raw and fitted values and then calculatingthe average within each of the two anagrams sets. In otherwords, variability scores reflected average variability about thefitted decay function. In testament to the effectiveness of thevariability manipulation, average variability in the descendinganagram set was under 19 s (18.3 s), but greater than 30 s (30.1s) in the random-sort set, F(l, 46) = 5.49, p < .05, MSE =313.51. In spite of the differences in variability, no differencesin average overall problem solving time between the twoconditions were found (F < 1), attesting to the equivalentdifficulty of the two problem sets.

    Judgment and prediction. We predicted that the influenceof set size on judgments of preparedness and predictions ofupcoming exam performance would interact with intrasetvariability. Specifically, set-size effects w ere predicte d to showup only in the random-sort condition. For the question "Howprepared are you for the upcoming exam ?", this predicted SetSize x Set Variability interaction was supported, F(l, 44) =6.15, p < .05, MSE = 89.19. As Figure 2 shows, set size had no

    25 anagrams

    200 anagrams

    10 -

    7.5 -o J,

    2.5 -

    Descending VariableIntraset Variability

    Figure 2. Judgments of preparedness as a function of set size andintraset variability.

    1 5 - i

    8. " 75 -

    5 -

    25 anagrams

    200 anagrams

    Descending VariableIntraset Variability

    Figure 3. Predictions of performance as a function of set size andintraset variability.

    influence on judgment in the descending size condition, but alarge effect in the variable-set condition. A planned compari-son revealed that within the random-sort set, participants inthe 25-anagram set condition judged themselves to be betterprepared, relative to participants in the 200-anagram setcondition, F(l, 22) = 6.76,;; < .05, MSE = 89.19 (Ms = 12.9vs. 8.3, respectively).As Figure 3 shows, the same pa ttern of results was observedfor the prediction question. For the question "How do youthink you will perform on the upcoming exam ?", the predictedSet Size X Set Variability interac tion was supported,.F(l, 44) =4.99,;? < .05, M SE = 117.37. Set size only had an influence onprediction in the variable-set condition. A planned com parisonrevealed that participants' predictions of their upcomingperformance were higher in the 25-anagram set condition thanin the 200-anagram set condition, F(l, 22) = 5.38, p < .05,MSE = 117.37 (Ms = 10.3 vs. 7.6, respectively).Discussion

    Serving as a conceptual replication of Experiment 1, theinternal variability of the problem set was shown to be crucialin determining the operation of learning-curve-based versusset-size-based preparedne ss judgmen ts. When variability washigh, access to information that would have allowed attribu-tions to learning was masked, forcing participants to resort tothe size of the problem set as the primary determinant ofpreparedness judgments and performance predictions. This

  • 7/28/2019 xlm-22-2-510

    8/15

    LEARNING CURVE AS STOPPING SIGNAL 517set-size effect was not evident under conditions of low variabil-ity. Ra ther, when variability was low, there was modest, albeitstatistically nonsignificant support for the idea that partici-pants used the location of the dy/dx = - 1 stopping signal tojudge preparedness and predict performance.It should be noted that in spite of the success we had intricking participants into attributing learning to a boguslearning curve, the ecological validity of this type of situation isprobably quite low. It seems highly unlikely that a problem setengineered to create the experience of learning would becoupled with a deceptive instruction claiming within-set prob-lem equivalence. Obviously, the implementation of such ahighly contrived and deceitful situation was deemed necessaryin order to defeat the possibility of problems introduced as aresult of the variability manipulation used in Experiment 1. Inthe majority of cases in which a learning curve is generated, itis most likely true that the learning curve does reflect legiti-mate learning, and thus has the potential to be a high qualitysource of m etacognitive awareness.

    This ability to recognize and use diagnostic performanceinformation while resisting the influence of a computationallysimple judgment heuristic is important, for at least tworeasons. First, this finding has important and clear-cut practi-cal significance. It offers a clear prescription for effective andefficient learning. Specifically, it suggests tha t the challenge forstudents and others who depend on the acquisition of aparticular skill or body of knowledge for performance successlies in the identification and/or construction of conditionsunder which appropriate signaling information can be ac-cessed, rather than in the use and identification of thesignaling information itself. The findings from Experiments 1and 2 suggest that p articipants have little problem in recogniz-ing and using learning curve information, Rath er, the problemlies in the accessibility of such information. It can be arguedfurther that the responsibility for solving this problem lies notwith the problem solver, but rather is the responsibility ofthose who select and construct the practice sets that are usedin the service of skill acquisition.

    A second argument for these findings' importance can bemade by viewing them against the context provided by thejudgment literature . Compared to th e large number of findingsthat are classified as part of the heuristics and biases researchtradition, the demonstration of individual competence in therecognition and use of what is apparently high-quality judg-ment information while simultaneously resisting a seductivelysimple but potentially misleading judgmental shortcut (i.e., setsize) is indeed unusual. We would suggest that this demonstra-tion provides a counter to the view of human performance thatthis literature su pports (see, e.g., Koehler, in press). This pointis discussed in somewhat grea ter detail later in the article.In spite of the evidence marshaled thus far demonstratingmetacognitive regulation, the strongest case can be made notfor use of the learning curve, but rath er for use of set size. Wecan state rath er confidently that relative to participants in thevariable conditions, participants in the uniform conditions ofExperiments 1 and 2 were not influenced by set size. However,the source of their metacognitive regulation remains unclear.Participants clearly were not using set size, but whether or notthey defaulted to the use of their learning curves has not yet

    been verified. A stronger test of learning curve regulationwould require an examination of the behavioral effects ofexperimentally manipulating various segments of the learningcurve. Thu s, the primary purpose of the following experimentwas to provide a more direct test of the regulatory effects oflearning curve characteristics by blocking participants* accessto different segments of an engineered learning curve of thetype used in the previous experiment.Because of the shape of the average learning curve asrevealed in Experiment 1 (see footnote 3), it appears reason-able, at least for an initial attempt, to conceptually decom posethe anagram-generated learning curve into two relativelydiscrete com ponents: improvement and mastery. T he improve-ment, or descending limb, corresponds to the initial stage, inwhich problem solving effort results in a more-or-less mono-tonic decrease in solution time. The mastery, or flat limb,follows the improvement limb, and is characterized by solutiontimes that are relatively unaffected by problem solving effort.Recall that we have argued repeatedly that learning curveawareness should result in problem solving effort continuingthroughout the descending limb, and stopping soon afterencountering the flat limb. By generating a noise mask usinganagram variability, it becomes possible to examine the ind e-pendent effects on behavioral regulation of access to these twolimbs of the learning curve. In o ther w ords, by masking accessto one limb, the effects of the other limb on metacognitiveregulation can be isolated. We sought to accomplish this byadopting and modifying the paradigm used in Experiment 2.More specifically, we manipulated intra-anagram variabilitywithin each limb, resulting in a 2 x 2 + 1 factorial design, withthe presence or absence of the variability mask crossed witheach of th e two limbs of the learning curve. A fifth conditionwas added to test the hypothesis that th e perception of masterygenerated by the fla t limb is sensitive to awareness of improve-ment, but not necessarily sensitive to magnitude of improve-ment.

    Experiment 3In this experiment, all participants were presented with a setof 200 anagrams. Participants were told that the anagramswere pretested to insure that all were roughly equivalent indifficulty. In actuality, each of the five anagram sets wascomposed of anagrams sequenced to mimic (or mask) adescending and/or flat performance curve. In the no-maskiongcondition, both limbs of th e learning curve were hypothesizedto be accessible. This condition was equivalent in structure to

    the descending condition used in Experiment 2, in which aseries of progressively easier anagrams was followed by a seriesof anagrams that were roughly equivalent in difficulty. Theno-maskshort condition was identical to the no-maskiong condi-tion, except that the descending limb was 50% longer in theno-maskiong condition. The opposite of the no-mask conditionswas the full-mask condition, consisting of an equivalent mix-ture of four-, five-, and six-letter anagrams presented inrandom order. The full-mask condition was identical to thevariable condition used in Experiment 1. The mastery-maskcondition consisted of the same series of progressively easieranagrams used in the no-maskiong condition, followed by the

  • 7/28/2019 xlm-22-2-510

    9/15

    518 JOSEPHS, SILVERA, AND GIESLERvariable difficulty anagrams used in the full-mask cond ition. Inthis condition, access to the descending limb should bepossible, whereas access to the flat limb should be blocked. Inthe improvement-mask condition, a se ries of variable difficultyanagrams (four-,five-,and six-letter) was followed by the sameseries of equivalent-difficulty anagram s used in both no-maskconditions. In this condition, access to the descending limbshould be blocked, whereas access to the flat limb should bepossible. In all but the full-mask and no-maskShort conditions,the juncture between improvement and mastery was located atthe 16th anagram is the sequence. In the ao-mask^t condi-tion, the flat limb began at the 11th anagram (see Figure 4 for aschematic of thefiveexperimental conditions).

    As in Experiment 1, all participants were instructed tocontinue to solve anagrams until they felt p repared to do wellon an upcoming test of their problem solving ability. Inactuality, no such test was given. As soon as participantsindicated that they were ready to begin the exam, they weredebriefed and dismissed.Predictions

    The experimental design generated the following predic-tions concerning problem solving effort.No-mask. Relative to other conditions, both no-mask con-ditions were predicted to result in the fewest anagrams solved.The relative lack of noise in these conditions should allowparticipants maximal awareness of improvement and m astery.

    The awareness of a lack of marginal improvement associatedwith the flat limb was predicted to result in participantsdiscontinuing work on the anagram set soon after this limb isencountered. Of course, we expected participants in theno-maskiong condition to solve more anagrams relative toparticipants in the no-mask^t condition, because of thelonger improvement limb associated with the no-mask]Ongcondition. However, subsequent to the juncture betweenlimbs, the point at which participants stop work was predictednot to differ between the no-mask^^ and no-mask iong condi-tions, in spite of the rela tive difference in the size of theimprovement limbs. This prediction squares well with ourintuition regarding the phenomenology of learning curveawareness. Whether it takes a person 10 min, 20 min, or 2 hrsto reach the point at which practice ceases to yield gains inimprovement, the feeling of "finally getting it" should result ina rapid termination of effort. It has been our experience thatthe hectic pace of modern life (especially college life) fosterslittle patience in mast people for efforts tha t do not continue toyield tangible gains.

    Full mask. The high intra-anagram variability in the full-mask condition was hypothesized to block access to thelearning curve signal and thus promote the use of set-sizeinformation. The intentionally large set size (n = 200), coupledwith the structure of the learning curve as represented in theother experimental conditions (recall that the mastery limbbegan at Anagram 16 in all but the no -m as k^ , condition, in

    no-masklong no-maskshort

    full-mask improvement-mask mastery-maskFigure 4. Diagram of experimental conditions, with normed solution times on the ordinate and anagramsequence on the abscissa.

  • 7/28/2019 xlm-22-2-510

    10/15

    LEARNING CURVE AS STOPPING SIGNAL 519which th e m astery limb began at Anagram 11) was predicted toresult in average problem solving effort in the full-maskcondition to exceed average problem solving effort in all otherconditions.Mastery mask. We hypothesized that people are sensitiveto both limbs of the learning curve. However, if the predictionsassociated with the no-mask conditions are confirmed, theresults may be explained by the possibility that participants aresensitive only to the flat limb, and are not necessarily display-ing any unique metacognitive effects of the descending limb.The mastery-mask condition was one of two conditions createdto test this possibility. If learning curve awareness is identifiedexclusively with awareness of lack of marginal performancegain due to practice (the flat limb), then the average number ofanagrams solved in the mastery-mask condition should notdiffer from the average number solved in the full-maskcondition. However if the descending limb does serve a usefulmetacognitive function by communicating to p articipants evi-dence of their improvement, we predicted that individualdifferences in participants' reactions upon encountering thenoise mask would result in an average number of anagramssolved that would fall somewhere between the averages yieldedby the no-mask and full-mask conditions. Although Experi-ment 3 was not designed to elucidate the phenomenal experi-ence gene rated by each experimental condition, several likelyreactions to the mastery-mask are believed to underlie thisprediction.

    We expected that a certain percentage of participants, as aresult of the presumed confidence-shaking effects of the onsetof a significant increase in intra-anagram variability, wouldswitch to a set-size signal, thus behaving like participants in thefull-mask condition. It also seemed likely that many partici-pants would behave like no-mask participants, terminatingeffort soon after the noise mask was encountered, resigned tothe knowledge that they no longer had access to their learningcurves, in essence "cutting bait" once their practice effortsceased yielding systematic and noticeable benefits. The com-bined effect of these individual differences w ould be to yield anaverage number of anagrams solved that would fall somewherebetween the averages yielded by the no-mask and full-maskconditions. We hasten to point out that in spite of ourdescriptions of possible phenomenological reactions, Experi-ment 3 was not designed to offer a direct glimpse into the blackbox. For now, the best we can hope for are proxiesnamely,conditional differences in problem solving effort and problemsolving variability.Improvement mask. Problem solving effort in the improve-

    ment-mask condition was also predicted to fall somewherebetween that observed in the full-mask and no-mask condi-tions. The metacognitive ambiguity created by masking thedescending limb was hypothesized to disappear soon after theflat limb was encoun tered. However, the lack of a clear pictureof one's preparatory progress during the masked descendinglimb was hypothesized to result in a lack of confidence, relativeto the clear and steady illusion of progress that is generated byan accessible descending limb. This difference should encour-age some participants to solve a somewhat greater number ofproblems while on the flat limb, relative to the majority ofparticipants in the no-mask condition. Combined with thepredicted lack of difference between the two no-mask condi-

    tions, this result would add further support to the idea thatselection of a stopping value along the flat limb of the learningcurve is sensitive to awareness of improvement, but notmagnitude of improvement.A secondary set of predictions targete d within-cell variabil-ity. We predicted that the two no-mask conditions wouldgenerate the lowest variability in choice of stopping value(stopping value is defined as the number of anagrams solved),on the basis of the hypothesis that individual differences indecisions regarding stopping values are minimized when ac-cess to the learning curve is unobstructed. As we have arguedrepeatedly throughout this article, the signal sent out by thelearning curve is believed to enjoy widespread consensus.When practice ceases to affect performance, and when aware-ness of this relationship exists, most individuals should quicklyterminate their efforts. Thus, relative to other conditions, wepredicted that choice of stopping values in the no-maskconditions would be characterized by low variability amongparticipants. However, variability among participan ts in termsof choice of stopping value was predicted to be highest in thefull-mask condition. Participants in this condition were hypoth-esized to rely primarily on set size, a stopping signal that doesnot yield an obvious stopping value. As a result, individualdifferences in the interpretation of the set-size signal shouldresult in substantially greater variability in choice of stoppingvalue, relative to other conditions.

    One cautionary note before proceed ing: Because the condi-tions predicted to generate the lowest variability were alsopredicted to generate the lowest mean number of anagramssolved, the statistical relationship commonly observed betweenvariance and the mean has the potential to serve as atrivializing explanation for our predicted results concerningwithin-cell variability. This problem is revisited shortly in theResults section.

    MethodParticipants. Seventy-eight un dergradua tes (30 men and 48 women)at the University of Texas at Austin participated in exchange forcourse credit.Design and procedure. The experimental design took the form of abetween-subjects design, with intraset location of the noise mask as theexperimental factor and with a fifth experimental condition crea ted byshortening the descending limb of the no-maskiong condition. Partici-pants were randomly assigned to one of the following five conditions:no-masking, no-masks^, full-mask, mastery-mask, and improvement-mask. The masking anagrams consisted of a randomly determinedsequence of four-, five-, and six-letter anagrams (these were the same

    anagrams that were used in the variable-set condition of Experiment 1).The descending limb used in the no-mask|Ong and mastery-maskconditions was constructed of the same 15 anagrams used in Experi-ment 2, in order of descending magnitude of average solution time(recall that this ordering yielded an anagram sequence beginning withan anagram that generated an average solution time of 106.7 s andending with one having an average solution time of 43.0 s). In theno-maskjong and improvement-mask conditions, A nagrams 16-30 rangedin solution time from 29 to 55 s (these anagrams were obtained fromthe normed anagram set created in the pretest used in Experiment 2).In the no-maskshon condition, the descending limb was composed ofAnagrams 6-15 from the no-masking condition, with Anagrams 11-25in the no-maskshon condi tion composed of the same anagrams as in theflat limbs of the no-masklong and improvement-mask conditions. The

  • 7/28/2019 xlm-22-2-510

    11/15

    520 JOSEPHS, SILVERA, AND GIESLERremaining anagrams in these three conditions were the same four-,five-, and six-letter anagrams that were used in the full-mask andmastery-mask conditions. Unfortunately, the increased variability thatwould be encountered by participants continuing beyond Anagram 30(Anagram 25 in the no-maskshort condition) would seriously compro-mise any interpretation of results in either no-mask or improvement-mask condition. However, we assumed that the majority of partici-pants in these conditions would termina te problem solving efforts priorto reaching Anagram 31 (Anagram 26 in the no-maskshort condition).As it turned out, only 1 of 49 participants in any of the conditions inwhich the mastery limb was not m asked solved mo re than 15 anagramspast the juncture between the improvement and mastery limbs. Thisparticipant (from the improvement-mask condition) was included inall of the statistical analyses.Although they differed in structure, we did not expect the fiveanagram sets to differ in overall difficulty. To test for differences indifficulty, we compared the pretested solution times for Anagrams1-30 (beyond Anagram 30, all conditions were identical) across thefive experimental con ditions. A one-way ANOVA revealed that nocondition was, on average, easier or more difficult than any other(F < 1).In all other aspects, the procedure used in Experiment 3 wasidentical to that used in Experiment 1. To recap briefly, participantswere told that the anagrams they were about to solve served as practicefor an upcoming test of their anagram-solving ability. After leaving andthen quickly returning to the laboratory cubicle, the experimenterhanded participants a stack of 200 anagrams, each presented individu-ally on index cards. Participants were told to stop working when theyfelt prepared enough to perform well on such a test, at which pointthey would be given the test. In actuality, when participants stoppedwork, they were debriefed and dismissed from the experiment.

    ResultsProblem solving effort. As we had predicted, presence andlocation of the noise mask on the learning curve had a large

    effect on the number of anagrams participants solved,F{4, 73) = 12.45, p < .05, MSE = 219.02, As a Tukey'shonestly significant difference (HSD) test revealed, partici-pants in the no-maskjhort condition solved, on average, signifi-cantly fewer anagrams relative to all other groups, with theaverage stopping point occurring 5.7 anagrams past the pointat which the flat limb began.5 Participants in the no-masktongcondition solved significantly fewer anagrams than all butthose in th e no-maskShort condition, with the average stoppingpoint occurring 4.9 anagrams past the descending-fiat limbjuncture. Further confirmation of predictions was the findingof no difference in postjuncture problem solving efforts be-tween the two no-mask conditions (d = 0.8), in spite of thesizable percentage difference (50%) in the magnitude of thedescending limbs between the two conditions (\t[ < 1seeFigure 5). As we hypothesized, participants did not appea r totake the magnitude of the descending limb into account whendeciding on a stopping po int.In spite of the results observed in the no-mask conditions, itmay be that participants were merely demonstrating a sensitiv-ity to the flat limb, rather than a sensitivity to both limbs of thelearning curve. If this is true , then problem solving effort in theimprovement-mask condition should be equivalent to effort inthe no-mask condition, because the flat limb in both conditionswas equally accessible. However, a Tukey's HSD test revealedthat average problem solving effort (in terms of number ofanagrams solved) was greater in the improvement-mask condi-

    tion than it was in either no-mask condition, confirming thatsensitivity to the descending limb of the learning curve existed(see Figure 5).A second test of descending limb sensitivity was accom-plished by comparing problem solving effort between thefull-mask and mastery-mask conditions. If sensitivity to thedescending limb doe s not exist, no problem solving differencesshould have been apparent between these two conditions.However, this was not the case. As predicted, a Tukey's HSDtest revealed that problem solving effort was significantlygreater in the full-mask condition than it was in the mastery-mask condition (see Figure 5), demonstrating that an acces-sible descending limb does serve a metacognitive function.Finally, as predicted, a Tukey's HSD test revealed thatparticipants in the full-mask condition solved significantlymore anagrams, relative to all other groups.Variability in choice of stopping value. We predicted thatindividual differences in choice of stopping value should beminimized when access to the learning curve is unobstructed.Thus, we predicted that among all experimental conditions,within-cell variance would be lowest in the no-mask condi-tions. We also hypothesized that in the full-mask condition,individual differences in the interpretation of the set-sizesignal should yield the greatest within-cell variance. Bothpredictions were strongly confirmed by means of a series ofpairwise error variance comparisons (Howell, 1992, p. 187).These comparisons were analyzed by dividing the largervariance estimate by the smaller, with n - 1 degrees offreedom for each estimate.As antic ipated, within-cell variance in the no-mask|Otlg condi-tion was significantly lower than in the full-mask condition,F(14, 14) = 7.01, p < .05; the mastery-mask condition, F(\3,14) = 2.88, p < .05; and marginally significantly lower than inthe improvement-mask condition, F(13, 14) = 2.34, p < .08,Also as expected, we found that the full-mask conditiongenerated more noise than any other experimental condition.As reported above, not only was within-cell variance in thefull-mask condition greater than in either of the no-maskconditions, it was also significantly greater than in the improve-ment-mask condition, F(14,13) = 3.03, p < .06, and margin-ally greater than in the mastery-mask condition, F(14, 13) =2.43,_p < .06.The larger cell size and equivalent within-cell variance in theno-maskshyrt condition, relative to the no-maskiotIg conditionperm itted th e inference that all statistically significant differ-ences observed in the no-masklong comparisons were shared orexceeded by the no-maskshort comparisons. More important,however, within-cell variances across the two no-mask condi-tions were found to be nearly identical, F(15, 19) = 1.08, ns .This is important because this lack of difference in variancebetween the two no-mask conditions, in light of the largedifference in mean problem solving effort between these twoconditions, strongly argues against an artifactual statisticalexplanation for these variance data. Thus, the low errorvariances observed in the two no-mask conditions appear tohave been du e to the presence of a well-understood metacogni-tive signal generated by the learning curve, rather than theresult of a statistical artifact.

    ' The value of Tukey's HSD in the experiment was 4.49.

  • 7/28/2019 xlm-22-2-510

    12/15

    LEARNING CU RVE AS STOPPING SIGNAL 52120 _

    15 -

    10 _

    5 -

    no-mask no-maskshort long improvement-mask mastery-maskfull-mask

    Experimental conditionFigure 5. Number of anagrams solved subsequent to the improvement-mastery juncture, by experimen-tal condition.

    DiscussionBy independently masking each of the two limbs of anengineered learning curve, the metacognitive functioning ofthe learning curve was clearly evident. Behavioral regulation

    was influenced quite strongly by both the descending limb andthe flat limb of the curve. When neither limb was masked, thusmaximizing access to both limbs of the curve, problem solvingefforts quickly tailed off once the flat limb was encountered.Although participants clearly dem onstrated a sensitivity to th edescending limb, metacognitive regulation was unaffected bythe length of the descending limb. Once they had encounteredthe flat limb , participants' p roblem solving efforts were term i-nated at roughly the same po int, regardless of the length of thedescending limb. In addition, the Low within-cell variances inthe two no-mask conditions confirmed the hypothesis thatindividual differences in choice of stopping signal are mini-mized when learning curve access is possible. Apparently, th e

    signal that is sent out by an unobstructed learning curve isresponded to in a remarkably uniform manner, yieldingrelatively little in the way of individual variation in choice ofstopping signal and stopping value. When both limbs weremasked, as was the case in the full-mask condition, set size waspresumed to be the guiding stopping signal, although thesignificantly greater within-cell variance in this conditionsuggests that as a control cue, set size is quite susceptible toindividual interpretation .Experiment 4

    A finalexperiment was conducted as a test of an alternativeexplanation for pa rticipants' use of set-size information underconditions of high intraset variability. Although the techniquesused in the previous studies were designed to emphasize thearbitrary nature of the size of the problem set, it remains thecase that the experimenter was the individual who selected the

  • 7/28/2019 xlm-22-2-510

    13/15

    522 JOSEPHS, SILVERA, AND GIESLERnumber of anagrams ultimately presented to participants.Therefore, in spite of our attem pts to downplay the role of theexperimenter as the agent bearing primary responsibility forthe size of the initial anagram set, it may be that participantswere operating under the belief that the experimenter hadprivileged knowledge regarding the level of preparednessrequired for the ostensible upcoming exam and was communi-cating this knowledge through selection of the initial size of thestudy set. Thus, the observed set-size effect may have beenproduced, at least in part, by the participant's belief in theguidance and wisdom of the experimenter.

    We should make it clear that the determinan ts of the beliefin set size as a metacognitive aid are far from obvious, and itmay be that one's belief in the proximal agent (e.g., teacher,professor, experimenter) as privileged knower plays a signifi-cant role in the belief in the diagnostic nature of the initial sizeof the problem set. However, we believe that the diagnosticpower ascribed to problem set size is multiply determined andis likely to occur without the presence of such a proximal agent.In the following experiment, the size of the initial anagramset was generated by means of a random procedure controlledand initiated by the participant. The experimenter played norole in determining the size of the anagram set. Participantsdetermined the number of anagrams they were to work on byremoving a slip of paper from a cardboard box. Participantswere told that th e box contained 100 slips numbered sequen-tially from 1 through 100. In actuality, half of the slips werenumbered 25, and the other half were numbered 100.After "randomly" selecting the size of the problem set,participants were told to continue to practice solving anagramsuntil they felt that their practice efforts resulted in their beingprepared to take a test of anagram problem solving. Unlike inthe previous experiments, intraset variability was not manipu-lated. All participants were given a set of four-, five-, andsix-letter anagrams (the same set that was used in the variablecondition in Experiment 1). Only set size was manipulated,with half of the participants in a 25-anagram set and the otherhalf in a 100-anagram set. We predicted a data patternresembling that of the E xperiment 1 high-variability condition ,that is, participants in the 100-anagram set solving moreanagrams than p articipants in the 25-anagram set.

    MethodParticipants. Nineteen undergraduates (10 men and 9 women) atthe University of Texas at Austin participated in exchange for coursecredit.Procedure. The procedures used in this experiment were identicalto those used in Experiment 1, with the following exceptions. Partici-pants were told that in order to simulate the arbitrary nature of manyof the situations people find themselves in on a daily basis, "we areconducting an experiment in which the initial quantity of problems youwill be given is either determined by the experimenter or by yourselfusing a random p rocedure. You are in the self-selection con dition, andthus, you will determine the starting quantity of anagrams by randomlyselecting a number from a box." Participants then drew a slip of paperfrom a small cardboard box. Unbeknownst to participa nts, half of thepaper slips had the number 100 written on them and half of the paperslips had the number 25 written on them. Participants were informedat this point that the number they drew would be the number of

    anagrams that would be placed before them. Participants were theninformed that there was no specific number of anagrams that theyshould do, nor was there a time limit on their task. Rather, they weretold that they should work until they felt sufficiently prepared to dowell on an upcoming exam testing anagram solving performance. As inExperiment 1, participants were told that there was a virtuallyunlimited quantity of supplemental anagrams in the next room, and ifthey finished the anagrams sitting in front of them, more would begladly provided.Results and Discussion

    As predicted, pa rticipants in the 100-anagram set conditionsolved more anagrams than did pa rticipants in the 25-anagramset condition (28.80 vs. 20.22, respectively, or roughly 42%more). A one-way (100-anagram set vs. 25-anagram set)between-subjects ANOVA revealed this difference to bestatistically significant, F(l, 17) = 16.05,/? < .05,MSE = 6.19.This difference was not due to a ceiling effect, as attested to bythe finding that only 1 out of 9 participants in the 25-anagramset condition solved all 25 anagrams.By removing the experimenter from the problem-set sizeselection procedure, and by allowing participants to utilize apseudorandom procedure to determine problem-set size, set-size information continued to act as a guidepost for metacogni-tive judgments of exam preparedness. This finding serves toargue against experimental demand as the sole explanation ofthe set-size effect. In addition, the number of anagrams solvedin the 100-anagram set (M = 28.8) was roughly equivalent tothe number solved in the 200-anagram set used in Experiment1 (M = 30.2), supporting our earlier notion that participantsare not computing proportions with any amount of precision,but rather are coarsely generating judgments of preparednessbased on a rough "dent in a pile" estimate. The question ofsensitivity to set size clearly requires the type of parametric

    exploration that, given the lack of difference between the 100-and 200-anagram set size conditions, may not be justified(however, this comparison is between-experiments, and thuscannot be taken too seriously).General Discussion

    Access to, and subsequent use of the learning curve inregulating preparation was shown to depend on the within-setvariability (i.e., statistical noise) of the problem set. When thearrangement of problems within a problem set generated lownoise, participants were clearly able to recognize and use theirlearning curves to regulate effort and judge pre paredne ss. As ametacognitive cue, the learning curve generated a clear anduniform response in participants. As exemplified by the resultsfrom Experiment 3, when both the descending and flat limbs ofthe learning curve were accessible, there was remarkableuniformity in participant's regulatory behavior. Problem solv-ing efforts ceased soon after the flat limb of the learning curvewas encountered. In addition, the point at which participantsterminated their efforts was far less variable relative to thestopping point chosen un der conditions in which access to oneor both limbs of the learning curve was blocked. When intrasetnoise was high, learning curve access was presumably blocked.In this case, problem solving efforts and preparedness judg-

  • 7/28/2019 xlm-22-2-510

    14/15

    LEARNING CURVE AS STOPPING SIGNAL 523ments were noticeably influenced by the size of the problemset.As mentioned at the beginning of this article, these mctacog-nitive effects were shown to d epend on the overall characteris-tics of the problem set and were not noticeably affected by theindividual problems composing the set. Thus, although theproblem sets used in Experiments 2 and 3 differed markedly incomposition from those used in Experiments 1 and 4, highlysimilar effects on judgment and behavior were observed acrossstudies. Additionally, although measures of poststudy perfor-mance were not taken, the implications for performance arequite clear. If progress along the learning curve is used as aproxy for performance, then the utility of a given stoppingsignal or study strategy can be gauged. For example, thejudgment of individuals working on a high-variability problemset would generally be expected to be less well calibrated totheir performance, compared with participants working on alow-variability problem set. Although this extension to perfor-mance is clear, it is obvious that the validity of such assertionsmust await empirical verification.

    The notion of a descriptive mathematical performancefunction commonly referred to as the learning curve dates backat least to the tur n of the century (e.g., Bryan & Harter, 1897).Furthermore, the processes by which learning occurs asdescribed by the shape of the learning curve have been thesubject of much discussion (e.g., Mazur & Hastie, 1978).However, to the best of our knowledge, there has been noresearch into the explicit metacognitive effects of the learningcurve (cf. Brown, Campione, & Barclay, 1979; Kluwe &Friedrichsen, 1987; Paris, Wasik, & Turner, 1991; Pressley,Borkowski, & O'Sullivan, 1985). The series of experimentsreported in this article provide the first evidence of the explicituse of the learning curve as a metacognitive tool, and alsodocument the conditions that promote or inhibit access to thelearning curve. As a caveat to this rather sweeping claim, werealize that the na ture and complexity of a given performancefunction (e.g., Newell & Rosenbloom , 1981) likely also plays asignificant role in determining the success that an individualhas in the recognition and proper use of the function. Al-though smooth, monotonic, negatively, and positively acceler-ated functions are quit e common, task differences (e.g., insightvs. analytic problems), as well as differences in participants'strategies and motivation levels have generated a wide varietyof learning curve shapes (e.g., Hull, 1952; Maier, 1931). Theeffect of learning curve characteristics on skill acquisition,however, is a separate topic, well beyond the scope of thisarticle. Our demonstrations using simple, two-part curvessignaling im provement and mastery clearly have just scratchedthe surface of a potentially rich area of research.The im portance of thesefindingsmay also be gauged againstthe context provided by the literatures associated with m etacog-nitive skill acquisition and human judgment and decisionprocesses. With the exception of ill-defined problems such aswriting (e.g., Bryson, Bereiter, Scardamalia, & Joram, 1991),and highly domain-specific complex problems such as chess(e.g., Chase & Simon, 1973) or computer programming (e.g.,Kay, 1991), an overview of the literature on metacognitionleaves one with the general impression of humans as reason-ably adept at abandoning poor learning strategies in favor of

    superior ones as a function of age-related and experientialincreases in general competence. This trend has been docu-mented across most problem types commonly encounteredwithin academia, including reading (e.g., Daneman, 1991;Paris et al., 1991), writing (e.g., Bryson et al., 1991; alsoTierney & Shanahan, 1991), m emory strategies (e.g., Brown etal., 1979; Pressley et al., 1985), and general problem solvingstrategies (e.g., Flavell, 1976; Kluwe, 1987; Simon & Simon,1978). In general, executive control and regulation strategieshave been shown to become more sophisticated, more appro-priate, and hence more effective as children grow older (e.g.,Kluwe, 1987) and as adults gain task familiarity (e.g., Josephs& Hahn, 1995; Maki & Serra, 1992; Weaver, 1990).In contrast to this picture of hum an com petence is the bodyof research documenting the multiple and various errors,biases, and shortcomings associated with human decisionmaking (see, e.g., Kahneman, Slovic, & Tversky, 1982, for areview). This literature questions the neoclassical economicview of hum ans as generally rational, com petent, and thought-ful information processors (for counter-challenges to the

    heuristics and biases approach, see, e.g., Cohen , 1981; Gigeren-zer, 1991; Koehler, in press).We believe that the experimental paradigm used in thisarticle does not fit neatly within either literature, but rathercan be seen as representing a bridge between the two. On theone hand, the problem under scrutiny (use of an appropriatemetacognitive stopping strategy in skill acquisition) clearlyfalls within the boundaries defined by the literature onmetacognitive control and regulation. On the one hand, thedocumentation of a computationally simple judgmental short-cut (viz., the anchoring effects of set size) is not typical ofmetacognition research but rather represents a good fit withthe literature on judgmental heuristics. Thus, we would arguethat this research represents a contribution to the judgmentliterature by dem onstrating conditions under which a judgmen-tal shortcut of dubious diagnostic value is used in the service ofskill acquisition, as well as representing an important contribu-tion to the literature on metacognition by documenting a pairof metacognitive stopping strategies (learning curve and set-size anchoring effects) that heretofore have not been topics ofempirical scrutiny.

    ReferencesBrown, A. L., Campione, J. C , & Barclay, C. R. (1979). Trainingself-checking routines for estimating test readiness: Generalizationfrom list learning to p rose reca ll. Child Development, 50, 501-512.Bryan, W. L., & Harter, N. (1897). Studies in the physiology andpsychology of the telegraphic language. Psychological Review, 4,

    27-53.Bryson, M., Berei ter , C , Scardamal ia, M., & Joram, E. (1991) . Goingbeyond the problem as given: Problem solving in expert and novicewri ters. In R. J. Sternbe rg & P. A. Frensch (Eds.) , Complex problemsolving: Principles and mechanisms (pp. 61-84) . Hi l lsdale, NJ:Er lbaum.Chase, W. G., & Simon, H. A. (1973) . Percept ion in chess. CognitivePsychology, 4, 55-81 .Cohen, L. J. (1981) . Can human i r rat ional i ty be exper imental lydemonst r a t ed? Behavioral and Brain Sciences, 4, 317-370.Costermans, J., Lories, G., & Ansay, C. (1992). Confidence level and

  • 7/28/2019 xlm-22-2-510

    15/15

    524 J O S E P H S , S I L V E R A , A N D G I E S L E Rfeeling of knowing in question answering: The weight of inferentialprocesses. Journal of Experimental Psychology: Learning, M emory,and Cognition, 18, 142-150.

    Daneman, M. (1991). Individual differences in reading skills. In R.Barr, M. L, Kamil, P. B. MosenthaL & P. D. Pearson (Eds.) , Th ehandbook of reading research (Vol . 2, pp. 512-538) . White Plains,NY: Longman.

    Flavell, J. H. (1976). Metacognitive aspects of problem solving. In L.Resnick (Ed.) , The nature of intelligence (pp. 231-235) . Hi l lsdale, NJ:Er lbaum.Gigereszer , G. (1991) . How to make cogni t ive i l lusions disappear :Beyond heuristics and biases. European R eview of Social Psychology,2, 83-115.Glenberg, A. M , Sanocki , T. , Epstein, W., & M orr is, C. (1987) .Enhancing cal ibrat ion of comprehension. Journal of ExperimentalPsychology: General, 116, 119-136.Howell, D. C. (1992). Statistical methods for psychology. Belmont , CA:Wadsworth,Hull, C L, (1952). A behavior system. New Haven, CT : Yale Universi tyPress.Josephs, R. A., Giesler, R. B., & Silvera, D. H. (1994). Judgment byquantity. Journal of Experimental Psychology: General, 123, 21-32 .

    Josephs , R. A., & Hahn, E. D. (1995) . Bias and accuracy in est imatesof task duration. Organizational Behavior and Human DecisionProcesses, 61, 202-213.Kahneman, D., Slovic, P. , & Tversky, A, (1982). Judgment underuncertainty: Heuristics and biases. Cambridge, England: CambridgeUniversity Press.Kay, D. S, (1991) . Computer interact ion: Debugging the problems . InR. J. Sternberg & P. A. Frensch (Eds.) , Complex problem solving:Principles and mechanisms (pp. 317-340) . Hi l lsdale, NJ: Er lbaum .Kluwe, R, H. (1987). Executive decisions and regulation of problemsolving behavior. In F. E. Weinert & R, H. Kluwe (Eds.) , Metacogni-tion, motivation, and understanding (pp. 31-64) . Hi l lsdale, NJ:Er lbaum.Kluwe, R. H., & Friedrichsen, G. (1985). Mechanisms of control andregulation in problem solving. In J. Kuhl & J. Beckrnann (Eds.),Action control (pp. 183-218) . Ber lin, Germany: Spr inger .Koehler , J, J . ( in press) . The base ra te fal lacy reconsidered : De scr ip-tive, normative, and methodological challenges. Behavioral andBrain Sciences.

    Lehman, D, R., Lempert, R. O., & Nisbett, R. E. (1988). The effects ofgraduate training on reasoning: Formal discipline and thinkingabout everyday-life events. American Psychologist, 43 , 431-442.Lerner , M J. (1980) . The be lief in a just world: A fundamental delusion.New York: Plenum.Maier , N. R. F. (1931) . Reasoning in humans H: The solut ion of aproblem and i ts appearance in consciousness. Journal of Compara-tive Psychology, 12 , 181-194.Maki, R. H., & Berry, S. L. (1984). Metacomprehension of textmater ial . Journal of E xperimental Psychology: Learning, Memory, andCognition, 10, 663-679.

    Maki, R. H., & Serra, M. (1992). The basis of test predictions for textmater ial . Journal of Experimental Psychology: Learning, Memory, andCognition, 18, 116-126.Mazur , J. E. , & Hast ie, R. (1978) . Learning as accumulat ion: Areexamination of the learning curve. Psychological Bulletin, 85,1256-1274.Mazzoni , G., & Cornoldi, C. (1993). Strategies in study time alloca-

    tion: Why is study time sometimes not effective? Journal of Experi-mental Psychology: General, 122, 47-60.

    Metcalfe, J. (1986). Feeling of knowing in memory and problemsolving. Journal of Experimental Psychology: Learning, Memory, andCognition, 12, 288-294.

    Metcalfe, J., & Wiebe, D. (1987). Intuition in insight and non insightproblem solving. Memory & Cognition, 15, 238-246.

    Nelson, T. O. (1993). Judgments of learning and the allocation of studyt ime. Journal of Experimental Psychology: General, 122, 269-273.Nelson, T. O., Leonesio, R. J. , Landwehr , R. S. , & Narens, L. (1986) .

    A comp ar ison of three predictors of an Individual's memory perfor-mance: The individual 's feeling of knowing versus the normativefeeling of knowing versus base-rate item difficulty. Journal ofExperimental Psychology: Learning, Memory, and Cognition, 12, 2 7 9 -287.

    Newell, A., & Rosenbloom, P. S. (1981). Mechanisms of skill acquisi-t ion and the law of pract ice. In J. R. Anderson (Ed. ) , Cognitive skillsand their acquisition (pp . 1-55). Hil lsdaie, NJ: E r lbaum.

    Par is, S. G., Wasik, B. A., & Tu rner , J. C. (1991 ) . The development ofst rategic readers. In R. B an , M. L, Kami*, P. B. MosenthaL & P. D.Pearson (Eds.) , The handbook of reading research (Vol. 2, pp.609-640) . White Plains, NY: Lon gman.

    Payne, J. W., Bet tman, J. R. , & Johnson, E. J. (1990) . The adapt ivedecision maker: Effort and accuracy in choice. In R. M. Hogarth(Ed. ) , Insights in decision making: A tribute to HiMJ. Einhorn ( p p .129-153). Chicago: University of Chicago Press,

    Poulton, C. E. (1968). The new psychophysics: Six models for magni-tude est imat ion. Psychological Bulletin, 69, 1-19.

    Pressley, M., Borkowski, J. G., & O'Sullivan, J. (1985). Children'smetamemory and the teaching of memory st rategies. In D. L.Forrest -Pressley, G. E, MacKinnon , & T. G. Waller (Eds.) , Metacog-nition, cognition, and human performance (Vol. l ,p p . 111-153). NewYork: Acade mic Press.

    Simon, D . P., & Simon, H. A. (197 8). Individual differences in solvingphysics problems. In R. S. Siegler (Ed.), Children's thinking: Whatdevelops? (pp. 325-348) . Hi l lsdale, NJ: Er lba um.

    Slovic, P., Fischhoff, B., & Lichtenstein, S. (1982). Facts versus fears:Understanding perceived r isk. In D. Kahneman, P. Slovic, & A.Tversky (E ds.) , Judgment under uncertainty: Heuristics and biases ( p p .463-489) . Cambridge, England: Cam bridge Universi ty Press.

    Staszewski, J. (1988). Skilled memory and expert mental calculation.In M. T. H. Chi, R. Glaser, & M. J. Farr (Eds.), The nature ofexpertise (pp. 71-128) . Hi l lsdale, NJ: Er lbaum .

    Tierney, R. J. , & Shanahan, T. (1991) . Research on the reading-wri t ing relat ionship: Interact ions, t ransact ions, and outcom es. In R.Barr, M. L. KaffliL P. B. Mosenthal, & P. D. Pearson (Eds.), 77iehandbook of reading research (Vol . 2, pp. 246-280) , White Plains,NY: Longman.

    Tversky, A., & Kahneman, D. (1974) . Judgment under uncer tainty:Heur ist ics and biases. Science, 185, 1124-1131.

    Wea ver, C. A. (1990). Constraining factors in calibration of comp rehen -sion. Journal of Experimental Psychology: Learning, Memory, andCognition, 16, 214-222.

    Received August 16,1993Revision received March 2,1995Accepted March 6,1995