in search of the reliability of a flemish version of the knowledge monitoring assessment test

11
In search of the reliability of a Flemish version of the Knowledge Monitoring Assessment Test Geraldine Clarebout & Jan Elen & Patrick Onghena Received: 3 June 2005 / Revised: 26 April 2006 / Accepted: 22 May 2006 / Published online: 19 August 2006 # Springer Science + Business Media, LLC 2006 Abstract Metacognitive skills are widely recognized as an important moderating variable for learning. Many studies have shown that these skills affect students’ learning results. Tobias and Everson (2000) argue that metacognitive skills cannot be effectively applied in absence of accurate knowledge monitoring. Consequently, they constructed a Knowledge Monitoring Assessment Test, which is claimed to be a valid test to measure students’ knowledge monitoring capacity. In this contribution the reliability of a Flemish version of the KMA test is studied. Two studies are reported on, one with secondary education students and one with freshmen university students. In both studies split half method and Kuder Richardson 20 were used to calculate the internal consistency as a measurement of reliability. Because none of the results showed a good reliability it is suggested that additional efforts are needed to elaborate a reliable instrument. Keywords metacognition . knowledge monitoring . methodology . reliability Introduction Flavell (1979) already made a distinction between metacognitive knowledge, experiences and metacognitive skills. Metacognitive knowledge refers to declarative knowledge about what and how factors act and interact to affect learning processes. Metacognitive experiences have to do with where one stands in a specific process and what progress one is making. These experiences may activate metacognitive Metacognition Learning (2006) 1: 137–147 DOI 10.1007/s11409-006-9582-0 G. Clarebout (*) : J. Elen Center for Instructional Psychology and Technology, Katholieke Universiteit Leuven, Vesaliusstraat 2, B-3000 Leuven, Belgium e-mail: [email protected] P. Onghena Center for Methodology of Educational Research, Katholieke Universiteit Leuven, Vesaliusstraat 2, B-3000 Leuven, Belgium

Upload: geraldine-clarebout

Post on 14-Jul-2016

214 views

Category:

Documents


2 download

TRANSCRIPT

In search of the reliability of a Flemish versionof the Knowledge Monitoring Assessment Test

Geraldine Clarebout & Jan Elen & Patrick Onghena

Received: 3 June 2005 /Revised: 26 April 2006 /Accepted: 22 May 2006 / Published online: 19 August 2006# Springer Science + Business Media, LLC 2006

Abstract Metacognitive skills are widely recognized as an important moderatingvariable for learning. Many studies have shown that these skills affect students’learning results. Tobias and Everson (2000) argue that metacognitive skills cannotbe effectively applied in absence of accurate knowledge monitoring. Consequently,they constructed a Knowledge Monitoring Assessment Test, which is claimed to bea valid test to measure students’ knowledge monitoring capacity. In this contributionthe reliability of a Flemish version of the KMA test is studied. Two studies arereported on, one with secondary education students and one with freshmenuniversity students. In both studies split half method and Kuder Richardson 20were used to calculate the internal consistency as a measurement of reliability.Because none of the results showed a good reliability it is suggested that additionalefforts are needed to elaborate a reliable instrument.

Keywords metacognition . knowledge monitoring . methodology . reliability

Introduction

Flavell (1979) already made a distinction between metacognitive knowledge,experiences and metacognitive skills. Metacognitive knowledge refers to declarativeknowledge about what and how factors act and interact to affect learning processes.Metacognitive experiences have to do with where one stands in a specific processand what progress one is making. These experiences may activate metacognitive

Metacognition Learning (2006) 1: 137–147DOI 10.1007/s11409-006-9582-0

G. Clarebout (*) : J. ElenCenter for Instructional Psychology and Technology, Katholieke Universiteit Leuven,Vesaliusstraat 2, B-3000 Leuven, Belgiume-mail: [email protected]

P. OnghenaCenter for Methodology of Educational Research, Katholieke Universiteit Leuven,Vesaliusstraat 2, B-3000 Leuven, Belgium

strategies that monitor cognitive processes. Metacognitive skills are part ofprocedural knowledge. Vermunt (1992) refers to them as activities studentsundertake to regulate, monitor and control their own cognitive processes. Theycomprise the application of cognitive and environmental resources as demanded bythe task (Newman, 1998). Metacognitive skills are widely recognized as animportant moderating variable for learning. Learners’ metacognitive skills ofmonitoring and regulating learning affect learning results (Dorner & Wearing,1995; Frensch & Funke, 1995). Metacognitive skills for instance, guide and improvethe efficiency of the problem-solving process (Davidson, Deuser, & Sternberg,1994). Davidson et al. (1994) claim that metacognitive skills help in identifying theproblem, in mentally representing the problem, in planning how to proceed and inevaluating what one knows about his or her performance. These skills seem to havean overall positive effect on learning results.

Measuring students’ metacognitive skills is a recurrent topic of debate (e.g.,Pintrich, Wolters, & Baxter, 2000; Schraw & Impara, 2000; Veenman, 2005).Questionnaires and interviews have been used to measure students’ own descrip-tions of how they control or monitor their learning, but the answers do notnecessarily reflect what students actually do. Questionnaires are arguable appropri-ate for measuring metacognitive declarative knowledge, but not for metacognitiveprocedural knowledge. They are a poor indirect measure of students’ metacognitiveskills. Pintrich et al. (2000) however, argue that metacognition, including meta-cognitive skills, is similar to other kinds of knowledge stored in the long-termmemory and that it can be accessed when properly cued. In this line of reasoningthey argue that self-reports are appropriate, as an easy and efficient measurement.

Veenman (2005) argues in favor of a different evaluation method, namely thethinking aloud method. Evaluating students’ metacognitive activities Fon the spot_,rather than pro- or re-actively, allows access to particular metacognitive actions;information that otherwise would not be retrieved. However, Ericsson and Simon(1993) for instance point out, that problems might occur when thinking aloud is usedfor extremely difficult or highly routine tasks. They state that making explicit one’sreasoning process may interfere with the actual behavior. Additionally, this methodmay give only partial information about the actual cognitive processes and thereliability can be problematic given that the Fwording_ does not always completelycorrespond with the actual processes (De Corte, Verschaffel, & Lowyck, 1986).Karat (1997) on the other hand points out that the actual behavior is slowed downbut not changed. Veenman, Elshout, and Groen (1993) found no differences withrespect to learning results between a group of students who worked while thinkingaloud and a group of students who worked without thinking aloud. Although thereseems to be no clear consensus of the influence of this method, it is clear that thethinking aloud method is a very time- and resource-intensive method.

Tobias and Everson (2000), have made a specific contribution in this respect,Given the automated nature of metacognitive skills, they question whether studentsare aware of their metacognitive processes and whether they are able to describe orreport adequately on the processes used.

Tobias and Everson (1996, 2002), in their approach, focus on the knowledgemonitoring component of metacognition because learners cannot effectively controltheir learning in the absence of accurate knowledge monitoring. Knowledgemonitoring, in their view, forms the basis for the application of metacognitive skills(see also, Williams, 1996). Knowledge monitoring in learning and problem solving

138 G. Clarebout, J. Elen, et al.

relates to knowing what you know and what you do not know (Tobias & Everson,1996). Tobias and Everson assume that learners cannot adequately invokemetacognitive activities unless they can accurately monitor their knowledge.Specifically in situations where a great deal of information has to be masteredadequate knowledge monitoring will lead to a more accurate time investment.Adequate knowledge monitoring will help students to focus on those informationbits not sufficiently mastered. Given these theoretical assumptions, Tobias andEverson elaborated the Knowledge Monitoring Assessment instrument (KMA) asan interesting research tool to measure students_ knowledge monitoring, and hence(part of) metacognitive skills. The KMA is interesting an instrument since itprovides a behavioral measurement of metacognitive skills (compared to question-naires and interviews), without interfering or slowing down the problem-solvingprocess (as compared to the thinking aloud method).

Given the importance of metacognitive skills, the need of an adequate andreliable instrument, and the promising nature of the KMA, this contributiondiscusses the KMA in more detail and report on two studies that used the KMA asan instrument to measure students’ knowledge monitoring skills. The resultspresented relate to the reliability of this instrument. The instruments’ internalconsistency is used as an indicator of reliability (Onwuegbuzie & Daniel, 2002).

The Knowledge Monitoring Assessment Test

The KMA is a technique that Bsimultaneously evaluated students’ self-reports oftheir declarative word knowledge, or their procedural problem-solving ability inmath, and their demonstrated knowledge and ability’’ (Tobias & Everson, 1996,p. 2). The KMA measures students’ monitoring capabilities. Thus, the instrumentconsists of two parts. In a first part, students indicate whether they think they cansolve a particular mathematical problem, or whether they know the meaning of aspecific word. In the second part, students actually solve the mathematical problemor give the meaning of the word. The two parts are compared and scored in such away that the scores reflect the relationship between students’ estimates of theirknowledge and their performance (Table 1).

Looking at Table 1, two scores can be said to reflect accurate knowledgemonitoring, namely the F+ +_ and the Fj j_ score. The two other scores reflect anover- or underestimation of their knowledge.

Tobias and Everson (1996, 2002) report multiple studies in which they analyzedthe construct and the criterion validity of the instrument. They investigated

Table 1 Scoring of the KMA

Score Code Meaning

+ + 1 Answered question correctly, when they also thought they could answer

it correctly

+ j 0 Answered question incorrectly, while they thought they could answer it correctly

j + 0 Answered question correctly, while they thought they could not answer

it correctly

j j 1 Answered question incorrectly, while they though they could not answer

it correctly

In search of the reliability of a Flemish version 139

correlations between the score on the KMA and, amongst others, collegeachievement, reading comprehension, and general ability. They also studied thepredictive value of the KMA-scores, for instance, for college achievement (see,Everson & Tobias, 1998).

In their 1996-report, a review of 12 studies results in some support for theconstruct validity of the KMA procedure. Similar or comparable results were foundfor samples with different student populations. For instance, the finding that capablestudents estimated to know more words and the less capable indicated to knowfewer words was found with both freshmen university students and vocational highschool students. Also, the mathematics KMA (KMA_math) gave similar resultswith respect to correlations with other variables as found with the vocabularyversion (KMA_voc). Mixed results were found with respect to the relation betweenthe KMA and external criteria. Significant correlations were found with astandardized achievement test (correlations of 0.67 and 0.76). The lowest relation-ships were found between the KMA scores and college grades. Tobias and Everson(1996) suggest that in this case the low reliability of such grades probably accountsfor the modest correlation. In the 2002 report, a review of 11 additional studiesconfirms the validity of the KMA. The KMA was extended to secondary and post-secondary school level as well as to verbal analogies and sciences. The studiesreveal that the KMA results are positively related to general ability and scholasticaptitude.

In the 1996 and the 2002 reports various indicators of the validity of the KMA-test are presented. Unfortunately, reliability data are not presented. Nevertheless,such data might help to further present additional corroborative evidence of thevalidity of the KMA-test.

Troonen (2000) made, in close collaboration with Tobias and Everson, a Flemishversion of the KMA, adapted to the Flemish school context, taking into accountwhat students learn in school. Similar to Tobias and Everson, Troonen reportspositive correlations with other instruments, more specifically with a survey whereteachers indicate for their pupils whether they are good performers, able to judgetheir own knowledge, possess good study methods. High significant correlationswere found with teachers’ judgments of performance (0.77), of the pupil’s ability tojudge one’s own knowledge (0.84), and of the study method (0.80); and henceTroonen concludes that the Flemish KMA is valid. Although not specified by theauthor, this statement more precisely pertains to criterion related evidence ofvalidity (Fraenkel & Wallen, 2003). Also in this case, reliability data are missing.

In the next part, two studies are reported trying to gain insight in the reliability ofthis Flemish version of the KMA. This Flemish version was opted for, since theparticipants in the study were all students going to or having attended Flemishschools.

Materials and methods

Participants

Participants were 38 secondary education students from a general secondaryeducation school(age 14–15) in the first study and 109 freshmen university students(age 18–20) from the humanities (social sciences, psychology, and educational

140 G. Clarebout, J. Elen, et al.

sciences) in the second one. Both studies were part of a larger research study. Allstudents participated voluntarily; a movie ticket was provided as an incentive.

Material

A Flemish version of the KMA was used (Troonen, 2000). The test (see Appendix)consists of 20 mathematical problems (KMA-math) and 20 words (KMA-voc). Thistest was made based on an verbal and numeric intelligent test (Stinissen, 1975) usedfor 18-years old in Flemish schools, it was opted to use this test rather than an exacttranslation of the Tobias and Everson version, given the difficulties related totranslation of research instruments (Harkness & Schoua-Glusberg, 1998). In the firstpart, students have to indicate whether they can solve the problem, or whether theythink they know the meaning of the word. To put it differently, students were askedwhether they believed they possessed the knowledge to solve the problems. In thesecond part, students actually solved the problems and indicated the meaning of theword by choosing among five alternatives.

Procedure

The session started with the first part of the KMA. The 40 items were projectedusing a PowerPoint presentation with 4 s between every item. This time interval waspre-tested with 20 secondary education students prior to the actual studies to assurethat it was sufficiently large for students to make an estimate, but also short enoughso that they could not already start solving the problems. The trial with was donewith a time interval of 3, 4, and 5 s. Three seconds were too short; studentscomplained that they did not have enough time, and 5 s allowed some students toalready start solving the problems. Students were warned that the items would bepresented apace and the experimenter always indicated when a new item wasshown. Students indicated on a paper whether they thought they knew the solution.After the projection, the estimations were collected and the second part wasdistributed. Students were instructed to solve the exercises and to indicate themeaning of the words. The exercises and words were ordered differently in thissecond part to avoid that students would remember their response in the first part.Students could take as long as they wanted to fill out the second part.

To process the results, the KMA was scored as indicated in Table 1. Studentsreceived a total score of their correct estimations (the sum of the F++_ and the Fjj_).

In a first step the mean scores and standard deviations for each part of the KMAwere calculated to make sure that the exercises offered were not too easy or toodifficult.

Next, and given the main interest of the studies two estimation techniques wereused to assess the internal consistency of the KMA. First the split half method(comparing even and odd items) with a Spearman–Brown prophecy correction fortest-length was calculated. Second, and to avoid the problem that split half methodmay render the best or the worst case, the Kuder Richardson Formula 20 (KR20)was calculated (Fraenkel & Wallen, 2003).

For the split half method, the results for the even items were compared to theresults on the uneven items and this for both the KMA_Math and the KMA_Voc.This meant that in both halves more difficult and more easy items were included.Pearson product-moment coefficient (r) was used as an indication for the internal

In search of the reliability of a Flemish version 141

consistency. After calculating the Pearson product-moment coefficient, theSpearman–Brown prophecy was applied to these correlations. This formula allowsfor control of test length (Fraenkel & Wallen, 2003).

The KR20 is an alternative way to calculate how consistent responses are amongthe items of an instrument. Scores on the items must be dichotomous. In the splithalf method, one half of the items is compared to another half. This approachassumes that the two halves are homogeneous in content and difficulty. In the KR20all items are compared with one another; it is actually the mean of all split-halfcoefficients resulting from different half-splits of a test. In contrast to the KR21, theKR20 does not assume equal difficulty of the items.

A value of 0.70 is considered as an indication for a good internal consistency(Fraenkel & Wallen, 2003; Henson, 2001).

Additionally, it was also tested whether deleting items would result in betterreliability scores.

Results

Means and standard deviations

The results for the estimation part of the KMA-test, revealed that students did not thinkthey would be able to answer all questions correctly. For the mathematical version, themeans were 16.41 (SD = 2.70) and 15.60 (SD = 2.85) for, respectively, study 1 and study2. For the vocabulary version this was, respectively, 13.72 (SD = 2.90) and 15.31 (SD =1.92). For the second part of the KMA results show that the average was, respectively,11.74 (SD = 2.38) and 13.16 (SD = 2.83) for the mathematical part; and 10.66 (SD =2.29) and 14.48 (SD = 2.17) for the vocabulary part.

Reliabilities

The split half method indicates that in study 1 a statistically significant correlation isfound between the two halves of the KMA_math, and the vocabulary version in thesecond study (Table 2). However, for the other tests (KMA_voc in the first studyand KMA_math in the second one), significant correlations could not be foundbetween the two halves of the tests. Logically, the Spearman–Brown prophecy doesnot reveal very high reliabilities. The maximum is 0.60 for the mathematical versionin study 1. Which, given the limited number of subjects involved in the first studymay be questionable.

Table 2 Split half reliabilities for the Knowledge Monitoring Assessment Test

KMA_Math KMA_Voc

Study 1 (n = 38)

Split half reliabilities 0.41* 0.16

Spearman–Brown prophecy 0.60 0.30

Study 2 (n = 108)

Split half reliabilities 0.11 0.23*

Spearman–Brown prophecy 0.20 0.37

142 G. Clarebout, J. Elen, et al.

The KR20 results (Table 3) confirm the results of the split half method; no goodreliabilities are found. None of the results exceeds 0.50. Furthermore, deleting itemsdid not result in higher reliabilities.

Discussion and conclusion

Tobias and Everson constructed a theoretically sound and interesting method formeasuring metacognitive skills. By developing a behavioral measurement instru-ment they countered criticisms associated with self-reporting instruments, namelythat these self-reporting instruments lack the ability to gain insight into the actualmetacognitive skills by developing a more behavioral measure. The KMA is basedon a clear theoretical framework, stating that knowledge monitoring is the basicprocess underlying regulation and controlling activities. Furthermore, it is less timeconsuming and can be more easily applied than for instance, the thinking aloudmethod. Tobias and Everson (1996, 2002) have reported ample studies with thisinstrument (see reports for overviews) that indicate its validity. Because the KMAgenerates similar findings across different studies and with a variety of students in avariety of studies, the KMA-test was said to be a valid instrument. A similarconclusion was made by Troonen (2000). The studies presented in this contributionraise doubt about the reliability of the KMA-test, and hence it can be wonderedwhether validity can indeed be assumed. The studies report on the use of the KMAwith secondary and university education students and addressed the internalconsistency of the KMA. Results show that in these studies only low internalconsistency values could be found. Different reasons might explain this disappoint-ing result with respect to internal consistency.

First, students might have been prone to socially desirable answers leading themto indicate that they know almost all problems or words (Furnham, 1986; Kalton &Schuman, 1982). However, results do not completely confirm this. The averages ofthe estimations (part 1) for the mathematical version and the vocabulary versionshow that although on average students overestimate themselves; they do not claimto know all problems or words. As such it seems not that they mainly answered in asocially desirable way.

Second, estimating the reliability by a measure of internal consistency might alsoprovide a possible explanation. In the studies presented here, reliabilities werecalculated on the integrated score of the two parts in spite of the fact that it isdifficult to estimate reliabilities. The difficulty relates to the fact that twoadministrations are done, the estimate and the test (Tobias, personal communica-tion, 2004). Even when the reliabilities for the two parts of the instrument areseparately high, low overall reliability might be found. In the study here, thereliabilities of the different parts vary between 0.41 and 0.78 when using KR20 as areliability measurement. The estimation parts give systematically higher reliabilities

KMA_Math KMA_Voc

Study 1 (n = 38) 0.42 0.20

Study 2 (n = 108) 0.37 0.37

Table 3 KR 20 for the Knowl-edge Monitoring AssessmentTest

In search of the reliability of a Flemish version 143

(0.65–0.78) than the second part (0.41–0.65). The approach here, combining the twoparts, was adopted because calculating the reliabilities of the different partsseparately does not reflect the internal consistency of the instrument as a whole.Moreover to measure knowledge monitoring, the two parts are essential, it isespecially the combination of the two parts that constitutes the instrument forknowledge monitoring, not the separate parts.

While the KMA seems a very promising instrument and unique in its kind,further research is needed to test the reliability of the instrument, combiningdifferent methods of reliability testing. In this contribution, internal consistency wasused as an indicator for reliability. Other methods such as the test–retest methodscould be considered as well. However, this would pose also some questions aboutthe right time interval in which it is assumed that no learning effect has occurredwith respect to the exercises and vocabulary used in the test. Alternatively, onecould consider equivalent-forms method which can be measured at the same timeperiod. Research combining different methods of reliability testing may help to gaina better insight not only in the internal consistency of the KMA-test (Tobias &Everson, 1996, 2002) but also in its reliability, which is an essential feature of abroadly usable research instrument (Fraenkel & Wallen, 2003). If reliability can beproven of the instrument, the issue of validity can again be raised and studied.

Acknowledgment The authors are grateful to Sigmund Tobias and Howard Everson for their helpwith the operationalization of the test and processing of the results.

Appendix

Part 2 of the KMA1

Rekenopgaven (Mathematics test)

Los volgende oefeningen op. Schrijf je antwoord op de stippellijn naast elke oefening.

De oefening bestaat in het berekenen van het getal dat hoort op de plaats van het vraagteken. (Het

getal kan ook een breuk zijn.)

(Solve the following exercises. Write your answer on the dotted lines next to each exercise. The test

consists of the calculation of the number corresponding to the questionmark. This can also be a

division.)

1. j6 + 3 + (j4) j8 = ? ? = ...

2. (j27) : (j9) = ? ? = ...

3. j16 = ? ? = ...

4. 8 j 32 + 3 � 6 = ? ? = ...

5. 15% van 50 = ? ? = ...

6. 120 is 30% van ? ? = ...

7.52þ 4

5 = ? ? = ...

8.36� 7

4 = ? ? = ...

9.1

5�217

= ? ? = ...

10. 0.16 = 32?

? = ...

11. 1015:103 = ? ? = ...

1 Part 1 were identical exercises and words, only in a different order and in the right column studentshad to circle, FI do know how to solve this_ or FI do not know how to solve this._

144 G. Clarebout, J. Elen, et al.

Woordenschat (Vocabulary)

Duid het woord aan met dezelfde of bijna dezelfde betekenis als het vetgedrukte woord door de

letter voor het woord te omcirkelen.

(Mark the word with the same or almost the same meaning as the word put in bold by circling the

letter in front of the word.)

1.Gratis

A gratievol B waardevol C goedkoop D kosteloos E voordelig

2.Verdrag

A verband B vergiffenis C overeenkomst D som geld E geduld

3.Flater

A opschepperij B storing C gemeen D misleiding E vergissing

4.Plunderen

A moorden B vernielen C binnenvallen D doorzoeken E buitmaken

5.Berucht

A fanatiek B beroemd C bekend D slecht befaamd E misdadig

6.Radicaal

A communist B geheel en al C ongelovige D buigzaam E opwindend

7.Beslissen

A uitsluiten B uitmaken C spreken D getuigen E ondertekenen

8.Loof

A gebladerte B lofzang C feest D groente E boom

9.Afwijzen

A beschuldigen B verstoten C verwerpen D wantrouwen E ontslaan

10. Ergeren

A aanmatigen B opjagen C vervelen D aanstoot geven E machtigen

11.Chronisch

A lafhartig B acuut C zwak D aanhoudend E mopperend

12.Stuwen

A voortbewegen B aanvoeren C ritmisch bewegen D stijgen E verplaatsen

13.Heterogeen

A ongelijksoortig B onregelmatig C geniaal D onbegrijpelijk E twijfelachtig

14.Humaan

A gedegen B menslievend C goed D geestig E gevoelig

15.Repliek

A betwist standpunt B kort verslag C tegenantwoord D overreding E gedachtegang

16.Aanmatigend

A arrogant B trots C vervelend D bescheiden E nerveus

12. 23

� ��3 : 4�2 = ? ? = ...

13.3ffiffiffiffiffiffiffiffi512p

= ? ? = ...

14. 5x3� �2

= ? ? = ...

15. (a j 2 + b) . (j3) = ? ? = ...

16. (3a j 4b) . (2a + 5b) = ? ? = ...

17. (4x2 + 2x): (-x + 3) = ? ? = ...

18. 32 x� 5y3� �2

= ? ? = ...

19. x� 12 yþ 2

3 x2� �

� 2x� 3yð Þ = ? ? = ...

20. (2x + 5y)5 = ? ? = ...

In search of the reliability of a Flemish version 145

References

Davidson, J. E., Deuser, R., & Sternberg, R. J. (1994). The role of metacognition in problem solving.In J. Metcalfe, & A. P. Shumamura (Eds.), Metacognition. Knowing about knowing (pp. 207–226). Cambridge, Massachusetts: MIT.

De Corte, E., Verschaffel, L., & Lowyck, J. (1986). Zelfrapportering als techniek bij de studie vanonderwijsleerprocessen: Een poging tot verheldering [Self-reporting as technique to studylearning processes]. Pedagogische Studieen, 63, 506–514.

Dorner, D., & Wearing, A. J. (1995). Complex problem solving: Toward a (computersimulated)theory. In P. A. Frensch, & J. Funke (Eds.), Complex problem solving. The Europeanperspective (pp. 65–99). Hillsdale, New Jersey: Erlbaum.

Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: Verbal reports as data (rev.ed.).Cambridge, Massachusetts: MIT.

Everson, H. T., & Tobias, S. (1998). The ability to estimate knowledge and performance in college:A metacognitive analysis. Instructional Science, 26, 65–79.

Flavell, J. H. (1979). Metacognition and cognitive monitoring: A new area of cognitive developmentinquiry. American Psychologist, 34, 906–911.

Fraenkel, J. R., & Wallen, N. E. (2003). How to design and evaluate research in education (5th ed.).New York: McGraw-Hill.

Frensch, P. A., & Funke, J. (1995). Definitions, traditions and a general framework forunderstanding complex problem solving. In P. A. Frensch, & J. Funke (Eds.), Complex problemsolving. The European perspective (pp. 3–25). Hillsdale, New Jersey: Erlbaum.

Furnham, A. (1986). Response bias, social desirability and dissimulation. Personality and IndividualDifferences, 7(3), 385–400.

Harkness, J. A., & Schoua-Glusberg, A. (1998). Questionnaires in translation. In J. A. Harkness(Ed.), Cross-cultural survey equivalence (pp. 7–128). Mannheim: Schnelldurk Bongers.

Henson, R. K. (2001). Understanding internal consistency reliability estimates: A conceptual primeron coefficient alpha. Measurement and Evaluation in Counseling and Development, 34, 177–189.

Kalton, G., & Schuman, H. (1982). The effect of the question on survey responses: A review.Journal of the Royal Statistical Society, 145(1), 42–73.

Karat, J. (1997). User-centered software evaluation methodologies. In M. Helander, T. K. Landauer,& P. Prabhu (Eds.), Handbook of Human-Computer Interaction (2nd Ed.; pp. 689–704).Amsterdam: Elsevier.

Newman, R. S. (1998). Adaptive help seeking: A role of social interaction in self-regulated learning.In S. A. Karabenick (Ed.), Help seeking strategies. Implications for learning and teaching(pp. 13–37). Mahwah, New Jersey: Erlbaum.

Onwuegbuzie, A. J., & Daniel, L. G. (2002). A framework for reporting and interpreting internalconsistency reliability estimates. Measurement and Evaluation in Counseling and Development,35, 89–103.

Pintrich, P. R., Wolters, C. A., & Baxter, G. P. (2000). Assessing metacognition and self-regulatedlearning. In G. Shaw, & J. C. Impara (Eds.), Issues in the measurement of metacognition (pp. 43–97). Lincoln, Nebraska: Buros Institute of Mental Measurements.

Schraw, G., & Impara, J. C. (2000). Issues in the measurement of metacognition. Lincoln, Nebraska:Buros Institute of Mental Measurements.

Stinissen, J. (1975). De verbale en numerieke intelligentietest. Leuven: Afdeling Psychodiagnostiek,K.U.Leuven.

17.Precair

A precieus B ongewoon C vlegelachtig D onzeker E kostbaar

18.Krakeel

A gebak B herrie C krentenbrood D schaaldier E vis

19.Houwitser

A vuurtoren B kanon C soldaat D verrekijker E uniform

20.Soterisch

A stemmingsvol B zaligmakend C kunstmatig D geweldig E opgetogen

146 G. Clarebout, J. Elen, et al.

Tobias, S., & Everson, H. T. (1996). Assessing metacognitive knowledge monitoring [College BoardReport No.96-01]. New York: College Entrance Examination Board.

Tobias, S., & Everson, H. T. (2000). Assessing metacognitive knowledge monitoring. In G. Schraw(Ed.), Issues in the Measurement of Metacognition (pp. 147–222). Lincoln, NE: Buros Institute ofMental Measurements.

Tobias, S., & Everson, H. T. (2002). Knowing what you know and what you don’t: Further researchon metacognitive knowledge monitoring [College Board Report No.2002–3]. New York: CollegeEntrance Examination Board.

Tobias, S. ([email protected]). (2004, May 25). Re: IUME Web Site communication. E-mail to G.Clarebout ([email protected]).

Troonen, R. (2000). Het meten van metacognitie: een valideringsonderzoek van de metacognitiveknowledge monitoring assessment methode van Sigmund Tobias en Howard Everson [Themeasurement of metacognition; a validation research of the metacognitive knowledgemonitoring assessment method of Sigmund Tobias and Howard Everson]. Thesis, Universityof Leuven, Leuven, Belgium.

Veenman, M. V. (2005). The assessment of metacognitive skills: What can be learned from multi-method designs? In B. Moschner, & C. Artelt (Eds.), Lernstrategien und Metakognition:Implikationen fur Forschung und Praxis (pp. 75–97). Berlin: Waxmann.

Veenman, M. V., Elshout, J. J., & Groen, M. G. (1993). Thinking aloud. Does it affect regulatoryprocesses in learning. Tijdschrift voor Onderwijsresearch, 18, 322–330.

Vermunt, J. (1992). Leerstijlen en sturen van leerprocessen in het hoger onderwijs: Naarprocesgerichte instructie en zelfstandig denken [Learning styles and coaching learning processesin Higher Education]. Lisse, The Netherlands: Swets & Zeitlinger.

Williams, M. D. (1996). Learner-control and instructional technology. In D. H. Jonassen (Ed.),Handbook of research for educational communications and technology (pp. 957–983). NewYork: Macmillan.

In search of the reliability of a Flemish version 147