the context of educational measurement instruction for preservice teachers: professor perspectives

5
Arlen R. Gullickson University of South Dakota and Kenneth D. Hopkins Uniuersity of Colorado Those who have assessed teachers’ knowledge of evaluation practices have consistently voiced concern that teacher knowledge and skills are inadequate. For example, Goslin (1967) noted weaknesses in teachers’ knowledge of standardized tests, and Mayo (1967) found that teachers have an inadequate understanding in several measurement areas, most notably statistics. More recent studies have served to reinforce such findings (see Rudman et al., 1980). One implication of these studies is that preservice instruc- tion in measurement is inadequate, particularly in statistics and stan- dardized testing. Some investigators have argued that previous research efforts, and perhaps educational measurement courses as well, have focused on the wrong topics. In particular, elemen- tary and secondary teachers do not attach high importance to standard- ized tests and statistics (Gullickson, 1986; Mayo, 1964, 1967). In fact, re- search suggests (Gullickson & Ell- wein, 1985)that teachers do not use statistics in their evaluation of stu- dents. Research (Gullickson, 1982, 1985; Salmon-Cox, 1982; and Stig- gins & Bridgeford, 1982) also sug- gests that teachers spend a great deal of their evaluative energies in nontest activities. A direct conclu- sion of this research is that greater attention needs to be given to non- test evaluation practices with lesser attention to statistics and standard- ized tests. In sum, the previous research ap- pears to yield two major conclu- sions. First, there are strong dif- ferences of opinion regarding what measurement instruction preservice teachers should receive. Second, teachers have not learned, and rarely apply, those concepts that appar- ently receive major emphasis in measurement instruction. Despite these conclusions, little has been done to directly assess undergrad- uate measurement instruction. The literature review yielded only two Arlen R. Gullickson is Professor of Education in the School of Education, Uniiiersity of South Dakota, Verrnillion, SD 57069. He specializes in educational research and measurement and evaluation. Kenneth D. Hopkins is Director of the Laboratory of Educational Research, University of Colorado, Campus Box 249, Boulder, CO 80309. He specializes in educational measurement. This study was supported in part by the University of South Dakota General Research Fund. 12 Educational Measurement: Issues and Practice

Upload: arlen-r-gullickson

Post on 28-Sep-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Arlen R. Gullickson University of South Dakota and Kenneth D. Hopkins Uniuersity of Colorado

Those who have assessed teachers’ knowledge of evaluation practices have consistently voiced concern that teacher knowledge and skills are inadequate. For example, Goslin (1967) noted weaknesses in teachers’ knowledge of standardized tests, and Mayo (1967) found that teachers have an inadequate understanding in several measurement areas, most notably statistics. More recent studies have served to reinforce such findings (see Rudman et al., 1980). One implication of these studies is that preservice instruc- tion in measurement is inadequate, particularly in statistics and stan- dardized testing.

Some investigators have argued that previous research efforts, and perhaps educational measurement courses as well, have focused on the wrong topics. In particular, elemen- tary and secondary teachers do not attach high importance to standard- ized tests and statistics (Gullickson, 1986; Mayo, 1964, 1967). In fact, re- search suggests (Gullickson & Ell- wein, 1985) that teachers do not use statistics in their evaluation of stu- dents. Research (Gullickson, 1982, 1985; Salmon-Cox, 1982; and Stig- gins & Bridgeford, 1982) also sug- gests that teachers spend a great deal of their evaluative energies in nontest activities. A direct conclu- sion of this research is that greater

attention needs to be given to non- test evaluation practices with lesser attention to statistics and standard- ized tests.

In sum, the previous research ap- pears to yield two major conclu- sions. First, there are strong dif- ferences of opinion regarding what measurement instruction preservice teachers should receive. Second, teachers have not learned, and rarely apply, those concepts that appar- ently receive major emphasis in measurement instruction. Despite these conclusions, little has been done to directly assess undergrad- uate measurement instruction. The literature review yielded only two

Arlen R . Gullickson i s Professor of Education in the School of Education, Uniiiersity of South Dakota, Verrnillion, SD 57069. He specializes in educational research and measurement and evaluation.

Kenneth D. Hopkins i s Director of the Laboratory of Educational Research, University of Colorado, Campus Box 249, Boulder, CO 80309. He specializes in educational measurement.

This study was supported in part by the University of South Dakota General Research Fund.

12 Educational Measurement: Issues and Practice

studies that focused directly on these issues (Noll, 1955; Roeder, 1972). Neither is current, and neither study investigated either the instructional design of measure: ment courses or the characteristics of the instructors and students in such courses. Such knowledge is necessary to make appropriate recommendations for change.

This study focused on measure- ment instruction provided at the undergraduate level in teacher education. Instructors of educa- tional measurement courses were surveyed to obtain the desired infor- mation: (a) course and student char- acteristics, (b) instructors’ educa- tional preparation and experience, and (c) learninghnstructional activi- ties and course content and emphases.

Method Sample

To obtain a sample of instructors, a systematic sample of one third of the 99 colleges (n = 33) in South Dakota and its six contiguous states was drawn from a roster of colleges and universities that grant bacca- laureate degrees in elementary and secondary education. This sample included 11 public colleges and universities and 22 private colleges. Names of the specific instructors at each institution were obtained through telephone calls to each institution.

Instrument The content and questions

(k = 67) contained in the inventory were based on the review of liter- ature provided by Rudman et al. (1980), on an analysis of textbooks used in educational measurement courses, and on the authors’ per- sonal experiences and discussions with instructors of undergraduate measurement courses for preser- vice teachers.l

phases, item scores were combined to provide eight scales (see “Key,” Figure 1). Cronbach alpha reliabil- ity estimates for the scales were quite high, ranging from .77 to .97.

In order to facilitate data collec- tion and improve the response rate, items were presented in a fixed- response format. Before being placed in final form, the instrument was

Fall 1987

To assess instructors’ content em- ‘

critiqued by teacher education col- leagues and subsequently reviewed by three professors who taught educational measurement courses at nearby colleges.

Procedures Questionnaires, with cover letters,

were mailed to the identified profes- sors at each of the colleges. Subse- quent mailings of a postcard re- minder and a new questionnaire were made at approximately 10-day intervals to nonrespondents. Final- ly, the written reminders were followed by a telephone call to re- maining nonrespondents.

Responses were received from 28 colleges (17 private, 11 public), 85% of the colleges sampled. Where two or more responses were received from a given institution, only the responses of the instructor primari- ly responsible for the instruction of educational measurement was in- cluded for analysis purposes. In the few cases where this matter was un- clear, the choice was made based on the completeness of responses to the questionnaire. This distilling process yielded 24 questionnaires that were essentially complete and four questionnaires in which the section on educational measure- ment and evaluation content was not completed.

Results Course Context

The results indicate that roughly half the students receive measure- ment and evaluation as a separate course and half receive this instruc- tion within the context of another course. Educational measurement and evaluation information is pro- vided to undergraduate students as a separate course in 71% of the col- leges, yet only three fourths of the colleges that offer a separate course require it for preservice teachers. When it is optional, professors gen- erally report that few students (25% or fewer) take the course. (No professor indicated that more than 50% take the optional course.) In colleges where no course is provided or where the course is optional, pro- fessors’ responses suggest that edu- cational measurement information is taught within another required course, typically educational psy- chology or a methods course.

When offered as a separate course, i t is typically offered for 2 to 3 semester hours (56% and 28%, re- spectively). Most students take the course as juniors (56%) or seniors (28%), and 90% take it prior to stu- dent teaching. The remainder either take the course subsequent to stu- dent teaching or during a student teaching block (but not concurrent with student teaching).

Instructors Who teaches the measurement

course provides a perspective of the importance attached to the course and the quality of instruction in the course. Ordinarily, one would ex- pect less quality in instruction if the course is “passed around” and taught by many faculty or by junior faculty members. In this regard the professors’ responses were positive: (a) No professor reported that grad- uate students are allowed to teach the course, (b) in most colleges the course appears to be taught by a small group of experienced profes- sors (41% reported that they alone teach the course and 41% reported that one person in addition to them- selves teaches it), and (c) only 8% reported that another faculty mem- ber was better prepared to teach the course than they.

The instructors had diverse educa- tional backgrounds. All reported having at least a masters degree and 82% the doctorate. Although the nature of their specialization varied greatly, 46% reported hav- ing majored in a measurement- related area, such as psychology, educational psychology, educational research, or statistics; an additional 28% reported a minor in this area. Forty-six percent reported a major in some other area of education, and only 7% reported their major as being in some discipline not directly related to either education or mea- surement. When these categories are combined, the net result is that 74% reported either a major or a minor in a measurement-related field, and the remaining 26% have either a major or a minor in education.

Eighty percent of the instructors reported having taken an under- graduate-level measurement/evdua- tion course. Only one person re- ported not having taken a graduate- level measurement course: The

13

v) C m

t al W m

- m

STAT PE AS FE CA SE NT U

Scales

FIGURE 1. Relative emphasis reported for various evaluation topics by instructors in preservice teacher education programs

Key: Scales and number of items in each scale: STAT= computing and inter- preting statistical data (7); PE = preparing exams (13); AS = administering and scor- ing tests (7); FE = using test results for instructional planning and formative evalua- t ion (7); CA= general assessment issues (11); SE = using test results for summative evaluation purposes (8); NT= employing nontest evaluative devices (8); LI = legal issues in testing (6).

large majority (89%) reported hav- ing taken at least 6 semester hours. The large number of measurement credit hours reported and the fact that the number of credit hours taken was not related to major sug- gest that these professors did not report just the credit hours that they have received in measurement courses, but instead reported credit hours from courses that they deemed related to educational mea- surement.

All but two (93%) of the profes- sors reported having taught at the elementary or secondary school level, and all had taught at least 1 year at the college level. The me- dians were 7 years of precollege teaching experience and 14 years of college teaching experience. All reported having taught the under- graduate measurement course at least once, and 82% reported hav- ing taught the course at least three times. Only a third, however, have taught a graduate-level educational measurement course.

Approximately one fourth of the professors who teach the educa-

14

tional measurement course are not considered to be members of the department or schoollcollege of education. In at least a half-dozen instances, the person who taught the course was either in a different department-typically, psychology- or was an adjunct faculty member.

Character ist ics of the Course

Five facets of course instruction were explored: (a) a theoretical ver- sus practical focus of the course, (b) instructional strategies em- ployed to teach the course, (c) the breakdown of time devoted to vari- ous activities, (d)use of the com- puter as an instructional tool, and (e) the relative emphasis given to each of eight areas of educational measurement and evaluation con- tent.

None of the professors reported a totally theoretical focus for the course. Eight percent reported a practical focus, 3670 reported giving greatest emphasis to practice with some emphasis on theory, and the majority (56%) reported an even split between theory and practice.

Most professors (89%) reported that they determine the content of the course themselves, with little or no direct student involvement.

The instructors tend to use a lec- tureldiscussion format and comple- ment it with student activities. A relatively small proportion (25%) oc- casionally or frequently make use of self-study modules. Similarly, 34% occasionally or frequently use stu- dent directed tutorials. Lectureldis- cussion takes approximately 50% of class time for the typical (median) teacher, and student activities take another 40% of the time. The re- maining 1070 of the time is used for testing purposes; generally two to four examinations are given.

Because student activities is a broad term, professors were asked to further subdivide their respective time allocation for this category. In doing so, they were offered three main options. The typical professor reported spending about half of this time on problems and exercises, with most of the remaining time divided between item and test development and individual student reading or programmed instruction.

Fewer than one third (29%) re- ported any computer use for instruc- tional purposes. Of those that did, the most frequently purpose was for computation of test statistics. Half the computer users reported use of programmed instruction as well. The use of a computer for test scor- ing, item analysis, or test develop- ment was rarely mentioned.

Figure 1 displays the professors’ content emphases for preservice in- struction in educational measure- ment and evaluation. Note that they reported giving great emphasis to two areas: statistics and the prep- aration of exams. Compared to the other six to ics, these areas had ef-

or more. (The variability within the eight topics was very similar; the standard deviations varied from .95 for statistics to 1.29 for legal issues.) Least emphasized were legal issues in testing and nontest (qualitative or naturalistic) evalua- tion procedures-these areas were more than .56 below all except one of the other areas. This is especial- ly noteworthy in that teachers have very different perceptions: They view the need for nontest evalua-

feet sizes ( f ) of approximately .5 6

Educational Measurement: Issues and Practice

tion procedures as far more impor- tant than statistics (Gullickson, 1986).

Interrelationships Among Variables A primary goal for this study was

to identify course and instructor variables related to the nature of in- struction that preservice teachers receive. A few significant correla- tions were obtained. When the measurement instruction is given as a separate course, much greater emphasis is given to statistics (r ,,,, = .55, p < .01) and general assessment (r,,* = .41, p < .05); ex- pressed as effect sizes, the dif- ferences are approximately 16 or more. It is also interesting to note that a separate course is much more often required in public institutions than in private colleges (90% vs. 29%, r,,! = .64, p < .001).

Certain professor variables were related to content and instructional emphases. Those with a major in measurement reported giving much greater emphasis (8-16) to the ad- ministration and scoring of tests (r,,!, = .49, p < .02) and to the appli- cation of qualitative evaluation pro- cedures (rr,/, = .40, p < .05). One additional relationship was note- worthy: Instructors with more col- lege teaching experience gave greater emphasis to test develop- ment (r!)/, = .40, p < .05).

Discussion Stiggins, Conklin, and Bridgeford

(1986) have documented “the tre- mendous complexity of the class- room assessment task” (p. 13). The available research suggests that preservice instruction/curriculum in educational assessment is not ade- quate to develop the desired skills. The current study suggests that even when taught as a separate course, instruction is overly preoc- cupied with the teaching of statis- tics. Clearly, certain statistical con- cepts are needed, but these can be taught with minimal emphasis on computational details. Teachers can interpret standard scores and cor- relation coefficients even if they cannot compute s or r.

There are three issues that con- cern the appropriate instructional design for teacher preparation: (a) the amount of time devoted to measurement instruction, (b) in- structional emphases and sequenc-

Fall 1987

ing, and (c) the professors’ prepara- tion and orientation.

TimelTiming Virtually all students now receive

some form of instruction in educa- tional measurement. Unfortunately, the large majority of this instruction comes prior to student teaching. Consequently, most do not have a clear understanding of, or apprecia- tion for, the complexity of the evaluation tasks facing them. The context is somewhat artificial- there is no “lab” for applying what is being taught. The relevance is much more apparent when one is faced with the realities of test development, interpretation, and grading.

When the measurement instruc- tion is nested within another course, it can be very superficial, depending on the orientation, interests, and expertise of the instructor. Experi- enced faculty know that even a full course (without an overdose of sta- tistical computation) is minimal preparation for providing the range of quantitative and qualitative evaluation methods needed to grasp and apply the proper role of mea- surement/evaluation in instruc- tional design.

Instructional Emphasis The patterns of instructional em-

phases substantiate the concerns of those who favor an emphasis on qualitative evaluation techniques. First, when professors do have more time available, rather than giving more attention to nontest techniques or to all content areas, they instead tend to emphasize sta- tistics and standardized testing issues, both of which clearly should have lower priorities than other areas, such as test development and grading, that have a more direct bearing on student learning and good instructional design. This sug- gests the need for a shift toward greater emphases on qualitative techniques that may require both an increase in instructional time and a change in professors’ attitudes toward qualitative evaluation tech- niques. Perhaps because instructors are inclined to emphasize topics about which they feel more secure, they may be slow in including the emphasis on qualitative methods

that has become common in current educational research.

Professor Characteristics The faculty who teach educational

measurement are not a homogene- ous group. Most have not majored in educational measurement, and many are not even members of the respective department, school, or college of education. This diversity of faculty seems to pose two prob- lems. First, the research reported by Stiggins et al. (1986) makes clear the need for specific understanding of measurement as it applies to classroom practices. Almost all measurement professors have sub- stantial teaching experience, some have substantial measurement cre- dentials, and most have specialties in education. But probably few are specifically trained in the applica- tion of measurement to the day-to- day classroom evaluation concerns.

Second, attempts to change the content of educational measure- ment courses that will increase the instructional time allocated to educa- tional measurement will necessarily impinge on other undergraduate in- struction required of the students. Such changes in curriculum require substantial interfaculty coopera- tion, understanding, and trust. Because many measurement profes- sors are not an integral part of the teacher education faculty, they may be less committed and/or less able to lobby effectively for curricular revisions that better serve the measurement and evaluation needs of teachers. This situation suggests that efforts to modify or reform pro- grams for preparation of teachers cannot be directed solely toward professors of educational measure- ment courses. Instead proposed changes must be brought to the at- tention of teacher educators in general.

In sum, it seems apparent that many students, given the substan- tial constraints imposed on the educational measurement course, will continue to be inadequately prepared for classroom evaluation tasks. It is time to rethink the ap- propriate place of measurement and evaluation in the teacher training curriculum. Which topics should be included, and what should be the relative emphasis of each? The

15

authors are of the opinion that the decreased measurement emphasis in preservice training programs evi- dent over the past 3 decades results in part in an inordinate emphasis on computational statistics and related areas. With an orientation toward the evaluation needs of classroom teachers, perhaps a systematic study of evaluation procedures can regain its rightful place as an essential in- gredient in any sound system of in- structional design.

Note 'A description of the items is available

from the first author upon request.

References Goslin, D. A. (1967). Teachers and

testing. New York: Russell Sage Foundation.

Gullickson, A. R. (1982). The practice of testing in elemntary and secondary schools. Paper presented at the Rural Education Conference at Kansas State University, Manhattan. (ERIC

Document Reproduction Service No. ED 229 391)

Gullickson, A. R. (1985). Student eval- uation techniques and their relation- ship to grade and curriculum. Jour- nal of Educational Research, 79(2), 96-100.

Gullickson, A. R. (1986). Teacher educa- tion and teacher-perceived needs in educational measurement and evalua- tion. Journal of Educational Measure- ment, 23, 355-368.

Gullickson, A. R., & Ellwein, M. C. (1985). The goodness-of-fit between prescription and practice: Post hoe analysis of teacher-made tests. Educa- tional Measurement: Issues and Prac- tice, 7(1), 15-18.

Mayo, S. T. (1964). What experts think teachers ought to know about educa- tional measurement. Journal of Educational Measurement, 1, 79-86.

Mayo, S. T. (1967). Pre-semiceprepara- tion of teachers in educational rnea- surernent (Contract No. OE 4-10- 01 I). Chicago: Loyola University.

Noll, V. H. (1955, September). Require- ments in educational measurement for prospective teachers. School and Society, 88-90.

Roeder, H. H. (1972). Are today's teachers prepared to use tests? Pea- body Journal of Education, 49(3),

Rudman, H. C., Kelly, J. L., Wanous, D. S., Mehrens, W. A., Clark, C. M., & Porter, A. C. (1980). Integrating assessment with instruction: A review (1922-1980). East Lansing: Michigan State University, College of Educa- tion, Institute for Research on Teaching.

Salmon-Cox, L. (1982, May). Teachers and standardized achievement tests: What's really happening? Phi Delta Kappan, 631-634.

Stiggins, 'R. J., Conklin, N. F., & Bridgeford, N. J. (1986). Classroom assessment: Key to effective educa- tion. Educational Measurement: Issues and Practice, 5(2), 5-17.

Stiggins, R. J., & Bridgeford, N. J. (1982). Final research report on the role, nature and quality of classroom p e r f i r m assessment (Contract No. 400-80-0105). Portland, OR: North- west Regional Educational Labora- tory, Center for Performance As- sessment.

239-240.

16 Educational Measurement: Issues and Practice