shannon o. sampson kelly d. bradley university of kentucky
DESCRIPTION
Quality Control in Survey Design: Evaluating a Rating Scale of Educators’ Attitudes Toward Differentiated Compensation. Shannon O. Sampson Kelly D. Bradley University of Kentucky. Background. - PowerPoint PPT PresentationTRANSCRIPT
Quality Control in Survey Design: Evaluating a Rating Scale of Educators’ Attitudes
Toward Differentiated CompensationShannon O. Sampson
Kelly D. BradleyUniversity of Kentucky
Background Surveys- most common example of self-
reported data collection and one of the most popular research methodologies for graduate studies and published papers in education (Aiken, 1988; Babbie, 1992; Gay, 1981).
Even so, the efficiency and effectiveness of
the instrument as a measurement tool is often overlooked or underemphasized.
“Operationalizing and then measuring variables are two of the necessary first steps in the empirical research process. Statistical analysis, as a
tool for investigating relations among the measures, then follows. Thus, the interpretation of analyses can only be as good as the quality of the measures.” (Bond and Fox, 2001)
Objectives of Study
Utilize Rasch analysis to evaluate the quality of a survey instrument designed to measure educators’ attitudes about differentiated compensation
Employ a data-driven model for improving survey instrumentation
Assumptions with traditional rating scale data analysis
Each item contributes equally to the measure of the construct
Each item is measured on the same interval scale
Respondents have appropriately interpreted the directions
All items are written clearly such that only one interpretation is possible
However… Items generally represent different
amounts of a variable
Scales are ordinal, so categories are not necessarily spaced equally
Respondents often misinterpret directions
Items are often open to multiple interpretations
Furthermore… Estimates for items depend on
severity of respondents in sample
Estimates of item ratings cannot be compared across groups
Complete records required
Single standard error of measurement is produced for the composite of ratings
Rasch model Probabilistic version of the
scalogram
Parameters neither sample nor test dependent- missing data not problematic
Standard error estimates produced for each discrete raw score
Attitudes about differentiated compensation
Differentiated compensation: Range of incentives added to present compensation
Salary bonuses for teaching in critical shortage areas
Financial support for seeking advanced degrees
Participation in voluntary career advancement opportunities
Instrumentation and Sample
10 KY school districts involved in differentiated compensation program pilot
University of Kentucky faculty constructed a pencil and paper survey instrument
Survey administered to four groups Teachers (n = 438) Mentor teacher- achievement coaches (n = 60) Principals (n = 63) Superintendents (n = 10)
“Prior to analysis, our preliminary ideas about items and persons we choose to study obligates us to form specific
hypotheses about both items and persons.”
(Wright & Stone, 2004)
Evaluating the quality of the instrument
1. Evaluate the coherence of the data (does a yardstick exist?)
2. Evaluate the rating scale structure (how accurate is the yardstick?)
3. Evaluate the individual items (can the yardstick be refined?)
1. Evaluate the coherence of the data (do I have a yardstick?)
Have items been keyed as intended?
Are there problems with the data coding?
Are the items measuring only one variable?
Are all items pointing in the same direction?
Kentucky Department of Education Differentiated Compensation Survey
PTMEAMEASUREMNSQ ZSTD MNSQ ZSTD CORR. ITEM
1.34 1.48 7.5 1.6 9 -0.19 DC would not enhance the positive…1.76 1.47 6.8 1.73 9.5 -0.15 DC will have a negative impact on…1.77 1.34 5.1 1.62 8.3 -0.14 Relations b/n admin and inst staff negatively…2.26 1.22 2.9 1.52 5.9 -0.05 Teachers receiving DC will be less…
-0.08 1.54 6.9 1.69 8 -0.02 linking teacher salary to student…2.2 1.15 2 1.27 3.4 0.01 There is too much peer pressure…
0.85 1.36 5.9 1.39 6.2 0.08 Non-certified teachers should pay…-1.15 1.25 2.3 1.21 1.8 0.18 All teachers should be required to be…0.32 1.15 2.4 1.21 3.1 0.23 When non-cert tchrs hired, districts should…
-1.59 0.97 -0.2 0.85 -1.1 0.26 school districts should support teachers…-0.26 1.2 2.7 1.17 2.1 0.28 School districts should pay for university…0.07 1.03 0.5 1.01 0.2 0.3 Certified teachers are more effective…
INFIT OUTFIT
2. Evaluate the rating scale structure (how accurate is my yardstick?)
Do the mean measures for the responses in each category increase as the categories step up the scale in the direction defined as “more”?
Do the categories fit the expectations of the model?
How are the respondents using the rating scale?
Empirical item-category measures for administrators-2 -1 0 1 2 3 4 5|-------+-------+-------+-------+-------+-------+-------| NUM ITEM| 12 3 4 | 17 Non-cert Ts should pay all...| 1 23 4 | 7 linking teacher salary to...| 1 2 3 4 | 11 Students' standardized tes...| 1 2 3 4 | 2 DC would not enhance the p...| 1 2 34 | 18 When non-cert Ts are hired...| 1 2 3 4 | 24 If Ts receive a DC bonus...deserve it| 3 24 | 33 There is too much peer pre...| 1 2 3 4 | 25 The size of the salary bon...| 342 | 19 cert Ts are more effective...| 1 2 3 4 | 8 Ts receiving differentiate...| 1 2 3 4 | 1 DC will attract better qua...| 3 2 4 | 32 Ts believe their school stresses excellence...| 3 4 | 31 Ts feel a sense of ownership in student lrng| 3 4 | 28 Ts identify with their sch...| 3 2 4 | 29 Ts take pride in being a p...| 3 4 | 30 Ts feel a sense of ownership in school| 3 4 | 27 Improving knowledge and sk...| 3 4 | 34 Ts are encouraged to make suggestions| 143 | 16 All Ts should be required...|-------+-------+-------+-------+-------+-------+-------| NUM ITEM-2 -1 0 1 2 3 4 5 1 2 2423227367 34442512 212 1 PERSONS T S M S T
Do the mean measures for the responses in each category increase as the categories step up the scale in the direction defined as “more”?
CATEGORY PROBABILITIES: MODES - Structure measures at intersectionsP ++---------+---------+---------+---------+---------+---------++R 1.0 + +O | |B | |A |111 |B .8 + 111 +I | 11 444|L | 11 44 |I | 11 44 |T .6 + 11 3333333333 44 +Y | 1 33 3333 44 | .5 + 11 33 3344 +O | 1 33 44333 |F .4 + 11 33 44 33 + | 22222*2**222 44 33 |R | 2222 *1 222 44 333 |E | 222 33 1 222 44 33 |S .2 + 222 33 11 22244 3+P | 2222 333 111 444222 |O |2 333 ***4 22222 |N | 3333333 4444444 111111 2222222 |S .0 +****444444444444444444 1111111111111**********+E ++---------+---------+---------+---------+---------+---------++ -3 -2 -1 0 1 2 3 PERSON [MINUS] ITEM MEASURE
Differentiated Compensation Survey- Administrators
Strongly disagree
Agree
Strongly agree
Disagree
How are the respondents using the rating scale?
+------------------------------------------------------------------|CATEGORY OBSERVED|OBSVD SAMPLE|INFIT OUTFIT||STRUCTURE|CATEGORY||LABEL SCORE COUNT %|AVRGE EXPECT| MNSQ MNSQ|| MEASURE | MEASURE||-------------------+------------+------------++---------+--------+| 1 1 93 4| -.76 -.85| 1.12 1.39|| NONE |( -2.54)| 1| 2 2 220 9| .32 .29| .97 .92|| -1.12 | -.96 | 2| 3 3 982 40| 1.31 1.36| 1.00 .96|| -.67 | .70 | 3| 4 4 1145 47| 2.60 2.57| 1.01 1.00|| 1.79 |( 2.95)| 4|-------------------+------------+------------++---------+--------+|MISSING 8 0| .78 | || | |+------------------------------------------------------------------
Do the categories fit the expectations of the model?
administrators
Respondent use of scale is almost dichotomous
Category fit is acceptable
3. Evaluate the individual items (how can I refine my yardstick?)
Do the items fall into the hypothesized hierarchy?
Do the items spread evenly across the intended range of the instrument?
Do any items clump at a point on the scale?
Which items are misfitting? Why might they be misfitting?
Do the items fall into the hypothesized hierarchy? Students’ standardized test scores would improve if a differentiated compensation
program were adopted Differentiated compensation would not enhance the positive relationship among
teachers and administrators A differentiated compensation program will positively affect teacher morale Relations between administrative and instructional staff will be negatively affected if
a differentiated compensation program is adopted Differentiated compensation will have a negative impact on the morale of the
teachers in the system Differentiated compensation programs recognize teacher contributions to student
learning Differentiated compensation programs help recruit teachers who can improve
student learning A differentiated compensation program will result in a higher teacher retention rate It is appropriate for teachers to receive bonuses to serve in rural schools classified as
a “difficult assignment” or “hard-to-fill” position The size of the salary bonus I could receive to become certified in a critical shortage
area or “difficult assignments” must be large enough to motivate me. A differentiated compensation program helps recruit better-qualified people to the
teaching profession A differentiated compensation program helps retain teachers in critical shortage
areas I identify with this school All teachers should be required to be certified to teach I take pride in being a part of this school Improving teachers’ knowledge and skills enhances student learning School districts should support teachers who are voluntarily advancing their careers I feel a sense of ownership in student learning
(Teacher hierarchy)
Difficult to endorse
Easy to endorse
Do the items spread evenly across the intended range of the instrument? MAP OF PERSONS AND ITEMS
MEASURE | MEASURE <more> --------------------- PERSONS -+- ITEMS --------------------- <rare> 5 + 5 | | | | | 4 + 4 | | | | X | 3 X + 3 T| X XX |T X X | XXX | X S| 2 XXXXX + X 2 XXXXX | XXXX | XXXX | XXXX M|S XXXXXX | XX 1 XXXXXXXXX + XX 1 XXXX | XX XXXXX S| XXX X | XXXX XX | X | X 0 X T+M X 0 | XXXX | X X | X | | -1 + -1 | X |S X | X | | -2 + XXX -2 | XXX | | |T | -3 + -3 | | | | | -4 + -4 <less> --------------------- PERSONS -+- ITEMS ------------------<frequent>
Which items are misfitting? Why might they be misfitting?
Infit ZSTD
Outfit ZSTD
Item Misfits for Teachers
7.2 9.6 There is too much peer pressure here to do a good job
4.9 5.8 All teachers should be required to be certified to teach
5.9 7.3 Certified teachers are more effective than non-certified teachers
7.3 8.6 Non-certified teachers should pay all the costs of becoming certified
3.1 3.9 This school stresses excellence
3.7 3.4 I’m encouraged to make suggestions about how we can be more effective
Do these items tap the construct?
Infit ZSTD
Outfit ZSTD
Item Misfits for Teachers
5.3 5.6 School districts should pay for university coursework in content areas
4.8 5.4 Linking teacher salary to student achievement on standardized tests has no place in education
2.9 4.2 When non-certified teachers are hired, school districts should pay all the costs of becoming certified
3.0 3.4 The size of the salary bonus I could receive to become certified in a critical shortage area or “difficult assignments” must be large enough to motivate me
These items may have multiple interpretations:
In which content areas?
On which standardized tests?
For all teachers and all content areas?
What is “large enough to motivate”?
Infit ZSTD
Outfit ZSTD
Item Misfits for Teachers
-5.2 -5.1 Differentiated compensation will positively affect teacher morale
-4.7 -5.3 Differentiated compensation provides incentives for growth
-7.6 -7.3 Students’ standardized test scores would improve in the district if a differentiated compensation program were adopted
-6.6 -6.4 Differentiated compensation programs help recruit teachers who can improve student learning
Less variability than the model would predict
Redundancyor
High agreement among respondents
“The problem of measurement, and especially of attaining interval scales,
is an extremely serious one for the social and behavioral sciences. It is unfortunate that in their search for quantitative methods, researchers
sometimes overlook the question of level of measurement and tend to
read quite unjustified meanings into their results… However, the core
problem of level of measurement lies outside the province of mathematics
and statistics.”(Hays, 1988)
Educational Importance
The education community will benefit by receiving better-informed results by collecting data using a more valid and reliable instrument.
Offers a sound methodology for evaluating the quality of the measurement instrument