the inter-temporal stability of teacher effect estimates j. r. lockwood daniel f. mccaffrey tim r....

30
The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida State University National Conference on Value-Added Modeling, April 2008 This presentation has not been formally reviewed and should not be cited or distributed without the authors’ permission.

Upload: doris-hines

Post on 01-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

The Inter-temporal Stabilityof Teacher Effect Estimates

J. R. Lockwood Daniel F. McCaffrey Tim R. SassThe RAND Corporation The RAND Corporation Florida State University

National Conference on Value-Added Modeling, April 2008

This presentation has not been formally reviewed and should not be cited or distributed without the authors’ permission.

Page 2: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Introduction Several school districts and states have begun using

measures of teachers’ contributions to student achievement to assess and reward teachers Denver, Houston, Florida

For “value added” measures to provide correct incentives and be acceptable to stakeholders they must: be relatively accurate measures of productivity (ie.unbiased) be relatively stable over time

Most observers (implicitly) assume that a given teacher’s productivity doesn’t vary much from year to year

Page 3: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Research Questions How stable are estimated teacher effects? What factors affect the stability of estimated teacher

effects? Are there methods to enhance the stability of estimated

teacher effects?

Page 4: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Previous Literature Ballou (2005)

Elementary and middle school teachers in a “moderately large” Tennessee school district in two consecutive years

Nearly 50 percent of math teachers in top quartile in one year stay in the top quartile the next year

Precision increases with number of student observations per teacher

Aaronson, et al. (2007) High school teachers in Chicago over two years 57 percent of teachers in the top quartile in one year remain

there in the next year

Page 5: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Previous Literature Koedel and Betts (2007)

Teachers in San Diego Within-school estimates of teacher quality

Models with student and school fixed effects

35 percent of teachers ranked in the top quintile remain there in the next year

Omission of student and school fixed effects increases stability of estimated teacher effects

Page 6: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Models of Teacher Effects General Value-added Model

Student i, classroom j, teacher k, school m X = time-varying student characteristics P = time-varying classroom peer characteristics T = time-varying teacher characteristics S = time-varying school characteristics

itktimt4kt3ijmt2it11itit AA SβTβPβXβ

Teacher Classroom Average Effect

itkti1itit AA

Page 7: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Data Seven Large Countywide School Districts in Florida

Two among 10 largest in the U.S. (Dade, Broward) Remainder among 25 largest in the U.S. (Hillsborough,

Palm Beach, Orange, Duval and Pinellas) Testing in Grades 3-10

FCAT-NRT (Stanford Achievement Test) 1999/2000-2004/05 (SAT-9 1999/2000, SAT-10 2004/05)

FCAT-SSS (Criterion reference exam) 2000/2001-2004/05

Focus on Middle School Math Teachers Teacher effects greater in math More students per teacher in middle school

Page 8: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

How Stable are Estimated Teacher Effects? Year-to-Year Correlations Proportion of Top-Quintile Teachers Remaining in the

Top Quintile the Next Year

Page 9: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Inter-temporal Correlation in Estimated Teacher Classroom Average Effects

Varying Teachers Across 2-Year Periods Correlation Between

County2000/01 and

2001/022001/02 and

2002/032002/03 and

2003/042003/04 and

2004/05

Broward 0.48 0.55 0.47 0.35

Dade 0.44 0.42 0.31 0.38

Duval 0.41 0.45 0.34 0.23

Hillsborough 0.35 0.33 0.36 0.23

Orange 0.23 0.18 0.21 0.32

Palm Beach 0.28 0.35 0.36 0.13

Pinellas 0.45 0.38 0.49 0.34

Page 10: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Quintile Ranking of Estimated Teacher Classroom Average Effect in 2001/02 by Quintile Ranking in 2000/01 (in Percent)Broward County [Cross-Year Correlation = 0.48]

Quintile in

2000/01

Quintile in 2001/02

1 2 3 4 51 41.4 20.7 25.9 8.6 3.52 21.1 29.8 28.07 14.04 7.023 25.0 17.2 20.3 20.3 17.24 20.3 15.9 20.3 26.1 17.45 4.7 9.4 15.6 15.6 54.7

Total 22.1 18.3 21.8 17.3 20.5

Page 11: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Quintile Ranking of Estimated Teacher Classroom Average Effect in 2001/02 by Quintile Ranking in 2000/01 (in Percent)Orange County [Cross-Year Correlation = 0.23]

Quintile in 2000/01

Quintile in 2001/02

1 2 3 4 51 27.6 17.2 27.6 17.2 10.32 32.4 11.8 29.4 11.8 14.73 13.6 18.2 15.9 25.0 27.34 9.4 28.1 15.6 25.0 21.95 6.8 18.2 18.2 15.9 40.9

Total 16.9 18.6 20.8 19.1 24.6

Page 12: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Percentage of Teachers Who Remain in Top Quintile from One Year to the Next

County2000/01 and

2001/022001/02 and

2002/032002/03 and

2003/042003/04 and

2004/05

Broward 54.7 55.2 48.3 46.2

Dade 45.7 40.4 38.9 39.3

Duval 35.9 41.2 34.3 33.3

Hillsborough 39.7 33.3 37.3 30.6

Orange 40.9 40.4 26.0 39.5

Palm Beach 34.0 40.5 30.6 23.1

Pinellas 39.4 41.0 52.5 39.4

Page 13: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

What Factors Affect the Stability of Estimated Teacher Effects? Changes in the Measurement of Achievement

Test Scale If scaling changes over time, could decrease stability Norming by grade and year should reduce fluctuations due to scaling

changes Could increase stability if distribution changes over time

Test Content If teacher ability varies across content, changes in test could

contribute to instability in measured teacher effectiveness Compare FCAT-SSS and FCAT-NRT

Page 14: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Inter-Temporal Correlation of Estimated Teacher Classroom Average Effects Under Alternative Test Score Measures

Counties

OutcomeStudent Controls

Student Min

Broward Dade DuvalHills-

boroughOrange

Palm Beach

Pinellas

Correlation Between 2000/01 and 2001/02 Estimates

Gain on Normed FCAT-NRT

Student Fixed

Effects

10 per class

0.48 0.44 0.41 0.35 0.23 0.28 0.45

Gain on FCAT-NRT Scale Score

Student Fixed

Effects

10 per class

0.41 0.47 0.47 0.32 0.22 0.38 0.45

Page 15: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Inter-Temporal Correlation of Estimated Teacher Classroom Average Effects Under Alternative Achievement Tests

Counties

OutcomeStudent Controls

Student Min

Broward Dade DuvalHills-

boroughOrange

Palm Beach

Pinellas

Correlation Between 2001/02 and 2002/03 Estimates

Gain on Normed FCAT-NRT

Student Fixed

Effects

10 per class

0.55 0.42 0.45 0.33 0.18 0.35 0.38

Gain on Normed FCAT-SSS

Student Fixed

Effects

10 per

class0.54 0.35 0.50 0.25 0.34 0.49 0.56

Page 16: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

What Factors Affect the Stability of Estimated Teacher Effects? Changes in Reference Point (Stratification)

Individual Teacher Effectiveness Must be Measured Relative to Some Reference Point

“Holdout” teacher or Average teacher If reference teacher changes, measured effectiveness changes

Comparisons Can Only be Made to Other Teachers Who Are Interconnected by Common Students

Different strata will have different reference points Within-school vs. between-school measures

Page 17: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Number of Teacher-Years in Interconnected Groups Counties

Group Type Broward Dade Duval

Hills-borough Orange

Palm Beach Pinellas

No Movers

69 793 354 430 521 455 36

Primary Group

5,939 16,034 6,837 9,583 8,928 8,964 5,020

All Others

[No. of Groups]

134

[59]

97

[46]

26

[13]

61

[28]

50

[25]

48

[23]

59

[26]

Page 18: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

What Factors Affect the Stability of Estimated Teacher Effects? Omitted Variable Bias

If Measured Teacher Effects Reflect Omitted Variables, Stability of Measured Teacher Effects Will Depend on Stability of Omitted Variables and Extent of Selection

Past educational inputs (persistence) Achivement levels vs. achievement gains

Student heterogeneity No controls vs. student covariates vs. student fixed effects

Peer heterogeneity No controls vs. controls for peer characteristics

Page 19: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Inter-Temporal Correlation of Estimated Teacher Classroom Average Effects Under Alternative Persistence Assumptions

Counties

OutcomeStudent Controls

Student Min

Broward Dade DuvalHills-

boroughOrange

Palm Beach

Pinellas

Correlation Between 2000/01 and 2001/02 Estimates

Gain on Normed FCAT-NRT

Student Fixed

Effects

10 per class

0.48 0.44 0.41 0.35 0.23 0.28 0.45

Level of Normed FCAT-NRT

Student Fixed

Effects

10 per

class0.56 0.55 0.38 0.32 0.50 0.41 0.50

Page 20: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Inter-Temporal Correlation of Estimated Teacher Classroom Average Effects Under Alternative Controls for Time-Invariant Student Heterogeneity

Counties

OutcomeStudent Controls

Student Min

Broward Dade DuvalHills-

boroughOrange

Palm Beach

Pinellas

Correlation Between 2000/01 and 2001/02 Estimates

Gain on Normed FCAT-NRT

Student Fixed

Effects

10 per class

0.48 0.44 0.41 0.35 0.23 0.28 0.45

Gain on Normed FCAT-NRT

Student Co-

variates

10 per class

0.54 0.42 0.38 0.39 0.30 0.30 0.53

Page 21: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Inter-Temporal Correlation of Estimated Teacher Classroom Average Effects Under Alternative Controls for Time-Varying Factors (Baseline Model – Gain on Normed FCAT-NRT, Student Fixed Effects, Minimum 10 Students per Class)

Controls for Time-

Varying Co-

variates

Counties

Broward Dade DuvalHills-

boroughOrange

Palm Beach

Pinellas

Correlation Between 2000/01 and 2001/02 Estimates

None 0.48 0.44 0.41 0.35 0.23 0.28 0.45

Student, Peer and Teacher

0.49 0.46 0.36 0.37 0.19 0.27 0.44

Page 22: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

What Factors Affect the Stability of Estimated Teacher Effects? Measurement Error in Student Achievement

If measurement error is uncorrelated across students within a classroom, then precision should be higher for teachers with larger classes

Minimum class size Minimum number of “movers” per teacher

If measurement error is correlated across students within a classroom, but not across classrooms, precision should increase with the number of classes per teacher

Using middle school teachers who generally teach multiple class per term

Page 23: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Inter-Temporal Correlation of Estimated Teacher Classroom Average Effects Under Alternative Class Size Restrictions

Counties

OutcomeStudent Controls

Student Min

Broward Dade DuvalHills-

boroughOrange

Palm Beach

Pinellas

Correlation Between 2000/01 and 2001/02 Estimates

Gain on Normed FCAT-NRT

Student Fixed

Effects

10 per class

0.48 0.44 0.41 0.35 0.23 0.28 0.45

Gain on Normed FCAT-NRT

Student Fixed

Effects

2 per class

0.38 0.33 0.48 0.29 0.33 0.25 0.14

Gain on Normed FCAT-NRT

Student Fixed

Effects

20 per class

0.49 0.52 0.41 0.39 0.31 0.27 0.58

Page 24: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Inter-Temporal Correlation of Estimated Teacher Classroom Average Effects Under Alternative “Student Mover” Restrictions (Minimum 10 students per class restriction)

Counties

OutcomeStudent Controls

Student Min

Broward Dade DuvalHills-

boroughOrange

Palm Beach

Pinellas

Correlation Between 2000/01 and 2001/02 Estimates

Gain on Normed FCAT-NRT

Student Fixed

Effects

1 mover per

teacher0.48 0.44 0.41 0.35 0.23 0.28 0.45

Gain on Normed FCAT-NRT

Student Fixed

Effects

10 movers

per teacher

0.48 0.44 0.41 0.35 0.23 0.27 0.45

Gain on Normed FCAT-NRT

Student Fixed

Effects

20 movers

per teacher

0.46 0.50 0.45 0.36 0.25 0.29 0.50

Page 25: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

What Factors Affect the Stability of Estimated Teacher Effects? True Variation in Teacher Quality Over Time

Instability in Estimated Effects Could Reflect Changes in True Teacher Quality Over Time

Add time-varying teacher covariates to model Regress estimated teacher-by-year effects on teacher fixed effect and

time-varying teacher covariates

Page 26: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Inter-Temporal Correlation of Estimated Teacher Classroom Average Effects Under Alternative Controls for Time-Varying Factors (Baseline Model – Gain on Normed FCAT-NRT, Student Fixed Effects, Minimum 10 Students per Class)

Controls for Time-

Varying Co-

variates

Counties

Broward Dade DuvalHills-

boroughOrange

Palm Beach

Pinellas

Correlation Between 2000/01 and 2001/02 Estimates

None 0.48 0.44 0.41 0.35 0.23 0.28 0.45

Teacher Only

0.46 0.45 0.39 0.35 0.21 0.26 0.44

Page 27: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Are There Methods to Enhance the Stability of Estimated Teacher Effects? Are there methods to enhance the stability of teacher

effect estimates? 3-Year Running Averages

Reduces noise by averaging sampling errors Could add bias if true performance is changing across years

Empirical Bayes or “Shrinkage” Estimators Place greater weight on more reliable estimates and push less

reliable estimates toward population mean If already have a significant minimum class size restriction,

“simple” EB adjustments which account for differences in the number of classes per teacher or students per teacher not likely to yield large improvements to stability

Accounting for variability at the individual teacher level requires computation of standard errors on individual teacher effects, which can be problematic

Page 28: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Problems in Computing Shrinkage Estimators Default estimates from most software packages are

estimating contrasts between every teacher and a holdout teacher Such estimates support within-year comparisons and cross-

year correlations

We cannot shrink these estimates directly and we cannot use the resulting standard errors for shrinkage

We cannot average these estimates without removing yearly means because changes to the holdout create year-to-year fluctuations

Page 29: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Summary Moderate Stability in Teacher Effects

Cross-year correlations in range on 0.2-0.5 About 40-50 percent of teachers in top quintile remain in the top

quintile the following year Stability increases with number of students per teacher and

when persistence is assumed to equal 0 Nothing else has a consistent appreciable effect on stability Variation across districts appears substantial

Shrinkage estimators or running averages could improve stability of teacher-by-year effects, but need to get appropriate estimates and standard errors

Findings suggest caution in using value-added measures for high-stakes personnel decisions

Page 30: The Inter-temporal Stability of Teacher Effect Estimates J. R. Lockwood Daniel F. McCaffrey Tim R. Sass The RAND Corporation The RAND Corporation Florida

Next Steps Determine sources of year-to-year variability

Improve estimation techniques to obtain comparable estimates across years with accurate measures of within-year standard errors

Current software does not support such estimation

Separate noise from “true” short-term variation Model sources of short-term variation

If true year-to-year variations exists more efficient estimation than three year averages might be possible via smoothing or filtering