national superintendent's dialogue

47
Current issues in assessment and accountability John Cronin, Ph.D. – Senior Director of Education Rese Northwest Evaluation Associa

Upload: nwea

Post on 30-Nov-2014

104 views

Category:

Education


0 download

DESCRIPTION

Discussion ab out trends in assessment and accountability for National Superintendent's Dialogue

TRANSCRIPT

  • 1. Current issues in assessment and accountabilityJohn Cronin, Ph.D. Senior Director of Education ResearchNorthwest Evaluation Association

2. Are students over-tested? 3. Most teachers (59%) and the vastmajority of district administrators(89%) say the ideal focus ofassessments should be frequentlytracking student performance andproviding daily or weekly feedback inthe classroom.Make Assessment Matter: Students and EducatorsWant Tests that Support Learning (2014). Portland, OR. NWEA and Grunwald Associates LLC. 4. Percent of students who saythey do not receive their stateaccountability test results.37%Make Assessment Matter: Students and EducatorsWant Tests that Support Learning (2014). Portland, OR. NWEA and Grunwald Associates LLC. 5. A majority of teachers believe that students areover-tested 6. Teachers How much time do you feel is spentpreparing for and taking assessments?59%28%13%Too muchJust the Right AmountToo LittleMake Assessment Matter: Students and EducatorsWant Tests that Support Learning (2014). Portland, OR. NWEA and Grunwald Associates LLC. 7. Number of annual instructional hours required inCalifornia schools for students in grades 6-8900 8. Ways in which educators useassessment dataInform instructionEvaluate program effectivenessEvaluate teacher/principal performanceMeasure growth in student learning0% 20% 40% 60% 80% 100%Evaluate school performanceMeasure student achievementAdministrators TeachersMake Assessment Matter: Students and Educators WantTests that Support Learning (2014). Portland, OR. NWEAand Grunwald Associates LLC. 9. Number of hours required to administer theSmarter Balanced Assessment summativeassessment in grades 6-87.5 10. Estimated Time Devoted to Testing inThird Grade 12 Urban School Systems3.57.24.5911.8126.33.5711.4814.563.112212.31.8810.615.39910.50 5 10 15 20 25 30ClevelandHoustonAtlantaIndianapolisDenverLos AngelesBostonWashington DCAnchorageBaltimoreShelby Cty, TNChicagoClassroom HoursState Mandated DistrictSource: Teoh, M., Coggins, C., Guan, C. and Hiller, H. (2014, Winter). The Student and the Stopwatch.How much time do American students spend on testing? Teach Plus. 11. What drives the perception of over-testing is theamount of time invested in test preparation ordrilling 12. Estimated time devoted to testpreparation in one mid-western schoolsystem1009080706050403020100K 1 2 3 4 5 6 7 8 9 10 11 12Classroom hoursGrade in schoolState MandatedBenchmarkLocalNelson, H. (2013). Testing More Teaching Less. What Americas Obsession with Student Testing Costs inTerms of Money and Time. Washington, D.C. American Federation of Teachers 13. And the constraints imposed by testing studentson a limited number of computers or tablets. 14. Days required to complete SBAC summative testing in aschool with 200 students in grades 3-5.20161311.310252015105020 25 30 35 40Number of Instrucitonal Days Neededto Complete TestingNumber of Available Computers in LabsEstimates based on results from SBAC technology needs calculator. 15. Days required to complete SBAC summative testing in a school with800 students in grades 6-8.3226.6722.862017.781614.5513.333530252015105050 60 70 80 90 100 110 120Number of Instructional DaysNumber of Available Computers in LabsEstimates based on results from SBAC technology needs calculator. 16. The evolving evaluationlandscape value-added 17. A simple framework for teacher evaluationEvidence ofprofessionalresponsibilitiesEffective teachingand professionaljob performanceEvidence ofstudentlearningThe evaluation of teachingby classroom observationand use of artifactsEvidence ofprofessionalpracticeThe evaluation of theteachers effectiveness inmaking progress towardtheir goals and fulfilling theresponsibilities of aprofessional educator.The evaluation of ateachers contribution tostudent learning andgrowth 18. What teacher effectiveness infers Evidence of Learning A claim that theimprovement in learning (or lack of it)reflected on one or more tests is caused bythe teacher. Evidence of good practice That theobservers ratings or conclusions are reliableand associated with behaviors that causeimproved learning in the classroom. 19. Three ways tests are used inevaluation and their claimsValue-Added Produces rankings of teachers relative to each otherbased on assessment results. Introduces controls to account for factors that mayinfluence growth that are outside the teachersinfluence. Advances a claim of causation that the teachersranking is based on learning caused. Can be applied to as few as 20% of the teachers in aschool system (Whitehurst, 2013).Whitehurst, G. J. (2013). Teacher value- added: Do we want a ten percent solution? TheBrown Center Chalkboard, April 24. Washington, DC: Brookings Institution.Retrieved October 2, 2014, from www.brookings.edu/blogs/brown-center-chalkboard/posts/2013/04/24-merit-pay-whitehurst 20. Student Growth Percentiles Produce rankings of teachers relative to each otherbased on assessment results. Use prior results to predict student growth. Do not introduce controls to account for factors thatmay influence growth that are outside the teachersinfluence. Does not advance a claim of causation that resultsdescribe the growth of the classroom but are notintended to identify the cause. 21. Student Learning Objectives Are a contract negotiated between the principal and teacheraround student results. Do not produce rankings that compare teacher results acrosssettings Do not introduce controls to account for factors that mayinfluence growth that are outside the teachers influence. Do not advance a claim of causation teacher competence isdemonstrated by fulfillment of the contract Can be applied to as few as 20% of the teachers in a schoolsystem 22. Three ways tests are used inevaluation and their issuesValue-Added High measurement error Reliability of the results Insufficient controls Measurement quality of assessments used Fairness of application 23. Three ways tests are used inevaluation and their issuesStudent Growth Percentiles Reliability of the results Use of past performance as the only control Measurement quality of assessments used Fairness of application The developers claim that the percentiles aredescriptive and not evidence of causation 24. Three ways tests are used inevaluation and their issuesStudent Learning Objectives Do not provide evidence of teacher effectiveness. Teachers using SLOs may be evaluated against lessrigorous criteria than teachers evaluated by value-addedor student growth percentiles. Goals are not consistent in difficulty. Goals are not consistent across teachers. 25. The evolving evaluation landscape principal observation 26. What NWEA supports The evaluation process should focus onhelping teachers improve. The principal or designated evaluator shouldcontrol the evaluation. Tests should not be the deciding factor in anevaluation. Multiple measures should be used. 27. Distinguishing teacher effectivenessfrom teacher evaluation Teacher effectiveness The judgment of a teachersability to positively impact learning in the classroom. Teacher evaluation The judgment of a teachersoverall performance including: Teacher effectiveness Common standards of job performance Participation in the school community Adherence to professional standards 28. Purposes of summative evaluation Make an accurate and defensible judgment of an educatorsjob performance. Provide ratings of performance that provide meaningfuldifferentiation across educators. Goals of evaluation Help educators focus on their students and their practice Retain your top educators Dismiss ineffective educators 29. Learn from others mistakes. 30. The greatest tragedy of this century ineducation so far, was the number ofyoung, talented teachers who lost theirpositions in the last recession. 31. Employment of Elementary Teachers2007-2012NUMBER OF TEACHERS1538000 1544270 15443001485600The elementary schoolteacher workforce shrunk by178,000 teachers (11%)between May, 2007 and May,2012.141500013603802007 2008 2009 2010 2011 2012Source: (2012, May) Bureau of Labor Statistics Occupational Employment StatisticsNumbers exclude special education and kindergarten teachers 32. The impact of seniority based layoffs onschool qualityIn a simulation study of implementation of a layoff of 5%of teachers using New York City data, reliance on senioritybased layoffs: Resulted in 25% more teachers laid off. Teachers laid off were .31 standard deviations moreeffective (using a value-added criterion) than those lostusing an effectiveness criterion. 84% of teachers with unsatisfactory ratings wereretained.Source: Boyd, L., Lankford, H., Loeb, S., and Wycoff, J. (2011). Center for Education Policy.Stanford University. 33. Ultimately the principal shoulddecide Evaluation inherently involvesjudgment not a bad thing. Evidence should inform and notdirect their judgment. The implemented system shoulddifferentiate performance. Courts respect the judgment ofschool administrators relative topersonnel decisions. 34. If evaluators do notdifferentiate their ratings,then all differentiationcomes from the test. 35. Why differentiating ratings isimportant100908070605040302010060 70 80 90 100Principal Rating Value-added rating 36. The (Race to the Top teacher evaluation) changes, alreadyunder way in some cities and states, are intended toprovide meaningful feedback and, critically, to weed outweak performers. And here are some of the early results:In Florida, 97 percent of teachers were deemed effectiveor highly effective in the most recent evaluations. InTennessee, 98 percent of teachers were judged to be atexpectations. In Michigan, 98 percent of teachers wererated effective or better.Source: New York Times (2013, March 30). Curious Grade for Teachers: Nearly all Pass.Retrieved from:http://www.nytimes.com/2013/03/31/education/curious-grade-for-teachers-nearly-all-pass.html?pagewanted=all&_r=0 37. Results of Georgia Teacher Evaluation1% 2%75%23%Evaluator RatingineffectiveMinimally EffectiveEffectiveHighly EffectivePilot 38. Teacher Evaluation Ratings in Eleven FloridaSchools - 2013FloridaDistrictHighlyEffectiveEffective NeedsImprovementDeveloping Unsatisfactory VA Score FloridaRankingRanking1 44.4% 55.6% 0.0% 0.0% 0.0%2 25.0% 75.0% 0.0% 0.0% 0.0%3 90.9% 9.1% 0.0% 0.0% 0.0%4 60.7% 39.3% 0.0% 0.0% 0.0%5 81.2% 18.8% 0.0% 0.0% 0.0%6 37.3% 54.2% 1.7% 0.0% 6.8%7 81.3% 18.8% 0.0% 0.0% 0.0%8 41.7% 55.6% 1.4% 1.4% 0.0%9 52.2% 47.8% 0.0% 0.0% 0.0%10 27.0% 66.2% 1.4% 0.0% 5.4%11 7.1% 72.6% 9.5% 10.7% 0.0% 39. Teacher Evaluation Ratings in Eleven FloridaSchools - 2013FloridaDistrictHighlyEffectiveEffectiveNeedsImprovementDevelopingUnsatisfactoryVAScoreFloridaRankingRanking1 44.4% 55.6% 0.0% 0.0% 0.0% 0.39 109 12 25.0% 75.0% 0.0% 0.0% 0.0% 0.37 121 23 90.9% 9.1% 0.0% 0.0% 0.0% -0.14 2802 94 60.7% 39.3% 0.0% 0.0% 0.0% -0.14 2797 85 81.2% 18.8% 0.0% 0.0% 0.0% -0.16 2831 106 37.3% 54.2% 1.7% 0.0% 6.8% 0.12 880 57 81.3% 18.8% 0.0% 0.0% 0.0% 0.22 402 38 41.7% 55.6% 1.4% 1.4% 0.0% -0.34 3274 119 52.2% 47.8% 0.0% 0.0% 0.0% 0.16 664 410 27.0% 66.2% 1.4% 0.0% 5.4% 0 1764 611 7.1% 72.6% 9.5% 10.7% 0.0% -0.08 2445 7 40. Teacher observation as a part ofteacher evaluationSystematic observation of teacher performanceis a central part of every states teacherevaluation plan. 41. New York Teacher Ratings1291 57415511563809State Test Value-addedIneffective DevelopingEffective Highly Effective616 28535140071087Principal Rating 42. How should tests,observations, and surveys beweighted? 43. The importance of non-cognitivefactors inteacher evaluation 44. Non-cognitive factorsIn education, value-addedmeasurement has focusedpolicy-makers on the teacherscontribution to academicsuccess, as reflected in testscores.Jackson (2012) argues thatteachers may have more impacton non-cognitive factors that areessential to student success likeattendance, grades, andsuspensions.These are not the only measuresthat matter however. 45. Non-cognitive factors Lowered the average student absenteeism by 7.4 days. Improved the probability that students would enroll inthe next grade by 5 percentage points.Employing value-added methodologies, Jacksonfound that teachers had a substantive effect onnon-cognitive outcomes that was independentof their effect on test scores Reduced the likelihood of suspension by 2.8% Improved the average GPA by .09 (Algebra) or .05(English)Source: Jackson, K. (2013). Non-Cognitive Ability, Test Scores and TeacherQuality: Evidence from 9th Grade Teachers in North Carolina. NorthwesternUniversity and NBER 46. Solving one problem can sometimescreate another. 47. Suggested readingBaker B., Oluwole, J., Green, P. (2013). The legalconsequences of mandating high stakes decisions basedon low quality information: Teacher evaluation in theRace to the Top Era. Education Policy Analysis Archives.Vol 21. No 5.