ncce bylsma

52
MAKING SENSE OF THE NEW ACCOUNTABILITY INDEX AND STUDENT GROWTH PERCENTILES Dr. Pete Bylsma Director, Assessment/Student Information Services Renton School District (Past President, Washington Educational Research Association - WERA) Dr. Glenn Malone Executive Director of Assessment, Accountability & Student Success Puyallup School District (WERA President-Elect) NCCE Conference March 12, 2014

Post on 22-Oct-2014

325 views

Category:

Education


2 download

DESCRIPTION

NCCE Presentation

TRANSCRIPT

Page 1: NCCE Bylsma

MAKING SENSE OF THE NEW ACCOUNTABILITY INDEX AND STUDENT GROWTH PERCENTILES

Dr. Pete BylsmaDirector, Assessment/Student Information ServicesRenton School District(Past President, Washington Educational Research Association - WERA)

Dr. Glenn MaloneExecutive Director of Assessment, Accountability & Student SuccessPuyallup School District(WERA President-Elect)

NCCE ConferenceMarch 12, 2014

Page 2: NCCE Bylsma

Describe changes in federal accountability that prompted changes in old Index and required student growth measures

Describe old and new Achievement Index that rates schools (assigns labels, identifies high and low performers, basis for State Board of Education/OSPI recognition)

Describe & critique the new student growth percentile measure (SGP) used in the new index (and potentially used in staff evaluations)

SESSION OBJECTIVES

Page 3: NCCE Bylsma

AYP under NCLB started in 2002, state discarded its existing accountability system

• AYP used 9 student groups, reading/math proficiency and participation, graduation rate

• 37 “cells” possible for schools, 111 for district

• Gradually increasing goal, all groups must meet standard by 2014

• “Conjunctive” model – not making it in one area means not making AYP

• Escalating negative sanctions when not making AYP, but only for Title I schools

Why Change Accountability System?

3

Page 4: NCCE Bylsma

• System is too complicated, invalid, and unrealistic– Different “rules” than those used by state

• Larger minimum N, margin of error, excludes some students

– Negative label applied when missing one goal,ELLs must take test despite not knowing English

– Conjunctive model all will eventually “fail”

• Resulted in unintended side effects– Focus on “bubble kids,” narrowing curriculum, some

states lowered standards so all can pass by 2014

Problems with AYP System

4

Page 5: NCCE Bylsma

AYP waiver approved in 2012, some rules no longer apply• Do not need to have all students meet standard by 2014• Do not need to set aside Title I funds• School choice or supplemental services not required• Still looks at reading & math percent meeting standard,

95% participation rate, graduation rates

Annual Measurable Objectives (AMO) is new measure• Each subgroup in each school has its own annual targets• Targets use a 2011 baseline, must cut in half the

“proficiency gap” (difference between baseline and 100% meeting standard) by 2017

5

New Federal Accountability Rules

Page 6: NCCE Bylsma

6

Example of AMOs

0

1 0

2 0

3 0

4 0

5 0

6 0

7 0

8 0

9 0

1 0 0

2 0 1 1 2 0 1 2 2 0 1 3 2 0 1 4 2 0 1 5 2 0 1 6 2 0 1 7

A sia n

W hite

Tw o or M ore R a ces

A ll

Low Incom e

B la ck

Pa cifi c Is la nder

H ispa nic

A m erica n India n

S pecia l Educa ti on

Lim ited Eng lish

Page 7: NCCE Bylsma

Instead of “not making AYP,” lowest performing schools are now identified for more support3 types of “Persistently Low Achieving” schools• Priority: Bottom 5% in “all students” category• Focus: Bottom 10% of all subgroups (Asian, black,

Hispanic, white, low income, ELL, special education)• Emerging: Schools close to becoming Priority or Focus

(next lowest 5%/10%)

No grade-band distinctions (elementary, middle, high, comprehensive, alternative are all in the same rankings)

7

Revised Federal Accountability “Sanctions”

Page 8: NCCE Bylsma

System to identify low performing schools is badly flawed• Applies only to Title I schools, must have N > 30 for

three years• To identify Focus and Emerging schools, all subgroups

are combined and ranked together• In 2012, every Focus and Emerging school (186 total)

was identified based on ELL or SpEd subgroups (or both)*

If a school has a large ELL and/or SpEd population and is Title I, the odds of identification is very high

*A few alternative schools were also identified for low graduation rates8

Revised Federal Accountability “Sanctions”

Page 9: NCCE Bylsma

Educational accountability systems require:(1) measures of effectiveness(2) goals to guide improvement efforts(3) reports that provide useful information to

policymakers, educators, and parents (4) a set of consequences that recognize exemplary

performance and support those needing more help

In response to flawed AYP system, the State Board of Education created an Accountability Index in 2009 to provide a better measure of school effectiveness

Accountability Systems

9

Page 10: NCCE Bylsma

Original Accountability Index*Five Outcomes

Results from 4 assessments (reading, writing, math, science) aggregated together from all grades and all students, extended graduation rate for all students, minimum N = 10

Four Indicators1. Achievement by non-low income students

(% meeting standard/ext. grad rate)

2. Achievement by low income students (eligible for FRL)

3. Achievement vs. Peers (make “apples to apples” comparisons by controlling for percent ELL, low-income, special ed, gifted, mobility)

4. Improvement (change in Learning Index from previous year)

Creates a 5x4 matrix with 20 outcomes, each rated on a scale of 1-710* Required by Legislature in 2009 (ESHB 2261)

Page 11: NCCE Bylsma

Original Accountability Index Matrix(multiple measures using available state data)

Outcomes

Indicator Reading Writing Math Science Ext. G.R. Avg.

Non-low inc. achievement

Low inc. ach.

Ach. vs. peers

Improvement

Average Index *

* Simple average of all rated cells (compensatory model)

11

Page 12: NCCE Bylsma

Index Benchmarks and Ratings

Indicator Reading Writing Math Science Ext. grad rate

Achievement of- Non-low inc.- Low income (% met standard)

% MET STANDARD RATING90 – 100% 780 – 89.9% 670 – 79.9% 560 – 69.9% 450 – 59.9% 340 – 49.9% 2< 40% 1

RATE RATING> 95 790 – 95% 685 – 89.9% 580 – 84.9% 475 – 79.9% 370 – 75% 2< 70% 1

- Achievement vs. Peers(Learning Index)

DIFFERENCE IN LEARNING INDEX RATING

> .20 7.151 to .20 6.051 to .15 5-.05 to .05 4-.051 to -.15 3-.151 to -.20 2 < -.20 1

DIFFERENCEIN RATE RATING> 12 76.1 to 12 63.1 to 6 5-3 to 3 4-3.1 to -6 3-6.1 to -12 2< -12 1

12

Page 13: NCCE Bylsma

Indicator Reading Writing Math Science Ext. grad rate

- Improvement(Learning Index)

CHANGE IN LEARNING INDEX RATING

> .15 7.101 to .15 6.051 to .10 5-.05 to .05 4-.051 to -.10 3-.101 to -.15 2< -.15 1

CHANGEIN RATERATING> 6 74.1 to 6 62.1 to 4 5-2 to 2 4-2.1 to -4 3-4.1 to -6 2< -6 1

Index Benchmarks and Ratings

13

• No Improvement rating given when performing at a very high level (sensitive to “ceiling” effect)

• Index excluded ELL results in the first 3 years of enrollment (ELLs must still take tests, most exit in 3 years)

Page 14: NCCE Bylsma

Achievement vs. Peers

•Recognizes context affects outcomes

•Makes “apples to apples” comparisons (“statistical neighbors”) to control for 5 student variables(percent ELL, low-income, special education, mobile, gifted)

•Separate analysis for each type of school (e.g., elementary, middle, high, multiple grades)

•Non-regular schools do not receive a “peer” rating

14

Page 15: NCCE Bylsma

Illustration of Achievement vs. Peers (1 of 5 variables)

Linear Regression

0.0 25.0 50.0 75.0 100.0

Pct low income

0.000

1.000

2.000

3.000

4.000

Ma

th L

earn

ing

In

dex

, 20

07

Math Learning Index, 2007 = 3.26 + -0.01 * PctLowIncR-Square = 0.70

A

B 7

1

4

15

Page 16: NCCE Bylsma

Five Tier Names and Ranges

Schools/districts assigned to a “tier” based on index score

(but some applied A-F labels to these tiers)

Tier Index Range

Exemplary 5.50 – 7.00Very Good 5.00 – 5.49Good 4.00 – 4.99Fair 2.50 – 3.99Struggling 1.00 – 2.49

16

Page 17: NCCE Bylsma

Example - XXX High School

Index(Good)

Indicator Reading Writing Math ScienceGrad Rate Average

Non-low inc. ach. 7 7 3 3 6 5.20 Low-inc. ach. 6 7 2 2 6 4.60 Ach. vs. peers 4 4 4 4 6 4.40 Improvement 5 2 1 4 3 3.00 Average 5.50 5.00 2.50 3.25 6.00 4.37

Indicator Reading Writing Math ScienceGrad Rate

Non-low inc. ach.* 92.5 93.7 58.7 56.5 94.9 Low-inc. ach.* 87.2 91.8 44.8 40.8 94.2 Ach. vs. peers** +.05 +.01 +.03 +.05 +10.3 Improvement** +.09 -.14 -.26 -.04 -2.5

* Percent meeting standard for content areas, extended graduation rate** All students, content areas measured using the Learning Index

17

Page 18: NCCE Bylsma

2012 Index Results

18

41 3875

119

212

268

400377

320

162

5118

0

1 0 0

2 0 0

3 0 0

4 0 0

5 0 0

1 .0 0 -1 .4 9

1 .5 0 -1 .9 9

2 .0 0 -2 .4 9

2 .5 0 -2 .9 9

3 .0 0 -3 .4 9

3 .5 0 -3 .9 9

4 .0 0 -4 .4 9

4 .5 0 -4 .9 9

5 .0 0 -5 .4 9

5 .5 0 -5 .9 9

6 .0 0 -6 .4 9

6 .5 0 -7 .0 0

Struggling 7.4%

Fair 28.8%

G ood 37.3%

V e ry G ood 15.4%

Exe m plary 11.1%

N=2,081

Page 19: NCCE Bylsma

Washington Achievement Awards

OSPI/SBE used 2-year averages from Accountability Index• Overall Excellence Award uses the Index score (top 5% by grade band)• Special Recognition given “on the edges” when 2-year average is > 6.00

Language arts, math, science, graduation rate, Improvement

19

Outcomes

Indicator Reading Writing Math Science G.R. Average

Non-low inc. achievement

Compare1

Low inc. ach.

Ach. vs. peers

Improvement 6.00

Average 6.00 6.00 6.00 6.00 Top 5%1

1 Overall Excellence is granted only if the average difference in the income gap and the race/ethnicity gap (using a separate matrix) is < 2.5

Page 20: NCCE Bylsma

• Federal NCLB waiver required a change to the current Index – it must include subgroups and a growth measure

• Merges two different accountability systems (state and federal) into one system

• Has no relationship with AMOs!• New index is much more complicated, has

different rules compared to previous index

20

New Accountability Index

Page 21: NCCE Bylsma

• Included in waiver proposal to U.S. Dept. of Education (waiver still not approved)

• Includes all subgroups (race/ethnicity, programs)• N > 20 across grade band (not grade)

• New rating scales (1-10) and more “labels”• No Peer rating• Growth based on SGPs, not grade band improvement in Levels• Includes all ELL results (including results of students who exited program)

• Basis for identifying low-performing schools (federal acct.)• Sanctions also apply to non-Title I schools• Preliminary analyses show high correlation with school % FRL

-.53 (elementary) -.45 (middle) -.60 (high)

21

New Accountability Index

Page 22: NCCE Bylsma
Page 23: NCCE Bylsma
Page 24: NCCE Bylsma
Page 25: NCCE Bylsma
Page 26: NCCE Bylsma

6 Labels, Norm-referenced

• Exemplary: Top 5% of schools using overall index, must have 60% students proficient in all tested subjects (given recognition)

• Very Good: Next 15% of schools

• Good: Next 30% of schools

• Fair: Next 30% of schools

• Underperforming: Next 5% of schools + 10% with large achievement gaps

• Priority: Lowest 5% of index

Page 27: NCCE Bylsma

Proposed Priority, Focus, Emerging• Includes all schools, not just Title I• Uses Index to identify schools rather than stacked

rankingsPriority system uses the overall index value– Bottom 5% are Priority (“Struggling”)– Next 5% from the bottom are Emerging PriorityFocus system uses index value for each subgroup in each school– Bottom 10% are Focus– Next 10% from the bottom are Emerging Focus

Page 28: NCCE Bylsma

Getting Off the Priority / Focus List*

• For 3 consecutive years in Math and Reading:– Meet or exceed AMOs for all subgroups

– Have at least 95% participation for all subgroups

– Not be in the bottom 5% (or 10% for Focus)

– Decrease % of students in all groups scoring Level 1 or 2 in reading and math. Improvement % must be comparable to top 30% of Title 1 schools

• OSPI determines sufficient progress has been made* Unclear how Emerging schools get off list

Page 29: NCCE Bylsma

New Emphasis on Student Growth

• Federal waiver submitted in 2011 requires a student growth measure for the Index and for teacher and principal evaluations

• Index has growth measure but “weak legislation” regarding use of state test results in growth measure puts waiver in jeopardy

• OSPI amended waiver in July 2013 and requires student growth to be a “substantial factor” in 3 of 8 teacher and principal criteria – brinksmanship occurring right now

• Many ways to measure growth, State Board only considered Student Growth Percentile (SGP)

Page 30: NCCE Bylsma

Achievement vs Growth What’s the Difference?

Achievement

Growth

Page 31: NCCE Bylsma

Measuring Student Growth

• Growth, in its simplest form, is a comparison of the assessment results of a student or group of students between two points in time where a positive difference would imply growth.

Page 32: NCCE Bylsma

Student Growth Percentiles• Problem: Current state assessment system was

not designed to measure student growth– Only selected grades and subjects are tested

– Difficulty varies in passing the test from one year to the next (high school reading and writing HSPE is easy to pass (bar was lowered due to graduation requirement)

• State’s Solution: Use a norm-referenced system that ranks the rate of student growth

Page 33: NCCE Bylsma

Student Growth Percentiles• SGPs compare the growth rates of students who

were at the same scale score level the previous year (their “academic peers”)Example: A student earning an SGP of 80 performed as well or better than 80 percent of the students who scored the same score the previous year

• Does not compare the growth rate of all students to each other or compare the achievement to all students (the usual way to give percentiles)

Page 34: NCCE Bylsma

Student Growth Percentiles• SGP trajectory predicts where students will perform in

the future, based on their previous growth rate and students who were at the same scale score level the previous year

• OSPI groups students into three categoriesHigh Growth Top 1/3 67th to 99th percentileTypical Growth Middle 1/3 34th to 66th percentileLow Growth Bottom 1/3 1st to 33rd percentile

• The median SGPs for a class, grade, school or district is the “score” (school median SGP is used in the new Index)

Page 35: NCCE Bylsma

SGP Student Data

Student Growth Percentile (SGP) results are available to the public on the OSPI State Longitudinal Data System (SLDS) website 1 • From OSPI homepage, select “K-12 Data & Reports”

on right side• Select “Static Data Files”• Select “Assessment” menu item, scroll down to find

the SGP files and reports1 http://data.k12.wa.us/PublicDWP/Web/WashingtonWeb/Home.aspx

Page 36: NCCE Bylsma

SGP School Data

Available to the public on the OSPI State Longitudinal Data System (SLDS) website http://data.k12.wa.us/PublicDWP/Web/WashingtonWeb/Home.aspx

• From OSPI’s homepage, click on the “K-12 Data & Reports” button on the right-hand side, then click on “Static Data Files”

• Under the “Assessment” menu item, you can scroll down to find the SGP files and reports

Page 37: NCCE Bylsma
Page 38: NCCE Bylsma
Page 39: NCCE Bylsma

Takes you to a list of district links

Page 40: NCCE Bylsma

SGPs on OSPI’s Web site

Three types of SGP files available to public• Bubble chart with all schools, with district’s

schools identified (hover over bubble for results)

• Individual school results by subgroup (compared to district and state for three years)

• Excel file with all results for all schools and district (Renton’s file has > 5000 rows and 20 columns)

Page 41: NCCE Bylsma
Page 42: NCCE Bylsma
Page 43: NCCE Bylsma

Problems with SGP1. Results can be misleading

Percentile rank is not based on all students, so the 50th percentile is not the middle of the entire distribution, just those who had the same scale score the previous year

2. SGPs do not provide a measure of adequate (enough) growth or a year’s worth of growthA student can be at the 50th percentile and not make a year’s worth of growth or enough growth to meet expectations upon graduation; another student can be at the 50th percentile and make more than a year’s worth of growth

Page 44: NCCE Bylsma

Student Report: No growth is “typical”

Page 45: NCCE Bylsma

Student Report: Decline is “high growth”

Page 46: NCCE Bylsma

Problems with SGP

3. Results may not reflect an accurate measure of student growth or educator effectiveness• SGPs are “highly unstable” and “problematic” for students

with very high and low scores because there are relatively few students with those scores to obtain stable rankings1

• No standard errors reported• Does not control for differences in the student population

4. Results are not available in a timely manner

5. SGPs are new and harder to understand than current metrics

1 Castellano, K. and Ho, A. (2013). A Practitioner’s Guide to Growth Models. Washington, DC: Council of Chief State School Officers

Page 47: NCCE Bylsma

Alternative Measure of Student Growth• Criterion-referenced approach

• Students are compared to their own growth, not the growth rate of others

• Encourages cooperation because score doesn’t depend on how other students perform

• Can be computed quickly and easily – doesn’t require a minimum number of students and doesn’t depend on how other students perform

• Uses familiar data and concepts, makes it easy to understand

Page 48: NCCE Bylsma

Measuring Achievement and Growth

LeadingSlipping

GainingLagging

Above 439 Level 4(Exceeds standard)

400-439 Level 3(Meets standard)

375-399 Level 2 (Below standard)

Below 375 Level 1 (Far below standard)

Change in Scale Score from Grade 4 (2012)

2013

Gra

de 5

Mat

h Sc

ale

Scor

e

-100 -75 -50 -25 0 25 50 75 100

Page 49: NCCE Bylsma

-1 0 0 -5 0 0 5 0 1 0 0

2013 Achievement and Growth from 2012(Math, Grade 4 and Change from Grade 3)

LeadingSlipping

GainingLagging

Average change in scale score: +6.5 (413.1 to 419.6) N = 913 R2=.5856.3% of the students made at least one year gain (change in scale score > 0)Each dot represents a student who was enrolled in the district in both 2012 and 2013 (scores below 300 were marked as 300, scores above 500 were marked as 500)

15.6% (N=142) 50.4%

(N=460)

5.9% (N=54)

28.1% (N=257)

Change in Scale Score from Grade 3 (2012)

2013

Gra

de 4

Mat

h Sc

ale

Scor

e

500

440

400

375

300

Above 439 Level 4(Exceeds standard)

Below 375 Level 1 (Far below standard)

375-399 Level 2 (Below standard)

400-439 Level 3(Meets standard)

Page 50: NCCE Bylsma

Change in Math Scale Scores, 2011 to 2012Non-Low Income Low Income (FRL)

43% made 1+ years gain60% made 1+ years gain

Page 51: NCCE Bylsma

Limitations to Alternative Measure• Proficiency cut scores vary slightly from grade to

gradeIt’s harder to meet standard in some grades compared to others (like having an easy teacher one year and a hard teacher the next)

• No “vertical scale” to measure absolute growthSmarter Balanced assessments will have a vertical scale and cut scores that align with college/career readiness

For more details, see WERA Educational Journal, Winter 2014 article, “Using SGPs to Measure Student Growth: Context, Characteristics, and Cautions” www.wera-web.org

Page 52: NCCE Bylsma

Questions, comments, reactions