jamal abedi national center for research on evaluation, standards, and student testing ucla graduate...

39
Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November 18, 2004 Psychometric Issues in the ELL Assessment and Special Education Eligibility English Language Learners Struggling to Learn: Emergent Research on Linguistic Differences and Learning Disabilities

Upload: dominick-terry

Post on 17-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Jamal Abedi

National Center for Research on Evaluation, Standards, and Student Testing

UCLA Graduate School of Education & Information StudiesNovember 18, 2004

Psychometric Issues in the ELL Assessment and Special Education Eligibility

English Language Learners Struggling to Learn:Emergent Research on Linguistic Differences and

Learning Disabilities

Page 2: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Why Should English Language Learners be Assessed?

Goals 2000

Title I and VII of the Improving America’s School Act of 1994 (IASA)

No Child Left Behind Act

Page 3: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Should Schools Test English Language Learners?

Yes

Assessment outcomes may not be valid because their low level English proficiency interferes with content knowledge performance

Test results affect decisions regarding promotion or graduation

They may be inappropriately placed into special educational programs where they receive inappropriate instruction

ELL students may not have received the same curriculum which is assumed for the test

General Problems

English language learners (ELLs) can be placed at a disadvantage because:

Page 4: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Should Schools Test English Language Learners?

YesProblems In Large-Scale Assessment:

Standardized assessment

• Assessment tools in large-scale assessments are usually constructed based on norms that exclude ELL populations

• Research shows major differences between the performance of ELL and non-ELL students on the results of standardized large-scale assessments

• The tests may be biased in favor of non-ELL populations

Performance/alternative assessment

• Such assessments require more language production; thus students with lower language capabilities are at a greater disadvantage

• Scorers may not be familiar with rating ELL performance

Page 5: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Problems

Due to the powerful impact of assessment on instruction, ELL and SWD students’ quality of instruction may be affected

If excluded, they will be dropped out of the accountability picture

Institutions will not be held responsible for their performance in school

They will not be included in state or federal policy decision

Their academic progress, skills, and needs may not be appropriately assessed

Should Schools Test English Language Learners?

No

Page 6: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

States with the Highest Proportion of ELL Students

Percentage of Total Student Population:

California 27.0

New Mexico 19.0

Arizona 15.4

Alaska 15.0

Texas 14.0

Nevada 11.8

Florida 10.7

Page 7: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Problems in AYP Reporting: Focus on LEP Students

1. Problems in classification/reclassification of LEP students (moving target subgroup)

2. Measurement quality

3. Low baseline

4. Instability of the LEP subgroup

5. Sparse LEP population

6. LEP cutoff points (Conjunctive vs. Compensatory model)

Page 8: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Site 2 Stanford 9 Sub-scale Reliabilities (1998) Grade 9 Alphas

Non-LEP Students

Sub-scale (Items) Hi SES Low SES

English Only

FEP RFEP LEP

Reading, N= 205,092 35,855 181,202 37,876 21,869 52,720

-Vocabulary (30) .828 .781 .835 .814 .759 .666

-Reading Comp (54) .912 .893 .916 .903 .877 .833

Average Reliability .870 .837 .876 .859 .818 .750

Math, N= 207,155 36,588 183,262 38,329 22,152 54,815

-Total (48) .899 .853 .898 .898 .876 .802

Language, N= 204,571 35,866 180,743 37,862 21,852 52,863

-Mechanics (24) .801 .759 .803 .802 .755 .686

-Expression (24) .818 .779 .812 .804 .757 .680

Average Reliability .810 .769 .813 .803 .756 .683

Science, N= 163,960 28,377 144,821 29,946 17,570 40,255

-Total (40) .800 .723 .805 .778 .716 .597

Social Science, N= 204,965 36,132 181,078 38,052 21,967 53,925

-Total (40) .803 .702 .805 .784 .722 .530

Page 9: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Classical Test Theory: Reliability

2X = 2

T + 2E

X: Observed ScoreT: True ScoreE: Error Score

XX’= 2T /2

X

XX’= 1- 2E /2

X

Textbook examples of possible sources that contribute to the measurement error:

2

RaterOccasion

ItemTest Form

Page 10: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Classical Test Theory: Reliability

2X = 2

T + 2E

2X = 2

T + 2E+ 2

S+ ES

XX’= 1- ((2E + 2

S+ ES )/2X)

2

Page 11: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Generalizability Theory:Partitioning Error Variance into Its

Components

s2(Xpro) = 2p + 2

r + 2o + 2

pr + 2po + 2

ro + 2pro,e

p: Personr: Ratero: Occasion

Are there any sources of measurement error that may specifically influence ELL performance?

3

Page 12: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Grade 11 Stanford 9 Reading and Science Structural Modeling Results (DF=24), Site 3

All Cases (N=7,176)

Even Cases (N=3,588)

Odd Cases (N=3,588)

Non-LEP (N=6,932)

LEP (N=244)

Goodness of Fit

Chi Square 1786 943 870 1675 81

NFI .931 .926 .934 .932 .877

NNFI .898 .891 .904 .900 .862

CFI .932 .928 .936 .933 .908

Factor Loadings

Reading Variables

Composite 1 .733 .720 .745 .723 .761

Composite 2 .735 .730 .741 .727 .713

Composite 3 .784 .779 .789 .778 .782

Composite 4 .817 .722 .712 .716 .730

Composite 5 .633 .622 .644 .636 .435

Math Variables

Composite 1 .712 .719 705 709 660

Composite 2 .695 .696 .695 .701 .581

Composite 3 .641 .628 .654 .644 .492

Composite 4 .450 .428 .470 .455 .257

Factor Correlation

Reading vs. Math .796 .796 .795 .797 .791

Note. NFI = Normed Fit Index. NNFI = Non-Normed Fit Index. CFI = Comparative Fit Index.

Page 13: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Normal Curve Equivalent Means & Standard Deviations for Students in Grades 10 and 11, Site 3 School District

Reading Science Math M SD M SD M SD

Grade 10SWD only 16.4 12.7 25.5 13.3 22.5 11.7LEP only 24.0 16.4 32.9 15.3 36.8 16.0LEP & SWD 16.3 11.2 24.8 9.3 23.6 9.8Non-LEP/SWD 38.0 16.0 42.6 17.2 39.6 16.9All students 36.0 16.9 41.3 17.5 38.5 17.0

Grade 11SWD Only 14.9 13.2 21.5 12.3 24.3 13.2LEP Only 22.5 16.1 28.4 14.4 45.5 18.2LEP & SWD 15.5 12.7 26.1 20.1 25.1 13.0Non-LEP/SWD 38.4 18.3 39.6 18.8 45.2 21.1All Students 36.2 19.0 38.2 18.9 44.0 21.2

Page 14: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Subgroup Reading Math Language Spelling

LEP Status

LEP

Mean 26.3 34.6 32.3 28.5

SD 15.2 15.2 16.6 16.7

N 62,273 64,153 62,559 64,359

Non-LEP

Mean 51.7 52.0 55.2 51.6

SD 19.5 20.7 20.9 20.0

N 244,847 245,838 243,199 246,818

SES

Low SES

Mean 34.3 38.1 38.9 36.3

SD 18.9 17.1 19.8 20.0

N 92,302 94,054 92,221 94,505

Higher SES

Mean 48.2 49.4 51.7 47.6

SD 21.8 21.6 22.6 22.0

N 307,931 310,684 306,176 312,321

Site 2 Grade 7 SAT 9 Subsection Scores

Page 15: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Reading Math Math Calculation

Math Analytical

Non-LEP/Non-SWD

Mean 45.63 49.30 49.09 48.75

SD 21.10 20.47 20.78 19.61

N 9217 91.18 9846 92.50

LEP only

Mean 20.26 36.00 39.20 33.86

SD 16.39 18.48 21.25 16.88

N 692 687 696 699

SWD only

Mean 18.86 27.82 28.42 29.10

SD 19.70 14.10 15.76 15.14

N 872 843 883 873

LEP/SWD

Mean 9.78 21.37 22.75 22.87

SD 11.50 10.75 12.94 12.06

N 93 92 97 94

Site 4 Grade 8 Descriptive Statistics for the SAT 9 Test Scores by Strands

Page 16: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Accommodations for SWD/LEP

Accommodations that are appropriate for the particular

subgroup should be used

Page 17: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Why Should English Language Learners be

Accommodated?Their possible English language deficiency may interfere with their content knowledge performance.

Assessment tools may be culturally and linguistically biased for these students.

Linguistic complexity of the assessment tools may be a source of measurement error.

Language factors may be a source of construct irrelevant variance.

Page 18: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

SY 2000-2001 Accommodations Designated for ELLs Cited in

States’ Policies

There are 73 accommodations listed:

N: Not Related

R: Remotely Related

M: Moderately Related

H: Highly Related

From: Rivera (2003) State assessment policies for English language learners. Presented at the 2003 Large-Scale Assessment Conference

Page 19: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

N 1. Test time increased

N 2. Breaks provided

N 3. Test schedule extended

N 4. Subtests flexibly scheduled

N 5. Test administered at time of day most beneficial to test-taker

N = not related; R = remotely related; M = moderately related; H = highly related

I. Timing/Scheduling (N = 5)

SY 2000-2001 Accommodations Designated for ELLs Cited in States’

Policies

Page 20: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

There are 73 Accommodations Listed

47 or 64% are not related

7 or 10% are remotely related

8 or 11% are moderately related

11 or 15% are highly related

Page 21: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

A Clear Language of Instruction and Assessment Works for ELLs, SWDs, and Everyone

What is language modification of test items?

Page 22: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Examining Complex Linguistic Features in Content-Based Test Items

Feature Feature Description Categories Combined

1 I tem length 1, 2, 4, 45

2 Vocabulary 3, 26, 27

3 Nominal heaviness 5, 6, 29, 30, 31, 32

4 Verb voice 7, 33

5 Modal 8, 34

6 Relative clause 9, 10, 11, 35, 36, 37

7 Adverbial modification 12, 13, 14, 15, 16, 17, 38, 39, 40, 41

8 Conditional clause 18, 19

9 Complement clause 20, 44

10 Sentence structure 28, 42, 43, 46

11 Preferred argument structure 22, 23, 47, 48

12 Question form 21

13 Global difficulty 24

14 Content interest 25

Page 23: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Familiarity/frequency of non-math vocabulary: unfamiliar or infrequent words changed

census > video gameA certain reference file > Mack’s company

Length of nominals: long nominals shortened last year’s class vice president > vice presidentthe pattern of puppy’s weight gain > the pattern above

Question phrases: complex question phrases changed to simple question words

At which of the following times > Whenwhich is best approximation of the number >

approximately how many

Linguistic Modification Concerns

Page 24: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Conditional clauses: conditionals either replaced with separate sentences or order of conditional and main clause changed If Lee delivers x newspapers > Lee delivers x newspapers

If two batteries in the sample were found to be dead > he found three broken pencils in the sample

Relative clauses: relative clauses either removed or re-cast A report that contains 64 sheets of paper > He needs 64

sheets of paper for each report

Voice of verb phrase: passive verb forms changed to activeThe weights of 3 objects were compared > Sandra

compared the weights of 3 rabbitsIf a marble is taken from the bag > if you take a marble

from the bag

Linguistic Modification cont.

Page 25: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Original:

2. The census showed that three hundred fifty-six thousand, ninety-seven people lived in Middletown. Written as a number, that is:

A. 350,697B. 356,097C. 356,907D. 356,970  Modified:

2. Janet played a video game. Her score was three hundred fifty-six thousand, ninety-seven. Written as number, that is: A. 350,697B. 356,097C. 356,907D. 356,970

Page 26: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Interview StudyTable 1. Student Perceptions Study: First Set (N=19)

Item # Original item chosen Revised item chosen

1 3 16

2 4 15

3 10 9

4 11 8

Table 2. Student Perceptions Study: Second Set (N=17)

Item # Original item chosen Revised item chosen5 3 14

6 4.5a 12.5

7 2 15

8 2 15

Page 27: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Many students indicated that the language in the revised item was easier:

“Well, it makes more sense.”

“It explains better.”

“Because that one’s more confusing.”

“It seems simpler. You get a clear idea of

what they want you to do.”

Page 28: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Issues in the ELL Special Education Eligibility

Issues concerning authenticity of English language Proficiency tests

Issues and problems in identifying students with learning disability in general

Distribution of English language proficiency across ELL/non-ELL student categories

Page 29: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Issues concerning authenticity of English language Proficiency tests

Issues in theoretical bases (discrete point approach, holistic approach, Pragmatic approach)

Issues in content coverage (language proficiency standards)

Issues concerning psychometrics of the assessment

Low relationship between ELL classification categories and English proficiency scores

Page 30: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Issues and problems in identifying students with learning disability in

general A large majority of students with

disabilities fall in learning disability

Validity of identifying students with learning disability is questionable

Page 31: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Distribution of English language proficiency across ELL/non-ELL

student

Most of the existing tests of English proficiency lack enough discrimination power

There is a large number of ELL students perform higher than non-ELL student

The line between ELL and non-ELL on their English proficiency is not a clear line

Page 32: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Reducing the Language Load of Test Items

Reducing unnecessary language complexity of test items helps ELL students (and to some extent SWDs) present a more valid picture of their content knowledge.

The language clarification of test items may be used as a form of accommodation for English language learners.

The results of our research suggest that linguistic complexity of test items may be a significant source of measurement error for ELL students.

Page 33: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Conclusions and Recommendation

1. Classification Issues

Classifications of ELLs and SWDs:

Must be based on multiple criteria that have predictive power for such classificationsThese criteria must be objectively definedMust have sound theoretical and practical basesMust be easily and objectively measurable

Page 34: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Conclusions and Recommendation

2. Assessment Issues

Assessment for ELLs and SWDs:

Must be based on a sound psychometric principlesMust be controlled for all sources of nuisance or confounding variablesMust be free of unnecessary linguistic complexitiesMust include sufficient number of ELLs and SWDs in its development process (field testing, standard setting, etc.)Must be free of biases, such as cultural biasesMust be sensitive to students’ linguistics and cultural needs

Page 35: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

3. Issues concerning special education eligibility particularly in placing ELL

students at the lower English language proficiency in the learning/ reading

disability category

There are psychometric issues with the English language proficiency tests

Standardized achievement tests may not provide reliable and valid assessment of ELL students

Reliable and valid measures are needed to distinguish between learning disability and low level of English proficiency

Page 36: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Conclusions and Recommendation

4. Accommodation Issues

Accommodations:

Must be relevant to the subgroups of students Must be effective in reducing the performance gap between accommodated and non-accommodated studentsMust be valid, that is, accommodations should not alter the construct being measuredThe results could be combined with the assessments under standard conditionsMust be feasible in the national and state assessments

Page 37: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November

Now for a visual art representation of invalid

accommodations…

Page 38: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November
Page 39: Jamal Abedi National Center for Research on Evaluation, Standards, and Student Testing UCLA Graduate School of Education & Information Studies November