struggling for meaning in standards-based assessment mark wilson uc berkeley

Struggling for meaning in standards-based

assessmentMark WilsonUC Berkeley

Outline• What do we mean by “standards-based” assessments?

• Some current solutions to the problem of assessing standards

• An alternative– Learning performances– Learning progressions– Progress variables

What do we mean by “standards-based”

assessments?• What people often think they are getting:– A useful result for each standard

•“ideal approach”– The illusion of “standards-based” assessments

• What they are usually getting: – A single result that is somehow related to all, or a subset of, the standards

– The reality of “standards-based” assessments

How standards-based is “standards-based”?

• “Fidelity”--how well do the assessments match the standards?

• High Fidelity: each standard has its own useable result

• Moderate Fidelity: each standard is represented by at least one item in the assessments

• Low Fidelity: the items match some of the standards

Why can’t each standard be assessed?

Fidelity versus Cost when total cost is fixed

Number of items

Fidelity

$ per item

i.e., in the “ideal approach” we need so many items per standard that we can’t afford it.

Common Solutions: “Standards-based”

• One (more or less) items per standard– not enough for actual assessments of standards– Also used to provide emphasis among standards (i.e., “gold standards”)

• Sample standards over time• Assess only a certain subset of the standards

• Validate through “alignment review”• Decide to have a much smaller set of standards– Popham’s “Instructionally-sensitive assessments”

E.g. #1

Eg. #2

“Standards-based” assessments

• Do not have high fidelity to standards

• Are what can be afforded• Still maintain “threat” effect– Although low density of items per standard means that “threat” on any one standard is low

Thinking about an Alternative

• “A mile wide and an inch deep”– now-classic criticism of US curricula in Mathematics and Science

• Need for standards to be interpretable by educators, policy-makers, etc.

• Need to enable long-term view of student growth

• Need to find a more efficient way to use item information than in “ideal approach”

Learning Performances• Learning performances: a way of elaborating on content standards by specifying what students should be able to when they achieve a standard– E.g., students should be able to describe phenomena, use models to explain patterns in data, construct scientific explanations, or test hypotheses

– Reiser (2002), Perkins (1998)

Learning performance example

• Benchmark (AAAS, 1993):– [The student will understand that] Individual organisms with certain traits are more likely than others to survive and have offspring

• LP expansion (Reiser et al, 2003):– Students identify and represent mathematically the variation on a trait in a population.

– Students hypothesize the function a trait may serve and explain how some variations of the trait are advantageous in the environment.

– Students predict, supported with evidence, how the variation on the trait will affect the likelihood that individuals in the population will survive an environmental stress.

Learning progressions• Learning progressions: descriptions of the successively more sophisticated ways of thinking about an idea that follow one another as students learn– Aka learning trajectories, progressions of developmental competence, and profile strands

• More than one path leads to competence• Need to engage in curriculum debate about which learning progressions are most important– Try and choose them so that we end up with fewer standards per grade level

Learning progression examples

• Evolutionary Biology– Catley, K., Reiser, B., and Lehrer, R. (2005). Tracing a

prospective learning progression for developing understanding of evolution.

• Atomic-Molecular Theory– Smith, C., Wiser, M., Anderson, C.W., Krajcik, J., and

Coppola, B. (2004). Implications of research on children’s learning for assessment: matter and atomic molecular theory.

• Both available at:– http://www7.nationalacademies.org/bota/Test_Design_K-12_Science.html

http://www7.nationalacademies.org/bota/Test_Design_K-12_Science.html





Progress Variables• Progress variable: Assessment expression of a learning progression

• Aim is to use what we know about meaningful differences in item difficulty to make the interpretation of the results more efficient– Borrow interpretative and psychometric strength from easier and more difficult items, so that we don’t need as many as does the “ideal approach”.

• Progress variables are a principal component of the BEAR Assessment System (Wilson, 2005; Wilson & Sloane, 2000):

The BEAR Assessment System

4 principles: 4 building blocks

Examples provided by:

Principle 1: Developmental Perspective

Building Block 1: Construct Map• Developmental perspective

– assessment system should be based on a developmental perspective of student learning

• Progress variable– Visual metaphor for

• how the students develop and • how we think about how their item responses might

change

Example: Why things sink and float

Levels of Understanding

Buoyancy depends on thedensity of the object relative

to the density of the medium.

Lessons

12: Relative Density

Assessment Activities

Reflective Lesson @10



Post test

Buoyancy depends on thedensity of the object.

11: Density of Medium

10: Density of Object

Buoyancy depends on themass and volume of the object.

7: Mass and Volume


Buoyancy depends on thevolume of the object.

Buoyancy depends on themass of the object.

6: Volume


Pretest

4: Mass

1: Introduction

Principle 2: Match between curriculum and assessment

Building Block 2: Items design

• Instruction & assessment match– there must be a match between what is taught and what is assessed

• Items design – a set of principles that allows one to observe the students under a set of standard conditions that span the intended range of the item contexts


Please answer the following question. Write as much information as you need to explain your answer. Use evidence, examples and what you have learned to support your explanations.Why do things sink and float?

Principle 3: Interpretable by teachersBuilding Block 3: Outcome space

• Management by teachers– that teachers must be the managers of the system, and hence must have the tools to use it efficiently and use the assessment data effectively and appropriately

• Outcome space– Categories of student responses must make sense to teachers


Level What the Student Knows

RD Relative Density

D Density

MV Mass and Volume

M V Mass Volume

PM Productive Misconception

UF Unconventional Feature

OT Off Target

NR No Response

Principle 4: Evidence of qualityBuilding Block 4: Measurement model

• Evidence of quality– reliability and validity evidence, evidence for fairness

• Measurement model– multidimensional item response models, to provide links over time both longitudinally within cohorts and across cohorts

Example: Evaluate progress of a group

OT UF PM M V MV D RD

Evaluate a student’s locations over time

Embedded Assessments

BEAR Assessment System: Principles

Developmental Perspective

Need a framework for communicating meaning

Match between Instruction and Assessment

Need methods of gathering data that are acceptable and useful to all participants

Interpretable byTeachers

Need a way to value what we see in student work

Evidence of Quality

Need a technique of interpreting data that allows meaningful reporting to multiple audiences

In conclusion…• Achieving meaningful measures is tough under any circumstances,

• but especially so in an accountability situation, – where the requirements for accountability and the scale of the evaluation make it very expensive.

• Strategies like learning performances, learning progressions and progress variables are needed to make meaning possible, and affordable.

References• American Association for the Advancement of Science (1993). Benchmarks for Science Literacy. New York:

Oxford University Press.• Catley, K., Reiser, B., and Lehrer, R. (2005). Tracing a prospective learning progression for developing

understanding of evolution. Commissioned paper prepared for the National Research Council’s Committee on Test Design for K-12 Science Achievement, Washington, DC.(http://www7.nationalacademies.org/bota/Test_Design_K-12_Science.html)

• Reiser, B.J., Krajcik, J., Moje, E., and Marx, R. (2003). Design strategies for developing science instructional materials. Paper presented at the National Association for Research in Science Teaching Annual Meeting, March, Philadelphia, PA.

• Smith, C., Wiser, M., Anderson, C.W., Krajcik, J., and Coppola, B. (2004). Implications of research on children’s learning for assessment: matter and atomic molecular theory. Commissioned paper prepared for the National Research Council’s Committee on Test Design for K–12 Science Achievement, Washington, DC. .(http://www7.nationalacademies.org/bota/Test_Design_K-12_Science.html)

• Wilson, M. (2005). Constructing measures: An item-response modeling approach. Mahwah, NJ: Lawrence Erlbaum Associates.(https://www.erlbaum.com/shop/tek9.asp?pg=products&specific=0-8058-4785-5)

• Wilson, M, & Bertenthal, M. (Eds.). (2005). Systems for state science assessment. Report of the Committee on Test Design for K-12 Science Achievement. Washington, D.C.: National Academy Press. (http://books.nap.edu/catalog/11312.html)

• Wilson, M., and Sloane, K. (2000). From principles to practice: An embedded assessment system. Applied Measurement in Education, 12(2), 181–208. Available at: http://www.leaonline.com/doi/pdfplus/10.1207/S15324818AME1302_4







https://www.erlbaum.com/shop/tek9.asp?pg=products&specific=0-8058-4785-5

struggling for meaning in standards-based assessment mark wilson uc berkeley

Documents