test development nicole williams, msn, rn-bc content manager sarah hagge, phd psychometrician

39
Test Development Nicole Williams, MSN, RN- BC Content Manager Sarah Hagge, PhD Psychometrician

Upload: paige-jensen

Post on 27-Mar-2015

221 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Test Development

Nicole Williams, MSN, RN-BC

Content Manager

Sarah Hagge, PhD

Psychometrician

Page 2: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Objectives

Identify the negative impact of examination bias

Discuss the impact of enemy items on test validity

Page 3: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Differential item Functioning (DIF)

Page 4: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Because of the high-stakes nature of the NCLEX®, numerous processes are in place to ensure that the exam is psychometrically sound, valid and legally defensible

One such process includes regular review of the NCLEX for potential biases

Page 5: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Detecting Bias

Bias exists when the test construct measured in one group differs from the construct measured in another group taking the same exam

For example, bias would exist in the NCLEX if it measured nursing knowledge in one group of candidates and another construct, such as reading comprehension, in another

Page 6: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Consequence of Bias

Goal of the NCLEX is to classify candidates into two groups Those who have adequate knowledge, skills

and ability to practice entry-level nursing safely Those who do not

If bias occurs, the construct of entry-level nursing knowledge may not be measured accurately for some groups of candidates

Page 7: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Methods to Detect and Minimize Bias

Item Development Writing Review

Editorial SME Sensitivity

Analyses Differential Item Functioning (DIF) Readability

Page 8: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

What is DIF?

Investigates bias at the individual item level Exists when two groups of candidates with

similar ability perform differently on an item In short, one may consider whether the

candidate’s response to the item is dependent upon a group in which he/she resides

Page 9: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

DIF Analyses

Statistical analyses are conducted on a focal vs. reference group Focal: group of interest (generally the minority) Reference: group with whom the focal group is

compared (generally the majority)

Method Rasch Separate Calibration t-test Compares the difference in difficulty of an item

for the focal and reference groups

Page 10: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

NCLEX DIF Procedure

Routine DIF analyses are conducted semi-annually

Data include all U.S.-educated candidates

Page 11: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Focal and Reference Groups

Gender Reference: Female Focal: Male

Ethnicity Reference: Caucasian Focal: African American, Hispanic, Asian Other,

Asian Indian, Native American and Pacific Islander

Page 12: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

2010 U.S.-Educated NCLEX Candidates

[1] 22,008 candidates did not provide information regarding ethnicities; 5,827 candidates did not provide information on gender.

78,222 PN candidates reported gender 164,175 RN candidates reported gender

74,147 PN candidates reported ethnicity152,069 RN candidates reported ethnicity

Page 13: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

NCLEX DIF Procedure Continued

Analyses are conducted on all pretest and operational items

Minimum sample size requirements 50 focal group candidates 400 reference group candidates

Item difficulty is estimated for the two separate groups of candidates

Page 14: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Content Review

Items with large differences in difficulty are flagged for content review

Items displaying statistical DIF may still be content appropriate and valid Item content may be within the scope of entry-

level nurse practice Obstetrics and gynecology Operating medical equipment

Page 15: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Content Review Panel

Panel of subject matter experts (SMEs) convened to review items displaying statistical DIF

Panel composition must contain at least Five members Three ethnic focal groups One male One member with a background in linguistics One licensed RN

Page 16: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Content Review Panel Continued

Panel reviews all items flagged for statistical DIF in the past six months Potential bias Content relevance for entry-level nursing

Items identified for bias are forwarded to NCLEX Examination Committee

Content irrelevant items removed from operational use

Page 17: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Sample Item #1

The nursing care plan for a 74-year-old resident of a long-term care facility includes actions to promote the quality and duration of the client’s nighttime sleep. Which of the following behaviors, if exhibited by the client, would indicate an appropriate action?

1.The client does mild calisthenics 1 hour before bedtime.

2.The client takes walks in the halls primarily in the afternoon.

3.The client takes naps from mid- to late afternoon.

4.The client drinks warm tea before bedtime.

Page 18: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Sample Item #2

The nurse is caring for a 9-year-old client with bronchial asthma who was admitted with pneumonia. The client is on bed rest. Which of the following would be most appropriate to offer the client?

1.Coloring book and crayons

2.A toy stethoscope and syringe with needle

3.Beads and thread for making jewelry

4.A radio and telephone

Page 19: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Conclusion

Goal of NCLEX is to ensure public safety by classifying candidates based on whether they can practice entry-level nursing safely and effectively

Analyses such as DIF are conducted to ensure that all candidates receive an examination that accurately measures their entry-level nursing knowledge

Page 20: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Impact of Enemy Items

Page 21: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Effective item sampling from a specified test plan is essential to ensure that the exam is psychometrically sound, valid and legally defensible

One such process which assists in this endeavor is assessing and eliminating item duplication or enemy item pairs

Page 22: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

How are Enemy Pairs Developed?

Random Occurs coincidentally in the normal process of

item development

Direct Intent Items similar in nature are purposefully

developed

Page 23: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

What is an Enemy Item Pair?

Two or more items with very similar content are not placed on the same exam due to an impairment in: Content validity Face validity Measurement precision

Page 24: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Content Validity

The consistency with which the content is represented on the exam may be impacted

The content domain may be considered “oversampled”

Large impact on standardized exam as a specific number of items are allocated to the said content domain

Page 25: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Face Validity

Item duplication may cause the candidate to question exam validity

Candidate response may be altered due to the perception that the item is redundant

Candidate may become distracted believing that it is a “trick”

Page 26: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Measurement Precision

Item duplication may result in what is called Conditional Dependence

The two or more items are most likely correlated

Two dependent areas are being sampled and may lead to errors in ability estimates

Page 27: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Types of Enemy Item Pairs

Duplicate Items Stems Options Stimuli

Overlapping Content

Page 28: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Duplicate Items

All item components are virtually identical True duplicates, same item except

punctuation or other small differences

Page 29: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Duplicate Stems

Identical item stem and varying options May occur as a result of developing items

used as “variants” Less likely to occur when developing

authentic items from “scratch”

Page 30: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Duplicate Options

Similar stem and near identical item options Is considered a cost effective strategy used

by test developers to increase item development productivity

With response options so similar, candidates may become confused

Page 31: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Duplicate Stimuli

Identical exam stimulus such as Graphics Exhibits Case scenarios

Using same stimuli across exam items may create candidate confusion

Candidate exposure to the same stimuli multiple times may introduce fatigue

Page 32: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Overlapping Content

Similar content exists in the items (stem or options), however, the verbiage is different

Same concept, phrased differently Difficult to detect, precise effort should be

employed to seek out Can occur in differing item format, e.g.

multiple-choice and multiple response

Page 33: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Management of Enemy Item Pairs

Item Development Process Test Publishing Efforts Post Exam Administration

Page 34: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Item Development Enemy Management

Efforts placed at the beginning of item development to identify and label enemy pairs

Automated software now available which can isolate potential enemy pairs

Subject Matter Experts (SMEs) then review potential enemy item pairs, making identification more precise

Page 35: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Test Publishing Enemy Management

Once one or more enemy items are labeled, test developers can activate test driver specifications to prohibit the inclusion of an enemy item once one item in the enemy set has been selected

Page 36: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Post Administration Enemy Management

Test developers may analyze item intercorrelations

High intercorrelations may indicate potential enemy pairs

This method may capture the most obscure enemy pairs—those not immediately identifiable, least likely to impact test validity and measurement

Page 37: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Future Research

Page 38: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

Future Research

DIF Investigate DIF using different reference/focal

groups

Enemy Item Management Impact of various enemy pairs on test validity—

does one type of enemy pair have a stronger/lesser impact on test validity and measurement?

Page 39: Test Development Nicole Williams, MSN, RN-BC Content Manager Sarah Hagge, PhD Psychometrician

References

Exam Publications

Ensuring Validity of NCLEX® With Differential Item Functioning Analysis

Understanding the Impact of Enemy Items on Test Validity and Measurement Precision