test validity

Item Analysis, Test Validity and ReliabilityItem Analysis, Test Validity and ReliabilityPrepared by:

Rovel A. AparicioMathematics Teacher

THERE IS always A better WAY

Stages in Test Construction

A. Determining the Objectives A. Determining the Objectives

B. Preparing the Table of Specifications B. Preparing the Table of Specifications

C. Selecting the Appropriate Item FormatC. Selecting the Appropriate Item Format

I. Planning the Test I. Planning the Test

D. Writing the Test items D. Writing the Test items

E. Editing the Test items

A. Administering the testA. Administering the test

Item analysis Item analysis

II. Trying Out the Test II. Trying Out the Test

C. Preparing the Final Form of the Test

IV. Establishing Test Reliability IV. Establishing Test Reliability

III. Establishing Test Validity III. Establishing Test Validity

V. Interpreting the Test ScoresV. Interpreting the Test Scores

DISCRIMINATION INDEX

refers to the degree to which success or failure of an item indicates possession of the acheivement being measured.

DIFFICULTY INDEX

the percentage of the pupils who got the items rigth.interpreted as how easy or how difficult an item is.

Item Analysis

GOAL: Improve the test.IMPORTANCE: Measure the effectiveness of individual test item.

ACTIVITY NO.1

COMPUTE THE DIFFICULTY INDEX AND

DISCRIMINATION INDEX OF PERIODICAL TEST.

U-L INDEX METHOD(STEPS)

1. Score the papers and rank them from highest to lowest according to the total score.2. Separate the top 27% and the bottom 27% of the papers.3. Tally the responses made to each test item by each individual in the upper 27% group.4.Tally the responses made to each test item by each individual in the lower 27% group.

U-L INDEX METHOD(STEPS)

5. Compute the difficulty index. [d= (U+L)/(nu+nl)]6. Compute the discrimination index. [D=(U-L)/nu] or [D=(U-L)/nl]

No. of pupils tested- 60

Item no.

Upper 27%

Lower 27%

Difficulty Index

Discrimination Index

Remarks

Revised

Retained

Rejected

Retained

Rejected

Revised

DISCRIMINATION INDEX

< .09 Poor items (Reject)

.10-.39 Reasonably Good (Revise)

.40-1.00 Very Good items (Retain)

DIFFICULTY INDEX

.00-.20 Very Difficult

.21-.80 Moderately Difficult

.81-1.00 Very Easy

Item Analysis

Construct

Validity

Construct Validity

Criterion- related Validity

Criterion- related Validity Content

Validity

Content Validity

Types of Test Validity

Establishing Test Validity

Types of ValidityTypes of Validity MeaningMeaning ProcedureProcedure

1.Content Validity

How well the sample test bar tasks represent the domain of tasks to be measured.

Compare test tasks with test specifications describing the task domain under consideration (non-statistical)

2. Construct Validity

How test performance canbe describedpsychologically.

Experimentally determine what factorsinfluence scores on test. The procedure may be logical and statistical using correlations and other statistical methods.

3. Criterion- related Validity

How well test performance predicts future performance or estimates current performance on some valued measures other than the test itself.

Compare test scores with measure of performance(grade) obtain on later date(for prediction).or another measure of performance obtain concurrently(for estimating present status.( PrimarilyStatistical). Correlatetest results with outside criterion.

Measure of Internal

Consistency

Measure of Internal

Consistency

Measure ofStability andEquivalence

Measure ofStability

Types of ReliabilityMeasure

Establishing Test Reliability

Measure ofEquivalence

Types of ReliabilityMeasures

Methods of Estimating Reliability

Methods of Estimating Reliability ProcedureProcedure

1. Measure of Stability

Test- retest method

Give a test twice to the same group withany time intervalbetween tests fromseveral minutes to several years.(Pearson r)

2. Measure of Equivalence

Equivalent forms method

Give two forms of a test to the same group in close succession(Pearson r)

3. Measure of Stability

Test- retest with equivalentforms

Give two forms ofa test to the same group with increasedtime intervalsbetween forms.(Pearson r)

4. Measure of internal consistency

Kuder-Richarsonmethod

Give a test once. Score the total test and apply the Kuder Richardson formula.

4. Measure of internal consistency

Split half method

Give a test once. Score equivalent halves of the test. (e.g. odd and evennumbered items.(Pearson r and Spearman- Brown formula)

ACTIVITY NO.2

TEST THE RELIABILITY OF PERIODICAL TEST.

Pearson r Standard Scores(Directions)

1. Begin by writing the pairs of scores to be studied in two columns. Be sure that the pair of scores for each pupils is in the same row. Label one set of scores X , the other Y.2.Get the sum (∑) of the scores for each column. Divide the sum by the number of scores (N) in each column to get the mean.3.Subtract each score in column X from the mean x. Write the difference in column x. Be sure to put an algebraic sign.

4. Subtract each score in column Y from the mean y. Write the difference in column y. Don't forget the sign.5. Square each score in column X. Enter each result under X2 .

6. Square each score in column Y. Enter each result under Y2 .

7. Compute the standard deviation of X and Y and enter the result under the column of SDx and SDy respectively .

8. Divide each entry in column x and y by the standard deviation SDx and SDy respectively and enter the result under Zx and Zy respectively.9. Multiply Zx and Zy and enter the result under ZxZy.10. Get the sum (∑) ZxZy.11. Apply the formula r=∑ZxZy

Direction of Relationship

Negative coefficient means, as one variable increases, the other decreases.

Positive Coefficientmeans, as one variable increases, the other also increases

Magnitude or size ofRelationship

0.8 and above means high correlation

0.5 means moderatecorrelation

0.3 and below meanslow correlation

Interpretation of Coefficient of Correlation

Correlation is a measure of relationship between two variables.

Criteria:

less than 10%- Homogenous

greater than 10%-- Heterogenous

c.v. = (mean/s.d.)x100

Interpretation of Coefficient of Variation

Coeffecient of Variation is defined as the ratio of the standard deviation and the mean and usually expressed in percent.

REMEMBER:

1. Use item analysis procedures to check the quality of the test. The item analysis should be interpreted with care and caution2. A test is valid when it measures what it is supposed to measure3. A test is reliable when it is consistent .

test validity

internal consistency

item analysis

index method

standard deviation

reliability

responses

discrimination

difficulty

Technology

validity is the test appropriate, useful, and meaningful?

compass math test validity march 2011

test validity

construct validity of the gre aptitude test - ets

measurement validity - harvard university major depressive...

a validity study of the wonderlic personnel test

validity and screening test

test validity – revisited again

reliability and validity of pendulum test measures of

the predictive validity of the bender-gestalt test

development, reliability, and validity of open-ended test

differential validity of a differential aptitude test u

test validity and the ethics of assessment

formal test for validity

validity and reliability of test scores

chapter 6 validity §1 basic concepts of validity what is...

validity of a screening test

the predictive validity of a psychological test

pain validity test use in men

pain validity test versus mmpi