test validity

Post on 01-Nov-2014

1.464 Views

Category:

Technology

10 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Item Analysis, Test Validity and ReliabilityItem Analysis, Test Validity and ReliabilityPrepared by:

Rovel A. AparicioMathematics Teacher

THERE IS always A better WAY

Stages in Test Construction

A. Determining the Objectives A. Determining the Objectives

B. Preparing the Table of Specifications B. Preparing the Table of Specifications

C. Selecting the Appropriate Item FormatC. Selecting the Appropriate Item Format

I. Planning the Test I. Planning the Test

D. Writing the Test items D. Writing the Test items

E. Editing the Test items

Stages in Test Construction

A. Administering the testA. Administering the test

Item analysis Item analysis

II. Trying Out the Test II. Trying Out the Test

C. Preparing the Final Form of the Test

Stages in Test Construction

IV. Establishing Test Reliability IV. Establishing Test Reliability

III. Establishing Test Validity III. Establishing Test Validity

V. Interpreting the Test ScoresV. Interpreting the Test Scores

DISCRIMINATION INDEX

refers to the degree to which success or failure of an item indicates possession of the acheivement being measured.

DIFFICULTY INDEX

the percentage of the pupils who got the items rigth.interpreted as how easy or how difficult an item is.

Item Analysis

GOAL: Improve the test.IMPORTANCE: Measure the effectiveness of individual test item.

ACTIVITY NO.1

COMPUTE THE DIFFICULTY INDEX AND

DISCRIMINATION INDEX OF PERIODICAL TEST.

U-L INDEX METHOD(STEPS)

1. Score the papers and rank them from highest to lowest according to the total score.2. Separate the top 27% and the bottom 27% of the papers.3. Tally the responses made to each test item by each individual in the upper 27% group.4.Tally the responses made to each test item by each individual in the lower 27% group.

U-L INDEX METHOD(STEPS)

5. Compute the difficulty index. [d= (U+L)/(nu+nl)]6. Compute the discrimination index. [D=(U-L)/nu] or [D=(U-L)/nl]

No. of pupils tested- 60

Item no.

Upper 27%

Lower 27%

Difficulty Index

Discrimination Index

Remarks

1

2

3

4

5

6

7

8

9

10

1214

610

13

3

13

6

912

6

12

10

4

14

11

6

2

7

8

0.81

0.50

0.56

0.34

0.56

0.63

0.53

0.41

0.78

0.44

0.13

0.25

0.25

0.44

-0.50

0.38

0.56

-0.44

0.06

0.13

Revised

Retained

Retained

Retained

Retained

Rejected

Retained

Rejected

Rejected

Revised

DISCRIMINATION INDEX

< .09 Poor items (Reject)

.10-.39 Reasonably Good (Revise)

.40-1.00 Very Good items (Retain)

DIFFICULTY INDEX

.00-.20 Very Difficult

.21-.80 Moderately Difficult

.81-1.00 Very Easy

Item Analysis

Construct

Validity

Construct Validity

Criterion- related Validity

Criterion- related Validity Content

Validity

Content Validity

Types of Test Validity

Types of Test Validity

Establishing Test Validity

Establishing Test Validity

Types of ValidityTypes of Validity MeaningMeaning ProcedureProcedure

1.Content Validity

How well the sample test bar tasks represent the domain of tasks to be measured.

Compare test tasks with test specifications describing the task domain under consideration (non-statistical)

Establishing Test Validity

Types of ValidityTypes of Validity MeaningMeaning ProcedureProcedure

2. Construct Validity

How test performance canbe describedpsychologically.

Experimentally determine what factorsinfluence scores on test. The procedure may be logical and statistical using correlations and other statistical methods.

Establishing Test Validity

Types of ValidityTypes of Validity MeaningMeaning ProcedureProcedure

3. Criterion- related Validity

How well test performance predicts future performance or estimates current performance on some valued measures other than the test itself.

Compare test scores with measure of performance(grade) obtain on later date(for prediction).or another measure of performance obtain concurrently(for estimating present status.( PrimarilyStatistical). Correlatetest results with outside criterion.

Measure of Internal

Consistency

Measure of Internal

Consistency

Measure ofStability andEquivalence

Measure ofStability andEquivalence

Measure ofStability

Measure ofStability

Types of ReliabilityMeasure

Types of ReliabilityMeasure

Establishing Test Reliability

Measure ofEquivalence

Measure ofEquivalence

Establishing Test Reliability

Types of ReliabilityMeasures

Types of ReliabilityMeasures

Methods of Estimating Reliability

Methods of Estimating Reliability ProcedureProcedure

1. Measure of Stability

Test- retest method

Give a test twice to the same group withany time intervalbetween tests fromseveral minutes to several years.(Pearson r)

Establishing Test Reliability

Types of ReliabilityMeasures

Types of ReliabilityMeasures

Methods of Estimating Reliability

Methods of Estimating Reliability ProcedureProcedure

2. Measure of Equivalence

Equivalent forms method

Give two forms of a test to the same group in close succession(Pearson r)

Establishing Test Reliability

Types of ReliabilityMeasures

Types of ReliabilityMeasures

Methods of Estimating Reliability

Methods of Estimating Reliability ProcedureProcedure

3. Measure of Stability

Test- retest with equivalentforms

Give two forms ofa test to the same group with increasedtime intervalsbetween forms.(Pearson r)

Establishing Test Reliability

Types of ReliabilityMeasures

Types of ReliabilityMeasures

Methods of Estimating Reliability

Methods of Estimating Reliability ProcedureProcedure

4. Measure of internal consistency

Kuder-Richarsonmethod

Give a test once. Score the total test and apply the Kuder Richardson formula.

Establishing Test Reliability

Types of ReliabilityMeasures

Types of ReliabilityMeasures

Methods of Estimating Reliability

Methods of Estimating Reliability ProcedureProcedure

4. Measure of internal consistency

Split half method

Give a test once. Score equivalent halves of the test. (e.g. odd and evennumbered items.(Pearson r and Spearman- Brown formula)

ACTIVITY NO.2

TEST THE RELIABILITY OF PERIODICAL TEST.

Pearson r Standard Scores(Directions)

1. Begin by writing the pairs of scores to be studied in two columns. Be sure that the pair of scores for each pupils is in the same row. Label one set of scores X , the other Y.2.Get the sum (∑) of the scores for each column. Divide the sum by the number of scores (N) in each column to get the mean.3.Subtract each score in column X from the mean x. Write the difference in column x. Be sure to put an algebraic sign.

Pearson r Standard Scores(Directions)

4. Subtract each score in column Y from the mean y. Write the difference in column y. Don't forget the sign.5. Square each score in column X. Enter each result under X2 .

6. Square each score in column Y. Enter each result under Y2 .

7. Compute the standard deviation of X and Y and enter the result under the column of SDx and SDy respectively .

Pearson r Standard Scores(Directions)

8. Divide each entry in column x and y by the standard deviation SDx and SDy respectively and enter the result under Zx and Zy respectively.9. Multiply Zx and Zy and enter the result under ZxZy.10. Get the sum (∑) ZxZy.11. Apply the formula r=∑ZxZy

N

Direction of Relationship

Negative coefficient means, as one variable increases, the other decreases.

Positive Coefficientmeans, as one variable increases, the other also increases

Magnitude or size ofRelationship

0.8 and above means high correlation

0.5 means moderatecorrelation

0.3 and below meanslow correlation

Interpretation of Coefficient of Correlation

Correlation is a measure of relationship between two variables.

Criteria:

less than 10%- Homogenous

greater than 10%-- Heterogenous

c.v. = (mean/s.d.)x100

Interpretation of Coefficient of Variation

Coeffecient of Variation is defined as the ratio of the standard deviation and the mean and usually expressed in percent.

REMEMBER:

1. Use item analysis procedures to check the quality of the test. The item analysis should be interpreted with care and caution2. A test is valid when it measures what it is supposed to measure3. A test is reliable when it is consistent .

top related