how to design effective multiple-choice tests that assess student learning

62
How to Design Effective Multiple-Choice Tests that Assess Student Learning March 22, 2010 Debra Dunlap Runshe Instructional Development Specialist University Information Technology Services - Learning Technologies Indiana University – Purdue University Indianapolis

Upload: cicada

Post on 01-Feb-2016

39 views

Category:

Documents


0 download

DESCRIPTION

How to Design Effective Multiple-Choice Tests that Assess Student Learning. Debra Dunlap Runshe Instructional Development Specialist University Information Technology Services - Learning Technologies Indiana University – Purdue University Indianapolis. March 22, 2010. Webinar Objectives. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: How to Design Effective Multiple-Choice Tests that Assess Student Learning

How to Design Effective Multiple-Choice Tests that Assess Student Learning

March 22, 2010Debra Dunlap Runshe

Instructional Development SpecialistUniversity Information Technology Services - Learning Technologies

Indiana University – Purdue University Indianapolis

Page 2: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Webinar ObjectivesBy the end of this webinar, participants will be able to: •describe strengths and limitations of multiple-choice tests. •evaluate appropriate uses of multiple-choice tests. •explain guidelines for constructing multiple-choice items. •learn how to create questions that address the different levels of Bloom’s Taxonomy. •review examples of effective and ineffective multiple choice tests. •write multiple choice questions at different cognitive levels.

Page 3: How to Design Effective Multiple-Choice Tests that Assess Student Learning

About Multiple-Choice Tests

Page 4: How to Design Effective Multiple-Choice Tests that Assess Student Learning

About Multiple-Choice Tests

Students select the correct answer from alternative responses. Each item has:•item stem•correct or keyed option•several distractor options

Format:•complete question•incomplete question

(Clegg & Cashin, 1986)

Page 5: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Multiple-choice Test Construction

“… the greater your experience in their construction, the longer it takes per [multiple-choice] item to construct a reasonably fair, accurate, and inclusive question.”

- Wilbert J. McKeachie

Page 6: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Bloom’s Cognitive Domain

KnowledgeKnowledge

ComprehensionComprehension

ApplicationApplication

AnalysisAnalysis

SynthesisSynthesis

EvaluationEvaluation

A Resource for Question Verbs: http://tep.uoregon.edu/resources/assessment/multiplechoicequestions/blooms.html

Page 7: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Advantages

Multiple-choice items can provide:• versatility in measuring all levels of cognitive ability,• highly reliable test scores,• scoring efficiency and accuracy,• objective measurement of achievement or ability,• a wide sampling of content or objectives,• a reduced guessing factor compared with true-false items,

and• different response alternatives which can provide

diagnostic feedback.(Ory & Ryan, 1993)

Page 8: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Limitations

Multiple-choice items:• are difficult and time-consuming to construct,• lead an instructor to favor simple recall of facts,• place a high degree of dependence on the student’s

reading ability and instructor’s writing ability, and• are particularly subject to clueing. (Students can often

deduce the correct response by elimination.)

(Ory & Ryan, 1993)

Page 9: How to Design Effective Multiple-Choice Tests that Assess Student Learning

When to Use • To assess breadth of learning• To test a variety of levels of learning• When you have a large number of individuals taking the

test• When you have time to construct the test items• When time is limited for scoring• When it is not important to determine how well individuals

can formulate their own answer• When you want to prepare individuals for future

assessments that use a similar format(Clegg & Cashin, 1986)

Page 10: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Planning a Test

Page 11: How to Design Effective Multiple-Choice Tests that Assess Student Learning

General Tips for Writing Tests• Compose test items over time.

• Test what you really want individuals to learn.

• Check borrowed items carefully.

• Create a test bank.

• Start easy to build confidence.

• Get feedback on items.

(Nilson, 2010)

Page 12: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Planning a Test• Use a test matrix or blueprint.

• Identify major ideas and skills

rather than specific details.

• Use Bloom’s cognitive taxonomy

or something appropriate for

your context.

(Nilson, 2010)

Page 13: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Test Matrix

Additional Techniques for Writing Multiple-Choice Items: http://tep.uoregon.edu/resources/assessment/multiplechoicequestions/sometechniques.html

Page 14: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Objectives at Different Levels

Level: Knowledge

Objective: State the average effective radiation dose from chest CT.

What is the average effective radiation dose from chest CT?

A. 1 mSv

B. 8 mSv

C. 16 mSv

D. 24 mSv

Page 15: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Objectives at Different Levels

Level: Comprehension and application

Objective: Compare the radiation exposures from different radiologic examinations.

Which of the following imaging examinations is associated with the highest effective radiation dose?

A. Abdominal and pelvic multidetector CT

B. Coronary artery multidetector CT

C. Conventional pulmonary angiography

D. Digital pulmonary angiography

Page 16: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Objectives at Different Levels

Level: Problem solving

Objective: Explain the effects that various factors have on radiation dose from chest CT.

Which of the following actions would decrease the radiation dose from chest CT the least?

A. Decreasing mA from 250 to 125

B. Decreasing kVp from 140 to 120

C. Decreasing the pitch from 2 to 1

D. Decreasing scan time from 1 to 0.5

Page 17: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Constructing Test Items

Page 18: How to Design Effective Multiple-Choice Tests that Assess Student Learning

• Write items on significant concepts, not trivial facts.

• Write items that have a definite answer.

• Communicate clearly.

• Don’t give away the answer by including irrelevant cues in the item.

• Don’t write items that require skills or knowledge irrelevant to what you are trying to measure.

• Have items reviewed by knowledgeable persons other than the composer of the question if possible.

(Clegg & Cashin, 1986)

Writing Items

Page 19: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Components

Stem: presents the problem

Correct or keyed options: correct option

Distractor options: incorrect options

(Clegg & Cashin, 1986)

Page 20: How to Design Effective Multiple-Choice Tests that Assess Student Learning

1. Choose an important concept

2. Write the stem

3. Write the correct answer (key)

4. Develop distractors• common misconceptions• errors that could be made• plausible, yet less important information• similar in style, length to the key• every distractor should be reasonable

Developing an Item

(Clegg & Cashin, 1986)

Page 21: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Issues Related to Testwiseness

• grammatical cues

• logical cues

• absolute terms

• long correct answer

• word repeats

• convergence strategy

(Clegg & Cashin, 1986)

Page 22: How to Design Effective Multiple-Choice Tests that Assess Student Learning

• options long

• numeric data not stated consistently

• vague terms

• language not parallel

• options in no logical order

• “none of the above” is used

• stems tricky or unnecessarily complicated

• answer to an item is “hinged” to the answer of a related item

Issues Related to Irrelevant Difficulty

(Clegg & Cashin, 1986)

Page 23: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Stems

• Ensure that the directions in the stem are very clear.• Include the central idea in the stem instead of the

choices.• Avoid window dressing (excessive verbiage).• Word the stem positively, avoid negatives such as

NOT or EXCEPT. If negative words are used, use the word cautiously and always ensure that the word appears capitalized and boldface.

(Haladyna, Downing & Rodriguez, 2002)

Page 24: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing StemsAvoid statements that fail to present a complete thought or question.

Schizophrenia

A. is caused by excessive role playing in childhood.

B. causes hallucinations.

C. is a tendency toward ritualistic behavior.

D. is a psychosocial disorder.

Better:

Schizophrenia

A. an alternation between two or more personalities.

B. a tendency toward ritualistic behavior.

C. a fragmentation of psychological functioning.

D. an inability to inhibit emotional outbursts. (Ory & Ryan, 1993)

Page 25: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing StemsAvoid stems that ask for a series of multiple true-false responses.

Which of the following is true about the middle adult years?

A. It encompasses ages 19 to 30.

B. It is the most conflict-free period of life.

C. It is characterized by dramatic changes in our sense of values.

D. It is marked by a conflict between intimacy and isolation.

Better:

According to Erickson, the middle adult years are characterized by the conflict between ____ and ___ .

A. intimacy; isolation

B. generativity; stagnation

C. integrity; despair

D. industry; despondency(Ory & Ryan, 1993)

Page 26: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Eliminate excessive wording and irrelevant information.

Sheldon developed a highly controversial theory of personality based on body type and temperament of the individual. Which of the following is a criticism of Sheldon’s theory?

A. He was influenced too much by Freudian psychoanalysis.

B.His ratings of physique and temperament were not independent.

C. He failed to use an empirical approach.

D. His research sample was improperly selected.

Better:

Which of the following is a criticism of Sheldon's theory of personality?

Writing Stems

(Ory & Ryan, 1993)

Page 27: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing StemsInclude in the stem any word(s) that might otherwise be repeated in each alternative.

The receptors for the vestibular sense are located

A.in the fovea.

B.in the brain.

C.in the middle ear.

D.in the inner ear.

Better:

The receptors for the vestibular sense are located in the

(Ory & Ryan, 1993)

Page 28: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing StemsUse negatively stated stems sparingly. When used, underline and/or capitalize the negative word.

Which is not a major technique for studying brain function?

A. accident and injury

B. cutting and removing

C. electrical stimulation

D. direct phrenology

Better:

Which is NOT a major technique for studying brain function?

(Ory & Ryan, 1993)

Page 29: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing StemsWhen using incomplete statements avoid beginning with the blank space. ___ is the least severe form of behavior disorder.

A. PsychosisB. Panic disorder C. NeurastheniaD. Neurosis

Better:The least severe form of behavior disorder is ___ .

(Ory & Ryan, 1993)

Page 30: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing StemsUse familiar language.

According to Freud the raison d’être for hysteria was

A. sexual conflicts.

B. unresolved feelings of guilt.

C. latent tendencies.

D. repressed fear.

Better:

According to Freud hysteria was caused by …

(Ory & Ryan, 1993)

Page 31: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing StemsProvide sufficient information in the stem to allow students to respond to the question.

How many interrelated stages to creative problem solving are there?

A. Three

B. Four

C. Seven

D. Ten

Better:

The textbook indicates that there are ___ interrelated stages to creative problem solving.

(Ory & Ryan, 1993)

Page 32: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item Alternatives• Develop as many effective choices as you can, but

research suggests three is adequate.• Make sure that only one of these choices is the right

answer.• Vary the location of the right answer according to the

number of choices• Place choices in logical or numerical order.• Keep choices independent; choices should not be

overlapping.

(Haladyna, Downing & Rodriguez, 2002)

Page 33: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item Alternatives• Keep choices homogeneous in content and grammatical

structure.• Keep the length of the choices about equal.• None-of-the-above should be used carefully.• Avoid All-of-the-above.• Make all distractors plausible.• Use typical errors of students to write your distractors.• Use humor if it is compatible with the teacher and the

learning environment.

(Haladyna, Downing & Rodriguez, 2002)

Page 34: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item Alternatives• Phrase choices positively; avoid negatives such as NOT.• Avoid giving clues to the right answer, such as:ospecific determiners including always, never,

completely, and absolutely.oclang associations, choices identical to or resembling

words in the stem.oconspicuous correct choice.opairs or triplets of options that clue the test-taker to the

correct choice.oblatantly absurd, ridiculous options.

(Haladyna, Downing & Rodriguez, 2002)

Page 35: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item Alternatives

Make sure there is one correct or best response.

Which of the following does not belong with the others?

A. Wundt

B. Structuralism

C. James

D. Titchener

(Ory & Ryan, 1993)

Page 36: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item AlternativesMake all alternatives plausible and equally attractive to both less-knowledgeable and skillful students.

The number of photoreceptors in the retina of each human eye is about

A. 1000,000.

B. 2 million.

C. 115 million.

D. 2.37 billion.

Better:

A. 5 million.

B. 35 million.

C. 65 million.

D. 115 million.(Ory & Ryan, 1993)

Page 37: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item AlternativesMinimize the use of the all-of –the-above and none-of-the-above alternatives.

Problem representation involves

A. determining which factors matter and which do not.

B. the initial state of problem solving.

C. both a and b.

D. neither a nor b.

Better:

A. determining which factors matter and which do not.

B. the initial state of problem solving.

C. reducing the problem to manageable segments.

D. all of the above.(Ory & Ryan, 1993)

Page 38: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item AlternativesUse between three and five alternatives for each item.

What function is performed by the sensory neurons?

A. Receive information from the environment.

B. Carry information from the central nervous system to the muscles.

C. Connect one neuron to another.

D. Are only found inside the brain.

Better:

A. Receive information from the environment.

B. Carry information from the central nervous system to the muscles.

C. Connect one neuron to another.(Ory & Ryan, 1993)

Page 39: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item AlternativesAll alternatives should be approximately equal in length.

Latane and Darley smoke-filled room experiment suggested that people are less likely to help in groups than alone, because people

A. in groups talk to one another.

B. who are alone are more attentive.

C. in groups do not display pluralistic ignorance.

D. in groups allow others to define the situation as a non-emergency.

Better:

Latane and Darley smoke-filled room experiment suggested that people are less likely to help in groups than alone, because people

A. talk to one another.

B. are less attentive than people who are alone .

C. do not display pluralistic ignorance.

D. allow others to define the situation as a non-emergency

(Ory & Ryan, 1993)

Page 40: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item AlternativesMake alternatives parallel in construction and consistent with the stem.

Which of the following is NOT a defense mechanism?

A. Conflict.

B. Repression.

C. Reaction formation.

D. Rationalization.

Better:

A. Rationalization.

B. Repression.

C. Reaction formation.

D. Regression.(Ory & Ryan, 1993)

Page 41: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item AlternativesWhen possible, present alternatives in some logical order (e.g., most to least and chronological .)

In the course of a dark adaptation , the eye’s best sensitivity to wavelength shifts to

A. 580 millimicrons.

B. 477 millimicrons.

C. 505 millimicrons.

D. 600 millimicrons.

Better:

A. 600 millimicrons.

B. 580 millimicrons.

C. 505 millimicrons.

D. 477millimicrons. (Ory & Ryan, 1993)

Page 42: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item AlternativesMake the alternatives mutually exclusive.

Rods are found in the

A. blind spot.

B. fovea.

C. periphery of the retina.

D. back of the eye.

Better:

A. blind spot.

B. periphery of the fovea.

C. periphery of the retina.

D. cornea.

(Ory & Ryan, 1993)

Page 43: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item AlternativesAvoid overly wordy alternatives that become confusing and difficult to read.

Flooding differs from systematic desensitization in that

A. the former is based on classical conditioning and the latter on operant conditioning.

B. systematic desensitization requires insight and the flooding does not.

C. flooding has you start at the top of your fear hierarchy and systematic desensitization has you start at the bottom and work up gradually.

D. flooding emphasizes the use of cognitions to a much greater extent than does systematic desensitization.

Better:

Flooding differs from systematic desensitization in that flooding

A. is based on classical conditioning rather than operant conditioning.

B. doesn’t require insight.

C. starts at the top of the fear hierarchy.

D. places greater emphasis on the use of cognitions.

(Ory & Ryan, 1993)

Page 44: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item AlternativesAvoid irrelevant cues such as grammatical structure, well-known work associations, or connections between the stem and the correct answer.

School psychologists who examine and place children in special education settings often apply the research done by

A. biopsychologists.

B. educational psychologists.

C. clinical psychologists.

D. counseling psychologists.

Better:

School psychologists often apply the research done by

(Ory & Ryan, 1993)

Page 45: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Writing Item AlternativesAvoid language that may offend or exclude a particular group of individuals.

Which of the following is a characteristic of persons with Down’s syndrome?

A. Larger than normal head

B. Obesity

C. Oriental-like skin folds over the eyes

D. Above average height.

Better:

A. Larger than normal head

B. Obesity

C. Downward sloping skin fold over the eyes

D. Above average height. (Ory & Ryan, 1993)

Page 46: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Critiquing Test Items

Page 47: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Critiquing Test Items

Twenty Thousand Leagues Under the Sea is considered to be:

A. an adventure story.

B. a science-fiction story.

C. an historical novel.

D. an autobiography.

Could be either A or B; should have one best answer.

 

Page 48: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Critiquing Test Items

When a court possesses appellate jurisdiction this means that it

A. must have a jury.

B. has the power or authority to review and decide appeals.

C. can conduct the original trial.

D. can declare laws unconstitutional.

The term “appeal” in B is too close to “appellate” in the stem.

Page 49: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Critiquing Test Items

Which of the following men invented the telephone?

A. Bell

B. Morse

C. Pasteur

D. Salk

C & D are not plausible distractors and the answer (A) is too obvious.

Page 50: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Critiquing Test Items

The indicator found by correlating students’ scores on a classroom math test with their scores on a standardized math test is called a

A. validity coefficient.

B. index of reliability.

C. equivalence coefficient.

D. internal consistency coefficient.

The end of the stem is “a” which only matches answer (A).

Page 51: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Critiquing Test Items

In order to determine the criterion-related validity of a test, one would

A. correlate the test scores with an appropriate criterion.

B. correlate the scores from the odd and even items.

C. correlate the scores from forms a & b of the test.

D. correlate the scores from two administrations of the same test.

“Correlate the” should be included in the stem. Also both (A) and the stem have the same word, “criterion.”

Page 52: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Critiquing Test Items

The state that is not south of the Mason-Dixon line is

A. Mississippi.

B. Florida.

C. Kentucky.

D. Vermont.

“Not south” could trip up students and should be replaced by “north” OR the negative should be underlined or highlighted (e.g. “NOT South”). Again, answer (D) is too easy.

Page 53: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Critiquing Test Items

Which one of the following is the best source of heat for home use?

A. Gas

B. Electricity

C. Oil

D. Geo-thermal

“Best” is too vague. Why not use “cheaper,” “more efficient,” etc. The answer is also geographically dependent.

 

Page 54: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Critiquing Test Items Important early theorists in the psychology of learning included

A. Ebbinghaus.

B. Thorndike.

C. Pavlov.

D. None of the above.

E. All of the above.

The stem says “theorists” so there must be more than one. (E) is the right answer. Another problem is the answer tends to be “all of the above” in this type of question. If the student can see 2 that are correct, it must be “all of the above.”

 

Page 55: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Critiquing Test Items

In a normal distribution, the mean and the median are

A. always the same point.

B. never the same point.

C. usually very close to one another.

(A) and (B) are absolutes, which are usually incorrect. (C) is also longer.

 

Page 56: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Item Analysis

Page 57: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Item Analysis• Review items for accuracy

and formatting

• Have a colleague read

and give feedback

• Item difficulty (percentage

of students who answered

each item correctly)

• Item discrimination

Page 58: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Summary• Multiple-choice tests can be useful measures of learning.

• Write questions to assess the cognitive level of interest.

• Follow guidelines for writing effective multiple choice questions.

• Review student performance on items and revise exams as needed.

Page 59: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Questions?

Page 60: How to Design Effective Multiple-Choice Tests that Assess Student Learning

Thank You for Your Participation!

Debra Dunlap Runshe, Instructional Development SpecialistUniversity Information Technology Services – Learning Technologies

Indiana University-Purdue University IndianapolisInformation Technology and Communications Complex (IT 342H)535 West Michigan Street, Indianapolis, IN 46202

Phone: 317-278-0589  Email: [email protected]

Page 61: How to Design Effective Multiple-Choice Tests that Assess Student Learning

ResourcesClegg, V. L., & Cashin, W. E. (1986). Improving multiple-choice tests.

Idea Paper No. 16, Center for Faculty Evaluation and Development, Kansas State University. http://www.idea.ksu.edu/papers/Idea_Paper_16.pdf.

Davis, B. G. (2009). Tools for teaching. (2nd ed.). San Francisco, CA: Jossey-Bass.

Haladyna, T. M., Downing, S. M., & Rodriguez, M. C. (2002). A review of multiple-choice item-writing guidelines for classroom assessment. Applied Measurement in Education, 15(3), 309-334.  

Nilson, L. B. (2010). Teaching at its best: A research-based resource for college instructors. (3rd ed.) San Francisco, CA: Jossey-Bass.

Ory, J.C. & Ryan, K. E. (1993). Tips for improving testing and grading. Vol. 4. Newbury Park: Sage Publications.

Page 62: How to Design Effective Multiple-Choice Tests that Assess Student Learning

ResourcesSvinicki, M. & McKeachie, W. J. (2011). McKeachie's teaching tips:

Strategies, research, and theory for college and university teachers. Belmont, CA: Wadsworth, Cengage Learning.

University of Oregon, Teaching Effectiveness Program. Writing Multiple Choice Questions that Demand Critical Thinking. Web site: http://tep.uoregon.edu/resources/assessment/multiplechoicequestions/mc4critthink.html

University of Minnesota, Office of Measurement Services. Writing Multiple Choice Items. Web site: http://oms.umn.edu/fce/how_to_write/multiplechoice.php

University of Texas at Austin, Instructional Assessment Resources. Writing Multiple Choice Items. Web site: http://www.utexas.edu/academic/ctl/assessment/iar/students/plan/method/exams-mchoice-write.php