why is there a need to study test construction (1)

14
Why is there a need to study test construction? 13% of students who got low grades in exams are caused by faulty test questions. WORLDWATCH The Philadelphia Trumpet It is estimated that 90% of all test questions asked are of “Low level” - knowledge and comprehension. (Wilen, W.W.,) “Low level” doesn’t mean easy: Write an essay explaining the decline and fall of the Roman Empire incorporating at least five of the seven causes discussed in class from the writings of Gibbon and Toynbee “High level” doesn’t mean hard: Which movie did you like more, Maleficent or She’s Dating the Gangster? Why? Old belief about teaching ..... Education is a process where the notes of the teachers or the contents in the books are transfered to the notebooks of the students, without understanding them. What are tests for? Inform learners and teachers of the strengths and weaknesses of the process Motivate learners to review or consolidate specific material Guide the planning/development of the ongoing teaching process Create a sense of accomplishment Determine if the objectives have been achieved Encourage improvement BARRIERS IN TEST CONSTRUCTION Ms. Alanganin – confusing statements Mr. Highfalutin – difficult vocabulary Ms. Madaldal – excessive wordiness Ms. Magulo – complex sentence structure Ms. Malabo – unclear instructions

Upload: michael-p-celis

Post on 19-Jul-2016

34 views

Category:

Documents


8 download

DESCRIPTION

mhddd

TRANSCRIPT

Why is there a need to study test construction? 

13% of students who got low grades in exams are caused by faulty test questions.

WORLDWATCH

The Philadelphia Trumpet

It is estimated that 90% of all test questions asked are of “Low level” - knowledge and comprehension.

(Wilen, W.W.,)

“Low level” doesn’t mean easy:

Write an essay explaining the decline and fall of the Roman Empire incorporating at least five of the seven causes discussed in class from the writings of Gibbon and Toynbee

“High level” doesn’t mean hard:

Which movie did you like more, Maleficent or She’s Dating the Gangster? Why?

Old belief about teaching .....

Education is a process where the notes of the teachers or the contents in the books are transfered to the notebooks of the students, without understanding them.

What are tests for?

Inform learners and teachers of the strengths and weaknesses of the process Motivate learners to review or consolidate specific material Guide the planning/development of the ongoing teaching process Create a sense of accomplishment Determine if the objectives have been achieved Encourage improvement

BARRIERS IN TEST CONSTRUCTION

Ms. Alanganin – confusing statements Mr. Highfalutin – difficult vocabulary Ms. Madaldal – excessive wordiness Ms. Magulo – complex sentence structure Ms. Malabo – unclear instructions Mr. Pulpol – unclear illustrative materials Ms. Foringer – linguistically bound words Ms. Colonial Mentality – culturally bound words

“To be able to prepare a good test, one has to have a mastery of the subject matter, knowledge of the pupils to be tested, skill in verbal expression and the use of the different test format”

Evaluating Educational Outcomes (Oriondo & Antonio)

The test objectives guide the kind of objective tests that will be designed and constructed by the teacher.

PREPARING A TABLE OF SPECIFICATIONS

TOS – is a test map that guides the teacher in constructing the test. It ensures that there is a balance between items that test lower level thinking skills in the test

Tips in Preparing the Table of Specifications (TOS)

Don’t make it overly detailed. It's best to identify major ideas and skills rather than specific details.

Use a cognitive taxonomy that is most appropriate to your discipline, including nonspecific skills like communication skills or graphic skills or computational skills if such are important to your evaluation of the answer.

CONSTRUCTING THE TEST ITEMS

The actual construction of the test follows the TOS. As a general rule, it is advised that the actual number of items to be constructed in the draft should be double the desired number of test items.

ITEM ANALYSIS AND TRY – OUT

The test draft is tried out to a group of pupils/students. The purpose of this try – out is to determine the:

1. item characteristics through item analysis;

2. characteristics of the test itself – validity, reliability and practicality. 

General Rules in Writing Test Questions

Number test questions continuously. Keep your test question in each test group uniform.

Make your layout presentable.

Do not put too many test questions in one test group.

T or F: 10 – 15 questions

Multiple Choice: max. of 30 questions

Matching type: 5 questions per test group

Others: 5 – 10 questions

Some additional guidelines to consider when writing items are described below:

1. Avoid humorous items. Classroom testing is very important and humorous items may cause students to either not take the exam seriously, become confused or anxious.

2. Items should measure one’s knowledge of the item context not their level of interest.

3. Write items to measure what students know, not what they do not know. (Cohen & Wallack)

CONSTRUCTING A TRUE – FALSE TEST

Bionomial - choice tests – are tests that have only two (2) options such, True or False, Right or Wrong. A student who knows nothing of the content of the examination would have 50% chance of getting the correct answer by sheer guess work.

It is best that the teacher ensures that a true – false item is able to discriminate properly between those who know and who are just guessing. A modified true – false test can offset the effect of guessing by requiring students to explain their answer and to disregard a correct answer if the explanation is incorrect.

RULE # 1 – DO NOT GIVE A HINT IN THE BODY OF THE QUESTION.

Example:

The Philippines gained its independence in 1898 and therefore celebrated its centennial year in 2000.

RULE # 2 – AVOID USING THE WORDS “ALWAYS”, “NEVER”, “OFTEN”, AND OTHER ADVERBS THAT TEND TO BE EITHER ALWAYS TRUE OR ALWAYS FALSE.

Example:

Christmas always falls on a Sunday because it is a Sabbath Day.

RULE # 3 – AVOID LONG SENTENCES AS THESE TEND TO BE TRUE. KEEP SENTENCES SHORT.

Example:

Tests need to be valid, reliable and useful, although, it would require a great amount of time and effort to ensure that tests possess these characteristics.

Notice that the statement is true. However, we are not also sure which part of the sentence is deemed true by the student. The following example illustrates what can go wrong in long sentences.

Example:

Tests need to be valid, reliable and useful since it takes very little amount of time, money and effort to construct tests with these characteristics.

The first part of the sentence is true but the second part is debatable and may, in fact be false. Thus, a “TRUE” response is correct and also; a “FALSE” response is also correct.

RULE # 4 – AVOID TRICK STATEMENTS WITH SOME MINOR MISLEADING WORD OR SPELLING ANOMALY, MISPLACED PHRASES, ETC. A WISE STUDENT WHO DOES NOT KNOW THE SUBJECT MATTER MAY DETECT THIS STRATEGY AND THUS GET THE ANSWER CORRECTLY.

RULE # 5 – Avoid quoting verbatim from reference materials or textbooks. This practice sends the wrong signal to the students that it is necessary to memorize the textbook word for word and thus, acquisition of higher level thinking skills is not given due importance.

Example:

According to Webster, Biology is a natural science concerned with the study of life and living organisms, including their structure, function, growth, evolution, distribution, and taxonomy.

RULE # 6 – AVOID SPECIFIC DETERMINERS OR GIVE – AWAY QUALIFIERS. STUDENTS QUICKLY LEARN THAT STRONGLY WORDED STATEMENTS ARE MORE LIKELY TO BE TRUE THAN FALSE, FOR EXAMPLE, STATEMENTS WITH NEVER, NO, ALL OR ALWAYS. MODERATELY WORDED STATEMENTS ARE LIKELY TO BE TRUE THAN FALSE. STATEMENTS WITH MANY, OFTEN, SOMETIMES, GENERALLY, FREQUENTLY, OR SOME SHOULD BE AVOIDED.

Examples:

Thailand is called “the land of the free” because it is the only country in Southeast Asia that was never colonized.

Generally, a tsunami follows after the earthquake.

RULE # 7 – WITH TRUE – FALSE QUESTIONS, AVOID A GROSSLY DISPROPORTIONATE NUMBER OF EITHER TRUE OR FALSE STATEMENTS OR EVEN PATTERNS IN THE OCCURRENCE OF TRUE OR FALSE STATEMENTS.

2. DO NOT USE MODIFIERS THAT ARE VAGUE AND WHOSE MEANINGS CAN DIFFER FROM ONE PERSON TO THE NEXT SUCH AS: MUCH, OFTEN, USUALLY, ETC.

3. AVOID COMPLEX OR AWKWARD WORD ARRANGEMENTS. ALSO, AVOID USE OF NEGATIVES IN THE STEM AS THIS MAY ADD UNNECESSARY COMPREHENSION DIFFICULTIES.

4. DO NOT USE NEGATIVES OR DOUBLE NEGATIVES AS SUCH STATEMENTS TEND TO BE CONFUSING. IT IS BEST TO USE SIMPLER SENTENCES RATHER THAN SENTENCES THAT WOULD REQUIRE EXPERTISE IN GRAMMATICAL CONSTRUCTION.

5. EACH ITEM STEM SHOULD BE AS SHORT AS POSSIBLE, OTHERWISE YOU RISK TESTING MORE FOR READING AND COMPREHENSION SKILLS.

6. DISTRACTERS SHOULD BE EQUALLY PLAUSIBLE AND ATTRACTIVE.

7. ALL MULTIPLE –CHOICE OPTIONS SHOULD BE GRAMATICALLY CONSISTENT WITH THE STEM.

8. THE LENGTH, EXPLICITNESS, OR DEGREE OF TECHNICALITY OF ALTERNATIVES SHOULD NOT BE THE DETERMINANTS OF THE CORRECTNESS OF THE ANSWER.

9. AVOID STEMS THAT REVEAL THE ANSWER TO ANOTHER ITEM.

10. AVOID ALTERNATIVES THAT ARE SYNONYMOUS WITH OTHERS OR THOSE THAT INCLUDE OR OVERLAP OTHERS.

11. AVOID PRESENTING SEQUENCED ITEMS IN THE SAME ORDER AS IN THE TEXT.

12. AVOID USE OF ASSUMED QUALIFIERS THAT MANY EXAMINEES MAY NOT BE AWARE OF.

13. AVOID USE OF UNNECESSARY WORDS OR PHRASES, WHICH ARE NOT RELEVANT TO THE PROBLEM AT HAND (unless such discriminating ability is the primary intent of the evaluation). THE ITEM’S VALUE IS PARTICULARLY DAMAGED IF THE UNNECESSARY MATERIAL IS DESIGNED TO DISTRACT OR MISLEAD. SUCH ITEMS TEST THE STUDENT’S READING COMPREHENSION RATHER THAN KNOWLEDGE OF THE SUBJECT MATTER.

14) Avoid use of non-relevant sources of difficulty such as requiring a complex calculation when only knowledge of a principle is being tested.

Note in the previous example, knowledge of the sine of the 30-degree angle would have led some students to use the sine formula for calculation even if a simpler approach would have sufficed.

15) Avoid extreme specificity requirements in responses.

16)  Include as much of the item as possible in the stem. This allows less repetition and shorter choice options.

17)  Use the “None of the above” option only when the keyed answer is totally correct. When choice of the “best” response is intended, “none of the above” is not appropriate, since the implication has already been made that the correct response may be partially inaccurate.

19)  Note that use of “all of the above” may allow credit for partial knowledge. In a multiple option item, (allowing only one option choice) if a student only knew that two (2) options were correct, he could then deduce the correctness of “all of the above”. This assumes you are allowed only one correct choice.

21)  Having compound response choices may purposefully increase difficulty of an item.

23)  The difficulty of a multiple choice item may be controlled by varying the homogeneity or degree of similarity of responses.

The more homogeneous, the more difficult the item.

Example: (Less Homogenous) Thailand is located in: a. Southeast Asia b. Eastern Europe c. South America d. East Africa e. Central America (More Homogenous) Thailand is located next to: a. Laos and Kampuchea b. India and China c. China and Malaya d. Laos and China e. India and Malaya

Essay / Short Answer Test Essays, classified as non – objective tests, allow for the assessment of higher order thinking skills. Such tests require students to organize their thoughts on a subject matter in coherent sentences in order to inform an audience. In essay tests, students are required to write one or more paragraphs on a specific topic. Type of Essay Items:Restricted response typeThe test limits the examinees response may be answered by the examinee’s responses in terms of length, content, style or organization. Example: Give and explain three reasons why the government should or should not allow teachers to work abroad as domestic helpers. 14 TYPES OF ABILITIES THAT CAN BE MEASURED BY ESSAY ITEMS

1. Comparisons between two or more things 2. The development and defense of an opinion3. Questions of cause and effect4. Explanations of meanings5. Summarizing of information in a designated area6. Analysis7. Knowledge of relationships8. Illustrations of rules, principles, procedures and applications9. Applications of rules, laws, and principles to new situations10. Criticisms of the adequacy, relevance or correctness of a concept, idea or information 11. Formulation of new questions and problems12. Reorganization of facts13. Discriminations between objects, concepts or events 14. Inferential thinking

Note that all these involved the higher-level skills mentioned in Bloom’s Taxonomy.

THE FOLLOWING ARE RULES OF THUMB WHICH FACILITATE THE SCORING OF ESSAYS:1. Phrase the direction in such a way that students are guided on the key concepts to be

included.2. Inform the students on the criteria to be used for grading their essays. This rule allows the

students to focus on relevant and substantive materials rather than on peripheral and unnecessary facts and bits of information.

3. Put a time limit on the essay test.4. Decide on your essay grading system prior to getting the essays of your students.5. Evaluate all the students’ answers to one question before proceeding to the next question.6. Evaluate answers to essay questions without knowing the identity of the writer.7. Whenever possible, have two or more persons grade each answer.

The task is clearly defined. The students are given an idea on the scope and direction you intended for the answer to take. The question starts with a description of the required behavior to put them in the correct mind frame.E.g. “Compare” or “Analyze” Questions regarding a student’s opinion on a certain issue should focus not on the opinion but on the way it is presented and argued.A larger number of shorter, more specific questions are better, than, one or two longer questions.

Example:What is wrong with this question? Describe asthma?Better: (Clearly explain what is expected of the student.)Describe asthma. Include in your answer :a. the pathophysiologic features of asthmab. the clinical manifestations associated with an asthma episodec. the management of an asthma episode. (10 points)

Example:What is wrong with this question? Who is better, Rizal or Bonifacio?Better: ( The students are given an idea on the scope and direction you intended for the answer to take.)Compare and contrast the method used by Rizal and Bonifacio in promoting nationalism. (5 points)

MATCHING TYPE

The Matching Type items may be considered as modified multiple – choice type items where the choices progressively reduce as one successfully matches the items on the left with the items on the right.

A premise or premiss is a statement that an argument claims will induce or justify a conclusion.

A response is a written or verbal answer to a question in a test or questionnaire.

Matching Type Tests

The list of responses should be relatively short.

Response options should be arranged alphabetically or numerically.

Directions clearly indicate the basis for matching.

Can responses be used more than once?

Where will you place your answer?

Can students infer relationships or are they based on real word logic?

Matching Type Tests

Position of matches should be varied. Avoid using patterns.

The choices of each matching set should be on one page. There are more responses than premises in a single set if responses cannot be used more than once.

Matching Type Tests

The premises are homogeneous as well as the responses and are grouped as one item.

Example:

Set A: Provinces in Region I

Set B: Provinces in CAR

If responses can be used more than once, it should be proportional to the number of premises (3:5 or 4:10)

Examples:

Better: (Use homogenous material in matching items, and if responses are not to be used more than once, include more responses)

Match the theories in Column A with their proponents in Column B. Write the letter of the correct answer.

Column A Column B

___ 1. Psychodynamic Theory A. Albert Bandura

___ 2. Trait Theory B. B.F. Skinner

___ 3. Behaviorism C. Carl Rogers

___ 4. Humanism D. Gordon Allport

___ 5. Social Learning Theory E. Karn Horney

F.Raymond Cattell

G.Sigmund Freud

Things to Remember:

Making a good test takes time

Teachers have the obligation to provide their students with the best evaluation

Tests play an essential role in the life of the students, parents, teachers and other educators. Break any of the rules when you have a good reason for doing so!

(Mehrens, 1973)

POINTS TO PONDER…

A good lesson makes a good question

A good question makes a good content

A good content makes a good test

A good test makes a good grade

A good grade makes a good student

A good student makes a good COMMUNITY