ga2143 teslchina 7 april

53
Measurement and Evaluation of Thinking Skills (HOTS) ASSOC. PROF DR KAMISAH OSMAN 7 APRIL 2008

Upload: guest84cff7

Post on 21-Dec-2014

238 views

Category:

Business


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Ga2143 Teslchina 7 April

Measurement and Evaluation of Thinking

Skills (HOTS)

ASSOC. PROF DR KAMISAH OSMAN7 APRIL 2008

Page 2: Ga2143 Teslchina 7 April

Measurement and Evaluation

What may happen in measurement and evaluation of science teaching with respect to measure higher order thinking?

“We have never studied the content of these questions”.

“This test is too easy. I can answer them by closing my eyes.”

“I scored 65% on the test, but I still failed the test, which doesn’t make sense to me.”

Page 3: Ga2143 Teslchina 7 April

What is all about?

Student

x

xxx

xxxxx

xxxxxx

Xx

X

x

Measurement Objectives

2.4

1.3

2.4 3.3

2.3 3.1 3.2

2.1 2.2

4.1

1.1

Page 4: Ga2143 Teslchina 7 April
Page 5: Ga2143 Teslchina 7 April

Key to Good Assessment of HOTS

Assessment

(Measurement + Evaluation)

alignment

Curriculum Instruction

Page 6: Ga2143 Teslchina 7 April

Necessary Conditions for Good Assessment Planning Developing Administering Scoring Analyzing Grading

Page 7: Ga2143 Teslchina 7 April

Test Planning: Test Grid

# of Items (points)

Remember Understand Apply

M-C C-R Performance Sub-total Substances 5(5) 2(6) Mixture 8(8) 1(5) Conservation 4(4) 1(3) Physical change

5(5)

Chemical change

4(4)

2

Sub-total

26(26) 4 (14) 2(10) 32(50)

Page 8: Ga2143 Teslchina 7 April

Developing multiple-choice (MC) questions

Guideline 1: The stem of an item should be meaningful by itself and should present a definite problem

A scientist…

a. Consults the writing of Aristotle.

b. Debates with fellow scientists.

c. Makes a careful observation during experiments.

d. Thinks about the probability.

How does a scientist discover new facts?

a. Consulting the writing of Aristotle.

b. Debating with fellow scientists.

c. Making careful observations during experiments.*

d. Thinking about the probability.

Page 9: Ga2143 Teslchina 7 April

Developing MC questions

Guideline 2: Make all choices plausible to uninformed students

What are electrons?

a. Negative particles*

b. Positive particles

c. Neutral particles

d. Mechanical tools

What are electrons?

a. Negative particles*

b. Positive particles

c. Neutral particles

d. Nuclei of atoms

Page 10: Ga2143 Teslchina 7 April

Ways to make choices plausible

1. Using students’ common misconceptions or errors

2. Use the textbook language or other phraseology that has the “appearance of truth”

3. Use distracters that are parallel in form and grammatically consistent with the item’s stem

4. Make the distracters similar to the correct answer in length, vocabulary, sentence structure, and complexity of thought

Page 11: Ga2143 Teslchina 7 April

Developing MC questions

Guideline 3: Arrange the responses in a logical order if it exists

How many bonding electrons does an oxygen atom have?

a. 3

b. 5

c. 6

d. 2*

e. 4

How many bonding electrons does an oxygen atom have?

a. 2*

b. 3

c. 4

d. 5

e. 6

Page 12: Ga2143 Teslchina 7 April

Developing MC questions

Guideline 4: Avoid extraneous clues to the correct and incorrect choices

Increasing the temperature will increase the pressure of a gas in a sealed container because

a. No container expansion

b. Gas particles constantly more

c. Gas particles collide with each other

d. More gas particles collide with each other and with the container walls*

Increasing the temperature will increase the pressure of a gas in a sealed container because

a. Gas particles move more rapidly

b. Gas particles expand bigger

c. Gas particles collide more with each other

d. Gas particles collide more with the container*

Page 13: Ga2143 Teslchina 7 April

Developing MC questions

Guideline 5: Avoid using the “none of the above”, “all of the above” and “I don’t know” alternatives

According to Boyle’s law, which of the following changes will occur to the pressure of a gas at a given temperature when the volume of the gas is increased?

a. increase

b. decrease*

c. no change

d. none of the above

According to Boyle’s law, which of the following changes will occur to the pressure of a gas at a given temperature when the volume of the gas is increased?

a. increase

b. decrease*

c. increase first, then decrease

d. no change

Page 14: Ga2143 Teslchina 7 April

Summary: Checklist for good MC questions The stem presents a clear problem The stem is stated as a question The choices are equally plausible The choices are in alpha-numerical or other logical

order The choices are consistent in length and contain no

extraneous clues The choices contain only one best or correct answer “None of the above” or “all of the above” choices are

avoided

Page 15: Ga2143 Teslchina 7 April

Advantages and Limitations of M-C Questions

Advantages Easy to score Objective to score Large coverage Good at assessing specific

knowledge and understanding or lower order thinking skills (LOWS)

Incorrect answers provide valuable information on students’ learning difficulties

Limitations Limited in assessing

higher order thinking skills (HOTS)

Guessing Reading comprehension Time consuming to write

good M-C questions

Page 16: Ga2143 Teslchina 7 April

How to measure higher order thinking skills (HOTS) First of all: You don’t have to use MC to assess

HOTS; there are many other question formats that assess HOTS better than MC does

Understand the difference among different cognitive levels

Use combinations of question formats

Develop appropriate multiple-choice questions

Page 17: Ga2143 Teslchina 7 April

Lower Order Thinking Skills (LOTS) Remember: recognize (identify) , recall (retrieve)

Understand: interpret (clarify, paraphrase, represent, translate) , exemplify (illustrate, instantiate), classify (categorize, instantiate), summarize (abstract, generalize), infer (conclude, extrapolate, interpolate, predict), compare (contrast, map, match), explain (construct, model)

Apply: execute (carry out), implement (use)

Page 18: Ga2143 Teslchina 7 April

Higher Order Thinking Skills (HOTS)

Analyze: differentiate (discriminate, distinguish, focus, select), organize (find coherence, integrate, outline, parse, structure), attribute (deconstruct)

Evaluate: check (coordinate, detect, monitor, test), critique (judge)

Create: generate (hypothesize), plan (design), Produce (construct)

Page 19: Ga2143 Teslchina 7 April

Original Terms New Terms

Evaluation

Synthesis

Analysis

Application

Comprehension

Knowledge

•Creating

•Evaluating

•Analysing

•Applying

•Understanding

•Remembering

Page 20: Ga2143 Teslchina 7 April

Using combinations of question formats: M-C + M-CAfter a large ice-cube

has melted in a beaker of water, how will the water level change?a. higherb. lowerc. the same*

After a large ice-cube has melted in a beaker of water, how will the water level change?a. higherb. lowerc. the same*

Why do you think so? Choose all that apply. a. The mass of water displaced is equal to the mass of the ice*. b. Ice has more volume than water. c. Water is denser than ice. d. Ice cube decreases the temperature of water. e. Water molecules in water occupy more space than in ice.

Page 21: Ga2143 Teslchina 7 April

Using combinations of question formats: M-C + Constructed ResponseAfter a large ice-cube has

melted in a beaker of water, how will the water level change?a. higherb. lowerc. the same*

After a large ice-cube has melted in a beaker of water, how will the water level change?a. higherb. lowerc. the same*

Why do you think so? Please justify your choice:

Page 22: Ga2143 Teslchina 7 April

Using combinations of question formats: Performance + M-CUsing the materials provided at

your table, create a model of the human heart. You should use the blue and red play-doh to represent de-oxygenated and oxygenated blood. Be sure to create and label the following:

Left Atrium (2 pts.)Right Atrium (2 pts.)Left Ventrical (2 pts.)Right Ventrical (2 pts.)Aorta (2 pts.)Pulmonary Vein (2 pts.)Pulmonary Artery (2 pts.)

Using the materials provided at your table, create a model of the human heart. You should use the blue and red play-doh to represent de-oxygenated and oxygenated blood. Be sure to create and label the following:

Same as the left

In the heart, the mixing of oxygen-rich and oxygen-poor blood is prevented by the

a.mitral valve b.tricuspid valve c.septum* d.pericardium.

Page 23: Ga2143 Teslchina 7 April

Developing appropriate M-C questions for HOTS1. Providing a factual statement, ask students to

analyze.

The Sun is the only body in our solar system that gives off large amounts of light and heat. Why can we see the Moon?

A. It is nearer the earth than the SunB. It is reflecting light from the Sun*C. It is the biggest object in the solar systemD. It is without an atmosphere

Page 24: Ga2143 Teslchina 7 April

Developing appropriate M-C questions for HOTS2. Providing a diagram, ask

students to identify elements:

In the cell on the right, what letter correctly identifies the portion that first receives a signal

a. A*b. Bc. C.d. De. E

Page 25: Ga2143 Teslchina 7 April

Developing appropriate M-C questions for HOTS3. Providing data, ask students to develop a hypothesis:

Amounts of oxygen produced in a pound at different depths are shown below:Location OxygenTop meter 4 g/m3Second meter 3g/m3Third meter 1g/m3Bottom meter 0g/m3

Which statement is a reasonable hypothesis based on the data in the table?A. More oxygen production occurs near the surface because there is more light

there.*B. More oxygen production occurs near the bottom because there are more plants

thereC. The greater the water pressure, the more oxygen production occursD. The rate of oxygen production is not related to depth.

Page 26: Ga2143 Teslchina 7 April

Developing appropriate M-C questions for HOTS4. Providing a statement, ask students to evaluate its

validity:

The crews of two boats at sea can communicate with each other by shouting to each other, so are crews of two close-by spaceships in the space. How valid is this statement?

A. ValidB. Partially valid*C. InvalidD. Not enough information to make a judgment

Page 27: Ga2143 Teslchina 7 April
Page 28: Ga2143 Teslchina 7 April
Page 29: Ga2143 Teslchina 7 April

Developing constructed response questions for assessing HOTS

Short constructed response (SCR) questions require answers ranging from one word to a few sentences.

Extended constructed response (ECR) questions require students to write a few sentences or a short paragraph.

Essay (E) questions require students to write a few paragraphs to a few pages.

Page 30: Ga2143 Teslchina 7 April

General guidelines for writing constructed-response questions

1. Define the task completely and specifically.

Poor State whether you think pesticide should be used in farms.

Better State the environmental effects of pesticide use in farms

Page 31: Ga2143 Teslchina 7 April

Avoid ambiguous words

Possible student interpretations of the word “discuss” Explain in my own words, maybe with an

introduction, something in the middle and a conclusion

Analyze in length Present analogies and comparisons Tell all I know as much as possible Put down facts …

Page 32: Ga2143 Teslchina 7 April

General guidelines for writing constructed-response questions

2. Give explicit directions such as the length, grading guideline, and time to complete.

Poor State whether you think pesticide should be used in farms.

Better State whether you think pesticide should be used in farms. Defend your position as follows:

a. Identify any positive benefits associated with pesticide use. b. Identify any negative effects associated with pesticide use. c. Compare positive benefits against negative effects. d. Suggest if better alternatives than pesticide are available. Your essay should be in no more than 2 double-spaced pages.

Two of the points will be used to evaluate the sentence structure, punctuation, and spelling. (10 points).

Page 33: Ga2143 Teslchina 7 April

General guidelines for writing constructed-response questions3. Do not provide optional

questions for students to choose

Because different questions may measure completely different constructs, which makes comparisons among students difficult

Page 34: Ga2143 Teslchina 7 April

General guidelines for writing constructed-response questions4. Define scoring clearly and appropriately

scoring rubric

Analytic vs Holistic

Trait 1

Trait 2

Trait 3

4

3

2

1

5. -----------------------

4. ------------------------

3. ------------------------

2. ------------------------

1. ------------------------

Page 35: Ga2143 Teslchina 7 April

Holistic Scoring Rubric

Quality Category (Score) Characteristics Distinguished (4) Student can apply all the skills needed to

investigate a self-selected issue or problem

Experienced (3) Student can use appropriate skills needed to investigate a self-selected issue or problem

Average (2) Student can use some investigative skills to investigate a problem identified by another person.

Novice (1) Student has difficulty in demonstrating skills needed to study a provided problem.

Page 36: Ga2143 Teslchina 7 April

Analytic Scoring Rubric Score Attribute

3 2 1

Source Lists more than three sources

Includes title, page, date

Includes author

Lists three sources Includes title, page,

date Includes author

Lists two sources Includes title,

page, and date

Report Title

Relates to main topic.

Capitalizes correctly

Captures a reader’s interest

Relates to main topic

Capitalizes correctly

Relates to main topic

Content Many related topics Informative Well organized

Relates to topic Informative

Relates to topic

Paragraphs Complete sentences Correct spelling Correct punctuation Correct

capitalization Neatly done.

Complete sentences Correct spelling Correct punctuation Correct

capitalization

Complete sentences

Illustrations More than one titled

Labeled correctly Neat appearance

Titled Labeled correctly Neat appearance

Titled Labeled

Activities Relates to issue question

Informative Written conclusion

Relates to issue question

Informative Written conclusion

Relates to issue question

Cover Title Name Date Illustration

Title Name Date

Title Name

Page 37: Ga2143 Teslchina 7 April

Holistic vs Analytic

Holistic Easy to construct Efficient to score Clear implication Vague feedback Less informative for

students to answer the question

Analytic Time consuming to

construct Time consuming to

score Unclear implication Specific feedback Informative for

students to answer the question

Page 38: Ga2143 Teslchina 7 April

Guidelines for scoring essay questions

Essays are scored anonymously Essays are scored question by question

across students Each essay is graded twice independently to

ensure consistency/reliability Appropriate scoring rubrics are developed

and applied consistently

Page 39: Ga2143 Teslchina 7 April

Multiple Faculty, TAs

Common curriculum Common learning opportunities Develop and agree on a common test grid Same scoring rubrics Consider item banking

Page 40: Ga2143 Teslchina 7 April

Developing vs. Adopting

There are many standardized tests or item banks (e.g. http://www.flaguide.org/)

Standardized tests have established validity, reliability, and absence of bias

The key is the match between the test coverage and the curriculum/instruction

Page 41: Ga2143 Teslchina 7 April

Necessary conditions for good assessment Planning Developing Administering Scoring Analyzing Grading

Page 42: Ga2143 Teslchina 7 April

Administering tests

Order questions: easy to difficult; SRC questions first, SCR questions next, and ECR questions at last

Give complete instructions before students begin: test purpose, time allowance, basis for responding, methods of recording, appropriateness of guessing

Use equivalent forms or different item orders and recording sheets to avoid cheating

Ensure adequate physical setting

Avoid unnecessary interaction with students

Start and end the test at the same time

Page 43: Ga2143 Teslchina 7 April

Scoring tests Hand scoring vs. optical scanning

Need to establish inter-rater reliability for constructed response questions

Correct for guessing when appropriate (e.g. speeded)Corrected Score = R – W/(n-1)

Correct for cheating

Harpp-Hogan index (H‑H) = EEIC/D

EEIC is exact errors in commonD is number of different responses

Page 44: Ga2143 Teslchina 7 April

Item and test analysis

Item analysis: item response patterns, item difficulty, item discrimination, etc.

Test analysis: reliability (), criterion related validity, bias, prediction related validity, etc.

Page 45: Ga2143 Teslchina 7 April

Grading

Lake Wobegon Effect:

In 1988 it was reported that 70% of the students, 90% of the 15,000 school districts, and 50 states in US were scoring above the national norms on norm-referenced achievement tests in elementary schools (Cannell, 1988)

Page 46: Ga2143 Teslchina 7 April

Criterion-referenced grading based on standards Commonly used standards: pass/fail,

A/B/C/F, …

The essential part of standard setting is to decide a cut-off score

Page 47: Ga2143 Teslchina 7 April

Deciding the cut-off score: M-C testIf number of test questions are more than 20, and the 0

is within the range of .50 to .80, the approximate X can be calculated as follows:

X= (n-)/ * 0 + (-1)/ * M + .5

M is the mean score on the test, is test reliability, n is total number of questions

0 is a true cut-off score if measurement quality is perfectX is the approximate cut-off score given the measurement error ()

Page 48: Ga2143 Teslchina 7 April

Example (n = 28, 0 = .75, X0=21)

Average Score on the Test (M)

Reliability

()

17 21 25

.6 23.43 20.76 18.10

.8 21.75 20.75 19.25

Page 49: Ga2143 Teslchina 7 April

Norm-referenced grading or curving grading

Z = (X-μ)/σ

Page 50: Ga2143 Teslchina 7 April

Norm-referenced vs. criterion-reference Number of students Characteristics of students Purpose of testing Use of testing results Quality of tests

Page 51: Ga2143 Teslchina 7 April

Other grading issues

1.Components of Grades (achievement, efforts, attitude)

2. Combining Scores for the Final Grade (equate before weight before aggregate)

3. Translating Final Grades to Letter Grades (pre-determined scheme)

4. Reporting Grades (clear definition)

Page 52: Ga2143 Teslchina 7 April

Putting all things together: VRA

validity

reliability

absence of bias

Page 53: Ga2143 Teslchina 7 April

Congratulations! You have survived two hours’ preach, you can preach others now.

Questions?