1 /132 1. /91 2 research shows three reliable effects when students are graded: they tend to think...
TRANSCRIPT
1 /132
Assessment
1
/91
Do we need assessment?
2
/91
Alfie Kohn
Research shows three reliable effects when students are graded:
they tend to think less deeply avoid taking risks lose interest in the learning itself
3
/91
“I tell my classes that if they just do what they are supposed to do and meet the standard requirements, that they will earn a C,” he said. “That is the default grade. They see the default grade as an A.”
4
New York Times
/91
Types of assessment
Performance (practical) testsAssessment through writing (papers,
essays)Supply or select questionsPortfolio
5
/91
Performance test
With given tools, find out which of the 3 paper towels is most absorbent
Create a blog using Blogger.com
Make a surface with slope= .45 and friction = .15
6
/91
short-answer questions (supply type)
What is the formula for Quadratic equation?
List 5 causes of Civil War.
America was discovered by-----------
7
/91
(Matching Question)
Which invention was created by each of the following inventors
A- Gutenberg 1-SteamboatB- Edison 2- PhonographC- Fulton 3- Moveable-type
press
8
/91
Supply Question
Disadvantage
Spelling errors= losing pointsMostly based on memoryNeglect the abilities to see the whole
picture
9
/91
True-False Questions
The speed of light is 186,000 miles a second;
True False
Herber Hoover invented the vacuum cleaner
True False
10
/91
True-False
Advantage?Good for intro or start
DisadvantageCould be misused (easy to create so
use it a lot)A lot of time people are in doubt
(186,000)Not appropriate for complex or
subtle kinds of knowledge and comprehension.
11
/91
Multiple Choice
A sonnet is a verse form consisting of
A. six linesB. eight linesC. ten linesD. twelve linesE. fourteen lines
12
/91
Multiple Choice
AdvantageOf all the select-type forms, the
multiple choice question is the most flexible and most reliable.
easy to score
DisadvantageHard to design & time consuming
13
/91
Portfolios
Performances in art, writing, mathematics, science projects, teaching, computers can be stored in portfolios
14
/91
Portfolio
Advantages and disadvantages?
15
/91
Essay Questions
Compare and contrast the American & French Revolutions
What makes Beethoven’s music unique?
16
/91
Assessment through writing
Any problems?
17
/91
Factors affecting student’s score
Handwriting or PenmanshipNeatnessGenderPhysical beauty or sexual
attractivenessFirst/Last name (ethnicity & religion)
18
/91
Missing from research
Bias due to open structure (letting the writer to impress the
teacher by good writing even is she/he does not answer the question)
19
/91
Reliability
20
/91
Another experiment
A teacher re-marked the same 10 essays after two months, the correlation was only about .45
21
/91
What about rubric?
/91
Holistic or Analytic?
23
/91
Holistic Rubric
If writing were scored in a truly holistic manner, there would be less pre-specification of criteria and more flexibility in scoring.
Students would be evaluated more on what they actually write than on how well their writing matched the scoring criteria.
24
/91
Analytic
Example: an essay about first amendment
Essay rubrics1 point for pointing out that civil
liberties are provided in the Bill of Rights
1 point for pointing out that First Amendment provides for freedom of speech and press
1 point for pointing out a recent news story that involved the First Amendment
25
/91
Analytic scoring
Is much more useful for providing feedback to students
However, the question is how valid & reliable is analytic scoring.
26
/91
Why Rubrics?
Your writing is poor.This is my opinion.
Assumption= Rubrics help in developing more valid, reliable, fair assessment
27
/91
Problems with writing rubrics?
28
/91
standardization
Rubrics promote reliability in performance assessments by standardizing scoring, but they also standardize writing.
The standardization of a skill that is fundamentally self-expressive and individualistic obstructs its assessment
29
/91
Writing to the Rubrics
30
rubrics have the power to undermine assessment.
/91
Spandel (2006) the quality of voice, is omitted from many rubrics because it is thought too difficult to define.
Kohn’s (2006) . Quality is more than the sum of its rubricized parts.
31
/91
My Research
Reliability and Validity of Rubrics for Assessment through WritingRezaei, Ali Reza; Lovorn, MichaelAssessing Writing, v15 n1 p18-39 2010
http://eric.ed.gov/?id=EJ881105
/91
Technology Assisted Assessment
/91
Problem with online assessment?
PlagiarismHow do we know who is answering?
/91
Wrong assumptions
Most online assignments involve writing essays
We don’t have cheating in f2fThere is no way to control cheating We don’t have other alternatives
/91
Ways to control cheating?
???
/91
Using technology
37
AdvantageFrequent assessment (because it is
easy)
/91
Using technology
With the integration of technology, assessment can be an integral part of teaching and both can happen together.
38
/91
Immediate feedback
39
If you like this activity ring the bellWas this activity useful?Write your reflection
/91
Peer assessment
Advantages & Disadvantages
40
/91
Send to the teacher
http://wps.prenhall.com/chet_creswell_educational_2/24/6311/1615692.cw/index.html
Khan Academy IXL
41
/91
Using Images
42
Fig. 1. Students are asked to predict the number, name, magnitude, and the direction offorces being applied on the athlete jumping on a trampoline. Later they have to predictat which of the above positions the velocity/acceleration is maximum, minimum, zero,increasing, or decreasing and why?
/91
Simple animations
http://pbskids.org/curiousgeorge/games/feed_gnocchi/feed_gnocchi.html
43
/91
Innovative teaching
/91
Convince me
45
/91
Advanced simulations
http://phet.colorado.edu/en/simulation/forces-and-motion
http://phet.colorado.edu/en/simulation/acid-base-solutions
46
/91
Speed Test
http://www.oswego.org/ocsd-web/games/Mathmagician/cathymath.html
48
/91
Lumosity (and other games) fast thinking makes us happy,
energized and self-confident http://www.eurekalert.org/pub_releases/2006-09/afps-se092606.php
https://www.lumosity.com/personal-training-plan/sign-up
49
/91
Simple simulations (Excel, ppt)
50
/91
Tips for a better assessment
51
/91
Tips for essay questions
Use them only when multiple choice cannot be used (application & synthesis & evaluation)
Don’t use them to evaluate simple kinds of knowledge and comprehension
52
/91
General or Specific?
Broader topics are harder to evaluate but they require students to provide their own structure and focus and they can call for greater complexity
53
/91
Example
How your thinking about ----------- has changed as a result of taking this course.
Broad subjects (China , environment)More specifics (air quality in LA)
54
/91
Explain the verbs
Wrong: Write an essay about global warming
Correct: Compare and contrast Clinton and Obama’s environmental policies
55
/91
Allowing choice among essay questions?
56
/91
Writing a model answer
To guide yourself in gradingAnd for students to follow the formatNot for students to copy
57
/91
Question
Do students study more efficiently for essay examinations than for multiple-choice tests?
58
/91
How to reduce biases in scoring essays?
59
/91
Use codes or ask them to write their names on the back of the page
Score all answers to a given question before going to the next question
60
/91
Oral Exam
Essay tests allow you to get into your students’ minds in a way otherwise impossible
However, what they say orally is one thing , and what they say in writing is another
You should note that students who are effective in one medium are not necessarily good in the other
E.g., Class discussion61
/91
Problems Multiple Choice exams
Have not seen good onesHard to make good onesMC exams not good to evaluate
HOTH and CTH
/91
Force Concept Inventory
63
This is an example of a multiple choice test that unlike many multiple choice tests1- Does not measure memorization2- Measures higher order thinking3- It measures students' misconceptions4- Is professionally designed5- All options are based on research on students' learning6- Several journal articles have been published confirming its validity and reliability.
/91
Tips for multiple choice questions
Table of specificationsMy Powerpoint on Bloom
64
/91
New VersionOld Version
/91
Link
/91
stem & alternatives
Multiple-choice items have two main parts
1- the stem, either a question or an incomplete statement that comes first
2- the alternatives, which requires students to choose the best or the correct one
67
/91
Stem should state a problem Rather than simply lead into a collection of true-false
statements
The United States………..
A. has less than 300 million people B. grows large amounts of rubber C. has few good harbors D. produces most of the world’s automobiles
68
/91
better
The population of the United States could be characterized as:
A. stable birthrateB. people of varied national
backgroundC. Its even distribution over the area of
the countryD. an increasing movement from
suburbs into cities 69
/91
Still better
The population of the United States during the 1990s could be characterized as
Rising Staying fairly stable Falling
70
/91
Use Plausible Distractors
the distractors should be plausible so that students who don’t know the correct answer will tend to select them
71
/91
how many alternatives
use as many distractors, up to six, as can be logically created.
don’t hesitate to change the number of distractors from item to item.
72
/91
direct questions
use direct questions rather than incomplete statements in the stem when appropriate
73
/91
Poor Example
in analyzing the cost of living in the United States, we find that its largest component is…….
A. foodB. housingC. clothingD. health careE. recreation
74
/91
Better
Which of the following is the largest component of the cost
of living in the United States?A. foodB. housingC. clothingD. health careE. recreation
75
/91
Avoid repeating words
Which of the following is the best brief description of this novelist’s writing?
A. a flowery approach to characterizationB. a psychiatric approach to
characterizationC. an inner-feeling approach to
characterizationD. an overt-action approach to
characterization76
/91
Which of the following is the best brief description of this novelist’s approach to characterization?
A. floweryB. psychiatricC. inner-feelingD. overt-action
77
/91
Length & Precision
The length and precision of the choices should not vary systematically with their correctness..
Careless item writers tend to make the correct choice the longest one.
78
/91
Always B is the correct answer
The correct response alternative should vary from item to item. Why?
79
/91
what is wrong with this?
The judiciary committee’s impeachment deliberations resulted in a resolution in favor of
A. no impeachment B. the majority voted for three articlesC. a sharp division between the two
partiesD. one article cited obstruction of justice.
80
/91
consistent grammar
All choices should be grammatically consistent with the stem and one another
81
/91
better
The judiciary committee’s impeachment deliberations resulted in resolutions in favor of how many articles of impeachment?
A. noneB. oneC. twoD. threeE. four
82
/91
Check the scoring key
Have other teachers to check your test and scoring key
There should be one and only one choice that experts would consider best
83
/91
None of the above
Should be used ONLY where answers can be absolutely right or wrong.
For example, in math and science but not in social sciences where there are only best answers.
84
/91
Power vs Speed
Don’t make tests so long that they become speed rather than power tests
Some very able students have been very slow test takers.
For performance assessment sometimes speed test is OK.
85
/91
Paraphrase
Comprehension is defined as the students’ ability to answer a question based on a paraphrase of a statement that appeared in learning materials.
86
/91
example
in the text they read“the private single-handedly blew up the ammo dump”
the question“the soldier who ignited the enemy’s supply of ammo was”A. on patrolB. working aloneC. with a team of frogmen
87
/91
example 2
“Rising air, cools and releases water”
When ocean winds go over the coastal mountains, they are likely toA. pick up moistureB. increase velocityC. bring rain
88
/91
Diagnostic items
The choice of distractors in multiple choice items are seen as rich sources of information if the test items were designed with diagnosis in mind, not just as ways to assess knowledge.
89
/91
327-48= ?
389 (borrowed but forgot to reduce the number in the column he borrowed from )
321 (subtracted 27 from 48)
279189.48 (borrowed twice from the left)
90
/91
diagnosis
(a), we would hypothesize that he correctly “borrowed” 10 from the appropriate number on top but forgot to reduce the number in the column he borrowed from by 1, a common and easily corrected error.
(b), we would infer that the student subtracted the smaller number from the larger number regardless of whether it was on top or on the bottom.
(d), we would assume that the student borrowed twice from the left-most digit on top, and never from the adjoining column.
91
327-48= ?
389321279189.48