designing quality assessments - university of new south … · designing quality assessments . ......
TRANSCRIPT
Shelley Wylie 28 October 2012
Designing Quality
Assessments
Designing Quality Assessments
Assessment as a design process Designing quality assessments
Principles of assessment Validity, Reliability and Relevance Writing quality assessments Constructing multiple choice items Review processes
Assessment and teaching as inquiry
Ref: NZ Curriculum (2007) New Zealand Ministry of Education
The design process and assessment
Reference: http://www.curriculumsupport.education.nsw.gov. au/designproduce/tech_process.htm
1. Know the characteristics of the test population.
2. Determine assessment content & format. 3. Develop specifications. 4. Write the questions. 5. Review and check the questions. 6. Compile the paper. 7. Develop the marking criteria and guide. 8. Include justifications. 9. Administer the assessment task. 10. Mark the assessment. 11. Examine the data. 12. Provide student feedback. 13. Report results.
Allocation of workload
Assessment design process
Agreement on task specifications (Why? Who? What? How? When?)
Writing / developing
Review
Assemble task and review as a whole Production
Administer task
Finalise marking guide
Marking and feedback
Agreement on task specifications: fit for purpose
Class-based assessment
Teacher observation
Standardised assessment
Agreement on task specifications (Why? Who? What? How? When?)
Judgement
Agreement on task specifications:
Learning-oriented assessment ‘Learning-oriented assessment holds that for all
assessments, whether predominantly summative or formative in function, a key aim is for them to promote productive student learning.’
Ref: Carless, D. (2009) Learning-oriented Assessment: Principles, Practice and a Project. In
L.H. Meyer, S Davidson, H. Anderson, R. Fletcher, P.M Johnston & M. Rees (Eds.),Tertiary Assessment and Higher Education Student Outcomes: Policy, Practice & Research (pp.79-90). Wellington, New Zealand: Ako Aoteraroa
Agreement on task specifications (Why? Who? What? How? When?)
What makes learning-oriented assessment difficult?
Believing that assessment is designed to ‘trick’ or ‘trap’ students and so find out what they don’t know.
Not knowing the curriculum documents and the difference between achievement objectives and learning outcomes.
Lack of understanding of the principles and theory of assessment makes it difficult to decide what and how to assess.
Assessing behaviour rather than quality of work.
Assuming students will understand how to self assess without teaching them.
Reference: Hawk and Hill 2001 quoted in UNSWIL presentation by Dr Hanan Khalifa, Australia, Sep 2012
Agreement on task specifications: Principles of quality assessment & feedback practice
1. Help clarify what good performance is (goals, criteria, standards).
2. Encourage ‘time and effort’ on challenging learning tasks. 3. Deliver high quality feedback information that helps learners self-correct. 4. Encourage positive motivational beliefs and self-esteem 5. Encourage interaction and dialogue around learning (peer and teacher-student) 6. Facilitate the development of self-assessment and reflection in learning. 7. Give learners choice in assessment – content and processes 8. Involve students in decision-making about assessment policy and practice 9. Support the development of learning communities 10. Help teachers adapt teaching to student needs Reference: David Nicole (2009) REAP Conference Keynote paper, Principals of good assessment and
feedback: Theory and practice
Judging validity
‘Validity can be considered as the key issue in assessment. If an assessment is going to have any use at all, it is crucial that the inferences we make on the basis of assessment results are well-founded.’
‘Recent research proposes that validity is better
understood as: how well the inferences we make or actions we take on the basis of an assessment result can be justified.’
Reference: Charles Darr (2005), A Hitchhiker’s Guide to Validity Assessment News Set
2,No3, 2005
Validity considerations 1. Content considerations A fair sample of the learning area we are interested in Clear links between assessment tasks and learning intentions
2. Construct considerations The extent to which the assessment result can be used to make inferences
about the existence of a particular trait or characteristic. Is the attribute essential for success in the assessment task?
3. Criterion considerations We should ask questions when assessment results from different assessment
tools that are meant to be testing the same construct lead us to very different interpretations of achievement.
4. Consequential considerations We should question the validity of an assessment when there is evidence that
the consequences of using the assessment results to make decisions or inform students of progress are detrimental to our overall educational goals.
Reference: Charles Darr (2005), A Hitchhiker’s Guide to Validity Assessment News Set 2,No3, 2005
Validity checklist
Do the tasks match the learning intentions we are interested in? Does the test cover a wide enough range of content? Are there enough items or tasks to cover the scope of what is being assessed? Do the tasks require use of the desired skills and reasoning processes? Is there an emphasis on deep rather than surface knowledge? Are the directions for the assessment task clear? Are the questions unambiguous? Are the time limits sufficient? Do the tasks avoid favouring groups of students more likely to have useful
background knowledge? Is the language used suitable? Are the reading demands fair? Reference: Charles Darr (2005), A Hitchhiker’s Guide to Validity Assessment News Set 2,No3, 2005
Factors affecting reliability
The number of tasks in the assessment The suitability of the questions or tasks for the students being
assessed. The spread of scores produced by the assessment. The training of the assessors. The clearness of marking guides & checking of marking procedures The wording of the rubric. How closely standardised procedures and conditions for
assessment are followed. How well questions and tasks are phrased. The anxiety or readiness of the students for assessment Reference: Charles Darr (2005), A Hitchhiker’s Guide to Reliability Assessment News Set 2,No3,
2005
Assessment results and reliability
Unreliable results will not lead to valid inferences about student achievement. However, just because assessment results are reliable does not mean that we are assessing what counts. Reliability is therefore a necessary but not sufficient condition for validity.
Reference: Charles Darr (2005), A Hitchhiker’s Guide to Reliability Assessment News Set 2,No3, 2005
Assessing deep versus surface knowledge
Ref: http://www.johnbiggs.com.au/solo_graph.html John Biggs (2011)
Ways to increase the reliability of marking
Clear, agreed marking guidelines Mark promptly but not hurriedly Allocate manageable sections to markers Mark across the whole cohort Encourage discussion of unexpected responses Check mark Use MC tests
Multiple choice tests
•Can be used to objectively assess a wide variety of outcomes (recalling, applying information, higher order skills)
•(Usually) expert agreement on the correct answer(s)
•Allows distractor analysis
Writing
Multiple choice item structure
Stimulus
Stem
Distractors
Key Options
Writing
Stimulus material should be Cognitively appropriate Not overly familiar (e.g. drawn from textbooks) Culturally ‘balanced’ and sensitive/unbiased Varied and balanced Interesting and stimulating Self-contained
Choosing stimulus material Writing
Tips for writing MC items
Write items that test a range of skills.
The answer and distractors must be plausible and drawn from the stimulus.
Check that spelling is consistent between the stimulus material and the items.
Writing
Features of an ideal item
• Engaging and interesting to student and teacher
• Tests an important learning outcome
• Uses a realistic context that aids the question
• Clear, correct and unambiguous
• Original
• Visually interesting
Writing
Good distractors
Hide the key Measure student errors in logic/ knowledge Consider the likely errors that a less able student
could make but not aim to trap able students Must be plausible
Writing
The options must contain ONE and only ONE key – that melds together with the other options ‘appear’ homogeneous or work in pairs contain distractors that ‘work’ – resemble the key in length, in linguistic form contain distractors that are plausible but incorrect
contain distractors that do not overlap with, or are similar in content/idea to the key
NOT contain distractors that are directly opposite to the key (reduces no. of options, increases chance score) avoid using frequency adverbs e.g. always, only, never
avoid ‘all of the above’ and ‘none of the above’ options
Writing
A problem should be clearly stated in the stem.
On Friday the children went
(A) to the shops. (B) swimming. (C) all silly. (D) with their mother.
Each option must be grammatically consistent with the stem.
How did Joan react when she saw the boat?
(A) angrily (B) sad (C) unfriendly (D) joyful
Avoid ambiguity in the stem and options.
Phil was not able to catch up to Joe because
(A) he was a fast runner. (B) his friend had broken the gate.
(C) he was not up to the challenge. (D) he was able to take longer strides.
The stem must not give away the answer.
In which rooms did the cat search for the cake? (A) the children’s bedroom (B) the kitchen and study (C) the bathroom (D) the laundry
Make sure that each item is independent of other items in the set.
According to the story, what is a “moat”?
(A) a water-filled canal encircling the castle (B) the gateway to the castle (C) the watchtower where the guard sat (D) a stable for the horses
What kind of fish lived in the moat? (A) salmon (B) tuna (C) carp (D) whiting
Make options the same length wherever possible.
Which aspect of the mirror is described in the second paragraph?
(A) how much it had been loved (B) its beauty (C) its size (D) its value
Options should not overlap in their content.
What was displayed in the shop window? (A) jewellery (B) necklaces (C) books (D) videos
Present options in a logical order.
How old was Yurong when she moved to the city? (A) seven years old (B) seventeen years old (C) eight years old (D) two years old
Avoid similarity in wording in the stem and the correct answer.
Cross-country skiing is a healthy pastime because
(A) it gives you a big appetite. (B) there are interesting things to see. (C) it keeps you fit and healthy. (D) it allows you to meet new people.
Review processes
Internal review: review individual items in hard copy External review: items revised based on feedback from external
review Yellow form review: may include an external reviewer, includes a subject
expert Blue form review: includes a wider range of reviewers and a
subject specialist Test development sign-off: Test developers and manager
In a school … Consider an objective reviewer / critical reviewer
who was not involved in the assessment construction. Between teacher and between school reviews.