multiple response questions - allowing for chance in authentic assessments
TRANSCRIPT
Multiple Response Questions
Allowing for chance in authentic assessments
Mhairi McAlpineRobert Clark CentreUniversity of Glasgow
Ian HeskethTOIA ProjectUniversity of Strathclyde
Introduction
• Multiple response questions are a popular method of computer assisted assessment – however questions are being raised about their reliability.
• This paper looks at how MRQs are implemented in practice, and how this may impact on assessment in Higher Education.
Methodology
• 637 MRQ questions from 65 tests were reviewed.
• We were interested in– How big the question guess factors were – particularly compared
with other objective formats such as T/F and MCQ
– How this impacted on the test guess factor
– What effect that had on the weightings of the MRQ questions within the test
Results – questions and options
• The examined items ranged from 1-7 keys and 3-18 response options.
• The majority of items clustered between 2-5 keys and 5-9 response options
• The most popular combination of keys and options was 3 from 6. This was the default setting on the software and accounted for almost 40% of the questions examined.
Question chance factors
• Question chance factors peaked at 0.5 accounting for 57.8% of the questions.
• There was also a smaller peak at 0.4
• Less than 15% of questions had a chance factor lower than a standard MCQ (1 key; 4 options).
• Nearly 15% of questions had a chance factor greater than a True/False question
Tests chance factor
• The test chance factors ranged from 0.25 to 0.75 with the majority of the data clustered between 0.34 and 0.6.
• Only in 1 test was the chance factor comparable to an MCQ test.
• In over a quarter of tests, the chance factor was greater than in a T/F test.
Impact on Weightings
• In all of the tests examined, each response within a MRQ carried at least one mark – leading to this question type being heavily weighted.
• A high guess factor depresses the discrimination of a question– this in turn depresses the weighting of the question,
meaning that its intended weight is not achieved
Effect of intended weighting on test chance factor
• The chance factors of the questions were weighted by the number of marks that each of them carried.
• Only in one test did this reduce the overall test chance factor– in some cases it increased the test chance
factor by 0.05.
Discussion
• The use of MRQs in formative assessment has been demonstrated, however adjustments may have to be made for them to be a valid form of summative assessment
• McCabe and Barrett have suggested a formula for calculating the chance factor of an MRQ, this would make explicit how much random variation an author may be introducing.
• This issue becomes more pressing when randomised or adaptive tests are given. – Where the overall test chance factor may vary from student to
student
Recommendations
• It is clear that it is time for a community-based approach to identifying and resolving issues of analysis.
• Further work should be carried out on how the outcomes of computer based questions and tests are handled.
• Development of statistical approaches to chance calculation & guess correction in new question formats should be conducted .
• More effort should be focused on the analysis of tests and items that exploit the advances made in authoring complexity.
Conclusions
• Matrix questions may offer a partial solution to the chance factor problem in MRQs
• The default software setting has influenced practice – care must be taken that good practice prevails
• Dissemination of item analysis techniques and test construction methodology must be prioritised within the CAA community.