multiple response questions - allowing for chance in authentic assessments

Multiple Response Questions

Allowing for chance in authentic assessments

Mhairi McAlpineRobert Clark CentreUniversity of Glasgow

Ian HeskethTOIA ProjectUniversity of Strathclyde

Introduction

• Multiple response questions are a popular method of computer assisted assessment – however questions are being raised about their reliability.

• This paper looks at how MRQs are implemented in practice, and how this may impact on assessment in Higher Education.

Methodology

• 637 MRQ questions from 65 tests were reviewed.

• We were interested in– How big the question guess factors were – particularly compared

with other objective formats such as T/F and MCQ

– How this impacted on the test guess factor

– What effect that had on the weightings of the MRQ questions within the test

Results – questions and options

• The examined items ranged from 1-7 keys and 3-18 response options.

• The majority of items clustered between 2-5 keys and 5-9 response options

• The most popular combination of keys and options was 3 from 6. This was the default setting on the software and accounted for almost 40% of the questions examined.

Question chance factors

• Question chance factors peaked at 0.5 accounting for 57.8% of the questions.

• There was also a smaller peak at 0.4

• Less than 15% of questions had a chance factor lower than a standard MCQ (1 key; 4 options).

• Nearly 15% of questions had a chance factor greater than a True/False question

Tests chance factor

• The test chance factors ranged from 0.25 to 0.75 with the majority of the data clustered between 0.34 and 0.6.

• Only in 1 test was the chance factor comparable to an MCQ test.

• In over a quarter of tests, the chance factor was greater than in a T/F test.

Impact on Weightings

• In all of the tests examined, each response within a MRQ carried at least one mark – leading to this question type being heavily weighted.

• A high guess factor depresses the discrimination of a question– this in turn depresses the weighting of the question,

meaning that its intended weight is not achieved

Effect of intended weighting on test chance factor

• The chance factors of the questions were weighted by the number of marks that each of them carried.

• Only in one test did this reduce the overall test chance factor– in some cases it increased the test chance

factor by 0.05.

Discussion

• The use of MRQs in formative assessment has been demonstrated, however adjustments may have to be made for them to be a valid form of summative assessment

• McCabe and Barrett have suggested a formula for calculating the chance factor of an MRQ, this would make explicit how much random variation an author may be introducing.

• This issue becomes more pressing when randomised or adaptive tests are given. – Where the overall test chance factor may vary from student to

student

Recommendations

• It is clear that it is time for a community-based approach to identifying and resolving issues of analysis.

• Further work should be carried out on how the outcomes of computer based questions and tests are handled.

• Development of statistical approaches to chance calculation & guess correction in new question formats should be conducted .

• More effort should be focused on the analysis of tests and items that exploit the advances made in authoring complexity.

Conclusions

• Matrix questions may offer a partial solution to the chance factor problem in MRQs

• The default software setting has influenced practice – care must be taken that good practice prevails

• Dissemination of item analysis techniques and test construction methodology must be prioritised within the CAA community.