better judgement-2: improving assessors’ management of

45
Better Judgement-2: Improving assessors’ management of factors affecting their judgement Final Report 2016 Lead Institution Flinders University Partner Institution The University of Adelaide Project Leader Lisa Schmidt Team Members Lambert Schuwirth Maree O’Keefe Svetlana King www.flinders.edu.au/better-judgement/

Upload: others

Post on 03-Jun-2022

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Better Judgement-2: Improving assessors’ management of

Better Judgement-2: Improving assessors’

management of factors affecting their judgement

Final Report 2016

Lead Institution Flinders University

Partner Institution The University of Adelaide

Project Leader Lisa Schmidt

Team Members Lambert Schuwirth

Maree O’Keefe

Svetlana King

www.flinders.edu.au/better-judgement/

Page 2: Better Judgement-2: Improving assessors’ management of

Support for the production of this report has been provided by the Australian Government Office for Learning and Teaching. The views expressed in this report do not necessarily reflect the views of the Australian Government Office for Learning and Teaching.

With the exception of the Commonwealth Coat of Arms, and where otherwise noted, all material presented in this document is provided under Creative Commons Attribution-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-sa/4.0/. The details of the relevant licence conditions are available on the Creative Commons website (accessible using the links provided) as is the full legal code for the Creative Commons Attribution-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-sa/4.0/legalcode. Requests and inquiries concerning these rights should be addressed to: Learning and Teaching Support Student Information and Learning Branch Higher Education Group Department of Education and Training GPO Box 9880 Location code C50MA7 CANBERRA ACT 2601 [email protected]

2016 ISBN 978-1-76028-801-3 [PDF] ISBN 978-1-76028-802-0 [DOCX] ISBN 978-1-76028-800-6 [PRINT]

Page 3: Better Judgement-2: Improving assessors’ management of

Better Judgement iii

Acknowledgements The project team would like to acknowledge and thank the following people for their assistance and support of this project:

Mrs Rebekah Parton and Ms Jennie McCulloch for providing administrative support during the project;

Tom Young and the Flinders Creations team for producing and directing the videos;

Academic colleagues, drama students and graduates at Flinders University who contributed their acting talents to the production of the videos;

Ms Anna Smith, Ms Helen Stephenson, and Professor Andrew Parkin for providing institutional support within Flinders University;

Our expert reference group for their insight and guidance along the way: Professor Simon Pyke, Professor Kevin Eva, Professor Denise Chalmers and Professor Sally Kift.

Page 4: Better Judgement-2: Improving assessors’ management of

Better Judgement iv

List of acronyms used AMEE Association for Medical Education in Europe

OLT Office for Learning and Teaching

Page 5: Better Judgement-2: Improving assessors’ management of

Better Judgement v

Executive summary Following on from an initial seed project, the Better Judgement-2 project extends and deepens the range of resources available across the sector to assist the management of judgement biases in assessment. It is often the case that examiner biases in assessment are viewed as inherent flaws that can be minimised or avoided in order to ensure reliable and valid outcomes. In our Better Judgement project, however, we argue for an opposing perspective: it is unlikely that biases can be avoided given the limitations of human short-term memory. Instead, biases need to be enriched and developed into judgement ‘scripts’. This position is supported by extensive consideration of contemporary expertise and decision-making literature.

The outcomes of the Better Judgement-2 project have delivered a broad and versatile suite of staff development resources which are available on the project website -http://www.flinders.edu.au/medicine/sites/better-judgement/. Examiners can use these resources across many disciplines to enhance their recognition and management of biases, especially in workplace-based assessment. These biases include: primacy effect, memory errors of commission, recency effect, memory failure, selective perception, confirmation bias, cognitive dissonance, contrast effect and halo effect.

The resources include:

• an introductory videoed presentation

• filmed vignettes

• a listing of YouTubeTM resources

• Self-Directed Learning Guides, a Facilitator’s Guide and an Academic Developer’s Guide

The introductory videoed presentation on judgement biases and their possible role in assessment was developed to provide users with the necessary foundational knowledge and understanding. This presentation is accompanied by a series of eight specific short film introductions explaining each of the biases.

The second element of the suite consists of short films depicting scripted fragments of oral examinations in which one of the biases is demonstrated. In these films actors portray ‘hyper real’ situations. The purpose of these filmed vignettes is to assist users to recognise the bias in overly well-defined situations. The enacted vignettes were produced in two versions: versions in which thought bubbles and explanatory captions were subsequently added to highlight the actors’ thinking and make the bias explicit, and versions without those thought bubbles and captions.

Page 6: Better Judgement-2: Improving assessors’ management of

Better Judgement vi

The suite also includes a list of YouTubeTM videos that were identified following a comprehensive search. These videos demonstrate examples of the various biases in more realistic situations. This supplemental material is included to stimulate the users’ ability to transfer their learning from the filmed vignettes to more natural situations by recognising possible biases in a variety of situations.

A workshop was developed in which the resources described above were embedded. In this activity, participants explored links between the video materials and the YouTube TM examples and their own practical situations. Participants also had the opportunity to exchange strategies with other participants to manage biases and to consider the likely effectiveness of each strategy. Self-directed Learning Guides were developed to support independent use of the resources. Finally, a Facilitator’s Guide and Academic Developer’s Guide were developed to support academics in implementing Better Judgement in their own institution independently.

In the course of the project 13 workshops were conducted nationally and internationally with a total of 392 participants. Participant feedback after each workshop contributed to ongoing fine-tuning of the resources and the workshop content and approach. Two hundred and sixty workshop participants (66 per cent) completed the workshop evaluation. In their responses the participants indicated strong support for the value of the workshop and the resource materials. Participants recognised the following important learning outcomes that they had achieved following the workshop:

• An awareness that subjectivity is not inherently bad

• A better understanding of biases and how they might rightly or wrongly influence judgements.

• A feeling of empowerment as an assessor, especially by having gained a ‘language’ to explain their judgements.

• A recognition of the complexity of oral and workplace-based assessment

• A better understanding of the individual biases and approaches to their management.

Participants further appreciated the applicability and relevance of the workshop contents to their own assessment practice. Having the ability to discuss the difficulties and issues in groups was particularly valued by participants together with experiencing the feeling that they were not alone in recognising these challenges.

Better Judgement was a highly successful project in terms of the comprehensive suite of resources developed that effectively met a clear need across the sector, the broadly based interdisciplinary engagement and the ongoing use of the project resources. At the time of

Page 7: Better Judgement-2: Improving assessors’ management of

Better Judgement vii

writing this report the website at www.flinders.edu.au/better-judgement had registered 5,589 ‘views’ with a total video viewing time of 11,866 minutes.

Page 8: Better Judgement-2: Improving assessors’ management of

Better Judgement viii

Table of contents Acknowledgements ................................................................................................................... iii

List of acronyms used ............................................................................................................... iv

Executive summary .................................................................................................................... v

Tables and figures ..................................................................................................................... ix

Chapter 1: Why the Better Judgement project was needed and what it delivered ................. 1

Aim ......................................................................................................................................... 1

Context ................................................................................................................................... 1

Approach and methodology .................................................................................................. 4

Project outputs and findings ...................................................................................................... 6

How the project used and advanced existing knowledge ..................................................... 8

Success factors ................................................................................................................. 11

Implementing Better Judgement in your institution ....................................................... 13

Linkages ............................................................................................................................ 14

Chapter 2: Impact, dissemination and evaluation ................................................................... 16

Impact .................................................................................................................................. 16

Dissemination ...................................................................................................................... 16

Dissemination during the production of the resources................................................... 16

Workshops ....................................................................................................................... 17

Project evaluation ................................................................................................................ 19

Workshop evaluation ....................................................................................................... 19

Follow-up evaluation ....................................................................................................... 25

Appendix A: References ........................................................................................................... 27

Appendix B: External evaluator’s report .................................................................................. 29

Appendix C: Certification by Deputy Vice-Chancellor ............................................................. 36

Page 9: Better Judgement-2: Improving assessors’ management of

Better Judgement ix

Tables and figures

Tables Table 1: Effects and biases explored in the Better Judgement project. .................................... 4

Table 2: Categories of biases ................................................................................................... 10

Table 3: Number of Better Judgement project video views (as at 26/01/16). ........................ 16

Table 4: Better Judgement workshops. ................................................................................... 18

Table 5: Better Judgement presentations. .............................................................................. 18

Table 6: Breakdown of workshops and number of feedback forms received. ........................ 19

Table 7: Responses to the five-item follow-up questionnaire. ............................................... 26

Figures No Figures were included in the main body of this report.

Page 10: Better Judgement-2: Improving assessors’ management of

Better Judgement 1

Chapter 1: Why the Better Judgement project was needed and what it delivered

Aim The aim of the Better Judgement project was to develop and disseminate a suite of training materials to facilitate improved understanding, recognition and management of judgement biases in oral and practice-based assessment. The intention was not to teach examiners to become neutral, objective and bias free but, rather, to assist them to become more aware of the effects and biases that may influence their judgements, and to develop strategies to manage these influences. In this way, the effects and biases continue to exist, but the extent to which the examiner allows these factors to influence their judgement is more conscious, adding credibility and defensibility to their judgements. Consequently, the Better Judgement project was designed to improve the quality of assessment practices.

Context Two major drivers have shaped the current climate of assessment in the Australian context: (i) a greater emphasis on standards associated with higher education regulatory requirements and professional accreditation (e.g., Higher Education Standards Framework, Tertiary Education Quality and Standards Agency, Australian Health Practitioner Regulation Agency); and (ii) a greater emphasis on graduate attributes such as career and leadership readiness, and communication. These drivers are influenced by the modern discourse in education in which ideas about standards, outcomes and competence-based education converge with the integration of knowledge, skills, professional attitudes and metacognition (self-regulations, reflection, etc.) as a prerequisite for the successful management of diverse professional situations 1. Integration (as opposed to reductionism) is, therefore, a central concept in this discourse and the ensuing national drivers in education.

From this development in thinking, it has become apparent that the exclusive use of so-called ‘objective’ assessment methods is misaligned with the need for a more integrative approach to assessment and learning 1,2. Despite the traditionally strong push towards assessing student learning through standardised testing, there is an equally important recognition that assessing students’ application of knowledge and capabilities in practice settings requires more complex, integrative approaches. Furthermore, when education is integrative and seeks to avoid reductionist designs, assessment should be aligned with its goals and principles 3,4. Integrative approaches to assessment typically require some sort of oral or human interaction-based forms of assessment (e.g., in law and health profession education programs observation of real life client or patient consultations; coaching, mentoring and personal feedback situations and vivas). Consequently, oral and practice-based assessments are increasingly forming part of comprehensive assessment programs.

Page 11: Better Judgement-2: Improving assessors’ management of

Better Judgement 2

In an attempt to make such assessments ‘objective’, much attention has been paid to the design and validation of forms, rubrics and checklists, in order to decrease the human subjectivity. Yet, the role of the human assessor remains paramount as rubrics and checklists only play a supporting role for the expert opinion, rather than replacing it.

High-quality or expert human judgement is indispensable in assessment for a number of reasons. Firstly, performance-orientated assessments, such as work sampling 5, practice-based 6, 360 degree and portfolio assessments 7,8, require human observation, and observation without interpretation (i.e., total objectivity) is not possible. Secondly, high-quality, expert human judgement is needed to integrate information from various parts of assessment tasks to form meaningful conclusions. This judgement is particularly important where assessors need to make decisions regarding a student’s progress, or when they are advising students about activities to further their learning 9.

Oral and practice-based assessment methods invariably involve the observation and evaluation of performance. Therefore, they rely heavily on the assessor’s assessment expertise. Optimal assessment, therefore, requires that assessors have expertise not only in the discipline in which they are working, but also in assessment itself.

Despite the essential role of the expert assessor in such assessment tasks, these approaches may be labelled as ‘subjective’ in nature and are sometimes regarded as a threat to reliability (and, consequently, to validity). However, there is strong evidence to suggest that human subjectivity is not the most important factor contributing to unreliability in examinations10. Rather, lack of reliability is mostly attributable to item sampling11.

In modern views, subjectivity is considered to be innate to the processes of assessment and evaluation. Indeed, it puts the ‘value’ in evaluation. Yet, at the same time, assessments can still be influenced by judgement biases 12. These biases are not prejudices but, rather, are misrepresentations in the assessor’s mind about what occurred during the assessment. Biases are based on the way humans represent information in their limited working memory and can be considered cognitive strategies to effectively and efficiently store and manipulate information in the working memory. While these cognitive strategies are useful (and indeed, unavoidable), they may lead to biased representations and decisions, especially if the examiner is unaware of them and their possible influence. In such cases, they may lead the assessor to involuntarily draw conclusions that s/he might not have made otherwise.

At this point, we should emphasise that biases are not necessarily errors as the term ‘errors’ assumes that there is an external objective true student competence. Instead, biases are embedded in all events requiring human judgement and may distract the examiner from his/her original, subjective judgement. For example, if a law student is dressed very poorly for a viva, this may influence the assessor’s judgement of the students mooting competence, which may be justifiable or not. But in no case does this mean that mooting

Page 12: Better Judgement-2: Improving assessors’ management of

Better Judgement 3

competence is objectively measurable. Unlike unreliability, however, judgement biases cannot be counteracted by improved item sampling or merely by the use of rubrics and checklists. Unchecked, such biases pose a threat to assessment validity and the inferences made in the assessment process13. Despite the notion that such cognitive strategies are naturally occurring and, therefore, cannot be avoided, assessors can be trained to name, recognise and consciously manage possible biases that impact upon their assessment of student performance, especially in practice-based settings.

The management of human biases cannot be achieved using a simple, ‘one-size-fits-all’ approach, as it requires examiners to develop ‘self-diagnostic’ skills to recognise biases in diverse contexts and situations. Training, therefore, requires the integration of basic knowledge, including worked examples, well-defined problems, ill-defined problems, and applications to the examiners’ own situations. Such training would allow examiners to develop a repertoire of solutions to deal with possible bias situations. The development of a training suite using such an approach, with a focus on the most frequent biases, was the aim of the Better Judgement project.

The Better Judgement project is a direct extension of the successful Better Judgement-1 project which was funded by an Office for Learning and Teaching (OLT) Seed Grant (SD12-2276: Better judgement: improving assessors management of factors affecting their judgement). The aim of the Better Judgement-1 was to design, develop and test the assessor training program for two biases (i.e., the primacy effect and memory errors of commission described in Table 1), with the intent to learn from the experience for the redesign and optimisation of the current project. One salient, encouraging finding from the Better Judgement-1 project, was the almost unanimous relief expressed by participants that the issue of bias in assessment was being named and addressed. Participants appreciated learning that they were not expected to be, or become, objective observers in assessment, and recognised that the training was intended to assist them to make more conscious and informed decisions. Workshop participant feedback also indicated a need to develop further resources to address a broader range of judgement biases. The current project was intended to further develop and build on the work of the Better Judgement-1 project, to cover a more comprehensive set of biases (see Table 1). In so doing, the current project was also designed to facilitate national and international uptake and use of resources.

Page 13: Better Judgement-2: Improving assessors’ management of

Better Judgement 4

Table 1: Effects and biases explored in the Better Judgement project. Bias Explanation Primacy effect This effect relates to the impact of the first impression in shaping the

overall judgement. Recency effect This effect relates to the influence of the last impression. It is based on

the premise that people best remember the last elements of an experience.

Memory errors of commission

This effect is associated with the memory failure and the creation of ‘false’ memories, i.e. thinking that you have done or observed something when in fact it has not happened.

Memory errors of omission

This effect relates to the failure of memory and forgetting elements of an experience. Both memory errors (i.e., omission and commission) are based on the premise that memories are not gelled representations of an event but are cognitive re-activations of that event and therefore susceptible to changes.

Confirmation bias

This bias assumes that people notice information that confirms their thinking rather than information that refutes their opinions, and thus making it difficult to change their mind.

Halo effect This effect relates to the inability of an assessor to separately evaluate different aspects of a person (e.g., attitudes, behaviours and appearances).

Cognitive dissonance

This effect assumes that people sometimes have to entertain two opinions or thoughts that contradict each other. For example when they perform certain actions that contradict their personal self-perceptions, which either has to result in a change in behaviour or in their thinking.

Contrast effect Here, the idea is that a recent observation (e.g., a previous candidate’s performance) influences a current observation (e.g., perceptions of the current candidate).

Selective perception

This relates to the impact and influence of assessors’ ‘hobby horses’ or ‘pet interests’ in shaping their overall assessment.

Approach and methodology The pedagogy used in the Better Judgement project was based on current insights from the cognitive psychology literature on human learning and the development of expertise 14,15. We argue that being able to recognise (diagnose) and manage effects and biases is similar to other diagnostic classification or categorisation tasks, and such problems are so-called ill-defined problems (there is no one single solution). Therefore, we consulted the diagnostic expertise literature. This literature indicates that a well-organised knowledge base is critical in enabling people to diagnose and solve problems and deal with complex, problematic situations 16-18. The training program, therefore, utilised various instructional approaches including:

• knowledge-building methods;

Page 14: Better Judgement-2: Improving assessors’ management of

Better Judgement 5

• knowledge transfer methods1 17-19;

• worked examples with gradually decreased scaffolding;

• exercises to support the application of knowledge to participants’ own contexts and thereby improve storage, retention and retrieval of new information 20.

A series of videos was developed which included short introductory lectures and the presentation of hyper-real scenarios. These scenarios were designed to portray overt, clear demonstrations of the effects and biases. One version of each scenario included thought bubbles to indicate actors’ thoughts and explanations of the situations. In addition, a series of real-life scenarios was sourced from YouTube™ to assist participants to recognise and identify biases in more realistic, less obvious situations. Both the hyper-real and real life scenarios were developed and selected, respectively, to cater to different disciplines in (tertiary) education and diverse contexts with a view to facilitating sector-wide impact. To further support participant learning, activities were developed to promote application to their own disciplines and assessment contexts.

The approach used in the Better Judgement training program was designed to:

• Increase understanding of the nature of human biases and the difference between subjectivity and bias (knowledge and insight);

• Improve recognition of a range of judgement biases in action (understanding on the basis of video exemplars);

• Improve recognition of biases in various situations and to support the development of strategies to manage their impact (exercising knowledge in various situations)

• Apply knowledge and strategies to a personal context (exercising knowledge in active learning situations);

• Facilitate action planning to improve assessment practices in participants’ own institutions

In order to broaden the target audience, increase involvement, and ensure longevity of the project outcomes beyond the life of the project, a second set of materials was developed. A series of Self-directed Learning Guides were developed to enable assessors to learn about, and manage, effects and biases without having to rely on the availability of facilitated workshops. These are self-paced online modules. These guides were developed, evaluated, and re-designed with assistance from nine of the workshop participants who volunteered to trial the guides and provide feedback. A Facilitator’s Guide was also developed to enable

1 Transfer is the ability to recognise similarities between two seemingly different but related problems.

Page 15: Better Judgement-2: Improving assessors’ management of

Better Judgement 6

assessors to develop training programs that are tailored to their own educational contexts and disciplines. See the website for the Self-directed Learning Guides.

Finally, two groups in two states (Western Australia and South Australia) were formed from workshop attendants who volunteered to help the project team with trialling materials and providing feedback and to become state-based leads in further developing and promoting Better Judgement training in their own states with a view to further increasing commitment and ownership beyond the project team.

Project evaluation procedures involved:

• Feedback forms including open-ended questions to evaluate participants’ learning, uptake and usefulness of the training;

• Number of website visits to evaluate the spontaneous use of the project resources;

• Follow-up email and telephone surveys to evaluate the impact of the project after workshop attendance.

Project outputs and findings The following deliverables and the associated resources have been developed (as specified in the original grant application):

• An introductory video overview (initially developed during the Better Judgement-1 project) which describes the role of biases in assessment and establishes the context for the training program (www.flinders.edu.au/medicine/sites/better-judgement/biases/overview.cfm). The video provides basic information in relation to assessment and judgement, and to assist assessors to be aware of the need to manage their biases. Workshop participants were encouraged to watch this video before the workshop by way of preparation.

• Modules for each of the nine biases including:

o A short video to explain the bias

o Two video vignettes depicting a hyper-real situation (with and without a commentary) to enable assessors to recognise the bias in an overt prototypical situation (well-defined problem-solving exercise)

o A collection of YouTube™ videos depicting various real-life situations in which assessors learn to recognise the bias in less obvious situations (ill-defined problem-solving exercise)

Page 16: Better Judgement-2: Improving assessors’ management of

Better Judgement 7

o Written assessment scenarios that have been produced by workshop participants that identify the bias occurring in a given context (link with real situations in the Australian context )

o A compendium of strategies that can be used to manage the biases

o This modular format was deliberately chosen to optimally enable end users to adapt the training materials to suit the needs of their local context. NB. Resources for the primacy effect and memory errors of commission were developed during the Better Judgement-1 project.

• Workshops facilitated by the project team for academics and potential adopters, based both nationally and internationally

• A Facilitator’s Guide for adopters to run workshops utilising training materials in their own settings. This guide was developed to stimulate ongoing use of the Better Judgement materials beyond the life of the project. The guide was developed with input from the reference group and workshop participants.

• A series of Self-directed Learning Guides for academics who wish to work through the materials independently. These guides navigate users to the resources on the Better Judgement website by outlining the project, providing an overview of biases in assessment, describing the training materials and how to use them, and include a series of reflective questions to prompt users to understand and apply what they have learned to their own context.

• An Academic Developer’s guide to assist implementation of the Better Judgement training in other institutions (see the website). This resource complements the facilitator guide by providing more background for those looking to lead the implementation of Better Judgement within their institution. It covers alternate modes of delivery and guidance on how to develop further resources.

• An enhanced Better Judgement website which includes materials that have been developed during the Better Judgement-2 project.

Including a wiki-type structure as a platform for ongoing collaboration between individuals and groups that use and produce Better Judgement materials and resources was briefly considered but has not been further pursued. Following initial scoping of interest, it became clear that it would be difficult to sustain a wiki with sufficient momentum to warrant the time and resource investment. Instead, participants were invited to share any self-constructed resources with the project team who would subsequently host these resources on the Better Judgement website and, thus, be made widely available.

Page 17: Better Judgement-2: Improving assessors’ management of

Better Judgement 8

The entire Better Judgement training package and its resources are now publically available on the project’s Flinders University-hosted website (www.flinders.edu.au/better-judgement) and all videos are available on YouTube™. The website not only includes materials for those who attended workshops, but also the downloadable Self-directed Learning Guides. These resources have been designed with transferability in mind, to facilitate uptake by assessors from various disciplines and institutions. The targeted potential adopters are coordinators of degree programs with oral and practice-based assessment components.

In designing the resources, every effort was made to ensure wide applicability across different contexts over time. The scenarios portrayed in the videos were deliberately taken from different disciplinary contexts – history, English literature, geography, accounting, psychometrics, law, cognitive psychology, music, and English grammar. The actors who featured in these vignettes were also chosen to represent diversity in terms of, for example, age, culture, and gender. Dress and locations were chosen such that they would be relatively time-independent so as to reduce the dating of the videos.

How the project used and advanced existing knowledge The Better Judgement project utilised three important areas of existing knowledge – workplace-based assessment, human decision making and cognitive psychology, and expertise development. The project was based on the premise that observation-based and workplace-based assessments make an important contribution to comprehensive assessment programs in many professional education contexts 3,21. Formerly, the pursuit of objectivity has dominated the thinking of assessment developers; the notion that competence is an objectively measurable trait which is independent of the observer. This has led to the design of different assessment instruments, including highly structured checklists and rating scales. In various disciplines, however, and to varying degrees, it is understood that competence cannot – and perhaps should not – be seen as an objective concept but rather, always involves subjective judgement 22 . From this perspective, subjective human judgement is inescapable in assessment, as any form of assessment is an evaluative activity, and thus subjective assessments based on human judgement can certainly be reliable and valid 10.

The literature from heuristics and bias, and naturalistic decision making, present different perspectives in considering human judgement 12. From a heuristics and biases perspective, the focus of the judgement is considered an objectively measureable truth, subjectivity is to be abhorred and avoided, and judgement biases are, by definition, bad and serve to distort the assessment process. Conversely, naturalistic decision making theories seek to explain how and why human judgement can prove beneficial in time- and information-poor situations, rather than focusing on explaining the imperfection of human judgement 23,24. Cognitive load theory 25 represents another perspective on human judgement. Central to this theory is the notion that short-term memory capacity is extremely limited (both in

Page 18: Better Judgement-2: Improving assessors’ management of

Better Judgement 9

terms of quantity information and retention time). In order to overcome these limitations, humans employ strategies to optimise the way they work within the constraints, reducing their cognitive load. In the Better Judgement project, judgement biases are considered cognitive load reduction strategies. In summary, the project team took the view that biases are ever-present and unavoidable, and are needed to overcome human limitations such as short-term memory. No examiner is able to store an entire oral examination in their short term memory for processing and, therefore, cannot avoid utilising biases or be trained to observe without biases.

A dominant view in the expertise literature is that expertise develops through the formation of scripts as solutions to complex problems 26,27. Here, people begin with isolated facts that form simple semantic networks that subsequently aggregate into full problem scripts and ‘instance’ scripts, in which a problem and its solution is almost unconsciously recognised. From the literature, it is also clear that the development of expertise in ‘diagnosing incompetence’ is quite similar to the development of other diagnostic expertise (e.g., medicine) 28,29. Expertise plays an important role in reliability and validity of human judgements. After all, a single well-founded expert diagnosis is generally not deemed inferior to a random sample of opinions. In healthcare, for example, it is common to accept a single, and sometimes second, expert opinion as opposed to a combined random guess of a hundred less-informed people. The central point, then, is how to help examiners develop this diagnostic expertise. One of the elements of this is to support examiners to transform their biases (as unripe scripts) into rich ‘diagnostic’ scripts.

Diagnosing intangible concepts (e.g., competence) is a complex task and a ‘one-size-fits-all’ approach is rarely effective. The project team, therefore, sought to design a training package informed by cognitive psychological and educational notions of developing such forms of complex expertise. The following educational design principles were used in developing the Better Judgement project:

• Ensure that the basic knowledge for understanding is present 17 and design interventions that the requisite basic knowledge is conveyed simply and is well-paced. In the Better Judgement project, this was achieved by the presentation of videoed mini-lectures and explanations.

• Present worked examples (i.e., well-defined problems) in which concepts are demonstrated and strategies for identification (‘diagnosis’) are simple 30. In the Better Judgement project, this was achieved by portraying particular biases in the video vignettes depicting oral exam situations (with and without explanations).

• Present real-life examples with reduced scaffolding to enable the learner to recognise or identify the concept in a less well-defined context to stimulate near transfer 30. In the Better Judgement project, users were presented with a series of YouTube™ videos portraying real-life situations. Minimal scaffolding was provided in

Page 19: Better Judgement-2: Improving assessors’ management of

Better Judgement 10

the form of brief explanations to indicate the type of bias that was likely demonstrated.

• Link learned material to personal experience to support learners to understand deep structures of the problem. This facilitates active learning which enhances storage and retrieval and further stimulates far transfer 15. In the Better Judgement project, this was addressed in both the workshops and self-directed guides, by inviting participants to identify and describe a personal situation in which they believed that bias may have played a role in shaping their judgement.

Several experiences during this project have advanced our understanding of, and insights into, judgement and biases, and ways to help the examiner manage them. Firstly, the literature tends to describe biases as discrete and separate (for an overview cf. 12). From the development of the project materials and through interactions with workshop participants, it became clear that these distinctions are much less evident in reality. The bias categories shown in Table 2 were identified and used. These insights were used to further develop the workshops and resource materials which aided in better articulating the project and its goals. Nonetheless, more research is warranted in order to better understand this clustering.

Table 2: Categories of biases Storage information errors

Here, the examiner encounters difficulties in remembering information either through lack of access to all information or ‘over-complete’ information about an event. This includes memory errors of omission (i.e., forgetting) and memory errors of commission (i.e., introducing false memories).

Errors in weighting information

In these situations, information may be remembered but the elements of the information are weighted differently. Examples include the primacy effect, recency effect, and selective perception.

Errors in interpretation

Here, all information may be available and appropriately weighted, but it is interpreted from a particular perspective. This category includes the halo effect, cognitive dissonance, and the contrast effect.

Errors caused by emotional responses

While emotions do not constitute a ‘bias’ in its original sense, the role of emotions such as boredom, irritation, and fatigue were recognised as playing an important role in judgement.

Secondly, it appears that the suggested bias management strategies are linked with the categories of biases (described above). Generally: recording methods (e.g., transcription, note-taking, and video/audio recording) were most intuitively connected with memory failures; rubrics and checklists were best linked with weighting biases; cognitive measures

Page 20: Better Judgement-2: Improving assessors’ management of

Better Judgement 11

were best aligned with interpretation errors; and general emotional responses required organisational approaches. Having said this, the bias categories and management strategies by no means represent a simple 1:1 relationship. While these findings were helpful in explaining the biases and management approaches to participants, additional research is required to support further educational development.

Thirdly, it was evident that participants felt empowered by learning the language of biases, and constructing a narrative to describe and understand them with a view to enabling more conscious decision making in observation-based assessment. Most participants came from fields/disciplines in which assessment typically uses an objective measurement approach and where subjectivity is seen as unreliable or lacking validity. Participants appreciated the perspective that the Better Judgement training package was not about trying to train them into being more ‘objective’ observers but, rather, that they can better understand and manage their own subjectivities.

Success factors The Better Judgement project has been successful in delivering the outputs described in the original project proposal. The project has generated interest and uptake by users, not only in Australia, but also internationally. The successful delivery of the project’s outcomes can be attributed to several process- and content-related factors.

Process-related factors

The Better Judgement project utilised an educational approach to training that is based on cognitive approaches to the development of expertise 15,18. According to the literature, training programs are essential to facilitate the rapid transformation from novice to expert assessor. The components of such training programs were described in the previous section.

We identified ongoing active stakeholder involvement throughout the development of the project materials and resources as a key process-related success factor. Involvement of stakeholders occurred at various levels:

• The videos that were developed during the project used existing academics from various disciplines to portray the examiners. Students and academics assisted in the development phase by ‘workshopping’ the video scripts to ensure that the narratives were both authentic and credible. This was conducted for two reasons: (i) to improve the quality of the material; and (ii) to increase involvement of the academic community, beyond the project team.

• Workshop participants made invaluable contributions to the ongoing development of the project by offering ideas and suggestions which contributed to the development of the resources, and the provision of feedback which was used in refining and improving upon workshop delivery.

Page 21: Better Judgement-2: Improving assessors’ management of

Better Judgement 12

• The ongoing involvement of an expert reference group aided in refining the project’s outcomes, contributed to the development of Better Judgement resources, and supported the dissemination of project materials.

• Institutional support provided by OLT staff played a critical role in facilitating high-level organisational matters such as contracts and financial management. Open and supportive communication with the OLT, and the ability to exercise flexibility regarding the project’s boundaries, have also been extremely important factors contributing to the success of this project.

This extensive and rigorous stakeholder involvement and feedback, as well as structural feedback from all participants, was used in an ongoing cycle to improve and redesign. It has also served an important function in the final evaluation of the project materials.

Content-related factors

The theoretical foundations of the Better Judgement project have been described throughout this report. Yet, a firm theoretical foundation alone is not enough to make an education-focused project effective. A careful translation to practice process must also be undertaken. An effective education-focused project requires:

• An educational design team with a good working knowledge of relevant theories and an understanding of the meaningfulness of the project outcomes;

• An educational design team that is able to translate ‘in-vitro’ theories into ‘in-vivo’ notions;

• An educational design team with experience in educational design and a good working knowledge of the practicalities of education;

• An ongoing process of quality assurance and quality improvement cycles;

• Ideally, a research program vis-à-vis the project’s implementation in order to generate new knowledge which can subsequently be used to further re(design) the educational program.

All of these criteria have been met in the Better Judgement project. The project team had a good working knowledge of the theories that were used in the project. The project team and the expert reference group include scholars who have published extensively in the field of assessment, workplace-based assessment, expertise development, diagnostic reasoning, and decision making. In addition, we had extensive experience as educational developers and designers and teachers, facilitating the translation from theory to practice.

The data collected for evaluation purposes improved our understanding of the difference between the occurrence of effects (e.g., a primacy or recency effect) and the extent to

Page 22: Better Judgement-2: Improving assessors’ management of

Better Judgement 13

which this constitutes either a judgement bias or a cognitive strategy. The differences between effects and biases remain unclear, but current research is being conducted by the project team to better understand this. While this research was not part of the project proposal and can be seen as an entirely different process from the Better Judgement project, there are links with the project. Firstly, the publication of research findings is a means by which to promote further uptake of the project by alerting users to the resources. Secondly, the research outcomes will have implications for future re-design of the materials, including the Facilitator and Academic Developer’s Guides.

The integrity of the processes involved in integrating knowledge and understanding of theories with practical knowledge and experience about educational design, and the approach to quality assurance and quality improvement vis-à-vis an applied research program constitutes a significant strength of this project.

Factors that impeded success

The Better Judgement project has met and exceeded the project team’s expectations in relation to outputs, engagement and impact. However, some factors that affected progress should be noted. Firstly and most importantly, the project team had a desire to develop a project that would be broadly applicable. Although it can be assumed that observation and workplace-based assessment problems occur across many disciplines, it is impossible to build a project team of members from all possible disciplines, all with a deep working knowledge of assessment issues. The project was, therefore, based on the assumption that identified assessment issues, and the approach to remedy them, are generalisable.

Although workshop attendance was based on self-selection and participants’ backgrounds were reasonably diverse, attendance was dominated by participants from healthcare-related disciplines. Despite attempts to be broad (e.g., by developing video examples from different disciplines), the workshops did not capture a similar breadth. For example, amongst workshop participants, there were no examiners from geography, history, economy or accounting teachers. Similarly, psychology, law and language teachers represented a minority of workshop participants. Consequently, the project was limited in terms of its strategies to incorporate involvement by examiners from additional disciplines. A reliance on the established networks of the project team members remained an important factor in engaging participants.

Implementing Better Judgement in your institution The Self-directed Learning Guides and the Facilitator’s Guide have been described above. They were exclusively developed to support academics who want to use or further develop the Better Judgement materials in their own institution. The Academic Developer’s Guide explicitly includes guidance on how to produce new training materials.

Page 23: Better Judgement-2: Improving assessors’ management of

Better Judgement 14

Adding resources to the website would be open to all those who choose to be involved in the Better Judgement training and who choose to produce additional material. We have made it explicit to all involved that we are happy to host any good-quality material that users want to share on our website.

Linkages Various approaches were used to ensure a broad linkage of the project. This included both linkages between and across disciplines, in addition to linkages with other projects.

Disciplinary and interdisciplinary linkages

Firstly, because workshop participants were overwhelmingly from the health professions, the project team decided that the best strategy would be to focus on this core group of early adopters, and attempt to involve them as leaders in their institutions and/or states for further development and dissemination of the project. As previously discussed, two groups were established. The first group involved users from Western Australia (from the University of Western Australia and Curtin University), while the second group were involved in dentistry education, from the University of Adelaide in South Australia. Secondly, the videos produced during this project were deliberately situated in a range of disciplines in order to facilitate uptake across disciplines. In addition (as previously discussed), teaching academics from various disciplines were the actors who portrayed the examiners in the videos. This involved staff from law, engineering, chemistry, and medicine, to name a few. Finally, workshops were regularly promoted using university-wide mailing lists, the Better Judgement blog, and via the Better Judgement website ‘Events’ page.

Linkages with other projects

The project builds on the experiences from the Better Judgement -1 project, which could be seen as a pilot for this Better Judgement-2 project. An important concept underpinning the Better Judgement projects is the notion of building teacher/examiner expertise as already flagged in Professor Orrell’s Good Practice Report on work-integrated learning and assessment. The Better Judgement projects sought to develop resources that would enable a faster development of the expertise needed especially in workplace-based or work-integrated assessment contexts 31.

In doing this we also built on the concepts set out in the Assessment Futures report 32. Not only did we acknowledge that assessment is a human judgement activity with all its advantages and disadvantages but we also recognise that it requires a specific expertise with the appropriate narrative. Such a narrative is needed to support the assessors in forming their own decisions and to explain and defend their decisions to other stakeholders. In addition though, assessors need to be able to explain their decisions to students as part of a conversation in an assessment-for-learning context. So, it does not only assist by empowering the assessor to make defensible and credible certifying decisions but, more

Page 24: Better Judgement-2: Improving assessors’ management of

Better Judgement 15

importantly, the possession of this narrative is also needed to optimise an educational role for assessment (assessment-for-learning).

Page 25: Better Judgement-2: Improving assessors’ management of

Better Judgement 16

Chapter 2: Impact, dissemination and evaluation

Impact As discussed, the videos that formed part of the project training materials are available on YouTube™. One measure of the project’s impact was the numbers of views each video received (see Table 3).

Table 3: Number of Better Judgement project video views (as at 26/01/16). Background Exemplar Exemplar (with

explanation) Review

Overview 681 Primacy effect 1,062 2,459 227 0 Memory errors of commission

152 100 258 0

Recency effect 25 21 112 8 Contrast effect 40 26 31 3 Memory errors of omission 11 9 5 Confirmation bias 22 22 19 3 Selective perception 26 23 27 6 Halo effect 31 28 27 14 Cognitive dissonance 26 23 27 6

The total number of views was 5,589 with a total video watching time of 11,866 minutes. From these total views, 85 per cent were directly accessed via YouTube™ and 15 per cent through the Better Judgement website. The top geographical regions of origin of the viewers (by watching time) were: (i) Australia (32 per cent); (ii) United States (21 per cent); (iii) United Kingdom (7 per cent); (iv) Canada (3.7 per cent); and (v) The Netherlands (3.5 per cent). In addition, there were views from: India, Russia, Lithuania, Norway, Sweden, Indonesia, Thailand, Cambodia, Vietnam, New Zealand, Fiji, Cambodia, Philippines, South Africa, Zimbabwe, Mozambique, Namibia, Tanzania, Kenya, Nigeria, Ghana, Algeria, Morocco, Egypt, Saudi Arabia, Oman, Israel, Iraq, Afghanistan, Pakistan, South Korea, Japan, Kazakhstan, Ukraine, Estonia, Finland, Denmark, Ireland, Netherlands, Poland, Romania, Slovenia, Croatia, Italy, Bosnia, Albania, Greece, Turkey, Azerbaijan, Georgia, Bulgaria, Serbia, Hungary, France, Spain, Morocco, Iceland, Mexico, and Panama.

Dissemination

Dissemination during the production of the resources The original dissemination plan for the Better Judgement project was developed according to the D-Cubed guidelines 33. The timeline was refined throughout the project as new opportunities arose for dissemination. An example of this is the involvement of users such

Page 26: Better Judgement-2: Improving assessors’ management of

Better Judgement 17

as the group in Western Australia. Members of this group assisted in trialling and providing feedback on the Self-directed Learning Guides in terms of the content, clarity, usefulness, and time investment. These members were then equipped to promote the project within their own professional networks thus extending the reach of the core project team.

Training materials were developed in consultation with the reference group, during the Better Judgement-1 project and up until the first workshop. A core activity during this establishment and development phase involved writing, trialling, revising and filming the video vignettes. Scripts were written by the project team and were trialled with two students who provided feedback on the appropriateness of the language relative to the student voice. The scripts were then workshopped with actors who played the roles of the students. This resulted in further modification of the scripts and also provided an opportunity for the actors to rehearse their roles.

The project workshops involved simultaneous and ongoing education, evaluation and dissemination. This redesign did not involve major changes but, rather, minor improvements. One example was the suggestion by participants to better clarify the difference between the effects (e.g., the primacy effect and the contrast effect) and the extent to which these effects can be seen as supporting or hindering a good judgement in assessment. That is, to better distinguish between the effect as an unavoidable aspect of human judgement, and the influence it has on the outcome of the judgement.

Workshops A total of 13 workshops were conducted during the Better Judgement project. The details of these workshops, including the number of attendees, are provided in Table 4.

The project team also has confirmed future commitments at the Ottawa/ANZAHPE Conference (March 2016) and the Higher Education Research and Development Society of Australasia conference (July 2016).

The mailing list established during the Better Judgement-1 project was extended to support the communication and dissemination strategy. Individuals on this mailing list received regular updates on the outcomes of each phase of the project. In addition, a Better Judgement blog was established to enable interested individuals to subscribe and receive news about the project.

Where possible, information about the Better Judgement project was disseminated in conjunction with key conferences, which were often cross-disciplinary (e.g., the Higher Education Research Group of Adelaide (HERGA) Conference). This was designed to maximise the size and diversity of the audience, with a view to increasing uptake by end users. Table 5 provides an overview of the presentations given as part of conferences and other events in addition to the workshops listed in Table 4.

Page 27: Better Judgement-2: Improving assessors’ management of

Better Judgement 18

Table 4: Better Judgement workshops. Date Location Bias(es) covered Number of

Participants August 2014 Association for Medical Education in

Europe Conference (AMEE), Italy Primacy effect Memory errors of commission

30

December 2014

University of Western Australia Cognitive dissonance Contrast effect

32

January 2015

Tabor Christian College Halo effect 8

March 2015 The University of Adelaide Recency effect Selective perception

30

March 2015 Australia and New Zealand Association for Health Professional Education (ANZAHPE) Conference, Newcastle

Halo effect 80

June 2015 Tabor Christian College Confirmation bias 14 June 2015 The University of Adelaide Selective perception 29 August 2015 Monash University Confirmation bias

Halo effect 9

September 2015

University of Queensland Halo effect Memory errors of omission

23

November 2015

Charles Darwin University Halo effect Recency effect

37

January 2016

Asia Pacific Medical Education Conference, Singapore

Confirmation bias 70

February 2016

Australian National University Contrast effect Cognitive dissonance

30

March 2016 University of Technology, Sydney Confirmation bias Selective perception

14

Table 5: Better Judgement presentations. Date Event Location April 2014 Postgraduate Education Topic Presentation:

Assessment & Evaluation in the Graduate Certificate in Education (Higher Education)

Flinders University

September 2015

Higher Education Research Group of Adelaide (HERGA) Conference (presentation)

The University of Adelaide

November 2015

Seminar Maastricht University, the Netherlands

November 2015

Society for Research in Higher Education (SRHE) Conference (poster presentation)

Newport, South Wales

March 2016 Ottawa conference (presentation) Perth

Page 28: Better Judgement-2: Improving assessors’ management of

Better Judgement 19

Sustainability has been supported by the commitment of Flinders University to hosting the webpage for a minimum of five years beyond the project (as specified in the original grant application). In addition, Dr Schmidt and Professor Schuwirth will continue to give presentations after the project’s conclusion, with travel being funded by the requesting institution.

Project evaluation The Better Judgement project was evaluated from a number of different perspectives. Firstly, participants provided feedback following attendance at workshops. Secondly, workshop participants were invited to participate in a follow-up evaluation. Finally, the project was evaluated by an external evaluator (refer to Appendix B). Evaluation was completed in February 2016 based on data collected up to that time.

Workshop evaluation In total, 260 out of 392 (66.3 per cent) participants provided feedback on the workshops via the feedback form (see the Facilitator’s Guide on the website for an example of the form). Table 6 provides a breakdown of the total number by effects and biases.

Table 6: Breakdown of workshops and number of feedback forms received.

Effect/Bias Number of workshops

Number of participant feedback forms

Contrast effect 1 23 Halo effect 2 48 Selective perception 2 35 Confirmation bias 2 16 Cognitive dissonance 1 23 Recency effect 2 38 Memory errors of omission 1 19 Memory errors of commission 1 29 Primacy effect 1 29

Feedback forms largely contained open-ended questions enabling participants to describe their perceptions of the value of the workshop and give suggestions for further improvement of workshop delivery. A qualitative thematic analysis of the responses was conducted. The summary of the feedback is presented below.

Theme 1: Learning Gained

Participants indicated an overwhelmingly positive response to the workshops. All respondents reported the development of new insights following their attendance, even where they were previously aware of the concept of human bias. For example: Although I

Page 29: Better Judgement-2: Improving assessors’ management of

Better Judgement 20

was aware of most of these ‘psychological’ concepts, I hadn’t really pulled them together into a framework of bias in assessment practice.

Learning gained: Heightened Awareness

Participants indicated that the knowledge of the concepts gained from workshop attendance had heightened their awareness of biases. Many explained that in future assessment situations, they would be more attuned to the impact of their biases. For example, one participant stated: I will need to think about being more upfront about my biases and check that I am being fair and reasonable.

Learning gained: Empowerment with Language

A key area of learning identified in workshop feedback was the empowerment experienced by participants after learning the language of biases and understanding that subjectivity is not necessarily a bad thing in assessment. As one participant explained: It’s helped me to name and put a framework around something that I know I have struggled with in the past, and also helped me to better articulate and be aware of the implications on bias and assessment.

Another participant stated: As a novice assessor, I now have a term on which to anchor this bias and an opportunity to reflect upon my contribution to this interaction.

Learning the language of human bias was not only perceived to be of benefit to the assessment of student performance, but also to support professional development in broader assessment contexts, as this participant explained: …gives me some more language around assessor biases to use when supporting educators and educating them about quality clinical supervision.

Learning gained: Recognition of Complexity

Feedback from workshop participants indicated an acknowledgement of the complexity of human biases, and the multiple factors that can influence judgement. As one participant explained, in relation to the contrast effect: Many factors like fatigue, bias, use of prompts, rubrics to guide questions and expected responses – can affect judgement.

This complexity was also noted by participants who recognised that biases, although categorised into clusters, are not easily distinguishable from one another in the assessment context. Rather, there is a dynamic interaction between the various biases. For example, one participant explained that the halo effect and primacy effect are “closely related”, making it “tricky to differentiate”. Similarly, another respondent noted that the recency effect can be shaped by issues of memory failure: … the information in memory can cause us to weight and perceive behaviours differently.

Page 30: Better Judgement-2: Improving assessors’ management of

Better Judgement 21

Learning gained: Acknowledgement and Management of Biases

Central to the Better Judgement project was the notion that human biases are not necessarily beneficial or detrimental, and that subjectivity is an accepted component of any assessment. This was reflected in the workshop feedback:

I like to use my intuitive judgement and am delighted to learn that it is not such a bad thing!

I had always thought of the halo effect as having a positive influence but now understand that it can also be negative …

This is not necessarily a bad thing – it requires close scrutiny and caution around assumptions.

Rather than seeing biases as good or bad, it is the acknowledgement and management of biases that is important:

It happens! It may not be something we can change but we need to be aware of it.

Bias is always there – one needs to recognise it, be mindful and know how to deal with it.

Many participants recognised the need to incorporate strategies into their assessment practices to ensure assessment outcomes are equitable and fair for all students. Strategies that were cited include: video and audio recording and note-taking to aid with recall, incorporating breaks from marking, and reducing marking loads. Another strategy identified by participants was the need to assess the student’s performance in its entirety to mitigate the impact of biases.

A key learning participants reported emerged from the workshop feedback was the importance of engaging in self-reflection as part of the assessment process. As one respondent explained, you need to: Trust your instincts to make judgements but … have a metacognitive dialogue to check for biases and correct where required.

Another participant indicated this as an area for change in personal assessment practices: I will be far more self-reflective about how I may be experiencing the recency effect and/or selective perception in my interviews, and try not to let them affect my final judgement.

Theme 2: Applicability and Relevance

Workshop feedback was extremely positive in relation to the value, applicability and relevance of the training. As one participant explained: This is the first time I have been formally speaking about that issue which is somewhat fundamental.

The fundamental nature of the content was also demonstrated through a recommendation made by respondents that the training should become a component of all assessor training

Page 31: Better Judgement-2: Improving assessors’ management of

Better Judgement 22

programs. As one participant explained: Assessors require orientation to the role in order to aim to provide consistency in approach.

The notion that bias is a central concept was also reflected in participants’ acknowledgement of its versatile application, both across different types of assessment (e.g., written and oral-based assessments) and different contexts (e.g., disciplines and teaching settings). For example, one participant noted: I can use it in any class/student interaction when discussing this learning.

It was further recognised that the concepts presented were also beneficial beyond the teaching and learning context, taking on an important role in research: As a qualitative researcher, I always practice examining my subjectivities and how it shapes my research and its outcomes. I apply this to my T&L [teaching and learning] practice. I like the fact that this workshop has allowed me to apply this understanding to the assessment context in a “conscious” manner.

Some participants reported that the training validated the concerns encountered in the assessment context. Workshop attendance also prompted a renewed desire towards professional development for some participants. As one participant explained: It re-enthused me in this space. Often we, as academics, get busy doing and lose focus on developing better practices.

Theme 3: Workshop Discussions

A particularly valued aspect of the workshop for participants was the opportunity for interaction, discussion and knowledge sharing. This was seen as integral to the workshop’s effective organisation and structure. As one participant explained: It was affirming to hear other educators’ experience and understand [that] bias exists everywhere.

The video vignettes supported participants’ understanding by providing a framework for the development of rigorous discussion around assessment. Discussions were also supported by the knowledge and facilitation of the workshop presenters, which was cited as a key strength by participants.

Central to workshop discussions was an acknowledgement of the importance of drawing upon participants’ own knowledge and experience. Through these discussions, participants developed a greater appreciation of how the issues about human bias are reflected across multiple disciplines. This was also reflected in the interdisciplinary approach that was taken to workshop delivery, which was identified by participants.

The applied nature of Better Judgement workshops was also recognised by participants. In particular, participants appreciated the integration of concepts with personal practical application.

Page 32: Better Judgement-2: Improving assessors’ management of

Better Judgement 23

Theme 4: Anticipated Changes to Assessment Practices

Participant feedback revealed a number of significant areas of change as a result of attending the training. These changes related to developing assessment practices at both an individual and a collegial level.

At an individual level, participants cited personal changes they would make to their existing assessment practices. This included: visiting the Better Judgement website to further develop their knowledge and understanding of biases (n = 23); reviewing current assessment practices such as assessment criteria, weightings, tasks, frameworks; reflecting upon approaches to feedback (n = 48); and implementing management strategies (n = 19).

From a collegial perspective, participants recognised the importance of collaborating with other assessors. This included engaging in discussions with colleagues about biases (n = 27) and incorporating the Better Judgement training into existing professional development programs (n = 15).

Theme 5: Specific effect/bias-specific perspectives

In general, these findings also pertain to the specific effect/bias covered in the workshops. Yet, there were some participants’ perspectives that were specific to the effect/bias covered in the workshop. The most salient participant views were:

Contrast Effect

Participants (78.3 per cent) acknowledged that the workshop heightened their awareness of its impact on judgement in assessment situations. As one participant explained: I learnt about the contrast effect and its role when assessing many students back to back. To be aware of internal dialogue surrounding comparison of students and the role that fatigue may play in assessment.

Nearly half of respondents (47.8 per cent) recognised the need to employ strategies to ensure that the contrast effect is appropriately managed in the assessment context. Participants cited organisational strategies such as scheduling breaks between assessments, and ensuring that there are not too many students scheduled in any given assessment block.

Cognitive Dissonance

All respondents reported an increased understanding and a need to be aware of cognitive dissonance in assessment practices. Some participants provided their own definition of the bias. For example, one participant stated that cognitive dissonance occurs when: Competing opinions have an effect and there is a need to consider this in giving a mark or grade. (Being aware of my biasness)

Page 33: Better Judgement-2: Improving assessors’ management of

Better Judgement 24

Recency Effect

The majority of respondents (89.5 per cent) found that the workshop assisted them to better understand the potential impact that the recency effect can have on overall judgement in assessment contexts. As one participant explained: Some people start off strong and may have a few weak areas. Depending on the conclusion, it may affect judgement.

Participants also realised that important role of the examiner in managing the recency effect in which encouragement and prompting can influence the strength of students’ responses. As one respondent suggested, the examiner plays an important role “…to assist the student to pull knowledge together.”

Selective Perception

The concept of selective perception was not new to all participants, yet many gained new insights as a result of workshop attendance. As one respondent explained: I have seen it before but it is a stronger consideration in my thinking.

Most agreed that there is a need to manage this bias, for example by self-reflection. This was seen to involve asking questions such as “Am I being fair? What are my preconceived ideas?”

Halo Effect

Five respondents identified this concept as revision, but all participants demonstrated an increased understanding of the halo effect and its potential role in shaping judgement in assessment practices. As one respondent explained, it is: The importance of not just staying with the judgement initially made; to name the issues to stop the potential for things to get even worse in terms of my assessment and the student.

In particular, the workshop was seen as providing validation for previous concerns in assessment: It has explained why I feel disappointed with student performance in some contexts – it is perhaps because I inappropriately elevated them due to halo effect.

Memory Errors of Omission

Respondents recognised the difficulties associated with memory failures, and the potential impacts on their judgements of student performance. As one participant explained: Because we are human, we forget things and in an assessment situation, this may disadvantage a student, or given them undue advantage depending on the scenario.

Another stated that: Sometimes, you don’t even know what you have missed – makes assessment/judgement a scary space!

Page 34: Better Judgement-2: Improving assessors’ management of

Better Judgement 25

Primacy Effect

Nearly half of participants (41.3 per cent) noted that they knew little or nothing about the primacy effect prior to their attendance at the workshop. Seventeen participants had an awareness of the concept of which seven were unfamiliar with the terminology, and one was unaware of approaches to managing this bias.

Memory Errors of Commission

Twelve (41 per cent) participants had either little or no knowledge of this human bias, while 10 respondents were aware of the concept, but not the terminology or the theory behind human biases. Workshop feedback indicated that participants developed an increased awareness and understanding of memory errors of commission.

Confirmation Bias

Whilst one participant reported that the concept was not new, all participants described an increased understanding of the impact that confirmation bias can have on judgement. One participant explained that confirmation bias occurs: When you already know something about a student and use this to make a judgement.

Another stated that you run “…the risk of manipulating the assessment process in line with our expectations.”

Half of the respondents (n = 8) recognised the ease with which confirmation bias can “sway the results.” For example, one participant explained that: It is very tempting to coach the student who you believe knows their stuff but is not performing at their best.

Follow-up evaluation As a follow-up evaluation, a short 5-item email feedback form was sent around to all participants who consented to be approached for this purpose. In total, 117 invitations were sent and 20 people responded, which constitutes a rather disappointing response rate. Although most people indicated that they have used the insights gained from the workshop in their own practice and are still able to concretely explain what they have learned, the uptake and use of the online materials by participants is half. Table 7 summarises the responses to the five questions.

Page 35: Better Judgement-2: Improving assessors’ management of

Better Judgement 26

Table 7: Responses to the five-item follow-up questionnaire. Question Yes No 1. Have you applied the concepts presented in the workshop in your work? 17 3 2. Please give a brief example of how you have used something from the

workshop in any aspect of your own practice. 18 2

3. Have you used any of the BJ website resources since attending the workshop?

10 10

4. Are you aware of anyone else who has? 5 15 5. Would you be willing to participate in a follow-up phone interview? If so,

please provide your preferred contact number. 12 8

The follow-up telephone interviews (n=6) revealed reasons and incentives for participants and their colleagues to further use the resources. The main reasons for using the website and its resources were:

• Involvement in further development activities. These include institution-specific activities (e.g., staff development responsibilities and tutor training), or Better Judgement project-related (e.g., involvement in one of the state working groups).

• Personal interest and intrigue by the workshop content and a desire to further explore the ideas

• Usefulness of website materials in incorporating into their own work as an assessor

• Usefulness of the resources for inclusion in their own staff-development material

Workshop feedback has been highly positive, and enrolment and participation have been high. Respondents also praised the design and usefulness of the resources on the website. Unfortunately, limited feedback was received as to why people did not access the resources after the workshop. Our view is that end-users were probably waiting for the entire package of resources to become available before deciding what and how to embed them into their local context. The team realised that it is the nature of projects like these that produce a body of resources, that even though dissemination happens along the way, a final package of resources is not available to show people until the end. The Project Leader has been in contact with coordinators of a couple of Graduate Certificates in Higher Education who have expressed an interest in embedding these resources in their modules and have been waiting for the entire package to become available, especially the Self-directed Learning Guides. The team therefore anticipates that uptake will improve in the near future and will provide support for this to occur and that this will facilitate improved understanding, recognition and management of judgement biases in oral and practice-based assessment.

Page 36: Better Judgement-2: Improving assessors’ management of

Better Judgement 27

Appendix A: References 1. Albanese MA, Mejicano G, Mullan P, Kokotailo P, Gruppen L. Defining characteristics of educational competencies. Medical Education 2008;42:248-55. 2. Schuwirth LWT, Van der Vleuten CPM. Changing education, changing assessment, changing research? Medical Education 2004;38:805 - 12. 3. Boud D. Assessment and the promotion of academic values. Studies in Higher Education 1990;15:101-11. 4. Schuwirth LWT, Van der Vleuten CPM. A plea for new psychometrical models in educational assessment. Medical Education 2006;40:296-300. 5. Schuwirth LWT, Van der Vleuten CPM. Programmatic assessment: from assessment of learning to assessment for learning. Medical Teacher 2011;33:478-85. 6. Norcini JJ, Blank LL, Duffy FD, Fortna GS. The mini-CEX: A method for assessing clinical skills. Annals of Internal Medicine 2003;138:476-81. 7. Epstein RM, Hundert EM. Defining and Assessing Professional Competence. The Journal of the American Medical Association 2002;287:226-35. 8. Driessen EW, Van der Vleuten CPM. Matching student assessment to problem-based learning; lessons from experience in a law faculty. Studies in continuing education 2000;22:235-48. 9. Driessen E, Van der Vleuten CPM, Schuwirth LWT, Van Tartwijk J, Vermunt J. The use of qualitative research criteria for portfolio assessment as an alternative to reliability evaluation: a case study. Medical Education 2005;39:214-20. 10. Van der Vleuten CPM, Norman GR, De Graaf E. Pitfalls in the pursuit of objectivity: issues of reliability. Medical Education 1991;25:110-8. 11. Swanson DB. A measurement framework for performance-based tests. In: Hart I, Harden R, eds. Further developments in Assessing Clinical Competence. Montreal: Can-Heal publications; 1987:13 - 45. 12. Plous S. The psychology of judgment and decision making. New Jersey: McGraw-Hill inc.; 1993. 13. Kane MT. Validation. In: Brennan RL, ed. Educational Measurement. Westport: ACE/Praeger; 2006:17 - 64. 14. Eva KW. What every teacher needs to know about clinical reasoning. Medical education 2004;39:98-106. 15. Regehr G, Norman GR. Issues in cognitive psychology: Implications for professional education. Academic Medicine 1996;71:988 - 1001. 16. Chase W, Simon H. Perception in chess. Cognitive psychology 1973;4:55-81. 17. Chi MTH, Glaser R, Rees E. Expertise in problem solving. In: Sternberg RJ, ed. Advances in the psychology of human intelligence. Hillsdale NJ: Lawrence Erlbaum Associates; 1982:7 - 76. 18. Posner MI. What is it to be an expert? In: Chi MTH, Glaser R, Farr MJ, eds. The nature of expertise. Hillsdale, NJ, US: Lawrence Erlbaum Associates, Inc; 1988:xxix - xxxvi. 19. Eva K. On the generality of specificity. Medical Education 2003;37:587-8. 20. Schmidt HG. Foundations of problem-based learning: some explanatory notes. Medical Education 1993;27:422-32. 21. Miller GE. The Assessment of Clinical Skills/Competence/Performance. Academic Medicine 1990;65:S63 - 7.

Page 37: Better Judgement-2: Improving assessors’ management of

Better Judgement 28

22. Delandshere G, Petrosky AR. Assessment of Complex Performances: Limitations of Key Measurement Assumptions. Educational Researcher 1998;27:14-24. 23. Klein G. Naturalistic Decision Making. Human Factors 2008;50:456-60. 24. Gigerenzer G, Goldstein DG. Reasoning the Fast and Frugal Way: Models of Bounded Rationality. Psychological review 1996;Vol. 103:650-69. 25. Van Merrienboer J, Sweller J. Cognitive Load Theory and Complex Learning: Recent Developments and Future Directions. Educational Psychology Review 2005;17:147-77. 26. Custers EJFM, Boshuizen H, Schmidt HG. The Role Of Illness Scripts in the Development of Medical Diagnostic Expertise; Results from an Interview Study. Cognition and Instruction 1998;16:367-98. 27. Schmidt HG, Boshuizen HP. On acquiring expertise in medicine. Special Issue: European educational psychology. Educational Psychology Review 1993;5:205-21. 28. Govaerts MJB, Van de Wiel MWJ, Schuwirth LWT, Van der Vleuten CPM, Muijtjens AMM. Workplace-based assessment: raters’ performance theories and constructs. advances in health sciences education 2012:1-22. 29. Govaerts MJB, Schuwirth LWT, Van der Vleuten CPM, Muijtjens AMM. Workplace-Based Assessment: Effects of Rater Expertise. Advances in health sciences education 2011;16:151-65. 30. Van Merrienboer JJ, Sweller J. Cognitive load theory in health professional education: design principles and strategies. Med Educ 2010;44:85-93. 31. Orrell J. GOOD PRACTICE REPORT: Work-integrated learning: OLT; 2011. 32. Assessment Futures; http://www.uts.edu.au/research-and-teaching/teaching-and-learning/assessment-futures/overview (last access 5 February 2016). 2010. at http://www.uts.edu.au/research-and-teaching/teaching-and-learning/assessment-futures/overview (last access date 5 February 2016).) 33. Hinton T, Gannaway D, Berry B, Moore K. The D-cubed guide: Planning for effective dissemination. Sydney: Australian Teaching and Learning Council; 2011.

Page 38: Better Judgement-2: Improving assessors’ management of

Better Judgement: Academic Developer’s Guide

Better Judgement 29

Appendix B: External evaluator’s report Professor David Boud

Introduction

The Better Judgement project was funded by the Office for Learning and Teaching under the Innovation and Development Program, a competitive grants scheme. It followed a prior Seed Project that established the approach adopted— Better Judgement-1. It focused on improving the quality of assessment practices through assisting assessors and teachers in higher education understand and manage their biases in oral and practice-based assessment. It did this through developing and disseminating sets of training materials.

Context for the evaluation

The project has already been subjected to various forms of evaluation. The Seed Project that established the approach taken was evaluated and led to the current project that was judged against a set of criteria in the Guidelines for the Innovation and Development Program. It has also been judged for its contribution to the mission and objectives of the Office for Learning and Teaching and for its congruence with the OLT values and principles for action. The OLT requires an independent evaluation with a focus on the quality of the project and the extent to which it meets its stated aims, outcomes, outputs and deliverables. It is this evaluation that is contained in this Appendix.

The OLT has identified a particular view about the evaluation process and the role of the evaluator. That is, the evaluation is formative and summative. For the former, the evaluator acts as a critical friend to the project and provides feedback and commentary during the project on the directions taken and the ways in which the proposal is enacted. The latter aspect is contained in a final report which has three functions: a quality assurance and auditing function for the OLT, recommendations on processes and policy for the funding agency, and feedback to the project team and those considering further projects in a similar area.

Evaluation approach

The project proposal identifies the specific brief for the evaluator:

Page 39: Better Judgement-2: Improving assessors’ management of

Better Judgement: Academic Developer’s Guide

Better Judgement 30

[He] will have input into and provide feedback on the method being used to assess the effectiveness of the program material in raising the awareness of the influences of biases on assessors. This will ensure agreement on the evaluation method prior to the evaluation starting. He will also conduct annual formal evaluations focusing on specific issues. End of Year 1 – progress against timeline, progress of product development, and dissemination planning. He will provide alerts and recommendations at the end of year 1. End of Year 2 – resources developed, production management, dissemination and uptake.

[He] will assess the project outcomes against the declared deliverables in this application. Regular reviews of the course material with the possibility to incorporate/amend the project are factored in as identified in the timeline.

The first of these has been conducted and the results incorporated by the project team into Year 2 planning. This report will focus on: resources developed, production management, dissemination and uptake, and an assessment of the project outcomes against the declared deliverables.

Evaluation process

Evaluative activities were built in to each stage of the project. These were enacted by the project team and monitored by the external evaluator. These included regular consultations with the Reference Group and monitoring of responses to the changes suggested, workshop observation, collection of data from participants about workshop activities and design of materials, reviews with the evaluator about progress at the key stages of the project. As the team was careful in implementing evaluative activities throughout, the evaluator was able to rely on the data collected as part of this and his analysis of the materials generated in drawing conclusions.

Resources developed and production management

The resources developed in the Better Judgement project focused on enabling assessors to understand and respond to nine common biases exhibited in assessment judgements. These were embedded in a training program designed to raise the awareness of participants to these biases, enable them to be recognized in a range of situations, develop strategies to manage the impact of these biases and to help them translate these into their own context. Videos were produced which introduced the ideas and presented scenarios of situations in which biases are encountered.

Identification of the biases was soundly based in the literature. The training design was carefully scaffolded to move participants from knowledge of biases to observation of others to noticing them within one’s own practice, ending in planning to implement bias amelioration within the assessment practice for which participants were responsible.

Page 40: Better Judgement-2: Improving assessors’ management of

Better Judgement: Academic Developer’s Guide

Better Judgement 31

The videos used covered a wide range of discipline areas and portrayed each form of bias both in exemplary form and then in a number of ‘real’ contexts. A particular feature was the high degree of contextualization used so that the biases had to be discerned by viewers.

For each bias, an introductory video was produced followed by an exemplar with an exaggerated illustration of the bias firstly without and then with an explanation of the effect. There were then real life videos assembled from a variety of disciplines each of which exhibited the particular bias being examined:

Form of bias No. of real life videos

Primacy effect 10

Memory errors of commission 23

Recency effect 9

Memory failure 8

Selective perception 4

Confirmation bias 6

Cognitive dissonance 6

Contrast effect 6

Halo effect 6

Subsequently, self-directed training guides were developed and posted on the project website (http://www.flinders.edu.au/medicine/sites/better-judgement/) to enable training in biases to continue after the life of the project.

These training guides were soundly structured and easy to use. They were based on the materials used in the face-to-face training sessions. The use of these self-directed training guides exhibited a robust strategy for ensuring continuing take-up of the resources once workshops were no longer available. They may be used for individual access, or by staff development personnel in designing face-to-face sessions for training assessors. The latter is particularly important as a full appreciation of the issues in each can often only be understood in dialogue with others who can notice different features of the situation in which judgement is being exercised.

Evaluation data reported in the main body of the Final Report show that participants greatly appreciated the sessions and the materials used, and were able to apply them in their own situations.

Page 41: Better Judgement-2: Improving assessors’ management of

Better Judgement: Academic Developer’s Guide

Better Judgement 32

A particular feature of the project was an emphasis on the articulation and naming of biases in order to provide a vocabulary for discussion of the issues involved in making judgements. Problems partly understood and not clearly identified are unable to be addressed and the biases that are the focus of the resources are both important, pervasive and commonly unaddressed. The naming of different kinds of bias and their exemplification in common situations renders them visible and able to be articulated by assessors and examiners in ways difficult to do without this.

A Facilitator’s Guide to aid use of them was also produced as well as a guide for academic developers. In response to earlier evaluative comments, these were designed to accommodate those who had not experienced one of the training workshops and could thus provide an entry point into the resources of the project following completion.

Dissemination and uptake

The first means of dissemination was through workshops conducted in a variety of institutions held in the following capital cities: Perth, Adelaide, Melbourne, Brisbane, Darwin, Canberra and Sydney. This was complemented by the building of the website which was incorporated as part of the core website of Medicine at Flinders University to ensure continuity of access.

The second means of dissemination is through other forms of presentations and media representations to draw attention to the ideas and the resources available. These direct those interested to the Self-directed Learning Guides that will be available well after the life of the project ends. As the biases examined are ones that are soundly based in research and pervasive in practice, there is little likelihood that these resources will date for some considerable time.

Uptake of these workshops and resources is reported in the main body of the report. In summary, they were well attended by members of the target groups: those involved in oral and practice-based assessments from a variety of disciplines. Evaluation sheets were completed for each workshop and the form of the workshop adapted in the light of these comments. Discussion of the findings of the qualitative responses can also be found in the main body of the report. In short, reactions were very positive about the value, applicability and relevance of the sessions.

While it was possible to follow through to implementation a few examples of those who participated in the project, the timescale of the project did not allow focus on issues of implementation in assessment contexts with teams of multiple assessors who may not have had the benefit of the full training available.

As part of the extension of this a Better Judgement blog was established to which participants were subscribed. This feature was underutilized and mainly provided information about the availability of training sessions.

Page 42: Better Judgement-2: Improving assessors’ management of

Better Judgement: Academic Developer’s Guide

Better Judgement 33

Assessment of project outcomes against deliverables

A set of deliverables was outlined in the proposal. The status of these at the end of the project funding period is indicated:

Deliverable Status

A series of short videos explaining each bias Completed

A series of exemplar video vignettes for each type of bias Completed

A collection of links to YouTube clips showing biases in real-life assessment settings

Completed

Workshops run by the Project Team for end-users and potential adopters nationally

Completed

A series of written scenarios illustrating biases in action in various discipline contexts. These will be written by participants in the workshops who give their consent to share their scenarios

Completed

A compendium of practical strategies for how to counteract the impact of biases on assessment

Completed

A facilitator guide for adopters running workshops in their own setting using the programme material

Completed

A Self-directed Learning Guide for people wanting to use the material independently

Completed for each bias with an overview in preparation

An academic developers’ guide on how to implement the Better Judgement programme in their institution

Completed

Enhancing the Better Judgement-1 webpage by adding the materials developed in Better Judgement-2

Completed

Dissemination activities as specified Completed

Page 43: Better Judgement-2: Improving assessors’ management of

Better Judgement: Academic Developer’s Guide

Better Judgement 34

Recommendations

The project team is to be commended on producing all that they set out to do and somewhat more than was necessary. The quality of the training was high, materials were well received and there is an ongoing legacy from the project that will last well into the future. Following discussions between the team and the evaluator in the light of the data assembled and the experiences of each party with the project, the following recommendations were identified:

1. The project resulted in the development of resources that support the enhancement of quality in assessment that would not have been produced without a scheme such as the OLT Innovation and Development Program. It is recommended that any subsequent funding agency establish programs with similar intentions to the Innovation and Development Program as a core part of their suite of programs.

2. The focus of the project was on assessment in oral and practice-based settings. However, the considerations of bias are applicable in all forms of assessment.

It is recommended that teaching and learning funding agencies give priority to the development of resources for judgemental bias in the wider assessment context.

3. Although the training sessions and materials were well received and the resources on the website accessed frequently, addressing the real issues of biased judgement in assessment needs to occur within programs and in teaching and assessing teams. While this final stage of applicability was beyond the scope of the project, substantial impact on students will only occur when such considerations are taken seriously by and enacted by course coordinators in situ. It is recommended that the body replacing the OLT give consideration to the final stages of embedding projects in future program funding strategies that focus on following through innovative approaches to full contextualisation and identifying what enables this to occur.

4. While bias in judgements can be identified through the work of the project, there is a further stage involved in the management of bias in assessment in addition to the one identified in recommendation 3. That is, the design of assessment practices. An obvious extension of the present project would be into the domain of assessment

Page 44: Better Judgement-2: Improving assessors’ management of

Better Judgement: Academic Developer’s Guide

Better Judgement 35

design at both the program and unit levels. This would involve the identification of and strategies developed to minimise the likelihood of assessment bias being introduced through the nature of assessment tasks and how they are built into a program.

It is recommended that in any extension to this project that a focus be given to assessment design that minimises the opportunities for bias to occur while maintaining the validity of the approaches used.

5. It has become apparent in the project, and indeed in many others supported by OLT, that the assessment literacy of academics is quite low and few have received adequate preparation for their assessment roles. Their level of sophistication about assessment makes it difficult for them to fully utilise assessment resources. It is recommended to higher education institutions that consideration be given to the enhancement of programs to foster assessment literacy as a normal part of academic development provision. This would be aided by resources such as examples of model programs and high quality resources to be used within them. A first step might be the development of a set of core assessment standards which academics are expected to meet (a) for completion of probation, and (b) for promotion on the basis of teaching.

Evaluator

David Boud is Emeritus Professor in the Faculty of Arts and Social Sciences at the University of Technology Sydney, Professor and Director of the Centre for Research on Assessment and Digital Learning, Deakin University and Research Professor at the Institute for Work-Based Learning, Middlesex University, London. He has published widely on teaching and learning in higher education, particularly in the area of student assessment.

Page 45: Better Judgement-2: Improving assessors’ management of

Better Judgement: Academic Developer’s Guide

Better Judgement 36

Appendix C: Certification by Deputy Vice-Chancellor I certify that all parts of the final report for this OLT grant provide an accurate representation of the implementation, impact and findings of the project, and that the report is of publishable quality.

Professor Andrew Parkin

2 March 2016