[hci] week 15. evaluation reporting

45
Lecture 15 Evaluation Reporting Human Computer Interaction/COG3103, 2016 Fall Class hours : Monday 1-3 pm/Wendseday 2-3 pm Lecture room : Widang Hall 209 12 th December

Upload: dsdlab

Post on 13-Apr-2017

395 views

Category:

Education


1 download

TRANSCRIPT

Page 1: [HCI] Week 15. Evaluation Reporting

Lecture 15

Evaluation Reporting

Human Computer Interaction/COG3103, 2016 Fall Class hours : Monday 1-3 pm/Wendseday 2-3 pm Lecture room : Widang Hall 209 12th December

Page 2: [HCI] Week 15. Evaluation Reporting

UX EVALUATION INTRODUCTION Textbook Chapter 12.

Lecture #15 COG_Human Computer Interaction 2

Page 3: [HCI] Week 15. Evaluation Reporting

INTRODUCTION

Lecture #15 COG_Human Computer Interaction 3

Figure 12-1 You are here, at the evaluation activity in the context of the overall Wheel lifecycle template.

Page 4: [HCI] Week 15. Evaluation Reporting

INTRODUCTION

• Evaluate with a Prototype on Your Own Terms

• Measurability of User Experience

– But can you evaluate usability or user experience? This may come as a

surprise, but neither usability nor user experience is directly measurable.

– we resort to measuring things we can measure and use those

measurements as indicators of our more abstract and less measurable

notions.

• User Testing? No!

– UX evaluation must be an ego-free process; you are improving designs,

not judging users, designers, or developers.

Lecture #15 COG_Human Computer Interaction 4

Page 5: [HCI] Week 15. Evaluation Reporting

FORMATIVE VS. SUMMATIVE EVALUATION

• When the cook tastes the soup, that’s formative; when the guests

taste the soup, that’s summative” (Stake, 2004, p. 17).

– Formative evaluation is primarily diagnostic; it is about collecting

qualitative data to identify and fix UX problems and their causes in the

design.

– Summative evaluation is about collecting quantitative data for

assessing a level of quality due to a design, especially for assessing

improvement in the user experience due to formative evaluation.

Lecture #15 COG_Human Computer Interaction 5

Page 6: [HCI] Week 15. Evaluation Reporting

FORMATIVE VS. SUMMATIVE EVALUATION

• Qualitative Data

– Qualitative data are nonnumeric and descriptive data, usually describing

a UX problem or issue observed or experienced during usage.

• Quantitative Data

– Quantitative data are numeric data, such as user performance metrics or

opinion ratings.

Lecture #15 COG_Human Computer Interaction 6

Page 7: [HCI] Week 15. Evaluation Reporting

FORMATIVE VS. SUMMATIVE EVALUATION

• Formal summative evaluation

– is typified by an empirical competitive benchmark study based on formal,

rigorous experimental design aimed at comparing design hypothesis factors.

Formal summative evaluation is a kind of controlled hypothesis testing with an

m by n factorial design with y independent variables, the results of which are

subjected to statistical tests for significance. Formal summative evaluation is

an important HCI skill, but we do not cover it in this book.

• Informal summative evaluation

– is used, as a partner of formative evaluation, for quantitatively summing up or

assessing UX levels using metrics for user performance (such as the time on

task), for example, as indicators of progress in UX improvement, usually in

comparison with pre-established UX target levels

Lecture #15 COG_Human Computer Interaction 7

Page 8: [HCI] Week 15. Evaluation Reporting

FORMATIVE VS. SUMMATIVE EVALUATION

• Engineering Evaluation of UX: Formative Plus Informal Summative

Lecture #15 COG_Human Computer Interaction 8

Figure 12-2 UX evaluation is a combination of formative and informal summative evaluation.

Page 9: [HCI] Week 15. Evaluation Reporting

TYPES OF FORMATIVE AND INFORMAL SUMMATIVE EVALUATION METHODS

• Dimensions for Classifying Formative UX Evaluation Methods

– empirical method vs. analytic method

– rigorous method vs. rapid method

• Rigorous Method vs. Rapid Method

– Choose a rigorous empirical method such as lab-based testing when you need

effectiveness and thoroughness, but expect it to be more expensive and time-consuming.

– Choose the lab-based method to assess quantitative UX measures and metrics, such as

time-on-task and error rates, as indications of how well the user does in a performance-

oriented context.

– Choose lab-based testing if you need a controlled environment to limit distractions.

– Choose empirical testing in the field if you need more realistic usage conditions for

ecological validity than you can establish in a lab.

Lecture #15 COG_Human Computer Interaction 9

Page 10: [HCI] Week 15. Evaluation Reporting

TYPES OF FORMATIVE AND INFORMAL SUMMATIVE EVALUATION METHODS

• UX evaluation methods can be faster and less expensive.

– Choose a rapid evaluation method for speed and cost savings, but expect

it to be (possibly acceptably) less effective.

– Choose a rapid UX evaluation method for early stages of progress, when

things are changing a lot, anyway, and investing in detailed evaluation is

not warranted.

– Choose a rapid method, such as a design walkthrough, an informal

demonstration of design concepts, as a platform for getting initial

reactions and early feedback from the rest of the design team, customers,

and potential users.

Lecture #15 COG_Human Computer Interaction 10

Page 11: [HCI] Week 15. Evaluation Reporting

TYPES OF FORMATIVE AND INFORMAL SUMMATIVE EVALUATION METHODS

• Where the Dimensions Intersect

Lecture #15 COG_Human Computer Interaction 11

Figure 12-3 Sample UX evaluation methods at intersections between the dimensions of UX evaluation method types.

Page 12: [HCI] Week 15. Evaluation Reporting

TYPES OF EVALUATION DATA

• Objective Data vs. Subjective Data

– Objective UX data are data observed directly by either the evaluator or the

participant. Subjective UX data represent opinions, judgments, and other

subjective feedback usually from the user, concerning the user experience

and satisfaction with the interaction design.

• Quantitative Data vs. Qualitative Data

– Quantitative data are numeric data, such as data obtained by user

performance metrics or opinion ratings. Quantitative data are the basis of an

informal summative evaluation component and help the team assess UX

achievements and monitor convergence toward UX targets, usually in

comparison with the specified levels set in the UX targets

Lecture #15 COG_Human Computer Interaction 12

Page 13: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

• Critical Incident Identification

– Critical incidents

• the user’s general activity or task

• objects or artifacts involved

• the specific user intention and action that led immediately to the critical incident

• expectations of the user about what the system was supposed to do when the

critical incident occurred

• what happened instead

• as much as possible about the mental and emotional state of the user

• indication of whether the user could recover from the critical incident and, if so, a

description of how the user did so

• additional comments or suggested solutions to the problem

Lecture #15 COG_Human Computer Interaction 13

Page 14: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

– Relevance of critical incident data

– History of critical incident data

– Mostly used as a variation

– Critical incident reporting tools

– Who identifies critical incidents?

– Timing of critical incident data capture: The evaluator’s awareness zone

Lecture #15 COG_Human Computer Interaction 14

Page 15: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

Lecture #15 COG_Human Computer Interaction 15

Figure 12-4 Critical incident description detail vs. time after critical incident.

Page 16: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

• The Think-Aloud Technique

– Why use the think-aloud technique?

– What kind of participant works best?

– How to manage the think-aloud protocol?

– Retrospective think-aloud techniques

– Co-discovery think-aloud techniques

– Does thinking aloud affect quantitative task performance metrics in lab-

based evaluation?

Lecture #15 COG_Human Computer Interaction 16

Page 17: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

• Questionnaires

– Semantic differential scales

– The Questionnaire for User Interface Satisfaction (QUIS)

– The System Usability Scale (SUS)

– The Usefulness, Satisfaction, and Ease of Use (USE) Questionnaire

Lecture #15 COG_Human Computer Interaction 17

Page 18: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

Lecture #15 COG_Human Computer Interaction 18

Table 12-1 An excerpt from QUIS, with permission

Page 19: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

• The System Usability Scale (SUS)

– The questions are presented as simple declarative statements, each with a five point Likert

scale anchored with “strongly disagree” and “strongly agree” and with values of 1 through 5.

These 10 statements are:

– I think that I would like to use this system frequently

– I found the system unnecessarily complex

– I thought the system was easy to use

– I think that I would need the support of a technical person to be able to use this system

– I found the various functions in this system were well integrated

– I thought there was too much inconsistency in this system

– I would imagine that most people would learn to use this system very quickly

– I found the system very cumbersome to use

– I felt very confident using the system

– I needed to learn a lot of things before I could get going with this system

Lecture #15 COG_Human Computer Interaction 19

Page 20: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

Lecture #15 COG_Human Computer Interaction 20

Table 12-2 Questions in USE questionnaire

Page 21: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

– Other questionnaires

• General-purpose usability questionnaires:

– Computer System Usability Questionnaire (CSUQ), developed by Jim Lewis (1995, 2002) at

IBM, is well-regarded and available in the public domain.

– Software Usability Measurement Inventory (SUMI) is “a rigorously tested and proven

method of measuring software quality from the end user’s point of view” (Human Factor

Research Group, 1990). According to Usability Net, SUMI is “a mature questionnaire

whose standardization base and manual have been regularly updated.” It is applicable to

a range of application types from desk-top applications to large domain-complex

applications.

– After Scenario Questionnaire (ASQ), developed by IBM, is available in the public domain

(Bangor, Kortum, & Miller, 2008, p. 575).

– Post-Study System Usability Questionnaire (PSSUQ), developed by IBM, is available in the

public domain (Bangor, Kortum, & Miller, 2008, p. 575).

Lecture #15 COG_Human Computer Interaction 21

Page 22: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

• Web evaluation questionnaires:

– Website Analysis and MeasureMent Inventory (WAMMI) is “a short but very

reliable questionnaire that tells you what your visitors think about your web site”

(Human Factor Research Group, 1996b).

• Multimedia system evaluation questionnaires:

– Measuring the Usability of Multi-Media Systems (MUMMS) is a questionnaire

“designed for evaluating quality of use of multimedia software products” (Human

Factor Research Group, 1996a).

• Hedonic quality evaluation questionnaires:

– The Lavie and Tractinsky (2004) questionnaire

– The Kim and Moon (1998) questionnaire with differential emotions scale

Lecture #15 COG_Human Computer Interaction 22

Page 23: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

– Modifying questionnaires for your evaluation

• choosing a subset of the questions

• changing the wording in some of the questions

• adding questions of your own to address specific areas of concern

• using different scale values

Lecture #15 COG_Human Computer Interaction 23

Page 24: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

• Data Collection Techniques Especially for Evaluating Emotional Impact

– Self-reported indicators of emotional impact

– Questionnaires as a verbal self-reporting technique for collecting

emotional impact data (AttrakDiff and others)

– Observing physiological responses as indicators of emotional impact

– Bio-metrics to detect physiological responses to emotional impact

Lecture #15 COG_Human Computer Interaction 24

Page 25: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

Lecture #15 COG_Human Computer Interaction 25

Table 12-3 AttrakDiff emotional impact questionnaire as listed by Hassenzahl, Scho¨bel, and Trautman (2008), with permission

Page 26: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

Lecture #15 COG_Human Computer Interaction 26

Table 12-4 A variation of the AttrakDiff emotional impact questionnaire, as listed in Appendix A1 of Schrepp, Held, and Laugwitz (2006), reordered to group related items together, with permission

Page 27: [HCI] Week 15. Evaluation Reporting

SOME DATA COLLECTION TECHNIQUES

• Data Collection Techniques to Evaluate Phenomenological Aspects of

Interaction

– Long-term studies required for phenomenological evaluation

– Goals of phenomenological data collection techniques

– Diaries in situated longitudinal studies

– Evaluator triggered reporting for more representative data

– Periodic questionnaires over time

– Direct observation and interviews in simulated real usage situations

Lecture #15 COG_Human Computer Interaction 27

Page 28: [HCI] Week 15. Evaluation Reporting

VARIATIONS IN FORMATIVE EVALUATION RESULTS

• Reasons given by Hertzum and Jacobsen (2003) for the wide variation

– vague goals (varying evaluation focus)

– vague evaluation procedures (the methods do not pin down the

procedures so each application is a variation and an adaptation)

– vague problem criteria (it is not clear how to decide when an issue

represents a real problem)

Lecture #15 COG_Human Computer Interaction 28

Page 29: [HCI] Week 15. Evaluation Reporting

EVALUATION REPORTING Textbook Chapter 17.

Lecture #15 COG_Human Computer Interaction 29

Page 30: [HCI] Week 15. Evaluation Reporting

INTRODUCTION

Lecture #15 COG_Human Computer Interaction 30

Figure 17-1 You are here; at reporting, within the evaluation activity in the context of the overall Wheel lifecycle template.

Page 31: [HCI] Week 15. Evaluation Reporting

INTRODUCTION

• Importance of Quality Communication and Reporting

– Hornbæk and Frokjær (2005) show the need for usability evaluation

reports that summarize and convey usability information, not just lists of

problem descriptions by themselves.

– What should be contained in the report

• inform the team and project management about the UX problems in the

current design

• persuade them of the need to invest even more in fixing those problems.

Lecture #15 COG_Human Computer Interaction 31

Page 32: [HCI] Week 15. Evaluation Reporting

REPORTING INFORMAL SUMMATIVE RESULTS

• What if You Are Required to Produce a Formative Evaluation Report for

Consumption Beyond the Team?

– The informal summative UX results are intended to be used only as a

project management tool and should not be used in public claims.

• What if You Need a Report to Convince the Team to Fix the Problems?

– The need for the rest of your team, including management, to be

convinced of the need to fix UX problems that you have identified in your

UX engineering process could be considered a kind of litmus test for a

lack of teamwork and trust within your organization.

Lecture #15 COG_Human Computer Interaction 32

Page 33: [HCI] Week 15. Evaluation Reporting

REPORTING QUALITATIVE FORMATIVE RESULTS

• Common Industry Format (CIF) for Reporting Formal Summative UX Evaluation

Results

– A description of the product

– Goals of the testing

– A description of the number and types of participants

– Tasks used in evaluation

– The experimental design of the test (very important for formal summative studies

because of the need for eliminating any biases and to ensure the results do not suffer

from external, internal, and other validity concerns)

– Evaluation methods used

– Usability measures and data collection methods employed

– Numerical results, including graphical methods of presentation

Lecture #15 COG_Human Computer Interaction 33

Page 34: [HCI] Week 15. Evaluation Reporting

REPORTING QUALITATIVE FORMATIVE RESULTS

• Common Industry Format (CIF) for Reporting Qualitative Formative

Results

– Mary Theofanos, Whitney Quesenbery, and others, organized two

workshops in 2005 (Theofanos et al., 2005), these aimed at a CIF for

formative reports (Quesenbery, 2005; Theofanos & Quesenbery, 2005).

– They concluded that requirements for content, format, presentation style,

and level of detail depended heavily on the audience, the business

context, and the evaluation techniques used.

Lecture #15 COG_Human Computer Interaction 34

Page 35: [HCI] Week 15. Evaluation Reporting

FORMATIVE REPORTING CONTENT

• Individual Problem Reporting Content

– the problem description

– a best judgment of the causes of the problem in the design

– an estimate of its severity or impact

– suggested solutions

Lecture #15 COG_Human Computer Interaction 35

Page 36: [HCI] Week 15. Evaluation Reporting

FORMATIVE REPORTING CONTENT

• Include Video Clips Where Appropriate

– Show video clips of users, made anonymous, encountering critical

incidents if you are giving an oral report or include links in a written report.

• Pay Special Attention to Reporting on Emotional Impact Problems

– Provide a holistic summary of the overall emotional impact on participants.

Report specific positive and negative highlights with examples from

particular episodes or incidents. If possible, try to inspire by comparing

with products and systems having high emotional impact ratings.

• Including Cost-Importance Data

Lecture #15 COG_Human Computer Interaction 36

Page 37: [HCI] Week 15. Evaluation Reporting

FORMATIVE REPORTING CONTENT

Lecture #15 COG_Human Computer Interaction 37

Figure 16-4 The relationship of importance and cost in prioritizing which problems to fix first.

Page 38: [HCI] Week 15. Evaluation Reporting

FORMATIVE REPORTING CONTENT

Lecture #15 COG_Human Computer Interaction 38

Problem Imp. Solution Cost Prio. Ratio

Prio. Rank

Cuml. Cost Resolution

Did not recognize the “counter” as being for the number of tickets. As a result, user failed to even think about how many tickets he needed.

M Move quantity information and label it

2 M 1 2

User confused by the button label “Submit” to conclude ticket purchasing transaction

4 Change the label wording to “Proceed to Payment”

1 4000 2 3

User confused about “Theatre” on the “Choose a domain” screen. Thought it meant choosing a physical theater (as a venue) rather than the category of theatre arts.

3 Improve the wording to “Theatre Arts”

1 3000 3 4

Unsure of current date and what date he was purchasing tickets for

5 Add current date field and label all dates precisely

2 2500 4 6

Did not like having a “Back” button on second screen since first screen was only a “Welcome”

2 Remove it 1 2000 5 7

Users were concerned about their work being left for others to see

5 Add a timeout feature that clears the screens

3 1667 6 10

Transaction flow for purchasing tickets (group problem; see Table 16-8)

3 Establish a comprehensive and more flexible model of transaction flow and add labeling to explain it.

5 600 7 15

Line of affordability (16 person-hours—2 work days) Did not recognize what geographical area theater information was being displayed for

4 Redesign graphical representation to show search radius

12 333 8 27

Ability to find events hampered by lack of a search capability

4 Design and implement a search function

40 100 9 67

Table 16-11 The Ticket Kiosk System cost-importance table, sorted by priority ratio, with cumulative cost values entered, and the “line of affordability” showing the cutoff for this round of problem fixing

Page 39: [HCI] Week 15. Evaluation Reporting

FORMATIVE REPORTING AUDIENCE, NEEDS, GOALS, AND CONTEXT OF USE

• The 2005 UPA Workshop Report on formative evaluation reporting (Theofanos et al., 2005)

– Documenting the process: The author is usually part of the team, and the goal is to document

team process and decision making. The scope of the “team” is left undefined and could be just

the evaluation team or the whole project team, or perhaps even the development organization

as a whole.

– Feeding the process: This is the primary context in our perspective, an integral part of

feedback for iterative redesign. The goal is to inform the team about evaluation results,

problems, and suggested solutions. In this report the author is considered to be related to the

team but not necessarily a member of the team. However, it would seem that the person most

suited as the author would usually be a UX practitioner who is part of the evaluator team and, if

UX practitioners are considered part of the project team, a member of that team, too.

– Informing and persuading: The audience depends on working relationships within the

organization. It could be the evaluator team informing and persuading the designers and

developers (i.e., software implementers) or it could be the whole project team (or part thereof)

informing and persuading management, marketing, and/or the customer or client.

Lecture #15 COG_Human Computer Interaction 39

Page 40: [HCI] Week 15. Evaluation Reporting

FORMATIVE REPORTING AUDIENCE, NEEDS, GOALS, AND CONTEXT OF USE

• Introducing UX Engineering to Your Audience

– Engender awareness and appreciation

– Teach concepts

– Sell buy-in

– Present results

• Reporting to Inform Your Project Team

• Reporting to Inform and/or Influence Your Management

• Reporting to Your Customer or Client

Lecture #15 COG_Human Computer Interaction 40

Page 41: [HCI] Week 15. Evaluation Reporting

FORMATIVE REPORTING AUDIENCE, NEEDS, GOALS, AND CONTEXT OF USE

• Formative Evaluation Reporting Format and Vocabulary

– Jargon

– Precision and specificity

• Formative Reporting Tone

– Respect feelings: Bridle your acrimony

– Accentuate the positive and avoid blaming

• Formative Reporting over Time

Lecture #15 COG_Human Computer Interaction 41

Page 42: [HCI] Week 15. Evaluation Reporting

FORMATIVE REPORTING AUDIENCE, NEEDS, GOALS, AND CONTEXT OF USE

• Problem Report Effectiveness: The Need to Convince and Get Action

with Formative Results

– problem severity: more severe problems are more salient and carry more

weight with designers

– problem frequency: more frequently occurring problems are more likely to

be perceived as “real”

– perceived relevance of problems: designers disagreeing with usability

practitioners on the relevance (similar to “realness”) of problems did not

fix problems that practitioners recommended be fixed

Lecture #15 COG_Human Computer Interaction 42

Page 43: [HCI] Week 15. Evaluation Reporting

FORMATIVE REPORTING AUDIENCE, NEEDS, GOALS, AND CONTEXT OF USE

• Reporting on Large Amounts of Qualitative Data

– One possible approach is to use a highly abridged version of the affinity

diagram technique (Chapter 4).

• Your Personal Presence in Formative Reporting

– Do not just write up a report and send it out, hoping that will do the job.

Lecture #15 COG_Human Computer Interaction 43

Page 44: [HCI] Week 15. Evaluation Reporting

Final Submission Content Table

• Part I. Design

– System Concept Statement

– Flow Model I/Social Model/Flow Model II

– Persona

– Sketches & Wireframes

– Kickstarter Page + Concept Test Results and Analysis (Form & Survey Results Links)

• Part II. UX Evaluation

– A description of the product

– Goals of the testing

– A description of the number and types of participants

– Tasks used in evaluation [Prototype Links or Captures]

– The experimental design of the test (very important for formal summative studies because of the need for eliminating any

biases and to ensure the results do not suffer from external, internal, and other validity concerns)

– Evaluation methods used

– Usability measures and data collection methods employed

– Numerical results, including graphical methods of presentation

Lecture #15 COG_Human Computer Interaction 44

Page 45: [HCI] Week 15. Evaluation Reporting

Guideline

Lecture #15 COG_Human Computer Interaction 45

Report the Evaluation Result

Final Presentation

R1 R2

Due : 9th December

Part I Design + Part II Evaluation Due : 12th & 14th December The week of independent study

Submission Due : 11: 59 pm Fri. 9th December

Final Submission

R3

By the Compiling Part I and the Part II, complete your Design and Analysis Report Due : 21th December The end of Semester (yay)