kursusgang 5

Design af brugerflader 5.1

Kursusgang 5Oversigt:

• Sidste kursusgang

• Fremlæggelse

• Brugbarhedsevaluering: Teknikker til brugbarhedstest Heuristisk inspektion Tænke-højt kontra heuristisk inspektion Learning to find usability problems in internet time


Sidste kursusgang

• Interaktionsdesign: Paradigmer Principper

• Brugbarhedsevaluering: Udførelse Fortolkning


FremlæggelseHver gruppe (ca. 5 minutter):

• Kort beskrive produktet/systemet og afprøvningen (testproceduren)

• Vise uddrag fra afprøvningen (VHS)

• Vurdere produktets/systemets brugbarhed (var systemet brugbart, hvad var der af brugbarhedsproblemer)

• En vurdering af testmetoden: hvad var let og hvad var svært ved planlægning og udførelse af testen.


ErfaringerLet

• Nemt inde i kontrolrummet, når det kører

Svært

• Hvad gør man, når en testperson er gået i stå eller går i forkert retning

• At finde en passende bruger

• Brugeren tænker ikke højt –siger bare hvad han/hun gør

• Teknikken drillede lidt

• Vanskeligt at se skærmen på et mobilt system (opstilling af kameraer)

• At have styr på roller (modtager, logger, operatør)


Teknikker til brugbarhedstest

Bruger kontrollerer Udvikler kontrollerer

LaboratoriumTænke-højtKonstruktiv interaktion

Heuristisk inspektionKognitiv inspektion

Brugerorganisation

(felt)

FokusgruppeObservationAnvendelsesstatistik

TilbagemeldingInterviewSpørgeskemaer

Andre dimensioner: Rigorisme (planlagt og styret forløb) RealismeKvalitativt Kvantitativt


Heuristisk inspektionLaboratorium + Udviklerkontrol

• Deltagerne gennemgår systemet ud fra en checkliste

• Scenario med relevante opgaver kan strukturere processen

• Systemet gennemgås to gange:1.Fokus på helhed og umiddelbare indtryk2.Fokus på detaljer såsom funktioner i

forhold til opgaver

• Deltagerne arbejder individuelt og noterer problemer

• I fællesskab udarbejdes en samlet liste• Problemerne kategoriseres eventuelt

(kritisk, alvorligt, kosmetisk)• Rettelsesforslag udarbejdes, prioriteres

og overdrages til udviklerne

Eksempel på checkliste:

Enkel og naturlig dialog

Tal brugerens sprog Minimer krav til

hukommelsen Sørg for konsistens Giv feedback Lav tydelige udgange Lav genveje Giv konstruktive

fejlmeddelelser Forebyg fejl

Eksempel på checkliste:

Enkel og naturlig dialog

Tal brugerens sprog Minimer krav til

hukommelsen Sørg for konsistens Giv feedback Lav tydelige udgange Lav genveje Giv konstruktive

fejlmeddelelser Forebyg fejl


Øvelse (1)Antal inspektioner – Molich & Nielsens resultater:

1: ca. 35% af alle problemer findes3-5: ca. 70% af alle problemer findes

Denne påstand er meget omdiskuteret

Øvelse (fra Molich & Nielsen):Functionality: A service from Manhattan Telephone (MANTEL) to home users. Typical users have little knowledge of data processing. They can dial into the system, which will provide the name and address of a telephone subscriber in the United States, given the telephone number of the subscriber.

Assumptions: For each telephone number there is at most one subscriber. All telephone numbers consist of exactly ten digits (3 digit area code and 7 digit local number). The user's computer has a traditional alphanumeric, monochrone display with 24 lines of 80 characters each and a typewriter-like keyboard with the usual extra keys found on most keyboards, including 10 function keys marked PF1-PF10.


Øvelse (2)

ILLEGAL NUMBER: TRY AGAIN

If the user enters a telephone number which is not in use, the system answers:

UNKNOWN TELEPHONE NUMBER

If the area code is 212 (Manhattan), the system will normally display the the screen shown within 5 seconds. For other area codes, the system must retrieve the necessary information from external databases.Thismay take up to 30 seconds.

Dialogue: Enter by selecting "Computer Telephone Index" from the main MANTEL menu. The system then issues the prompt:

ENTER DESIREDTELEPHONE NO. ANDRETURN

If the user enters anything other than exactly 10 digits to this prompt, the system answers:


Øvelse (3)


Øvelse (4)


Øvelse (5)


Øvelse (6)


Øvelse (7)


Tænke-højt kontra Heuristisk inspektion• www.hotmail.com

• 8 laboratorier testede web-stedet Professionelle firmaer Forskningsmiljøer Studerende

• Testen skulle omfatte et antal specificerede funktioner

• Selve udførelsen kunne tilrettelægges frit

• Formålet var at undersøge kvaliteten af brugbarhedstest

• 1 af laboratorierne indgik ikke seriøst i undersøgelsen

• 6 af laboratorierne baserede deres evaluering på test med brugere De fandt mellem 17 og 75

problemer af forskellige kategorier

• 1 af laboratorierne baserede deres evaluering på en kombination af heuristisk inspektion og test med brugere De fandt 150 problemer De beskrives tit med

formuleringen “might be a problem”

107 af deres problemer findes ikke af nogen af de andre

De finder 19 ud af 26 “core problems” men uklart hvordan


RESULTATER


Tænke-højt versus inspektion

94

49

1.9

18

5.2

118

68

1.7

23

5.1

160

159

1.0

40

4.0

Samlet tid (i timer)

Problemtyper (antal)

Tid/problem

Kategoriproblemer

Tid/SPA

0113Unikke problemer for hver metode (antal)

8

9

1

29

9

13

1

24

19

18

3

7

Problemkategorier

Kategori 1 (antal)

Kategori 2 (antal)

Kategori 3 (antal)

Ingen aktion (antal)

4968159Problemtyper (antal)

Individuel inspektionGruppe-inspektionTænke-højt forsøg

(Karat, Campbell og Fiegel, Comparison of Empirical Testing and Walkthrough Methods in User Interface Evaluation, 1992)


Learning to Find Usability Problems in Internet Time

Learning to Find Usability Problems in Internet Time

Mikael B. Skov & Jan Stage


Motivation• Information technology: available to anyone, anywhere, anytime• Strength: WWW is a significant move in that direction• Weakness: Many web sites are designed and implemented in fast-

paced projects by multidisciplinary teams• Teams involve such diverse professions as information architects,

Web developers, graphic designers, brand and content strategists, etc.

• Teams are usually not familiar with established knowledge on human-computer interaction

• The strong limitation in terms of price and development time effectively prohibits usability testing in the classical sense, conducted by experienced testers in sophisticated laboratories

• Methods tend to focus on analysis, design, and implementation• The implied lack of focus on usability issues and practical skills with

usability testing reflects a potential barrier for universal access of information on the Web


Empirical study (1)Research questions:• What is the potential for

supporting universal access through dissemination of fundamental usability engineering skills

• Can we teach a simple approach to usability testing to people with an interest in information technology but without formal education in software development or usability engineering, and to do it in less than a week.

Overall design:• A course for first-year students at

Aalborg University, Denmark. • Subject: fundamentals of

computerized systems with particular emphasis on usability issues.

• Ten class meetings with two hours of class lecture and two hours of exercises in smaller teams.

• Two primary techniques: Think-aloud protocol (Nielsen 1993) Questionnaires filled in after each task

and the entire test (Spool et al.)

• The exercises after the first four class meetings made the students conduct small usability pilot tests in order to train and practice their practical skills.

• The last six exercises were devoted to conducting a more realistic usability test of a web-site: Hotmail.com.


Empirical study (2)• 36 teams of first year university

students used the simple approach to conduct a usability evaluation of the email services at Hotmail.com.

• The 36 teams consisted of 234 students in total, of which 129 acted as test subjects

• Educations: architecture and design, informatics, planning and environment, and chartered surveyor

• All part of a natural science or engineering program at Aalborg University

• Each team should apply at least one of the two primary techniques, and could supplement this with other techniques

• The team should among themselves choose a test monitor and a number of loggers and the rest of each team acted as test subjects

• Each team was given a very specific two-page scenario stating that they should conduct a usability test of the Hotmail web-site (www.hotmail.com)

• The entire team worked together on the analysis and identification of usability problems and produced the usability report

Team sizeAverage

Team sizeMin / Max

Number of testsubjectsAverage

Number of testsubjects

Min / Max

Age of testsubjectsAverage

Age of testsubjects

Min / Max

6.5 4 / 8 3.6 2 / 5 21,2 19 / 30


Data collection and analysis• The usability reports were the

primary source of data for our empirical study

• All reports were analyzed, evaluated, and marked by both authors:

1. We worked individually and marked each report in terms of 16 different factors

2. The markings were compared, a new factor was added, and the characteristics of each factor was specified explicitly

3. We worked individually to re-mark all reports according to the 17 factors

4. All reports and evaluations were compared and a final evaluation on each factor was negotiated

The markings were made on a scale of 1 to 5, with 5 being the best

Five of the 17 factors:

1. The planning and conduction of the evaluation

2. The quality of the task assignments

3. The clarity and quality of the problems listed in the report

4. The practical relevance of these problems

5. The number and relevance of the usability problems identified

Five of the 17 factors:

1. The planning and conduction of the evaluation

2. The quality of the task assignments

3. The clarity and quality of the problems listed in the report

4. The practical relevance of these problems

5. The number and relevance of the usability problems identified

• Comparison with usability reports produced by eight professional laboratories (Molich 1999)

• Evaluated the same web-site according to the scenario used by the students

• Their reports were analyzed, evaluated, and marked through the same procedure as the student reports


Tasks

0

5

10

15

20

25

30

35

40

1 2 3 4 5

%

Professional

Student

Clarity of Problem List

0

5

10

15

20

25

30

35

40

1 2 3 4 5

%

Professional

Student

Similar distribution• The relevance of the tasks, the number of

tasks, and the extent to which they cover the areas specified in the scenario

• The student teams cover all five elements of the scale, with an average of 3.3

• The professional laboratories score almost the same result, with an average of 3.5

• This is by no means impressive for the professionals; a general low quality of the tasks

• How well each problem is described, explained, and illustrated and how easy it is to gain an overview of the complete list of problems

• The student teams are distributed mainly around the middle of the scale, with an average of 2.9

• The professional laboratories are distributed from 2 to 5 with an average of 3.5

• Again, not impressive for the professional laboratories.


Different distribution (1)• How well the tests were planned, organized,

and carried out• The student teams have average of 3.7 and

the majority score 4, indicating well-conducted tests with a couple of problematic characteristics

• The professional laboratories score an average of 4.6 on this factor, and 6 out of 8 score the top mark.

• This is as it should be expected because experience will tend to raise this factor.

• The practical relevance of the problem list• The student teams are almost evenly distributed

on the five marks of the scale, and their average is 3.2

• The professional laboratories score an average of 4.6 where 6 out of 8 laboratories score the top mark

• Reason may be the experience of the professionals in expressing problems in a way that make them relevant to their customers

• The course has focused too little on discussing the nature of a problem

Test Conduction

0

10

20

30

40

50

60

70

80

1 2 3 4 5

%

Professional

Student

Practical Relevance of Problem List

0

10

20

30

40

50

60

70

80

1 2 3 4 5

%

Professional

Student


Different distribution (2)• A key aim in usability testing: to uncover

and identify usability problems

• The student teams are on average able to find 7.9 problems. They find between 1 and 19 problems with half of the teams finding between 6 and 10 problems

• Thus the distribution seems to be reasonable

• The average for the professional laboratories is 23.0 problems identified

• Only one of them scores in the same group as a considerable number of student teams – that is between 11 and 15 problems

• Only one student team identified a number of problems that is comparable to the professional laboratories

Number of Problems

0

10

20

30

40

50

60

1-3 4-7 8-12 13-17 >17

%

Professional

Student

Number of Problems

0,0

10,0

20,0

30,0

40,0

50,0

60,0

1-5 6-10 11-15 16-20 21-25 26-30 31-35 36-40 41-45

%

kursusgang 5

Documents