simplifying heuristic evaluation for older childrensimplifying heuristic evaluation for older...

Simplifying Heuristic Evaluation for Older Children

Kishan Salian

UX Consultant

Mumbai 400092,

Maharashtra, India.

[email protected]

Gavin Sim

University of Central

Lancashire

Preston, UK.

[email protected]

ABSTRACT

This paper aims to identify whether children can perform a

heuristic evaluation based on a new modified method. In

total 12 children placed in groups of 4 participated in the

study, evaluating a music making game on a laptop. The

results showed that children could perform a heuristic

evaluation, identifying genuine usability problems using the

simplified method. The effectiveness of the method was

measured against ease of use for aspects such as mapping

problem to game rules (heuristics), using the ‘Bad Scale’

(severity) and the number of observed errors reported. The

children struggled to map all the problems to the game rules

and had difficulty rating some problems to the new ‘Bad

Scale’. Further research will be performed to refine the

process in an attempt to eliminate reported issues thus

improving the method for children.

Author Keywords

Heuristic Evaluation; Child Computer Interaction;

Evaluation Methods

ACM Classification Keywords

H5.2. Information interfaces and presentation (e.g., HCI):

User Interfaces - Evaluation/Methodology.

General Terms

Human Factors

INTRODUCTION

Over the past decade there has been a significant amount of

research undertaken to understand child behaviour while

interacting with technology, and this is partly down to the

rise in technology for children . Although the value of

involving children in the design process was recognised,

most researchers and current evaluation methods still

require further research to ensure validity within a global

market. Children have their own perception of technology

and extensive insight in child behaviour is needed before

conducting any evaluation studies with children. It becomes

more complicated when researchers adopt inspection

methods to evaluate products targeted for children, resulting

in a high risk of wrong assumption being made about the

users’ behaviour.

There has been considerable research in Child Computer

Interaction (CCI) in establishing the suitability and

effectiveness of many traditional evaluation methods for

use with children [1-3]. These studies have highlighted the

modifications that are required to traditional methods and

demonstrated the effectiveness of the methods in a given

context. However the majority of the studies are user based

and there are a few studies that have examined inspection

based approaches, especially the heuristic evaluation

method[4].

Researchers have recognised the need to involve children as

“active participants” in designing new technology for them

[8]. Various studies have identified different roles for

children in the design processes, e.g. stakeholders [8],

design partners [10], informants [11], testers [12] or expert

evaluator[16]. Given the understanding that children could

be domain experts in the use of technology it would appear

feasible that, given the correct training, they might be able

to act as expert evaluators within the context of a heuristic

evaluation.

The heuristic method is the most popular [5], most cost and

time effective Usability Inspection Methods (UIM) [21] and

arguably the most easy to learn [5]. It has been used

successfully to evaluate applications for both adults and

children in different contexts, but adults have always acted

as the evaluators. The experience of evaluators can affect

the quality of the results and this issue is referred to as the

evaluators effect [22].

A classic heuristic evaluation, by Nielsen[5], have been

modified to suit children. Following the same procedure as

the heuristic evaluation a small number of expert evaluators

independently inspect a piece of software to identify

usability problems based on compliance to a number of

usability principles. After this initial stage the evaluators’

individual lists of problems are aggregated to form a single

list of known usability problems within the system under

investigation. One of the key requirements for evaluators to

successfully perform a heuristic evaluation is that the

evaluators are experts in usability and the domain under

investigation [6, 7]. Without this expert knowledge there is

a risk that a large number of problems reported by the

evaluators would be false positives (not real problems). It

26

mailto:[email protected]



http://dx.doi.org/10.1145/2676702.2676704

has been suggested that children are experts in the way they

interact with their world and with the technology around

them and, within CCI, capturing this expertise is believed to

be key to designing meaningful artifacts for children [8]. If

children can be considered experts then it may be feasible

that they could successfully perform a heuristic evaluation,

thus providing the CCI community and organization’s

developing technology for children a cost effective

evaluation method.

The main objective of this research is to “test the modified

version of heuristic evaluation with children and measure

the acceptability of the new method”. The study tries to

find answers for some research questions.

Will they be able to follow the new procedure and

evaluate a product?

Will children be able to understand and map the

problems to the set of game rules (heuristics)?

Are they able to assign severity ratings using bad scale

to an identified problem?

Will the new participant comment forms simplify the

process of recording problems?

METHOD

The aim of the study was to observe the simplified heuristic

evaluation technique without influencing the child

evaluators. The study adopts mixed method design which

does not restrict researchers to plan the study within one

approach. Similarly, this research will rely on two different

research techniques (heuristic evaluation and direct

observation) to elicit data and conclude the findings. The

data was concurrently collected from participants

(quantitative) and observers (qualitative) at the same time

by two different techniques (direct observation and

heuristic evaluation). The direct observation technique was

chosen over interviews and surveys as it would allow the

children to proceed uninterrupted and would not affect the

heuristic process. Figure 1 graphical depiction of

conducting two evaluation methods at the same time. At the

analysis stage both the data was merged by a concurrent

triangulation strategy to interpret the findings.

Figure 1. Evaluation process with different techniques, pen

and paper shows data being collected during the activity.

Participants

Twelve children aged 10-11 years participated in this study.

They were all from a single primary school within the UK

and the teacher selected the participants from the class (this

was a convenience sample). On selection, the children were

placed in three different groups); each group contained four

participants (numbered 1 to 4).

For the direct observation, three researchers were recruited,

one acting as facilitator and two as observers. The observers

were experienced researchers specialized in HCI and CCI.

The facilitator initiated and coordinated the evaluation

process, whilst the observers documented any problems the

children were experiencing.

Apparatus

In order to carry out the heuristic evaluation all the child

participants used identical touch screen laptops which each

had the music making game called “JamMo” installed on

them. All laptops had the same configuration with the

ability to use a stylus for touch or the track pad. The

participants were allowed to choose any mode of interaction

while using the application. Figure 2 is a screenshot of the

JamMo application used within this study.

Figure 2. Screenshot of singing and composition activity in

JamMo application.

Modifying Heuristics

The process of creating simplified heuristics involved an

analysis of the frequently used heuristics from a previous

study [16] and a detailed literature review into heuristics

developed for game usability [4, 13-15]. Throughout this

process careful consideration was placed on the language

and terminology used within the heuristic set. The most

frequently used heuristics were rephrased removing jargons

from the sentence and in order not to overwhelm the

evaluators similar heuristics were merged to create a

smaller list of heuristics (see Table 1). It was anticipated

that the children would struggle to understand the meaning

of the term heuristic, thus the activity of removing jargons

also involved changing the title of the list from “list of

heuristic” to “game rules” making it easier for children to

connect with the context of the evaluation. It is questionable

whether game rules is an appropriate term as the five rules

are more aligned to heuristics than game rules, but the

27

authors felt that this may be easier to explain to the child

evaluators.

Game rules (Select the rule broken)

1

Sound

and

visual

images

support

the

game

2

The player

understands

the

messages

in the game

3

Navigations

are simple

and easy to

use

4

The

game

helps

players

to

avoid

making

errors

5

The

game

provides

help to

the

player

Table 1. Simplified set of heuristics for child evaluators.

Creating the ‘Bad Scale’

It has been previously reported that children faced difficulty

using Neilsen’s severity scale in the evaluation [16]. The

child evaluators struggled to comprehend the severity scale

and attach the problems to the numeric scale[16]. A

modified analogue scale was adapted from ‘Smileyometer’

and ’Likert scale’ to capture severity ratings [17]. In a

heuristic evaluation the evaluators are predicting problems

that the users will encounter with the software. Therefore it

is anticipated that all the problems reported will be negative

and a scale needed to be designed to reflect this. Therefore

a new scale was developed based upon negative

terminology the children would understand and a visual aid.

The ‘Bad Scale’ contains three facial expressions along

with supportive texts representing feelings such as bad,

very bad and awful, see Table 2.

Bad Scale (Rate your problem)

Bad Very bad Awful

Table 2. A new visual analogue scale “Bad Scale”.

The title was changed from “Severity Ratings” to “Bad

Scale” to match the context and possibly making it easier

for children to interpret.

Integrated Comment Form

Salian, Sim and Read [16] identified that children struggled

to use multiple sheets within the evaluation process, in

particular child evaluators found it difficult moving

between the sheets containing the information to perform

the evaluation and the data capture forms. A new

participant comment form was designed; see Figure 3, to

reduce the activity of referring between the lists (heuristic

and severity). This form integrated both the sub-tasks of

selecting the heuristic which was violated and severity into

one form. The first column was for writing down the

problems, followed by the “Bad Scale” and “Game Rules”.

This form was perceived to be easier than having the

children transfer data from one form to another in terms of

severity and the heuristic. It was feasible for an evaluator to

select more than one game rule if they perceived the

problem violated more than one rule.

Figure 3. Usage of new individual comment form by

participants during heuristic evaluation.

Adding Interactivity

In a previous study the children seemed to have short

attention span while performing the heuristic evaluation and

need regular prompts during the session [16]. Therefore in

this study most of the instructions were planned to be given

in the form of small milestones within the tasks.

For the individual evaluation

Tasks were written on the board and read out to help

children understand the activity. They could also refer to it

whenever they need.

After a task was read out, the children were encouraged to

indicate their progress after achieving these milestones to

the facilitator, giving a sense of focus in the evaluation e.g.

“Raise your hand as soon as you find singing game” or

“Inform the facilitator as soon as you the find singing

game”.

For group discussion

The final part of the heuristic evaluation requires the

aggregation of the individual data into a single list of

problems.

To aid the evaluators the problems were written on a white

board by the facilitator whilst the children read out their

individual problems.

The children then had the opportunity to say whether they

had also found that problem.

The aim was to make the process of merging all their

problems into one comprehensive list more interactive.

28

Data collection

Data was collected from two sources, the heuristic

evaluations and from the observations by the researchers.

During the heuristic evaluation, the children were provided

with a newly designed individual comment sheet to record

the problems they found in the game. To assist the children

in completing the forms a simple example of problem was

shown on a completed sheet. The researchers who were

watching the study used specially prepared forms to record

their observations; these were used both during the stage

when children were carrying out their individual

evaluations and later during the aggregation process.

Procedure

This study was a field-based evaluation using a computer

lab within a UK primary school. All laptops were placed at

a certain distance from each other so all participants had

enough space to perform their activities. This also

minimized the possibility of the children influencing each

other during the individual component of the evaluation.

Prior to the children commencing the evaluation an

experienced researcher informed them about the study and

about the extent of, and purpose of, their participation in the

subsequent activity. The children were briefed (for about 10

minutes) about the steps involved in performing a heuristic

evaluation:

Playing the game (one task at a time) for a few minutes Try to find some problems that they believed could

hinder them or their friends to finish the tasks Write the problems down on the sheet Indicate how bad they think the problem is using the

‘Bad Scale’ and which game rules it broke.

Finally the problems need to be merged into a single list.

Individually read out your problem and if anyone else has

the same problem on their sheet raise their hand.

Agree what the ‘Bad Scale’ rating should be

Explanations were kept to a minimum in order to keep the

children engaged. The teacher was present during this

explanation and the children were given the opportunity to

ask any questions.

As each child came to the study, he or she was again

informed about the objective of the study. The roles of the

observers were explained and they were informed that they

could opt out of the proceedings at any time if they did not

want to continue with the evaluation.

The study was in two parts; the first part (taking around 15

– 20 minutes) involved the children performing an

individual evaluation of the game to identify possible

problems. This session was quite structured with two main

game related tasks to perform and these tasks were verbally

communicated and written on the white board. The

participants were first required to find a particular task and

raise their hand or inform one of the team members, once

found they were instructed to play the section for few

minutes e.g., Activity 1 - Find “singing game” in the

software and raise your hand once you have found it. Once

found the children were asked to play the game for few

minutes by recording songs in their own voice. During this

process they were reminded to keep writing any problems

they encountered on the forms provided. The facilitator

constantly provided assistance with any problems related to

using the comment forms, the heuristic, and the severity

ratings, making sure participants were only guided to

perform the reporting tasks (i.e. the practicalities) rather

than hinting at problems for them. During this time the

observers recorded any issues the children encountered in

completing the forms or understanding what was required

of them.

In the second part of the study, each group of child

participants were asked to share their findings with each

other and merge all the problems into one aggregated list.

One by one, each child read out their list of found problems

to the group whilst the other children sought to find similar

problems in their own lists. Once a similar problem was

found, the children were asked to tick a box beside that

specific problem in their own comment forms to eliminate

duplicate problems. At the same time the facilitator (rather

than the children, who would have taken a long time) wrote

the problems onto a white board placed in front of the

participants. This activity continued until all the children

within the group had shared all their problems and resulted

in there being three sets of data (one set from each group).

Throughout the two parts of the study, the facilitator and

observers occasionally intervened in order to keep the

children focused on the evaluation. For example some of

the children were found playing the game and losing track

of time. Some children had to be given a little more time to

write their problems down after resolving their confusion

with the facilitator.

Analysis

Two sets of data were derived from this study, the first

being the usability problems the child evaluators

documented and the second the observational data relating

to problems the evaluators encountered.

For the observational data, this data was analyzed in two

stages to confirm anticipated issues related to children’s

ability to perform an evaluation. The first step included a

“Data reduction” process that aimed to remove duplicate

comments made by the observers and just capture the

frequency. The next stage involved the data being coded

into themes using an open card sort which was carried out

by the authors of this paper. Figure 4 shows a graphical

representation of the data merging process. For the usability

problems, the data from each group was analyzed to

identify issues such as incomplete forms and inaccurate

classifications to game rules. By reading through each

comment, the problems identified by the children and the

mappings of those problems to heuristics and severity

29

ratings were analyzed by the research team to examine any

critical game play problems and to identify the

effectiveness of the children in following the procedure.

Figure 4. Data merging process, 1) Merging observer problems

within group or individual participant comments and 2)

Merging problems between groups for both data sources.

The three aggregated lists of usability problems were

merged by the researchers into a single list of problems.

RESULTS

The aim of this research was can children conduct a

heuristic evaluation using the new procedure to evaluate a

product?

All the children were able to follow the instructions and to

some degree successfully conduct a heuristic evaluation.

Overall 18 usability problems were found with an average

of 1.50 problems per participant across all the groups see

Table 3.

Gro

ups Participants

Total

Problems

(by

group)

Mean

Proble

ms

(by

group)

Unique

Problems

Merge

d

Proble

ms

A B C D

1 1 0 0 1 1 3 0.75

2 1 1 1 2 1 6 1.50

3 4 1 1 1 2 9 2.25

18 1.50

Table 3. Number of problems found during heuristic

evaluation across all the groups.

The adult observers did not document any problems related

to the child applying game rules (heuristics) and ‘Bad

Scale’ (severity rating). It appeared that the majority of

child evaluators were able to understand the new simplified

version of game rules with some assistance (clarification

from facilitator) and mapped at least one of their problems

to each of the game rules.

Even though children seem to understand game rules, only

50% of unique and 22% of the merged (common) problems

were mapped correctly. Further analysis of mapped

problems and confirming with academics indicated all the

50% of mapped problems were correctly mapped to the

game rules.

Game Rules (NM = No Mapping, * = Double Mapping)

Unique Problems

GR1 GR2 GR3 GR4 GR5 NM Total

2 3 2 1 * 6 14

Table 4.Tagging of game rules (heuristics) to unique problems

by children during evaluation.

There were two instances of “double mapping” where

participants tagged their found problem to two game rules.

For Example: An evaluator mapped one problem to Game

Rule 4 and 5 see Table 4. The final mapping was decided

by the participants during the group discussion stage. Table

3 and 4 shows the trend of unique and merged (common)

problems to the game rules respectively (GR indicates game

rule). The number and alphabet (indicating child) in the

tables marked with red color indicates the double mapping

of the problems. For Example: Two children (C and D)

from Group 1 found a same problem and mapped it to one

game rule (GR2), see Table 5.

Game Rules (NM = No Mapping, * = Double Mapping)

Merged Problems

Groups GR1 GR2 GR3 NM Total

1 (C+D) 1

2 (A+B) 1

3 (A) (D) 1

* (A+C) (D) 1

Table 5.Tagging of game rules (heuristics) to merged problems

by children during evaluation.

Similarly, the participants had no issues relating the

analogue ‘Bad Scale’ to severity. All the 18 problems

where mapped to the ‘Bad Scale’. Despite mapping all the

problems, child participants rated 10 out of 14 unique

problems as “bad” and the other four problems were

recorded as “very bad” see Table 5. The total problems

found (18 problems) and ‘Bad Scale will not match as

individual ratings by each participant is shown in merged

Between Groups Within Group/Individual

1) 2)

30

problems section i.e., one of merged (common) problem

found in Group 3 by three child evaluators (A+C+D) is

displayed individually (A = bad, C = very bad and D =

bad). Four identified problems were mapped for nine times

by the children, see Table 6.

Groups Problems

Found

Bad Scale

(NR = No rating)

Bad

Very

Bad Awful NR

Unique Problems

1 2 1 1 0 0

2 5 5 0 0 0

3 7 4 3 0 0

Total -

Unique 14 10 4 0 0

Merged Problems

Total - All

Merged 4 8 1 0 0

Total -

Unique +

Merged

18 18 5 0 0

Table 6. Usage of ‘Bad Scale’ by child evaluators during

heuristic evaluation.

No problems were rated as “Awful” by any of the

participant. This could indicate some problem in usage of

analog scale as there was confusion amongst some children

while using the newly designed comment form and they

needed some assistance by the facilitator to guide them.

Some of the participants were noticed to be writing all the

problems within one column of comment forms and

associated the number besides the problem column as the

number of tasks.

The other research aims was to determine can children

inspect interfaces?

Some genuine usability problems were reported by the

participants during the evaluation, examples of these

problems are reported below: – presenting them with their

own vocabulary:

Drag and drop functionality was viewed as a problem

for children in reception class among participants by

recording some comments such as “Dragging items

might be a bit of a problem for the little once” and

“They wouldn’t know how hard to click or drag, they

might press hard because they think it’s not working”.

The participant also reported that the game needs to be

more interactive and provide help to the user of the

game e.g., “They (reception class children) may not

know what a track is so there should be an arrow

pointing at it”, “They won’t know that you can move

the track to put more sounds in”, “They won’t know

that when you play the sounds that you can add more

sounds and take away the sounds” and “If you do

something wrong it does not tell you”.

The words used in the game were considered difficult

for younger children to understand e.g., “The word

composition would be difficult for them to understand”

and “Different words then children know”.

Some participants felt the images or button in the game

were not self reflective and needed supportive text to

avoid confusion for children. e.g., “Write under the

buttons” and “Reception children would not know what

all the buttons are for”.

Similar to the comments above, others indicated that

some image do not represent the sounds attached to

them by saying “Cannot find a piece of music” and

“The noises don’t represent the items of music at

times”.

During the recording session in the singing game, one

participant noted that “It would be good to have writing

underneath, so they know what to sing”.

To aid future work it was also necessary to understand:

What was difficult?

The adult observers reported some major concerns in both

parts of the sessions. A total of 23 problems were observed

and recorded in the first part of the evaluation (individual

session). In the second part (group aggregation) 19

problems were recorded. These identified problems were

then categorized into different themes and scored with the

frequency of their occurrence during the evaluation e.g., 12

participants x 2 observers = 24 frequencies per problem, see

Table 7.

Most of the problems arose with the children having

difficulty in identification of problems this was observed 10

times making it the most detected category. It was noted

that participants found problems but did not recorded it on

their forms. For example one participant faced a problem

and overcame it but did not record it. Both observers found

all groups needed prompts to keep the children’s on the

evaluation. In some occasions participants were not sure

what to look for in the game and a participant from Group 2

did not find any problem as they thought it could be easily

operated. Another participant was identified focusing on the

mechanical problems rather than interface problem.

31

Indicators Total

Problems

Understanding Tasks 4

Understanding Game Rules (Heuristics) 3

Bad Scale (Severity Ratings) 2

Identification of Problems 10

Interpret Findings 0

Understanding Forms 2

Others 2

Table 7. Frequency of problems found during individual

evaluation within and across the groups.

Despite the facilitator writing the task on the board and then

verbally communicating to the participants, they had

problems understanding the tasks (4 times), for example

find the singing game. It was observed that some of the

participants were not concentrating on the given tasks and

exploring the application. It was also noted that participants

in some groups found it very difficult to identify which

section signified the “singing game”.

It was observed that participants failed to map game rules to

all the recorded issues, despite the initial training, they

needed some assistance understanding the rules (3 times).

Even though participants understood the ‘Bad Scale’ as

severity, the observers noted that children were unsure

while mapping the severity through ‘Bad Scale’ (2 times).

Two problems were noted by observers while participants

used newly designed comment forms. The children seem to

write multiple problems in one single box and found it

difficult to map games rules and ‘Bad Scale’ to them. The

theme “others” included problems like participants writing

down same problem friends have said out loud. One of the

participants in the first group had played the game before

and was familiar with the given tasks.

Table 8 below, identifies the problems the children

encountered during the data aggregation phase.

Indicators Total Problems

Communication Problems 7

Identification Problems 0

Final Ratings 5

Others 7

Table 8. Frequency of problems found during group

evaluation within and across the groups.

During the group discussion phase children were also seen

to be having some problems. In the “others” theme, the

adult observers had registered problems related to

proceedings of the group discussion session. The facilitator

failed to write the exact same words read by the participants

on the board during the problem merging process e.g.,

Group 3 – Participant A read a problem as “It would be

good to have writing underneath, so they know what to

sing” and facilitator wrote it on the board as “Need text

underneath”. This could have lead to confusion between the

participants trying to identify similar problems in their

comment sheets.

The child evaluators were observed to be confused while

reading out problems from their comment sheets - for

example not reading the problems, ratings and selected

game rules in a correct order to the group. During the

debriefing session it was pointed that participants were not

able to differentiate between the ‘Bad Scale’ and game rule

sections on the participant comment form and needed to

have some assistance during the process. The observers

reported that most of the problems were rated “bad” while

confirming the final ratings and supposed it could be based

on opinion and not metric scale.

DISCUSSION AND FURTHER RESEARCH

The study reconfirmed the ability of children to critically

evaluate a product and identify genuine problems [16, 17].

The new procedure involved more interaction between the

team and participants resulting in a more focused

evaluation. It gave a sense of direction to the product

evaluation by working on the specific task given to the

participants. There were instances when participants were

more engaged in exploring the game and not concentrating

on the given tasks. Timely prompts were required to keep

the children focused on the evaluation e.g. prompting

participants to play the specific activity for a few minutes or

recording problems before starting the next tasks. However,

the majority of children were able to understand the

procedures and accomplish the given tasks. The children

were also able to identify problems and interpret it

comfortably. This was confirmed by observers not

recording any issues under “interpret findings” theme.

Some issues related to mapping of newly modified game

rules to the found problems were observed across the

groups. While observers reported three problems, the

participant data indicated about 50 percent of unique and

22.22 percent of merged problems were found unmapped. It

confirmed that children to some extent were able to

understand the game rules (heuristics). It showed reducing

the number of games rules and simplifying them by

removing complicated words from the traditional heuristics

had good impact on participants and their understanding of

the problems. However given the fact that 50% of problems

were unmapped the game rules might not have offered

sufficient coverage of the problems the children identified.

A new form of visual analogue scale ‘Bad Scale’ was tested

in this evaluation to gather opinion of the children about

their found problems. The participant data revealed that

32

children were comfortable rating the severity of the

problem using a new analogue scale of ratings with visual

pointer (faces) along with respective labels, which directly

indicated the feeling of the found problems (bad, very bad

and awful). However, the effectiveness of the scale was not

very convincing as most of the problems were consistently

rated as “bad” and “very bad”. No problems were rated as

“Awful” by any of the participants. This might have been a

result of the design of the software and the fact that there

were no real major problems inherent within the game and

the children only identified minor issues.

The new comment forms for the participant were easily

completed by most of the children to capture their

problems. The steps of referring heuristics and severity

ratings were reduced by incorporating both activities into

one participant comment form. The participants were able

to differentiate between the sections within the participant

comment forms such as; write down the problems, rating

the problem by ticking on one of the ‘Bad Scale’ indicator

and selecting the closest game rule in one form itself.

However, there was one issue of misinterpretation that was

observed. The participants associated the number on the

form as one section to write all the problems related to one

task. They were able to quickly recover from their mistake

and managed to record the issues in the individual columns

after a number of prompts. It was noticed that this problem

could easily be avoided by removing the number besides

the “problems found” columns.

During the data aggregation stage, the observers noticed

participants getting confused while reading out the

problems. It is recommended having a prominent partition

between the two sections within participant comment form.

The overall feasibility of using the simplified heuristic

evaluation along with children was encouraging. At the

same time, there were a few limitations that were identified

throughout the process of this research. The children were

able to find genuine problems and report their findings to

the group. However, due to the lack of experience in area of

usability the children may not recognise the impact of the

problems in the game.

As mentioned earlier, it was observed that some of the

participants were not concentrating on the given tasks and

needed prompts to keep in context of the evaluation. It

confirms that the entire process would be highly assistive

and needs an experienced facilitator to coordinate the entire

evaluation. The participants in some groups found it very

difficult to identify the tasks. There is a need to collaborate

with teachers and parents while framing the questions or

tasks in context of children.

Falsification testing could not be performed; therefore,

comparison of the problem data sets from user studies was

not possible. A user study would be needed to compare the

results to investigative the effectiveness of the method. To

further validate the data, a focus group or interviewing in

pairs [18] with children could be conducted. Secondly,

conducting similar heuristic evaluation with adult

participants could possibly strengthen the data by

comparing evaluation results between children and adults.

Further studies are needed to compare the effectiveness of

the new modified version of heuristic evaluation and more

improvements are needed to refine each component in the

evaluation. The components such as simplified heuristic set,

‘Bad Scale’, comment form, tasks and instructions are

equally important to successfully perform the heuristic

evaluation. These different components can be tested and

refined individually e.g., a study can be conducted for

investigating how children understand words. The number

of points within ‘Bad Scale’ needs to be investigated to

determine whether a three point scale is sufficient within

this context. The observer comment form appeared

effective at capturing issues in the evaluation and possibly

can be used in future research.

ACKNOWLEDGMENTS

I am heartily thankful to my supervisor, Gavin Sim and

Janet Read, whose encouragement, guidance at every step

enabled to develop and complete my research.

I’m also very thankful to everyone from the department of

Computing, Engineering and Physical Sciences at

University of Central Lancashire and especially all the staff

from ChiCI group for their co-operation and help, which

has made it possible to carry out the study.

REFERENCES

1. Read, J.C. Validating the Fun Toolkit: an instrument for

measuring children's opinion of technology. Cognition,

Technology and Work, 10, 2 (2008), 119-128.

2. Baauw, E., and P. Markopoulos, A comparison of think-

aloud and post-task interview for usability testing with

children. In IDC 2004, ACM (2004), 115-116.

3. Zaman, B., and V.V. Abeele. Laddering with Young

Children in User Experence Evaluations: Theoretical

Groundings and a Practical Case. In IDC 2010, ACM

(2002), 156-165.

4. MacFarlane, S., and Pasiali, A. Adapting the heuristic

evaluation method for use with children. In Workshop

on child computer interaction: methodological research

2005, Interact (2005).

5. Nielsen, J. Finding usability problems through heuristic

evaluation. In Proceedings of the SIGCHI conference on

Human factors in computing systems 1992, ACM

(1992), 373-380.

6. Law, E.L.-C., and Hvannberg, E.T. Consolidating

Usability Problems with Novice Evaluators.

In Proceedings of the 5th Nordic conference on Human-

computer interaction: building bridges 2008, ACM

(2008), 495-498.

7. Nielsen, J. Enhancing the Explanatory Power of

Usability Heuristics. In Proceedings of the SIGCHI

33

conference on Human factors in computing systems:

celebrating interdependence 1994, ACM (1994), 152-

158.

8. Iversen, O.S., and C. Brodersen. Building a BRIDGE

between children and users: a socio-cultural approach to

child–computer interaction. Cognition, Technology &

Work, 10, 2 (2008), 83-93.

9. Read, J.C., et al. Child computer interaction. In CHI '08

Extended Abstracts on Human Factors in Computing

Systems 2008, ACM (2008), 2419-2422.

10. Druin, A., et al. Children as our technology design

partners. Morgan Kaufmann Publishers Inc, San

Francisco, CA, USA, 1998.

11. Scaife, M., and Y. Rogers. Kids as informants: Telling

us what we didn’t know or confirming what we knew

already. Morgan Kaufmann Publishers Inc, San

Francisco, CA, USA, 1999.

12. Usability of websites for children: 70 design guidelines.

http://www.nngroup.com/reports/kids/.

13. Desurvire, H., Caplan, M., and Toth, J.A. Using

heuristics to Evaluate the Playability of Games. In CHI

2004, ACM (2004), 1509-1512.

14. Korhonen, H., and Koivisto, E.M.I. Playability

heuristics for mobile multi-player games. In

Proceedings of the 2nd international conference on

Digital interactive media in entertainment and arts.

2007, ACM (2007), 28-35.

15. Pinelle, D., Wong, J., and Stach, T. Heuristic Evaluation

for Games:Usability Principles for Video Game Design.

In Proceedings of the SIGCHI Conference on Human

Factors in Computing Systems 2008, ACM (2008),

1453-1462.

16. Salian, K., Sim, G., and Read, J.C. Can children perform

a heuristic evaluation?. In Proceedings of the 11th Asia

Pacific Conference on Computer Human Interaction

2013, ACM (2013), 137-141.

17. Read, J.C., and MacFarlane, S. Using the Fun Toolkit

and Other Survey Methods to Gather Opinions in Child

Computer Interaction. In Proceedings of the 2006

conference on Interaction design and children 2006,

ACM (2006), 81-88.

18. Children's Websites: Usability Issues in Designing for

Kids. http://www.nngroup.com/articles/childrens-

websites-usability-issues/.

19. Woolrych A., and Cockton, G. Testing a Conjection

based on the DR-AR Model of UIM Effectiveness. In

Proceedings of HCI 2002, 30 – 33.

20. Cockton, G., and Woolrych, A. Understanding

inspection methods: lessons from an assessment of

heuristic evaluation. Springer-Verlag, London, 2001.

21.Nielsen J. Usability inspection methods. In CHI 1994,

ACM Press (1994), 413–414.

22.Ling, C., and Salvendy, G. Effect of evaluators'

cognitive style on heuristic evaluation: Field dependent

and field independent evaluators. Int. J. Hum.-Comput.

Stud. 67, 4 (2009), 382-393.

34

http://www.nngroup.com/reports/kids/

http://www.nngroup.com/articles/childrens-websites-usability-issues/

http://www.nngroup.com/articles/childrens-websites-usability-issues/

simplifying heuristic evaluation for older childrensimplifying heuristic evaluation for older...

Documents