simplifying heuristic evaluation for older childrensimplifying heuristic evaluation for older...
TRANSCRIPT
Simplifying Heuristic Evaluation for Older Children
Kishan Salian
UX Consultant
Mumbai 400092,
Maharashtra, India.
Gavin Sim
University of Central
Lancashire
Preston, UK.
ABSTRACT
This paper aims to identify whether children can perform a
heuristic evaluation based on a new modified method. In
total 12 children placed in groups of 4 participated in the
study, evaluating a music making game on a laptop. The
results showed that children could perform a heuristic
evaluation, identifying genuine usability problems using the
simplified method. The effectiveness of the method was
measured against ease of use for aspects such as mapping
problem to game rules (heuristics), using the ‘Bad Scale’
(severity) and the number of observed errors reported. The
children struggled to map all the problems to the game rules
and had difficulty rating some problems to the new ‘Bad
Scale’. Further research will be performed to refine the
process in an attempt to eliminate reported issues thus
improving the method for children.
Author Keywords
Heuristic Evaluation; Child Computer Interaction;
Evaluation Methods
ACM Classification Keywords
H5.2. Information interfaces and presentation (e.g., HCI):
User Interfaces - Evaluation/Methodology.
General Terms
Human Factors
INTRODUCTION
Over the past decade there has been a significant amount of
research undertaken to understand child behaviour while
interacting with technology, and this is partly down to the
rise in technology for children . Although the value of
involving children in the design process was recognised,
most researchers and current evaluation methods still
require further research to ensure validity within a global
market. Children have their own perception of technology
and extensive insight in child behaviour is needed before
conducting any evaluation studies with children. It becomes
more complicated when researchers adopt inspection
methods to evaluate products targeted for children, resulting
in a high risk of wrong assumption being made about the
users’ behaviour.
There has been considerable research in Child Computer
Interaction (CCI) in establishing the suitability and
effectiveness of many traditional evaluation methods for
use with children [1-3]. These studies have highlighted the
modifications that are required to traditional methods and
demonstrated the effectiveness of the methods in a given
context. However the majority of the studies are user based
and there are a few studies that have examined inspection
based approaches, especially the heuristic evaluation
method[4].
Researchers have recognised the need to involve children as
“active participants” in designing new technology for them
[8]. Various studies have identified different roles for
children in the design processes, e.g. stakeholders [8],
design partners [10], informants [11], testers [12] or expert
evaluator[16]. Given the understanding that children could
be domain experts in the use of technology it would appear
feasible that, given the correct training, they might be able
to act as expert evaluators within the context of a heuristic
evaluation.
The heuristic method is the most popular [5], most cost and
time effective Usability Inspection Methods (UIM) [21] and
arguably the most easy to learn [5]. It has been used
successfully to evaluate applications for both adults and
children in different contexts, but adults have always acted
as the evaluators. The experience of evaluators can affect
the quality of the results and this issue is referred to as the
evaluators effect [22].
A classic heuristic evaluation, by Nielsen[5], have been
modified to suit children. Following the same procedure as
the heuristic evaluation a small number of expert evaluators
independently inspect a piece of software to identify
usability problems based on compliance to a number of
usability principles. After this initial stage the evaluators’
individual lists of problems are aggregated to form a single
list of known usability problems within the system under
investigation. One of the key requirements for evaluators to
successfully perform a heuristic evaluation is that the
evaluators are experts in usability and the domain under
investigation [6, 7]. Without this expert knowledge there is
a risk that a large number of problems reported by the
evaluators would be false positives (not real problems). It
26
has been suggested that children are experts in the way they
interact with their world and with the technology around
them and, within CCI, capturing this expertise is believed to
be key to designing meaningful artifacts for children [8]. If
children can be considered experts then it may be feasible
that they could successfully perform a heuristic evaluation,
thus providing the CCI community and organization’s
developing technology for children a cost effective
evaluation method.
The main objective of this research is to “test the modified
version of heuristic evaluation with children and measure
the acceptability of the new method”. The study tries to
find answers for some research questions.
Will they be able to follow the new procedure and
evaluate a product?
Will children be able to understand and map the
problems to the set of game rules (heuristics)?
Are they able to assign severity ratings using bad scale
to an identified problem?
Will the new participant comment forms simplify the
process of recording problems?
METHOD
The aim of the study was to observe the simplified heuristic
evaluation technique without influencing the child
evaluators. The study adopts mixed method design which
does not restrict researchers to plan the study within one
approach. Similarly, this research will rely on two different
research techniques (heuristic evaluation and direct
observation) to elicit data and conclude the findings. The
data was concurrently collected from participants
(quantitative) and observers (qualitative) at the same time
by two different techniques (direct observation and
heuristic evaluation). The direct observation technique was
chosen over interviews and surveys as it would allow the
children to proceed uninterrupted and would not affect the
heuristic process. Figure 1 graphical depiction of
conducting two evaluation methods at the same time. At the
analysis stage both the data was merged by a concurrent
triangulation strategy to interpret the findings.
Figure 1. Evaluation process with different techniques, pen
and paper shows data being collected during the activity.
Participants
Twelve children aged 10-11 years participated in this study.
They were all from a single primary school within the UK
and the teacher selected the participants from the class (this
was a convenience sample). On selection, the children were
placed in three different groups); each group contained four
participants (numbered 1 to 4).
For the direct observation, three researchers were recruited,
one acting as facilitator and two as observers. The observers
were experienced researchers specialized in HCI and CCI.
The facilitator initiated and coordinated the evaluation
process, whilst the observers documented any problems the
children were experiencing.
Apparatus
In order to carry out the heuristic evaluation all the child
participants used identical touch screen laptops which each
had the music making game called “JamMo” installed on
them. All laptops had the same configuration with the
ability to use a stylus for touch or the track pad. The
participants were allowed to choose any mode of interaction
while using the application. Figure 2 is a screenshot of the
JamMo application used within this study.
Figure 2. Screenshot of singing and composition activity in
JamMo application.
Modifying Heuristics
The process of creating simplified heuristics involved an
analysis of the frequently used heuristics from a previous
study [16] and a detailed literature review into heuristics
developed for game usability [4, 13-15]. Throughout this
process careful consideration was placed on the language
and terminology used within the heuristic set. The most
frequently used heuristics were rephrased removing jargons
from the sentence and in order not to overwhelm the
evaluators similar heuristics were merged to create a
smaller list of heuristics (see Table 1). It was anticipated
that the children would struggle to understand the meaning
of the term heuristic, thus the activity of removing jargons
also involved changing the title of the list from “list of
heuristic” to “game rules” making it easier for children to
connect with the context of the evaluation. It is questionable
whether game rules is an appropriate term as the five rules
are more aligned to heuristics than game rules, but the
27
authors felt that this may be easier to explain to the child
evaluators.
Game rules (Select the rule broken)
1
Sound
and
visual
images
support
the
game
2
The player
understands
the
messages
in the game
3
Navigations
are simple
and easy to
use
4
The
game
helps
players
to
avoid
making
errors
5
The
game
provides
help to
the
player
Table 1. Simplified set of heuristics for child evaluators.
Creating the ‘Bad Scale’
It has been previously reported that children faced difficulty
using Neilsen’s severity scale in the evaluation [16]. The
child evaluators struggled to comprehend the severity scale
and attach the problems to the numeric scale[16]. A
modified analogue scale was adapted from ‘Smileyometer’
and ’Likert scale’ to capture severity ratings [17]. In a
heuristic evaluation the evaluators are predicting problems
that the users will encounter with the software. Therefore it
is anticipated that all the problems reported will be negative
and a scale needed to be designed to reflect this. Therefore
a new scale was developed based upon negative
terminology the children would understand and a visual aid.
The ‘Bad Scale’ contains three facial expressions along
with supportive texts representing feelings such as bad,
very bad and awful, see Table 2.
Bad Scale (Rate your problem)
Bad Very bad Awful
Table 2. A new visual analogue scale “Bad Scale”.
The title was changed from “Severity Ratings” to “Bad
Scale” to match the context and possibly making it easier
for children to interpret.
Integrated Comment Form
Salian, Sim and Read [16] identified that children struggled
to use multiple sheets within the evaluation process, in
particular child evaluators found it difficult moving
between the sheets containing the information to perform
the evaluation and the data capture forms. A new
participant comment form was designed; see Figure 3, to
reduce the activity of referring between the lists (heuristic
and severity). This form integrated both the sub-tasks of
selecting the heuristic which was violated and severity into
one form. The first column was for writing down the
problems, followed by the “Bad Scale” and “Game Rules”.
This form was perceived to be easier than having the
children transfer data from one form to another in terms of
severity and the heuristic. It was feasible for an evaluator to
select more than one game rule if they perceived the
problem violated more than one rule.
Figure 3. Usage of new individual comment form by
participants during heuristic evaluation.
Adding Interactivity
In a previous study the children seemed to have short
attention span while performing the heuristic evaluation and
need regular prompts during the session [16]. Therefore in
this study most of the instructions were planned to be given
in the form of small milestones within the tasks.
For the individual evaluation
Tasks were written on the board and read out to help
children understand the activity. They could also refer to it
whenever they need.
After a task was read out, the children were encouraged to
indicate their progress after achieving these milestones to
the facilitator, giving a sense of focus in the evaluation e.g.
“Raise your hand as soon as you find singing game” or
“Inform the facilitator as soon as you the find singing
game”.
For group discussion
The final part of the heuristic evaluation requires the
aggregation of the individual data into a single list of
problems.
To aid the evaluators the problems were written on a white
board by the facilitator whilst the children read out their
individual problems.
The children then had the opportunity to say whether they
had also found that problem.
The aim was to make the process of merging all their
problems into one comprehensive list more interactive.
28
Data collection
Data was collected from two sources, the heuristic
evaluations and from the observations by the researchers.
During the heuristic evaluation, the children were provided
with a newly designed individual comment sheet to record
the problems they found in the game. To assist the children
in completing the forms a simple example of problem was
shown on a completed sheet. The researchers who were
watching the study used specially prepared forms to record
their observations; these were used both during the stage
when children were carrying out their individual
evaluations and later during the aggregation process.
Procedure
This study was a field-based evaluation using a computer
lab within a UK primary school. All laptops were placed at
a certain distance from each other so all participants had
enough space to perform their activities. This also
minimized the possibility of the children influencing each
other during the individual component of the evaluation.
Prior to the children commencing the evaluation an
experienced researcher informed them about the study and
about the extent of, and purpose of, their participation in the
subsequent activity. The children were briefed (for about 10
minutes) about the steps involved in performing a heuristic
evaluation:
Playing the game (one task at a time) for a few minutes Try to find some problems that they believed could
hinder them or their friends to finish the tasks Write the problems down on the sheet Indicate how bad they think the problem is using the
‘Bad Scale’ and which game rules it broke.
Finally the problems need to be merged into a single list.
Individually read out your problem and if anyone else has
the same problem on their sheet raise their hand.
Agree what the ‘Bad Scale’ rating should be
Explanations were kept to a minimum in order to keep the
children engaged. The teacher was present during this
explanation and the children were given the opportunity to
ask any questions.
As each child came to the study, he or she was again
informed about the objective of the study. The roles of the
observers were explained and they were informed that they
could opt out of the proceedings at any time if they did not
want to continue with the evaluation.
The study was in two parts; the first part (taking around 15
– 20 minutes) involved the children performing an
individual evaluation of the game to identify possible
problems. This session was quite structured with two main
game related tasks to perform and these tasks were verbally
communicated and written on the white board. The
participants were first required to find a particular task and
raise their hand or inform one of the team members, once
found they were instructed to play the section for few
minutes e.g., Activity 1 - Find “singing game” in the
software and raise your hand once you have found it. Once
found the children were asked to play the game for few
minutes by recording songs in their own voice. During this
process they were reminded to keep writing any problems
they encountered on the forms provided. The facilitator
constantly provided assistance with any problems related to
using the comment forms, the heuristic, and the severity
ratings, making sure participants were only guided to
perform the reporting tasks (i.e. the practicalities) rather
than hinting at problems for them. During this time the
observers recorded any issues the children encountered in
completing the forms or understanding what was required
of them.
In the second part of the study, each group of child
participants were asked to share their findings with each
other and merge all the problems into one aggregated list.
One by one, each child read out their list of found problems
to the group whilst the other children sought to find similar
problems in their own lists. Once a similar problem was
found, the children were asked to tick a box beside that
specific problem in their own comment forms to eliminate
duplicate problems. At the same time the facilitator (rather
than the children, who would have taken a long time) wrote
the problems onto a white board placed in front of the
participants. This activity continued until all the children
within the group had shared all their problems and resulted
in there being three sets of data (one set from each group).
Throughout the two parts of the study, the facilitator and
observers occasionally intervened in order to keep the
children focused on the evaluation. For example some of
the children were found playing the game and losing track
of time. Some children had to be given a little more time to
write their problems down after resolving their confusion
with the facilitator.
Analysis
Two sets of data were derived from this study, the first
being the usability problems the child evaluators
documented and the second the observational data relating
to problems the evaluators encountered.
For the observational data, this data was analyzed in two
stages to confirm anticipated issues related to children’s
ability to perform an evaluation. The first step included a
“Data reduction” process that aimed to remove duplicate
comments made by the observers and just capture the
frequency. The next stage involved the data being coded
into themes using an open card sort which was carried out
by the authors of this paper. Figure 4 shows a graphical
representation of the data merging process. For the usability
problems, the data from each group was analyzed to
identify issues such as incomplete forms and inaccurate
classifications to game rules. By reading through each
comment, the problems identified by the children and the
mappings of those problems to heuristics and severity
29
ratings were analyzed by the research team to examine any
critical game play problems and to identify the
effectiveness of the children in following the procedure.
Figure 4. Data merging process, 1) Merging observer problems
within group or individual participant comments and 2)
Merging problems between groups for both data sources.
The three aggregated lists of usability problems were
merged by the researchers into a single list of problems.
RESULTS
The aim of this research was can children conduct a
heuristic evaluation using the new procedure to evaluate a
product?
All the children were able to follow the instructions and to
some degree successfully conduct a heuristic evaluation.
Overall 18 usability problems were found with an average
of 1.50 problems per participant across all the groups see
Table 3.
Gro
ups Participants
Total
Problems
(by
group)
Mean
Proble
ms
(by
group)
Unique
Problems
Merge
d
Proble
ms
A B C D
1 1 0 0 1 1 3 0.75
2 1 1 1 2 1 6 1.50
3 4 1 1 1 2 9 2.25
18 1.50
Table 3. Number of problems found during heuristic
evaluation across all the groups.
The adult observers did not document any problems related
to the child applying game rules (heuristics) and ‘Bad
Scale’ (severity rating). It appeared that the majority of
child evaluators were able to understand the new simplified
version of game rules with some assistance (clarification
from facilitator) and mapped at least one of their problems
to each of the game rules.
Even though children seem to understand game rules, only
50% of unique and 22% of the merged (common) problems
were mapped correctly. Further analysis of mapped
problems and confirming with academics indicated all the
50% of mapped problems were correctly mapped to the
game rules.
Game Rules (NM = No Mapping, * = Double Mapping)
Unique Problems
GR1 GR2 GR3 GR4 GR5 NM Total
2 3 2 1 * 6 14
Table 4.Tagging of game rules (heuristics) to unique problems
by children during evaluation.
There were two instances of “double mapping” where
participants tagged their found problem to two game rules.
For Example: An evaluator mapped one problem to Game
Rule 4 and 5 see Table 4. The final mapping was decided
by the participants during the group discussion stage. Table
3 and 4 shows the trend of unique and merged (common)
problems to the game rules respectively (GR indicates game
rule). The number and alphabet (indicating child) in the
tables marked with red color indicates the double mapping
of the problems. For Example: Two children (C and D)
from Group 1 found a same problem and mapped it to one
game rule (GR2), see Table 5.
Game Rules (NM = No Mapping, * = Double Mapping)
Merged Problems
Groups GR1 GR2 GR3 NM Total
1 (C+D) 1
2 (A+B) 1
3 (A) (D) 1
* (A+C) (D) 1
Table 5.Tagging of game rules (heuristics) to merged problems
by children during evaluation.
Similarly, the participants had no issues relating the
analogue ‘Bad Scale’ to severity. All the 18 problems
where mapped to the ‘Bad Scale’. Despite mapping all the
problems, child participants rated 10 out of 14 unique
problems as “bad” and the other four problems were
recorded as “very bad” see Table 5. The total problems
found (18 problems) and ‘Bad Scale will not match as
individual ratings by each participant is shown in merged
Between Groups Within Group/Individual
1) 2)
30
problems section i.e., one of merged (common) problem
found in Group 3 by three child evaluators (A+C+D) is
displayed individually (A = bad, C = very bad and D =
bad). Four identified problems were mapped for nine times
by the children, see Table 6.
Groups Problems
Found
Bad Scale
(NR = No rating)
Bad
Very
Bad Awful NR
Unique Problems
1 2 1 1 0 0
2 5 5 0 0 0
3 7 4 3 0 0
Total -
Unique 14 10 4 0 0
Merged Problems
Total - All
Merged 4 8 1 0 0
Total -
Unique +
Merged
18 18 5 0 0
Table 6. Usage of ‘Bad Scale’ by child evaluators during
heuristic evaluation.
No problems were rated as “Awful” by any of the
participant. This could indicate some problem in usage of
analog scale as there was confusion amongst some children
while using the newly designed comment form and they
needed some assistance by the facilitator to guide them.
Some of the participants were noticed to be writing all the
problems within one column of comment forms and
associated the number besides the problem column as the
number of tasks.
The other research aims was to determine can children
inspect interfaces?
Some genuine usability problems were reported by the
participants during the evaluation, examples of these
problems are reported below: – presenting them with their
own vocabulary:
Drag and drop functionality was viewed as a problem
for children in reception class among participants by
recording some comments such as “Dragging items
might be a bit of a problem for the little once” and
“They wouldn’t know how hard to click or drag, they
might press hard because they think it’s not working”.
The participant also reported that the game needs to be
more interactive and provide help to the user of the
game e.g., “They (reception class children) may not
know what a track is so there should be an arrow
pointing at it”, “They won’t know that you can move
the track to put more sounds in”, “They won’t know
that when you play the sounds that you can add more
sounds and take away the sounds” and “If you do
something wrong it does not tell you”.
The words used in the game were considered difficult
for younger children to understand e.g., “The word
composition would be difficult for them to understand”
and “Different words then children know”.
Some participants felt the images or button in the game
were not self reflective and needed supportive text to
avoid confusion for children. e.g., “Write under the
buttons” and “Reception children would not know what
all the buttons are for”.
Similar to the comments above, others indicated that
some image do not represent the sounds attached to
them by saying “Cannot find a piece of music” and
“The noises don’t represent the items of music at
times”.
During the recording session in the singing game, one
participant noted that “It would be good to have writing
underneath, so they know what to sing”.
To aid future work it was also necessary to understand:
What was difficult?
The adult observers reported some major concerns in both
parts of the sessions. A total of 23 problems were observed
and recorded in the first part of the evaluation (individual
session). In the second part (group aggregation) 19
problems were recorded. These identified problems were
then categorized into different themes and scored with the
frequency of their occurrence during the evaluation e.g., 12
participants x 2 observers = 24 frequencies per problem, see
Table 7.
Most of the problems arose with the children having
difficulty in identification of problems this was observed 10
times making it the most detected category. It was noted
that participants found problems but did not recorded it on
their forms. For example one participant faced a problem
and overcame it but did not record it. Both observers found
all groups needed prompts to keep the children’s on the
evaluation. In some occasions participants were not sure
what to look for in the game and a participant from Group 2
did not find any problem as they thought it could be easily
operated. Another participant was identified focusing on the
mechanical problems rather than interface problem.
31
Indicators Total
Problems
Understanding Tasks 4
Understanding Game Rules (Heuristics) 3
Bad Scale (Severity Ratings) 2
Identification of Problems 10
Interpret Findings 0
Understanding Forms 2
Others 2
Table 7. Frequency of problems found during individual
evaluation within and across the groups.
Despite the facilitator writing the task on the board and then
verbally communicating to the participants, they had
problems understanding the tasks (4 times), for example
find the singing game. It was observed that some of the
participants were not concentrating on the given tasks and
exploring the application. It was also noted that participants
in some groups found it very difficult to identify which
section signified the “singing game”.
It was observed that participants failed to map game rules to
all the recorded issues, despite the initial training, they
needed some assistance understanding the rules (3 times).
Even though participants understood the ‘Bad Scale’ as
severity, the observers noted that children were unsure
while mapping the severity through ‘Bad Scale’ (2 times).
Two problems were noted by observers while participants
used newly designed comment forms. The children seem to
write multiple problems in one single box and found it
difficult to map games rules and ‘Bad Scale’ to them. The
theme “others” included problems like participants writing
down same problem friends have said out loud. One of the
participants in the first group had played the game before
and was familiar with the given tasks.
Table 8 below, identifies the problems the children
encountered during the data aggregation phase.
Indicators Total Problems
Communication Problems 7
Identification Problems 0
Final Ratings 5
Others 7
Table 8. Frequency of problems found during group
evaluation within and across the groups.
During the group discussion phase children were also seen
to be having some problems. In the “others” theme, the
adult observers had registered problems related to
proceedings of the group discussion session. The facilitator
failed to write the exact same words read by the participants
on the board during the problem merging process e.g.,
Group 3 – Participant A read a problem as “It would be
good to have writing underneath, so they know what to
sing” and facilitator wrote it on the board as “Need text
underneath”. This could have lead to confusion between the
participants trying to identify similar problems in their
comment sheets.
The child evaluators were observed to be confused while
reading out problems from their comment sheets - for
example not reading the problems, ratings and selected
game rules in a correct order to the group. During the
debriefing session it was pointed that participants were not
able to differentiate between the ‘Bad Scale’ and game rule
sections on the participant comment form and needed to
have some assistance during the process. The observers
reported that most of the problems were rated “bad” while
confirming the final ratings and supposed it could be based
on opinion and not metric scale.
DISCUSSION AND FURTHER RESEARCH
The study reconfirmed the ability of children to critically
evaluate a product and identify genuine problems [16, 17].
The new procedure involved more interaction between the
team and participants resulting in a more focused
evaluation. It gave a sense of direction to the product
evaluation by working on the specific task given to the
participants. There were instances when participants were
more engaged in exploring the game and not concentrating
on the given tasks. Timely prompts were required to keep
the children focused on the evaluation e.g. prompting
participants to play the specific activity for a few minutes or
recording problems before starting the next tasks. However,
the majority of children were able to understand the
procedures and accomplish the given tasks. The children
were also able to identify problems and interpret it
comfortably. This was confirmed by observers not
recording any issues under “interpret findings” theme.
Some issues related to mapping of newly modified game
rules to the found problems were observed across the
groups. While observers reported three problems, the
participant data indicated about 50 percent of unique and
22.22 percent of merged problems were found unmapped. It
confirmed that children to some extent were able to
understand the game rules (heuristics). It showed reducing
the number of games rules and simplifying them by
removing complicated words from the traditional heuristics
had good impact on participants and their understanding of
the problems. However given the fact that 50% of problems
were unmapped the game rules might not have offered
sufficient coverage of the problems the children identified.
A new form of visual analogue scale ‘Bad Scale’ was tested
in this evaluation to gather opinion of the children about
their found problems. The participant data revealed that
32
children were comfortable rating the severity of the
problem using a new analogue scale of ratings with visual
pointer (faces) along with respective labels, which directly
indicated the feeling of the found problems (bad, very bad
and awful). However, the effectiveness of the scale was not
very convincing as most of the problems were consistently
rated as “bad” and “very bad”. No problems were rated as
“Awful” by any of the participants. This might have been a
result of the design of the software and the fact that there
were no real major problems inherent within the game and
the children only identified minor issues.
The new comment forms for the participant were easily
completed by most of the children to capture their
problems. The steps of referring heuristics and severity
ratings were reduced by incorporating both activities into
one participant comment form. The participants were able
to differentiate between the sections within the participant
comment forms such as; write down the problems, rating
the problem by ticking on one of the ‘Bad Scale’ indicator
and selecting the closest game rule in one form itself.
However, there was one issue of misinterpretation that was
observed. The participants associated the number on the
form as one section to write all the problems related to one
task. They were able to quickly recover from their mistake
and managed to record the issues in the individual columns
after a number of prompts. It was noticed that this problem
could easily be avoided by removing the number besides
the “problems found” columns.
During the data aggregation stage, the observers noticed
participants getting confused while reading out the
problems. It is recommended having a prominent partition
between the two sections within participant comment form.
The overall feasibility of using the simplified heuristic
evaluation along with children was encouraging. At the
same time, there were a few limitations that were identified
throughout the process of this research. The children were
able to find genuine problems and report their findings to
the group. However, due to the lack of experience in area of
usability the children may not recognise the impact of the
problems in the game.
As mentioned earlier, it was observed that some of the
participants were not concentrating on the given tasks and
needed prompts to keep in context of the evaluation. It
confirms that the entire process would be highly assistive
and needs an experienced facilitator to coordinate the entire
evaluation. The participants in some groups found it very
difficult to identify the tasks. There is a need to collaborate
with teachers and parents while framing the questions or
tasks in context of children.
Falsification testing could not be performed; therefore,
comparison of the problem data sets from user studies was
not possible. A user study would be needed to compare the
results to investigative the effectiveness of the method. To
further validate the data, a focus group or interviewing in
pairs [18] with children could be conducted. Secondly,
conducting similar heuristic evaluation with adult
participants could possibly strengthen the data by
comparing evaluation results between children and adults.
Further studies are needed to compare the effectiveness of
the new modified version of heuristic evaluation and more
improvements are needed to refine each component in the
evaluation. The components such as simplified heuristic set,
‘Bad Scale’, comment form, tasks and instructions are
equally important to successfully perform the heuristic
evaluation. These different components can be tested and
refined individually e.g., a study can be conducted for
investigating how children understand words. The number
of points within ‘Bad Scale’ needs to be investigated to
determine whether a three point scale is sufficient within
this context. The observer comment form appeared
effective at capturing issues in the evaluation and possibly
can be used in future research.
ACKNOWLEDGMENTS
I am heartily thankful to my supervisor, Gavin Sim and
Janet Read, whose encouragement, guidance at every step
enabled to develop and complete my research.
I’m also very thankful to everyone from the department of
Computing, Engineering and Physical Sciences at
University of Central Lancashire and especially all the staff
from ChiCI group for their co-operation and help, which
has made it possible to carry out the study.
REFERENCES
1. Read, J.C. Validating the Fun Toolkit: an instrument for
measuring children's opinion of technology. Cognition,
Technology and Work, 10, 2 (2008), 119-128.
2. Baauw, E., and P. Markopoulos, A comparison of think-
aloud and post-task interview for usability testing with
children. In IDC 2004, ACM (2004), 115-116.
3. Zaman, B., and V.V. Abeele. Laddering with Young
Children in User Experence Evaluations: Theoretical
Groundings and a Practical Case. In IDC 2010, ACM
(2002), 156-165.
4. MacFarlane, S., and Pasiali, A. Adapting the heuristic
evaluation method for use with children. In Workshop
on child computer interaction: methodological research
2005, Interact (2005).
5. Nielsen, J. Finding usability problems through heuristic
evaluation. In Proceedings of the SIGCHI conference on
Human factors in computing systems 1992, ACM
(1992), 373-380.
6. Law, E.L.-C., and Hvannberg, E.T. Consolidating
Usability Problems with Novice Evaluators.
In Proceedings of the 5th Nordic conference on Human-
computer interaction: building bridges 2008, ACM
(2008), 495-498.
7. Nielsen, J. Enhancing the Explanatory Power of
Usability Heuristics. In Proceedings of the SIGCHI
33
conference on Human factors in computing systems:
celebrating interdependence 1994, ACM (1994), 152-
158.
8. Iversen, O.S., and C. Brodersen. Building a BRIDGE
between children and users: a socio-cultural approach to
child–computer interaction. Cognition, Technology &
Work, 10, 2 (2008), 83-93.
9. Read, J.C., et al. Child computer interaction. In CHI '08
Extended Abstracts on Human Factors in Computing
Systems 2008, ACM (2008), 2419-2422.
10. Druin, A., et al. Children as our technology design
partners. Morgan Kaufmann Publishers Inc, San
Francisco, CA, USA, 1998.
11. Scaife, M., and Y. Rogers. Kids as informants: Telling
us what we didn’t know or confirming what we knew
already. Morgan Kaufmann Publishers Inc, San
Francisco, CA, USA, 1999.
12. Usability of websites for children: 70 design guidelines.
http://www.nngroup.com/reports/kids/.
13. Desurvire, H., Caplan, M., and Toth, J.A. Using
heuristics to Evaluate the Playability of Games. In CHI
2004, ACM (2004), 1509-1512.
14. Korhonen, H., and Koivisto, E.M.I. Playability
heuristics for mobile multi-player games. In
Proceedings of the 2nd international conference on
Digital interactive media in entertainment and arts.
2007, ACM (2007), 28-35.
15. Pinelle, D., Wong, J., and Stach, T. Heuristic Evaluation
for Games:Usability Principles for Video Game Design.
In Proceedings of the SIGCHI Conference on Human
Factors in Computing Systems 2008, ACM (2008),
1453-1462.
16. Salian, K., Sim, G., and Read, J.C. Can children perform
a heuristic evaluation?. In Proceedings of the 11th Asia
Pacific Conference on Computer Human Interaction
2013, ACM (2013), 137-141.
17. Read, J.C., and MacFarlane, S. Using the Fun Toolkit
and Other Survey Methods to Gather Opinions in Child
Computer Interaction. In Proceedings of the 2006
conference on Interaction design and children 2006,
ACM (2006), 81-88.
18. Children's Websites: Usability Issues in Designing for
Kids. http://www.nngroup.com/articles/childrens-
websites-usability-issues/.
19. Woolrych A., and Cockton, G. Testing a Conjection
based on the DR-AR Model of UIM Effectiveness. In
Proceedings of HCI 2002, 30 – 33.
20. Cockton, G., and Woolrych, A. Understanding
inspection methods: lessons from an assessment of
heuristic evaluation. Springer-Verlag, London, 2001.
21.Nielsen J. Usability inspection methods. In CHI 1994,
ACM Press (1994), 413–414.
22.Ling, C., and Salvendy, G. Effect of evaluators'
cognitive style on heuristic evaluation: Field dependent
and field independent evaluators. Int. J. Hum.-Comput.
Stud. 67, 4 (2009), 382-393.
34