assessing the user experience (ux) of online museum collections: perspectives from design and museum...
Post on 27-Jul-2015
393 Views
Preview:
TRANSCRIPT
Assessing the User Experience (UX) of Online Museum Collections: Perspectives from Design and Museum Professionals
Craig M. MacDonald, Ph.D. Pra3 Institute, School of Information and Library Science
Paper | Museums and the Web 2015 | April 9, 2015
The Online Collection A common feature of museum websites is the online collection. Idea: allow experts access to museum holdings without needing to be physically present.
Substantial time and effort has been invested in developing these online collections. But, collections are routinely among the least visited sections of the website.
2
Two Possible Explanations
1. Most people are
completely uninterested in viewing museum objects through a computer screen.
3
2. People want to view digital museum objects but are
deterred from doing so due to the poor experiences offered by existing online
collection interfaces.
Beyond Usability
Museums understand the importance of a usable website. If a visitor can’t find information about visiting the museum, they probably won’t.
But, usability alone is no longer sufficient. Museums cannot simply provide access to their digital materials; they must also create positive experiences for their users. 4
UX of Online Museum Collections
Overarching Research Question
How can the experience of using online museum collections be improved?
Related Questions:
1. What factors determine the UX of an online museum collection?
2. How can these factors be used to evaluate the UX of existing online museum collections?
5
Two Challenges 1. Evaluating interfaces is time-‐‑consuming and
resource intensive. Even lightweight usability testing methods can be challenging.
2. UX is a complex concept that is difficult to evaluate well. The relevant UX factors of a mobile banking app are likely not the same as those of an online museum collection.
What’s needed:
An evaluation method that is easy to use, adaptable, and quick.
6
Assessment Rubrics Defined as: “criteria for assessing complicated things.” Common in educational se3ings because they articulate gradations of quality for meaningful dimensions or criteria.
7
Scale Level 1 Scale Level 2 Scale Level 2
Dimension 1 description description description
Dimension 2 description description description
Dimension 3 description description description
Benefits of Using Rubrics Efficiency
Streamline assessment by reducing need to explain why specific scores were given.
Transparency Clearly define “quality” in objective and observable ways.
Reflectiveness Don’t directly prescribe specific fixes; instead, reflect on why/how improvements can be made.
Ease of Use Simple as completing a form, and completed rubric is effective tool for communicating results.
8
Rubric Creation Process
1. Identify purpose/goals 2. Choose rubric type 3. Identify the dimensions 4. Choose a rating scale 5. Write descriptions for each rating point
9
What is this rubric for?
This is the most important step, as it will drive all subsequent decisions. Goal: To assess the UX quality of an online museum collection.
10
Step 1
What type of rubric?
Holistic Rubrics Look at a product or performance as a whole; contain just one dimension (e.g., “overall quality”).
Analytic Rubrics Split a product or performance into its component parts; allow for feedback on multiple dimensions.
11
Step 2
What dimensions matter?
Requires breaking down the product being evaluated into components that are: Observable Important Precise
No prescribed way to do this; just needs to be a process that can be explained and justified.
12
Step 3
Finding a starting point Began with a literature search to see if any UX criteria for online museums had already been established. Starting point: Lin, Fernandez, and Gregor (2012) identified 4 design characteristics and five design principles associated with user enjoyment. Characteristics: Novelty, Harmonization, No time constraint, Appropriate facilitation and association Principles: Multisensory learning experiences, Creating a storyline, Mood building, Fun in learning, Establishing social connection 13
Step 3
Testing Lin et al.’s model
With a graduate assistant, reviewed 39 online museum collections with respect to these 9 dimensions. This allowed for a bo3om-‐‑up approach.
Ensured that dimensions were reflective of what the museum community considers valuable.
14
Step 3
Finding Exemplars The Rijksmuseum quickly emerged as an exemplar. But, discussing how it excelled uncovered limitations to Lin et al.’s framework. Many dimensions were actually describing multiple concepts, making them difficult to assess independently.
15
Step 3
Refining the dimensions In response, we developed a parallel set of dimensions that were more observable and explicit. And that more closely matched our interpretation of Lin et. al.’s framework.
This allowed us to: Improve the vocabulary to make it more accessible; Tighten the concepts to make them more distinguishable; and Evaluate the ability of each dimension to capture an important aspect of UX.
16
Step 3
Iterative testing We iteratively tested the rubric with various museum collections to further refine and strengthen the dimensions. Goal was to make them less ambiguous and more observable. • Ex: Harmonization and Mood building became Strength of Visual Content and Visual Aesthetics.
Finally, split the dimensions into 3 categories inspired by Don Norman’s model of Emotional Design: Visceral, Behavioral, Reflective. 17
Step 3
Choosing a rating scale
Typical rubrics use between 2-‐‑ and 5-‐‑point rating scales. Four rating scale points were chosen and a neutral, non-‐‑judgmental language was selected: Incomplete Beginning Developing Emerged
18
Step 4
Gradations of quality
Final step: writing clear and well-‐‑defined gradations of quality for each rubric dimension. A 4-‐‑point rating scale should describe quality ratings as: No No, but Yes, but Yes
19
Step 5
Final Assessment Rubric Visceral (immediate impact)
1. Strength of visual content 2. Visual aesthetics
Behavioral (immediate usage) 3. System reliability & performance 4. Usefulness of metadata 5. Interface usability 6. Support for casual & expert users
Reflective (long-‐‑term usage) 7. Uniqueness of virtual experience 8. Openness 9. Integration of social features 10. Personalization of experiences
20
1 Incomplete 2 Beginning 3 Developing 4 Emerged
Ex: Strength of Visual Content
21
Incomplete Beginning Developing Emerged Artwork is a peripheral
component of the collection, with
text the dominant visual element. Images, when present, are too small and low quality. Text is a major distraction from the visual
content.
[No] [No, but] [Yes, but] [Yes]
Artwork is not emphasized
throughout the collection, and
images are rarely the dominant visual element. Some images are too small and/or low quality. At times, text is too
dense and distracts from the visual content.
Artwork is featured
throughout the collection, but images are not always the
dominant visual element. Most images are large and high quality.
Text is used purposefully, but
some is superfluous.
Artwork is presented as the primary focus of the collection, with images as the dominant visual element. All images are large and high quality. Text is
used purposefully but sparingly to enhance the
visual content.
Next Step: Rubric Quality Four experts – two museum professionals and two UX professionals – were asked to apply the rubric to three online museum collections. Sessions took ~90 minutes to complete (approx. 20 minutes per museum) Held one-‐‑on-‐‑one (3 face-‐‑to-‐‑face, 1 remote) Completed in August/September 2014
Three aspects of rubric quality:
Reliability Validity Utility
22
What is rubric reliability?
The extent to which using the rubric provides consistent ratings of quality. i.e.: do different raters provide the same (or similar) ratings when applying the rubric to the same interface?
This is known as inter-‐‑rater reliability.
Common measure: consensus agreement
23
UX Rubric Reliability
Participants rated three museum collections on ten different dimensions. 30 potential opportunities for agreement.
Two estimates of agreement:
Conservative: all raters provide the same rating • Target: Approximately 30% or higher
Liberal: all raters are within one rating point • Target: Approximately 80% or higher
24
Reliability: Results [1] Participant Type Conservative Liberal All (4) 4 / 30 (13.3%) 19 / 30 (63.3%)
25
Reliability: Results [2] Participant Type Conservative Liberal All (4) 4 / 30 (13.3%) 19 / 30 (63.3%)
Museum (2) 14 / 30 (46.7%) 28/30 (96.3%)
UX (2) 9 / 30 (30.0%) 24 / 30 (80.0%)
26
Reliability: Discussion
27
Using the rubric was be3er than blind guessing, but there is room for improvement. Especially when combining UX and Museum experts.
Conclusion: Don’t mix evaluators -‐‑ they should all share a disciplinary background and professional focus.
What is rubric validity?
The extent to which using the rubric provides accurate measures of quality. Many types of validity; for rubrics, two common types: 1) Content Validity 2) Construct Validity
28
UX Rubric Content Validity
Content validity refers to the extent to which the rubric measures things that actually maXer. i.e., do the dimensions of the rubric make sense?
Ideally, content validity is demonstrated by soliciting feedback from subject ma3er experts during rubric creation. In this case, study participants were asked to rate the perceived relevance of each rubric dimension.
29
Content Validity: Discussion None of the experts proposed any other concepts or elements that should have been included. Conclusion: Rubric has content validity, but Reflective dimensions may need more refinement. Are social features or personalization options really the best way to engage online visitors? Can challenges of providing “open” collection be mitigated? • These are open research questions.
31
UX Rubric Construct Validity Construct validity refers to whether the rubric actually measures the construct it is supposed to measure. i.e., is the UX rubric actually assessing UX?
Ideally, construct validity is demonstrated by showing a correlation between rubric scores and another accepted measure of quality. But, there is no accepted measure of UX quality. • Instead, study participants were asked to provide perceived levels of construct validity.
32
Construct Validity: Discussion
All participants felt the rubric was an effective measure of UX. But, museum-‐‑centric language was a perceived barrier for the UX experts.
Conclusion: Rubric has construct validity, but language could be more accessible to non-‐‑museum experts.
34
What is rubric utility?
The actual impact of using the rubric as an assessment instrument. i.e., does using the rubric make a difference?
Arguably the most complex and most important quality of a rubric. But, measuring actual impact is nearly impossible (too many confounding factors).
35
UX Rubric Utility
Instead, focus on perceived impact. Evaluators need to think the rubric is valuable, otherwise they’ll be unlikely to use it.
Need to demonstrate the extent evaluators believe the rubric is: Useful? Easy to use? Easy to learn?
36
Utility: Discussion [1]
All participants affirmed the utility of the rubric as an assessment instrument. Biggest benefit is to aid decision-‐‑making:
UX expert: the rubric seems like a great tool to “help museums figure out their digital budget.”
How? By providing a snapshot of the assessment results.
38
Summary
Study results show that the rubric is a reliable, valid, and useful assessment instrument. Future work: • Clarify museum-‐‑specific language. • Examine the reflective dimensions more closely. • Study the practicality of the rubric through an applied case study with a museum partner.
Conclusion: Rubric can provide valuable guidance for museums interested in improving their users’ experience with online collections.
40
top related