assessing the user experience (ux) of online museum collections: perspectives from design and museum...

Assessing the User Experience (UX) of Online Museum Collections: Perspectives from Design and Museum Professionals

Craig M. MacDonald, Ph.D. Pra3 Institute, School of Information and Library Science

Paper | Museums and the Web 2015 | April 9, 2015

The Online Collection A common feature of museum websites is the online collection. Idea: allow experts access to museum holdings without needing to be physically present.

Substantial time and effort has been invested in developing these online collections. But, collections are routinely among the least visited sections of the website.

Two Possible Explanations

1. Most people are

completely uninterested in viewing museum objects through a computer screen.

2. People want to view digital museum objects but are

deterred from doing so due to the poor experiences offered by existing online

collection interfaces.

Beyond Usability

Museums understand the importance of a usable website. If a visitor can’t find information about visiting the museum, they probably won’t.

But, usability alone is no longer sufficient. Museums cannot simply provide access to their digital materials; they must also create positive experiences for their users. 4

UX of Online Museum Collections

Overarching Research Question

How can the experience of using online museum collections be improved?

Related Questions:

1.  What factors determine the UX of an online museum collection?

2.  How can these factors be used to evaluate the UX of existing online museum collections?

Two Challenges 1.  Evaluating interfaces is time-‐‑consuming and

resource intensive. Even lightweight usability testing methods can be challenging.

2.  UX is a complex concept that is difficult to evaluate well. The relevant UX factors of a mobile banking app are likely not the same as those of an online museum collection.

What’s needed:

An evaluation method that is easy to use, adaptable, and quick.

Assessment Rubrics Defined as: “criteria for assessing complicated things.” Common in educational se3ings because they articulate gradations of quality for meaningful dimensions or criteria.

Scale Level 1 Scale Level 2 Scale Level 2

Dimension 1 description description description

Benefits of Using Rubrics Efficiency

Streamline assessment by reducing need to explain why specific scores were given.

Transparency Clearly define “quality” in objective and observable ways.

Reflectiveness Don’t directly prescribe specific fixes; instead, reflect on why/how improvements can be made.

Ease of Use Simple as completing a form, and completed rubric is effective tool for communicating results.

Rubric Creation Process

1.  Identify purpose/goals 2.  Choose rubric type 3.  Identify the dimensions 4.  Choose a rating scale 5.  Write descriptions for each rating point

What is this rubric for?

This is the most important step, as it will drive all subsequent decisions. Goal: To assess the UX quality of an online museum collection.

Step 1

What type of rubric?

Holistic Rubrics Look at a product or performance as a whole; contain just one dimension (e.g., “overall quality”).

Analytic Rubrics Split a product or performance into its component parts; allow for feedback on multiple dimensions.

Step 2

What dimensions matter?

Requires breaking down the product being evaluated into components that are: Observable Important Precise

No prescribed way to do this; just needs to be a process that can be explained and justified.

Step 3

Finding a starting point Began with a literature search to see if any UX criteria for online museums had already been established. Starting point: Lin, Fernandez, and Gregor (2012) identified 4 design characteristics and five design principles associated with user enjoyment. Characteristics: Novelty, Harmonization, No time constraint, Appropriate facilitation and association Principles: Multisensory learning experiences, Creating a storyline, Mood building, Fun in learning, Establishing social connection 13

Step 3

Testing Lin et al.’s model

With a graduate assistant, reviewed 39 online museum collections with respect to these 9 dimensions. This allowed for a bo3om-‐‑up approach.

Ensured that dimensions were reflective of what the museum community considers valuable.

Step 3

Finding Exemplars The Rijksmuseum quickly emerged as an exemplar. But, discussing how it excelled uncovered limitations to Lin et al.’s framework. Many dimensions were actually describing multiple concepts, making them difficult to assess independently.

Step 3

Refining the dimensions In response, we developed a parallel set of dimensions that were more observable and explicit. And that more closely matched our interpretation of Lin et. al.’s framework.

This allowed us to: Improve the vocabulary to make it more accessible; Tighten the concepts to make them more distinguishable; and Evaluate the ability of each dimension to capture an important aspect of UX.

Step 3

Iterative testing We iteratively tested the rubric with various museum collections to further refine and strengthen the dimensions. Goal was to make them less ambiguous and more observable. •  Ex: Harmonization and Mood building became Strength of Visual Content and Visual Aesthetics.

Finally, split the dimensions into 3 categories inspired by Don Norman’s model of Emotional Design: Visceral, Behavioral, Reflective. 17

Step 3

Choosing a rating scale

Typical rubrics use between 2-‐‑ and 5-‐‑point rating scales. Four rating scale points were chosen and a neutral, non-‐‑judgmental language was selected: Incomplete Beginning Developing Emerged

Step 4

Gradations of quality

Final step: writing clear and well-‐‑defined gradations of quality for each rubric dimension. A 4-‐‑point rating scale should describe quality ratings as: No No, but Yes, but Yes

Step 5

Final Assessment Rubric Visceral (immediate impact)

1.  Strength of visual content 2.  Visual aesthetics

Behavioral (immediate usage) 3.  System reliability & performance 4.  Usefulness of metadata 5.  Interface usability 6.  Support for casual & expert users

Reflective (long-‐‑term usage) 7.  Uniqueness of virtual experience 8.  Openness 9.  Integration of social features 10. Personalization of experiences

1 Incomplete 2 Beginning 3 Developing 4 Emerged

Ex: Strength of Visual Content

Incomplete Beginning Developing Emerged Artwork is a peripheral

component of the collection, with

text the dominant visual element. Images, when present, are too small and low quality. Text is a major distraction from the visual

content.

[No] [No, but] [Yes, but] [Yes]

Artwork is not emphasized

throughout the collection, and

images are rarely the dominant visual element. Some images are too small and/or low quality. At times, text is too

dense and distracts from the visual content.

Artwork is featured

throughout the collection, but images are not always the

dominant visual element. Most images are large and high quality.

Text is used purposefully, but

some is superfluous.

Artwork is presented as the primary focus of the collection, with images as the dominant visual element. All images are large and high quality. Text is

used purposefully but sparingly to enhance the

visual content.

Next Step: Rubric Quality Four experts – two museum professionals and two UX professionals – were asked to apply the rubric to three online museum collections. Sessions took ~90 minutes to complete (approx. 20 minutes per museum) Held one-‐‑on-‐‑one (3 face-‐‑to-‐‑face, 1 remote) Completed in August/September 2014

Three aspects of rubric quality:

Reliability Validity Utility

What is rubric reliability?

The extent to which using the rubric provides consistent ratings of quality. i.e.: do different raters provide the same (or similar) ratings when applying the rubric to the same interface?

This is known as inter-‐‑rater reliability.

Common measure: consensus agreement

UX Rubric Reliability

Participants rated three museum collections on ten different dimensions. 30 potential opportunities for agreement.

Two estimates of agreement:

Conservative: all raters provide the same rating •  Target: Approximately 30% or higher

Liberal: all raters are within one rating point •  Target: Approximately 80% or higher

Reliability: Results [1] Participant Type Conservative Liberal All (4) 4 / 30 (13.3%) 19 / 30 (63.3%)

Reliability: Results [2] Participant Type Conservative Liberal All (4) 4 / 30 (13.3%) 19 / 30 (63.3%)

Museum (2) 14 / 30 (46.7%) 28/30 (96.3%)

UX (2) 9 / 30 (30.0%) 24 / 30 (80.0%)

Reliability: Discussion

Using the rubric was be3er than blind guessing, but there is room for improvement. Especially when combining UX and Museum experts.

Conclusion: Don’t mix evaluators -‐‑ they should all share a disciplinary background and professional focus.

What is rubric validity?

The extent to which using the rubric provides accurate measures of quality. Many types of validity; for rubrics, two common types: 1) Content Validity 2) Construct Validity

UX Rubric Content Validity

Content validity refers to the extent to which the rubric measures things that actually maXer. i.e., do the dimensions of the rubric make sense?

Ideally, content validity is demonstrated by soliciting feedback from subject ma3er experts during rubric creation. In this case, study participants were asked to rate the perceived relevance of each rubric dimension.

Content Validity: Results

Content Validity: Discussion None of the experts proposed any other concepts or elements that should have been included. Conclusion: Rubric has content validity, but Reflective dimensions may need more refinement. Are social features or personalization options really the best way to engage online visitors? Can challenges of providing “open” collection be mitigated? •  These are open research questions.

UX Rubric Construct Validity Construct validity refers to whether the rubric actually measures the construct it is supposed to measure. i.e., is the UX rubric actually assessing UX?

Ideally, construct validity is demonstrated by showing a correlation between rubric scores and another accepted measure of quality. But, there is no accepted measure of UX quality. •  Instead, study participants were asked to provide perceived levels of construct validity.

Construct Validity: Results

Construct Validity: Discussion

All participants felt the rubric was an effective measure of UX. But, museum-‐‑centric language was a perceived barrier for the UX experts.

Conclusion: Rubric has construct validity, but language could be more accessible to non-‐‑museum experts.

What is rubric utility?

The actual impact of using the rubric as an assessment instrument. i.e., does using the rubric make a difference?

Arguably the most complex and most important quality of a rubric. But, measuring actual impact is nearly impossible (too many confounding factors).

UX Rubric Utility

Instead, focus on perceived impact. Evaluators need to think the rubric is valuable, otherwise they’ll be unlikely to use it.

Need to demonstrate the extent evaluators believe the rubric is: Useful? Easy to use? Easy to learn?

Utility: Results

Utility: Discussion [1]

All participants affirmed the utility of the rubric as an assessment instrument. Biggest benefit is to aid decision-‐‑making:

UX expert: the rubric seems like a great tool to “help museums figure out their digital budget.”

How? By providing a snapshot of the assessment results.

Utility: Discussion [2]

Summary

Study results show that the rubric is a reliable, valid, and useful assessment instrument. Future work: •  Clarify museum-‐‑specific language. •  Examine the reflective dimensions more closely. •  Study the practicality of the rubric through an applied case study with a museum partner.

Conclusion: Rubric can provide valuable guidance for museums interested in improving their users’ experience with online collections.

Thank you

Craig M. MacDonald, Ph.D. cmacdona@pra3.edu @CraigMMacDonald www.craigmacdonald.com

assessing the user experience (ux) of online museum collections: perspectives from design and museum...

online collections

museum holdings

digital museum objects

ux quality

viewing museum objects

museum professionals

relevant ux factors

scale level

Design

insuring museum collections - home - share museums...

afternoon at the museum: ux in nontraditional settings

archeology collections of the uganda national museum

active collections: rethinking the role of collections in...

history museum curatorship: collections …...malaro, marie...

museum collections management table of …...

collections open! amsterdam museum and open data

digital museum collections and social media: ethical

collections databases | burke museum

housekeeping manual, museum collections - fairfax county

museum collections at the core of learning

opening up museum collections online

social media & museum collections

museum handbook iii, museum collections use museum handbook,...

letter of intent: digitization of museum collections

chapter 4: museum collections … museum handbook, part i...

planning and managing museum collections

the met and museum collections on twitter

integrating museum systems: accessing collections...

cataloguing museum collections week 1