1 flexible search and navigation using faceted metadata prof. marti hearst dr. rashmi sinha, ame...

87
1 Flexible Search and Navigation using Faceted Metadata Prof. Marti Hearst Dr. Rashmi Sinha, Ame Elliott, Jennifer English, Kirsten Swearingen, Ping Yee February, 2002 University of California, Berkeley http://bailando.sims.berkeley.edu/ flamenco.html Research funded by NSF CAREER Grant, NSF9984741

Post on 20-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

1

Flexible Search and Navigation using Faceted

MetadataProf. Marti Hearst

Dr. Rashmi Sinha, Ame Elliott, Jennifer English, Kirsten Swearingen, Ping Yee

February, 2002University of California, Berkeley

http://bailando.sims.berkeley.edu/flamenco.htmlResearch funded by

NSF CAREER Grant, NSF9984741

2

Outline

1. Motivation2. Approach

Integrate Search into Information Architecture via Faceted Metadata

3. Definitions:Information Architecture Faceted Metadata

4. Recipe Interface and Usability Study5. Image Interfaces and Usability Studies6. Conclusions

3

Motivation and Background

4

Claims

• Web Search is OK– Gets people to the right starting

points

• Web SITE search is NOT ok• The best way to improve site

search is– NOT to make new fancy algorithms– Instead … improve the interface

5

The Philosophy

• Information architecture should be designed to integrate search throughout

• Search results should reflect the information architecture.

• This supports an interplay between navigation and search

• This supports the most common human search strategies.

6

An Important Search Strategy

• Do a simple, general search– Gets results in the generally correct area

• Look around in the local space of those results

• If that space looks wrong, start over– Akin to Shneiderman’s overview + details

• Our approach supports this strategy– Integrate navigation with search

7

Following Hyperlinks

• Works great when it is clear where to go next

• Frustrating when the desired directions are undetectable or unavailable

8

An Analogy

text searchhypertext

9

Main Idea

• Use metadata to show where to go next– More flexible than canned hyperlinks– Less complex than full search– Help users see and return to what

happened previously

Search Usability Design Goals

1. Strive for Consistency2. Provide Shortcuts3. Offer Informative Feedback4. Design for Closure5. Provide Simple Error Handling6. Permit Easy Reversal of Actions7. Support User Control8. Reduce Short-term Memory Load

From Shneiderman, Byrd, & Croft, Clarifying Search, DLIB Magazine, Jan 1997. www.dlib.org

11

Information Architecture

12

A Taxonomy of WebSites

low

low

high

high

Complexity of Applications

Complexity of Data

From: The (Short) Araneus Guide to Website development, by Mecca, et al, Proceedings of WebDB’99, http://www-rocq.inria.fr/~cluet/WEBDB/procwebdb99.html

Catalog Sites

Web-based Information

Systems

Web-Presence

Sites

Service-Oriented

Sites

13

An Important IA Trend

• Generating web pages from databases• Implications:

– Web sites can adapt to user actions– Web sites can be instrumented

14

Faceted Metadata

15

Metadata: data about dataFacets: orthogonal categories

Time/Date TopicGeoRegion

16

Faceted Metadata: Biomedical MeSH (Medical Subject Headings)www.nlm.nih.org/mesh

17

Mesh Facets (one level expanded)

18

Questions we are trying to answer

• How many facets are allowable?• Should facets be mixed and

matched?• How much is too much?• Should hierarchies be progressively

revealed, tabbed, some combination?

• How should free-text search be integrated?

19

How NOT to do it

• Yahoo uses faceted metadata poorly in both their search results and in their top-level directory

• They combine region + other hierarchical facets in awkward ways

20

Yahoo’s use of facets

21

Yahoo’s use of facets

22

Yahoo’s use of facets

23

Yahoo’s use of facets

Where is Berkeley? College and University > Colleges and Universities >United States > U > University of California > Campuses > Berkeley

U.S. States > California > Cities >Berkeley > Education > College and University > Public > UC Berkeley

24

Problem with Metadata Previews as Currently Used

– Hand edited, predefined– Not tailored to task as it develops– Not personalized– Often not systematically integrated

with search, or within the information architecture in general

25

Recipe Collection Examples

26

From soar.berkeley.edu (a poor example)

27

28

From www.epicurious.com (a good example)

29

30

31

32

Epicurious Metadata Usage

• Advantages– Creates combinations of metadata on the fly– Different metadata choices show the same

information in different ways– Previews show how many recipes will result– Easy to back up– Supports several task types

• “Help me find a summer pasta,'' (ingredient type + event type),

• “How can I use an avocado in a salad?'' (ingredient type + dish type),

• “How can I bake sea-bass'' (preparation type + ingredient type)

33

Metadata usage in Epicurious

PrepareCuisineIngredient Dish

Recipe

34

Metadata usage in Epicurious

PrepareCuisineIngredient Dish

PrepareCuisineDishISelect

35

Metadata usage in Epicurious

PrepareCuisineIngredient Dish

I >

Group by

PrepareCuisineDish

36

Metadata usage in Epicurious

PrepareCuisineIngredient Dish

PrepareCuisineDishI >

Group by

37

Metadata usage in Epicurious

PrepareCuisineIngredient Dish

PrepareCuisineDishI >

Group by

PrepareCuisineISelect

40

Epicurious Basic Search

Lacks integration with metadata

41

42

Usability Study: epicurious

43

Epicurious Usability Study

• 9 participants• Three interfaces

– Simple search form – Enhanced search form– Browse

• Two task types – known-item search – browsing for inspiration

46

Epicurious Usability Study: Preference Data

Site Basic Enhanced BrowseTotal "Very Likely" to Use: 7 2 4 7

Total "Likely" to Use: 0 1 1 0Total "Not Likely" to Use: 2 6 4 2

47

Epicurious Usability StudyInterface Preference

FavoriteSubject_JG: EnhancedSubject_NS: EnhancedSubject_SP: Browse

Subject_RM: Browse

Subject_LA: Enhanced

Subject_MC: BrowseSubject_MW: BrowseSubject_NM: EnhancedSubject_CG: Browse

Query previews and navigation. Options to refine by course or season. Choose how you view the results

Searching within made all the difference. I could see how many results I was getting in each Very specific. I can choose more than 1 detail with search for recipe I'm looking for.Likes the way it narrows things down. And it gives you the numbers.

Found it simpler, more readable. Helped you hone in on the season.Liked the kid friendly, low fat optionWhy?

Can narrow down when you're stuck. You can always refine [your search].

Allowed me to make specific selections. I liked Browse too. Gave lots to choose from. Depends on what you’re looking for that day

Can limit and unlimit and limit again in a different way. Prioritize your criteria--change the first thing I clicked and go in a different direction. Easy to back up.

48

Epicurious Usability StudyFeature Preference

49

Epicurious Usability StudyConstraint-based Preferences

# of Results High LowEnhanced (LA) Browse (LA)Enhanced (MC) Browse (MC)Browse (MW) Browse (MW)Enhanced (NM) Enhanced (NM)Basic (CG) Browse (CG)Enhanced (LA) Browse (LA)Enhanced (MC) Browse (MC)Enhanced (MW) Browse (MW)Enhanced (NM) Enhanced (NM)Enhanced (CG) Browse (CG)

Constraint

1 result needed

Many results needed

50

Usability Study Results: Summary

• People liked the browsing-style metadata-based search and found it helpful

• People sometimes preferred the metadata search when the task was more constrained – But zero results are frustrating– This can be alleviated with query previews

• People dis-prefer the standard simple search

51

Missing From Epicurious

• How to scale?– Hierarchical facets– Larger collection

• How to integrate search?• How to allow expansion in addition

to refinement?

52

Application to Image Search

53

Current Approaches to Image Search• Visual Content and Cues, e.g.,

• QBIC (Flickner et al. ‘95)• Blobworld (Carson et al. ‘99)• Body Plans (Forsyth & Fleck ‘00)

– Color, texture, shape– Move through a similarity space

• Keyword based– Piction (Srihari ’91)– WebSeek (Smith and Jain ’97)– Google image search

54

A Commonality Among Current Content-based Approaches:

Emphasis on similarityLittle work on analyzing

the search needs

55

The Users

• Architects and City Planners

56

The Collection

• ~40,000 images from the UCB architecture slide library

• The current database and interface is called SPIRO

• Very rich, faceted, hierarchical metadata

57

Architects’ Image Use

• Common activitie:– Use images for inspiration

• Browsing during early stages of design

– Collage making, sketching, pinning up on walls– This is different than illustrating powerpoint

• Maintain sketchbooks & shoeboxes of images– Young professionals have ~500, older ~5k

• No formal organization scheme– None of 10 architects interviewed about their

image collections used indexes

• Do not like to use computers to find images

58

Development Timeline• Needs assessment.

– Interviewed architects and conducted contextual inquiries.

• Lo-fi prototyping. – Showed paper prototype to 3 professional architects.

• Design / Study Round 1. – Simple interactive version. Users liked metadata idea.

• Design / Study Round 2: – Developed 4 different detailed versions; evaluated with 11 architects;

results somewhat positive but many problems identified. Matrix emerged as a good idea.

• Metadata revision. – Compressed and simplified the metadata hierarchies

• Design / Study Round 3. – New version based on results of Round 2– Highly positive user response

59

The Interface

• Nine hierarchical facets– Matrix– SingleTree

• Chess metaphor– Opening– Middlegame– Endgame

• Tightly Integrated Search• Expand as well as Refine• Intermediate pages for large categories

60

61

62

63

64

65

66

67

68

Usability Study on Round 3• 19 participants

– Architecture/City Planning background

• Two versions of the interface– Tree (one hierarchical facet at a time)– Matrix (multiple hierarchical facets)

• Several tasks• Subjective responses

– All highly positive– Very strong desire to use the interface in

future– Will replace the current SPIRO interface

69

Study Tasks1. High Constraint Search:

Find images with metadata assigned from 3 facets(e.g., exterior views of temples in Lebanon)

1.1)       Start by using a Keyword Search 1.2)       Start by Browsing (clicking a hyperlink) 1.3)       Start by using method of choice

2. Low Constraint Search: Find a low-constraint set of images (metadata in one facet)

3. Specific Image Search: Given a photograph and no other info, find the same image in the collection

4. Browse for Images of Interest

70

Interface Evaluation

• Users rated Matrix more highly for:– Usefulness for design work– Seeing relationships between images– Flexibility– Power

• On all except “find this image” task, users also rated the Matrix higher for:– Feeling “on track” during search– Feeling confident about having found all

relevant images

71

Overall Preferences: Matrix vs. Tree

Simple search (e.g.

images of deserts)

Complex search (e.g.

exteriors of temples

in Lebanon)

Find images like this

one

OVERALL PREFERENC

E

Matrix 13 14 16 16

Tree 5 4 3 3

72

User Comments - Matrix

• “Easier to pursue other queries from each individual page”

• “Powerful at limiting and expanding result sets. Easy to shift between searches.”

• “Keep better track of where I am located as well as possible places to go from there.”

• “Left margin menu made it easy to view other possible search queries, helped in trouble-shooting research problems.”

• “Interface was friendlier, easier, more helpful.”• “I understood the hierarchical relationships

better.”

73

User Comments – Tree

• Pro– “Simple”– “More typical of other search engines I’d use”– “Visually simpler and more intuitive…Matrix a bit

overwhelming with choices.”

• Con– “I found SingleTree difficult to use when I had to

refine my search on a search topic which I was not familiar with. I found myself guessing.”

– “SingleTree required more thought to use and to find specific images.”

– “I do not trust my typng and spelling skills. I like having categories.”

74

Task Completion Times

(Find Image is an artificial task: given a photo andno other info, find it in the collection.)

75

When Given A Choice …

For each interface, one task allowed the user to start with either a keyword search or the hyperlinks.

3 chose to search in both interfaces

11 chose to browse in both interfaces

4 chose to search in Matrix, browse in Tree

1 chose to browse in Matrix, search in Tree

76

Precision and Recall

Computed for tasks 1.1-1.3Pooling used for determining relevant setPrecision based on what was visible on screen

77

Feature Usage Percentages

(Dark bars show subtotals)

78

Feature Usage (%) Types of Actions

Action Categories

0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00%

Refine search (reduce# of results)

Expand search(increase # of results)

Arrange results

Start over/backup

Matrix

Tree

79

Feature Usage (%) Refining

Use of Features to Refine Search

0.00% 5.00% 10.00% 15.00% 20.00% 25.00% 30.00%

Drill above images

Drill in matrix

Drill from image detail

Drill from large category

Drill by clicking "All N items"

Search within

Disambiguate keyword search

"More" in disambiguation

Matrix

Tree

80

Feature Usage – Expanding / Starting Over

Use of Features to Expand Search / Start Over

0.00% 5.00% 10.00% 15.00% 20.00% 25.00%

Expand search usingbreadcrumbs

Expand by clicking X

Expand from imagedetail

Go back to start mid-search

Search all, mid-task

Back

Matrix

Tree

81

Interface Evaluation

• Users rated Matrix more highly for:– Usefulness for design work– Seeing relationships between images– Flexibility– Power

• On all except “find this image” task, users also rated the Matrix higher for:– Feeling “on track” during search– Feeling confident about having found all

relevant images

82

Application to Medline

83

Summary and Conclusions

84

Summary

• A new approach to web site search– Use hierarchical faceted metadata

dynamically, integrated with search

• Many difficult design decisions– Iterating and testing was key

85

Summary

• Two Usability Studies Completed– Recipes: 13,000 items– Architecture Images: 40,000 items

• Conclusions:– Users like and are successful with the

dynamic faceted hierarchical metadata, especially for browsing tasks

– Very positive results, in contrast with studies on earlier iterations

– Note: it seems you have to care about the contents of the collection to like the interface

86

Summary• We have addressed several interface

problems:– How to seamlessly integrate metadata

previews with search• Show search results in metadata context• “Disambiguate” search terms

– How to show hierarchical metadata from several facets

• The “matrix” view• Show one level of depth in the “matrix” view

– How to handle large metadata categories• Use intermediate pages

– How to support expanding as well as refining• Still working on it to some extent

87

Advantages of the Approach

• Supports different search types– Highly constrained known-item

searches– Open-ended, browsing tasks – Can easily switch from one mode to

the other midstream– Can both expand and refine

88

Advantages of the Approach

• Honors many of the most important usability design goals– User control– Provides context for results– Reduces short term memory load– Allows easy reversal of actions– Provides consistent view

89

Advantages of the Approach

• Allows different people to add content without breaking things

• Can make use of standard technology

90

Some Unanswered Questions

• How to integrate with relevance feedback (more like this)?– Would like to use blobworld-like

features

• How to incorporate user preferences and past behavior?

• How to combine facets to reflect tasks?

91

Thank you!

bailando.sims.berkeley.edu/flamenco.html

For more information: