models and interaction mechanisms for exploratory interfaces

COMO CAMPUS

Models and interaction mechanisms for exploratory interfaces Luigi Spagnolo [email protected]

1 Information and Communication Quality

Index 2

¨  PREVIEW: Online experimentation! ¨  Part I: navigation, search and exploration

¤  Break ¨  Part II: Faceted search: the model(s) and the

interaction

¨  Visualization issues will be covered into an other lecture

3 PREVIEW: Online Experimentation

Intro 4

¨  This lecture starts in a quite unusual way :-) ¨  To let you introduced with exploratory

interfaces you’ll take part to a research experiment

¨  But don’t worry! ¤  It’s not dangerous for your health :-) ¤ The questionnaire you’re asked to fill is

anonymous and the answers will not be graded

The application | 1 5


¨  The last version of a prototype built for the Italian Ministry of Culture

¨  A map of exploring venues of archaeological interest in Italy ¨  According to three properties (facets):

¤  Kind of venue: museum, archaeological site and superintendence (a local branch of the Ministry of Culture devoted to archeological heritage management).

¤  Location: the venue location, at level of macro-area (Northern Italy, Central Italy, eyc.), Italian region and Italian province.

¤  Civilization or Period: The ancient civilizations (Romans, Greeks, etc.) or periods (e.g. Middle Ages, Bronze age) the venues are relevant to.


¨  The tag cloud: ¤  Tag size à the number of results that are relevant with respect to the period or

civilization in question. ¤  Text color à how much the percentage of results that are relevant for the period/

civilization deviates from an uniform distribution. n  Shades of green show a stronger positive correlation between the other selected filters (e.g.

the location and/or the venue type) and the civilization/period in question. Red instead shows a negative correlation (the civilization/period is less significant with respect to other criteria selected).

¤  Background color à w.r.t. the whole set of venues are relevant for the period/civilization, which percentage of them are included in the results? n  Green shows a positive correlation, while red instead shows a negative correlation.

¤  E.g., for venues in a specific region only (e.g. Lombardy), a green tagindicates that the given civilization was particularly relevant for that region.

¤  The green background shows instead that the civilization is peculiar of that region, and is less likely to be found elsewhere.


¨  The map: ¤ At three levels: Italian region, Italian province, extact

location(s) ¤ The color of the circle à the specific type of venue ¤ The size of the circle à the number of items of that

type in that area

The experiment 9

¨  Go to http://tinyurl.com/exp-icq ¤  (or http://www.ellesseweb.com/mining/)

¨  You will find a page with two links: 1.   The application 2.   An online questionnarie (on Rational Survey) ¤  Keep both open on the browser

¨  Work individually (1 hour max) ¨  Answer with your opinions, without looking at other websites, just

at the ArchaeoItaly application ¤  Remember: the survey is anonymous, and there are no “correct

answers”! ¤  For any doubts, ask me!

10 Part 1 | Navigation, search and exploration

Let’s start with a scenario 11

¨  Work in pairs ¨  Imagine to work as journalists for the

Horse Illustrated magazine ¨  You have to write an essay about

horses in art (and in particular in painting) among the centuries.

¨  Find interesting information on the website of the Louvre Museum ¤  http://www.louvre.fr/llv/commun/

home.jsp?bmLocale=en

Problems with the Louvre 12

¨  Artworks are separated by department (internal “bureaucratic” classification) and by provenience.

¨  It is not possible to search them together (regardless of their age and country of origin) by subject.

¨  There is no introductory content on the subject that can guide the student in her search.

Content-intensive websites 13

¨  Also know as: ¤  Information-intensive ¤ Often Infosuasive = informative + persuasive ¤  Like ancient rhetoric: inform and persuade

¨  Mainly intended for: ¤  Learning, understanding, discovering, comparing

information ¤  Leisure and entertainment

Contents 14

¨  Text, multimedia (audio, video, images) ¨  Hypermedia = multimedia + hyperlinks ¨  Information involves subjective judgment

¤ Depends on the author and on the user ¤ Objective: “10km far from Como”, “the painting

was made in 1886” ¤  Subjective: “Near Como”, “the painting is

impressionist”

User experiences requirements | 1 15

¤  From the users’ point of view: n Usability: usage is effective, efficient and satisfactory n Findability: users can locate what they are looking for n  “At a glance” understandabity: users understand the

website coverage and can make sense of information n Enticing explorability: users are compelled to “stay

and play” and discover interesting connections among topics

User experiences requirements | 2 16

¤  From the stakeholders’ point of view: n Planned serendipity: promoting most important

contents so that users can stumble in them n  E.g. “Readers that purchased this book also bought…”

n Communication strengh and branding: the website conveys the intended “message” and “brand” of the institution behind it

n  E.g. “we have the lowest prices”, “we are very authorithative”, etc.

Information architecture 17

¤  Purpose: conceptually organizing information

¤  Providing access to contents n  Index navigation (a) n  Guided navigation (b)

¤  Providing the possibility of moving from a content to related ones n  Contextual navigation (c): cross-

reference links, semantic relationships

“Traditional” structure 18

¤ Taxonomy: hierarchy of categories and subcategories n  Sections and group of

contents are the branches of the tree

n  Contents are the leaves ¤ Cross-reference links

between nodes

An example 19

Sitemap:

Art gallery website

¤  Artworks of the month ¤  Paintings

n  Top 10 masterpieces n  By artist n  By artistic movement n  By subject

¤  Sculptures n  ... n  By material

¤  Photographs n  ...

Problems/1 20

¨  What if I want to browse all artworks (regardless their type) by artist? ¤  Classifications are “nested” in a fixed order ¤  Designers should choose which classification should

prevail (e.g. by type) ¨  What if I want to find “impressionists paintings

portraing animals”? ¤  I cannot combine multiple “sibling”classifications (e.g.

by style and by subject)

Problems/2 21

¨  As long as the website is small a good taxonomy can satisfy user requirements

¨  For large websites ¤  (hundreds or thousand of pages) ¤  Indexed/guided navigation doesn’t scale ¤ Users can’t easily find what they want ¤ Users can’t make sense of all such information

Solutions? 22

¨  What do users do when navigation doesn’t work? ¤  They use search! ¤  Search arranges contents dynamically and automatically (in

a way not predefined by designers) ¨  But keyword-based search is not optimal

¤  No hints for users that have no clear idea of what looking for

¤  Users must know how the information is described (e.e. the specific jargon used)

¤  Just for retrieval/focalized search ¨  We need a better paradigm: Exploratory search

Exploratory search 23

¨  The model “query à results” is (too much) simple

¨  Search is often like berry picking! (Bates 1989) ¤  Users explore a corpus of contents ¤  They refine the query (again and

again) according to what they learn ¤  They pick information here and there,

piece by piece

From search to exploration 24

¨  From finding to understanding (Marchionini) ¤  Acquire knowledge

about a domain, its jargon, the properties of information items in it.

¤  Useful to (better) understand what to look for

¤  …but also to analyze a dataset

Goals of exploratory applications 25

¨  Object seeking ¤  Identify the best object(s) whose features match user

requirements (e.g. purchasing a photocamera with concerns regarding price, resolution, etc.)

¨  Knowledge seeking ¤  Expand the knowledge about a given topic and related

information (e.g. Leonardo Da Vinci and Italian Renaissance) ¨  Wisdom seeking

¤  Discover interesting relationships among features in a information space/dateset (e.g. analysis of sales in Esselunga chain stores, according to store location, type of article, price, etc.)

¨  These goals can possibly coexist in the same application

Retrieval vs. exploration models 26

¨  Retrieval model: query + results ¤  Query can can be either:

n  Free form (e.g. keyword based search) n  Structured (parametric search, e.g. Scholar advanced search) n  Guided (select data from a predefined set of choices)

¨  Exploration model: ¤  Query + results + refinements/feedback ¤  Query supported by self-adaptive structures for:

n  Further filter results to a subset of them n  Summarizing the features shared by results

27 Part 2 | Faceted search: model(s) and interaction

(Amazon’s Diamond search was one of the first e-commerce applications of faceted search)

Faceted search 28

¨  A exploratory search/navigation pattern based on progressive filtering of results

¨  The user selects a combination of metadata values belonging to several facets

¨  Each facet correspond to a particular dimension that describes the content objects made available for search, e.g. for an artwork: ¤  Subject: people portrayed, flowers and plants, abstract... ¤  Medium: painting, sculpture, photography... ¤  Technique: oil, watercolors, digital art... ¤  Style: impressionism, expressionism, abstractism... ¤  Location: Prado, Louvre, Guggenheim

Let’s see a pair of examples 29

¨  Two examples: ¤  http://orange.sims.berkeley.edu/cgi-

bin/flamenco.cgi/famuseum/Flamenco ¤  http://www.artistrising.com

¨  Try the same search we’ve seen before: find horses in art

¨  More examples at: http://www.flickr.com/photos/morville/collections/72157603789246885/

Non just a matter of finding… E.g. you can learn that horses in art are often found in paintings portraing soldiers or warriors and leaders

30

How the interaction works 31

¨  When the user chooses a filter, the application selects: ¤  The results: items that have

been “tagged” with the filter and the other metadata previously chosen

¤  The remaining filters: metadata that combined with the previous choices can produce results

¨  The users can continue narrowing results until they options are available

A (generalized) formal model | 1 32

¨  Taxonomy: a pair ¤ A set of concepts or terms ¤ The subsumption relation connecting narrower

terms (hyponyms) to broader concepts (hypernyms) ¤ Terminal concepts: terms not further specialized

(the “leaves”)

T ,( ) T = t1,t2 ,…,tn{ }

laptop computerlocation : 'Como ' location : 'Lombardy ' location : 'Italy '


¨  For faceted taxonomies concepts are given in terms of property-value pairs (restrictions): ¤  E.g. subject: “horse”, location: “Como”

¨  A query is any of: ¤  A restriction ¤  A conjunction, disjunction or negation of

(sub)queries ¤  Actually there are limitations in the way concepts

can be combined in current facet browser implementations

q = property :value

q1 and q2

q1 or q2

not q


¨  Item description: an information item is described as a conjunction of restrictions

¨  Extension of a query: the set of items in a context O that match the query extO q( ) = o∈O | d o( ) q{ }

ext q1 and q2( )⊆ ext q1( ), ext q2( )ext q1( ), ext q2( )⊆ ext q1 or q2( )ext not q( ) ≡ ext ALL( ) ext q( )

d o( ) = subject :"horse"and style :"Impressionism"and…

o∈O

tc tp ⇒ ext tc( )⊆ ext tp( )


¨  The result of a query is: ¤  Its extension in the given information space ¤ The set of features shared by these results: i.e. all

the concepts that can be derived from the descriptions of objects in

extO q( )

extO q( )

Query transformations 36

¨  Operations allowing to navigate from a state to another of the exploratio ¤  Appending new restrictions to the query in conjunction

(zoom-in: from a wider to a narrower set of results) ¤  Adding alternatives in disjunction to the existent ones (zoom-

out: from a narrower to a wider set) ¤  Removing existing constraints (zoom-out again) ¤  Negating/excluding values ¤  Replacing a filter with another (shift)

¨  Implemented by hyperlinks (for conjunctive filters / shift), check boxes (for disjunctions), etc.

37

How values are (usually) combined ¨  Filters belonging to different facets are combined in

conjunction ¤  E.g. “technique:oil” AND “style:impressionism” ¤  Filters belonging to the same facet are: ¤  Combined in conjunction if the facet admits more values at

the same time for each object n  E.g. “subject:people” AND “subject:animals” n  (both people and animals in the same picture)

¤  Combined in disjunction if the facet adimits only one value n  E.g. “location:Milan” OR “location:Como” n  (an object which is Como or in Milan)

38

Type of facets

¨  Single-valued (functional properties) vs. multi-valued ¨  Flat vs. hierarchical organization of values

¤  E.g. hierarchical: nation/region/province ¨  Subjective/arbitrary (properly named facets) vs. objective

(attributes) ¤  A date, a location, a price are examples of objective data ¤  “Topic”, “Audience”, “Artistic movement”, “importance” are

examples of subjective information ¤  Assigning/using a value involves some kind of judgment and

interpretation and is influenced by cultural and personal backgrounds

Type of facet values ¨  Terms (strings of text)

¤  Taxonomies, controlled vocabularies

¤  User-defined tags (folksonomies)

¤  From data-mining ¨  Numerical values and dates ¨  Boolean values (yes/no)

¤  E.g. “Available for buying?”, “original?”, “still living?”

¨  Even shades of color, shapes, etc...

¨  Sortable and comparable? ¤  We can say that

value1<=value2<=…<=valueN? ¤  E.g. Dates, magnitudes, scales of

judgment, quantitative data n  e.g. “sufficient”<“excellent”,

10€<100€, “Monday”<“Friday” ¤  Ranges [value1, value2]

n  E.g. User is allowed to search for events from 01/06 to 31/08

¤  Classes of values n  e.g. for price: 0-10€, 11-20€,

21-50€, 51-100€, … n  The way we define classes is arbitrary

and depend on domain

39

Benefits of faceted search 40

¨  Easy and natural almost like “traditional” browsing ¨  With respect to keyword-based search users have hints

¤  Users can more easily make sense of information (if supported by good interfaces)

¤  …and learn about the context by interacting with it ¨  Users can freely combine multiple classifications according to their

wishes ¤  In traditional browsing, when you reach a terminal concept you can’t

refine further ¤  With faceted search, you can continue refining with related concepts

¨  Navigation is safe: frustrating “no results found” searches avoided ¤  Only concepts that have been used to classify the current set of

results are diplayed

Limitations 41

¨  It works well only with structured data ¨  Faceted search does not provide a ranking of

results ¤  For “object seeking” tasks it might be a limitation ¤  It may be better to compute the “distance” with

respect to an “optimal” solution à otimization task ¨  Other limitations are discussed in the following

slides on advanced issues

Advanced (research) issues 42

43

Full Boolean queries | 1 ¨  How to achieve something like this?

“Given a budget of 250,000 euros, I’m interested in a flat with at least 4 rooms and not central heating in the centre, or an house with at least 5 rooms in the suburbs”

44

Full Boolean queries | 2 ¨  Foci (Ferré et al.) the set of sub-expressions in the semantic

tree of the query ¨  A query is a pair , where is an arbitrary combination of

filters and is one of its foci ¤  The focus is used to select the subquery at which the new filter

should be appended (or the transformation should be applied) ¤  …But also to “inspect” different points of view of information ¤  The main focus represents the “whole” query

q,φ( ) qφ

Semantic faceted search 45

¨  We can filter items, but how can we filter facet values? ¤  E.g. paintings filtered by artists ¤  But how we filter the Artists facet values by nationality,

gender, age, etc.? ¨  Exploring contents at level of sets using semantic

relationships, e.g. ¤  The museums that have bronze Greek statues ¤  “Women portrayed by women”: paintings with subject:woman

and artist:gender:female ¤  Schools attended by the daughters of U.S. democratic

presidents (http://www.freebase.com/labs/parallax/) ¤  Challenges: effective models and usable interface

¨  An example: Sewelis

Beyond binary classication | 1

¤  Classification (faceted or not) is usually binary:

¤ An item must be either relevant (1) or not relevant (0) to a certain category

¤ Problem: quite arbitrary decision in many real domains


î  How to classify acathedral by architectural style? ¤  Built upon a 6th century buliding ¤  Mainly gothic ¤  17th century (baroque) towers ¤  Rebuilt during neoclassicism ¤  Decorations added in 19th century ¤  Contains Roman forum marbles (donated by Pius

IX) ¤  …

î  Do we tag the cathedral with all or only some of these?

î  A classification may be correct for a kind of users but ineffective for another one


î Monna Lisa is a well known portait of a woman, but…

î There is also a landscape in the background

î Do we classifity it as “subject: woman” and “subject: Tuscan landscape” too?


î Onion is very used in French cuisine

î How do we distinguish “onion-based” recipes from all the recipes with onion inside?


¨  A possible solution: associating weights to each triple item-facet-value ¤  A statement about

the statement ¨  Values between 0 and

1 or other scales ¨  Query could be

specified in terms of facet-values pairs and ranges of weights


¨  Subjective weights ¤  Relevance: at which

extent the item can be considered as belonging to a certain facet value

¤  Significance: the relative importance of the item according to a facet value

¨  Objective weights ¤  E.g. Concentration or quantity (e.g.

a thing is made for the 10% of material:bronze)

¤  E.g. for exploring venues: distance from points of interests


¨  Interaction (concepts)

Handling information overload 53

¨  Too more facets and facets values may generate information overload too! ¤  Possible solution: Display only the most relevant

facets (and facet values) for the user profile or the given context

¨  How to determine the most “interesting” facets in a given context? ¤  E.g. those with a less “uniform” distribution of

values (more correlation) ¤  We will discuss this in a next lecture… :-)

Interested in MS Theses? Contact us! :-) ¨  Advisors: Prof. Di Blas, Prof. Paolini ¨  Both theoretical and development ¨  Fuzzy facets ¨  Semantic faceted search ¨  Advanced visualizations ¨  … ¨  Your own ideas! :-)

54

Are you still alive/awake? Thank you for your attention!

Any final questions? 55

models and interaction mechanisms for exploratory interfaces

Education

user experiences

negative correlation

make sense

italian province

faceted search

formal model

features shared

italian region