models and interaction mechanisms for exploratory interfaces
DESCRIPTION
For the course in Information and Communication Quality (prof. Di Blas) for the MS in Computer Engineering at Politecnico di MilanoTRANSCRIPT
COMO CAMPUS
Models and interaction mechanisms for exploratory interfaces Luigi Spagnolo [email protected]
1 Information and Communication Quality
Index 2
¨ PREVIEW: Online experimentation! ¨ Part I: navigation, search and exploration
¤ Break ¨ Part II: Faceted search: the model(s) and the
interaction
¨ Visualization issues will be covered into an other lecture
3 PREVIEW: Online Experimentation
Intro 4
¨ This lecture starts in a quite unusual way :-) ¨ To let you introduced with exploratory
interfaces you’ll take part to a research experiment
¨ But don’t worry! ¤ It’s not dangerous for your health :-) ¤ The questionnaire you’re asked to fill is
anonymous and the answers will not be graded
The application | 1 5
The application | 2 6
¨ The last version of a prototype built for the Italian Ministry of Culture
¨ A map of exploring venues of archaeological interest in Italy ¨ According to three properties (facets):
¤ Kind of venue: museum, archaeological site and superintendence (a local branch of the Ministry of Culture devoted to archeological heritage management).
¤ Location: the venue location, at level of macro-area (Northern Italy, Central Italy, eyc.), Italian region and Italian province.
¤ Civilization or Period: The ancient civilizations (Romans, Greeks, etc.) or periods (e.g. Middle Ages, Bronze age) the venues are relevant to.
The application | 3 7
¨ The tag cloud: ¤ Tag size à the number of results that are relevant with respect to the period or
civilization in question. ¤ Text color à how much the percentage of results that are relevant for the period/
civilization deviates from an uniform distribution. n Shades of green show a stronger positive correlation between the other selected filters (e.g.
the location and/or the venue type) and the civilization/period in question. Red instead shows a negative correlation (the civilization/period is less significant with respect to other criteria selected).
¤ Background color à w.r.t. the whole set of venues are relevant for the period/civilization, which percentage of them are included in the results? n Green shows a positive correlation, while red instead shows a negative correlation.
¤ E.g., for venues in a specific region only (e.g. Lombardy), a green tagindicates that the given civilization was particularly relevant for that region.
¤ The green background shows instead that the civilization is peculiar of that region, and is less likely to be found elsewhere.
The application | 4 8
¨ The map: ¤ At three levels: Italian region, Italian province, extact
location(s) ¤ The color of the circle à the specific type of venue ¤ The size of the circle à the number of items of that
type in that area
The experiment 9
¨ Go to http://tinyurl.com/exp-icq ¤ (or http://www.ellesseweb.com/mining/)
¨ You will find a page with two links: 1. The application 2. An online questionnarie (on Rational Survey) ¤ Keep both open on the browser
¨ Work individually (1 hour max) ¨ Answer with your opinions, without looking at other websites, just
at the ArchaeoItaly application ¤ Remember: the survey is anonymous, and there are no “correct
answers”! ¤ For any doubts, ask me!
10 Part 1 | Navigation, search and exploration
Let’s start with a scenario 11
¨ Work in pairs ¨ Imagine to work as journalists for the
Horse Illustrated magazine ¨ You have to write an essay about
horses in art (and in particular in painting) among the centuries.
¨ Find interesting information on the website of the Louvre Museum ¤ http://www.louvre.fr/llv/commun/
home.jsp?bmLocale=en
Problems with the Louvre 12
¨ Artworks are separated by department (internal “bureaucratic” classification) and by provenience.
¨ It is not possible to search them together (regardless of their age and country of origin) by subject.
¨ There is no introductory content on the subject that can guide the student in her search.
Content-intensive websites 13
¨ Also know as: ¤ Information-intensive ¤ Often Infosuasive = informative + persuasive ¤ Like ancient rhetoric: inform and persuade
¨ Mainly intended for: ¤ Learning, understanding, discovering, comparing
information ¤ Leisure and entertainment
Contents 14
¨ Text, multimedia (audio, video, images) ¨ Hypermedia = multimedia + hyperlinks ¨ Information involves subjective judgment
¤ Depends on the author and on the user ¤ Objective: “10km far from Como”, “the painting
was made in 1886” ¤ Subjective: “Near Como”, “the painting is
impressionist”
User experiences requirements | 1 15
¤ From the users’ point of view: n Usability: usage is effective, efficient and satisfactory n Findability: users can locate what they are looking for n “At a glance” understandabity: users understand the
website coverage and can make sense of information n Enticing explorability: users are compelled to “stay
and play” and discover interesting connections among topics
User experiences requirements | 2 16
¤ From the stakeholders’ point of view: n Planned serendipity: promoting most important
contents so that users can stumble in them n E.g. “Readers that purchased this book also bought…”
n Communication strengh and branding: the website conveys the intended “message” and “brand” of the institution behind it
n E.g. “we have the lowest prices”, “we are very authorithative”, etc.
Information architecture 17
¤ Purpose: conceptually organizing information
¤ Providing access to contents n Index navigation (a) n Guided navigation (b)
¤ Providing the possibility of moving from a content to related ones n Contextual navigation (c): cross-
reference links, semantic relationships
“Traditional” structure 18
¤ Taxonomy: hierarchy of categories and subcategories n Sections and group of
contents are the branches of the tree
n Contents are the leaves ¤ Cross-reference links
between nodes
An example 19
Sitemap:
Art gallery website
¤ Artworks of the month ¤ Paintings
n Top 10 masterpieces n By artist n By artistic movement n By subject
¤ Sculptures n ... n By material
¤ Photographs n ...
Problems/1 20
¨ What if I want to browse all artworks (regardless their type) by artist? ¤ Classifications are “nested” in a fixed order ¤ Designers should choose which classification should
prevail (e.g. by type) ¨ What if I want to find “impressionists paintings
portraing animals”? ¤ I cannot combine multiple “sibling”classifications (e.g.
by style and by subject)
Problems/2 21
¨ As long as the website is small a good taxonomy can satisfy user requirements
¨ For large websites ¤ (hundreds or thousand of pages) ¤ Indexed/guided navigation doesn’t scale ¤ Users can’t easily find what they want ¤ Users can’t make sense of all such information
Solutions? 22
¨ What do users do when navigation doesn’t work? ¤ They use search! ¤ Search arranges contents dynamically and automatically (in
a way not predefined by designers) ¨ But keyword-based search is not optimal
¤ No hints for users that have no clear idea of what looking for
¤ Users must know how the information is described (e.e. the specific jargon used)
¤ Just for retrieval/focalized search ¨ We need a better paradigm: Exploratory search
Exploratory search 23
¨ The model “query à results” is (too much) simple
¨ Search is often like berry picking! (Bates 1989) ¤ Users explore a corpus of contents ¤ They refine the query (again and
again) according to what they learn ¤ They pick information here and there,
piece by piece
From search to exploration 24
¨ From finding to understanding (Marchionini) ¤ Acquire knowledge
about a domain, its jargon, the properties of information items in it.
¤ Useful to (better) understand what to look for
¤ …but also to analyze a dataset
Goals of exploratory applications 25
¨ Object seeking ¤ Identify the best object(s) whose features match user
requirements (e.g. purchasing a photocamera with concerns regarding price, resolution, etc.)
¨ Knowledge seeking ¤ Expand the knowledge about a given topic and related
information (e.g. Leonardo Da Vinci and Italian Renaissance) ¨ Wisdom seeking
¤ Discover interesting relationships among features in a information space/dateset (e.g. analysis of sales in Esselunga chain stores, according to store location, type of article, price, etc.)
¨ These goals can possibly coexist in the same application
Retrieval vs. exploration models 26
¨ Retrieval model: query + results ¤ Query can can be either:
n Free form (e.g. keyword based search) n Structured (parametric search, e.g. Scholar advanced search) n Guided (select data from a predefined set of choices)
¨ Exploration model: ¤ Query + results + refinements/feedback ¤ Query supported by self-adaptive structures for:
n Further filter results to a subset of them n Summarizing the features shared by results
27 Part 2 | Faceted search: model(s) and interaction
(Amazon’s Diamond search was one of the first e-commerce applications of faceted search)
Faceted search 28
¨ A exploratory search/navigation pattern based on progressive filtering of results
¨ The user selects a combination of metadata values belonging to several facets
¨ Each facet correspond to a particular dimension that describes the content objects made available for search, e.g. for an artwork: ¤ Subject: people portrayed, flowers and plants, abstract... ¤ Medium: painting, sculpture, photography... ¤ Technique: oil, watercolors, digital art... ¤ Style: impressionism, expressionism, abstractism... ¤ Location: Prado, Louvre, Guggenheim
Let’s see a pair of examples 29
¨ Two examples: ¤ http://orange.sims.berkeley.edu/cgi-
bin/flamenco.cgi/famuseum/Flamenco ¤ http://www.artistrising.com
¨ Try the same search we’ve seen before: find horses in art
¨ More examples at: http://www.flickr.com/photos/morville/collections/72157603789246885/
Non just a matter of finding… E.g. you can learn that horses in art are often found in paintings portraing soldiers or warriors and leaders
30
How the interaction works 31
¨ When the user chooses a filter, the application selects: ¤ The results: items that have
been “tagged” with the filter and the other metadata previously chosen
¤ The remaining filters: metadata that combined with the previous choices can produce results
¨ The users can continue narrowing results until they options are available
A (generalized) formal model | 1 32
¨ Taxonomy: a pair ¤ A set of concepts or terms ¤ The subsumption relation connecting narrower
terms (hyponyms) to broader concepts (hypernyms) ¤ Terminal concepts: terms not further specialized
(the “leaves”)
T ,( ) T = t1,t2 ,…,tn{ }
laptop computerlocation : 'Como ' location : 'Lombardy ' location : 'Italy '
A (generalized) formal model | 2 33
¨ For faceted taxonomies concepts are given in terms of property-value pairs (restrictions): ¤ E.g. subject: “horse”, location: “Como”
¨ A query is any of: ¤ A restriction ¤ A conjunction, disjunction or negation of
(sub)queries ¤ Actually there are limitations in the way concepts
can be combined in current facet browser implementations
q = property :value
q1 and q2
q1 or q2
not q
A (generalized) formal model | 3 34
¨ Item description: an information item is described as a conjunction of restrictions
¨ Extension of a query: the set of items in a context O that match the query extO q( ) = o∈O | d o( ) q{ }
ext q1 and q2( )⊆ ext q1( ), ext q2( )ext q1( ), ext q2( )⊆ ext q1 or q2( )ext not q( ) ≡ ext ALL( ) ext q( )
d o( ) = subject :"horse"and style :"Impressionism"and…
o∈O
tc tp ⇒ ext tc( )⊆ ext tp( )
A (generalized) formal model | 4 35
¨ The result of a query is: ¤ Its extension in the given information space ¤ The set of features shared by these results: i.e. all
the concepts that can be derived from the descriptions of objects in
extO q( )
extO q( )
Query transformations 36
¨ Operations allowing to navigate from a state to another of the exploratio ¤ Appending new restrictions to the query in conjunction
(zoom-in: from a wider to a narrower set of results) ¤ Adding alternatives in disjunction to the existent ones (zoom-
out: from a narrower to a wider set) ¤ Removing existing constraints (zoom-out again) ¤ Negating/excluding values ¤ Replacing a filter with another (shift)
¨ Implemented by hyperlinks (for conjunctive filters / shift), check boxes (for disjunctions), etc.
37
How values are (usually) combined ¨ Filters belonging to different facets are combined in
conjunction ¤ E.g. “technique:oil” AND “style:impressionism” ¤ Filters belonging to the same facet are: ¤ Combined in conjunction if the facet admits more values at
the same time for each object n E.g. “subject:people” AND “subject:animals” n (both people and animals in the same picture)
¤ Combined in disjunction if the facet adimits only one value n E.g. “location:Milan” OR “location:Como” n (an object which is Como or in Milan)
38
Type of facets
¨ Single-valued (functional properties) vs. multi-valued ¨ Flat vs. hierarchical organization of values
¤ E.g. hierarchical: nation/region/province ¨ Subjective/arbitrary (properly named facets) vs. objective
(attributes) ¤ A date, a location, a price are examples of objective data ¤ “Topic”, “Audience”, “Artistic movement”, “importance” are
examples of subjective information ¤ Assigning/using a value involves some kind of judgment and
interpretation and is influenced by cultural and personal backgrounds
Type of facet values ¨ Terms (strings of text)
¤ Taxonomies, controlled vocabularies
¤ User-defined tags (folksonomies)
¤ From data-mining ¨ Numerical values and dates ¨ Boolean values (yes/no)
¤ E.g. “Available for buying?”, “original?”, “still living?”
¨ Even shades of color, shapes, etc...
¨ Sortable and comparable? ¤ We can say that
value1<=value2<=…<=valueN? ¤ E.g. Dates, magnitudes, scales of
judgment, quantitative data n e.g. “sufficient”<“excellent”,
10€<100€, “Monday”<“Friday” ¤ Ranges [value1, value2]
n E.g. User is allowed to search for events from 01/06 to 31/08
¤ Classes of values n e.g. for price: 0-10€, 11-20€,
21-50€, 51-100€, … n The way we define classes is arbitrary
and depend on domain
39
Benefits of faceted search 40
¨ Easy and natural almost like “traditional” browsing ¨ With respect to keyword-based search users have hints
¤ Users can more easily make sense of information (if supported by good interfaces)
¤ …and learn about the context by interacting with it ¨ Users can freely combine multiple classifications according to their
wishes ¤ In traditional browsing, when you reach a terminal concept you can’t
refine further ¤ With faceted search, you can continue refining with related concepts
¨ Navigation is safe: frustrating “no results found” searches avoided ¤ Only concepts that have been used to classify the current set of
results are diplayed
Limitations 41
¨ It works well only with structured data ¨ Faceted search does not provide a ranking of
results ¤ For “object seeking” tasks it might be a limitation ¤ It may be better to compute the “distance” with
respect to an “optimal” solution à otimization task ¨ Other limitations are discussed in the following
slides on advanced issues
Advanced (research) issues 42
43
Full Boolean queries | 1 ¨ How to achieve something like this?
“Given a budget of 250,000 euros, I’m interested in a flat with at least 4 rooms and not central heating in the centre, or an house with at least 5 rooms in the suburbs”
44
Full Boolean queries | 2 ¨ Foci (Ferré et al.) the set of sub-expressions in the semantic
tree of the query ¨ A query is a pair , where is an arbitrary combination of
filters and is one of its foci ¤ The focus is used to select the subquery at which the new filter
should be appended (or the transformation should be applied) ¤ …But also to “inspect” different points of view of information ¤ The main focus represents the “whole” query
q,φ( ) qφ
Semantic faceted search 45
¨ We can filter items, but how can we filter facet values? ¤ E.g. paintings filtered by artists ¤ But how we filter the Artists facet values by nationality,
gender, age, etc.? ¨ Exploring contents at level of sets using semantic
relationships, e.g. ¤ The museums that have bronze Greek statues ¤ “Women portrayed by women”: paintings with subject:woman
and artist:gender:female ¤ Schools attended by the daughters of U.S. democratic
presidents (http://www.freebase.com/labs/parallax/) ¤ Challenges: effective models and usable interface
¨ An example: Sewelis
Beyond binary classication | 1
¤ Classification (faceted or not) is usually binary:
¤ An item must be either relevant (1) or not relevant (0) to a certain category
¤ Problem: quite arbitrary decision in many real domains
Beyond binary classication | 2
î How to classify acathedral by architectural style? ¤ Built upon a 6th century buliding ¤ Mainly gothic ¤ 17th century (baroque) towers ¤ Rebuilt during neoclassicism ¤ Decorations added in 19th century ¤ Contains Roman forum marbles (donated by Pius
IX) ¤ …
î Do we tag the cathedral with all or only some of these?
î A classification may be correct for a kind of users but ineffective for another one
Beyond binary classication | 3
î Monna Lisa is a well known portait of a woman, but…
î There is also a landscape in the background
î Do we classifity it as “subject: woman” and “subject: Tuscan landscape” too?
Beyond binary classication | 4
î Onion is very used in French cuisine
î How do we distinguish “onion-based” recipes from all the recipes with onion inside?
Beyond binary classication | 5
¨ A possible solution: associating weights to each triple item-facet-value ¤ A statement about
the statement ¨ Values between 0 and
1 or other scales ¨ Query could be
specified in terms of facet-values pairs and ranges of weights
Beyond binary classication | 6
¨ Subjective weights ¤ Relevance: at which
extent the item can be considered as belonging to a certain facet value
¤ Significance: the relative importance of the item according to a facet value
¨ Objective weights ¤ E.g. Concentration or quantity (e.g.
a thing is made for the 10% of material:bronze)
¤ E.g. for exploring venues: distance from points of interests
Beyond binary classication | 7
¨ Interaction (concepts)
Handling information overload 53
¨ Too more facets and facets values may generate information overload too! ¤ Possible solution: Display only the most relevant
facets (and facet values) for the user profile or the given context
¨ How to determine the most “interesting” facets in a given context? ¤ E.g. those with a less “uniform” distribution of
values (more correlation) ¤ We will discuss this in a next lecture… :-)
Interested in MS Theses? Contact us! :-) ¨ Advisors: Prof. Di Blas, Prof. Paolini ¨ Both theoretical and development ¨ Fuzzy facets ¨ Semantic faceted search ¨ Advanced visualizations ¨ … ¨ Your own ideas! :-)
54
Are you still alive/awake? Thank you for your attention!
Any final questions? 55