faceted navigation (lacasis fall workshop 2005)

28
Faceted Navigation Presentation to LACASIS 2005 Fall Workshop Search Forward: Emerging Internet Capabilities November 18 th 2005 Brad Allen, Founder and CTO Siderean Software, Inc.

Upload: bradley-allen

Post on 19-May-2015

312 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Faceted Navigation (LACASIS Fall Workshop 2005)

Faceted NavigationPresentation to LACASIS2005 Fall WorkshopSearch Forward: Emerging Internet Capabilities November 18th 2005

Brad Allen, Founder and CTOSiderean Software, Inc.

Page 2: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 2

Overview Problem: Knowing what information is available Solution: Faceted navigation

How is navigation different than search

Case studies and business applications

Lessons learned Challenges Demonstration Discussion

Page 3: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 3

Problem: Knowing what information is available

Page 4: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 4

Faceted navigation: providing “a bird’s eye view” of available information

vs.

Page 5: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 5

How faceted navigation differs from search Faceted navigation is a new type of software application It goes beyond search and browsing by providing:

Scope: an overview of all available information

Context: provide a frame of reference to orient oneself in a dynamic

information space

Repeatability: using scope and context as cues to lead users back to

relevant information

Universality: a unified means of accessing information that is

independent of type or source Faceted navigation provides the insight of analytics with the

ease of search

Page 6: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 6

Faceted navigation: origins Library science

Raganathan and the invention of faceted classification

Digital library efforts Information retrieval

Parametric search

Query by example

Retrieval by reformulation

Rabbit, Argon

Systems have been moving from academic prototypes into commercial use over the last four years

Marti Hearst as a pioneer in this area

Siderean, Endeca, Vivisimo, FAST driving technology into enterprises

Page 7: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 7

Facets: the basis of navigation Facets are metadata properties whose ranges form a near-

orthogonal set of controlled vocabularies Creator: Dickens, Charles

Subject: Arsenic, Antimony

Location: World > U.S. > California > Venice

Facets form a frame of reference for information overview, access and discovery

Other properties serve as landmarks and cues

Page 8: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 8

Building navigation applications

Organized into a unified information architecture…

Analyzed to generate faceted views…

Providing faceted navigation across

the data and content

Metadata about data and content is aggregated…

Term

Event

Person

PlaceText

View View

Page 9: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 9

Case study: NASA JPL Delivery to implementation in

weeks using 3 internal resources Brings together SharePoint,

DocuShare, and structured trouble ticketing databases

Provides uniform access to all relevant information about previous projects in one place

Incorporates corporate vocabulary for concept-based search

Allows user community to contribute to organization of information

Page 10: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 10

Metadata in today’s enterprises From thirty interviews conducted with Fortune 1000 organizations during Fall 2004

Use of metadata not yet widespread but emerging

Understanding varies widely across enterprises

Three basic approaches:

Top down

CEO says “We must be an information-driven company”

“Corporate controlled vocabulary that all divisions will use”

The effort is multi-year, ROI hard to track, and may not be implemented or adopted widely

Bottom up

Groups determine their vocabulary while describing their process

Light tagging of content when it is created or when the content is published to a portal

Give up

Assumption: too difficult to create metadata from existing content

But still feel that metadata would improve matters, particularly within business units

Page 11: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 11

Verticals for faceted navigation

Vertical Strong identified application fit

Existing metadata Adopting semantic technologies

Business users

Federal Government

Search, analyze and monitor complex,

dynamic intelligence, project and problem information across organizations and

projects (Columbia, Iraq, 9/11)

Scads of all types, with unstructured information often preprocessed to

boot

Commitment to RDF/OWL as

solution for cross-agency

interoperability, actively using RSS

Intelligence analysts

E-Commerce Search and browse catalogs of products

and services, consumer-generated

information

Product catalogs, customer reviews, customer service data, advertising

Pervasive adoption of XML standards for moving product and

customer data across value chains

Consumers, marketers

Financial Services Search, analyze and monitor dynamic

financial and market data

News feeds, financial DBs, market data

Adoption of RSS for market news

emerging

Traders, industry analysts, investment

bankers

Page 12: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 12

Navigation requires metadata Ontologies

Specifications of how to represent classes, instances and their properties

Sometimes called “vocabularies” Controlled vocabularies

Terms for saying what something is about

Also called “taxonomies” and “thesauri” Instances

Descriptions of resources Application profiles

Specifications of which classes and properties are useful and how they are to be

used in an application

Page 13: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 13

Lessons learned Balanced incremental approach Leverage metadata and indices at hand Exploit statistics where desirable

But layer a framework on top to structure the statistics

Significant mileage from very simple frameworks

Page 14: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 14

The utility of RDF for commercial metadata RDF can make metadata use easier and less costly

An open standard for metadata reduces cost and avoids technology and

vendor lock-in

A “universal solvent” for data and content

A platform for reuse and sharing

Page 15: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 15

Building navigation systems with RDF Define/reuse ontologies expressed in RDF(S)

Classes for defining instances and controlled vocabularies

Properties for facets and additional attributes

Import/transform instances into an RDF representation Resources referred to via URIs

Content and controlled vocabularies

Write application profiles in terms of RDF

Page 16: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 16

Lessons: ontologies Don’t do: assume you have to build elaborate OWL ontologies

Don’t have to boil the ocean to get the benefits

OWL DL, OWL Full are overkill for this class of application

Side issue: description logic for navigation is not addressed adequately by OWL

Class/subclass versus arbitrary hierarchical relations

Do: Tiny Ontologies All Stitched Together (TOAST) RDF Schema with a smattering of RDF/OWL properties (e.g.,

owl:inverse)

Start with DC + SKOS + FOAF

Page 17: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 17

Lessons: controlled vocabularies Don’t do: huge monolithic taxonomies

Unless they are ready at hand and can be reused largely without

modification

Do: bite-sized controlled vocabularies that exploit faceted approaches

4 facets x 10 terms per facet versus 104 terms in a single taxonomy

Start with flat term lists

Add BT/NT/RT relationships over time

Page 18: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 18

Lessons: instances Manual creation

Don’t do: exhaustive author creation of metadata

Do: community annotation and tagging

(Semi-)automated creation Don’t do: assume elaborate information extraction based on NLP,

subject tagging and categorization

Do: quick and dirty named entity extraction, or better yet, stick to readily

available asset and relational metadata (date, creator, document

type/genre)

Much of the benefit at a fraction of the effort

Page 19: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 19

Lessons: application profiles Metadata is increasingly pervasive

The way to leverage existing information infrastructure

Exploit “on-demand” information integration feature of RDF DB + XML XLST RDF(S): a simple, sloppy framework

Part of Adam Bosworth’s “Web of data”

Page 20: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 20

The big question: statistics vs. knowledge Statistics can’t deliver everything

Alan Kay’s puppy analogy

Vitanyi work on “Google learning”

On the other hand, knowledge is dearly won CYC

Need a balance that enables adoption without losing the benefits

Lessons from Statistics vs. knowledge in NLP

Expert systems

Page 21: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 21

Future directions User tagging + RDF: the killer SW application?

The rehabilitation of metadata in the social software community

The re-emergence of RSS 1.0

“Folksonomy”-driven social search

Del.icio.us, Flickr, CiteULike

Towards social navigation: fac.etio.us

Page 22: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 22

fac.etio.us Aggregated feeds from del.icio.us

social bookmarking site 105 Web pages 104 tags 104 contributors 104 orginating sites

Superior user experience with 10 minutes’ effort

“In 3 clicks, I drilled down through 9700+ sites, to a more specific set of 98 things, down to one I found useful.”

Tagging the tags to add semantics Bootstrapping folksonomies into

taxonomies without impacting user creation of metadata

Merging anarchy with governance

Page 23: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 23

Challenges Scale

Must be commensurate with expectations and requirements from

traditional web and enterprise search

Algorithms Many alternatives still being explored

Usability Lots of work to be done to validate benefits

Security, trust and provenance Just beginning to understand

Page 24: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 24

Challenges: scale Navigation has to live up to the scaling expectations set by

search, while it is doing a lot more work Number of objects, feeds: 106 to 109

Ingest rates: ~ 103 – 104 triples/sec, how many per resource?

Latency: < 0.5 sec user time regardless of application

Implementations exploit RAM to deliver low latency, but this is an impediment to terabyte-scale bodies of information

Page 25: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 25

Challenges: algorithms Federated services vs. centralized servers Relationship to relevance ranking Support for aggregate and text search operators in RDF query Integration of multimedia retrieval algorithms as equal citizens

to free text retrieval

Page 26: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 26

Challenges: usability Navigation interfaces in their infancy Tagging interfaces even more so Principled analyses of precision and recall have yet to be

done Visualization beyond “sticks and ovals” is begging to be

integrated Navigate to a small result set, then visualize

Page 27: Faceted Navigation (LACASIS Fall Workshop 2005)

Copyright © 2005 Siderean Software, Inc. All rights reserved. 27

Summary Faceted navigation is a new software product category that

addresses the pain associated today with finding and discovering actionable information

The use of Semantic Web standards, principally RDF, enables the development of faceted navigation applications

It is “early days” for faceted navigation applications and challenges remain, but we believe the potential is significant

Page 28: Faceted Navigation (LACASIS Fall Workshop 2005)

Siderean Software, Inc.390 North Sepulveda Blvd., Suite 2070El Segundo, CA 90245-4475 USA+1 310 647-4266http://www.siderean.com

[email protected]