faqs about taxonomies & metadata€¦ · ppt file · web view(june 2003) scalability ... ....
TRANSCRIPT
Strategies LLCTaxonomy
May 16, 2005 Copyright 2005 Taxonomy Strategies LLC. All rights reserved.
FAQs About Taxonomies & Metadata
Joseph A. Busch & Ron Daniel, Jr.
2Taxonomy Strategies LLC The business of organized information
Agenda
9:00 Who are we?
9:10 What are taxonomies & metadata?
9:30 What kinds of taxonomies are there, and what do I need?
9:40 How do I get a good taxonomy?
10:05 How do I associate the taxonomy with content?
10:30 Break
10:45 What do taxonomies and metadata have to do with search?
11:15 How can I sell my management on a taxonomy project?
11:45 Any more questions?
12:00 Adjourn
3Taxonomy Strategies LLC The business of organized information
Who is Joseph Busch?
Over 25 years in the business of organized information Founder, Taxonomy Strategies Director, Solutions Architecture, Interwoven VP, Infoware, Metacode Technologies Program Manager, Getty Foundation Manager, Pricewaterhouse
Metadata and taxonomies community leadership President, American Society for Information Science & Technology Director, Dublin Core Metadata Initiative Adviser, National Research Council Computer Science and
Telecommunications Board Reviewer, National Science Foundation Division of Information and
Intelligent Systems Founder, Networked Knowledge Organization Systems/Services
4Taxonomy Strategies LLC The business of organized information
Who is Ron Daniel, Jr.?
Over 15 years in the business of metadata & automatic classification Principal, Taxonomy Strategies Standards Architect, Interwoven Senior Information Scientist, Metacode Technologies Technical Staff Member, Los Alamos National Laboratory
Metadata and taxonomies community leadership Chair, PRISM (Publishers Requirements for Industry Standard Metadata)
working group Acting Chair, XML Linking working group Member, RDF working groups Co-editor, PRISM, XPointer, 3 IETF RFCs, and Dublin Core 1 & 2
reports.
5Taxonomy Strategies LLC The business of organized information
Who has Taxonomy Strategies worked with?Government
Commodity Futures Trading Commission Defense Intelligence Agency ERIC Federal Aviation Administration Federal Reserve Bank of Atlanta Forest Service GSA Office of Citizen Services (
www.firstgov.gov) Head Start Infocomm Development Authority of
Singapore NASA (nasataxonomy.jpl.nasa.gov) Small Business Administration Social Security Administration USDA Economic Research Service USDA e-Government Program (
www.usda.gov)
International orgs & Non-profits CEN IDEAlliance IMF OCLC
Commercial Allstate Insurance Blue Shield of California Debevoise & Plimpton Halliburton Hewlett Packard Motorola PeopleSoft Pricewaterhousecoopers Siderean Software Sprint Time Inc.
Commercial subcontracts Agency.com – Top financial services Critical Mass – Fortune 50 retailers Deloitte Consulting – Big credit card Gistics/OTB – Direct selling giant
6Taxonomy Strategies LLC The business of organized information
What we do
Organize Stuff
7Taxonomy Strategies LLC The business of organized information
Who are you? What do you want out of today?
Government / NGO / SME / Global 2000?
IT / Library & IM / Public Affairs / Product Management / Engineering / HR & Finance / Other?
Webmaster / Technical / Researcher / Editorial / Supervisory / Executive?
Competing session – Search & Content Management: Putting the Puzzle Pieces Together What brought you HERE instead of THERE?
8Taxonomy Strategies LLC The business of organized information
Agenda
9:00 Who are we?
9:10 What are taxonomies & metadata?
9:30 What kinds of taxonomies are there, and what do I need?
9:40 How do I get a good taxonomy?
10:05 How do I associate the taxonomy with content?
10:30 Break
10:45 What do taxonomies and metadata have to do with search?
11:15 How can I sell my management on a taxonomy project?
11:45 Any more questions?
12:00 Adjourn
9Taxonomy Strategies LLC The business of organized information
What is metadata? Different definitions
Library & Information Science Author/Title/Subject Controlled Vocabularies for
Subject Codes (e.g. Dewey) Authority Files for Author
Names
Database Tables/Columns/
Datatypes/Relationships References for some values
10Taxonomy Strategies LLC The business of organized information
What is metadata? Another view of Dublin Core
Asset metadata – Who, Where & When:
Title, Creator, Publisher, Contributor, Date, Type,
Format, Identifier, Source, Language
Subject metadata –What & Why:
Subject, Description, Coverage
Relational metadata – Links between and to:
Relation
Use metadata – How can it be used:
Rights & Permissions
Functionality
Diff
icul
t to
Gen
erat
e
Better resource description = Better navigation &
discovery
11Taxonomy Strategies LLC The business of organized information
Are there extensions to the Dublin Core?
Elements1. Identifier2. Title3. Creator4. Contributor5. Publisher6. Subject7. Description8. Coverage9. Format10. Type11. Date12. Relation13. Source14. Rights15. Language
AbstractAccess rightsAlternativeAudienceAvailableBibliographic citationConforms toCreatedDate acceptedDate copyrightedDate submittedEducation levelExtentHas formatHas partHas versionIs format ofIs part of
Is referenced byIs replaced byIs required byIssuedIs version ofLicenseMediatorMediumModifiedProvenanceReferencesReplacesRequiresRights holderSpatialTable of contentsTemporalValid
RefinementsBoxDCMITypeDDCIMTISO3166ISO639-2LCCLCSHMESHPeriodPointRFC1766RFC3066TGNUDCURIW3CTDF
EncodingsCollectionDatasetEventImageInteractive ResourceMoving ImagePhysical ObjectServiceSoftwareSoundStill ImageText
Types
12Taxonomy Strategies LLC The business of organized information
ElementData Type Length Source Purpose
Asset Metadata
Unique ID Integer Fixed System supplied Basic accountability
Recipe Title String Variable Licensed Content Text search & results display
Recipe summary String Variable Licensed Content Content
Main Ingredients List VariableMain Ingredients vocabulary
Key index to retrieve & aggregate recipes, & generate shopping list
Subject MetadataMeal Types List Variable Meal Types vocab
Browse or group recipes & filter search results
Cuisines List Variable Cuisines
Courses List Variable Courses vocab
Cooking Method Flag Fixed Cooking vocab
Link MetadataRecipe Image Pointer Variable Product Group Merchandize products
Use Metadata
Rating String Variable Licensed Content Filter, rank, & evaluate recipes
Release Date Date Fixed Product Group Publish & feature new recipes
What is metadata: A scheme for recipes
13Taxonomy Strategies LLC The business of organized information
Biological taxonomy place an organism in one and only one place.
What is a taxonomy? Systematics view
Kingdom Phylum Class Order Family Genus Species
AnimaliaChordata
MammaliaCarnivora
CanidaeCanis
C. familiari
Linnaeus …
Pets
Dogs
Farm Animals
Mammals
But most of the time things belong to more than one category.
Pragmatic
14Taxonomy Strategies LLC The business of organized information
Agenda
9:00 Who are we?
9:10 What are taxonomies & metadata?
9:30 What kinds of taxonomies are there, and what do I need?
9:40 How do I get a good taxonomy?
10:05 How do I associate the taxonomy with content?
10:30 Break
10:45 What do taxonomies and metadata have to do with search?
11:15 How can I sell my management on a taxonomy project?
11:45 Any more questions?
12:00 Adjourn
15Taxonomy Strategies LLC The business of organized information
Are there other organizational schemes?
Type RemarksSynonym Ring
Connects a series of terms together Treats them as equivalent for search purposes
Authority File Used to control variant names with a preferred term Typically used for names of countries, individuals,
organizations
Classification Scheme
An arrangement of knowledge Does not follow taxonomy rules Usually enumerated; ie, LC or Dewey
Thesaurus Expresses semantic relationships of: Hierarchy (broader & narrower terms) Equivalence (synonyms) Associative (related terms)
Ontology Resembles faceted taxonomy but uses richer semantic relationships among terms and attributes and strict specification rules
16Taxonomy Strategies LLC The business of organized information
Another point of view ….
Source: Amy Warner. Metadata and Taxonomies for a More Flexible Information Architecture (http://www.lexonomy.com/presentations/metadataAndTaxonomies.ppt)
Simple Complex
SynonymRings
AuthorityFiles
ThesauriClassificationSchemes
Equivalence Hierarchical Associative
(Vocabularies)
(Relationships)
Taxonomies
Ontologies
17Taxonomy Strategies LLC The business of organized information
Jurisdiction
Industry Impact
BRM Impact
Form TypeAgency AudienceKeyword Topic
Taxonomic metadata – e-Forms example
0001 Legislative
1000 Judicial1100
Executive Office of Pres
0003 Exec Depts1200 Agriculture1300 Commerce9700 Defense9100 Education8900 Energy7500 HHS7000 DHS8600 HUD1400 Interior1500 Justice1600 Labor1900 State6900 Transport2000 Treasury3600 Veterans
Ind AgenciesIntl Orgs
ApplicationApprovalClaimInformation
requestInformation
submission
InstructionsLegal filingPaymentProcuremen
tRenewalReservationService
requestTestOther inputOther
transaction
Agriculture & food
CommerceCommunica-
tionsEducationEnergyEnv proForeign relsGovtHealth &
safetyHousing &
comm devLaborLawNamed grpsNational defNat resourcesRecreationSci & techSocial pgmsTransport
AllGeneral
CitizenBusinessGovtEmployeeNative American
Non-resident
TouristSpecial
group
00 Generic11
Agriculture21 Mining22 Utilities23
Construct31-33
Manuf42
Wholesale44-45
Retail48-49 Trans51 Info52 Finance54
Profession55 Mgmt56 Support61
Education62 Health
Care71 Arts72
Hospitality81 Other
Services92 Public
Admin
FederalState +Local +Other +
Citizen SrvcsSocial SrvsDefenseDisastersEcon DevEducationEnergyEnv MgmtLaw EnfJudicial
CorrectionalHealthSecurityIncome Sec
IntelligenceIntl AffairsNat ResourTransportWorkforceScience
DeliverySupport Manageme
nt
Taxonomies
Metadata Elements
18Taxonomy Strategies LLC The business of organized information
Why use faceted taxonomies?
4 independent categories of 10 nodes each have the same discriminatory power as one hierarchy of 10,00010,000 nodes (104) Easier to maintain Can be easier to
navigate
19Taxonomy Strategies LLC The business of organized information
Agenda
9:00 Who are we?9:10 What are taxonomies & metadata?9:30 What kinds of taxonomies are there, and what do I need?9:40 How do I get a good taxonomy?
Can I get a taxonomy off-the-shelf or create one with software? How do you know it is good? How do you build or modify to make it good?
10:05 How do I associate the taxonomy with content?10:30 Break10:45 What do taxonomies and metadata have to do with
search?11:15 How can I sell my management on a taxonomy project?11:45 Any more questions?12:00 Adjourn
20Taxonomy Strategies LLC The business of organized information
How do I get a good Taxonomy? – Seven practical rules1) Incremental, extensible process that identifies and enables
users, and engages stakeholders.
2) Quick implementation that provides measurable results as quickly as possible.
3) Not monolithic—has separately maintainable facets.
4) Re-uses existing IP as much as possible.
5) A means to an end, and not the end in itself .
6) Not perfect, but it does the job it is supposed to do—such as improving search and navigation.
7) Improved over time, and maintained.
21Taxonomy Strategies LLC The business of organized information
Can I get a taxonomy off the shelf?
Sure: www.taxonomywarehouse.com There are usually license fees, but they will be less than
the effort to develop an equivalent taxonomy. The voice of experience says these will usually not be
what you want.
We recommend: Adopt a faceted approach. Reuse existing (esp. internal) vocabularies for as many
of the facets as reasonable. Plan on doing full-custom “Content Type” and “Subject”
taxonomies.
22Taxonomy Strategies LLC The business of organized information
Sources for 8 common taxonomies
Taxonomy Definition Potential SourcesOrganization Organizational structure. FIPS 95-2, U.S. Government Manual, Your
organizational structure, etc.
Content Type Structured list of the various types of content being managed or used.
DC Types, AGLS Document Type, AAT Information Forms , Your records management policy, etc.
Industry Broad market categories such as lines of business, life events, or industry codes.
FIPS 66, SIC, NAICS, Your market segments, etc.
Location Place of operations or constituencies.
FIPS 5-2, FIPS 55-3, ISO 3166, UN Statistics Div, US Postal Service, Your sales regions, etc.
Function Functions and processes performed to accomplish mission and goals.
FEA Business Reference Model, Enterprise Ontology, AAT Functions, Your business functions, etc.
Topic Business topics relevant to your mission & goals.
Federal Register Thesaurus, NAL Agricultural Thesaurus, LCSH, Your research areas, etc.
Audience Subset of constituents to whom a piece of content is directed or intended to be used.
GEM, ERIC Thesaurus, IEEE LOM, Your psycho-graphics or personas, etc.
Products & Services
Names of products/programs & services.
ERP system, Your products and services, etc.
23Taxonomy Strategies LLC The business of organized information
What about automatically created taxonomies?
Documents can be ‘clustered’ based on similarities and differences.
Problems: Typically only a single
hierarchy No overall plan Results hard for people to
navigate
What does “North” mean on this map?
24Taxonomy Strategies LLC The business of organized information
What should I expect from automatic taxonomy construction software? Software can scan large quantities of
content and extract statistically significant words and phrases.
Example: Archive of 10 publications was analyzed for topics significant to ‘copyright’.
Software does a poor job of de-duplication turning those significant words and phrases
into a larger structure discriminating between gold and garbage
Software is good for getting an understanding of the key phrases
in a large amount of content providing test cases for evaluating a
taxonomy Source: Sample data courtesy of Randy Marcinko and nStein.
25Taxonomy Strategies LLC The business of organized information
How can I test a Taxonomy? – Qualitative methods
Method Process ValidationWalk-throughs Show and explain Approach
Consistency to rules Appropriateness to task
Usability Testing Contextual analysis (card sorting, scenario testing, etc.)
Tasks are completed successfully
Time to complete task is reduced
User Satisfaction Survey Reaction to new interface Reaction to search results
Tagging samples Tag sample content with taxonomy
Content ‘fit’ Fills out content inventory Training materials for people &
algorithms Basis for quantitative methods
26Taxonomy Strategies LLC The business of organized information
Quantitative Method – How evenly does it divide the content? Background: Documents do not distribute uniformly
across categories Zipf (1/x) distribution is expected
behavior 80/20 rule in action (actually 70/20 rule)
Methodology: Part of alpha test of ‘content type’ for
corporate intranet 115 URLs selected at random from
search index were manually categorized. Inaccessible files and ‘junk’ were removed
Results: Results were slightly more uniform than
the Zipf distribution, which is better than expected
Measured and Expected Distribution of Content Types in an Intranet
0
5
10
15
20
25
Peo
ple,
Gro
ups
& P
lace
s
New
s &
Eve
nts
Man
uals
&Le
arni
ngM
ater
ials
Ope
ratio
ns &
Inte
rnal
Com
mun
icat
ions
Mar
ketin
g &
Sal
es
Reg
ulat
ions
,P
olic
ies,
Pro
cedu
res
&P
aper
s &
Pre
sent
atio
ns
Oth
er &
Unc
lass
ified
Pro
gram
s,P
ropo
sals
, Pla
ns&
Sch
edul
es
Content Type
# Do
cum
ents
Measured
Expected
Measured and Expected Distribution of Top 10 Content Types in Library of Congress Database
0
50,000
100,000
150,000
200,000
250,000
300,000
350,000
Congre
sses
Biogra
phy
Period
icals
Maps
Fiction
Exhibitio
ns
Juve
nile l
itera
ture
Bibliog
raph
y
Statistic
s
Top 10 Content Types
Num
ber o
f Rec
ords
Series2
Series1
27Taxonomy Strategies LLC The business of organized information
Quantitative Method – How intuitive (repeatable) are the categorizations? Methodology: Closed Card
Sort For alpha test of a grocery site 15 Testers put each of 100 best-
selling products into one of 10 pre-defined categories
Categories where fewer than 14 of 15 testers put product into same category were flagged
Results:% of
TestersCumulative %
of Products15/15 54%14/15 70%13/15 77%12/15 83%11/15 85%
<11/15 100%
In the trade, “Corn Tortillas” are a Dairy item!
“Cocoa Drinks – Powder” is best categorized in both
“Beverages” and “Grocery”.
28Taxonomy Strategies LLC The business of organized information
Quantitative Method – How does taxonomy “shape” match that of content?
Term Group % Terms
% Docs
Administrators 7.8 15.8Community Groups 2.8 1.8Counselors 3.4 1.4Federal Funds Recipients and Applicants
9.5 34.4
Librarians 2.8 1.1News Media 0.6 3.1Other 7.3 2.0Parents and Families 2.8 6.0Policymakers 4.5 11.5Researchers 2.2 3.6School Support Staff 2.2 0.2Student Financial Aid Providers
1.7 0.7
Students 27.4 7.0Teachers 25.1 11.4
Source: Courtesy Keith Stubbs, US. Dept. of Ed.
Background: Hierarchical taxonomies allow
comparison of “fit” between content and taxonomy areas
Methodology: 25,380 resources tagged with
taxonomy of 179 terms. (Avg. of 2 terms per resource)
Counts of terms and documents summed within taxonomy hierarchy
Results: Roughly Zipf distributed (top 20
terms: 79%; top 30 terms: 87%) Mismatches between term%
and document% flagged
29Taxonomy Strategies LLC The business of organized information
How do large corporations typically extend the Dublin Core?
100%86%
57%
0%
20%
40%
60%
80%
100%
120%
Doc Types Products & Services Roles
Base: 20 corporate information managers
Source: CEN/ISSS Workshop on Dublin Core. Guidance information for the deployment of Dublin Core metadata in Corporate Environments (
http://www.cenorm.be/cenorm/businessdomains/businessdomains/isss/cwa/cwa15247.asp)
30Taxonomy Strategies LLC The business of organized information
Agenda
9:00 Who are we?9:10 What are taxonomies & metadata?9:30 What kinds of taxonomies are there, and what do I need?9:40 How do I get a good taxonomy?10:05 How do I associate the taxonomy with content?
How are we going to populate metadata elements with complete and consistent values?
What can we expect to get from automatic classifiers? What kinds of tools do people use? How do different automatic classification tools compare? What else should I keep in mind?
10:30 Break10:45 What do taxonomies and metadata have to do with search?11:15 How can I sell my management on a taxonomy project?11:45 Any more questions?12:00 Adjourn
31Taxonomy Strategies LLC The business of organized information
General remarks on tagging
Province of authors (SMEs) or editors?
Taxonomy often highly granular to meet task and re-use needs.
Vocabulary dependent on originating department.
The more tags there are (and the more values for each tag), the more hooks to the content.
If there are too many, authors will resist and use “general” tags (if available)
Automatic classification tools exist, and are valuable, but results are not as good as humans can do. “Semi-automated” is best. Degree of human involvement is a cost/benefit tradeoff.
32Taxonomy Strategies LLC The business of organized information
What methods do large companies use to create & maintain metadata?
71%
57%
43% 43%
0%
10%20%
30%
40%
50%60%
70%
80%
Forms DistributedProduction
Centralizedproduction
Not Automated
Base: 20 corporate information managers
Source: CEN/ISSS Workshop on Dublin Core. Guidance information for the deployment of Dublin Core metadata in Corporate Environments (
http://www.cenorm.be/cenorm/businessdomains/businessdomains/isss/cwa/cwa15247.asp)
33Taxonomy Strategies LLC The business of organized information
How do tools compare? Analyst viewpoint
Accuracy Levelhighlow
Con
tent
Vol
umes
low
high
34Taxonomy Strategies LLC The business of organized information
What accuracy should we expect from an automatic classifier? Classification Performance is
measured by “Inter-cataloger agreement”Trained librarians agree less than 80%
of the timeErrors are subtle differences in
judgment, or big goofs
Automatic classification struggles to match human performanceException: Entity recognition can
exceed human performance
Classifier performance limited by algorithms available, which is limited by development effort
Very wide variance in one vendor’s performance depending on who does the implementation, and how much time they have to do it
1) 80/20 tradeoff where 20% of effort gives 80% of performance.
2) Smart implementation of inexpensive tools will outperform naive implementations of world-class tools.
Accuracy
Development Effort/ Licensing
Expense
Regexps
Trained Librarians
potential performance
gain
35Taxonomy Strategies LLC The business of organized information
How do tools compare? Pragmatic viewpoint
Accuracy Levelhighlow
Con
tent
Vol
umes
low
high
36Taxonomy Strategies LLC The business of organized information
What kind of metadata creation and maintenance process is needed? Even ‘purely’ automatic
meta-tagging systems need a manual error correction procedure.Should add a QA sampling mechanism
Tagging models:Author-generatedCentral librariansHybrid – central auto-tagging service, distributed manual review and correction
Compose in Template
Submit to CMS
Analyst Editor
Review content
Problem?
Copywriter
Copy Edit content
Problem?Hard Cop
y
Web site
Y
Y N
N
Approve/Edit metadata
Automatically fill-in metadata
Tagging Tool Sys Admin
Sample of ‘author-generated’ metadata workflow.
37Taxonomy Strategies LLC The business of organized information
Tagging tool example: Interwoven MetaTagger
Manual form fill-in w/ check boxes, pull-down lists, etc. Auto keyword &
summarization
38Taxonomy Strategies LLC The business of organized information
Tagging tool example: Interwoven MetaTagger
Auto-categorization
Parse & lookup (recognize names)
Rules & pattern matching
39Taxonomy Strategies LLC The business of organized information
Where do I put the metadata?
Where can I store metadata? In the content – HTML Headers, File properties, etc. In a centralized repository – Search index, Metadata database, etc.
Where should I store metadata? It depends. If you are moving files through a process, putting it in the file keeps
it from getting dropped at system borders. If you are doing search across multiple documents, it has to be at
least copied out of the files. If you make copies of files and modify them, consistent in-file
metadata will be impossible.
Real question is not where to STORE the metadata, it is how to MAINTAIN the metadata. Web CMS as an example
40Taxonomy Strategies LLC The business of organized information
Agenda
9:00 Who are we?
9:10 What are taxonomies & metadata?
9:30 What kinds of taxonomies are there, and what do I need?
9:40 How do I get a good taxonomy?
10:05 How do I associate the taxonomy with content?
10:30 Break
10:45 What do taxonomies and metadata have to do with search?
11:15 How can I sell my management on a taxonomy project?
11:45 Any more questions?
12:00 Adjourn
41Taxonomy Strategies LLC The business of organized information
Agenda
9:00 Who are we?9:10 What are taxonomies & metadata?9:30 What kinds of taxonomies are there, and what do I need?9:40 How do I get a good taxonomy?10:05 How do I associate the taxonomy with content?10:30 Break
10:45 What do taxonomies and metadata have to do with search? Does adding a taxonomy mean replacing my search engine? How are they used behind the scenes in a search implementation How are they used in the Search UI to aid searching? How can we make our current search engine better?
11:15 How can I sell my management on a taxonomy project?11:45 Any more questions?12:00 Adjourn
42Taxonomy Strategies LLC The business of organized information
How to fix search? … Add metadata to search on! “Adding metadata to unstructured content allows it to be managed
like structured content. Applications that use structured content work better.”
“Enriching content with structured metadata is critical for supporting search and personalized content delivery.”
“Content that has been adequately tagged with metadata can be leveraged in usage tracking, personalization and improved searching.”
“Better structure equals better access: Taxonomy serves as a framework for organizing the ever-growing and changing information within a company. The many dimensions of taxonomy can greatly facilitate Web site design, content management, and search engineering. If well done, taxonomy will allow for structured Web content, leading to improved information access.”
43Taxonomy Strategies LLC The business of organized information
How does Google do so well without metadata?
They don’t, they just use particular types of metadata: Number of incoming links PageRank for each incoming link Text of incoming links
44Taxonomy Strategies LLC The business of organized information
Dublin Core framework for corporate use
Not just 15 elements A framework to enable cross-resource exploration and
use
Dublin Core is framework for “integration metadata” at BellSouth
Source: Courtesy of Todd Stephens, BellSouth
45Taxonomy Strategies LLC The business of organized information
ElementData Type Length
Req. / Repeat Source PurposeAsset Metadata
Unique ID Integer Fixed 1 System supplied Basic accountability
Recipe Title String Variable 1 Licensed Content Text search & results display
Recipe summary String Variable 1 Licensed Content Content
Main Ingredients List Variable ?Main Ingredients vocabulary
Key index to retrieve & aggregate recipes, & generate shopping list
Subject MetadataMeal Types List Variable * Meal Types vocab
Browse or group recipes & filter search results
Cuisines List Variable * Cuisines
Courses List Variable * Courses vocab
Cooking Method Flag Fixed * Cooking vocab
Link MetadataRecipe Image Pointer Variable ? Product Group Merchandize products
Use MetadataRating String Variable 1 Licensed Content Filter, rank, & evaluate recipes
Release Date Date Fixed 1 Product Group Publish & feature new recipes
Legend: ? – 1 or more * - 0 or more
What about Search? Integration Metadata
dc:identifierdc:titledc:description
X
XXXX
dcterms:hasPart
dc:date
dc:type=“recipe”, dc:format=“text/html”, dc:language=“en”
46Taxonomy Strategies LLC The business of organized information
Agenda
9:00 Who are we?
9:10 What are taxonomies & metadata?
9:30 What kinds of taxonomies are there, and what do I need?
9:40 How do I get a good taxonomy?
10:10 How do I associate the taxonomy with content?
10:30 Break
10:45 What do taxonomies and metadata have to do with search?
11:30 How can I sell my management on a taxonomy project?
11:45 Any more questions?
12:00 Adjourn
47Taxonomy Strategies LLC The business of organized information
How do I sell Management on a Taxonomy Project?
Don’t sell “metadata” or “taxonomy”, sell the vision of what you want to be able to do.
Clearly understand what the problem is and what the opportunities are.
Do the calculus (costs and benefits)
Design the taxonomy (in terms of LOE) in relation to the value at hand.
48Taxonomy Strategies LLC The business of organized information
Fundamentals of metadata ROI
Tagging content using metadata and a taxonomy are costs, not benefits.
There is no benefit without exposing the tagged content to users in some way that cuts costs or improves revenues.
Putting metadata and a taxonomy into operation requires UI changes and/or backend system changes, as well as data changes.
You need to determine those changes, and their costs, as part of the ROI.
49Taxonomy Strategies LLC The business of organized information
What are the typical metadata ROI scenarios?
Catalog site Increased sales. Increased productivity.
Customer support Cutting costs. Increased sales.
Compliance Avoiding penalties.
Knowledge worker productivity Less time searching, more time working.
50Taxonomy Strategies LLC The business of organized information
Guided Navigation 2-3 clicks to product
No dead ends
http://www.tesco.com/winestore
Metadata ROI: Catalog site
51Taxonomy Strategies LLC The business of organized information
Metadata ROI: Catalog site
Increased sales Product findability. Product cross-sells and up-
sells. Customer loyalty.
1-5% increase in sales $57.6B sales (’04) $2.1B net income (’04)
Enterprise portal cost $6M
$600M to $2B/year $21M to $105M/year
1-5% increase in productivity$50K average cost per employee310,400 employees (’04)
$155M to $776M/year
Source: Proforma based on Hoover’s data.
52Taxonomy Strategies LLC The business of organized information
Metadata ROI: Customer support model
Policy categories for browsing
Type and go to search for specific policies
Good search results for policy topics, e.g., “pets”
Refine search offered with results
Help on search page, not a click away.
53Taxonomy Strategies LLC The business of organized information
Metadata ROI: Customer support model
Self service Fewer customer calls. Faster, more accurate CSR
responses through better information access.
25-50% service efficiency increase 300K customer service calls
per month $6 cost per call
Manual processing 100,000 documents 2 pages per document $4 per page $800K
$5.4M to $10.8M/yr
$186M to $930M/year ($575M) to $169M/year
1-5% increased sales $18.6B sales (’04) ($761M) net income (’04)
Source: Proforma based on Hoover’s data.
54Taxonomy Strategies LLC The business of organized information
Metadata ROI: Compliance
Avoiding penalties for breaching regulations SOX: up to 5 years in jail SOX: up to $5M
Following required procedures
Loss of company $100B revenue (’00)
Loss of partner companies Arthur Andersen
$100B
Source: Proforma based on Hoover’s data.
55Taxonomy Strategies LLC The business of organized information
Searching
Creating
Commun-icating
Knowledge workers spend up to 2.5 hours each day looking for information …
… But find what they are looking for only 40% of the time.
— Kit Sims Taylor
56Taxonomy Strategies LLC The business of organized information
High cost of not finding information
“The amount of time wasted in futile searching for vital information is enormous, leading to staggering costs …”
— Sue Feldman, bnb nbnbn
High cost of poor classification
Poor classification costs a 10,000 user organization $10M each year—about $1,000 per employee.
— Jakob Nielsen, useit.com
But “better search” itself is a weak ROI
57Taxonomy Strategies LLC The business of organized information
Creating new
contentRecreating
existing content
SearchingCommun-icating
26%9%
Knowledge workers spend more time re-creating existing content than creating new content
— Kit Sims Taylor
58Taxonomy Strategies LLC The business of organized information
Metadata ROI: Productivity
Decreased cost to market Decreased development cost Increased R&D productivity Reduced time for sales &
marketing 1-5% decrease in drug
development cost $800M/drug
5-10% increase in R&D productivity 13% of revenue $39B in sales (’04)
10-20% decrease in time for sales & marketing 13% of revenue
Enterprise document management system cost $10M
$8M to $16M/drug
$254M to $507M/year
$254M to $507M/year
Source: Proforma based on Hoover’s data.
59Taxonomy Strategies LLC The business of organized information
Metadata ROI: Executive Mandate
There is no ROI out of the box Just someone with a vision
…and the budget to make it happen.
What’s really needed? Demos and proofs of value. So that a stronger cost benefit argument can be made for
continuing the work
60Taxonomy Strategies LLC The business of organized information
Productivity, loyalty, and revenue have provided the ROI
61Taxonomy Strategies LLC The business of organized information
Intranet has provided the best ROI
Intranet
Web/online customer sales
Web dev infrastructure
Middleware to link Web to ERP
e-billing/payment systems
Web/online business sales
Wireless Web access
Extranet/supply chain
e-marketplace/ portal
None
62Taxonomy Strategies LLC The business of organized information
Agenda
9:00 Who are we?
9:10 What are taxonomies & metadata?
9:30 What kinds of taxonomies are there, and what do I need?
9:40 How do I get a good taxonomy?
10:05 How do I associate the taxonomy with content?
10:30 Break
10:45 What do taxonomies and metadata have to do with search?
11:15 How can I sell my management on a taxonomy project?
11:45 Any more questions?
12:00 Adjourn
?
63Taxonomy Strategies LLC The business of organized information
Agenda
9:00 Who are we?
9:10 What are taxonomies & metadata?
9:30 What kinds of taxonomies are there, and what do I need?
9:40 How do I get a good taxonomy?
10:05 How do I associate the taxonomy with content?
10:30 Break
10:45 What do taxonomies and metadata have to do with search?
11:15 How can I sell my management on a taxonomy project?
11:45 Any more questions?
12:00 Adjourn
Strategies LLCTaxonomy
May 16, 2005 Copyright 2005 Taxonomy Strategies LLC. All rights reserved.
Contact Info
Ron Daniel925-368-8371
Joseph Busch415-377-7912