Download - Ontologies: What, Why, and How? Jon Corson-Rikert, Mann Library Metadata Working Group 4/18/03
Ontologies: What, Why, and Ontologies: What, Why, and How?How?
Jon Corson-Rikert, Mann Library
Metadata Working Group
4/18/03
What problems are we trying to What problems are we trying to solve?solve?• Problems with content
• Inconsistency• Incompatibility• Incompleteness• Unboundedness
• Need for Automation• Discovery• Filtering• Assembly• Interoperability
Why consider ontologies?Why consider ontologies?
• Sharing common understanding of the structure of information among people or software agents
• Codifying domain assumptions· Terminology· Relationships
• Reuse of domain knowledge
• Improving information retrieval success• Augmenting or refining search terms
· Preferred terminology· Discriminating among alternative meanings (e.g., WordNet)
• Language translation• Bridging across domains
The Evolution of Knowledge The Evolution of Knowledge ManagementManagement
Libraries/Archives/File Systems/Websites
Bibliographic Catalogues Machine Index Catalogues
Human Indexing Machine Indexing
Statistical Analysis by Machines
Bibliographies/Output from Fulltext Search Engines
Books, Magazines, Articles Databases, Webpages
Pre- Web Web Semantic Web
Electronic Repositories
Machine Readable Metadata Repositories
Machine Indexing Human Indexing
Semantical Analysis by Machines
Knowledge based specialized webportals
Defined Electronic Information Elements
Knowledge Mining
Ontologies
Libraries/Archives/File systems
Bibliographic Catalogues on Cards or Computers
Human Indexing
Bibliographies
Reviews
Human reading, checking and classifying
Books, Magazines, Articles, ….
Thesauri, Classification Schemes, Glossaries,
Johannes Keizer, FAO
What is an ontology? - 1What is an ontology? - 1
A thesaurus on steroids• Ordered terminology• Prescribed relationships among terms
What is an ontology? - 2What is an ontology? - 2
A shallow classification of basic categories
• Defines categories, and hence terminology
• Defines rules
(Soergel 1999)
What is an ontology? - 3What is an ontology? - 3
In information science:
A characterization, through formal, explicit knowledge, of the intended meanings and relationships of a vocabulary of concepts
(Gruber 1993)
What is an ontology? - 4What is an ontology? - 4
A formal explicit description of concepts in a domain of discourse (classes, or concepts),
with properties of each concept describing various features and attributes of the concepts (slots, roles, or properties)
and restrictions on slots (facets).
An ontology together with a set of individual instances of classes constitutes a knowledge base
(Ontology 101)
Ontologies have …Ontologies have …
ConceptsRelations between concepts
• Synonyms• Class/subclass (broader/narrower; dog is to mammal)• Membership (“is a”: Spot is a dog )• Part/whole (hand is part of arm, car has fender)• Inverse (e.g., pest damages plant so plant is damaged by pest)
Axioms (properties and attributes of concepts)• Definitions specifying both necessary and sufficient criteria
for membership• Constraints such as domain and range, minimum or maximum
number of values
Ontologies will (eventually) Ontologies will (eventually) support:support:
Automatic classification and query• Where does a target word or phrase fit into the ontology• Locating a concept or a cluster of concepts based on a
description and/or relationships• Vocabulary switching between domains
Inference• Using relationships to determine, given A and B, what C
might be and how you know it• Analysis to enhance navigation
Consistency checking
From common data to common From common data to common structurestructure• Controlled vocabulary
• Very simple structure (nearly flat)• The terms are the data
• Taxonomy• Primarily to define position within a hierarchy – e.g., species
• Thesaurus• More options for relationships• Often leverages retrieval and organization of additional data
• Meta-thesaurus• A federation of similar thesaurus structures to allow bridging data across
languages or across domains
• Ontology• Whatever can’t be done by the above
Typical thesaurus Typical thesaurus implementationimplementation
• A controlled vocabulary or thesaurus limited to the domain
• A set of separate database tables, each with predictable attributes• People• Departments• Resources
• Thesaurus cross-references this content for internal navigation
• Incoming keyword queries can provide a rich context of links to data tables
Website with thesaurusWebsite with thesaurus
Queries
People
Orgs
Projects
Publications
Crops
Genes
Thesaurus
http://mcknight.ccrp.cornell.edu
Thesaurus as leveraging agentThesaurus as leveraging agent
2nd thesaurusInput query
Refinement
thesaurus
3rd thesaurus
then search against
data warehouse
Gazetteer as leveraging agentGazetteer as leveraging agent
Scenario:• User finds library record (e.g., book or photo) with place name reference
(e.g., neighborhood in L.A.)• Place name and desired action sent to gazetteer (e.g., find other photos in
nearby L.A. neighborhoods using appropriate historical neighborhood names)
• Gazetteer matches incoming place name with coordinate footprint• Other place names near footprint and in L.A. retrieved• Records related to neighboring places returned to user
Requires:• Structured data (place names, coordinates)• Relationships (historical to modern names, neighborhoods to city)• Functionality (coordinate-based spatial analysis)
Agriculture Heritage ProjectAgriculture Heritage Project
• Wide variety of content from diverse organizations
• Open-ended content
• Time and place as first-order variables
• Data likely to cluster by theme, time, and place
• Many areas with sparse data
• Need to appeal to diverse audiences
• Need to produce independently functional results
• Goal: transform flat archives into dynamic context of people,
places, and events
ApproachApproach
• Simple underlying content model
• Adaptive relationships among content
• Sometimes very detailed
• Often very general
• Approachable from any viewpoint
• Time, space, originating organization, historical event, personalities,
crops, thematic interests
• Capability for encapsulation and export as curricular units
The ABC Ontology ModelThe ABC Ontology Model
• A rich model incorporating time, place, and events as well as information more traditionally encoded in metadata
• Designed for exchange and interoperability as RDF-XML metadata
• A set of generalized classes and canonical relationships among them
• An ontology framework independent of the data it accompanies
time place
agent artifact
item
action event situation
actuality temporalityabstraction
Entity
manifestation
work
ABC Ontology classesABC Ontology classes
ABC Ontology diagrams - 1ABC Ontology diagrams - 1
EV0 ST0 EV1 ST1
Events precede or follow situations
creation publication
EV2
acquisition
ABC Ontology diagrams - 2ABC Ontology diagrams - 2
EV0 ST0 EV1 ST1
creation publication
EV2
acquisition
Most agents, actions, times, and places modify events
AC0
AG0
hasAction
hasAgent
inPlace
atTime
AC1
photographer
photo taken
AG1
photo published
publishing house
ABC Ontology diagrams - 3ABC Ontology diagrams - 3
Manifestations exist in situations
EV0 ST0 EV1 ST1
creation publication
EV2
acquisition
MN1
the photo
color transparency original MN0 MN2
color print
poster
MN3
hasRealization
hasRealization
isPartOf
Kodak archive
collection
contains
contains
instanceOf
the poster
WK0Tulips
Complete ABC diagramComplete ABC diagram
Source: http://metadata.net/harmony/cimi_modelling.htm
Source: http://metadata.net/harmony/cimi_modelling.htm
ABC class-property relationshipsABC class-property relationships
• Set of canonical relationships• All bi-directional (inverses)• Provide a domain of possible connections• Serve as the basis for model traversal
ABC class-property relationships - 1ABC class-property relationships - 1
Entity-Entity contains - isPartOf
Entity-Place inPlace - isLocationOf
Actuality-Actuality hasPhase - isPhaseOf
Actuality-Situation inContext - isContextFor
Work-Manifestation hasRealization - isRealizationOf
Manifestation-Item hasCopy - isCopyOf
ABC class-property relationships - 2ABC class-property relationships - 2
Temporality-Agent hasParticipant - isParticipant
Temporality-Actuality involves - isInvolvedIn
transforms - isTransformedBy
usesTool – usedAsToolIn
destroys - isDestroyedBy
hasResult - isResultOf
creates – isCreatedBy
Event-Action hasAction – isActionOf
Event-Agent hasPresence – isPresentIn
Situation-Event precedes - isPrecededBy
follows - isFollowedBy
Work in progressWork in progress
Demo of Agriculture Heritage site prototype
Is it worth it?Is it worth it?
• It’s worth exploring
• Must be easier to build
• Useful to rethink typical site structure
• Not clear how to leverage all the potential power
• Need more use cases
• What does it mean for metadata?
ReferencesReferences
• “Indirect geospatial referencing through place names in the digital library: Alexandria Digital Library experience with developing and implementing gazetteers,” Linda L. Hill, Zi Zheng, Proceedings of the American Society for Information Science Annual Meeting, Washington, D.C., Oct. 31- Nov. 4, 1999, pp. 57-69.
• “Ontology Development 101: A Guide to Creating Your First Ontology”, Natalya F. Noy, Deborah L. McGuinness, Stanford University, Stanford, CA 94305
• “Science and the Semantic Web,” James Hendler, Science, vol. 299, 1/24/03
• “The ABC Ontology and Model,” Carl Lagoze and Jane Hunter, Journal of Digital Information, volume 2 issue 2, November, 2001.
• “The Rise of Ontologies or the Reinvention of Classification,” Dagobert Soergel, Journal of the American Society for Information Science, 50(12):1119-1120, 1999
• “Toward Principles for the Design of Ontologies Used for Knowledge Sharing,” Thomas R. Gruber, Revision: August 23, 1993, Stanford Knowledge Systems Laboratory