development of ontologies guus schreiber swi, university of amsterdam co-chair w3c web ontology...
TRANSCRIPT
Development of Ontologies
Guus Schreiber
SWI, University of AmsterdamCo-chair W3C Web Ontology Working
Group
COST G9 workshop, 11 Oct 2002
2
Overview
• The notion of ontology• Ontology types and examples• Ontology languages• Ontology engineering: methods and tools
COST G9 workshop, 11 Oct 2002
3
What is an Ontology?
• In philosophy: theory of what exists in the world
• In IT: consensual & formal description consensual & formal description of shared concepts in a domainof shared concepts in a domain
• Aid to human communication and shared understanding, by specifying meaning
• Machine-processable (e.g., agents use ontologies in communication)
• Ontology = key technology in Ontology = key technology in semantic information processingsemantic information processing
• Applications: knowledge management, e-business, industrial engineering, semantic world-wide web
COST G9 workshop, 11 Oct 2002
4
What is an Ontology? (2)
Source: Financial Times, e-procurement, Oct. 2000
COST G9 workshop, 11 Oct 2002
5
The notion of ontology
• Ontology = explicit specification of a shared
conceptualization that holds in a particular context”
(several authors)
• Captures a viewpoint an a domain: – Taxonomies of species– Physical, functional, & behavioral system descriptions– Task perspective: instruction, planning
• Main difference with data models is not the content, but the purpose (generalizes over applications)
COST G9 workshop, 11 Oct 2002
6
Ontology should allow for “representational promiscuity”
ontology
parameterconstraint -expression
knowledge base A
cab.weight + safety.weight = car.weight:
cab.weight < 500:
knowledge base B
parameter(cab.weight)parameter(safety.weight)parameter(car.weight)constraint-expression(
cab.weight + safety.weight = car.weight)constraint-expression(
cab.weight < 500)
rewritten as
viewpointmapping rules
COST G9 workshop, 11 Oct 2002
7
Ship design: STEP product model used for data exchange
Private ProgramRepresentation
Design System
“WriteDesign”
AssessmentReport
kactus
SubmittedDesign
“RetrieveDeign Markups”
NeutralFormat
Ship Builder/Designer
API‘C’
Void
Private ProgramRepresentation
AssessmentSystem
kactus
Lloyd’s Register
API PrologVoid
“WriteDeign Markups”
“RetrieveDesign”
SPF
PrologInstances
Void
Express CMLSPF
KACTUSAP218
AssessmentOntology
ExpressInstances
?InstanceMappings
API Prolog
CML WorldEXPRESS World
COST G9 workshop, 11 Oct 2002
8
The importance of context
Principle 1: “The representation of real-world objects always
depends on the context in which the object is used. This context can be seen as a “viewpoint” taken on the object. It is usually impossible to enumerate in advance all the possible useful viewpoints on (a class of ) objects.”
Principle 2: “Reuse of some piece of information requires an explicit
description of the viewpoints that are inherently present in the information. Otherwise, there is no way of knowing whether, and why this piece of information is applicable in a new application setting.”
COST G9 workshop, 11 Oct 2002
9
Multiple views on a domain
• typical viewpoints captured in ontologies: – physical, functional, behavioral, process type: flow, energy,
..
• viewpoints typically overlap• applications require combinations of viewpoints
Heat Exchanger
platform design diagnosisprocess
simulation
physical structureconnections
mathematics ofheat exchange process
temparatruredifferences
COST G9 workshop, 11 Oct 2002
10
Ontology as conceptual structuring: multiple viewpoints & abstraction levels
• viewpoint decomposition• shape, geometry• function• behavior• causality• structure: part-of (mereology), aggregation • connectedness (topology)
• abstraction (generalization) level organization:• Intel 166 MHz• micro-processor• device component• (sub)system: part-of, connectedness• thing
COST G9 workshop, 11 Oct 2002
11
Leveling of ontologies
• Ontologies can have a recursive structure:• One ontology expresses a viewpoint on
another ontology.• Entails a reformulation and/or reinterpretation
of the underlying domain theories.• Often used to specify increasingly application-
specific interpretations and/or reformulations of domain expressions.
• Notion of ontology mapping– Still poorly understood
COST G9 workshop, 11 Oct 2002
12
Multiple ontology levels
x + y = z0 < z < 10
constraint-expression(x + y = z)constraint-expression(0 < z < 10)
calculation(x + y = z) constraint(0 < z < 10)
Interpreted as a numerical/logicaldependency between system parameters
Interpretation according to role in the problem solving process
COST G9 workshop, 11 Oct 2002
13
Context specification through ontology types
• Domain-specific ontologies– Medicine: UMLS, SNOMED, Galen– Art history: AAT, ULAN– STEP application protocols
• Task-specific ontologies– Classification
– E-commerce
• Generic ontologies • Top-level categories• Units and dimensions
COST G9 workshop, 11 Oct 2002
14
Art and Architecture Thesaurus
COST G9 workshop, 11 Oct 2002
15
Domain ontology of a traffic light control system
COST G9 workshop, 11 Oct 2002
16
Classification ontology
descriptionuniverse
descriptiondimension
descriptor
value set
value
descriptorvalue
object
object type object class
classconstraint
has feature
descriptorvalue set
in dimension
instance of
class of
hasdescriptor
1+
1+
1+
1+
1+
1+
COST G9 workshop, 11 Oct 2002
17
Ontology for e-commerce
COST G9 workshop, 11 Oct 2002
18
Top-level categories:many different proposals
Chandrasekaran et al. (1999)
COST G9 workshop, 11 Oct 2002
19
Ontology specification
• Many different languages– KIF– Ontolingua– Express – LOOM– UML– RDF Schema / DAML+OIL / OWL
• Common basis– Class (concept)– Subclass with inheritance– Relation (slot)
COST G9 workshop, 11 Oct 2002
20
Additional expressivity (1 of 2)
• Multiple subclasses• Aggregation
– Built-in part-whole representation
• Relation-attribute distinction– “Attribute” is a relation/slot that points to a data type
• Treating relations as classes– Sub relations– Reified relations (e.g., UML “association class”)
• Constraint language
COST G9 workshop, 11 Oct 2002
21
Additional expressivity (2 of 2)
• Class/subclass semantics– Primitive vs. defined classes– Complete/partial, disjoint/overlapping subclasses
• Set of basic data types• Modularity
– Import/export of an ontology
• Ontology mapping– Renaming ontological elements– Transforming ontological elements
• Sloppy class/instance distinction– Class-level attributes/relations– Meta classes
COST G9 workshop, 11 Oct 2002
22
Priority list for expressivity
• Depends on goal:– Deductive capability: “limit to subset of first-order
logic”– Maximal content: “as much as (pragmatically)
possible”
• My priority list (from a “maximal-content”
representative)1. Multiple subclasses2. Reified relations3. Import/export mechanism4. Sloppy class/instance distinction5. Aggregation6. Constraint language
COST G9 workshop, 11 Oct 2002
23
COST G9 workshop, 11 Oct 2002
24
Expressivity of RDF Schema
• Class– Describes collection of resources
• Property– Links class to another class or to a “literal” (data
value)– Domain and range restrictions
• Subclass relation– Property inheritance
• Subproperty relation• Classes and properties are themselves also
resources– Cf. “classes as instances”
COST G9 workshop, 11 Oct 2002
25
OWL: W3C Web Ontology Language• Basis = RDF Schema• Basic features (OWL Lite/Core):
– Cardinality restrictions (limited)– Local range constraints– Equality of resources– Inverse, symmetric and transitive properties– Datatypes (reference to XML Schema)
• Advanced features (OWL DL) – Boolean class combinations– Disjointness and completeness– Nameless classes– Cardinality restrictions (full)
• Under development, see http://www.w3.org
COST G9 workshop, 11 Oct 2002
26
Example UML presentation of OWL
COST G9 workshop, 11 Oct 2002
27
Modelling issue:classes as instances
Aircraft-type
no-of-engines: integer >0propulsion: {propeller,
jet}
Fokker-70
instance of Aircraft-typeno-of-engines = 2
propulsion = jet
Aircraft
no-of-seats: positive integer
owner: Airline
Fokker-70
subclass of Aircraft
no-of-seats: 60-80
PH-851
instance of Fokker-70
no-of-seats = 65
owner = KLM
COST G9 workshop, 11 Oct 2002
28
Modelling issuedefinitional and default knowledge
IF style/period = “Late Georgian”THEN (by definition) culture = “British” AND date.created between 1760-1811
IF type = “chest of drawers” style/period = “Late Georgian”THEN (this typically suggests) material.main = “mahogany”
COST G9 workshop, 11 Oct 2002
29
Modelling issue:dealing with existing hierarchies
<color>
<chromatic color>
pink
vivid pink
strong pink
<intermediate pink>
purplish pink
brilliant purplish pink
yellowish pink
<neutral color>
COST G9 workshop, 11 Oct 2002
30
Limitations of Hierarchies
• What’s in a link? – Hierarchical links often have different semantics
• “Dimensions” of distinction making provide rationale for hierarchical levels– (Multiple) classification along different dimensions
within single hierarchy creates confusion and makes applications unnecessarily complex
• Hierarchy enforces a single fixed sequence of dimensions– fixed ordering not always possible or desirable
COST G9 workshop, 11 Oct 2002
31
Two different organizations of the disease hierarchy
infection
meningitis pneumonia
bacterialpneumonia
acute viralpneumonia
chronic viralpneumonia
viralpneumonia
infection
meningitis pneumonia
chronicpneumonia
acute viralpneumonia
acute bacterialpneumonia
acutepneumonia
COST G9 workshop, 11 Oct 2002
32
Characteristics of ontologies: viewpoints - simultaneous multiple classifications
infection
acuteinfection
chronicinfection
viralinfection
bacterialinfection
meningitispneumonia
acute viralmeningitis
causal agenttime factor
Note: different dimensions along which distinctions are made (e.g. time, location, cause,…) often occur and are used simultaneously in a task.
COST G9 workshop, 11 Oct 2002
33
Modelling issue:part-whole relation • Examples:
– a wing spar is part of a wing assembly
– chests of drawers have feet with their own style
• Most items in collections have some internal structure
COST G9 workshop, 11 Oct 2002
34
Part-whole relations
• Important for describing objects with structure
• Semantics are complicated• Different type of part-whole relations can be
distinguished• Good overview article:
– A. Artale, E. Franconi, N. Guarino and L. Pazzi. Part-Whole Relations in Object-Centered Systems: An Overview. Data and Knowledge Engineering. October 1996
COST G9 workshop, 11 Oct 2002
35
WCH typology of part-whole relations• Three features
– Do the parts play a functional role in the whole?– Is the part made of the same thing as the whole?– Can the parts be separated from the whole?
• Component / Integral object– Example: elevator, car– Functional, separable, non homegenous
• Member / Collection– Idem, but non-functional (tree in forest)
• Portion / Mass– Separable, homogenous (slice of bread)
• Place / Area– Not separable, homogenous
(Lunteren part-of Gelderland)
• Stuff / Object– Not separable, not homogenous (steel in bike)
COST G9 workshop, 11 Oct 2002
36
Modelling of part-whole relations• Explicit introduction of wholes• Distinction between parts and other featues
(attributes, relations) of the whole• Built-in transitivity of parts
– If A part-of B and B part-of C then A part-of C
• Generic names for parts– Typically describe functional roles (car has wheels)
• Vertical relationships– Existence dependency between whole and part– Feature dependencies:
• Inheritance from part to whole: “defective”• Inheritance from whole to part: “owner”• Systematic relation: weight whole = sum weight parts
• Horizontal relationships– Constraints between parts
COST G9 workshop, 11 Oct 2002
37
Ontology mappings
animal description
gographical locationterrain type
species
geographical rangetypical habitats
animal+image description ontology
speciesontology
WordNet
geographical location continent Asia country Indonesia regio J ava city Bandung
terrain type rain forest savanna pampa tundra taiga
geographicalstandard
COST G9 workshop, 11 Oct 2002
38
Guidelines for ontological engineering (1)• Do not develop from scratch• Use existing data models and domain standards
as starting point• Start with constructing an ontology of common
concepts• If many data models, start with two typical ones• Make the purpose and context of the ontology
explicit– E.g. data exchange between ship designers and
assessors– Operationalize purpose/context with use cases
• Use multiple hierarchies to express different viewpoints on classes
• Consider treating central relationships as classes
COST G9 workshop, 11 Oct 2002
39
Guidelines for ontological engineering (2)• Do not confuse terms and concepts• Small ontologies are fine, as long as they meet
their goal• Don’t be overly ambitious: complete unified
models are difficult• Ontologies represent static aspects of a domain
– Do not include work flow
• Use a standard representation format, preferably with a possibility for graphical representation
• Decide about the abtraction level of the ontology early on in the process.– E.g., ontology only as meta model
COST G9 workshop, 11 Oct 2002
40
Ontology tools
Some well known tools• Protégé (Stanford)• OntoEdit (now: OI Modeller / KAON)• OilEd (Manchester)
Decision points:– Expressivity– Graphical representation– DB backend– Modularization support– Versioning
COST G9 workshop, 11 Oct 2002
41
Small ontology construction example
Source: M. Fowler, “Analysis Patterns”Translated into UML
Goal: conceptual model for observations in medical practice
COST G9 workshop, 11 Oct 2002
43
A simple representation
COST G9 workshop, 11 Oct 2002
44
The notion of quantity
John has a height of 185(unit = cm)
COST G9 workshop, 11 Oct 2002
45
Unit conversion
Inches can be converted into centimeters by multiplying with 2.54Degrees Celsius can be converted into Fahrenheit with the formula F = 32 + 9C/5
COST G9 workshop, 11 Oct 2002
46
Introducing phenomena types
For John (person) a height (phenomena type)with a quantity of 185 (unit = cm) was measured on 11/11/2000 15:43 (time stamp)
COST G9 workshop, 11 Oct 2002
47
Qualitative observations
• Qualitative observation:” “category”• Example: John has blood group A• “Blood group” is a phenomenon type• “Blood group A” is a phenomenon• The fact “Blood group A” is present for John is
a category observation
COST G9 workshop, 11 Oct 2002
48
Qualitative and quantitative observations
COST G9 workshop, 11 Oct 2002
49
Observation method and observer
Dr. Smith has observed the height ofJohn by means of a length pole
COST G9 workshop, 11 Oct 2002
50
Resources
• Web portals– www.ontoweb.irg– www.semanticweb.org
• Articles, books on modelling:– T. R. Gruber, Towards principles for the design of
ontologies used for knowledge sharing, In: N. Guarino and R. Poli (eds.) Formal Ontology in Conceptual Analysis and
Knowledge Representation. Boston, Kluwer, 1994,– J. Martin & J. Odell, Object-Oriented Methods -- A
Foundation. UML edition, Upper Saddle River, NJ, Prentice Hall,, 1997
– M. Fowler, Analysis Patterns: Reusable Object Models Menlo Park, CA, Addison-Wesley, 1997.