ontology engineering: tools and methodologies ian horrocks information management group school of...

79
Ontology Engineering: Tools and Methodologies Ian Horrocks <[email protected]> Information Management Group School of Computer Science University of Manchester

Upload: brandon-bastow

Post on 31-Mar-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Ontology Engineering: Tools and Methodologies

Ian Horrocks<[email protected]>Information Management GroupSchool of Computer ScienceUniversity of Manchester

Page 2: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Tutorial Resources

http://www.cs.man.ac.uk/~horrocks/nsd07/

Page 3: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Ontologies

Page 4: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

• In Philosophy, fundamental branch of metaphysics

– Studies “being” or “existence” and their basic categories

– Aims to find out what entities and types of entities exist

Ontology: Origins and History

Page 5: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

• An ontology is an engineering artefact consisting of:

– A vocabulary used to describe (a particular view of) some domain

– An explicit specification of the intended meaning of the vocabulary.

• Often includes classification based information

– Constraints capturing background knowledge about the domain

• Ideally, an ontology should:

– Capture a shared understanding of a domain of interest

– Provide a formal and machine manipulable model

Ontology in Information Science

Page 6: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Example Ontology (Protégé)

Page 7: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

The Web Ontology Language OWL

Page 8: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

• Semantic Web led to requirement for a “web ontology language”

• set up Web-Ontology (WebOnt) Working Group

– WebOnt developed OWL language

– OWL based on earlier languages RDF, OIL and DAML+OIL

– OWL now a W3C recommendation (i.e., a standard)

• OWL is a family of 3 languages: OWL Lite, OWL DL and OWL Full

• OIL, DAML+OIL and OWL (DL & Lite) based on Description Logics

– Many OWL DL/Lite tools & ontologies

– Relatively few OWL Full tools or ontologies

OWL History

Page 9: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

What Are Description Logics?• A family of logic based Knowledge Representation

formalisms– Descendants of semantic networks and KL-ONE

– Describe domain in terms of concepts (classes), roles (properties, relationships) and individuals

– Operators allow for composition of complex concepts

– Names can be given to complex concepts, e.g.:

HappyParent ´ Parent u 8hasChild.(Intelligent t Athletic)HappyParent ´ Parent u 8hasChild.(Intelligent t Athletic)HappyParent ´ Parent u 8hasChild.(Intelligent t Athletic)HappyParent ´ Parent u 8hasChild.(Intelligent t Athletic)HappyParent ´ Parent u 8hasChild.(Intelligent t Athletic)

Page 10: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Why (Description) Logic?• OWL exploits results of 15+ years of DL research

– Well defined (model theoretic) semantics

Cat

Animal

IS-Ahas-color

Black

Felix

IS-A

Mat

IS-A

sits-on

[Quillian, 1967]

Page 11: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Why (Description) Logic?• OWL exploits results of 15+ years of DL research

– Well defined (model theoretic) semantics

– Formal properties well understood (complexity, decidability)

[Garey & Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, 1979.]

I can’t find an efficient algorithm, but neither can all these famous people.

Page 12: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Why (Description) Logic?• OWL exploits results of 15+ years of DL research

– Well defined (model theoretic) semantics

– Formal properties well understood (complexity, decidability)

– Known reasoning algorithms

Page 13: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Why (Description) Logic?• OWL exploits results of 15+ years of DL research

– Well defined (model theoretic) semantics

– Formal properties well understood (complexity, decidability)

– Known reasoning algorithms

– Implemented systems (highly optimised)

PelletKAON2 CEL

Page 14: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Why the Strange Names?• Description Logics are a family of KR formalisms

– Mainly distinguished by available operators

• Available operators indicated by letters in name, e.g.,

S : basic DL (ALC) plus transitive roles (e.g., ancestor R+)

H : role hierarchy (e.g., hasDaughter v hasChild)

O : nominals/singleton classes (e.g., {Italy})

I : inverse roles (e.g., isChildOf ´ hasChild–)

N : number restrictions (e.g., >2hasChild, 63hasChild)

• Basic DL + role hierarchy + nominals + inverse + NR = SHOIN– The basis for OWL-DL

• SHOIN is very expressive, but still decidable (just)

– Decidable we can build reliable tools and reasoners

Page 15: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Why (Description) Logic?• Foundational research was crucial to design of OWL

– Informed Working Group decisions at every stage, e.g.:

• “Why not extend the language with feature x, which is clearly harmless?”

• “Adding x would lead to undecidability - see proof in […]”

Page 16: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Class/Concept Constructors

• C is a concept (class); P is a role (property); x is an individual name

• XMLS datatypes as well as classes in 8P.C and 9P.C– Restricted form of DL concrete domains

Page 17: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Knowledge Base / Ontology Axioms

Page 18: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

• A TBox is a set of “schema” axioms (sentences), e.g.:

{Parent v Person u >1hasChild,

HappyParent ´ Parent u 8hasChild.(Intelligent t Athletic)}

• An ABox is a set of “data” axioms (ground facts), e.g.:

{John:HappyParent,

John hasChild Mary}

• An OWL ontology is just a SHOIN KB

Knowledge Base / Ontology

Page 19: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

OWL RDF/XML Exchange Syntax

<owl:Class> <owl:intersectionOf rdf:parseType=" collection"> <owl:Class rdf:about="#Parent"/> <owl:Restriction> <owl:onProperty rdf:resource="#hasChild"/> <owl:allValuesFrom> <owl:unionOf rdf:parseType=" collection"> <owl:Class rdf:about="#Intelligent"/> <owl:Class rdf:about="#Athletic"/> </owl:unionOf> </owl:allValuesFrom> </owl:Restriction> </owl:intersectionOf></owl:Class>

E.g., Parent u 8hasChild.(Intelligent t Athletic):

Page 20: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Ontology Reasoning

Page 21: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Why Ontology Reasoning?• Given key role of ontologies in many applications, it is essential to

provide tools and services to help users:

– Design and maintain high quality ontologies, e.g.:

• Meaningful — all named classes can have instances

Page 22: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Why Ontology Reasoning?• Given key role of ontologies in many applications, it is essential to

provide tools and services to help users:

– Design and maintain high quality ontologies, e.g.:

• Meaningful — all named classes can have instances

• Correct — captures intuitions of domain experts

Page 23: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Why Ontology Reasoning?• Given key role of ontologies in many applications, it is essential to

provide tools and services to help users:

– Design and maintain high quality ontologies, e.g.:

• Meaningful — all named classes can have instances

• Correct — captures intuitions of domain experts

• Minimally redundant — no unintended synonyms

Banana split Banana sundae

Page 24: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Why Ontology Reasoning?• Given key role of ontologies in many applications, it is essential to

provide tools and services to help users:

– Design and maintain high quality ontologies, e.g.:

• Meaningful — all named classes can have instances

• Correct — captures intuitions of domain experts

• Minimally redundant — no unintended synonyms

– Answer queries, e.g.:

• Find more general/specific classes

• Retrieve individuals/tuples matching a given query

Page 25: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Ontology Applications

Page 26: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

e-Science• E.g., Open Biomedical Ontologies Consortium (GO, MGED)

– Used, e.g., for “in silico” investigations relating theory and data

• E.g., relating data on phosphatases to (model of) biological knowledge

Page 27: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Medicine• Building/maintaining terminologies such as Snomed, NCI,

Galen and FMA

– Used, e.g., for semi-automated annotation of MRI images

Frontal Lobe

Temporal Lobe

Parietal Lobe

OccipitalLobe

Central Sulcus

Lateral Sulcus

Page 28: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Organising Complex Information• E.g., UN-FAO, NASA, Ordnance Survey, General

Motors, Lockheed Martin, …

Page 29: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Organising Complex Information• E.g., UN-FAO, NASA, Ordnance Survey, General

Motors, Lockheed Martin, …

Page 30: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

OWL Experiences and Directions• Workshop at ESWC’07 (Innsbruck, Austria, 6-7 June)

• Brings together users, implementors and researchers

• Submissions include:

– Enterprise Integration (Mitre)

– Product development (Lockheed Martin)

– Role based access control (NASA)

– Healthcare (SNOMED)

– Agriculture and fisheries (UN Food & Agriculture Organization)

– Oral Medicine (Chalmers)

– …

Page 31: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Ontology Engineering

Page 32: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Ontology Engineering Tasks• Typical tasks in Ontology Engineering:

– author concept descriptions

– refine the ontology

– manage errors

– integrate different ontologies

– (partially) reuse ontologies

• These tasks are highly challenging; need for:

– tool & infrastructure support

– design methodologies

Page 33: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Tools and Infrastructure• Editors/environments

– Protégé, Swoop, TopBraid Composer, Construct, Ontotrack, …

Page 34: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Tools and Infrastructure• Editors/environments

– Oiled, Protégé, Swoop, Construct, Ontotrack, …

• Reasoning systems– Cerebra, FaCT++, Kaon2, Pellet, Racer, …

Pellet

KAON2 CEL

Page 35: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Tools and Infrastructure• Editors/environments

– Oiled, Protégé, Swoop, Construct, Ontotrack, …

• Reasoning systems– Cerebra, FaCT++, Kaon2, Pellet, Racer, …

• Design methodologies– Modularity, foundational ontologies,

etc.Entity

SubstantialQuality Event

Achievement

Stative

Accomplishment

PerdurantEndurant

Page 36: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Development & Maintenance

Page 37: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

• Most widely used free to download tools are

– Protégé (Stanford / Manchester) -- be sure to get v4.x

– Swoop (UMD / Clark & Parsia)

• Commercial tools include

– TopBraid, RacerPro, …

• Facilities typically include

– Range of display modes and editing features

– Visualisation

– Consistency and subsumption checking

• Useful extras may include

– Debugging and explanation

– Repair

– Integration and/or partitioning

Development Environments

http://code.google.com/p/swoop/http://protege.stanford.edu/

Page 38: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Demo Ontologies• GALEN

– http://www.cs.man.ac.uk/~horrocks/OWL/Ontologies/galen.owl

• NCI

– http://www.mindswap.org/2003/CancerOntology

• Tambis

– http://www.cs.man.ac.uk/~horrocks/OWL/Ontologies/tambis.owl

Page 39: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

GALEN• Ontology about medical terms and surgical procedures.

• Work started in the 90s within the OpenGALEN project.

• Main applications:

– Integration of clinical records, and

– decision support.

• GALEN:

– is very large (~35,000 concepts),

– is fairly expressive (SHIF description logic),

– has not been classified yet by any DL reasoner

• We will look at a smaller version, which:

– is still large (~3,000 concepts),

– is similarly expressive as full GALEN,

– was first classified by the FaCT system.

Page 40: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

GALEN: The Ontology at a Glance

• Size:– ~ 3,000 classes

– ~ 500 object properties

– no individuals or datatypes

• Expressivity– ~350 General Concept Inclusion Axioms (GCIs).

– Concept constructors:

• Conjunction (intersectionOf)

• Existential restrictions (someValuesFrom)

– 150 functional properties

– 26 transitive properties

Page 41: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

GALEN: The (Unclassified) Hierarchies

• The class hierarchy:

– Number of subsumption relations: 1,978

– Maximum depth of the tree: 13

– No multiple inheritance

• The property hierarchy:

– 4 properties with multiple inheritance

Page 42: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

GALEN: Concept definitions and GCIsConcept definition

– Axiom of the form A ´ C with:

• A a concept name

• C a (possibly complex) concept

– A definition assigns a name A to a complex concept C

Some examples:

LungPathology ´ PathologicalCondition u 9 locativeAttribute.Lung

RenalTransplant ´ Transplanting u 9 actsOn.Kindney

Page 43: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

GALEN: Concept definitions and GCIsInclusion axioms:

– Axioms of the form A v C:

• A is a concept name

• C is a possibly complex concept

– Represent an incomplete (“partial”) definition

• Examples:

XRayMachine v ImagingDevice

Candida v Fungus u 9 hasFunction.AerobicMetabolicProcess

• In GALEN, some of these can be very complex:

– check out the definitions of Knee Joint and Kidney!

Page 44: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

GALEN: Concept definitions and GCIsGeneral Concept Inclusion Axioms (GCIs)

– Axioms of the form C ´ D

• C,D can be complex

• May describe general (background) knowledge about the ontology

Examples:

Secretion u 9 actsSpecificallyOn.Leucocidin v

9 isFunctionOf.StraphilococcusAureus

Transport u 9 actsOn.Glucose u 9 carriesFrom.Blood v

9 carriesTo.Cell

Page 45: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Classifying GALEN Ontology statistics (revisited):

– Number of class subsumption relations: 6729• 1978 of which are “told” and the rest inferred

– Maximum depth of the class tree: 15• As opposed to 13 in the case of the unclassified tree

– Classes with multiple inheritance: 408• All multiple inheritance relations have been inferred!• This was intended in the design of GALEN

– Maximum depth of the property tree: 9• No change with respect to the “told” tree

– Properties with multiple inheritance: 4• Again, no change with respect to the “told” tree

Reasoning is mostly performed on classes and not on properties

Page 46: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Modeling Choices

• The “upper” part:– Composed of the domain-independent concepts and roles.

– Examples:

• TopCategory, DomainCategory, GeneralisedStructure…

– Shallowly defined (mostly a taxonomy)

• The “domain specific” part:– Examples:

• Plant, LungPathology, …

– Richly defined

• Much more than just a taxonomy!

Page 47: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Inferred Knowledge

A trivial subsumption:

– Why is PathologicalCondition a subclass of DomainCategory?

• Simply look at the definition of Pathological Condition!

Another example:

– Why is PathologicalBehavior a subclass of PathologicalCondition?

• Look at the definition of both classes

• Notice that Behavior is a subclass of DomainCategory

A non-trivial subsumption:

– Why is AchalasiaProcesses a PathologicalBodyProcesses?

Page 48: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Classifying GALEN

• Simple and multiple inheritance

– Focus, for example, on PathologicalBodyProcess

– Navigate to its super-classes

– Visualisation can be useful

• In Swoop we can “Fly the mother ship”!

Page 49: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

The NCI Ontology

• Huge bio-medical ontology describing the Cancer domain

• Maintained by dozens of domain experts

• Contains information about:

– genes,

– diseases,

– drugs,

– research institutions, …

All with a cancer-centric focus

Page 50: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

NCI: The Ontology at a Glance

• Size:– ~ 30.000 classes

– ~ 70 object properties

– no individuals or datatypes

• Expressivity– Concept constructors:

• Conjunction (intersectionOf)

• Existential restrictions (someValuesFrom)

– Axioms:

• Definitions (no GCIs)

• Domain and range of properties

Page 51: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

NCI: The (Unclassified) Hierarchies

• The class hierarchy:

– Number of subsumption relations: 103.232

– Maximum depth of the tree: 19

– Classes with multiple inheritance: 4636

– Browse through it!

• The property hierarchy:

– No properties with multiple inheritance

– Browse through it!

Page 52: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Axioms in NCI

Examples:

Cancer_Gene v Gene u 9 hasFunction.Tumoregenesis

Alzheimer_Disease v Dementia

Domain(rAnatomic_Structure_Has_Location) = Anatomy_Kind

Range(rTechnique_Has_Purpose) = Clinical_Or_Research_Activity_Kind

Page 53: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

The NCI Kinds

• “Upper” concepts representing the sub-domains of NCI

• Examples:

– Anatomy.

– Biological processes.

– Chemicals and drugs.

– Organisms …

• Properties relating the Kinds

Page 54: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

NCI

• Partitioning and crop-circles view of the partitioning

• Gives an intuition about the different sub-domains in NCI, which ones are central, and which ones are “side” domains

Page 55: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

NCI and GALEN

• The domains of NCI and GALEN overlap. Both ontologies define concepts such as:

– Anatomical parts: bone, tissue, etc.

– Diseases

– Organisms,…

• Example:

– Check out how Femur is defined in NCI and GALEN

– Different modeling decisions and focus of interest

Page 56: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Tambis

• TAMBIS is a medical ontology constructed during the early days of the Web.

• The intended application was the integrated access to information in a set of databases.

• The OWL version was generated from the old format using a (buggy) script.

Page 57: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Tambis: The Ontology at a Glance• Size:

– ~ 400 classes

– ~ 100 object properties

– no individuals or datatypes

• Expressivity

– No General Concept Inclusion Axioms.

– Concept constructors:

• Conjunction (intersectionOf)

• Disjunction (unionOf)

• Existential restrictions (someValuesFrom)

• Universal restriction (allValuesFrom)

• Cardinality restrictions

– Axioms

• Definitions (complete and partial)

• Transitive, functional, symmetric and inverse properties

Page 58: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Tambis: the (unclassified) hierarchies

• Subclass relationships: 226

• No multiple inheritance

• Maximum depth of class tree: 6

• Maximum depth of property tree: 2

Page 59: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Tambis: Example Axioms• Tambis uses cardinality restrictions profusely

– See definition of anion

• Use of disjunction

– See definition of atom

• Use of universal restrictions

– See definition of book-title

• Use of complex nested restrictions

– See definition of complement-dna

– See definition of gene

• Disjointness axioms

– See definitions of metal, non-metal and metalloid

Page 60: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Tambis: Classification

• Subclass relationships: 600

– compared to 226

• Classes with multiple inheritance: 19

– compared to none

• Maximum deph of class tree: 7

– compared to 6

• Maximum depth of property tree: 2

• 144 unsatisfiable concepts!

Page 61: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Tambis: Unsatisfiable concepts• Almost half of the concepts in Tambis are unsatisfiable

• The explanations are non-trivial

– E.g., protein-structure and macromolecular-part

• Distinguishing root and derived unsatisfiable classes:

– derived unsatisfiable classes are unsatisfiable because they depend on another unsatisfiable concept.

• definition of Enzyme,

• definition of Binding-site

– root unsatisfiable classes contain an “inherent” contradiction

• definition of Metal,

• definition of Non-metal,

• definition of Metalloid

Page 62: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Advanced Issues and Design Patterns

Page 63: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Qualified Number Restrictions (QCRs)

• Existential restrictions in OWL DL are qualified:

– Person u 9hasChild.Male

• Cardinality restrictions can only be qualified with >

– Person u >2.hasChild

• The lack of QCRs has been identified as a major limitation of OWL, especially in biomedical applications:

– A quadruped is an animal with exactly four parts that are legs

– A medical oversight committee is a committee which consists of at least five members of which two are medical doctors, one is a manager and two are members of the public.

Page 64: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Qualified Cardinality Restrictions

Can be approximated using property inclusion and property range.

Quadruped ´ Animal u (= 4 hasLeg)

hasLeg v hasPart

Range(hasLeg) = Leg

Page 65: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Qualified Cardinality RestrictionsThis approximation is unsound in general:

MedicalCommittee ´ Committee u (=3 hasMember) u ·1hasMember.MD u

· 1 hasMember.: MD

Approximated by:

MedicalCommittee ´ (=3 hasMember) u · 1hasMDMember u

· 1hasNotMDMember

hasMDMember v hasMember

hasNotMDMember v hasMember

Range(hasMDMember) = MD

Range(hasNotMDMember) = : MD

Page 66: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Transitive Propagation of Properties

• In OWL, we can express transitive propagation of a property:– If Paris is located in France and France is located in Europe,

then France is located in Europe.

– If the hand is a part of the arm and the arm is part of the human body, then the hand is a part of the human body.

• In OWL, however, we cannot express transitive propagation of a property along a different property:– If an ulcer is located in the gastric mucosa and the gastric

mucosa is a part of the stomach, then the ulcer is located in the stomach

– If a burn is located in the foot and the foot is part of the leg, then the burn is located in the leg.

Page 67: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Transitive Propagation of PropertiesVarious patterns that approximate transitive propagation have been

proposed and used in ontologies.

• Use of the property hierarchy and transitivity:

Part_Of v Located_In

Transitive(Part_Of)

• This pattern may yield undesired results, since part-whole relations may not always imply location:

– The orange peal is part of the orange, but is it located in the orange?

Page 68: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Design Methodologies

Page 69: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Modularity in Software Engineering

Typically referred to as the extent to which software is divided into components with:

– high internal cohesion

– controlled coupling between each other through simple interfaces (encapsulation)

Benefits of modular software design:

– software maintainability

– software understandability

Page 70: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Modularity in Ontology Engineering

Benefits of a modular ontology design: to simplify

• ontology refinement/update

modifying a module should not lead to modifications in parts of the ontology that are not conceptually related

• understanding

relationships between different modules in an ontology controlled and well-understood

• integration with other ontologies

no unexpected consequences

• partial reuse

reuse only the relevant part/module of an ontology

Page 71: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Q

1 CysticFibrosis v Fibrosis u 9locatedIn.Pancreas u

9hasOrigin.GeneticOrigin

2 GeneticFibrosis v Fibrosis u 9hasOrigin.GeneticOrigin

3 Fibrosis u 9 locatedIn. Pancreas v GeneticFibrosis

4 GeneticFibrosis v GeneticDisorder

P

1 GenDisorderProject = Project u 9hasFocus.GeneticDisorder

2 CysticFibProject = Project u 9hasFocus.CysticFibrosis

3 9hasFocus.> v Project

4 Project u (GeneticFibrosis u GeneticDisorder) v ?

5 8 hasFocus.CysticFibrosis v 9hasFocus.GeneticDisorder

Q ² CysticFibrosis v Genetic Disorder

P [ Q ² > v 9 hasFocus.>

P [ Q ² > v Project

P [ Q ² GeneticFibrosis t GeneticDisorder v ?P [ Q ² CysticFibProject v GenDisorderProject

Page 72: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Foundational Ontologies• E.g., DOLCE

Page 73: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Recent Work andResearch Challenges

Page 74: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Increasing Expressive Power

• Complex role inclusion axioms [Horrocks, Kutz & Sattler, KR-06]

– E.g., hasLocation ± partOf v hasLocation

• Concrete domains/datatypes, e.g., [Lutz, IJCAI-99; Pan et al, ISWC-03]

– E.g., value comparison (income > expenditure)

• OWL 1.1 (see http://webont.org/owl/1.1/)

– Syntactic sugar to make commonly-stated things easier to say

– New class & property constructors

– Expanded datatype expressiveness

– Meta-modelling constructs

– Semantic-free comments

– Now a W3C Member Submission

Page 75: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Increasing Expressive Power

• Complex role inclusion axioms [Horrocks, Kutz & Sattler, KR-06]

– E.g., hasLocation ± partOf v hasLocation

• Concrete domains/datatypes, e.g., [Lutz, IJCAI-99; Pan et al, ISWC-03]

– E.g., value comparison (income > expenditure)

• OWL 1.1 (see http://webont.org/owl/1.1/)

• Database style keys [Lutz et al, JAIR 2004]

– E.g., make + model + chassis-number is a key for Vehicles

• Rule language extensions

– W3C RIF WG (see http://www.w3.org/2005/rules/)

– First order extensions (e.g., SWRL) [Horrocks et al, JWS, 2005]

– Hybrid language extensions, e.g., [Eiter et al, KR-04; Motik et al, ISWC-04; Rosati, JoWS, 2005]

– LP/F-Logic/Common Logic [Chen et al, JLP, 1993; de Bruijn et al, WWW-05]

Page 76: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Improving Scalability

• Optimisation techniques

– Improve performance of DL reasoners, e.g., [Sirin et al, KR-06]

• Reduction to disjunctive Datalog [Motik et at, KR-04]

– Transform SHOIN ontology to DatalogÇ rules

– Use LP techniques to deal with large numbers of ground facts

• Hybrid DL-DB systems [Horrocks et al, CADE-05]

– Use DB to store “Abox” (individual) axioms

– Cache inferences and use DB queries to answer/scope logical queries

• Polynomial time algorithms for sub-ALC logics

– Graph based techniques for EL+ [Baader et al, IJCAI-05]

– Database techniques for DL-Lite [Calvanese et al, AAAI-05]

Page 77: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Summary

• OWL Ontologies provide vocabulary for annotations– Terms have well defined meaning

• OWL now being used in a wide range of applications

– e-Science, medicine, geography, geology, …

• Reasoning enabled tools are of crucial importance

– For both design and deployment of ontologies

• Large and extremely active R&D area

– New and improved tools & methodologies constantly appearing

• Research challenges remain

– But tools now mature enough for “prime time” applications

Page 78: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Acknowledgements

Thanks to my many friends in the DL and Semantic Web communities, in particular:

– Alan Rector

– Franz Baader

– Uli Sattler

– The Swoop/Pellet team:

• Aditya Kalyanpur

• Evren Sirin

• Bernardo Cuenca Grau

• Bijan Parsia

Page 79: Ontology Engineering: Tools and Methodologies Ian Horrocks Information Management Group School of Computer Science University of Manchester

Resources:• FaCT++ system (open source)

– http://owl.man.ac.uk/factplusplus/

• OWL

– http://www.w3.org/TR/owl-features/

• OWL Experiences and Directions Workshop

– http://owled2007.iut-velizy.uvsq.fr/

Any questions?

Thank you for listening

• Protégé

– http://protege.stanford.edu/plugins/owl/

• OWL 1.1 Proposal

– http://webont.org/owl/1.1/