Transcript
Page 1: Ontology Mapping - Out Of The Babel Tower

Ontology mapping: a way out of

the medical tower of Babel?

Frank van HarmelenVrije Universiteit Amsterdam

The Netherlands Antilles

Page 2: Ontology Mapping - Out Of The Babel Tower

Before we start… a talk on ontology mappings

is difficult talk to give: no concensus in the field

• on merits of the different approaches• on classifying the different approaches

no one can speak with authority on the solution

this is a personal view, with a sell-by dateother speakers will entirely disagree

(or disapprove)

Page 3: Ontology Mapping - Out Of The Babel Tower

Good overviews of the topicKnowledge Web D2.2.3:

“State of the art on ontology alignment”Ontology Mapping Survey

talk by Siyamed Seyhmus SINIRESWC'05 Tutorial on

Schema and Ontology Matching by Pavel Shvaiko Jerome Euzenat

KER 2003 paper Kalfoglou & Schorlemmer

These are all different & incompatible…

Page 4: Ontology Mapping - Out Of The Babel Tower

Ontology mapping: a way out of

the medical tower of Babel?

Page 5: Ontology Mapping - Out Of The Babel Tower

The Medical tower of Babel Mesh

• Medical Subject Headings, National Library of Medicine • 22.000 descriptions

EMTREE• Commercial Elsevier, Drugs and diseases• 45.000 terms, 190.000 synonyms

UMLS• Integrates 100 different vocabularies

SNOMED• 200.000 concepts, College of American Pathologists

Gene Ontology• 15.000 terms in molecular biology

NCI Cancer Ontology: • 17,000 classes (about 1M definitions),

Page 6: Ontology Mapping - Out Of The Babel Tower

Ontology mapping: a way out of

the medical tower of Babel?

Page 7: Ontology Mapping - Out Of The Babel Tower

no shared understanding

Conceptual and terminological confusion

Actors: both humans and machines

Agree on a conceptualization

Make it explicit in some language.

world

concept

language

What are ontologies &what are they used for

Page 8: Ontology Mapping - Out Of The Babel Tower

Ontologies come in very different kindsFrom lightweight to heavyweight:

• Yahoo topic hierarchy• Open directory (400.000 general categories)• Cyc, 300.000 axioms

From very specific to very general• METAR code (weather conditions at air terminals)• SNOMED (medical concepts)• Cyc (common sense knowledge)

Page 9: Ontology Mapping - Out Of The Babel Tower

What’s inside an ontology?

terms + specialisation hierarchy classes + class-hierarchy instances slots/values inheritance (multiple? defaults?) restrictions on slots (type, cardinality) properties of slots (symm., trans., …) relations between classes (disjoint, covers) reasoning tasks: classification,

subsumption

Increasing semantic “weight”

Page 10: Ontology Mapping - Out Of The Babel Tower

In short (for the duration of this talk)Ontologies are not

definitive descriptions of what exists in the world (= philosphy)

Ontologies are

models of the worldconstructed

to facilitate communication

Yes, ontologies exist(because we build them)

Page 11: Ontology Mapping - Out Of The Babel Tower

Ontology mapping: a way out of

the medical tower of Babel?

Page 12: Ontology Mapping - Out Of The Babel Tower

Ontology mapping is old & inevitableOntology mapping is old

• db schema integration• federated databases

Ontology mapping is inevitable• ontology language is standardised,• don't even try to standardise contents

Page 13: Ontology Mapping - Out Of The Babel Tower

Ontology mapping is importantdatabase integration,

heterogeneous database retrieval (traditional)

catalog matching (e-commerce)agent communication (theory only)web service integration (urgent)P2P information sharing (emerging)personalisation (emerging)

Page 14: Ontology Mapping - Out Of The Babel Tower

Ontology mapping is now urgentOntology mapping has acquired

new urgency• physical and syntactic integration is ± solved,

(open world, web)• automated mappings are now required (P2P)• shift from off-line to run-time matching

Ontology mapping has new opportunities• larger volumes of data• richer schemas (relational vs. ontology)• applications where partial mappings work

Page 15: Ontology Mapping - Out Of The Babel Tower

Different aspectsof ontology mapping how to discover a mapping how to represent a mapping

• subset/equal/disjoint/overlap/is-somehow-related-to

• logical/equational/category-theoretical atomic/complex arguments, confidence measure how to use it

We only talk about “how to discover”

Page 16: Ontology Mapping - Out Of The Babel Tower

Many experimental systems: (non-exhaustive!) Prompt (Stanford SMI) Anchor-Prompt (Stanford SMI) Chimerae (Stanford KSL) Rondo (Stanford U./ULeipzig) MoA (ETRI) Cupid (Microsoft research) Glue (Uof Washington) FCA-merge (UKarlsruhe) IF-Map Artemis (UMilano) T-tree (INRIA Rhone-Alpes) S-MATCH (UTrento)

Coma (ULeipzig) Buster (UBremen) MULTIKAT (INRIA S.A.) ASCO (INRIA S.A.) OLA (INRIA R.A.) Dogma's Methodology ArtGen (Stanford U.) Alimo (ITI-CERTH) Bibster (UKarlruhe) QOM (UKarlsruhe) KILT (INRIA

LORRAINE)

Page 17: Ontology Mapping - Out Of The Babel Tower

Different approaches toontology matching

Linguistics & structure

Shared vocabulary

Instance-based matching

Shared background knowledge

Page 18: Ontology Mapping - Out Of The Babel Tower

Linguistic & structural mappings

normalisation (case,blanks,digits,diacritics)

lemmatization, N-grams, edit-distance, Hamming distance,

distance = fraction of common parents elements are similar if

their parents/children/siblings are similar

decreasing order of boredom

Page 19: Ontology Mapping - Out Of The Babel Tower

Different approaches toontology matching

Linguistics & structure

Shared vocabulary

Instance-based matching

Shared background knowledge

Page 20: Ontology Mapping - Out Of The Babel Tower

Up(Q)

Low(Q) µ Q µ Up(Q) Low(Q) µ Q µ Up(Q)

Q

QLow(Q)

Matching through shared vocabulary

Page 21: Ontology Mapping - Out Of The Babel Tower

Matching through shared vocabulary Used in mapping geospatial databases

from German land-registration authorities (small)

Used in mapping bio-medical and genetic thesauri(large)

Page 22: Ontology Mapping - Out Of The Babel Tower

Different approaches toontology matching

Linguistics & structure

Shared vocabulary

Instance-based matching

Shared background knowledge

Page 23: Ontology Mapping - Out Of The Babel Tower

Matching through shared instances

Page 24: Ontology Mapping - Out Of The Babel Tower

Used by Ichise et al (IJCAI’03) to succesfully map parts of Yahoo to parts of Google

Yahoo = 8402 classes, 45.000 instancesGoogle = 8343 classes, 82.000 instancesOnly 6000 shared instances70% - 80% accuracy obtained (!)

Conclusions from authors:• semantics is needed to improve on this

ceiling

Matching through shared instances

Page 25: Ontology Mapping - Out Of The Babel Tower

Different approaches toontology matching

Linguistics & structure

Shared vocabulary

Instance-based matching

Shared background knowledge

Page 26: Ontology Mapping - Out Of The Babel Tower

sharedbackgroundknowledge

Matching using shared background knowledge

ontology 1 ontology 2

Page 27: Ontology Mapping - Out Of The Babel Tower

Ontology mapping using background knowledgeCase study 1

Work with Zharko Aleksovski @ Philips Michel Klein @ VU

KIK @ AMC •

PHILIPS

Page 28: Ontology Mapping - Out Of The Babel Tower

Overview of test data

Two terminologies from intensive care domain

OLVG list• List of reasons for ICU admission

AMC list• List of reasons for ICU admission

DICE hierarchy• Additional hierarchical knowledge

describing the reasons for ICU admission

Page 29: Ontology Mapping - Out Of The Babel Tower

OLVG listdeveloped by clinician3000 reasons for ICU admission1390 used in first 24 hours of stay

• 3600 patients since 2000based on ICD9 + additional materialList of problems for patient admissionEach reason for admission is described

with one label• Labels consist of 1.8 words on average• redundancy because of spelling mistakes• implicit hierarchy (e.g. many fractures)

Page 30: Ontology Mapping - Out Of The Babel Tower

AMC listList of 1460 problems for ICU

admission Each problem is described using

5 aspects from the DICE terminology:

2500 concepts (5000 terms), 4500 links•Abnormality (size: 85)•Action taken (size: 55)•Body system (size: 13)•Location (size: 1512)•Cause (size: 255)

expressed in OWL allows for subsumption & part-of

reasoning

Page 31: Ontology Mapping - Out Of The Babel Tower

Why mapping AMC list $ OLVG list? allow easy entering of OLVG

data re-use of data in

• epidemiology• quality of care assessment• data-mining (patient prognosis)

Page 32: Ontology Mapping - Out Of The Babel Tower

Linguistic mapping: Compare each pair of concepts Use labels and synonyms of concepts Heuristic method to discover

equivalence and subclass relations

tumorbrainLong tumor LongMore specific than

First round• compare with complete DICE• 313 suggested matches, around 70 % correct

Second round:• only compare with “reasons for admission” subtree• 209 suggested matches, around 90 % correct

High precision, low recall (“the easy cases”)

Page 33: Ontology Mapping - Out Of The Babel Tower

Using background knowledge Use properties of concepts Use other ontologies to discover

relation between properties

….….….

….….….

?

Page 34: Ontology Mapping - Out Of The Babel Tower

Action taxonomyAction taxonomy

Abnormality taxonomyAbnormality taxonomy

Body system taxonomyBody system taxonomy

Location taxonomyLocation taxonomy

Cause taxonomyCause taxonomy

DICE aspect taxonomies

Semantic match

OLVG problem list

OLVG problem list

DICE problem list

DICE problem list

Given???

??

Implicit matching:property match

Lexical match

Page 35: Ontology Mapping - Out Of The Babel Tower

Semantic match

ArteryArtery

AortaAorta

is more general

Taxonomy of body parts

Blood vessel

Veinis more general is more general

Aorta thoracalis dissectionAorta thoracalis dissection Dissection of arteryDissection of artery

Lexical match:has location

Lexical match:has location

Location match:has more

general location

Reasoning:implies

Page 36: Ontology Mapping - Out Of The Babel Tower

Example: “Heroin intoxication” – “drugs overdose” DrugsDrugs

HeroineHeroine

is more general

Cause taxonomy

Heroin intoxicationHeroin intoxicationDrugs overdosisDrugs overdosis

Lexical match:cause Cause match:

has more specific cause

Abnormality match:has more general

abnormality

IntoxicatieIntoxicatie

OverdosisOverdosis

is more general

Abnormality taxonomy

Lexical match:cause

Lexical match:

abnormality

Lexical match:abnormality

Page 37: Ontology Mapping - Out Of The Babel Tower

Example results

• OLVG: Acute respiratory failureDICE: Asthma cardiale

• OLVG: Aspergillus fumigatus DICE: Aspergilloom

• OLVG: duodenum perforation DICE: Gut perforation

• OLVG: HIVDICE: AIDS

• OLVG: Aorta thoracalis dissectie type B DICE: Dissection of artery

cause

abnormality,cause

cause

location,abnormality

abnormality

Page 38: Ontology Mapping - Out Of The Babel Tower

Ontology mapping using background knowledgeCase study 2

Work with Heiner Stuckenschmidt @ VU

Page 39: Ontology Mapping - Out Of The Babel Tower

Case Study: 1. Map GALEN & Tambis,

using UMLS as background knowledge2. Select three topics with sufficient overlap

• Substances• Structures • Processes

3. Define some partial & ad-hoc manual mappings between individual concepts

4. Represent mappings in C-OWL5. Use semantics of C-OWL

to verify and complete mappings

Page 40: Ontology Mapping - Out Of The Babel Tower

GALEN(medical ontology)

Tambis(genetic ontology)

UMLS(medical terminology)

lexical mappinglexical mapping

derived mapping

verification &derivation

verification & derivation

Case Study:

Page 41: Ontology Mapping - Out Of The Babel Tower

Ad hoc mappings: Substances

Notice: mappings high and low in the hierarchy, few in the middle

UMLS GALEN

Page 42: Ontology Mapping - Out Of The Babel Tower

Ad hoc mappings: Substances

UMLS Tambis

Notice different grainsize: UMLS course, Tambis fine

Page 43: Ontology Mapping - Out Of The Babel Tower

Verification of mappings

UMLS:Chemicals

Tambis:Chemical

Tambis:enzyme

UMLS:Chemicals_viewed_structurally

UMLS:Chemicals_viewed_functionally

UMLS:enzyme

=

=

?

Page 44: Ontology Mapping - Out Of The Babel Tower

Deriving new mappings

UMLS:substance

Galen:ChemicalSubstance

UMLS:Phenomenon_or_process

UMLS:Chemicals

UMLS:OrganicChemical

=

Page 45: Ontology Mapping - Out Of The Babel Tower

Ontology mapping: a way out of

the medical tower of Babel?

Page 46: Ontology Mapping - Out Of The Babel Tower

“Conclusions”Ontology mapping is (still) hard & openMany different approaches will be

required:• linguistic,• structural• statistical• semantic• …

Currently no roadmap theory on what's good for which problems

Page 47: Ontology Mapping - Out Of The Babel Tower

Challengesroadmap theory run-time matching“good-enough” matcheslarge scale evaluation methodologyhybrid matchers (needs roadmap

theory)

Page 48: Ontology Mapping - Out Of The Babel Tower

Ontology mapping: a way out of

the medical tower of Babel?

?


Top Related