introduction to databases: from data to knowledge bases instructors: bertram ludaescher kai lin...

112
Introduction to Databases: Introduction to Databases: From Data to Knowledge From Data to Knowledge Bases Bases Instructors: Bertram Ludaescher Kai Lin

Upload: clare-torbett

Post on 01-Apr-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

Introduction to Databases:Introduction to Databases:From Data to Knowledge BasesFrom Data to Knowledge BasesIntroduction to Databases:Introduction to Databases:From Data to Knowledge BasesFrom Data to Knowledge Bases

Instructors:

Bertram LudaescherKai Lin

Instructors:

Bertram LudaescherKai Lin

Page 2: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

2Introduction to KR, B. Ludaescher & K. Lin

Overview• 08:30-9:30 Introduction to KR (1h)• 9:30 – 9:45 BREAK (15’)• 9:45 -11:20 Intro to KR (1h45’)• 11:20-11:50 Demos (30’)• 11:50-13:15 LUNCH (1h25’)

• Demonstrations/Hands-on (~30’)• Ontology-enabled data integration• Concept map creation tool• Ontology creation tool

Page 3: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

3Introduction to KR, B. Ludaescher & K. Lin

The Problem: Scientific Data Integration

or: … from Questions to Queries …

Page 4: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

4Introduction to KR, B. Ludaescher & K. Lin

Ontology Cheat Sheet (1/2)• What is an ontology? An ontology usually …

– specifies a theoryspecifies a theory (a set of logic modelsmodels) by …– definingdefining and relatingrelating …– conceptsconcepts representing features of a domain of interest

• Also overloaded (sloppy) for:– Controlled vocabularies– Database schema (relational, XML Schema/DTD, …)– Conceptual schema (ER, UML, … )– Thesauri (synonyms, broader term/narrower term)– Taxonomies (classifications)– Informal/semi-formal knowledge representations

• “Concept spaces”, “concept maps”• Labeled graphs / semantic networks (RDF)

– Formal ontologies, e.g., in [Description] Logic (OWL)• “formalization of a specification” constrains possible interpretation of terms

Page 5: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

5Introduction to KR, B. Ludaescher & K. Lin

Ontology Cheat Sheet (2/2)• What are ontologies used for?

– Conceptual models of a domain or application, (communication means, system design, …)

– Classification of …• concepts (taxonomy) and • data/object instances through classes

– Analysis of ontologies e.g.• Graph queries (reachability, path queries, …)• Reasoning (concept subsumption, consistency checking, …)

– Targets for semantic data registration– Conceptual indexes and views for

• searching,• browsing, • querying, and • integration of registered data

Page 6: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

6Introduction to KR, B. Ludaescher & K. Lin

Ontologies as Metadata++

Ontologies = Smarter Metadata

TM

Page 7: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

7Introduction to KR, B. Ludaescher & K. Lin

Smarter (Meta)data I: Logical Data Views

Source: NADAM Team(Boyan Brodaric et al.)

Adoption of a standard (meta)data model => wrap data sets into unified virtual views

Page 8: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

8Introduction to KR, B. Ludaescher & K. Lin

Smarter Metadata II: Multihierarchical Rock Classification for “Thematic Queries” (GSC) –– or: Taxonomies are not only for

biologists ...

Composition

Genesis

Fabric

Texture

“smart discovery & querying” via multiple, independent concept hierarchies (controlled vocabularies)• data at different description levels can be found and processed

Page 9: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

9Introduction to KR, B. Ludaescher & K. Lin

Biomedical InformaticsResearch Networkhttp://nbirn.net

Biomedical InformaticsResearch Networkhttp://nbirn.net

Smarter Metadata III: Source Contextualization & Ontology Refinement

The next frontier: Capturing Knowledge about Dynamic Processes

“Process Ontologies”

Page 10: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

10Introduction to KR, B. Ludaescher & K. Lin

Ontology-Enabled Application Example:Geologic Map Integration

Show formations where AGE = ‘Paleozic’

(without age ontology)

Show formations where AGE = ‘Paleozic’

(without age ontology)

Show formations where AGE = ‘Paleozic’

(with age ontology)

Show formations where AGE = ‘Paleozic’

(with age ontology)

+/- a few hundred million years

domainknowledge

domainknowledge

Knowledge r

epresentatio

n

AGE ONTOLOGY

NevadaNevada

Page 11: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

11Introduction to KR, B. Ludaescher & K. Lin

Integrated querying of multiple datasets via different “ontologies” (conceptual

views)

Page 12: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

12Introduction to KR, B. Ludaescher & K. Lin

Querying by Geologic Age …

Page 13: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

13Introduction to KR, B. Ludaescher & K. Lin

Querying by Geologic Age: Result

Page 14: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

14Introduction to KR, B. Ludaescher & K. Lin

Querying by Chemical Composition …

Page 15: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

15Introduction to KR, B. Ludaescher & K. Lin

Querying by Chemical Composition: Results

DO know: It’s NOT there!

DON’T know! (not registered)

Note the fine differences in

shades of gray:

OK – we got to work on the color coding ;-)

OK – we got to work on the color coding ;-)

?

!

Page 16: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

16Introduction to KR, B. Ludaescher & K. Lin

Querying w/ British Rock Classification

Uses a GSC BRC inter-ontology articulation mapping Uses a GSC BRC inter-ontology articulation mapping

Page 17: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

17Introduction to KR, B. Ludaescher & K. Lin

British Rock Classification Query: Results

Uses a GSC BRC inter-ontology articulation mapping Uses a GSC BRC inter-ontology articulation mapping

Page 18: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

18Introduction to KR, B. Ludaescher & K. Lin

Different views on State Geological Maps

Page 19: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

19Introduction to KR, B. Ludaescher & K. Lin

The Query: Show sedimentary rocksThe Puzzle: Find the 17 differences in the

results… but first: what states are we looking at?but first: what states are we looking at?

Page 20: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

20Introduction to KR, B. Ludaescher & K. Lin

Sedimentary Rocks: BGS Ontology

Page 21: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

21Introduction to KR, B. Ludaescher & K. Lin

Sedimentary Rocks: GSC Ontology

Page 22: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

22Introduction to KR, B. Ludaescher & K. Lin

Differing Conceptual Views: Why?• We are looking at the same datasets – why do

they look different?– Different rock classifications (GSC, BGS) are used as

“targets” for registering data to– Not every rock name/rock type found in the raw data

is found in both classifications– The mapping (“articulation”) between the

classifications is an approximation only

• Yet: having “conceptual views” (even if different) on the data really seems like a good idea…

Page 23: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

23Introduction to KR, B. Ludaescher & K. Lin

Geologic Map Integration

• Given: – Geologic maps from different state geological

surveys (shapefiles w/ different data schemas)– Different ontologies:

• Geologic age ontology• Rock classification ontologies:

– Multiple hierarchies (chemical, fabric, texture, genesis) from Geological Survey of Canada (GSC)

– Single hierarchy from British Geological Survey (BGS)

• Problem:– Support uniform queries using different

ontologies– Support registration w/ ontology A, querying

w/ ontology B

Page 24: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

24Introduction to KR, B. Ludaescher & K. Lin

A Multi-Hierarchical Rock Classification “Ontology” (really:Taxonomy)

Composition

Genesis

Fabric

Texture

Page 25: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

25Introduction to KR, B. Ludaescher & K. Lin

Implementation in OWL: Not only “for the machine” …

Page 26: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

26Introduction to KR, B. Ludaescher & K. Lin

Demonstration ofOntology-enabled Map Integration (OMI) v2

Data

Data

Data

Data

ontology A

ontology C

ontology B

Ontology enabled Map Integrator {A,B}

Application (B)

Application (C)

“Semantic Registration”

Data sets Ontologies Applications

Page 27: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

27Introduction to KR, B. Ludaescher & K. Lin

Ontology Mapping: Overview

• Align ontologies• Integrate data sets which are registered to

different ontologies• Query data sets through different ontologies

Data set 1

Data set 2

Ontology 1

Ontology 2

register

register

Ontology mappings queries

Page 28: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

28Introduction to KR, B. Ludaescher & K. Lin

Geology Workbench: Initial State

click on Ontologies click on Datasets click on Applications

An Ontology-based Mediator

Page 29: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

29Introduction to KR, B. Ludaescher & K. Lin

Geology Workbench: Uploading Ontologies

click on Ontology SubmissionChoose an OWL file to uploadClick to check its detail Name SpaceCan be used to import this

ontology into others

Page 30: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

30Introduction to KR, B. Ludaescher & K. Lin

Geology Workbench: Data (to Ontology!) Registration

Step 1: Choose Classes

Click on Submission Data set name

Select a shapefile

Choose an ontology

Page 31: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

31Introduction to KR, B. Ludaescher & K. Lin

Geology Workbench: Data RegistrationStep 2: Choose Columns for Selected Classes

AREA

PERIMETER

AZ_1000

AZ_1000_ID

GEO

PERIOD

ABBREV

DESCR

D_SYMBOL

P_SYMBOL

It contains information about geologic age

Page 32: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

32Introduction to KR, B. Ludaescher & K. Lin

Geology Workbench: Data RegistrationStep 3: Resolve Mismatches

Two terms arenot matched anyontology terms

Manually mappingalgonkian intothe ontology

Page 33: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

33Introduction to KR, B. Ludaescher & K. Lin

Geology Workbench: Ontology-enabled Map Integrator

Click on the nameChoose interesting

Classes

All areas with the age Paleozoic

Page 34: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

34Introduction to KR, B. Ludaescher & K. Lin

Geology Workbench: Change Ontology

Submit a mapping

Ontology mappingbetween British Rock

Classification and CanadianRock Classification

Switch from Canadian Rock Classification to

British Rock Classification

Run it New query interface

Page 35: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

35Introduction to KR, B. Ludaescher & K. Lin

Ontology Repository• Accept user-defined ontologies in OWL• Any ontology saved in the system or accessible by can be

imported into another user-defined ontology ( inter-ontology references)

• Provide tool to browse the ontologies in the repository

……………..<owl:Ontology> <owl:imports rdf:resource= "http://compute5.sdsc.geongrid.org:8080/workbench/jsp/ontologies/genesis.owl" /></owl:Ontology>…………….<owl:Class rdf:ID="Ultramafite"> <rdfs:subClassOf rdf:resource="#Ultramafic"/> <rdfs:subClassOf rdf:resource= "http://compute5.sdsc.geongrid.org:8080/workbench/jsp/ontologies/genesis.owl#Igneous"></owl:Class>……………..

composition.owl

Page 36: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

36Introduction to KR, B. Ludaescher & K. Lin

Ontology-Enabled Map Integration: Where do we stand?

The simple case (done) : ontologies contain only the subclass relation

More complicate cases (coming soon) : ontologies contain classes with attributes ontologies with constraints in Description Logic

Implementation:• v1,v2 prototypes: detail-level registration to ontology• v3 (portal): item-level registration to ontology

Page 37: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

37Introduction to KR, B. Ludaescher & K. Lin

Current Ontology Registration (Item-level) v3

Domain Knowledge Ontologies

Domain Knowledge Ontologies

ArizonaArizona

Page 38: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

38Introduction to KR, B. Ludaescher & K. Lin

GEON Search: Concept-based Querying

Page 39: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

39Introduction to KR, B. Ludaescher & K. Lin

GEONmiddleware

System Overview

myOntology.owl myDataset.foo

metadatametadata

User Access (via Portal)User Access (via Portal)

Gazetteer, DLESE, …

Geologic Age, Chronos, …

external services

GEONsearchGEONsearch

Search condition(s)spatial temporal concept

LogLog

GEONworkbench GEONworkbench

GEON Workspace

(user)

User actionsadd delete manipulate

GEON Catalog

ResourceRegistrationResourceRegistration

SRB

Client Access (via web services)Client Access (via web services)

Other distributed apps Kepler, DLESE, …

Page 40: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

40Introduction to KR, B. Ludaescher & K. Lin

Introduction to Knowledge Representation and Ontologies

Page 41: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

41Introduction to KR, B. Ludaescher & K. Lin

Complex Multiple-Worlds Mediation and XML

• XML is Syntax– DTDs talk about element nesting– XML Schema schemas give you data types – need anything else? => write comments!

• Domain Semantics is complex:– implicit assumptions, hidden semantics sources seem unrelated to the non-expert

• Need Structure and Semantics beyond XML trees! employ richer OO models make domain semantics and “glue knowledge” explicit use ontologies to fix terminology and conceptualization avoid ambiguities by using formal semantics

Page 42: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

XML-Based vs. Model-Based Mediation

Raw DataRaw DataRaw Data

IF THEN IF THEN IF THEN

LogicalDomainConstraints

Integrated-CM :=

CM-QL(Src1-CM,...)

Integrated-CM :=

CM-QL(Src1-CM,...)

. . ....

....

........ (XML)Objects

Conceptual Models

XMLElements

XML Models

C2 C3

C1

R

Classes,Relations,is-a, has-a, ...

Ontologies

DMs, PMs

Ontologies

DMs, PMs

Integrated-DTD :=

XQuery(Src1-DTD,...)

Integrated-DTD :=

XQuery(Src1-DTD,...)

No DomainConstraints

A = (B*|C),DB = ...

Structural Constraints (DTDs),Parent, Child, Sibling, ...

CM ~ {Descr.Logic, ER, UML, RDF/XML(-Schema), …} CM-QL ~ {F-Logic, OWL, …}

Page 43: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

43Introduction to KR, B. Ludaescher & K. Lin

Knowledge Representation:Relating Theory to the World via Formal Models

Source: John F. Sowa, Knowledge Representation: Logical, Philosophical, and Computational Foundations

““All models are wrong, but some models are useful!”All models are wrong, but some models are useful!”

Page 44: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

44Introduction to KR, B. Ludaescher & K. Lin

What is an ontology??

Page 45: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

45Introduction to KR, B. Ludaescher & K. Lin

Glossary (wordreference.com)• ontology noun

1  (Philosophy)  the branch of metaphysics that deals with the nature of being2  (Logic)  the set of entities presupposed by a theory

• taxonomy noun1  a  the branch of biology concerned with the classification of organisms into groups based on similarities of structure, origin, etc.b  the practice of arranging organisms in this way2  the science or practice of classification [ETYMOLOGY: 19th Century: from French

taxonomie, from Greek taxis order + -nomy]

• thesaurus noun(plural:  -ruses, -ri [-raı])1  a book containing systematized lists of synonyms and related words2  a dictionary of selected words or topics3  (rare)

a treasury[ETYMOLOGY: 18th Century: from Latin, Greek: treasure]

Page 46: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

46Introduction to KR, B. Ludaescher & K. Lin

Glossary (wordreference.com)• concept noun1  an idea, esp. an abstract ideaexample: the concepts of biology2  (Philosophy)  a general idea or notion that corresponds to some class of entities and that consists of the characteristic or essential features of the class3  (Philosophy) a  the conjunction of all the characteristic features of something b  a theoretical construct within some theory c  a directly intuited object of thought d the meaning of a predicate4  [modifier]  (of a product, esp. a car) created as an exercise to demonstrate the technical skills and imagination of the designers, and not intended for mass production or sale[ETYMOLOGY: 16th Century: from Latin conceptum something received or conceived, from concipere to take in, conceive]

• contingent adjective1  [when postpositive, often foll by on or upon]  dependent on events, conditions, etc., not yet known; conditional2  (Logic)  (of a proposition) true under certain conditions, false under others; not necessary3  (in systemic grammar) denoting contingency (sense 4)4  (Metaphysics)  (of some being) existing only as a matter of fact; not necessarily existing5  happening by chance or without known cause; accidental6  that may or may not happen; uncertain

• glossary noun (plural:  -ries); an alphabetical list of terms peculiar to a field of knowledge with definitions or explanations. Sometimes called: gloss[ETYMOLOGY: 14th Century: from Late Latin glossarium; see gloss2]

Page 47: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

47Introduction to KR, B. Ludaescher & K. Lin

1st Attempt: Ontologies in CS

• An ontology is ...– an explicit specification of a conceptualization [Gruber93]

– a shared understanding of some domain of interest [Uschold, Gruninger96]

• Different aspects:– a formal specification (reasoning and “execution”)– ... of a conceptualisation of a domain (community)– ... of some part of the world of interest (application, science

domain)

• Provides:– A common vocabulary of terms– Some specification of the meaning of the terms (semantics)– A shared “understanding” for people and machines

Page 48: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

48Introduction to KR, B. Ludaescher & K. Lin

Ontology as a philosophical discipline

• Ontology as a philosophical discipline, which deals with the nature and the organization of reality:– Ontology as such is usually contrasted with Epistemology,

which deals with the nature and sources of our knowledge [a.k.a. Theory of Knowledge]. Aristotle defined Ontology as the science of being as such: unlike the special sciences, each of which investigates a class of beings and their determinations, Ontology regards all the species of being qua being and the attributes which belong to it qua being" (Aristotle, Metaphysics, IV, 1).

• In this sense Ontology tries to answer to the question: What is being? What exists? – the nature of being, not an enumeration of “stuff” around us…

Page 49: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

49Introduction to KR, B. Ludaescher & K. Lin

Some different uses of the word “Ontology” [Guarino’95]

1. Ontology as a philosophical discipline2. Ontology as a an informal conceptual system3. Ontology as a formal semantic account4. Ontology as a specification of a “conceptualization”5. Ontology as a representation of a conceptual systemvia a logical theory

5.1 characterized by specific formal properties5.2 characterized only by its specific purposes

6. Ontology as the vocabulary used by a logical theory7. Ontology as a (meta-level) specification of a logical

theory

http://ontology.ip.rm.cnr.it/Papers/KBKS95.pdf

Page 50: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

50Introduction to KR, B. Ludaescher & K. Lin

Ontologies vs Conceptualizations• Given a logical language L ...

– ... a conceptualization is a set of models of L which describes the admittable (intended) interpretations of its non-logical symbols (the vocabulary)

– ... an ontology is a (possibly incomplete) axiomatization of a conceptualization.

conceptualization conceptualization C(L)C(L)

ontologyontology

set of all models M(L)set of all models M(L)logiclogictheoriestheories(consistent sets of (consistent sets of sentences; closed undersentences; closed underlogical consequence)logical consequence)

[Guarino96]http://www-ksl.stanford.edu/KR96/Guarino-What/P003.html

Page 51: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

51Introduction to KR, B. Ludaescher & K. Lin

Ontologies vs Knowledge Bases• An ontology is a particular KB, describing facts

assumed to be always true by a community of users:– in virtue of the agreed-upon meaning of the vocabulary used

(analytical knowledge):• black => not white

– ... whose truth does not descend from the meaning of the vocabulary used (non-analytical, common knowledge)

• Rome is the capital of Italy

• An arbitrary KB may describe facts which are contingently true, and relevant to a particular epistemic state:– Mr Smith’s pathology is either cirrhosis or diabetes

Page 52: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

52Introduction to KR, B. Ludaescher & K. Lin

Formal Ontology [Guarino’96]• Theory of formal distinctions

– among things– among relations

• Basic tools– Theory of parthood

• What counts as a part of a given entity? What properties does the part relation have? Are the different kinds of parts?

– Theory of integrity• What counts as a whole? In which sense are its parts connected?

– Theory of identity• How can an entity change while keeping its identity? What are its

essential properties? Under which conditions does an entity loose its identity? Does a change of “point of view” change the identity conditions?

– Theory of dependence• Can a given entity exist alone, or does it depend on other entities?

Page 53: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

53Introduction to KR, B. Ludaescher & K. Lin

Ontology: Definition and Scope [Sowa]

• The subject of ontology is the study of the categories of things that exist or may exist in some domain. The product of such a study, called an ontology, is a catalog of the types of things that are assumed to exist in a domain of interest D from the perspective of a person who uses a language L for the purpose of talking about D. The types in the ontology represent the predicates, word senses, or concept and relation types of the language L when used to discuss topics in the domain D.

• An uninterpreted logic, such as predicate calculus, conceptual graphs, or KIF, is ontologically neutral. It imposes no constraints on the subject matter or the way the subject may be characterized. By itself, logic says nothing about anything, but the combination of logic with an ontology provides a language that can express relationships about the entities in the domain of interest.

http://users.bestweb.net/~sowa/ontology/index.htm

Page 54: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

54Introduction to KR, B. Ludaescher & K. Lin

Ontology: Definition and Scope [Sowa]

• An informal ontology may be specified by a catalog of types that are either undefined or defined only by statements in a natural language. A formal ontology is specified by a collection of names for concept and relation types organized in a partial ordering by the type-subtype relation. Formal ontologies are further distinguished by the way the subtypes are distinguished from their supertypes: – an axiomatized ontology distinguishes subtypes by axioms and

definitions stated in a formal language, such as logic or some computer-oriented notation that can be translated to logic

– a prototype-based ontology distinguishes subtypes by a comparison with a typical member or prototype for each subtype.

• Large ontologies often use a mixture of definitional methods: formal axioms and definitions are used for the terms in mathematics, physics, and engineering; and prototypes are used for plants, animals, and common household items. .

http://users.bestweb.net/~sowa/ontology/index.htm

Page 55: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

55Introduction to KR, B. Ludaescher & K. Lin

Why develop an ontology?

• To make domain assumptions explicit– Easier to change domain assumptions– Easier to understand, update, and integrate legacy

data data integration

• To separate domain knowledge from operational knowledge– Re-use domain and operational knowledge separately

• A community reference for applications• To share a consistent understanding of what

information means.[Carole Goble, Nigel Shadbolt, Ontologies and the Grid Tutorial][Carole Goble, Nigel Shadbolt, Ontologies and the Grid Tutorial]

Page 56: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

56Introduction to KR, B. Ludaescher & K. Lin

What is being shared?

Metadata• Data describing the content and meaning of resources

and services.• But everyone must speak the same language…

Terminologies• Shared and common vocabularies• For search engines, agents, curators, authors and users • But everyone must mean the same thing…

Ontologies• Shared and common understanding of a domain• Essential for search, exchange and discovery Ontologies aim at sharing meaning

[Carole Goble, Nigel Shadbolt, Ontologies and the Grid Tutorial][Carole Goble, Nigel Shadbolt, Ontologies and the Grid Tutorial]

Page 57: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

57Introduction to KR, B. Ludaescher & K. Lin

Origin and History• Humans require words (or at least symbols) to

communicate efficiently. The mapping of words to things is indirect. We do it by creating concepts that refer to things.

• The relation between symbols and things has been described in the form of the meaning triangle:

“Jaguar“

Concept

Ogden, C. K. & Richards, I. A. 1923. "The Meaning of Meaning." 8th Ed. New York, Harcourt, Brace & World, Inc

before: Frege, Peirce; see [Sowa 2000]

[Carole Goble, Nigel Shadbolt, Ontologies and the Grid Tutorial][Carole Goble, Nigel Shadbolt, Ontologies and the Grid Tutorial]

Page 58: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

58Introduction to KR, B. Ludaescher & K. Lin

Human and machine communication

• ... MachineAgent 1

Things

HumanAgent 2

Ontology Description

MachineAgent 2

exchange symbol,e.g. via nat. language

‘‘JAGUAR“

Internalmodels

Concept

Formalmodels

exchange symbol,e.g. via protocols

MA1HA1 HA2

MA2

Symbol

commit commit

a specific domain, e.g.animals

commitcommitOntology

Formal Semantics

HumanAgent 1

MeaningTriangle

[Maedche et al., 2002]

Page 59: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

59Introduction to KR, B. Ludaescher & K. Lin

Introduction to Description Logics

References: • F. Baader, W. Nutt. Basic Description Logics. In the

Description Logic Handbook, edited by F. Baader, D. Calvanese, D.L. McGuinness, D. Nardi, P.F. Patel-Schneider, Cambridge University Press, 2002, pages 47-100.

• Description Logics Tutorial, Ian Horrocks and Ulrike Sattler, ECAI-2002, Lyon, France, July 23rd, 2002.

• Emerging Sparrow toolkit (Bowers, Ludaescher)

Page 60: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

60Introduction to KR, B. Ludaescher & K. Lin

Example: Description Logic

• DL definition of “Happy Father” (Example from Ian Horrocks, Ulrike Sattler, U Manchester)

Page 61: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

61Introduction to KR, B. Ludaescher & K. Lin

Science Example: Ontology for SYNAPSE and NCMIR

Domain Map (DM)

Purkinje cells and Pyramidal cells have dendritesthat have higher-order branches that contain spines.Dendritic spines are ion (calcium) regulating components.Spines have ion binding proteins. Neurotransmissioninvolves ionic activity (release). Ion-binding proteinscontrol ion activity (propagation) in a cell. Ion-regulatingcomponents of cells affect ionic activity (release).

Domain Expert Knowledge

DM in Description Logic

Page 62: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

62Introduction to KR, B. Ludaescher & K. Lin

Source Contextualization, Ontology Refinement

In addition to registering (“hanging off”) data relative toexisting concepts, a source may also refine the mediator’s domain map...

sources can register new concepts at the mediator ...

Page 63: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

63Introduction to KR, B. Ludaescher & K. Lin

Some Description Logics History

• “Structured Inheritance Networks” [Brachman 1977]

• KL-ONE [Brachman, Schmolze 1985]• Core ideas:

– Building blocks: atomic concepts (unary predicates), atomic roles (binary predicates), individuals (constants)

– Constructors for building complex concepts and roles from simpler ones

– Automated inference for concept subsumption and instance classification (is-a/is-instance-of are not explicitly given by the user, but inferred from concept definitions/instance properties)

Page 64: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

64Introduction to KR, B. Ludaescher & K. Lin

Source: Description Logics Tutorial, Ian Horrocks and Ulrike Sattler, ECAI-2002, Lyon, France, July 23rd,

2002

Page 65: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

65Introduction to KR, B. Ludaescher & K. Lin

Knowledge Base (DL-Style)• Terminological Knowledge (TBox)

– Concept Definition (naming of concepts):

– Axiom (constraining of concepts):

=> a mediators “glue knowledge source”

• Assertional Knowledge (ABox) about Individuals– n27_img118 : Neuron=> the concrete instances/individuals of the

concepts/classes that your sources export

Page 66: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

66Introduction to KR, B. Ludaescher & K. Lin

Example TBoxAtomic conceptsAtomic concepts = {P,F,W, M1,…} = {P,F,W, M1,…}

Base conceptsBase concepts = {P,F} = {P,F}

Defined conceptsDefined concepts = {W, M1, M2, …} = {W, M1, M2, …}

RolesRoles = { = {h1h1,,h2h2}}

Concept DefinitionConcept Definition

AxiomAxiom

where A atomic concept, where A atomic concept,

C, D complex concept expressionsC, D complex concept expressions

Page 67: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

67Introduction to KR, B. Ludaescher & K. Lin

Example TBox• Base conceptsBase concepts = {Person, Female} = {Person, Female}

… … occur on the RHS onlyoccur on the RHS only

• Defined conceptsDefined concepts = {P, F, W, …} = {P, F, W, …}

… … occur on the LHS (& maybe RHS)occur on the LHS (& maybe RHS)

• Base interpretation Base interpretation JJ: interpret base : interpret base concepts onlyconcepts only

• Extension Extension II of of JJ:: on same domain as on same domain as J J and agrees (on base) with and agrees (on base) with JJ

• TBox TBox T T is is definitorial definitorial if every base if every base interpretation has exactly one extension interpretation has exactly one extension that is a that is a model model of of TT

Page 68: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

68Introduction to KR, B. Ludaescher & K. Lin

Brains-On (Hands-off) SessionTM

Page 69: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

69Introduction to KR, B. Ludaescher & K. Lin

What do we mean here?

Starting with the base interpretation of • I(Person) := “the class of persons”• I(Female) := “the class of females”

… what is the meaning of the defined concepts?… what role play the roles in this process?

Page 70: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

70Introduction to KR, B. Ludaescher & K. Lin

And the answer is …

• atomic concept• atomic concept • concept def. w/

intersection• … plus negation• … existential restriction

• … … value restriction value restriction

Page 71: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

71Introduction to KR, B. Ludaescher & K. Lin

Digression: “Sparrow” (Prolog) Syntax for DL

Sparrow “Grammar” and “Parser”

Example in Example in Sparrow SyntaxSparrow Syntax

Page 72: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

72Introduction to KR, B. Ludaescher & K. Lin

Back to Reasoning with the Family ...

• concept definition: MyConcept DL-formula• concept inclusion: MyConcept DL-formula• finite set of definitions is a terminology or TBox if for

every atomic concept A there is at most one axiom whose lhs is A

Page 73: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

73Introduction to KR, B. Ludaescher & K. Lin

Expansion of Terminologies• For acyclic T we can “unfold” concept definitions

until every defined concepts is specified in terms of primitive concepts only

the expansion of a TBox T• Example:

Page 74: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

74Introduction to KR, B. Ludaescher & K. Lin

Reasoning in the Tableaux calculusTBox

Expansion

From this

We want to show this

In First-order (LeanTap) syntax

Page 75: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

75Introduction to KR, B. Ludaescher & K. Lin

Reasoning Services

• Remember the distinction between evaluation a query (over a DB) vs reasoning with queries (symbolic expressions)?

• The former can be very hard, esp. for large databases and complex queries

• The latter is much harder still, even for small queries and knowledge bases, ontologies

• Specialized DL reasoners (FACT, Racer, …) better than general purpose FO reasoners

Page 76: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

76Introduction to KR, B. Ludaescher & K. Lin

OK – enough of that jazz…

let’s look at some demos …

Page 77: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

77Introduction to KR, B. Ludaescher & K. Lin

Tools for Editing and Processing Ontology

1. Protégé 2000 (RDF, OWL) http://protege.stanford.edu/

2. CmapTools (concept map) http://cmap.ihmc.us/

3. Java API Jena http://www.hpl.hp.com/semweb/jena.htm

OWL API http://sourceforge.net/projects/owlapi

Geology Map Integration Demo: http://geon01.sdsc.edu:8080/workbench/jsp/onto-list.jsp

Page 78: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

78Introduction to KR, B. Ludaescher & K. Lin

ANOTHER APPLICATION OF ONTOLOGIES:

An Ontology-Driven Framework for Data Transformation in Scientific

Workflows (from DILS’04)

Shawn BowersBertram Ludäscher

San Diego Supercomputer CenterUniversity of California, San Diego

Page 79: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

79Introduction to KR, B. Ludaescher & K. Lin

Outline• Background (SEEK Project)• Scientific Workflows• The Problem: Reusing Structurally

Incompatible Services• The Ontology-Driven Framework• Future Work

Page 80: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

80Introduction to KR, B. Ludaescher & K. Lin

Outline• Background (SEEK Project)• Scientific Workflows• The Problem: Reusing Structurally

Incompatible Services• The Ontology-Driven Framework• Future Work

Page 81: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

81Introduction to KR, B. Ludaescher & K. Lin

Science Environment for Ecological Knowledge (SEEK)

• Domain Science Driver– Ecology (LTER), biodiversity,

• Analysis & Modeling System– Design and execution of

ecological models and analysis

– End user focus– {application,upper}-ware

• Semantic Mediation System– Data Integration of hard-to-

relate sources and processes– Semantic Types and

Ontologies– upper middleware

• EcoGrid – Access to ecology data and

tools– {middle,under}-ware

Architecture (cf. US cyberinfrastructure, UK e-Science)

this paper

Page 82: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

82Introduction to KR, B. Ludaescher & K. Lin

Outline• The SEEK Project• Scientific Workflows

– Focus: analysis & component integration on Focus: analysis & component integration on top of data integrationtop of data integration

• The Problem: Reusing Structurally Incompatible Services

• The Ontology-Driven Framework• Future Work

Page 83: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

83Introduction to KR, B. Ludaescher & K. Lin

Promoter Identification in Kepler [SSDBM’03]

Promoter Identification in Kepler [SSDBM’03]

• Problems– Many components (web

serivces) are NOT designed to fit!

“The problem P that X solves is simple, and X doesn’t solve it well”

– Semantically meaningful connections are structurally incompatible

• Approach– Distinguish structural

type and semantic type– Structural type: e.g.

XML Schema– Semantic type: e.g.

OWL expressions– Exploit the (optional!)

semantic type as much as possible

• Problems– Many components (web

serivces) are NOT designed to fit!

“The problem P that X solves is simple, and X doesn’t solve it well”

– Semantically meaningful connections are structurally incompatible

• Approach– Distinguish structural

type and semantic type– Structural type: e.g.

XML Schema– Semantic type: e.g.

OWL expressions– Exploit the (optional!)

semantic type as much as possible

Page 84: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

84Introduction to KR, B. Ludaescher & K. Lin

Service ReusabilityA scientist wishes to connect two

(independent) services

SourceServiceSourceService

TargetServiceTargetService

Ps Pt

Desired Connection

Page 85: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

85Introduction to KR, B. Ludaescher & K. Lin

Service ReusabilityIn Ptolemy II/Kepler (and in web services), input

and output ports (message parts) have structural types (XML Schema)

SourceServiceSourceService

TargetServiceTargetService

Ps Pt

StructuralType Pt

StructuralType Pt

StructuralType Ps

StructuralType Ps

Desired Connection

Page 86: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

86Introduction to KR, B. Ludaescher & K. Lin

Service ReusabilityUnless “designed to fit,” independent

services are structurally incompatible Generally, the source output type will

not be a subtype of the target input type

SourceServiceSourceService

TargetServiceTargetService

Ps Pt

StructuralType Pt

StructuralType Pt

StructuralType Ps

StructuralType Ps

Desired Connection

Incompatible

(⋠)

Page 87: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

87Introduction to KR, B. Ludaescher & K. Lin

Service ReusabilityA transformation mapping () is

required to connect the services … artificially creating subtype compatibility

If such a exists, the services are “structurally feasible”

SourceServiceSourceService

TargetServiceTargetService

Ps Pt

StructuralType Pt

StructuralType Pt

StructuralType Ps

StructuralType Ps

Desired Connection

Incompatible

(⋠)

(Ps)(Ps) (≺)

Page 88: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

88Introduction to KR, B. Ludaescher & K. Lin

Service ReusabilityWe can annotate services with

semantic types for discovery and interoperability of services

SourceServiceSourceService

TargetServiceTargetService

Ps Pt

Ontologies (OWL)Ontologies (OWL)

SemanticType Ps

SemanticType Ps

SemanticType Pt

SemanticType Pt

Desired Connection

Compatible ( )⊑

Page 89: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

89Introduction to KR, B. Ludaescher & K. Lin

Service ReusabilityServices can be semantically

compatible, but structurally incompatible

SourceServiceSourceService

TargetServiceTargetService

Ps Pt

SemanticType Ps

SemanticType Ps

SemanticType Pt

SemanticType Pt

StructuralType Pt

StructuralType Pt

StructuralType Ps

StructuralType Ps

Desired Connection

Incompatible

Compatible

(⋠)

( )⊑

(Ps)(Ps) (≺)

Ontologies (OWL)Ontologies (OWL)

Page 90: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

90Introduction to KR, B. Ludaescher & K. Lin

Example Structural Types (XML)

S1

(life stage property)

S1

(life stage property)

S2

(mortality rate for period)

S2

(mortality rate for period)

P1P2

P4

P3 P5

root population = (sample)*elem sample = (meas, lsp)elem meas = (cnt, acc)elem cnt = xsd:integerelem acc = xsd:doubleelem lsp = xsd:string

<population> <sample> <meas> <cnt>44,000</cnt> <acc>0.95</acc> </meas> <lsp>Eggs</lsp> </sample> …<population>

root cohortTable = (measurement)*elem measuremnt = (phase, obs)elem phase = xsd:stringelem obs = xsd:integer

<cohortTable> <measurement> <phase>Eggs</cnt> <obs>44,000</acc> </measurement>…<cohortTable>

structType(P2) structType(P3)

Page 91: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

91Introduction to KR, B. Ludaescher & K. Lin

Example Semantic TypesPortion of SEEK measurement ontology

MeasContext

Observation EntityMeasProperty

hasContext 0:*1:1

appliesTo

hasProperty

0:*

AccuracyQualifier

EcologicalProperty

AbundanceCount

LifeStageProperty

NumericValue

SpatialLocation

hasLocation

hasCount

1:1

1:1

hasValue1:1

itemMeasured

1:*

Page 92: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

92Introduction to KR, B. Ludaescher & K. Lin

Example Semantic TypesPortion of SEEK measurement ontology

MeasContext

Observation EntityMeasProperty

hasContext 0:*1:1

appliesTo

hasProperty

0:*

AccuracyQualifier

EcologicalProperty

AbundanceCount

LifeStageProperty

NumericValue

SpatialLocation

hasLocation

hasCount

1:1

1:1

hasValue1:1

itemMeasured

1:*

Same in OWL, a description logic standard (here, Sparrow syntax): Observation subClassOf forall hasContext/MeasContext and forall hasProperty/MeasProperty and exists itemMeasured/Entity.

MeasContext subClassOf exists appliesTo/Entity and atmost 1/appliesTo.

EcologicalProperty subClassOf Entity.

LifeStageProperty subClassOf EcologicalProperty.

AbundanceCount subClassOf EcologicalProperty and exists hasLocation/SpatialLocation and atMost 1/hasLocation and exists hasCount/NumericValue and atMost 1/hasCount.

Same in OWL, a description logic standard (here, Sparrow syntax): Observation subClassOf forall hasContext/MeasContext and forall hasProperty/MeasProperty and exists itemMeasured/Entity.

MeasContext subClassOf exists appliesTo/Entity and atmost 1/appliesTo.

EcologicalProperty subClassOf Entity.

LifeStageProperty subClassOf EcologicalProperty.

AbundanceCount subClassOf EcologicalProperty and exists hasLocation/SpatialLocation and atMost 1/hasLocation and exists hasCount/NumericValue and atMost 1/hasCount.

Page 93: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

93Introduction to KR, B. Ludaescher & K. Lin

Example Semantic TypesSemantic types for P2 and P3

S1

(life stage property)

S1

(life stage property)

S2

(mortality rate for period)

S2

(mortality rate for period)

P1P2

P4

P3 P5

Observation

semType(P3)

MeasContext

hasContext

1:1

appliesTo LifeStageProperty1:1

AbundanceCount

itemMeasured NumberValue

hasCount

1:11:1

semType(P2)

AccuracyQualifier

hasProperty

1:1

hasValue1:1

Page 94: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

94Introduction to KR, B. Ludaescher & K. Lin

Example Semantic TypesSemantic types for P2 and P3

S1

(life stage property)

S1

(life stage property)

S2

(mortality rate for period)

S2

(mortality rate for period)

P1P2

P4

P3 P5

Observation

semType(P3)

MeasContext

hasContext

1:1

appliesTo LifeStageProperty1:1

AbundanceCount

itemMeasured NumberValue

hasCount

1:11:1

semType(P2)

AccuracyQualifier

hasProperty

1:1

hasValue1:1

semType(P3) subClassOf Observation and exists hasContext/(MeasurementContext and exists appliesTo/LifeStageProperty and atMost 1/appliesTo) and exists itemMeasured/AbundanceCount and atMost 1/itemMeasured.

semType(P2) subClassOf Observation and exists hasContext/(MeasurementContext and exists appliesTo/LifeStageProperty and atMost 1/appliesTo) and exists itemMeasured/AbundanceCount and atMost 1/itemMeasured and exists hasProperty/AccuracyQualifier and atMost 1/hasProperty.

semType(P3) subClassOf Observation and exists hasContext/(MeasurementContext and exists appliesTo/LifeStageProperty and atMost 1/appliesTo) and exists itemMeasured/AbundanceCount and atMost 1/itemMeasured.

semType(P2) subClassOf Observation and exists hasContext/(MeasurementContext and exists appliesTo/LifeStageProperty and atMost 1/appliesTo) and exists itemMeasured/AbundanceCount and atMost 1/itemMeasured and exists hasProperty/AccuracyQualifier and atMost 1/hasProperty.

Page 95: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

95Introduction to KR, B. Ludaescher & K. Lin

Outline• The SEEK Project• Scientific Workflows• The Problem: Reusing Structurally

Incompatible Services• The Ontology-Driven Framework• Future Work

Page 96: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

96Introduction to KR, B. Ludaescher & K. Lin

The Ontology-Driven FrameworkDefine semantic registration mappings

(“semantic views”) to connect structural and semantic types

Use registration mappings to (semi-) automate transformation, based on derived structural correspondences

Depending on the ontologies and registration mappings, it may not be possible to find an appropriate …

(since the correspondence is often under-specified)

Page 97: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

97Introduction to KR, B. Ludaescher & K. Lin

The Ontology-Driven Framework

SourceServiceSourceService

TargetServiceTargetService

Ps Pt

SemanticType Ps

SemanticType Ps

SemanticType Pt

SemanticType Pt

StructuralType Pt

StructuralType Pt

StructuralType Ps

StructuralType Ps

Desired Connection

Compatible ( )⊑

RegistrationMapping (Output)

RegistrationMapping (Input)

Ontologies (OWL)Ontologies (OWL)

Page 98: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

98Introduction to KR, B. Ludaescher & K. Lin

Registration Example (simple XPaths)

/population/sample == semType(P2)/population/sample/meas/cnt == semType(P2).itemMeasured/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount/population/sample/meas/acc == semType(P2).hasProperty/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue/population/sample/lsp/text() == semType(P2).hasContext.appliesTo

root population = (sample)*elem sample = (meas, lsp)elem meas = (cnt, acc)elem cnt = xsd:integerelem acc = xsd:doubleelem lsp = xsd:string

<population> <sample> <meas> <cnt>44,000</cnt> <acc>0.95</acc> </meas> <lsp>Eggs</lsp> </sample> …<population>

structType(P2)

Page 99: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

99Introduction to KR, B. Ludaescher & K. Lin

Registration Example (simple XPaths)

/population/sample == semType(P2)/population/sample/meas/cnt == semType(P2).itemMeasured/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount/population/sample/meas/acc == semType(P2).hasProperty/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue/population/sample/lsp/text() == semType(P2).hasContext.appliesTo

root population = (sample)*elem sample = (meas, lsp)elem meas = (cnt, acc)elem cnt = xsd:integerelem acc = xsd:doubleelem lsp = xsd:string

<population> <sample> <meas> <cnt>44,000</cnt> <acc>0.95</acc> </meas> <lsp>Eggs</lsp> </sample> …<population>

structType(P2)

Each sample is an instance of the semantic type

Page 100: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

100Introduction to KR, B. Ludaescher & K. Lin

Registration Example (simple XPaths)

/population/sample == semType(P2)/population/sample/meas/cnt == semType(P2).itemMeasured/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount/population/sample/meas/acc == semType(P2).hasProperty/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue/population/sample/lsp/text() == semType(P2).hasContext.appliesTo

root population = (sample)*elem sample = (meas, lsp)elem meas = (cnt, acc)elem cnt = xsd:integerelem acc = xsd:doubleelem lsp = xsd:string

<population> <sample> <meas> <cnt>44,000</cnt> <acc>0.95</acc> </meas> <lsp>Eggs</lsp> </sample> …<population>

structType(P2)

Each sample’s cnt represents the itemMeasured object

Page 101: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

101Introduction to KR, B. Ludaescher & K. Lin

Registration Example (simple XPaths)

/population/sample == semType(P2)/population/sample/meas/cnt == semType(P2).itemMeasured/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount/population/sample/meas/acc == semType(P2).hasProperty/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue/population/sample/lsp/text() == semType(P2).hasContext.appliesTo

root population = (sample)*elem sample = (meas, lsp)elem meas = (cnt, acc)elem cnt = xsd:integerelem acc = xsd:doubleelem lsp = xsd:string

<population> <sample> <meas> <cnt>44,000</cnt> <acc>0.95</acc> </meas> <lsp>Eggs</lsp> </sample> …<population>

structType(P2)

Each sample’s cnt’s value represents the hasCount value ofthe corresponding itemMeasured object

Page 102: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

102Introduction to KR, B. Ludaescher & K. Lin

Registration Example (simple XPaths)

/cohortTable/measurement == semType(P3)/cohortTable/measurement/obs == semType(P3).itemMeasured/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo

<cohortTable> <measurement> <phase>Eggs</cnt> <obs>44,000</acc> </measurement>…<cohortTable>

root cohortTable = (measurement)*elem measuremnt = (phase, obs)elem phase = xsd:stringelem obs = xsd:integer

structType(P3)

… similary for P3 .. … .

Page 103: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

103Introduction to KR, B. Ludaescher & K. Lin

The Ontology-Driven Framework

SourceServiceSourceService

TargetServiceTargetService

Ps Pt

SemanticType Ps

SemanticType Ps

SemanticType Pt

SemanticType Pt

StructuralType Pt

StructuralType Pt

StructuralType Ps

StructuralType Ps

Desired Connection

Compatible ( )⊑

RegistrationMapping (Output)

RegistrationMapping (Input)

CorrespondenceCorrespondence

Ontologies (OWL)Ontologies (OWL)

Page 104: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

104Introduction to KR, B. Ludaescher & K. Lin

Correspondence Example/population/sample == semType(P2)/population/sample/meas/cnt == semType(P2).itemMeasured/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount/population/sample/meas/acc == semType(P2).hasProperty/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue/population/sample/lsp/text() == semType(P2).hasContext.appliesTo

/cohortTable/measurement == semType(P3)/cohortTable/measurement/obs == semType(P3).itemMeasured/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo

Source-side semantic registration mapping

Target-side semantic registration mapping

populationsample *meascnt

xsd:double

xsd:stringlsp

xsd:integer

acc

cohortTablemeasurement *obsxsd:integer

phasexsd:string

Page 105: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

105Introduction to KR, B. Ludaescher & K. Lin

Correspondence Example/population/sample == semType(P2)/population/sample/meas/cnt == semType(P2).itemMeasured/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount/population/sample/meas/acc == semType(P2).hasProperty/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue/population/sample/lsp/text() == semType(P2).hasContext.appliesTo

/cohortTable/measurement == semType(P3)/cohortTable/measurement/obs == semType(P3).itemMeasured/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo

Source

Target

populationsample *meascnt

xsd:double

xsd:stringlsp

xsd:integer

acc

cohortTablemeasurement *obsxsd:integer

phasexsd:string

We want to “compose”the registrations to obtain

structural correspondences

We want to “compose”the registrations to obtain

structural correspondences

Page 106: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

106Introduction to KR, B. Ludaescher & K. Lin

Correspondence Example/population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount/population/sample/meas/acc == semType(P2).hasProperty/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue/population/sample/lsp/text() == semType(P2).hasContext.appliesTo

/cohortTable/measurement == semType(P3)/cohortTable/measurement/obs == semType(P3).itemMeasured/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo

Source

Target

populationsample *meascnt

xsd:double

xsd:stringlsp

xsd:integer

acc

cohortTablemeasurement *obsxsd:integer

phasexsd:string

/population/sample == semType(P2)

/cohortTable/measurement == semType(P3)

These fragments correspond

Page 107: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

107Introduction to KR, B. Ludaescher & K. Lin

Correspondence Example/population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount/population/sample/meas/acc == semType(P2).hasProperty/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue/population/sample/lsp/text() == semType(P2).hasContext.appliesTo

/cohortTable/measurement == semType(P3)/cohortTable/measurement/obs == semType(P3).itemMeasured/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo

Source

Target

populationsample *meascnt

xsd:double

xsd:stringlsp

xsd:integer

acc

cohortTablemeasurement *obsxsd:integer

phasexsd:string

/population/sample/meas/cnt == semType(P2).itemMeasured

/cohortTable/measurement/obs == semType(P3).itemMeasured

These fragments correspond

Page 108: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

108Introduction to KR, B. Ludaescher & K. Lin

Correspondence Example/population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount/population/sample/meas/acc == semType(P2).hasProperty/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue/population/sample/lsp/text() == semType(P2).hasContext.appliesTo

/cohortTable/measurement == semType(P3)/cohortTable/measurement/obs == semType(P3).itemMeasured/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo

Source

Target

populationsample *meascnt

xsd:double

xsd:stringlsp

xsd:integer

acc

cohortTablemeasurement *obsxsd:integer

phasexsd:string

/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount

/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount

These fragments correspond

Page 109: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

109Introduction to KR, B. Ludaescher & K. Lin

Correspondence Example/population/sample == semType(P2) /population/sample/meas/cnt == semType(P2).itemMeasured/population/sample/meas/cnt/text() == semType(P2).itemMeasured.hasCount/population/sample/meas/acc == semType(P2).hasProperty/population/sample/meas/acc/text() == semType(P2).hasProperty.hasValue/population/sample/lsp/text() == semType(P2).hasContext.appliesTo

/cohortTable/measurement == semType(P3)/cohortTable/measurement/obs == semType(P3).itemMeasured/cohortTable/measurement/obs/text() == semType(P3).itemMeasured.hasCount/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo

Source

Target

populationsample *meascnt

xsd:double

xsd:stringlsp

xsd:integer

acc

cohortTablemeasurement *obsxsd:integer

phasexsd:string

/population/sample/lsp/text() == semType(P2).hasContext.appliesTo

/cohortTable/measurement/phase/text() == semType(P3).hasContext.appliesTo

These fragments correspond

Page 110: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

110Introduction to KR, B. Ludaescher & K. Lin

The Ontology-Driven Framework

SourceServiceSourceService

TargetServiceTargetService

Ps Pt

SemanticType Ps

SemanticType Ps

SemanticType Pt

SemanticType Pt

StructuralType Pt

StructuralType Pt

StructuralType Ps

StructuralType Ps

Desired Connection

Compatible ( )⊑

RegistrationMapping (Output)

RegistrationMapping (Input)

CorrespondenceCorrespondence

Generate (Ps)(Ps)

Ontologies (OWL)Ontologies (OWL)

Transformation

Page 111: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

111Introduction to KR, B. Ludaescher & K. Lin

Example Result (XQuery)Based on the structural correspondences

and certain assumptions, we derive the transformation XQuery:

<cohortTable> { for $s in /population/sample return <measurement> { for $c in $s/meas/cnt return <obs>{$c/text()}</obs> } { for $l in $s/lsp return <phase>{$l/text()}</phase> } </measurement> }</cohortTable>

Page 112: Introduction to Databases: From Data to Knowledge Bases Instructors: Bertram Ludaescher Kai Lin Instructors: Bertram Ludaescher Kai Lin

112Introduction to KR, B. Ludaescher & K. Lin

Assumptions Made(or why this may not work for you…)

• Common XPath prefixes refer to the same element

• Elements in correspondences have compatible cardinalities– source is equivalent or stricter than

target (e.g., + is stricter than *)

• Primitive data types are compatible