seek semantic mediation

36
SEEK Semantic Mediation Shawn Bowers Bertram Ludäscher e-Science Centre, May 11-14, 2004,

Upload: kaseem-dalton

Post on 01-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

SEEK Semantic Mediation. Shawn Bowers Bertram Ludäscher e-Science Centre, May 11-14, 2004,. Outline. The Sparrow Toolkit Semantic Registration Ontology-Driven Structural Transformation. Outline. The Sparrow Toolkit Semantic Registration Ontology-Driven Structural Transformation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SEEK Semantic Mediation

SEEK Semantic Mediation

Shawn BowersBertram Ludäscher

e-Science Centre, May 11-14, 2004,

Page 2: SEEK Semantic Mediation

Outline

• The Sparrow ToolkitThe Sparrow Toolkit

• Semantic RegistrationSemantic Registration

• Ontology-Driven Structural TransformationOntology-Driven Structural Transformation

Page 3: SEEK Semantic Mediation

Outline

• The Sparrow ToolkitThe Sparrow Toolkit

• Semantic RegistrationSemantic Registration

• Ontology-Driven Structural TransformationOntology-Driven Structural Transformation

Page 4: SEEK Semantic Mediation

Semantic Mediation in SEEK: Our focus

Resource Discovery– Ontology-driven tools to help search for datasets and

services using semantic descriptions …

Data Transformation – Determine and execute mappings to compose services

and bind data to services

Data Integration– Provide reconciled, uniform access to multiple datasets

“Semantic” Workflow Analysis– Verify semantic correctness, accumulate semantic

information, and provide workflow planning/suggestion services … the future

Page 5: SEEK Semantic Mediation

The Sparrow Toolkit: Vision

Lightweight Languages and command-line-style services to support mediation

– Syntax and language conversion • DL, FOL, OWL, RDF, …

– Reasoning• subsumption, classification, consistency, satisifiability,

datatypes, instance classification, …

– Display utilities • hierarchies, OO/ER style models, OWL DLs?

– Query• Query answering, semantic query rewriting, semantic

registration, integration, …

Logic-based implementation (Prolog)

Page 6: SEEK Semantic Mediation

Some sparrow-dl (Taxon example)

Page 7: SEEK Semantic Mediation

Some more sparrow-dl (“textbook” example)

Page 8: SEEK Semantic Mediation

display_formulas(KB)

Page 9: SEEK Semantic Mediation

display_preclassified_hierarchy(K)

Page 10: SEEK Semantic Mediation

display_classified_hierarchy(K)

Page 11: SEEK Semantic Mediation

display_classified(K)

Page 12: SEEK Semantic Mediation

Outline

• The Sparrow ToolkitThe Sparrow Toolkit

• Semantic RegistrationSemantic Registration

• Ontology-Driven Structural TransformationOntology-Driven Structural Transformation

Page 13: SEEK Semantic Mediation

Adding semantics to EML: Observations

The finer grain the annotation, the more opportunity for discovery, integration, and transformation …

The coarser grain the annotation, the harder it is to do useful operations; unless your ontology is very deep

annotation granularity

ontologydepth

finecourse

shallow

deepmaximal ontology/annotation leverage

Page 14: SEEK Semantic Mediation

Semantic Registration (SSDBM’04)

By annotation granularity, we mean:

– Resource-Level “Metadata”– Attribute Level (the attribute itself)– Attribute Level (as a collection-value)– Attribute Level (as independent values)– Attribute Groups (as a collection-value or independent

values)– Filtered values (e.g., SQL where-clause)– Specific value annotations (as a mapping function or

stated by-hand)

Often, integration and transformation require very detailed annotations

Page 15: SEEK Semantic Mediation

Some Examples (arguments against concepts-as-labels)

r(…, lt, ln, …)

sem(lt) == latitudesem(ln) == longitude

Question: What do these annotations mean? 1. The name “lt” itself refers to latitude?2. The set of values in the column taken as a whole make

up a latitude (like coverage)3. Each individual value in the column denotes a separate

latitude (Is it a latitude though? Or just a coded rep.?)

We want to avoid these ambiguous anntotations … often

Page 16: SEEK Semantic Mediation

Some Examples (still not enough)

r(…, lt, ln, …)

sem(lt) == values represent latitudesem(ln) == values represent longitude

More problems: How do I know lt and ln go together to form a location, for example, …

Location

Latitude Longitude

lat lon

Page 17: SEEK Semantic Mediation

Some Examples (still not enough)

r(…, lt, ln, lt-end, ln-end, …)

sem(lt) == values represent latitudesem(ln) == values represent longitudesem(lt-end) == values represent latitudesem(ln-end) == values represent longitude

Which lat goes with which lon?

Location

Latitude Longitude

lat lon

Page 18: SEEK Semantic Mediation

Some Examples (still not enough)

r(…, lt, ln, lt-end, ln-end, …)

sem(lt, ln) == values represent location and lat leads to semval(lt) and lon leads to

semval(ln) **

sem(lt, ln) == values represent locationsem(lt) == values represent latitudesem(ln) == values represent longitudesem(lt, ln) == values represent location and …sem(lt-end) == values represent latitudesem(ln-end) == values represent longitude

What if we want to integratewith another dataset withtwo lat/lons? What do we do?

Location

Latitude Longitude

lat lon

* We could infer the lat and lon roles here; in general, I don’t think we can infer roles as such…

Page 19: SEEK Semantic Mediation

Some Examples (still not enough)

r(…, lt, ln, lt-end, ln-end, …)

sem(lt, ln, lt-end, ln-end) === values represent transect and start leads to semval(lt, ln) and end leads to semval(lt-end, ln-end)

sem(lt, ln) == values represent location and …sem(lt) == values represent latitudesem(ln) == values represent longitudesem(lt, ln) == values represent location and …sem(lt-end) == values represent latitudesem(ln-end) == values represent longitude

So, even in verysimple cases,annotationscan become complex…

Location

Latitude Longitude

lat lonTransect

start

end

Page 20: SEEK Semantic Mediation

Executable, Fine-Grain Semantic Registration

genus species count lat lon

'Acanthomyops' 'latipes' 1 41.6, -119.383'Acromyrmex' 'versicolor' 1 33.1839 -114.866'Anergates‘ 'atratulus' 1 37.9833 -84.5167'Anergates‘ 'atratulus' 4 38.8833 -77.1167

Each row represents a RatioMeasurement

RatioMeasurement

Page 21: SEEK Semantic Mediation

Executable, Fine-Grain Semantic Registration (cont.)

genus species count lat lon

'Acanthomyops' 'latipes' 1 41.6, -119.383'Acromyrmex' 'versicolor' 1 33.1839 -114.866'Anergates‘ 'atratulus' 1 37.9833 -84.5167'Anergates‘ 'atratulus' 4 38.8833 -77.1167

For a row, count is the value of the measurement

value11

dataValue

RatioMeasurement

LocalInteger

Page 22: SEEK Semantic Mediation

Executable, Fine-Grain Semantic Registration (cont.)

genus species count lat lon

'Acanthomyops' 'latipes' 11 41.6 41.6 -119.383-119.383'Acromyrmex' 'versicolor' 1 33.1839 -114.866'Anergates‘ 'atratulus' 1 37.9833 -84.5167'Anergates‘ 'atratulus' 4 38.8833 -77.1167

For a row, lat/lon are the locations values of the measurement

value11

dataValue

context

location

latitude

longitude41.641.6

-119.383-119.383

RatioMeasurement

LocalInteger

LocationContext

GeogCoordPoint

Page 23: SEEK Semantic Mediation

Executable, Fine-Grain Semantic Registration (cont.)

genus species count lat lon

'Acanthomyops''Acanthomyops' 'latipes''latipes' 11 41.6 41.6 -119.383-119.383'Acromyrmex' 'versicolor' 1 33.1839 -114.866'Anergates‘ 'atratulus' 1 37.9833 -84.5167'Anergates‘ 'atratulus' 4 38.8833 -77.1167

For a row, genus/species are mapped to standard values, associated

RatioMeasurement

itemMeasuredCount

propertyEntityTaxonomicGroup

taxonomicIDSimpleTaxonomicId

genusGenus

rankName

taxon:1883/5taxon:1883/5

Species

speciesrankName

taxon:1883/3taxon:1883/3

subCatsuperCat

Page 24: SEEK Semantic Mediation

Querying based on Semantic Registrations

value11

dataValue

context

location

latitude

longitude41.641.6

-119.383-119.383

RatioMeasurement

LocalInteger

LocationContext

GeogCoordPoint

itemMeasuredCount

propertyEntityTaxonomicGroup

taxonomicIDSimpleTaxonomicId

genusGenus

rankName

taxon:1883/5taxon:1883/5

Species

speciesrankName

taxon:1883/3taxon:1883/3

subCatsuperCat

Find all datasets that measure species of ‘Acanthomyops’ in South Africa … and return a set of all lat/lon “points”(demo …)

Page 25: SEEK Semantic Mediation

Architecture

SMSOperations

Dataset repository (heterogeneous)

Lat/Lon Species Queries

Semantic Annotations

Taxon Services

SynonymsConcept

IDs

Ontology repository

Results

discover_resources

query_resourcesintegrate_resource

s

Mappings

Page 26: SEEK Semantic Mediation

Finding user interfaces that are easy-to-use, but provide detailed annotations

genus specieslat lon count

TaxaConceptIDValue Value Value

41.6 -119.4 5 ‘Manica’ ‘bradleyi’

34.9 -120.7 2 ‘Formica’ ‘fusca’

resource id:

<<registration information/properties>>

<<ontology view>> <<sample instance view>>

<<annotation, schema, and data>> antweb:040412

Page 27: SEEK Semantic Mediation

A Sparrow Executable Semantic Annotation Registration

A partial object instantiation (of onto classes)

The resource can be queried directly using the object structure (i.e., using the ontology)

Page 28: SEEK Semantic Mediation

Outline

• The Sparrow ToolkitThe Sparrow Toolkit

• Semantic RegistrationSemantic Registration

• Ontology-Driven Structural TransformationOntology-Driven Structural Transformation

Page 29: SEEK Semantic Mediation

Example Structural Types (XML)

S1

(life stage property)

S1

(life stage property)

S2

(mortality rate for period)

S2

(mortality rate for period)

P1P2

P4

P3 P5

root population = (sample)*elem sample = (meas, lsp)elem meas = (cnt, acc)elem cnt = xsd:integerelem acc = xsd:doubleelem lsp = xsd:string

<population> <sample> <meas> <cnt>44,000</cnt> <acc>0.95</acc> </meas> <lsp>Eggs</lsp> </sample> …<population>

root cohortTable = (measurement)*elem measuremnt = (phase, obs)elem phase = xsd:stringelem obs = xsd:integer

<cohortTable> <measurement> <phase>Eggs</cnt> <obs>44,000</acc> </measurement>…<cohortTable>

structType(P2) structType(P3)

Page 30: SEEK Semantic Mediation

Example Semantic Types

Portion of SEEK measurement ontology

MeasContext

Observation EntityMeasProperty

hasContext 0:*1:1

appliesTo

hasProperty

0:*

AccuracyQualifier

EcologicalProperty

AbundanceCount

LifeStageProperty

NumericValue

SpatialLocation

hasLocation

hasCount

1:1

1:1

hasValue1:1

itemMeasured

1:*

Page 31: SEEK Semantic Mediation

Example Semantic Types

Semantic types for P2 and P3

S1

(life stage property)

S1

(life stage property)

S2

(mortality rate for period)

S2

(mortality rate for period)

P1P2

P4

P3 P5

Observation

semType(P3)

MeasContext

hasContext

1:1

appliesTo LifeStageProperty1:1

AbundanceCount

itemMeasured NumberValue

hasCount

1:11:1

semType(P2)

AccuracyQualifier

hasProperty

1:1

hasValue1:1

Page 32: SEEK Semantic Mediation

The Ontology-Driven Framework

SourceServiceSourceService

TargetServiceTargetService

Ps Pt

SemanticType Ps

SemanticType Ps

SemanticType Pt

SemanticType Pt

StructuralType Pt

StructuralType Pt

StructuralType Ps

StructuralType Ps

Desired Connection

Compatible ( )⊑

RegistrationMapping (Output)

RegistrationMapping (Input)

Ontologies (OWL)Ontologies (OWL)

Page 33: SEEK Semantic Mediation

The Ontology-Driven Framework

SourceServiceSourceService

TargetServiceTargetService

Ps Pt

SemanticType Ps

SemanticType Ps

SemanticType Pt

SemanticType Pt

StructuralType Pt

StructuralType Pt

StructuralType Ps

StructuralType Ps

Desired Connection

Compatible ( )⊑

RegistrationMapping (Output)

RegistrationMapping (Input)

CorrespondenceCorrespondence

Ontologies (OWL)Ontologies (OWL)

Page 34: SEEK Semantic Mediation

The Ontology-Driven Framework

SourceServiceSourceService

TargetServiceTargetService

Ps Pt

SemanticType Ps

SemanticType Ps

SemanticType Pt

SemanticType Pt

StructuralType Pt

StructuralType Pt

StructuralType Ps

StructuralType Ps

Desired Connection

Compatible ( )⊑

RegistrationMapping (Output)

RegistrationMapping (Input)

CorrespondenceCorrespondence

Generate (Ps)(Ps)

Ontologies (OWL)Ontologies (OWL)

Transformation

Page 35: SEEK Semantic Mediation
Page 36: SEEK Semantic Mediation

Datasets used in the Prototype

genus species count lat lon

'Acromyrmex' 'versicolor‘ 1 33.1839 -114.866…

genus species cnt lt ln

Camponotus‘ ‘festinatus‘ 3 30.55 -103.833…

Antweb

SouthAfrica

Museum

mbcnt cfcnt lat lon

1 2 -25.35 -77.1167…

“faked”

genus1 species1 genus2 species2

Manica parasitica Manica bradleyi…

DulosisParasite/

Host