4/17/2015for keg seminar1 a posteriori evaluation of ontology mapping results graph-based methods...

64
03/25/22 for KEG seminar 1 A posteriori evaluation of Ontology Mapping results Graph-based methods for Ontology Matching Ondřej Šváb KIZI

Upload: jalyn-creacy

Post on 14-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

04/18/23 for KEG seminar 1

A posteriori evaluation of Ontology Mapping results

Graph-based methodsfor Ontology Matching

Ondřej ŠvábKIZI

04/18/23 for KEG seminar 2

Agenda Conference track within OAEI-2006

Initial manual empirical evaluation

Empirical Evaluation via Logical Reasoning Mapping debugging based on Drago system Experiments with OntoFarm collection

Consensus Building Workshop Mining over the mappings with meta-data

04/18/23 for KEG seminar 3

Agenda Conference track within OAEI-2006

Initial manual empirical evaluation

Empirical Evaluation via Logical Reasoning Mapping debugging based on Drago system Experiments with OntoFarm collection

Consensus Building Workshop Mining over the mappings with meta-data

04/18/23 for KEG seminar 4

Conference track - Features

Broadly understandable domain Conference organisation

Free exploration by participants within 10 ontologies

No a priori reference alignment Participants: 6 research groups

04/18/23 for KEG seminar 5

Conference track - Dataset

http://nb.vse.cz/~svabo/oaei2006/index2.html

OntoFarm collection

04/18/23 for KEG seminar 6

Conference track - Participants

6 participants Automs Coma++ OWL-CtxMatch Falcon HMatch RiMOM

04/18/23 for KEG seminar 7

Conference track - Goals

Focus on interesting mappings and unclear mappings Why should they be mapped?

Arguments: against and for Which systems did discover them? Differences in similarity measures

Underlying techniques?

04/18/23 for KEG seminar 8

Evaluation

Processing all mappings by hand Assessment based on personal

judgement of organisers (consistency problem)

Tags: TP, FP, interesting, ?, heterogenous mapping

Types of errors and phenomena: subsumption, inverse property, siblings,

lexical confusion

04/18/23 for KEG seminar 9

Evaluation… Subsumption mistaken for equivalence

Author,Paper_Author Conference_Trip, Conference_part

Inverse property has_author,authorOf

Siblings mistaken for equivalence ProgramCommittee,Technical_commitee

Lexical confusion error program,Program_chair

Relation – Class mapping has_abstract,Abstract Topic,coversTopic; read_paper,Paper

04/18/23 for KEG seminar 10

Evaluation…

Some statistics as a side-effect of processing

04/18/23 for KEG seminar 11

Evaluation…

04/18/23 for KEG seminar 12

Agenda Conference track within OAEI-2006

Initial manual empirical evaluation

Empirical Evaluation via Logical Reasoning Mapping debugging based on Drago system Experiments with OntoFarm collection

Consensus Building Workshop Mining over the mappings with meta-data

04/18/23 for KEG seminar 13

Mapping debugging Goal: to improve the quality of

automatically generated mapping sets using logical reasoning about mappings

Prototype of the debugger/minimezer implemented on top of the DRAGO DDL reasoner

Semi-automatic process

04/18/23 for KEG seminar 14

Drago – Distributed Reasoning Architecture for Galaxy of Ontologies

Tool for distributed reasoning Based on DDL (Distributed Description Logics)

Services check ontology consistency, build classification, verify concepts satisfiability, check entailment

Resource: [http://drago.itc.it/]

04/18/23 for KEG seminar 15

DDL Representation framework for semantically connected

ontologies Extension of Description Logics (local interpretation,

distributed ,…) Distributed T-box

Semantic relations represented via directed bridge-rules: bridge rules:

From the point of view of ontology j

04/18/23 for KEG seminar 16

DDL inference mechanism Extension of tableau algorithm Inference of „new“ subsumption via

‘subsumption propagation mechanism’

And its generalized form with disjunctions,…

04/18/23 for KEG seminar 17

Drago - architecture

DRP=Drago Reasoning Peer peer-peer network of DRPs

04/18/23 for KEG seminar 18

Drago - implementation Ontological language OWL Mapping between ontologies represented in

C-OWL

Distributed Reasoner – extension of OWL reasoner Pellet (http://www.mindswap.org/2003/pellet/)

Communication amongst DRP via HTTP

04/18/23 for KEG seminar 19

Mapping debugging

1st step: diagnosis - detect unsatisfiable concepts (inconsistent ontology) Assumption: semantically connected

ontologies are consistent (without unsatisfiable concepts)

Therefore, unsatisfiable concepts in target ontology are caused by some mappings

04/18/23 for KEG seminar 20

Mapping debugging

2nd step: discovering minimal conflict set Two conditions:

Set of mappings causing inconsistency and By removing a mapping, concept is satisfiable

3rd step: debugging User feedback Removing mapping with the lowest degree of

confidence Compute semantic distance of the concept

names using WordNet synsets

04/18/23 for KEG seminar 21

Mapping debugging

4th step: minimization Removing redundant mappings It leads to minimal mappings set with all

the semantics (logically-equivalent minimal version)

04/18/23 for KEG seminar 22

Experiments with OntoFarm Mapping between class names Six ontologies involved, Results from four matching systems were

analysed Results of reasoning-based analysis:

04/18/23 for KEG seminar 23

Experiments with OntoFarm

Interpretation: 1. the lower number of inconsistent alignments, the better quality of mappings 2. this analysis reveal non-obvious errors in mappings

obivously incorrect mappings non-obivous errors in mappings

04/18/23 for KEG seminar 24

Agenda Conference track within OAEI-2006

Initial manual empirical evaluation

Empirical Evaluation via Logical Reasoning Mapping debugging based on Drago system Experiments with OntoFarm collection

Consensus Building Workshop Mining over the mappings with meta-data

04/18/23 for KEG seminar 25

Consensus Building Workshop Discussion about interesting mappings discovered

during manual and automatic evaluation Reaching agreement Why should they be mapped?

Arguments: against and for During discussion the following order of arguments

were taken into account: lexical reasons context of elements (subclasses superclasses,

subproperties, superproperties), consider extensions of classes (set interpretation)

Properties related to classes Axioms (more complex restrictions)

04/18/23 for KEG seminar 26

Ilustrative examples

Person vs. Human

Against: different sets of subconceptsFor: the same domainResult: YES

04/18/23 for KEG seminar 27

Ilustrative examples

PC_Member vs. Member_PC

Who is the member of ProgramCommittee?

Ontologies have different interpretation.

Either PC_Chair=Chair_PCor PC_Member=Member_PC

result: PC_Chair=Chair_PCTherefore:PC_Member!=Member_PC

04/18/23 for KEG seminar 28

Ilustrative examples

Rejection vs. Reject Both are related to the outcome of the review of a

submitted paper Their position in taxonomy reveal differences in

meaning

Reccommendation is inputDecision is outputof the process of revieving

04/18/23 for KEG seminar 29

Ilustrative examples

Location vs. Place Location relates to the country and city where

conference is held Place relates to parts of building where particular

events take place

It is need to look at the range and domain restrictions of related properties:

Location is domain of properties: locationOfLocation is range of properties: heldIn

iasted:Place is domain of properties: is_equipped_bysigkdd:Place is range of properties: can_stay_in

04/18/23 for KEG seminar 30

Lessons learned Relevance of context

Lexical matching not enough Local structure not enough? Advice: employ semantics, background knowledge

(eg. Recommendation and Decision case) Semantic relations

Equivalent mappings quite often lead to inconsistencies

Many concepts are closely related but not exactly the same

Advice: discover not only equivalent mappings

04/18/23 for KEG seminar 31

Lessons learned

Alternative Interpretations (intended meaning) incomplete specification in ontologies lead

to diverse interpretations (PC_Member case),

Advice: check consistency of proposed mappings

04/18/23 for KEG seminar 32

Agenda Conference track within OAEI-2006

Initial manual empirical evaluation

Empirical Evaluation via Logical Reasoning Mapping debugging based on Drago system Experiments with OntoFarm collection

Consensus Building Workshop Mining over the mappings with meta-data

04/18/23 for KEG seminar 33

Mining over the mappings with meta-data

Introduction to Mapping Patterns Mining

4ft-Miner Mining over Mapping Results

04/18/23 for KEG seminar 34

Mapping patterns Deal with (at least) two ontologies Reflect the structure of ontologies and

include mappings between element of ontologies

Mapping pattern is a graph structure nodes are concepts, relations or instances Edges are mappings or relation between

(domain, range) elements or structural relations between classes (subclasses, siblings)

04/18/23 for KEG seminar 35

Mapping patterns - examples

The simplest one

Parent-child triangle

04/18/23 for KEG seminar 36

Mapping patterns - examples

Mapping along taxonomy

Sibling-sibling triangle

04/18/23 for KEG seminar 37

Mapping patterns - usage

Mining knowledge about habits? Enhance Ontology Mapping?

04/18/23 for KEG seminar 38

4ft-Miner Procedure from the LISp-Miner data

mining system This procedure mines for association

rules , where , is antecedent is succedent are condition

is 4ft-quantifier – statistical or heuristic test over the four-fold contingency table of and .

04/18/23 for KEG seminar 39

Mining over Mapping Results - data Data matrix

Name of mapping systemName of elements in mappingTypes of elements (‘c’, ’dp’, ‘op’)Validity of the correspondenceOntologies where elements belong toTypes of ontologies (‘tool’, ‘insider’, ‘web’)Manual label – ‘correctness’ (‘+’, ‘-’, ‘?’)Information about patterns in which this mapping

plays role Measure and result of the other mapping from

pattern

04/18/23 for KEG seminar 40

Mining over Mapping Results – analytic questions

1. Which systems give higher/lower validity than others to the mappings that are deemed ‘in/correct’?

2. Which systems produce certain mapping patterns more often than others?

3. Which systems are more succesful on certain types of ontologies?

04/18/23 for KEG seminar 41

Mining over Mapping Results Output: Ad 1)

Falcon system: twice more often ‘incorrect’ mappings with medium validity than all systems (on average)

RiMOM and HMatch systems: more ‘correct’ mappings with high validity than all system (on average)

Ad 2) HMatch: its mappings with medium validity more likely instantiate Pattern 1 than with all

validity values of such correspondences RiMOM: its mappings with high validity more likely instantiate Pattern 2 than with all

validity values of such correspondences

Ad 3) Automs: has more correct mappings between ontologies which are developed

according to web-pages, than all systems (on average) OWL-CtxMatch: has more correct mappings between ontologies which are

developed by insiders, than all systems (on average)

‘on average’ relates to average difference: a(a+b+c+d)/((a+b)(a+c))- 1

04/18/23 for KEG seminar 42

Graph-based methodsfor Ontology Matching (first experience)

Ondřej Šváb

04/18/23 for KEG seminar 43

Agenda

Graph in Ontology Mapping Graph Matching Problem Similarity Flooding Structural Method

04/18/23 for KEG seminar 44

Basic notation and terminology Graph

V is the set of vertices (nodes) E is the set of edges (arcs)

Types of graphs Directed, undirected Acc. To information connected with nodes and

edges Labelled graph Attributed graph

Tree is connected graph without circle Rooted tree, …

04/18/23 for KEG seminar 45

Ontology Mapping - formal definition

Ontology contains entities={concepts, relations and instances}

Ontologies O1, O2 consider as directed cyclic graphs with labelled edges, labelled nodes

Alignment A is the set of mapped pairs (a,b), where a N1 and b N2.

04/18/23 for KEG seminar 46

Simplifacation – example1Consider just subclass/superclass relation, without multiple inheritance

Ontologies as rooted trees

04/18/23 for KEG seminar 47

Labels of concepts – example1

04/18/23 for KEG seminar 48

Suggested structure-based technique

Onto2Tree ExactMatch -> initial mapping

(s:Thing=t:Thing) PropagateInitMappings – using

structures of trees and initial mappings to deduce new subsumption relations

04/18/23 for KEG seminar 49

New subsumptions – example1

04/18/23 for KEG seminar 50

Graph Matching Problem Graph are used in many fields (effective way of representing objects) Exact graph matching (isomorphism)

Inexact graph matching (homomorphism)

One-to-one Many-to-many matching (even more difficult to solve, preferable more concrete

results)

complexity problem! – combinatorial nature of graph matching problem

04/18/23 for KEG seminar 51

How to measure the similarity between nodes and arcs?

Isomorhism in graph Graph edit distance measures Similarity Flooding algorithm

04/18/23 for KEG seminar 52

How to measure the similarity between nodes and arcs?

Isomorhism in graph? Rather homomorphism

Tree simplification – efficient algorithms exist Begin with leaves Assign the set of vertices, which might be

isomorphic According to degree of vertices, how many

leaves or nonleaves they are adjacent to Make partitions of potentially isomorphic vertices

Classes of equivalence

04/18/23 for KEG seminar 53

How to measure the similarity between nodes and arcs?

Graph edit distance measures Tree simplification again Compute the minimum cost to transform

one tree into another using elementary operations, such as Substitution (replacing label of node) Insertion (of a node) Deletion (of a node)

04/18/23 for KEG seminar 54

Similarity Flooding algorithm

input: two structure (generally)output: mappings between corresponding

nodes

author: Sergey Melnik, 2001

04/18/23 for KEG seminar 55

Similarity Flooding

1. Models converted into directed labeled graphs

Intuition behind: Elements of two distinct models are similar when

their adjacent elements are similar 2. The similarity of two elements is propagated

(partly) to their respective neighbors

(fixpoint computation) 3. some filters are used on mappings->results

04/18/23 for KEG seminar 56

Similarity Flooding

Algorithm1. G1=Graph(S1); G2=Graph(S2)2. initialMap = StringMatch(G1,G2)3. product = SFJoin(G1,G2,initialMap)4. result = selectThreshold(product)

04/18/23 for KEG seminar 57

Example2 - crs_dr.owl, pcs.owl

crs_dr.owl

pcs.owl

04/18/23 for KEG seminar 58

Simple Structural method Nodes in trees represent with

attributes derived from their place in tree

Attributes for node s Level(s) Length(s) Children(s) Max_children(s) Siblings(s) Max_siblings(s)

relLevel(s)

relChildren(s)

relSiblings(s)

04/18/23 for KEG seminar 59

Structural Method

StructureDistance(Ontology O1, Ontology O2) S=Level_order(O1); T=Level_order(O2)

//trees

S’=Attributes(S); T’=Attributes(T) for each s in S’ for each t in T’ distance = distance(s,t) //euclidean distance in three-dimensional space

04/18/23 for KEG seminar 60

Example3 - crs_dr.owl, pcs.owl

crs_dr.owl

04/18/23 for KEG seminar 61

Example3 - crs_dr.owl, pcs.owl

pcs.owl

04/18/23 for KEG seminar 62

Example3 - crs_dr.owl, pcs.owl

Euclidean distance for some pairs in three-dimensional space

04/18/23 for KEG seminar 63

Example3 - crs_dr.owl, pcs.owl

onto1,onto2, name1, name2, structure, sf2

04/18/23 for KEG seminar 64

Thank you for your attention!