Evaluating a Generalization of the Winkler Extension in the Context of Ontology Mapping
Maurice Hermans
Bachelor Conference 2
Ontologies Ontology mapping Research question Similarities
◦ Compared similarities Proposed extension Evaluation Results
Outline
22-6-2012
Bachelor Conference 3
Provide a vocabulary of terms that describe a domain of interest
There are several ways in which ontologies can differ:◦ Encoding◦ Lexical◦ Syntactic◦ Semantic◦ Semiotic
Ontologies
22-6-2012
Bachelor Conference 4
Knowledge systems used in the same domain can be built according to different specifications and requirements
This makes it very hard to exchange data between multiple knowledge systems which do not use the same ontology
Ontology mapping frameworks provide knowledge systems with the capacity to exchange information with other knowledge systems which use different ontologies.
Ontology mapping
22-6-2012
Bachelor Conference 5
To what extend can string similarities, applied to concept names, be improved such that
these are better suited for ontology mapping?
Research Question
22-6-2012
Bachelor Conference 6
Levenshtein◦ Uses the number of edit operations required to convert string one string
to another
Jaro◦ Uses the number of matching characters between two strings and their
relative position
Jaccard◦ Compares the sets of tokens of two strings
SoftTFIDF◦ Includes tokens which are similar according to a secondary similarity
function
String Similarities
22-6-2012
Bachelor Conference 7
Uses the length of the of the longest common prefix of s and t to assign more favourable ratings
Most commonly used with the Jaro similarity
◦ Where: Sim is the basis similarity and P’ the length of the common prefix bounded at 4
Winkler extension
22-6-2012
Bachelor Conference 8
Uses the length of the longest common substring of s and t to assign more favourable ratings
)◦ Where: Sim is the basis similarity, LCS the length of the
longest common substring and S the scaling for the bonus
Proposed extension
22-6-2012
Bachelor Conference 9
Two partial ontologies from the OAEI dataset
Example
22-6-2012
Bachelor Conference 10
Two datasets are used:◦ 2010 Ontology Alignment Evaluation Initiative◦ Dataset created by Cohen et al. 2000
Similarities are evaluated using precision and recall values
Evaluation
22-6-2012
Bachelor Conference 11
OAEI Cohen
Optimal weight for both datasets is around 0.8
Results
22-6-2012
Bachelor Conference 12
OAEI Cohen
Results
22-6-2012
Bachelor Conference 13
OAEI Cohen
Results
22-6-2012
Bachelor Conference 14
OAEI Cohen
Results
22-6-2012
Bachelor Conference 15
Discussion
22-6-2012
Bachelor Conference 16
Conclusion
22-6-2012