ontology matching

66
Ícaro Medeiros Jaumir Valença da Silveira Franklin Amorim Pedro Henrique Ontology matching

Upload: icaro-medeiros

Post on 24-May-2015

695 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Ontology matching

Ícaro MedeirosJaumir Valença da Silveira

Franklin AmorimPedro Henrique

Ontology matching

Page 2: Ontology matching

● Context● Definitions● Classifications of Ontology Matching Techniques● Basic Techniques● Matching Strategies

Outline

Page 3: Ontology matching

Bibliography[1] Jerome Euzenat and Pavel Shvaiko. 2010. Ontology Matching (1st ed.). Springer Publishing Company, Incorporated.[2] Namyoun Choi, Il-Yeol Song, and Hyoil Han. 2006. A survey on ontology mapping. SIGMOD Rec.35, 3 (September 2006), 34-41.[3] Yannis Kalfoglou and Marco Schorlemmer. 2003. Ontology mapping: the state of the art. Knowl. Eng. Rev. 18, 1 (January 2003), 1-31. [4] Noy, N., 2005. Ontology Mapping and Alignment. Search, p.1-34. Available at: http://www.aifb.uni-karlsruhe.de/WBS/meh/foam/.[5] Casanova, M. A., 2012. Tecnologias de Banco de Dados para a Web Semântica - Módulo 9a - Ontologias - Matching.

Page 4: Ontology matching

● Context● Definitions● Classifications of Ontology Matching Techniques● Basic Techniques● Matching Strategies

Outline

Page 5: Ontology matching

● We have to deal with heterogeneity ● Different models are based on different

domains of knowledge and use different tools, at different detail levels

● Distributed nature of ontology development

has lead to different ontologies in the same or overlapping domains

Context

Page 6: Ontology matching

● Creating global ontologies from local ontologies● Reuse information between ontologies● Dealing with heterogeneity● Queries across multiple distributed resources● Data transformation

The need for ontology matching

Page 7: Ontology matching

● Context● Definitions● Classifications of Ontology Matching

Techniques● Basic Techniques● Matching Strategies

Outline

Page 8: Ontology matching
Page 9: Ontology matching

What is ontology matching?

It is the process of finding relationships or correspondences between entities of

different ontologies.

entities - classes, instances, properties or formulas

Page 10: Ontology matching

Other terms used

Page 11: Ontology matching

The matching process

Ontologies o and o'Alignment AParametersResources

Alignment A'

Page 12: Ontology matching

Ontology matching example

Page 13: Ontology matching

● Context● Definitions● Classifications of Ontology Matching Techniques● Basic Techniques● Matching Strategies

Outline

Page 14: Ontology matching

● Matching local ontologies to global ontologies ● Matching ontologies of complementary domains ● Merging two ontologies of the same domain

Classifying ontology matching in regard to the use

Page 15: Ontology matching
Page 16: Ontology matching

Synthetic Classifications

● Granularity/Input Interpretation Layer○ e.g. element- or structure-level

● Kind of Input Layer○ Classification based on the kind of input used by

elementary matching techniques

● Basic Techniques Layer○ Classification based on how input information is

interpreted

Page 17: Ontology matching

Granularity/Input Interpretation Layer

● Element-level matching techniques

○ Analysing entities or instances in isolation○ Ignoring their relations with other entities or their

instances

● Structure-level techniques○ Analysing how entities or their instances appear

together in a structure (e.g. by representing ontologies as a graph)

Page 18: Ontology matching

Granularity/Input Interpretation Layer

Syntactic techniques○ Interpret the input with regard to its sole structure

External techniques○ Uses external resources of a domain and common

knowledgeSemantic techniques

○ Interpret the input by using model-theoretic semantics

Page 19: Ontology matching
Page 20: Ontology matching

Kind of Input Layer

● Terminological○ Strings found in the ontology descriptions

● Structural○ Structures found in the ontology descriptions

● Semantics○ Requires some semantic interpretation of the

ontology● Extensional

○ Use data instances● In some papers, semantic=logic;

extensional=semantic

Page 21: Ontology matching

● Terminological○ String-based: terms as sequences of characters○ Linguistic: interpretation of the terms as linguistic

objects

● Structural○ Internal: consider the internal structure of entities○ Relational: consider the relation of entities with other

entities

Kind of Input Layer (Second level)

Page 22: Ontology matching
Page 23: Ontology matching

A label can be interpreted as○ A string (a sequence of letters)○ A word or a phrase in some natural language

A hierarchy can be considered as○ A graph○ A taxonomy

Basic Techniques Layer

Page 24: Ontology matching

Element-level

● String-based● Language-based● Based on linguistic resources● Constraint-based● Alignment reuse● Based on upper level and domain specific formal

ontologies

Basic Techniques Layer

Page 25: Ontology matching

Basic Techniques Layer

Structure-level

● Graph-based● Taxonomy-based

Page 26: Ontology matching

● String-based techniques● The more similar the strings, the more likely they

are to denote the same concepts● Distance functions map a pair of strings to a real

number

● Language-based techniques

● Based on natural language processing techniques exploiting morphological properties of the input words

Element-level Techniques

Page 27: Ontology matching

Element-level Techniques

● Constraint-based techniques● Deal with the internal constraints being applied to the

definitions of entities, such as types, cardinality of attributes, etc

● Linguistic resources

● Lexicons or domain specific thesauri, used to match words based on linguistic relations between them like synonyms, hyponyms, etc

Page 28: Ontology matching

Element-level Techniques

● Alignment reuse

● Record alignments of previously matched ontologies

● Upper level and domain specific ontologies● Used as external sources of common knowledge

Page 29: Ontology matching

● Graph-based techniques

● Treat input ontologies as labelled graphs● If two nodes from two ontologies are similar, their

neighbours may also be somehow similar

● Taxonomy-based techniques● is-a links connect terms that are already similar,

therefore their neighbours may be also somehow similar

Structure-level Techniques

Page 30: Ontology matching

● Context● Definitions● Classifications of Ontology Matching

Techniques● Basic Techniques● Matching Strategies

Outline

Page 31: Ontology matching

Basic Techniques

● Examples of metrics: Similarity and

Distance● Name-based techniques ● Structure-based techniques● Extensional techniques● Semantic-based techniques

Page 32: Ontology matching

Basic Techniques

Similarity: Function from a pair of entities to a real number

Page 33: Ontology matching

Name-based Techniques

● They can be applied to the name, the label

or the comments of entities in order to find those which are similar

● They can be used for comparing class

names and/or URIs

Page 34: Ontology matching

String-based methods

● Based on string similarity only

● Useful if conceptual schemas (or ontologies) use very similar strings to denote the same concepts

● Yield a low similarity, if schemas use synonyms with

different syntax

● Yield many false positives, if pairs of strings with low similarity are selected

Page 35: Ontology matching

String-based methods

String distance functions:

Page 36: Ontology matching

String-based methods

Levenshtein (edit) distance● Measure the similarity between two strings by

the minimum number of insertions, deletions, and substitutions of characters required to transform one string into the other

● Example: (“Gaming”, “Games”) = 2 substitutions [“e” by “i” and “n” by “s”] + 1 deletion [“g”] = 3

Page 37: Ontology matching

String-based methods

Token-based distance

● Usually applied to the complete description of a concept

● Treats strings as a bag of words (multisets of substrings)

● May split strings into independent tokens● Example: "InProceedings" is represented by

● the bag of words {In, Proceedings}● or a bag of substrings of length 3 {InP, roc, eed, ing, s}

Page 38: Ontology matching

Bag of words represented as a vector● Each dimension corresponds to a token● Each position of the vector is the number of occurrences of the

token

String-based methods

Page 39: Ontology matching

Ontology

Mapping

Ontologymapping=(1,1)

Mapping, ontologymapping=(1,2)

1

1 2

Cosine Similarity

V = {"Ontology", "Mapping" }

Page 40: Ontology matching

Language-based methods

Intrinsic methods● reduce each term to a normal form to facilitate

matching● use traditional natural language processing

techniques● stopword elimination● tokenization: segment strings into sequences of tokens● lemmatization: reduce words to normal forms

● suppress tense, gender and number

Page 41: Ontology matching

Language-based methods

Example – Variants of the term “theory paper”

Page 42: Ontology matching

Language-based methods

Extrinsic methods Use dictionaries, lexicons and terminologies to help match terms from different schemas or ontologies

● e.g. a terminology - a thesaurus which very often contains phrases rather than single words

● deal with synonyms● word sense disambiguation

Page 43: Ontology matching

●WordNet – an example of an external resource● an electronic lexical database for English● based on the notion of synsets (sets of synonyms)

● a synset denotes a concept or a sense of a group of terms

● WordNet also provides:● an hypernym structure (superconcept / subconcept) ● a meronym relation (part of)● textual descriptions of the concepts (glossary)

Language-based methods

Page 44: Ontology matching

Language-based methods

●Example● WordNet 2.0 entry for the word authorauthor1 noun: Someone who originates or causes or initiates something;

Example ‘he was the generator of several complaints’. Synonym generator, source. Hypernym maker. Hyponym coiner.

author2 noun: Writes (books or stories or articles or the like) professionally (for pay). Synonym writer2. Hypernym communicator. Hyponym abstractor, alliterator, authoress, biographer, coauthor, commentator, contributor, cyberpunk, drafter, dramatist, encyclopedist, essayist, folk writer, framer, gagman, ghostwriter, Gothic romancer, hack, journalist, libretist, lyricist, novelist, pamphleter, paragrapher, poet, polemist, rhymer, scriptwriter, space writer, speechwriter, tragedian, wordmonger, word-painter, wordsmith, Andersen, Assimov...

author3 verb.: Be the author of; Example ‘She authored this play’. Hypernym write. Hyponym co-author, ghost.

Page 45: Ontology matching

Language-based methods

●Example● fragment of the WordNet hierarchy (limited to nouns) for

“illustrator”, “author”, “creator”, “person”, “writer”

(“author”) ={A1, A2W2}

(“writer”) =

{W1, A2W2, W3}

Page 46: Ontology matching

Language-based methods

●Example – Synonym Similarity (s,t) = 1 iff (s) (t) (terms have a synset in common)

= 0 otherwise

(“author”) = {A1, A2W2} (“writer”) = {W1, A2W2, W3}

(“author”) (“writer”)

Page 47: Ontology matching

Language-based methods

’(s,t) =

●Example – Co-synonymy similarity| (s) (t)|

| (s) (t)|

(“author”) = {A1, A2W2}

(“writer”) = {W1, A2W2, W3} (“author”) (“writer”) = 1 (“author”) (“writer”) = 4

Page 48: Ontology matching

Structure-based techniques

Internal structure (constraint-based approaches)

● based on the internal structure of classes

● calculate the similarity between two classes based on○ the set of their properties, including keys○ the range of their properties (attributes and relations)○ the cardinality of their properties○ the transitivity or symmetry of their properties

Page 49: Ontology matching

Structure-based techniques

Internal structure (constraint-based approaches)

Page 50: Ontology matching

Structure-based techniques

Internal structure (constraint-based approaches)● positive point:

● can be used to eliminate incompatible matches● negative points:

● does not provide much information about the classes to compare

● different classes may have properties with the same datatypes● different models of a concept use different, and incompatible,

types● approach suggested:

● use method in combination with other methods

Page 51: Ontology matching

Structure-based techniques

Relational Structure● similarity between two concepts● based on the relations between the concepts with other

concepts○ similar concepts should have similar related concepts

● given a relation r, a pair of concepts may be:○ directly related through r○ inversely related through r ○ transitively related through r○ the maximal elements of r+

Page 52: Ontology matching

Structure-based techniques

Example subclass(Book) =

{Science, Pocket, Children}subclass−1(Book) =

{Product}subclass+(Book) =

{Science, Pocket, Textbook, Popular, Children}subclass ↑ (Book) =

{Textbook, Popular, Pocket, Children}

Page 53: Ontology matching

Structure-based techniques

Taxonomic Structure● Similarity between two concepts

○ Based on the graph of the subClassOf relation○ Example

■ (e,e’) = number of edges of the taxonomy between e and e’, normalized by dividing by the longest path

Page 54: Ontology matching

Structure-based techniques

Bounded path matchers

● use anchors relating paths from two distinct taxonomies

● take two paths with links from two distinct taxonomies● compare terms and their positions along these paths● identify similar terms

Page 55: Ontology matching

Structure-based techniques

Example

“Book -> Volume” and “Popular -> Autobiography” implies that possibly “Science -> Biography” or“Science -> Essay”

Page 56: Ontology matching

Structure-based techniques

Summary of relational structure methods

● Powerful methods to match conceptual schemas and ontologies

○ Allow relations between concepts to be taken into account

● Often used in combination with internal structural and terminological methods

Page 57: Ontology matching

Extensional techniques

When two ontologies share the same set of individuals, matching is highly facilitated.

Page 58: Ontology matching

Extensional techniques

● Jaccard Similarity: Given two sets A and B, let P(X) be the probability of a random instance to be in the set X.

● Note that the Jaccard Similarity reaches 1 when A = B and 0 when they are disjoint.

Page 59: Ontology matching

Semantic-based techniques

● Semantic-based techniques rely on using the axioms of ontologies and deductive methods.

● But for an inductive task like ontology matching, they do

not perform well alone. So, a preprocessing is needed. ● Therefore, we need, firstly, to suppress the lack of a

common ground between the ontologies. ● For those reasons, authors propose the use of semantic

techniques in two steps: the so-called anchoring step and the deriving relations step.

Page 60: Ontology matching

Semantic-based techniques

● Anchoring: is matching ontologies o' and o'' to the background ontology o. This can be done using any method described so far.

● Deriving relations: is the (indirect) matching of

ontologies o' and o'' by using the correspondences discovered during the anchoring step.

● Example: Micro-company: Has at most 5 employees.

SME: Has at most 10 associates. anchoring: employees ---> EMPLOYEE <--- associates Micro-company ---> FIRM <--- SME deriving relations: Micro-company is a subclass of SME.

Page 61: Ontology matching

● Context● Definitions● Classifications of Ontology Matching

Techniques● Basic Techniques● Matching Strategies

Outline

Page 62: Ontology matching

Matching strategies - Global Methods● Aggregating the results of the basic methods● Developing a strategy for computing these

similarities● Learning from data the best method and the

best parameters for matching ● Using probabilistic methods to combine

matchers or to derive missing correspondences● Involving users in the loop● Extracting the alignments from the resulting

(dis)similarity

Page 63: Ontology matching

Matcher composition

● Sequential composition of matchers

Page 64: Ontology matching

● Using matrices to represents a similarity or distance measure between entities to be matched

Matcher composition

Page 65: Ontology matching

● Parallel composition of matchers

Matcher composition

Page 66: Ontology matching

Similarity aggregation

Compound similarity is concerned with the aggregation of heterogeneous similarities

○ e.g. A single similarity measure composed by the similarity obtained from their names, the similarity of their superclasses, the similarity of their instances and that of their properties