Transcript
Page 1: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Ontology Ontology Alignment/MatchingAlignment/Matching

Prafulla PalwePrafulla Palwe

Page 2: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

AgendaAgenda► IntroductionIntroduction

Being serious about the semantic webBeing serious about the semantic web Living with heterogeneity Living with heterogeneity Heterogeneity problemHeterogeneity problem I have a plan for you I have a plan for you

► Matching ProblemMatching Problem Matching OperationMatching Operation MotivationMotivation Schema Matching Vs Ontology MatchingSchema Matching Vs Ontology Matching CorrespondenceCorrespondence AlignmentAlignment

► Matching ProcessMatching Process Sequential compositionSequential composition Parallel compositionParallel composition

► Application DomainsApplication Domains TraditionalTraditional EmergentEmergent

► ClassificationClassification Matching DimensionsMatching Dimensions

► Basic TechniquesBasic Techniques Element Level Element Level Structure LevelStructure Level

► Summary and ChallengesSummary and Challenges

Page 3: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

IntroductionIntroduction►Being serious about the semantic web -

It is not one guy's ontology It is not several guys' common ontology It is many guys and girls' many ontologies So it is a mess, but a meaningful mess

Page 4: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

IntroductionIntroduction

► Living with heterogeneity - The semantic web will be:

►Huge►Dynamic►Heterogeneous

These are not bugs, they are features. We must learn to live with them.

Page 5: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

IntroductionIntroduction

►Heterogeneity problem – Resources being expressed in different ways must be

reconciled before being used. Mismatch between formalized knowledge can occur when:

► different languages are used;► different terminologies are used;► different modeling is used.

Page 6: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

IntroductionIntroduction

► I have a plan for you – ReconciliationI have a plan for you – Reconciliation

Page 7: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProblemMatching Problem

►Matching OperationMatching Operation DefinitionDefinition – Matching operation takes as – Matching operation takes as

input ontologies, each consisting of a set input ontologies, each consisting of a set of discrete entities (e.g., tables, XML of discrete entities (e.g., tables, XML elements, classes, properties) and elements, classes, properties) and determines as output the relationships determines as output the relationships (e.g., equivalence, subsumption) holding (e.g., equivalence, subsumption) holding between these entitiesbetween these entities

Page 8: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProblemMatching Problem

►Motivation –Motivation – 2 XML Schemas 2 XML Schemas 2 Ontologies 2 Ontologies

Page 9: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProblemMatching Problem

Page 10: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProblemMatching Problem

Page 11: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProblemMatching Problem

Page 12: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProblemMatching Problem

Page 13: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProblemMatching Problem

Page 14: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProblemMatching Problem

Page 15: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProblemMatching Problem

Page 16: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProblemMatching Problem

Page 17: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProblemMatching Problem►Schema mapping Vs ontology mappingSchema mapping Vs ontology mapping

Differences -Differences -►Schemas often do not provide explicit Schemas often do not provide explicit

semantics for their datasemantics for their data Relational schemas provide no generalizationRelational schemas provide no generalization

►Ontologies are logical systems that constrain Ontologies are logical systems that constrain the meaningthe meaning

Ontology definition as set of logical axiomsOntology definition as set of logical axioms

Commonalities -Commonalities -►Schemas and ontologies provide a vocabulary Schemas and ontologies provide a vocabulary

of terms that describes the domain of interestof terms that describes the domain of interest►Schemas and ontologies constrain the Schemas and ontologies constrain the

meaning of terms used in the vocabulary.meaning of terms used in the vocabulary.

Page 18: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProblemMatching Problem

►CorrespondenceCorrespondence Definition –Definition – Given 2 ontologies O and O’ , a Given 2 ontologies O and O’ , a

correspondence between M between O and correspondence between M between O and O’ is a 5-uple : <id,e,e’,R,n> such that:O’ is a 5-uple : <id,e,e’,R,n> such that:►id is a unique identifier of the correspondence.id is a unique identifier of the correspondence.►e and e’ are entities of O and O’ (e.g. XML e and e’ are entities of O and O’ (e.g. XML

Elements, classes)Elements, classes)►R is a relation (e.g. equivalence (=), disjointness R is a relation (e.g. equivalence (=), disjointness

(_|_))(_|_))►n is a confidence measure in some n is a confidence measure in some

mathematical structure (typically in the [0,1] mathematical structure (typically in the [0,1] range)range)

Page 19: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProblemMatching Problem

►AlignmentAlignment Definition – Definition – Given 2 ontologies O and O’, an alignment Given 2 ontologies O and O’, an alignment

A between O and O’:A between O and O’:►Is a set of correspondence on O and O’Is a set of correspondence on O and O’►With some cardinality: 1-1, 1-* etc.With some cardinality: 1-1, 1-* etc.►Some additional metadata (method, date, Some additional metadata (method, date,

properties etc)properties etc)

Page 20: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProcessMatching Process

Page 21: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProcessMatching Process

Page 22: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProcessMatching Process

Page 23: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProcessMatching Process

Page 24: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProcessMatching Process

Page 25: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProcessMatching Process

Page 26: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProcessMatching Process

►General Basic Matching ProcessGeneral Basic Matching Process

Page 27: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProcessMatching Process

►Sequential CompositionSequential Composition

Page 28: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProcessMatching Process

►Parallel compositionParallel composition

Page 29: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProcessMatching Process

►Similarity Filter, alignment extractor Similarity Filter, alignment extractor and alignment filter –and alignment filter –

Page 30: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Matching ProcessMatching Process

►Aggregation Operations –Aggregation Operations – There are many different ways to aggregate matcher results, There are many different ways to aggregate matcher results,

usually depending on confidence/similarity:usually depending on confidence/similarity:► Triangular norms (min, weighted products) useful for selecting Triangular norms (min, weighted products) useful for selecting

only the best resultsonly the best results► Multidimensional distances (Eudidean distance, weighted Multidimensional distances (Eudidean distance, weighted

sum) useful for taking into account all dimensionssum) useful for taking into account all dimensions► Fuzzy aggregation (min, weighted average) useful for Fuzzy aggregation (min, weighted average) useful for

aggregating competing algorithms and averaging their resultsaggregating competing algorithms and averaging their results► Other specific measures (e.g., ordered weighted average)Other specific measures (e.g., ordered weighted average)

Page 31: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Application DomainsApplication Domains

►Traditional - Traditional - Ontology evolutionOntology evolution Schema integrationSchema integration Catalog integrationCatalog integration Data integrationData integration

Page 32: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Application DomainsApplication Domains

►Ontology EvolutionOntology Evolution

Page 33: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Application DomainsApplication Domains

►Catalog IntegrationCatalog Integration

Page 34: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Application DomainsApplication Domains

►EmergentEmergent P2P information sharingP2P information sharing Agent communicationAgent communication Web service compositionWeb service composition Query answering on the webQuery answering on the web

Page 35: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Application DomainsApplication Domains

►P2P information sharingP2P information sharing

Page 36: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Application DomainsApplication Domains

►Web Service CompositionWeb Service Composition

Page 37: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Application DomainsApplication Domains

►Agent communicationAgent communication

Page 38: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

ClassificationsClassifications

►Matching DimensionsMatching Dimensions Input DimensionsInput Dimensions

►Underlying models (e.g. XML, OWL)Underlying models (e.g. XML, OWL)►Schema Level Vs Instance LevelSchema Level Vs Instance Level

Process Dimensions Process Dimensions ►Approximate Vs ExactApproximate Vs Exact► Interpretation of the inputInterpretation of the input

Output DimensionsOutput Dimensions►CardinalityCardinality►Equivalence Vs Diverse relationsEquivalence Vs Diverse relations►Graded Vs Absolute ConfidenceGraded Vs Absolute Confidence

Page 39: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

ClassificationsClassifications

►Three LayersThree Layers Upper LayerUpper Layer

► Granularity of matchGranularity of match► Interpretation of the input informationInterpretation of the input information

Middle LayerMiddle Layer► Represents classes of elementary (basic) matching techniquesRepresents classes of elementary (basic) matching techniques

Lower LayerLower Layer► Based on the kind of input which is used by elementary Based on the kind of input which is used by elementary

matching techniquesmatching techniques

Page 40: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

ClassificationsClassifications►Classification of schema based Classification of schema based

techniquestechniques

Page 41: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Basic TechniquesBasic Techniques►Element Level TechniquesElement Level Techniques

String based – String based – Prefix -Prefix -

► Takes an input 2 strings and checks whether the first string Takes an input 2 strings and checks whether the first string starts with the second starts with the second

► e.g. net = network but also hot = hotele.g. net = network but also hot = hotel Suffix – Suffix –

► Takes an input 2 strings and checks whether the first string Takes an input 2 strings and checks whether the first string ends with the second ends with the second

► e.g. ID = PID but also word = sworde.g. ID = PID but also word = sword

Edit Distance –Edit Distance –► Takes as input 2 strings and calculates the number of edit Takes as input 2 strings and calculates the number of edit

operations (insertion,deletion,substitution) of characters operations (insertion,deletion,substitution) of characters required to transform one string into other normalized by required to transform one string into other normalized by length of the max string.length of the max string.

► editDistance(NKN, Nikon) = 0.4editDistance(NKN, Nikon) = 0.4

Page 42: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Basic TechniquesBasic Techniques Language based –Language based – Tokenization –Tokenization –

► Parses names into tokens by recognizing punctuation, casesParses names into tokens by recognizing punctuation, cases► Hands-Free_Kits Hands-Free_Kits <hands, free, kits> <hands, free, kits>

Lemmatization –Lemmatization –► Analyses morphologically tokens in order to find all their Analyses morphologically tokens in order to find all their

possible basic formspossible basic forms► Kits Kits Kit Kit

Elimination –Elimination –► Discards empty tokens that are articles, prepositions, Discards empty tokens that are articles, prepositions,

conjuctions conjuctions ► a, the, by, type of, their, from a, the, by, type of, their, from

Page 43: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Basic TechniquesBasic Techniques

►Structure Level Techniques Structure Level Techniques Ontologies are viewed as graph-like structure containing Ontologies are viewed as graph-like structure containing

terms and their inter-relationships.terms and their inter-relationships.

Taxonomy basedTaxonomy based►Bounded path matchingBounded path matching

These take 2 paths with links between classes These take 2 paths with links between classes defined by the hierarchical relations, compare terms defined by the hierarchical relations, compare terms and their positions along these paths and identify and their positions along these paths and identify similar terms.similar terms.

►Super(sub)-concept rules Super(sub)-concept rules If super concepts are the same, the actual concepts If super concepts are the same, the actual concepts

are similar to each otherare similar to each other

Page 44: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Basic TechniquesBasic Techniques

Tree based Tree based Children Children

►2 non leaf schema elements are structurally 2 non leaf schema elements are structurally similar if their immediate children sets are similar if their immediate children sets are highly similarhighly similar

Leaves Leaves ►2 non leaf schema elements are structurally 2 non leaf schema elements are structurally

similar if their leaf sets are highly similar, even similar if their leaf sets are highly similar, even if their immediate children are not.if their immediate children are not.

Page 45: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Basic TechniquesBasic Techniques

Page 46: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Basic TechniquesBasic Techniques

Page 47: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Basic TechniquesBasic Techniques

Page 48: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Summary and ChallengesSummary and Challenges

► SummarySummary Ontology Matching and alignment is the process of Ontology Matching and alignment is the process of

developing the common or most common developing the common or most common structure/semantic terms out of 2 or more different structure/semantic terms out of 2 or more different ontologies/structures/schemas.ontologies/structures/schemas.

Different efficient and complex algorithms using Different efficient and complex algorithms using basic techniques of matching process, can be basic techniques of matching process, can be developed for matching and alignment generation.developed for matching and alignment generation.

► ChallengesChallenges Developing generic and highly efficient matching Developing generic and highly efficient matching

and alignment generation algorithms.and alignment generation algorithms.

Page 49: Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem

Thank YouThank You


Top Related