ontology alignment/matching prafulla palwe. agenda ► introduction being serious about the...
Post on 11-Jan-2016
Embed Size (px)
Ontology Alignment/MatchingPrafulla Palwe
AgendaIntroductionBeing serious about the semantic webLiving with heterogeneity Heterogeneity problemI have a plan for you
Matching ProblemMatching OperationMotivationSchema Matching Vs Ontology MatchingCorrespondenceAlignment
Matching ProcessSequential compositionParallel composition
Basic TechniquesElement Level Structure Level
Summary and Challenges
IntroductionBeing serious about the semantic web -It is not one guy's ontologyIt is not several guys' common ontologyIt is many guys and girls' many ontologiesSo it is a mess, but a meaningful mess
IntroductionLiving with heterogeneity -The semantic web will be:HugeDynamicHeterogeneousThese are not bugs, they are features.We must learn to live with them.
IntroductionHeterogeneity problem Resources being expressed in different ways must be reconciled before being used.Mismatch between formalized knowledge can occur when:different languages are used;different terminologies are used;different modeling is used.
IntroductionI have a plan for you Reconciliation
Matching ProblemMatching OperationDefinition Matching operation takes as input ontologies, each consisting of a set of discrete entities (e.g., tables, XML elements, classes, properties) and determines as output the relationships (e.g., equivalence, subsumption) holding between these entities
Matching ProblemMotivation 2 XML Schemas 2 Ontologies
Matching ProblemSchema mapping Vs ontology mappingDifferences -Schemas often do not provide explicit semantics for their dataRelational schemas provide no generalizationOntologies are logical systems that constrain the meaningOntology definition as set of logical axiomsCommonalities -Schemas and ontologies provide a vocabulary of terms that describes the domain of interestSchemas and ontologies constrain the meaning of terms used in the vocabulary.
Matching ProblemCorrespondenceDefinition Given 2 ontologies O and O , a correspondence between M between O and O is a 5-uple : such that:id is a unique identifier of the correspondence.e and e are entities of O and O (e.g. XML Elements, classes)R is a relation (e.g. equivalence (=), disjointness (_|_))n is a confidence measure in some mathematical structure (typically in the [0,1] range)
Matching ProblemAlignmentDefinition Given 2 ontologies O and O, an alignment A between O and O:Is a set of correspondence on O and OWith some cardinality: 1-1, 1-* etc.Some additional metadata (method, date, properties etc)
Matching ProcessGeneral Basic Matching Process
Matching ProcessSequential Composition
Matching ProcessParallel composition
Matching ProcessSimilarity Filter, alignment extractor and alignment filter
Matching ProcessAggregation Operations There are many different ways to aggregate matcher results, usually depending on confidence/similarity:Triangular norms (min, weighted products) useful for selecting only the best resultsMultidimensional distances (Eudidean distance, weighted sum) useful for taking into account all dimensionsFuzzy aggregation (min, weighted average) useful for aggregating competing algorithms and averaging their resultsOther specific measures (e.g., ordered weighted average)
Application DomainsTraditional - Ontology evolution Schema integration Catalog integration Data integration
Application DomainsOntology Evolution
Application DomainsCatalog Integration
Application DomainsEmergent P2P information sharing Agent communication Web service composition Query answering on the web
Application DomainsP2P information sharing
Application DomainsWeb Service Composition
Application DomainsAgent communication
ClassificationsMatching DimensionsInput DimensionsUnderlying models (e.g. XML, OWL)Schema Level Vs Instance LevelProcess Dimensions Approximate Vs ExactInterpretation of the inputOutput DimensionsCardinalityEquivalence Vs Diverse relationsGraded Vs Absolute Confidence
ClassificationsThree LayersUpper LayerGranularity of match Interpretation of the input informationMiddle LayerRepresents classes of elementary (basic) matching techniquesLower LayerBased on the kind of input which is used by elementary matching techniques
ClassificationsClassification of schema based techniques
Basic TechniquesElement Level TechniquesString based Prefix -Takes an input 2 strings and checks whether the first string starts with the second e.g. net = network but also hot = hotelSuffix Takes an input 2 strings and checks whether the first string ends with the second e.g. ID = PID but also word = swordEdit Distance Takes as input 2 strings and calculates the number of edit operations (insertion,deletion,substitution) of characters required to transform one string into other normalized by length of the max string.editDistance(NKN, Nikon) = 0.4
Basic TechniquesLanguage based Tokenization Parses names into tokens by recognizing punctuation, cases Hands-Free_Kits Lemmatization Analyses morphologically tokens in order to find all their possible basic formsKits Kit Elimination Discards empty tokens that are articles, prepositions, conjuctions a, the, by, type of, their, from
Basic TechniquesStructure Level Techniques Ontologies are viewed as graph-like structure containing terms and their inter-relationships.Taxonomy basedBounded path matchingThese take 2 paths with links between classes defined by the hierarchical relations, compare terms and their positions along these paths and identify similar terms.Super(sub)-concept rules If super concepts are the same, the actual concepts are similar to each other
Basic TechniquesTree based Children 2 non leaf schema elements are structurally similar if their immediate children sets are highly similar Leaves 2 non leaf schema elements are structurally similar if their leaf sets are highly similar, even if their immediate children are not.
Summary and ChallengesSummaryOntology Matching and alignment is the process of developing the common or most common structure/semantic terms out of 2 or more different ontologies/structures/schemas.Different efficient and complex algorithms using basic techniques of matching process, can be developed for matching and alignment generation.ChallengesDeveloping generic and highly efficient matching and alignment generation algorithms.