Ontology Mapping and link discovery

Download Ontology Mapping and link discovery

Post on 28-Jan-2016

19 views

Category:

Documents

0 download

DESCRIPTION

Ontology Mapping and link discovery. Kunal Narsinghani Ashwini Lahane. Agenda. Introduction Levels of heterogeneity Previous work in the field PROMPT Suite of Tools Prompt on Protg The Web of Data CRS : Managing Co-references Silk A link discovery framework. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

  • Kunal NarsinghaniAshwini Lahane

    Ontology Mapping and link discovery

  • Agenda

    IntroductionLevels of heterogeneityPrevious work in the fieldPROMPT Suite of ToolsPrompt on ProtgThe Web of DataCRS : Managing Co-referencesSilk A link discovery framework

  • IntroductionCan a single ontology suffice for various applications?

    Definition The task of relating the vocabulary of two Ontologies that share the same domain of discourse

    Its a morphism that consists of a collection of functions assigning symbols used in one vocabulary to the symbols in the other[1]

    This would provide a common layer from which ontologies can be accessed and exchange information.

    Translation is different from mapping

  • IntroductionAn analogy to the problem Clocks

    Levels of Heterogeneity in Ontologies Syntactic

    Structural

    Semantic

  • Mapping discoveryFirst approach is to use a reference ontology

    Example the upper Ontologies SUMO and DOLCE

    What when a shared ontology is not available?

    Structural & definitional information can be used to discover mappings

    Example tools IF-Map, QOM, MAFRA & Prompt

  • IF-MAP architectureFig: The steps in IF-MAP

  • PROMPT Suite of ToolsInteractive tools for ontology merging and mappingOntology formal specification of domain information facilitate knowledge sharing and reuseDifferent ontologies may overlap, need to be reconciledDetermine correlation Find all conceptsDetermine similaritiesChange source ontologies or remove overlapRecord mapping for future reference

  • Ontology ManagementTasksFinding correlationsMerging ontologiesVersion managementFactoring ontologiesToolsBenefit from being tightly integrated into single frameworkUniform user interfaceSame interaction paradigms Easy access from one tool to another

  • PROMPT Knowledge ModelBased on knowledge model of ProtgFrame based Types of framesClassSet of entities specifying a conceptSlots Attributes of class Has domain and range Must have unique namesInstances Elements of class

  • PROMPT FrameworkTools for multiple-ontology managementExtension to Protege ontology-editing environmentOpen architecture allows easy extension with pluginsTools in PROMPTIPROMPT Interactive ontology merging toolANCHORPROMPT a graph-based tool for finding similarities between ontologiesPROMPTDIFF for finding a diff between two versions of the same ontologyPROMPTFACTOR a tool for extracting a part of an ontology

  • PROMPT Framework

  • IPROMPT

    Interactive ontology merging toolLeads user through merging processSuggestions for mergingIdentifies inconsistencies and potential problemsSuggests strategies for resolvingUses structure of concepts and their relation along with user inputDecision based on local contextIterative

  • IPROMPT Algorithm

  • IPROMPT AlgorithmCreates initial suggestion based on lexical similarity of namesMerged ontology contains frames which are similar to frames in input ontologies2 ontologies O1 and O2 are merged to form OmMerging decisions are designer and task dependentSet of knowledge based operations definedFor each operation:Changes performed automaticallyNew merging suggestionsInconsistencies and potential problems

  • Class hierarchies

  • Suggestion for merging

  • IPROMPT Operations

    Merge classes Merge slotsMerge instancesShallow copy of a classCopy class from source ontology to mergedDeep copy of a classAlso copies all the parents of the class up to the root hierarchy

  • Inconsistencies & Potential Problems

    Name conflicts

    Dangling references

    Redundancy in the class hierarchy

    Slot values violating slot-value restrictions

  • Additional features

    Setting up preferred ontology

    Maintaining user focus

    Providing feedback to user

    Logging of ontology merging and editing operations

  • ANCHORPROMPT

    Graph based tool for finding similarities Compares larger portionsGoal : Augment IPROMPT by determining additional points of similarityInput : Anchors - Set of pairs of related termsAnchor identification Manual /AutomaticEach ontology is viewed as a directed labeled graph

  • ANCHORPROMPT representation

  • ANCHORPROMPT algorithm

  • AlgorithmBegins with anchor pair TRIAL, TrailPERSON, PersonPath 1: TRIAL -> PROTOCOL -> STUDY-SITE -> PERSONPath 2: Trial -> Design -> Blinding -> PersonDetermine similarity score for pair of related termsIf two pairs of terms from the source ontologies are similar and there are paths connecting the terms, then the elements in those paths are often similar as well

  • PROMPTDIFFTool for comparing ontology versionsVersion comparison in software code is based on comparing text filesOntologies have different text representationHeuristics algorithm that produces a structural diff between two versionsCompares the structure of the two ontology versionsIdentifies frames changed and what changes were made

  • PromptDiff AlgorithmAn extensible set of heuristic matchersFixed-point algorithm to combine the results of the matchers to produce a structural diff between two versions

  • PROMPTFACTOR

    Tool for factoring out semantically independent part of an large ontology into a new sub-ontologyEnsures that severed links do not introduce ill-defined concepts in the sub-ontologyUser can specify concepts of interestPerforms the transitive closure of the superclass relation and all the relations defined by slotsTarget ontology works as stand-alone

  • PromptFactor Algorithm

    User specifies the concept of interestPromptFactor traverses the ontology termDetermines transitive closure of all relations including subclass-of relationDetermines all the parents of selected term in hierarchyUser interactiveDetermines inconsistencies

  • Prompt DemoIt is available as a plug-in for Protg 3.4

    Uses linguistic similarity matches between concepts

    Also matches slot names and slot value types

    In cases where automation is not possible, user intervention is needed; possible actions are suggested

    Alignment is followed by merging

    Alignment is establishing links between the ontologies

    Merging is the creation of a single coherent ontology

  • Prompt Demo

  • The Web of DataData sources span a large range of domains

    RDF data model is used to publish structured data on the web

    Explicit RDF links exist between entities in different data sources

    However, there is a lack of tools to set RDF links to other data sources

  • SilkIt is a link specification language

    Allows specification of the links that should be discovered between data sources, as well as conditions to be fulfilled to be linked

    Link conditions are specified using similarity metrics; they can use aggregation functions to combine similarity scores

    Data access performed using SPARQL

  • Silk FeaturesSupport for owl:sameAs links and other types of RDF links

    Provides a declarative language to specify link conditions

    Datasets need not be replicated locally

    Caching, indexing and entity pre-selection are used to enhance performance

  • Silk LSL example

  • Silk LSL example..contd

  • Silk similarity metricsSimilarity metrics can be combined using aggregation functionsSets of resources can be selected using Silk RDF path selector language

  • Silk Pre-MatchingComparison of all entities in Source S and Target T would need O(|S|*|T|)

    Using pre-matching a limited set of target entities that are likely to match a given source entity is found

    Performed by indexing the target resources based on their property values

    Using this scheme reduces runtime to O(|S| + |T|)

  • Silk Implementation

  • Managing coreferences

    Semantic web vision - Large quantities of information Readily available InterlinkedMachine readableFragmented webSignificant overlapNeed to identify duplicatesCo-reference resolution determining equivalent URIs

  • Co-reference Resolution Service (CRS)

    Systematic analysis and heuristic based approach :IdentifyingPublishingManaging Using co-reference information

    Most prevalent way owl:sameAsEquivalence context dependent

  • CRSes

    Maintain sets of equivalent URIsStoring co-reference data separatelyURI definition and synonyms are kept separateManagement techniques - history, rollback, annotationUse of multiple CRSes that applications can useCore functionality in PHP easy integrationBacked by MySQL

  • Data representation in CRS

    Equivalent URIs are stored in bundles1 URI in each bundle is considered as a canon- preferred URIFormation of bundles:Check if URI already exists in any bundleIf not, create a singleton bundle for new URIsPerform merge union of bundles with equivalent URIs Constituent bundles that were merged are marked inactive

  • Examples of bundle formation

  • Data representation

    Data storage Indexed tables of hashed URIsPermits fast lookup to find:Canon of given URIAll URIs in a bundleDeprecate URIs by flagsFinding all equivalences - coref:coreferenceData links to the bundle for that URI and recursively repeat the process for each URI in that bundle

  • 2009-01-16 11:11:40

    RDF description of equivalent URIs in a bundle

  • Ways to speed up Look up only 1 URI from each CRSFollow only coref:canon predicate

    Lookup would need O(log|S|+ log|T|)

  • References[1] The PROMPT Suite: Interactive Tools For Ontology Merging And Mapping Natalya F. Noy and Mark A. Musen;Stanford Medical Informatics, Stanford University

    [2] Managing Co-reference on the Semantic Web - Hugh Glaser, Afraz Jaffri, Ian C. Millard School of Electronics and Computer Science University of Southampton Southampton, Hampshire, UK

    [3] Ontology Mapping: The State of the Art Yannis Kalfoglou and Marco Schorlemmer

    [4] Kalfoglou, Y. and Schorlemmer, M. (2003a). IFMap: an ontology mapping method based on information flow theory. Journal on Data Semantics, 1(1):98127.

    [5] Silk A Link Discovery Framework for the Web of Data Julius Volz, Christian Bizer et al.