PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment

Download PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment

Post on 12-Jan-2016

31 views

Category:

Documents

2 download

Embed Size (px)

DESCRIPTION

PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment. Natalya F. Noy Stanford Medical Informatics Stanford University. Outline. Definitions and motivation The PROMPT ontology-merging algorithm Incremental algorithm (PROMPT) Statistical algorithm (Anchor-PROMPT) - PowerPoint PPT Presentation

TRANSCRIPT

  • PROMPT:Algorithm and Tool for Automated Ontology Merging and AlignmentNatalya F. NoyStanford Medical InformaticsStanford University

  • OutlineDefinitions and motivationThe PROMPT ontology-merging algorithmIncremental algorithm (PROMPT)Statistical algorithm (Anchor-PROMPT)The toolsEvaluationFuture work

  • OntologiesCharacterize concepts and relationships in an application area, providing a domain of discourseEnumerate concepts, attributes of concepts, and relationships among conceptsDefine constraints on relationships among concepts

  • Why do we need ontologiesAn ontology provides a shared vocabulary for different applications in a domainAn ontology enables interoperation among applications using disparate data sources from the same domain

  • Ontologies Are EverywhereOntologies have been used in academic projects for a long timeKnowledge sharing and reuseReuse of problem-solving methodsOntologies are becoming widely used outside of academiaCategorization of Web sites (e.g. Yahoo!)Product catalogs

  • Need for Ontology MergingThere is significant overlap in existing ontologiesYahoo! and DMOZ Open DirectoryProduct catalogs for similar domains

  • Need for Ontology Merging and IntegrationNeed to merge or align overlapping ontologiesChemdexa portal for accessing life-sciencesupply catalogsWorkshop on Ontologies and Information Sharing at IJCAI20016 out of 18 papers (1/3) are about ontology merging and integration

  • What Is Ontology Merging

  • Existing ApproachesOntology design and integrationterm matching (Stanford SKC, ISI)graph-based analysis (Stanford SKC)transformation operators (Ontomorph at ISI)merging tools (Chimaera at Stanford KSL)Object-oriented Programmingsubject-oriented programming (IBM)subjective views of classestransformation operationsconcentrates on methods rather than relations

  • Existing Approaches (II)Databasesdevelop mediators and provide wrappersdefine a common data model and mappingsdefine matching rules to translate directly

    Most of these approachesdo not provide any guidance to the user,do not use structural information

  • OutlineDefinitions and motivationThe PROMPT ontology-merging algorithmIncremental algorithm (PROMPT)Statistical algorithm (Anchor-PROMPT)The toolsEvaluationFuture work

  • PROMPTOur approach is:Partial automationAlgorithms based on concept-representation structurerelations between conceptsusers actions Our approach is not:Complete automationAlgorithm for matching concept names

  • Knowledge ModelA generic knowledge model of OKBC (Open Knowledge-Base Connectivity Protocol)ClassesCollections of objects with similar propertiesArranged in a subclasssuperclass hierarchyInstancesSlotsFirst-class objects in a knowledge baseBinary relations describing properties of classes and instancesFacetsConstraints on slot values (cardinality, min, max)

  • The PROMPT AlgorithmMake initial suggestionsSelect the next operation

  • Example: merge-classesAgency employeeAgentCustomersubclass ofagent forAgentEmployeeTravelersubclass ofhas client

  • Example: merge-classes (II)

  • Analyzing Global Properties LocallyGlobal propertiesclasses that have the same sets of slotsclasses that refer to the same set of classesslots that are attached to the same classesLocal contextincremental analysisconsider only the concepts that were affected by the last operation

  • The PROMPT Operation SetExtends the OKBC operation set with ontology-merging operationsmerge classesmerge slotsmerge instancescopy of a classdeep or shallowwith or without subclasses with or without instances

  • After a User Performs an OperationFor each operationperform the operationconsider possible conflictsidentify conflictspropose solutionsanalyze local contextcreate new suggestionsreinforce or downgrade existing suggestions

  • ConflictsConflicts that PROMPT identifiesname conflictsdangling referencesredundancy in a class hierarchyslot-value restrictions that violate class inheritance

  • Example: merge-classes AgentAgent

  • Operation Steps: merge-classes Own slot and their values for the new classask the user in case of conflicts or use preferencesTemplate slots for the new classunion of template slots of the original classesSubclasses and superclasses for the new classConflictsSuggestions

  • Template SlotsCopy template slots that dont exist in the merged ontologyAgentAgentAgentagent for

  • Template SlotsAttach the slots that have already been mappedhas client

  • Subclasses And SuperclassesIf a superclass (subclass) exists, re-establish the linksAgentAgentAgent

  • Dangling ReferencesAgentAgentAgentagent for

  • Additional Suggestions: Merge SlotsIf slot names at the merged class are similar, suggest to merge the slotsAgentclienthas client

  • Additional Suggestions: Merge ClassesIf the set of classes referenced by the merged class is the same as the set of classes referenced by another class, suggest a mergehasclientshandlesreservations

  • Additional Suggestions: Merge ClassesIf names of superclasses (subclasses) of the merged class are similar, suggest to merge the classesEmployeeAgencyemployeesuperclasssuperclass

  • Check for CyclesIf there is a cycle, suggest removing one of the parentsPersonEmployeeAgencyemployeesuperclasssuperclass

  • To SummarizePerform the actual operationFor the concepts (classes, slots, and instances) directly attached to the operation argumentsperform global analysis for new suggestionsPerform global analysis for new conflicts

  • Context

  • Anchor-PROMPT: Using Non-Local ContextsInput:A set of anchor pairsOutput:A set of related terms with similarity scores

    Where do anchors come from?Lexical matchingInteractive toolsUser-specifiedOntology 1Ontology 2

  • Generating Paths in the Graph

  • Similarity ScoreGenerate a set of all paths (of length < L)Generate a set of all possible pairs of paths of equal lengthFor each pair of paths and for each pair of nodes in the identical positions in the paths, increment the similarity scoreCombine the similarity score for all the paths

  • Equivalence Groups

  • Anchor-PROMPT: Initial ResultsTRIALTrialPERSONPersonCROSSOVERCrossoverPROTOCOLDesignTRIAL-SUBJECTPersonINVESTIGATORSPersonPOPULATIONAction_SpecPERSONCharacterTREATMENT-POPULATIONCrossover_arm

  • Knowledge Model AssumptionsThe only assumption:An OKBC-compliant knowledge model

  • OutlineDefinitions and motivationThe PROMPT ontology-merging algorithmIncremental algorithm (PROMPT)Statistical algorithm (Anchor-PROMPT)The toolsEvaluationFuture work

  • Protg-2000An environment forOntology developmentKnowledge acquisitionIntuitive direct-manipulation interfaceExtensibilityAbility to plug in new components

  • Ontologies in Protg-2000

  • Protg-200 pluginsDomain-specific user-interface pluginsAlternative back ends for archival storageUtility programs for knowledge-acquisition tasksEnd-user applications

  • Protg-based PROMPT toolProtg-2000has an OKBC-compatible knowledge modelallows building extensions through a plug-in mechanismcan work as a knowledge-base server for the plug-ins

  • The PROMPT tool

  • The PROMPT tool featuresSetting a preferred ontologyMaintaining the users focusProviding feedback to the userPreserving original relationssubclass-superclass relationsslot attachmentfacet valuesLinking to the direct-manipulation ontology editorLogging operations

  • OutlineDefinitions and motivationThe PROMPT ontology-merging algorithmIncremental algorithm (PROMPT)Statistical algorithm (Anchor-PROMPT)The toolsEvaluationFuture work

  • EvaluationKnowledge-based systems are rarely evaluatedWe can use software-engineering approaches to empirical evaluation of toolsWe need to develop additional knowledge-base measurements

  • Questions we askedHow good are PROMPTs suggestions and conflict-resolution strategies?Does PROMPT provide any benefit when compared to a generic ontology-editing tool (Protg-2000)?

  • What we were trying to find outThe benefit that the tool providesProductivity benefitQuality improvement in the resulting ontologiesUser satisfactionPrecision and recall of the tools suggestions

  • Source ontologies for the experimentsTwo ontologies of problem-solving methodsthe ontology for the Unified Problem-solving Method Development Language (UPML)the ontology for the Method-Description Language (MDL)

  • Experiment 1: Evaluate the quality of PROMPTs suggestionsMetricsPrecisionRecallMethodAutomatic loggingAutomatic data reportingSuggestions that the tool producedOperations that the user performedSuggestions that the user followed

  • Results: the quality of PROMPTs suggestionsSuggestions that users followedConflict-resolution strategies that users followedKnowledge-base operationsgenerated automatically90%75%74%

    Chart1

    10

    90

    Sheet1

    1090

    Sheet1

    0

    0

    Sheet2

    Sheet3

    Chart2

    25

    75

    Sheet1

    2575

    Sheet1

    0

    0

    Sheet2

    Sheet3

    Chart3

    26

    74

    Sheet1

    2674

    Sheet1

    0

    0

    Sheet2

    Sheet3

  • Experiment 2: PROMPT versus generic Protg-2000Metricscontent of the resulting ontologiesnumber of explicit knowledge-base operationsPROMPT

  • Results: PROMPT versus generic Protg-2000The resulting ontologies had only one differenceSpecifying operations explicitly1660

    Chart3

    16

    60

    Sheet1

    1660

    PROMPTProtg

    Sheet1

    0

    0

    Sheet2

    Sheet3

  • ResultsExperts followed most of the PROMPTs suggestionsUsing PROMPT has improved the efficiency of ontology merging

  • Anchor-PROMPT EvaluationExperiment setupTwo ontologies from the DAML ontology libraryVarying parametersmaximum path lengthnumber of anchor pairsExperiment resultsRatio of correct results above the median similarity score

  • Anchor-PROMPT: Evaluation Results

    Sheet1

    Max path lengthNumber of anchorsResult precision

    4467%

    4367%

    4261%

    3467%

    3361%

    3256%

    24100%

    23100%

    22100%

  • Anchor-PROMPT Evaluation ResultsEquivalence groups of size
  • Future workExtend the set of heuristics that PROMPT uses for guiding the expertsExtend the techniques to ontology alignment and ontology refactoringDevelop protocols and metrics for a more detailed evaluation of the tools

  • http://protege.stanford.edu

Recommended

View more >