Evaluating Ontology-Mapping Tools: Requirements and Experience

Download Evaluating Ontology-Mapping Tools: Requirements and Experience

Post on 05-Jan-2016




0 download

Embed Size (px)


Evaluating Ontology-Mapping Tools: Requirements and Experience. Natalya F. Noy Mark A. Musen Stanford Medical Informatics Stanford University. Types Of Ontology Tools. Ontology Tools. Development Tools. Mapping Tools. Protg-2000, OntoEdit OilEd, WebODE, Ontolingua. - PowerPoint PPT Presentation


  • Evaluating Ontology-Mapping Tools:Requirements and ExperienceNatalya F. NoyMark A. Musen

    Stanford Medical InformaticsStanford University

  • Types Of Ontology ToolsThere is not just ONE class ofONTOLOGY TOOLSOntology ToolsDevelopment ToolsProtg-2000, OntoEditOilEd, WebODE, OntolinguaMapping ToolsPROMPT, ONION, OBSERVER,Chimaera, FCA-Merge, GLUE

  • Evaluation Parameters forOntology-Development ToolsInteroperability with other toolsAbility to import ontologies from other languagesAbility to export ontologies to other languagesExpressiveness of the knowledge modelScalabilityExtensibilityAvailability and capabilities of inference servicesUsability of tools

  • Evaluation Parameters ForOntology-Mapping ToolsCan try to reuse evaluation parameters for development tools, but:

    Ontology ToolsDevelopment ToolsMapping ToolsDifferenttasks, inputs,and outputsSimilartasks, inputs,and outputs

  • Development ToolsDomainontologyCreate anontologyInputOutputTask

  • Mapping Tools: TasksC=Merge(A, B)ABiPROMPT, ChimaeraMap(A, B)ABAnchor-PROMPT, GLUEFCA-MergeABArticulation ontologyONION

  • Mapping Tools: InputsClassesClassesClassesClassesClassesSharedinstancesInstancedataDLdefinitionsSlots andfacetsSlots andfacetsiPROMPTChimaeraGLUEFCA-MergeOBSERVER

  • Mapping Tools: Outputs and User Interaction

  • Can We Compare Mapping Tools?Yes, we can!We can compare tools in the same groupHow do we define a group?

  • Architectural Comparison CriteriaInput requirementsOntology elementsUsed for analysisRequired for analysisModeling paradigmFrame-basedDescription LogicLevel of user interaction:Batch modeInteractiveUser feedbackRequired?Used?

  • Architectural Criteria (contd)Type of outputSet of rulesOntology of mappingsList of suggestionsSet of pairs of related termsContent of outputMatching classesMatching instancesMatching slots

  • From Large Pool To Small GroupsSpace ofmapping toolsArchitectural criteriaPerformance criterion (within a single group)

  • Resources Required For Comparison ExperimentsSource ontologiesPairs of ontologies covering similar domainsOntologies of different size, complexity, level of overlapGold standard resultsHuman-generated correspondences between termsPairs of terms, rules, explicit mappings

  • Resources Required (contd)Metrics for comparing performancePrecision (how many of the tools suggestions are correct)Recall (how many of the correct matches the tool found)Distance between ontologiesUse of inference techniquesAnalysis of taxonomic relationships (a-la OntoClean)Experiment controlsDesignProtocol

  • Where Will The Resources Come From?Ideally, from researchers that do not belong to any of the evaluated projectsRealistically, as a side product of stand-alone evaluation experiments

  • Evaluation Experiment: iPROMPTiPROMPT isA plug-in to Protg-2000An interactive ontology-merging tooliPROMPT uses for analysisClass hierarchySlots and facet valuesiPROMPT matchesClassesSlotsInstances

  • Evaluation Experiment4 users merged the same 2 source ontologiesWe measuredAcceptability of iPrompts suggestionsDifferences in the resulting ontologies

  • SourcesInput: two ontologies from the DAML ontology libraryCMU ontology:Employees of academic organizationPublicationsRelationships among research groupsUMD ontology:IndividalsCS departmentsActivities

  • Experimental DesignUsers expertise:Familiar with Protg-2000Not familiar with PROMPTExperiment materials:The iPROMPT softwareA detailed tutorialA tutorial exampleEvaluation filesUsers performed the experiment on their own. No questions or interaction with developers.

  • Experiment ResultsQuality of iPROMPT suggestions:Recall: 96.9%Precision: 88.6%Resulting ontologiesDifference measure: fraction of frames that have different name and typeOntologies differ by ~30%

  • Limitations In The ExperimentOnly 4 participantsVariability in Protg expertiseRecall and precision figures without comparison to other tools are not very meaningfulNeed better distance metrics

  • Research QuestionsWhich pragmatic criteria are most helpful in finding the best tool for a taskHow do we develop a gold standard merged ontology? Does such an ontology exist?How do we define a good distance metric to compare results to the gold standard?Can we reuse tools and metrics developed for evaluating ontologies themselves?


View more >