valerie cross, cosmin stroe xueheng hu , pramit silwal , maryam panahiazar , isabel f. cruz,
DESCRIPTION
Aligning the Parasite Experiment Ontology and the Ontology for Biomedical Investigations Using AgreementMaker. Valerie Cross, Cosmin Stroe Xueheng Hu , Pramit Silwal , Maryam Panahiazar , Isabel F. Cruz, Priti Parikh, Amit Sheth [email protected] July 29 , 2011 ICBO @ Buffalo NY. - PowerPoint PPT PresentationTRANSCRIPT
1
Aligning the Parasite Experiment Ontology
and the Ontology for Biomedical Investigations
Using AgreementMakerValerie Cross, Cosmin StroeXueheng Hu, Pramit Silwal,
Maryam Panahiazar, Isabel F. Cruz, Priti Parikh, Amit Sheth
July 29 , 2011ICBO @ Buffalo NY
2
Outline Task: Align PEO and OBI Ontologies
OAEI Investigation
AgreementMaker Overview
Enhancements to AgreementMaker
Experimental Results
Conclusions and Future Work
Parasite Experiment Ontology (PEO) http://wiki.knoesis.org/index.php/Parasite_Experiment_ontology
models provenance metadata associated with experiment protocols used in parasite research.
extends the upper-level Provenir ontology (http://knoesis.wright.edu/provenir/provenir.owl) PEO (v 1.0) includes Proteome, Microarray, Gene Knockout, and Strain
Creation experiment terms along with other terms that are used in pathway. 110 classes & 27 properties, uses concepts in Parasite Life Cycle ontology
3
Snapshot of PEO
Ontology for Biomedical Investigations(OBI) http://purl.obolibrary.org/obo/obi describes biological and clinical investigations. includes a set of 'universal' terms applicable across
various biological and technological domains, and domain-specific terms relevant only to a given domain.
support the consistent annotation of biomedical investigations, regardless of the particular field of study.
represent the design of an investigation, the protocols and instrumentation used, the material used, the data generated and the type analysis performed on it.
being built under the Basic Formal Ontology (BFO).
4
Ontology Alignment Evaluation Initiative (OAEI) http://oaei.ontologymatching.org Annual international competition to evaluate ontology alignment techniques with multiple tracks Benchmark tests Biomedical track (Mouse and NCI Human Anatomies) Conference track (15 ontologies)
“Side effect” of the competition are published ontology sets consists of two ontologies and correct mappings as determined by experts
Results measured by Recall, precision, and F-measure (combines recall and precision) Runtime Other
5
OAEI 2010http://oaei.ontologymatching.org/2010/results/anatomy/index.html#corrections
8
OAEI Anatomy Track #1 The matcher has to be applied with its standard settings. #2 An alignment has to be generated that favors precision over recall. #3 An alignment has to be generated that favors recall over precision. #4 A partial reference alignment has to be used as additional input.
10
AgreementMaker - OA SystemUniv. of Illinois Chicago, ADVIS Lab, Dr. Isabel F. Cruz and Cosmin Stroe
11
Motivation Automatic methods are required to match large ontologies Several features of the ontologies have to be considered Users need to trust the mappings and to be directly
involved in the loop System’s capabilities
Wide range of matching methods Capability to smartly combine multiple strategies Multi-purpose user interface to allow evaluation and
manual interaction with the matchings Extensible architecture to allow reuse and composition of
the matching modules
Architecture of a Matcher
12
Existing Matchers First layer (conceptual)
BSM (Basic Similarity Matcher) PSM (Parametric String-Based Matcher) ASM (Advanced Similarity Matcher) VMM (Vector-based Multi-term Matcher)
Second layer (structural) DSI Descendent Similarity Inheritance SSC Sibling Similarity Contribution
Third Layer (aggregation) LWC Linear Weighted Combination
14
LWC
17
Lexicon Extensions to Matchers AgreementMaker version 0.22 extended
these string-based matchers by integrating two lexicons (2010 OAEI): the Ontology Lexicon, built from synonym and
definition annotations existing in the ontologies themselves, and
the WordNet Lexicon, created by starting with the ontology lexicon and adding any non-duplicated synonyms/definitions found in WordNet
Result: BSMlex, PSMlex, and VMMlex.
18
Initial Experiments AgreementMaker (ver. 0.22) with the OAEI 2010 anatomy
configuration resulted in only two mappings Found inconsistency in entity descriptions of PEO and OBI.
Identifiers: PEO URIs use a textual fragment identifier (http://knoesis.wright.edu/ParasiteExperiment owl#transfection), while OBI's entities use numerical identifiers (e.g., http:// purl.obolibrary.org/obo/OBI_0600060).
Labels: PEO's use of the rdfs:label field (on 19.1% of classes) does not follow the specification guidelines since it contains a PLO identifier. OBI uses the rdfs:label field to contain a descriptive string on almost 100% of its classes.
Comments: PEO uses on 99% of its classes and provides a definition. OBI only uses the comment field on about 4% of its classes.
Some common annotations exist between PEO and OBI BUT either PEO or OBI has low coverage OBI has high coverage for label annotations PEO has high coverage for comment annotations.
This heterogeneity and matchers matching the same annotations to each other (i.e., class ID with class ID, label with label, etc.) resulted in almost no alignment.
19
Annotation Profiling allow the user to select and combine different annotations of the
source or target ontology to be used in the alignment process.
21
Provenance Information Added
22
Customization of Lexicon Matchers The lexicon builders for BSMlex, PSMlex, and VMMlex
lexicon use a fixed name for the synonym and definition annotations (hasSynonym and hasDefinition).
Lexicon builder modified to exploit the synonym annotations in PEO and OBI by having the user choose the annotation names used to create the lexicons. OBI does not use hasSynonym but uses IAO annotation
properties IAO 0000111 (“editor preferred term") and IAO 0000118 (“alternative term") which serve the same function as synonyms for the OBI.
The PEO does not use synonyms but uses the comment annotation for a definition in most cases.
Result: BSMlex+, PSMlex+, and VMMlex+.
23
BioPortal Mappings http://bioportal.bioontology.org/mappings http://wiki.knoesis.org/index.php/Parasite_Experiment_ontology
http://wiki.knoesis.org/index.php/Parasite_Experiment_ontology
25
Experimental Results
27
Overlapping of Matchers
28
Conclusions and Future Work Experimental results in the biomedical domain demonstrate the problem
of heterogeneous annotations of ontologies. Validated past approach of extending matching algorithms using lexicons,
showing the best results produced by matchers that use lexicons BSMlex+
Investigate including more lexicons such as UMLS to achieve better result Heterogeneity managed by increasing the flexibility of state of the art
matching algorithms, i.e.,, annotation profiling, mapping provenance information and custom lexicons which supports a domain expert in this process relies on the user to select relevant annotations to be used in the matching
process. More work needs to be done specifically to automatically identify semantically
compatible annotations by applying established ontology evaluation metrics Already have added a wide variety of semantic similarity measures to
AgreementMaker for future use in semantic matching, not just lexical matching of concepts between ontologies.
.
29
THANK YOU!
QUESTIONS?
30