1 aligning the parasite experiment ontology and the ontology for biomedical investigations using...

21
1 Aligning the Parasite Experiment Ontology and the Ontology for Biomedical Investigations Using AgreementMaker Valerie Cross, Cosmin Stroe Xueheng Hu, Pramit Silwal, Maryam Panahiazar, Isabel F. Cruz, Priti Parikh, Amit Sheth [email protected] July 29 , 2011 ICBO @ Buffalo NY

Upload: raymond-paul

Post on 18-Jan-2016

223 views

Category:

Documents


0 download

TRANSCRIPT

1

Aligning the Parasite Experiment Ontology

and the Ontology for Biomedical Investigations

Using AgreementMaker

Valerie Cross, Cosmin StroeXueheng Hu, Pramit Silwal,

Maryam Panahiazar, Isabel F. Cruz, Priti Parikh, Amit Sheth

[email protected]

July 29 , 2011ICBO @ Buffalo NY

2

Outline Task: Align PEO and OBI Ontologies

OAEI Investigation

AgreementMaker Overview

Enhancements to AgreementMaker

Experimental Results

Conclusions and Future Work

Parasite Experiment Ontology (PEO) http://wiki.knoesis.org/index.php/Parasite_Experiment_ontology

models provenance metadata associated with experiment protocols used in parasite research.

extends the upper-level Provenir ontology (http://knoesis.wright.edu/provenir/provenir.owl)

PEO (v 1.0) includes Proteome, Microarray, Gene Knockout, and Strain Creation experiment terms along with other terms that are used in pathway.

110 classes & 27 properties, uses concepts in Parasite Life Cycle ontology

3

Snapshot of PEO

Ontology for Biomedical Investigations(OBI) http://purl.obolibrary.org/obo/obi

describes biological and clinical investigations. includes a set of 'universal' terms applicable across

various biological and technological domains, and domain-specific terms relevant only to a given domain.

support the consistent annotation of biomedical investigations, regardless of the particular field of study.

represent the design of an investigation, the protocols and instrumentation used, the material used, the data generated and the type analysis performed on it.

being built under the Basic Formal Ontology (BFO).

4

Ontology Alignment Evaluation Initiative (OAEI) http://oaei.ontologymatching.org Annual international competition to evaluate ontology alignment techniques with multiple tracks Benchmark tests Biomedical track (Mouse and NCI Human Anatomies) Conference track (15 ontologies)

“Side effect” of the competition are published ontology sets consists of two ontologies and correct mappings as determined by experts

Results measured by Recall, precision, and F-measure (combines recall and precision) Runtime Other

5

OAEI 2010http://oaei.ontologymatching.org/2010/results/anatomy/index.html#corrections

8

OAEI Anatomy Track #1 The matcher has to be applied with its standard settings. #2 An alignment has to be generated that favors precision over recall. #3 An alignment has to be generated that favors recall over precision. #4 A partial reference alignment has to be used as additional input.

10

AgreementMaker - OA SystemUniv. of Illinois Chicago, ADVIS Lab, Dr. Isabel F. Cruz and Cosmin Stroe

11

Motivation Automatic methods are required to match large ontologies Several features of the ontologies have to be considered Users need to trust the mappings and to be directly

involved in the loop System’s capabilities

Wide range of matching methods Capability to smartly combine multiple strategies Multi-purpose user interface to allow evaluation and

manual interaction with the matchings Extensible architecture to allow reuse and composition of

the matching modules

Architecture of a Matcher

12

Existing Matchers First layer (conceptual)

BSM (Basic Similarity Matcher) PSM (Parametric String-Based Matcher) ASM (Advanced Similarity Matcher) VMM (Vector-based Multi-term Matcher)

Second layer (structural) DSI Descendent Similarity Inheritance SSC Sibling Similarity Contribution

Third Layer (aggregation) LWC Linear Weighted Combination

14

LWC

17

Lexicon Extensions to Matchers AgreementMaker version 0.22 extended

these string-based matchers by integrating two lexicons (2010 OAEI): the Ontology Lexicon, built from synonym and

definition annotations existing in the ontologies themselves, and

the WordNet Lexicon, created by starting with the ontology lexicon and adding any non-duplicated synonyms/definitions found in WordNet

Result: BSMlex, PSMlex, and VMMlex.

18

Initial Experiments AgreementMaker (ver. 0.22) with the OAEI 2010 anatomy

configuration resulted in only two mappings Found inconsistency in entity descriptions of PEO and OBI.

Identifiers: PEO URIs use a textual fragment identifier (http://knoesis.wright.edu/ParasiteExperiment owl#transfection), while OBI's entities use numerical identifiers (e.g., http:// purl.obolibrary.org/obo/OBI_0600060).

Labels: PEO's use of the rdfs:label field (on 19.1% of classes) does not follow the specification guidelines since it contains a PLO identifier. OBI uses the rdfs:label field to contain a descriptive string on almost 100% of its classes.

Comments: PEO uses on 99% of its classes and provides a definition. OBI only uses the comment field on about 4% of its classes.

Some common annotations exist between PEO and OBI BUT either PEO or OBI has low coverage OBI has high coverage for label annotations PEO has high coverage for comment annotations.

This heterogeneity and matchers matching the same annotations to each other (i.e., class ID with class ID, label with label, etc.) resulted in almost no alignment.

19

Annotation Profiling allow the user to select and combine different annotations of the

source or target ontology to be used in the alignment process.

21

Provenance Information Added

22

Customization of Lexicon Matchers The lexicon builders for BSMlex, PSMlex, and VMMlex

lexicon use a fixed name for the synonym and definition annotations (hasSynonym and hasDefinition).

Lexicon builder modified to exploit the synonym annotations in PEO and OBI by having the user choose the annotation names used to create the lexicons. OBI does not use hasSynonym but uses IAO annotation

properties IAO 0000111 (“editor preferred term") and IAO 0000118 (“alternative term") which serve the same function as synonyms for the OBI.

The PEO does not use synonyms but uses the comment annotation for a definition in most cases.

Result: BSMlex+, PSMlex+, and VMMlex+.

23

BioPortal Mappings http://bioportal.bioontology.org/mappings http://wiki.knoesis.org/index.php/Parasite_Experiment_ontology

http://wiki.knoesis.org/index.php/Parasite_Experiment_ontology

25

Experimental Results

27

Overlapping of Matchers

28

Conclusions and Future Work Experimental results in the biomedical domain demonstrate the problem

of heterogeneous annotations of ontologies. Validated past approach of extending matching algorithms using lexicons,

showing the best results produced by matchers that use lexicons BSMlex+

Investigate including more lexicons such as UMLS to achieve better result Heterogeneity managed by increasing the flexibility of state of the art

matching algorithms, i.e.,, annotation profiling, mapping provenance information and custom lexicons which supports a domain expert in this process relies on the user to select relevant annotations to be used in the matching

process. More work needs to be done specifically to automatically identify semantically

compatible annotations by applying established ontology evaluation metrics Already have added a wide variety of semantic similarity measures to

AgreementMaker for future use in semantic matching, not just lexical matching of concepts between ontologies.

.

29

THANK YOU!

QUESTIONS?

30