mapping phenotype ontologies for obesity and diabetes
TRANSCRIPT
Mapping Phenotype Ontologies
Chris MungallMonarch Initiative
http://monarchinitiative.org Lawrence Berkeley National Laboratory
PhenoBridges Workshop 2013
Outline
• Problem: multiple ontologies of relevance to the obesity/diabetes domain– By species– By category – How can we bring these together?
• Bridging ontologies using OWL axioms– Enables cross-domain semantic queries
• Integrated ontology-data views in Monarch• Challenges
– Modeling strategies– Tools
Ontologies for phenotype and disease
Tools:• OWLSim• BOQA • PhenoDigm / MouseFinder• Phenomizer• Phenomenet
We want to bridge species
Washington, N. L., Haendel, M. A., Mungall, C. J., Ashburner, M., Westerfield, M., & Lewis, S. E. (2009). Linking Human Diseases to Animal Models Using Ontology-Based Phenotype Annotation. PLoS Biol, 7(11). doi:10.1371/journal.pbio.1000247
Bridging across species requires bridging across ontologies
MP
HPHP:0012093Abnormality of endocrine pancreas physiology
MP:0009165abnormal endocrine pancreas morphology
??
Mammalian Phenotype OntologySmith, C. L., Goldsmith, Carroll-A. W., & Eppig, J. T. (2005). The Mammalian Phenotype Ontology as a tool for annotating, analyzing and comparing phenotypic information. Genome Biol, 6(1). doi:10.1186/gb-2004-6-1-r7
Used to annotate and query:• Genotypes• Alleles• GenesIn mice
Human Phenotype Ontology
Robinson, P. N. P. N., Koehler, S., Bauer, S., Seelow, D., Horn, D., Mundlos, S., K{"o}hler, S., et al. (2008). The Human Phenotype Ontology: a tool for annotating and analyzing human hereditary disease. American Journal of Human Genetics, 83(5), 610-615. Elsevier. doi:10.1016/j.ajhg.2008.09.017
Used to annotate:• Patients• Disorders• Genotypes• Genes• Sequence variantsIn human
Issue: bridging across categories/perspectives
MP:0005217abnormal pancreatic beta-cell morphology
GO:0003309pancreatic B cell differentiation
?
MPabnormal phenotypes
GO‘normal’ molecular,cellular and physiologicalprocesses
Phenotypes require more than “phenotype ontologies”
glucose metabolism (GO:00060
06)
Gene/protein function
data
glucose(CHEBI:17
234)
Metabolomics,
toxicogenomics
Data
Disease & phenotyp
e data
type II diabetes mellitus
(DOID:9352)
pyruvate(CHEBI:15
361)
DISEASE GO CHEBI
pancreatic beta cell
(CL:0000169)
transcriptomic data
CL
Bridging via lexical methods
• Approach– Create pairwise mappings between
ontologies– Use lexical methods and/or curation
• Advantages:– Large body of tools and willing text miners
• Disadvantages:– Semantics-free– Machine doesn’t understand text– Inexact matches
MPHP
Enhance lexical approach with OWL bridging axioms
• Key idea:– Describe the phenotype in a machine-
interpretable way• Break it down into digestible chunks!• Logical definition
– The machine will then be able to help you• Match phenotypes• Automate ontology checking and addition of new terms
• Approach:– Use Web Ontology Language (OWL), a description
logic to describe phenotypes– Use OWL reasoning to find connections
Mungall, C. J., Gkoutos, G., Washington, N., & Lewis, S. (2007). Representing Phenotypes in OWL. In C. Golbreich, A. Kalyanpur, & B. Parsia (Eds.), Proceedings of the OWLED 2007 Workshop on OWL: Experience and Directions. Innsbruck, Austria. http://www.webont.org/owled/2007/PapersPDF/paper_40.pdf
MPUberon(Anatomy)
CL(cell types)
PATO(qualities)
PATO(qualities)
Class: ‘abnormal pancreatic beta cell mass’EquivalentTo: ‘abnormal phenotype’ and has_entity some ‘type B pancreatic cell’ and has_quality some mass
MPHP
‘abnormal phenotype’ and has_entity some ‘type B pancreatic cell’ and has_quality some amount
‘abnormal phenotype’ and has_entity some ‘type B pancreatic cell’ and has_quality some ‘reduced amount’
Mungall, C. J., Gkoutos, G., Smith, C., Haendel, M., Lewis, S., & Ashburner, M. (2010). Integrating phenotype ontologies across multiple species. Genome Biology, 11(1), R2. doi:10.1186/gb-2010-11-1-r2
Köhler, S., Doelken, S. C., Ruef, B. J., Bauer, S., Washington, N., Westerfield, M., Gkoutos, G., et al. (2013). Construction and accessibility of a cross-species phenotype ontology along with gene annotations for biomedical research. F1000Research, 1–12. doi:10.3410/f1000research.2-30.v1
7181 / 9022 MPTerms are described
Phenotypes to metabolites
glucose homeostasis(GO:0042593) ≡
homeostasis(GO:0042592)
glucose(CHEBI:17234)
⊓
∃.has_participant
http://wiki.geneontology.org/index.php/Ontology_extensions
abnormal glucose homeostasis
(MP:0002078)
Linking cell types to proteins via GO
≡
secretion(GO:0046903)
insulin(PR:000009054)
⊓
∃.has_output
⊑pancreatic beta cell(CL:0000169)
Insulin secretion(GO:0046903)
∃.capable_of
INS_HUMAN - P01308
Meehan, T., Masci, A. M., Abdulla, A., Cowell, L., Blake, J., Mungall, C. J., & Diehl, A. (2011). Logical Development of the Cell Ontology. BMC Bioinformatics, 12(1), 6. doi:10.1186/1471-2105-12-6
Uberon bridges single species anatomy ontologies
Mungall, C. J., Torniai, C., Gkoutos, G. V., Lewis, S. E., & Haendel, M. A. (2012). Uberon, an integrative multi-species anatomy ontology. Genome Biology, 13(1), R5. doi:10.1186/gb-2012-13-1-r5
Lexical methods• Obol :
grammar approach
• Entity matching
Curation• Edit bridge files• Edit source ontologies
OWL Reasoning• Elk• GULO• Jenkins
Iterative development and deployment
Kohler, S., Bauer, S., Mungall, C. J., Carletti, G., Smith, C. L., Schofield, P., Gkoutos, G. V, et al. (2011). Improving ontologies by automatic reasoning and evaluation of logical definitions. BMC Bioinformatics, 12(1), 418. doi:10.1186/1471-2105-12-418Mungall, C. J., Dietze, H., Carbon, S. J., Ireland, A., Bauer, S., & Lewis, S. (2012). Continuous Integration of Open Biological Ontology Libraries. Bio-Ontologies 2012 http://bio-ontologies.knowledgeblog.org/405
Integrated views in Monarch
http://monarchinitiative.org
Linking model systems tohuman diseases
Integrated views in Monarch
http://monarchinitiative.org
glucose metabolism (GO:00060
06)
Gene/protein function
data
glucose(CHEBI:17
234)
Metabolomics,
toxicogenomics
Data
Disease & phenotyp
e data
type II diabetes mellitus
(DOID:9352)
pyruvate(CHEBI:15
361)
DISEASE/PHENOTYPE
GO CHEBI
pancreatic beta cell
(CL:0000169)
transcriptomic data
CL
Roadblocks and pitfalls
• Lack of tool support for ontology development
• Many tools for ‘mapping after the fact’–Mapping should not be retrospective–Must be integrated into ontology
development lifecycle
• OWL Modeling pitfalls– Over-modeling– Under-modeling
Overcoming ontology development bottlenecks with TermGenie
• Developed for GO– Instant compositional terms for curators– OWL axioms are added at time of term
creation
• We are rolling out pheno-ontology instances– Trial run on FYPO and plant traits
http://termgenie.org
Modeling confusion and analysis paralysis
• absent pancreatic beta cells (MP:0009174)– Tempting to use OWL cardinality
• Does this represent the biology?
• decreased pancreatic beta cell number (MP:0003339)– Can’t do this with OWL cardinality!
• Lesson: don’t over-model in OWL
Modeling temporal progression
• How did there come to be absence of beta cells in the pancreas?
• What are the downstream effects?• Changes with ages– Hyperglycemic hypoglycemic
• Existing phenotype ontologies steer clear of causality– Next frontier
What I haven’t talked about
• Quantitative phenotypes• Assay vs phenotype• Behavioral phenotypes• Environments• Mining disease phenotypes from the literature• Clinical vocabularies (see Nathalie’s talk)• Modeling other model systems• The data!• Making use of the data and OWL axioms for
analysis (see Damian’s talk)• …a lot more
Questions/Summary
• Approaches to mapping– OWL bridging axioms
• Roadblocks and pitfalls– OWL modeling analysis paralysis– Lack of tool support– Need to push upstream in ontology engineering lifecycle– Modeling complex phenomena
• From observation to temporal progression and models of causality
• Tools– CrossSpeciesPheno– Available:
• GULO, TermGenie, OBO-Edit, Protégé 4, OWL Reasoners, Onto-Jenkins
– Required: integration upstream
• Charite– Sebastian Kohler– Sandra Doelken– Sebatian Bauer– Peter Robinson
• U of Oregon– Barbara Ruef– Monte Westerfield
• OHSU– Carlo Torniai– Nicole Vasilesky– Shahim Essaid– Matt Brush– Melissa Haendel
• Sanger– Anika Oehlrich– Damian Smedley
• University of Cambridge– George Gkoutos– Rob Hoehndorf– Paul Schofield
• Lawrence Berkeley– Nicole Washington– Suzanna Lewis
• UCSD– Amarnath Gupta– Jeff Grethe– Anita Bandrowski– Maryann Martone
• U of Pitt– Chuck Borneo– Harry Hochheiser
• JAX– Terry Meehan– Cynthia Smith
Acknowledgments