ga4gh phenotype ontologies task team update

26
Melissa Haendel @ontowonka @monarchinit Phenotype Ontologies (and a bit of G2P) task team

Upload: mhaendel

Post on 23-Jan-2018

344 views

Category:

Science


2 download

TRANSCRIPT

Page 1: GA4GH Phenotype Ontologies Task team update

Melissa Haendel@ontowonka@monarchinit

Phenotype Ontologies (and a bit of G2P) task team

Page 2: GA4GH Phenotype Ontologies Task team update

Genes Environment Phenotypes

Determinants of Health are Diverse

Physical environment Chemical exposures Treatments Smoking, alcohol

Education Health services Income Social status Stress Employment Working conditions Microbiome Pathogens

Clinical observations Laboratory tests Patient reported outcomes Child development Biometrics Behaviors

Sleep Exercise Screen time Diet

Genomic endowment Epigenetics Gene expression Gene regulation

Ontologies can help make (some of) this computable

Page 3: GA4GH Phenotype Ontologies Task team update

Genes Environment Phenotypes+ =

But its not just the types of things

Page 4: GA4GH Phenotype Ontologies Task team update

…the relationships and their evidence must also be

captured

G-P or D (disease)• causes• contributes to• is risk factor for• protects against• correlates with• is marker for• modulates• involved in• increases susceptibility to

G-G (kind of)• regulates• negatively regulates (inhibits)• positively regulates (activates)• directly regulates• interacts with• co-localizes with• co-expressed with

P/D - P/D• part of• results in• co-occurs with• correlates with• hallmark of (P->D)

E-P• contributes to (E->P)• influences (E->P)• exacerbates (E->P)• manifest in (P->E)

G-E (kind of)• expressed in• expressed during• contains• inactivated by

Page 5: GA4GH Phenotype Ontologies Task team update

The Human Phenotype Ontology

11,813phenotype terms

127,125 rare disease -phenotype annotations

136,268common disease -phenotype annotations

bit.ly/hpo-paper

Peter Robinson, Sebastian Koehler, Chris Mungall

Page 6: GA4GH Phenotype Ontologies Task team update

Other clinical vocabularies don’t adequately

cover phenotypic descriptions

Winnenburg and Bodenreider, 2014

Perc

ent

cove

rage

=> HPO is now in the UMLS

0

20

40

60

80

100H

PO

UM

LS

SN

OM

ED

CT

CH

V

Med

DR

A

MeS

H

NC

IT

ICD

10

OM

IM

Page 7: GA4GH Phenotype Ontologies Task team update

Precision fuzzy phenotype matching

DOI: 10.1126/scitranslmed.3009262

Page 8: GA4GH Phenotype Ontologies Task team update

How much phenotyping is enough?

Enlarged ears (2)Dark hair (6) Female (4)Male (4)

Blue skin (1)Pointy ears (1)

Hair absent on head (1)Horns present (1)

Hair present on head (7)

Enlarged lip (2)

Increased skin pigmentation (3)

bit.ly/annotationsufficiency

Page 9: GA4GH Phenotype Ontologies Task team update

Matchmaker Exchange for patients, diseases, and model

organisms

Computational matching of rare disease patients across clinical & public sourcesFind models and experts for functional validation

bit.ly/mme-matchboxpatientarchive.orgbit.ly/exomiser-2017

Page 10: GA4GH Phenotype Ontologies Task team update

Plain language synonyms for computable

phenotypes

Page 11: GA4GH Phenotype Ontologies Task team update

Layperson-HPO driven phenotyping tool

https://www.pcori.org/research-results/2017/realization-standard-care-rare-diseases-using-patient-engaged-phenotyping

Catherine Brownstein, Ingrid Holm

Page 12: GA4GH Phenotype Ontologies Task team update

NCI Thesaurus is the de facto cancer

vocabulary standard

Required for drug trials by FDA, but not interoperable with other vocabulary standards

Page 13: GA4GH Phenotype Ontologies Task team update

SequenceOntology

UberonAnatomyOntology

GenotypeOntology

MONDODisease

Ontology

Human PhenotypeOntology NCBIGene

Reactome

NCBITaxon

ProteinOntology

ChEBI chemicalentities ontology

UNII chemicalsubstance registry

CellOntology

CellOntology

Ontology ofBiomedical

Investigations

GeneOntology(GO-BP)

UberonAnatomyOntology

GeneOntology(GO-CC)

UniProt

Tailoring the NCIt for computational

interoperability

https://github.com/NCI-ThesaurusICD-O and Oncotree slims available too: https://github.com/NCI-Thesaurus/thesaurus-obo-edition/wiki/Downloads

Page 14: GA4GH Phenotype Ontologies Task team update

SequenceOntology

UberonAnatomyOntology

GenotypeOntology

MONDODisease

Ontology

Human PhenotypeOntology NCBIGene

Reactome

NCBITaxon

ProteinOntology

ChEBI chemicalentities ontology

UNII chemicalsubstance registry

CellOntology

CellOntology

Ontology ofBiomedical

Investigations

GeneOntology(GO-BP)

UberonAnatomyOntology

GeneOntology(GO-CC)

UniProt

Lobular Breast Carcinoma = 'Breast Adenocarcinoma'and (Disease_Has_Normal_Tissue_Origin some 'Terminal Ductal Lobular Unit')and (Disease_Has_Normal_Cell_Origin some 'Terminal Ductal Lobular Unit Cell')and (Disease_Has_Abnormal_Cell some 'Lobular Carcinoma Cell')and (Disease_May_Have_Cytogenetic_Abnormality some 'Loss of Chromosome 16q')and (Disease_Excludes_Abnormal_Cell some 'Ductal Carcinoma Cell')and (Disease_Excludes_Finding some 'Mixed Cellular Population')and (Disease_Mapped_To_Gene some 'CDH1 Gene')and (Disease_May_Have_Molecular_Abnormality some 'Loss of E-cadherin Expression')and (Disease_May_Have_Molecular_Abnormality some 'CDH1 Gene Inactivation')

Tailoring the NCIt for computational

interoperability

Jim Balhoff, Sherri DeCorronado, Giberto Fragoso, Nicole Vasilevsky, Paula Carrio Caro, Matt Brush, Chris Mungall

Page 15: GA4GH Phenotype Ontologies Task team update

Variant Pathogenicity Interpretations

Pathogenic ?

Benign ?

"DSC2:c.631-2A>G

Right

Ventricular

Cardiomyopathy

Complications to variant interpretation:

Pathogenicity evidence is complex, diverse, indirect, conflicting

Siloed curation guidelines

High stakes (Applied directly to care)

Page 16: GA4GH Phenotype Ontologies Task team update

Improving Rigor and Consistency of

Variant Interpretation

2015 ACMG-AMP Variant Interpretation Guidelines 28 ‘criteria’ re: evidence types, strength

Framework for combining criteria outcomes

ClinGen Variant Curation Interface (VCI) and DMWG Data model and curation for variant evidence and provenance

SEPIO Scientific Evidence and Provenance Information Ontology Computable model for representation and analysis of evidence and provenance

Merged Disease Classification• Harmonized disease classification for algorithmic use and pathogenicity assignment

SEPIOScientific Evidence and

Provenance Information

Matt Brush, Selina Dwight, Larry Babb, Chris Bizon, Bradford Powell, Tristan Nelson, Bob Freimuth, Chris Mungall

Page 17: GA4GH Phenotype Ontologies Task team update

co-localization evidence functional

complementation evidence

microscopy evidence

imaging evidence co-immunoprecipitation

evidence

:e4

Algorithms can leverage semantics of SEPIO models to compute quantitative metrics of evidence quality, quantity, diversity, and concordance – supporting automated evaluation of claims.

:e5:e3:e1 :e2

:claim1“pathogenic”

:claim2“benign”

Evidence-Based Computational

Evaluation of Claims

https://github.com/monarch-initiative/SEPIO-ontology/wiki

Page 18: GA4GH Phenotype Ontologies Task team update

Disease 1 Disease 2

Data Standards Ontologies Data Standards Ontologies Data Standards Ontologies

Genes Environment Phenotypes

How do all these ontologies fit into our

notion of disease?

FHIR

Page 19: GA4GH Phenotype Ontologies Task team update

Disease 1 Disease 2

Data Standards Ontologies Data Standards Ontologies Data Standards Ontologies

Genes Environment Phenotypes

FHIR

METADATA, EVIDENCE

Page 20: GA4GH Phenotype Ontologies Task team update

Defining disease and clinical pathogenicity:

A lumping and splitting problem

source IDs

split/merge

manage resolution &provenance

MONDO Unified Disease OntologySEPIO

Scientific Evidence andProvenance Information

One disease or two? What does the evidence favor?

One disease or two? How do we manage identifiers, hierarchy?

Page 21: GA4GH Phenotype Ontologies Task team update

OMIM(brown)

MESH(grey)

ORDO/Orphanet(yellow)

SubClassOf(solid line)

Xref(dashed grey line)

Hemolytic anemia mappings across resources

Each nosology is different, they inconsistently map to each other, leading to poor interoperability and computability

Page 22: GA4GH Phenotype Ontologies Task team update

New integrated nosology

http://bit.ly/Monarch-Diseasehttp://purl.obolibrary.org/obo/mondo/pre/mondo.owl

Page 23: GA4GH Phenotype Ontologies Task team update

Genes Environment Phenotypes

VCF PXFGFF

Standard exchange formats exist for genes …

but for phenotypes? Environment?

BED

http://phenopackets.org New Funding: Forums for Phenomics!

Page 24: GA4GH Phenotype Ontologies Task team update

What does a phenopacket look like?

Alacrima

Sleep Apnea

Microcephaly

phenotype_profile:

- entity: ”patient16"

phenotype:

types:

- id: "HP:0000522"

label: ”Alacrima"

onset:

description: “at birth”

types:

- id: "HP:0003577"

label: "Congenital onset"

evidence:

- types:

- id: "ECO:0000033"

label: ”Traceable Author Statement"

source:

- id: ”PMID:"

Clinical labs

Public databases

Journals

Page 25: GA4GH Phenotype Ontologies Task team update

Layperson HPO + Phenopackets

Dry eyes

Stops breathing during sleep

Small head

phenotype_profile:

- entity: “Grace”

phenotype:

types:

- id: "HP:0000522"

label: “Alacrima"

onset:

description: “at birth"

types:

- id: "HP:0003577"

label: "Congenital onset"

evidence:

- types:

- id: “ECO:0000033”

label: “Traceable Author Statement"

source:

- id: “

https://twitter.com/examplepatient/status/1

23456789”

• Patient registries

• Social media

Page 26: GA4GH Phenotype Ontologies Task team update

What’s next? Challenges for this

workstream Figure out how ontologies, metadata, eHealth and exchange

standards all fit together in this workstream

Further harmonize existing disease and phenotype ontologies and standards

Define exchange of structured phenotype data in different contexts: clinical, basic research, patients, databases, journals

Getting structured G2P data–that is about the biology of the patient -into/out of the EHR

Demonstrate standardization success across the driver projects

Discuss!