ga4gh phenotype ontologies task team update

Post on 23-Jan-2018

344 Views

Category:

Science

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Melissa Haendel@ontowonka@monarchinit

Phenotype Ontologies (and a bit of G2P) task team

Genes Environment Phenotypes

Determinants of Health are Diverse

Physical environment Chemical exposures Treatments Smoking, alcohol

Education Health services Income Social status Stress Employment Working conditions Microbiome Pathogens

Clinical observations Laboratory tests Patient reported outcomes Child development Biometrics Behaviors

Sleep Exercise Screen time Diet

Genomic endowment Epigenetics Gene expression Gene regulation

Ontologies can help make (some of) this computable

Genes Environment Phenotypes+ =

But its not just the types of things

…the relationships and their evidence must also be

captured

G-P or D (disease)• causes• contributes to• is risk factor for• protects against• correlates with• is marker for• modulates• involved in• increases susceptibility to

G-G (kind of)• regulates• negatively regulates (inhibits)• positively regulates (activates)• directly regulates• interacts with• co-localizes with• co-expressed with

P/D - P/D• part of• results in• co-occurs with• correlates with• hallmark of (P->D)

E-P• contributes to (E->P)• influences (E->P)• exacerbates (E->P)• manifest in (P->E)

G-E (kind of)• expressed in• expressed during• contains• inactivated by

The Human Phenotype Ontology

11,813phenotype terms

127,125 rare disease -phenotype annotations

136,268common disease -phenotype annotations

bit.ly/hpo-paper

Peter Robinson, Sebastian Koehler, Chris Mungall

Other clinical vocabularies don’t adequately

cover phenotypic descriptions

Winnenburg and Bodenreider, 2014

Perc

ent

cove

rage

=> HPO is now in the UMLS

0

20

40

60

80

100H

PO

UM

LS

SN

OM

ED

CT

CH

V

Med

DR

A

MeS

H

NC

IT

ICD

10

OM

IM

Precision fuzzy phenotype matching

DOI: 10.1126/scitranslmed.3009262

How much phenotyping is enough?

Enlarged ears (2)Dark hair (6) Female (4)Male (4)

Blue skin (1)Pointy ears (1)

Hair absent on head (1)Horns present (1)

Hair present on head (7)

Enlarged lip (2)

Increased skin pigmentation (3)

bit.ly/annotationsufficiency

Matchmaker Exchange for patients, diseases, and model

organisms

Computational matching of rare disease patients across clinical & public sourcesFind models and experts for functional validation

bit.ly/mme-matchboxpatientarchive.orgbit.ly/exomiser-2017

Plain language synonyms for computable

phenotypes

Layperson-HPO driven phenotyping tool

https://www.pcori.org/research-results/2017/realization-standard-care-rare-diseases-using-patient-engaged-phenotyping

Catherine Brownstein, Ingrid Holm

NCI Thesaurus is the de facto cancer

vocabulary standard

Required for drug trials by FDA, but not interoperable with other vocabulary standards

SequenceOntology

UberonAnatomyOntology

GenotypeOntology

MONDODisease

Ontology

Human PhenotypeOntology NCBIGene

Reactome

NCBITaxon

ProteinOntology

ChEBI chemicalentities ontology

UNII chemicalsubstance registry

CellOntology

CellOntology

Ontology ofBiomedical

Investigations

GeneOntology(GO-BP)

UberonAnatomyOntology

GeneOntology(GO-CC)

UniProt

Tailoring the NCIt for computational

interoperability

https://github.com/NCI-ThesaurusICD-O and Oncotree slims available too: https://github.com/NCI-Thesaurus/thesaurus-obo-edition/wiki/Downloads

SequenceOntology

UberonAnatomyOntology

GenotypeOntology

MONDODisease

Ontology

Human PhenotypeOntology NCBIGene

Reactome

NCBITaxon

ProteinOntology

ChEBI chemicalentities ontology

UNII chemicalsubstance registry

CellOntology

CellOntology

Ontology ofBiomedical

Investigations

GeneOntology(GO-BP)

UberonAnatomyOntology

GeneOntology(GO-CC)

UniProt

Lobular Breast Carcinoma = 'Breast Adenocarcinoma'and (Disease_Has_Normal_Tissue_Origin some 'Terminal Ductal Lobular Unit')and (Disease_Has_Normal_Cell_Origin some 'Terminal Ductal Lobular Unit Cell')and (Disease_Has_Abnormal_Cell some 'Lobular Carcinoma Cell')and (Disease_May_Have_Cytogenetic_Abnormality some 'Loss of Chromosome 16q')and (Disease_Excludes_Abnormal_Cell some 'Ductal Carcinoma Cell')and (Disease_Excludes_Finding some 'Mixed Cellular Population')and (Disease_Mapped_To_Gene some 'CDH1 Gene')and (Disease_May_Have_Molecular_Abnormality some 'Loss of E-cadherin Expression')and (Disease_May_Have_Molecular_Abnormality some 'CDH1 Gene Inactivation')

Tailoring the NCIt for computational

interoperability

Jim Balhoff, Sherri DeCorronado, Giberto Fragoso, Nicole Vasilevsky, Paula Carrio Caro, Matt Brush, Chris Mungall

Variant Pathogenicity Interpretations

Pathogenic ?

Benign ?

"DSC2:c.631-2A>G

Right

Ventricular

Cardiomyopathy

Complications to variant interpretation:

Pathogenicity evidence is complex, diverse, indirect, conflicting

Siloed curation guidelines

High stakes (Applied directly to care)

Improving Rigor and Consistency of

Variant Interpretation

2015 ACMG-AMP Variant Interpretation Guidelines 28 ‘criteria’ re: evidence types, strength

Framework for combining criteria outcomes

ClinGen Variant Curation Interface (VCI) and DMWG Data model and curation for variant evidence and provenance

SEPIO Scientific Evidence and Provenance Information Ontology Computable model for representation and analysis of evidence and provenance

Merged Disease Classification• Harmonized disease classification for algorithmic use and pathogenicity assignment

SEPIOScientific Evidence and

Provenance Information

Matt Brush, Selina Dwight, Larry Babb, Chris Bizon, Bradford Powell, Tristan Nelson, Bob Freimuth, Chris Mungall

co-localization evidence functional

complementation evidence

microscopy evidence

imaging evidence co-immunoprecipitation

evidence

:e4

Algorithms can leverage semantics of SEPIO models to compute quantitative metrics of evidence quality, quantity, diversity, and concordance – supporting automated evaluation of claims.

:e5:e3:e1 :e2

:claim1“pathogenic”

:claim2“benign”

Evidence-Based Computational

Evaluation of Claims

https://github.com/monarch-initiative/SEPIO-ontology/wiki

Disease 1 Disease 2

Data Standards Ontologies Data Standards Ontologies Data Standards Ontologies

Genes Environment Phenotypes

How do all these ontologies fit into our

notion of disease?

FHIR

Disease 1 Disease 2

Data Standards Ontologies Data Standards Ontologies Data Standards Ontologies

Genes Environment Phenotypes

FHIR

METADATA, EVIDENCE

Defining disease and clinical pathogenicity:

A lumping and splitting problem

source IDs

split/merge

manage resolution &provenance

MONDO Unified Disease OntologySEPIO

Scientific Evidence andProvenance Information

One disease or two? What does the evidence favor?

One disease or two? How do we manage identifiers, hierarchy?

OMIM(brown)

MESH(grey)

ORDO/Orphanet(yellow)

SubClassOf(solid line)

Xref(dashed grey line)

Hemolytic anemia mappings across resources

Each nosology is different, they inconsistently map to each other, leading to poor interoperability and computability

New integrated nosology

http://bit.ly/Monarch-Diseasehttp://purl.obolibrary.org/obo/mondo/pre/mondo.owl

Genes Environment Phenotypes

VCF PXFGFF

Standard exchange formats exist for genes …

but for phenotypes? Environment?

BED

http://phenopackets.org New Funding: Forums for Phenomics!

What does a phenopacket look like?

Alacrima

Sleep Apnea

Microcephaly

phenotype_profile:

- entity: ”patient16"

phenotype:

types:

- id: "HP:0000522"

label: ”Alacrima"

onset:

description: “at birth”

types:

- id: "HP:0003577"

label: "Congenital onset"

evidence:

- types:

- id: "ECO:0000033"

label: ”Traceable Author Statement"

source:

- id: ”PMID:"

Clinical labs

Public databases

Journals

Layperson HPO + Phenopackets

Dry eyes

Stops breathing during sleep

Small head

phenotype_profile:

- entity: “Grace”

phenotype:

types:

- id: "HP:0000522"

label: “Alacrima"

onset:

description: “at birth"

types:

- id: "HP:0003577"

label: "Congenital onset"

evidence:

- types:

- id: “ECO:0000033”

label: “Traceable Author Statement"

source:

- id: “

https://twitter.com/examplepatient/status/1

23456789”

• Patient registries

• Social media

What’s next? Challenges for this

workstream Figure out how ontologies, metadata, eHealth and exchange

standards all fit together in this workstream

Further harmonize existing disease and phenotype ontologies and standards

Define exchange of structured phenotype data in different contexts: clinical, basic research, patients, databases, journals

Getting structured G2P data–that is about the biology of the patient -into/out of the EHR

Demonstrate standardization success across the driver projects

Discuss!

top related