biobanks in the us interoperability, database and ontologybiobanks in the us interoperability,...

41
Biobanks in the US Interoperability, Database and Ontology Olivier Bodenreider, MD, PhD Lister Hill National Center for Biomedical Communications Bethesda, Maryland - USA Biobanks:Interoperability, Database and Ontology November 23, 2010

Upload: truongtuyen

Post on 22-Feb-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Biobanks in the USInteroperability, Database and Ontology

Olivier Bodenreider, MD, PhD

Lister Hill National Centerfor Biomedical Communications

Bethesda, Maryland - USA

Biobanks:Interoperability, Database and OntologyNovember 23, 2010

Page 2: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 2

Outline

Context in the US Translational research Biobanking

Informatics infrastructure for translational research and biobanking

Ontological aspects of biobankingApplication – eMERGE Network

Page 3: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Translational researchin the US

Page 4: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 4

Translational medicine/research

Definition Effective transformation of information gained from

biomedical research into knowledge that can improve the state of human health and disease

Goals Turn basic discoveries into clinical applications more

rapidly (“bench to bedside”) Provide clinical feedback to basic researchers

[Butte, JAMIA 2008]

Page 5: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 5

Flavors of translational research

T1 Move a basic discovery into a candidate health application “bench to bedside”

T2 Assess the value of T1 application for health practice leading

to the development of evidence-based guidelines “clinical research”

T3 Move evidence-based guidelines into health practice,

through delivery, dissemination, and diffusion research T4

Evaluate the "real world" health outcomes of a T1 application in practice

http://medicalcenter.osu.edu/research/translational_research/Pages/index.aspx

Page 6: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 6

Translational bioinformatics

“… the development of storage, analytic, and interpretive methods to optimize the transformation of increasingly voluminous biomedical data into proactive, predictive, preventative, and participatory health.Translational bioinformatics includes research on the development of novel techniques for the integration of biological and clinical data and the evolution of clinical informatics methodology to encompass biological observations.The end product of translational bioinformatics is newly found knowledge from these integrative efforts that can be disseminated to a variety of stakeholders, including biomedical scientists, clinicians, and patients.” AMIA strategic plan

http://www.amia.org/inside/stratplan

Page 7: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Enabling translational research

Clinical Translational Research Awards(CTSA)

Page 8: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 8

Translational research NIH Common Fund

http://nihroadmap.nih.gov/

Page 9: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 9

Clinical and Translational Science Awards

http://nihroadmap.nih.gov/

Page 10: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 10

Clinical and Translational Science Awards

Strategic goals Build National Clinical and Translational Research

Capability Provide Training and Improving the Career

Development of Clinical and Translational Scientists Enhance Consortium-Wide Collaborations Improve the Health of our Communities and the Nation Advance T1 Translational Research

http://www.ctsaweb.org/

Page 11: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 11

CTSA program

National Center for Research Resources (NCRR) 55 academic health centers in 28 states (2010)

60 centers upon completion Funding provided for 5 yearsTotal annual cost: $500 MAnnual funding per center: $4-23 M

Depending on previous funding

http://www.ncrr.nih.gov/clinical_research_resources/clinical_and_translational_science_awards/

Page 12: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 12

Clinical and Translational Science Awards

http://www.ctsaweb.org/

Page 13: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 13

Other related programs

National Centers for Biomedical Computing

“networked national effort to

build the computational

infrastructure for biomedical

computing in the nation”

http://www.ncbcs.org/

Page 14: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 14

Other related programs

Cancer Biomedical Informatics Grid (caBIG)

Key elements Bioinformatics and Biomedical Informatics Community Standards for Semantic Interoperability Grid Computing

1000 participants from 200 organizations Funding: $60 M in the first 3 years (pilot)

“an information network enabling all constituencies in the cancer community – researchers, physicians, and patients –

to share data and knowledge.”

https://cabig.nci.nih.gov/

Page 15: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Biobankingin the US

Page 16: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 16

Biobank Definition

Collection of human tissue samples and medical information about donors, which are stored for long periods of times and are used for research studies

Bioregistry vs. Biorepository Registry – collection of contact information Biorepository – physical collection of samples

http://www.biobank.org/

Page 17: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 17http://cabigtrainingdocs.nci.nih.gov/caTissue/

Page 18: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 18

Examples of biobanks at large institutions

RENEW BiobankStanford University

Page 19: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 19

Genetic Alliance Biobank

http://www.biobank.org/

Page 20: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 20

CTSA Biobank Consortium

http://www.ncrr.nih.gov/ctsa/newsletter/December2009/

Page 21: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 21

caHUB

http://cahub.cancer.gov/Office of Biorepositories and Biospecimen Research (OBBR)

Page 22: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 22

caHUB Biospecimen Research Network

Missions Bridging the gap between existing clinical practice for

biospecimens and emerging technologies for personalized diagnostics and therapies

Defining the most significant variables for prospective collection of tissues, blood, and bodily fluids

Developing evidence-based biospecimen quality indicators for specific analytical platforms

http://biospecimens.cancer.gov/researchnetwork/

Page 23: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 23

caHUB Biospecimen Research Database

Page 24: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Informatics infrastructure

i2b2 software

Page 25: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 25

i2b2 software

https://www.i2b2.org/software/

Page 26: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 26

i2b2 software

Developed by the i2b2 NCBC (Boston)Adopted/adapted by most CTSA sitesClinical research data warehouseNo specific features for biospecimensOpen source, customizable Selected by the CTSA Biobank Consortium

https://www.i2b2.org/software/

Page 27: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 27

caTissue

Biorepository management tool designed for biospecimen inventory, tracking, and annotation

Developed by Washington University in St. Louis Part of caBIG Integration through caGridCoupled with caTIES – cancer Text Information

Extraction SystemAdopted by several academic centers

Page 28: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Ontological aspects of biobanking

Page 29: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 29

Genetic Alliance Biobank TRIMS TRIMS is a centralized data repository for annotation data (clinical and

experimental), sample source and handling information, processing and quality assurance (QA) information, as well as inventory and process flow data.

All clinical, demographic, and experimental data management and sample registration/accessioning occur in TRIMS. TRIMS also provides tracking, data query, report generation, and process management functions.

[…] TRIMS uses SNOMED, from the College of American Pathologists, as

the controlled vocabulary for classification of disease and sample morphology.

TRIMS also uses internal standards for controlling the quality and consistency of entered data including use of drop-down pick lists, flexible units translation, and field data type restrictions among others.

[…]http://www.biobank.org/english/View.asp?x=1413

Page 30: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 30

Ontologies

Provide the vocabulary for the annotation of clinical datasets and biospecimens

Clinical annotations Diseases Pathological conditions

Biospecimens Localization (anatomical, subcellular) Collection procedure

Page 31: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 31

Ontologies Clinical annotations

EMRs SNOMED CT

Billing data ICD-9-CM

Clinical research repositories NCI Thesaurus

Registries ICD-O

Page 32: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 32

Ontologies Biospecimen annotations

Anatomy Foundational Model of Anatomy (FMA) Anatomical component of general ontologies

(SNOMED CT, NCI Thesaurus)Tissue/cell ontologies

BRENDA (OBO) Cell type ontology (OBO)

Morphology (cancer) ICD-O

http://www.brenda-enzymes.info/

Page 33: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 33

Ontology for Biomedical Investigations

Domain poorly covered by most ontologies International collaborative effort

(OBI Consortium) – OBO Family

In the spirit of the MGED ontology for microarray experiments Scope

biological material – blood plasma instrument (and parts of an instrument therein) – DNA microarray,

centrifuge information content (e.g., image)

or digital information entity (e.g., electronic medical record ) design and execution of an investigation (and individual experiments

therein) – study design, electrophoresis material separaition data transformation – principal components analysis, dimensionality

reduction, mean calculation

http://obi-ontology.org/

Page 34: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 34

Ontology repositories

Unified Medical Language System National Library of

Medicine Free (license

agreement required) ~ 100 ontologies 2.2 M concepts Annotation tool

(MetaMap)

BioPortal

National Center for Biomedical Ontology

Free

~ 202 ontologies 1.4 M concepts Annotation tool

(Annotator)http://umlsks.nlm.nih.gov/ http://bioportal.bioontology.org/

Page 35: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 35

Terminology integration systems

UMLS and BioPortalEnable mappings across terminologiesUseful for reconciling annotations to disparate

terminologies

Page 36: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 36

Some issues with ontologies

Information model required Ontologies provide vocabulary for data elements and

categorical valuesLimited coverage

Biobanking proceduresAvailability

SNOMED CT – available only to the members of the IHTSDO

Limited availability of coded EMR data Need for natural language processing of EMR data

Page 37: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Application

eMERGE Network

Page 38: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 38

eMERGE Network Overview

Funded by NIH 5 member sites

Group Health Cooperative / University of Washington Marshfield Clinic Mayo Clinic Northwestern University Vanderbilt University

Local biorepositories Explore whether EMR systems can serve as resources

for complex genomic analysis of disease susceptibility and therapeutic outcomes, across diverse patient populations

https://www.mc.vanderbilt.edu/victr/dcc/projects/acc/index.php/Main_Page

Page 39: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 39

eMERGE Network Methods

Primary phenotypes studied Dementia - GHC/University of Washington Cataract - Marshfield Clinic Peripheral Arterial Disease - Mayo Clinic Type 2 Diabetes - Northwestern University Cardiac conduction - Vanderbilt University

Tooling developed eleMAP – facilitate mapping of local phenotype data

dictionaries to standard terminologies PheWAS

Page 40: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

Lister Hill National Center for Biomedical Communications 40

eMERGE Network Results

Demonstrated the feasibility of using EMR data for identifying genotype-phenotype associations (e.g., “rediscovery” of association between genes and red blood cell traits)

Demonstrated the feasibility of pooling clinical data from various institutions, increasing statistical power

Confirmed the need for NLP analysis of EMR data (clinical notes, reports)

Page 41: Biobanks in the US Interoperability, Database and OntologyBiobanks in the US Interoperability, Database and Ontology ... Interoperability, Database and Ontology ... Integration through

MedicalOntologyResearch

Olivier Bodenreider

Lister Hill National Centerfor Biomedical CommunicationsBethesda, Maryland - USA

Contact:Web:

[email protected]