cabig: building a biomedical e-ecosystemsagecongress.org/presentations/buetow.pdf · • connect...

15
caBIG: Building a Biomedical e-Ecosystem Ken Buetow, Ph.D. Associate Director for Biomedical Informatics and Information Technology National Cancer Institute Sage Congress

Upload: doankien

Post on 22-Feb-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

caBIG: Building a

Biomedical e-Ecosystem

Ken Buetow, Ph.D. Associate Director for Biomedical

Informatics and Information Technology National Cancer Institute

Sage Congress

Page 2: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

The cancer Biomedical Informatics Grid® (caBIG®) is a virtual network of interconnected data, individuals, and organizations that redefines how research is conducted, care is provided, and patients/participants interact with the biomedical research enterprise.

caBIG®: Biomedical Information Highway

Page 3: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

21st Century Biomedicine requires connection of diverse information

Clinical Research

Pathology Molecular Biology

Imaging

Molecular Medicine

Page 4: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

Boundaries and Interfaces

•  focus on boundaries and interfaces, how things fit together, not on the internal details •  acknowledge the inherent heterogeneity of biomedical data and applications •  assume that the diverse landscapes is ever changing

Page 5: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

Research Unit

IT-enabled ecosystem!

Imaging

Microarray Data

Biospecimens

Analytical Tools

Research Center

Security Advertisement / Discovery

Federated Query Workflow

Metadata Management

Dorian GTS Index Service

Federated Query

Service

Workflow Management

Service

Vocabularies & Ontologies

GME Schema Management

Common Data Elements

Medical Center

Research Center

Research Center

Medical Center

Medical Center

Medical Unit

Research Center

Research Center

Page 6: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

Classifying Lymphoma

•  Scientific value •  Use gene-expression patterns

associated with two lymphoma types to predict the type of an unknown sample.

•  Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from GenePattern).

•  Major steps •  Querying training data from

experiments stored in caArray. •  Preprocessing, i.e., normalizing the

microarray data. •  Predicting lymphoma type using SVM

& KNN services. •  Extension

•  Generalized the workflow into a cancer type prediction routine that can be used on other caArray data sets.

*Fig. from MA Shipp. Nature Medicine, 2002

Ravi Madduri Univ. Chicago

Carole Goble U. Manchester, UK

Wei Tan Univ. Chicago

Dinanath Sulakhe Univ. Chicago

Stian Soiland-Reyes U Manchester, UK

Page 7: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

MicroArray from tumor tissue

Microarray preProcessing

Lymphoma classification

Lymphoma Prediction Workflow

Ack. Juli Klemm, Xiaopeng Bian, Rashmi Srinivasa (NCI) Jared Nedzel (MIT)

Page 8: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

The I-SPY trial (Investigation of Serial studies to Predict Your Therapeutic Response with

Imaging And moLecular analysis):

a national study to identify biomarkers predictive of response to therapy throughout the treatment cycle for women with Stage 3 breast cancer.

(Laura Esserman, UCSF )

Page 9: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

Multiple Morphologic Patterns of Breast Cancer

•  Clinical diagnosis

•  Treatment history

•  Histologic diagnosis

•  Pathologic status

•  Tissue anatomic site

•  Surgical history

•  Gene expression

•  Chromosomal copy

number

•  Loss of heterozygosity

•  Methylation patterns

•  miRNA expression

•  DNA sequence

Specialized Programs of Excellence (SPOREs)

Multiple Sites/ Organizations

Multiple Data Types

Cancer and Leukemia Group B (CALGB)

American College of Radiology Imaging Network (ACRIN)

University of California at San Francisco (UCSF)

I-SPY Trial: Identify biomarkers predictive of therapeutic response in Stage 3 breast cancer

Page 10: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

10

Page 11: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

I-SPY Adaptive Trial Outline

Accrual: Anticipate 800 patients over 3–4 years

Enroll ~20 patients per month

Participating Sites: 15–20 across US and Canada

On Study

MRI MRI Biopsy Blood

MRI Blood

Surgery

Biopsy Blood

MRI Blood

Tissue

Taxol +/–New Drug (12 weekly cycles)

AC (4 cycles)

Page 12: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

Taxol + Trastuzumab* + New Agent A

Taxol + New Agent C

Taxol + Trastuzumab*

Taxol + Trastuzumab* + New Agent B

Taxol

AC

AC HER 2 (+)

HER 2 (–)

Randomize

Randomize

Surgery Taxol + New Agent D *Or Equivalent

On Study Surgery

Taxol + Trastuzumab* + New Agent C

Surgery

Learn, Adapt from each patient

Page 13: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

Taxol + Trastuzumab* + New Agent A

Taxol + New Agent C

Taxol + Trastuzumab*

Taxol + Trastuzumab* + New Agent B

Taxol

AC

AC HER 2 (+)

HER 2 (–)

Randomize

Randomize

Surgery

Taxol + New Agent F

Taxol + New Agent D

Taxol + New Agent G *Or Equivalent

Learn, Adapt from each patient

On Study Surgery

Taxol + Trastuzumab* + New Agent C

Taxol + Trastuzumab* + New Agent F

Surgery

Page 14: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

I-SPY Trial IT Infrastructure

Expression Array Data!SNPArray Data! Radiological Data!Clinical Data!Patient Samples!

Data Mart

API

Local System

caExchange - Hub

Tolven

API caBIG® Applications

caTissue caArray

API

Page 15: caBIG: Building a Biomedical e-Ecosystemsagecongress.org/Presentations/Buetow.pdf · • Connect caGrid data service (caArray) with analytical services (PreProcess, SVM and KNN from

Perspectives for building a Community where disease data and models are shared

Lessons Learned: •  Culture eats Strategy for Lunch: Most of what prevents data-sharing is not technical, but reflects the segmentation of stakeholders into silos (by discipline, by geography, by sector), or concerns about intellectual capital, getting credit

•  Begin with a “coalition of the willing” •  There are no free lunches

•  Failure to collect appropriate structured data can result in loss of information or inability to connect

•  context is critical for clinical data •  Exchange AND use of information requires more than the ability to electronically move it from point to point. •  Pay me now or pay me later

Challenges Ahead: •  The culture of science is driven towards doing “new” not leveraging the work of others and incentives are generally aligned that way – “not invented here”. We don’t move forward because we are always starting over.