big data a life sciences perspective

18
BIG DATA A Life Sciences Perspective Scott Novogoratz, CIO College of Veterinary Medicine & Biomedical Sciences

Upload: lara

Post on 24-Feb-2016

45 views

Category:

Documents


0 download

DESCRIPTION

BIG DATA A Life Sciences Perspective. Scott Novogoratz, CIO College of Veterinary Medicine & Biomedical Sciences. Infectious Disease Research Center. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: BIG DATA A Life Sciences Perspective

BIG DATAA Life Sciences Perspective

Scott Novogoratz, CIOCollege of Veterinary Medicine &

Biomedical Sciences

Page 2: BIG DATA A Life Sciences Perspective

Infectious Disease Research Center

Among the world's leaders in researching West Nile Virus, drug-resistant Tuberculosis, Yellow Fever, Dengue, Hantavirus, Plague, Tularemia and other zoonotic and human diseases

Page 3: BIG DATA A Life Sciences Perspective

Radiological Cancer Treatment

Heavy Ion Therapy

Page 4: BIG DATA A Life Sciences Perspective
Page 5: BIG DATA A Life Sciences Perspective

“BIG DATA is data that exceeds the processing

capacity of conventional database

systems.”Ed Dumbill, Big Data, Editor in Chief

Page 6: BIG DATA A Life Sciences Perspective

Omics & Ologies -Life Sciences BIG DATA

Omics– Genomics,– Transcriptomics,– Proteomics,– Metabolomics,– Metagenomics

BIG DATA Devices– Gene sequencing– Mass spectrometry– Imaging– Microarrays– Liquid chromatography

Ology(ies)– Radiology– Gastroenterology,– Cardiology,– Pathology

Page 7: BIG DATA A Life Sciences Perspective

Medical Imaging BIG DATA Demands

Increases due to:• Avg. Size/Study • More Digitized Data• Pathology• Endoscopy• Pictures

• More Imaging Procedures

Page 8: BIG DATA A Life Sciences Perspective

Radiographs

Canine Hip Dysplasia

Page 10: BIG DATA A Life Sciences Perspective

Pathology

Page 11: BIG DATA A Life Sciences Perspective

How Big is a Genome?

Paris Japonica152 Billion Base Pairs

Human3 Billion Base Pairs

E.Coli4 Million Base Pairs

Page 12: BIG DATA A Life Sciences Perspective

e.Coli

Page 13: BIG DATA A Life Sciences Perspective
Page 14: BIG DATA A Life Sciences Perspective

The scale of biological data is exponentially increasing with sequencing technologies now producing data at a rate exceeding growth in computing power predicted by Moore’s Law

(10,000-fold improvement in sequencing vs. 16-fold improvement in computing

From the Big Data article Unraveling the Complexities, Higdon et al

Scott Novogoratz
The scale of biological data is exponentially increasing withsequencing technologies now producing data at a rate exceedingthe growth in computing power predicted by Moore’s Law8–10(10,000-fold improvement in sequencing vs. 16-fold improvement in computing over Moore’s Law).8,9 In addition, themajority of research is generated in isolation and demonstratesonly an 11% rate of reproducibility according to a recentstudy.11 Moreover, 27% (+/-9%) of cancer cell lines are misidentified, one out of three proteins is unannotated, and according to one report, up to 85% of research efforts are wasteddue to inadequate production and reporting practices.2
Page 15: BIG DATA A Life Sciences Perspective

Velocity –Genome Studies Will Increase

Page 16: BIG DATA A Life Sciences Perspective

What Do Life Science Researchers Want?

1. Reliable Data2. Statistically Valid Results

3. Analysis Tools with User-Friendly I/F4. Transparent Reporting of Results

5. Ability to Share Data

From U of Washington study to assess data & analysis needs for Life Scientists

Page 17: BIG DATA A Life Sciences Perspective

Relative Importance for the Life Sciences

1.Volume2.Veracity3.Velocity4.Variety5.Value

Page 18: BIG DATA A Life Sciences Perspective

Conclusions

• Recognize that BIG DATA storage issues differ based on the purpose and use of data

• Maximize the value of biological research, by improving the capability to store, catalog, share and compare research through:– Low cost and shared storage mechanisms– Universal and easy-to-use tools that provide

researchers with the capability to compare their findings with libraries of information