building an efficient infrastructure, standards and data flow for metabolomics

68
Developing an Efficient Infrastructure, Standards and Data-Flow for Metabolomics Christoph Steinbeck European Bioinformatics Institute (EMBL-EBI)

Upload: christoph-steinbeck

Post on 22-Jan-2017

42 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Building an efficient infrastructure, standards and data flow for metabolomics

Developing an Efficient Infrastructure, Standards and Data-Flow for Metabolomics

Christoph Steinbeck

European Bioinformatics Institute(EMBL-EBI)

Page 2: Building an efficient infrastructure, standards and data flow for metabolomics

Developing an Efficient Infrastructure, Standards and Data-Flow for Metabolomics

Christoph Steinbeck

European Bioinformatics Institute(EMBL-EBI)

Slides on http://www.slideshare.net/csteinbeck

Page 3: Building an efficient infrastructure, standards and data flow for metabolomics

The European Bioinformatics Institute

(EBI)

Page 4: Building an efficient infrastructure, standards and data flow for metabolomics

The European Bioinformatics Institute

(EBI)

Page 5: Building an efficient infrastructure, standards and data flow for metabolomics

The European Bioinformatics Institute

(EBI)

Page 6: Building an efficient infrastructure, standards and data flow for metabolomics

The European Bioinformatics Institute

(EBI)

Page 7: Building an efficient infrastructure, standards and data flow for metabolomics

The European Molecular Biology Laboratory

(EMBL)

A basic research institute funded by public research monies from 20 member states.

Page 8: Building an efficient infrastructure, standards and data flow for metabolomics

Post Genomic Era

Christoph Steinbeck

European Bioinformatics Institute(EMBL-EBI)

[O]ur understanding of the human genome has changed in the most fundamental ways. The small number of genes -- some 30,000 -- supports the notion that we are not hard wired. We now know the notion that one gene leads to one protein, and perhaps one disease, is false.

Craig Venter, June 2001

Page 9: Building an efficient infrastructure, standards and data flow for metabolomics

Understanding Phenotypes

Page 10: Building an efficient infrastructure, standards and data flow for metabolomics

Nutrition

Exercise

Disease

AgeDrugs

Environment

Phenome/Exposome

Page 11: Building an efficient infrastructure, standards and data flow for metabolomics

Reaction times following external change

• Genetics (decades, centuries…)

• Epigenetics (days, month, years,…)

• Gene Expression (hours)

• Metabolism (seconds)

Page 12: Building an efficient infrastructure, standards and data flow for metabolomics

The Metabolome is the most accessible and

dynamically changing Molecular Phenotype

Page 13: Building an efficient infrastructure, standards and data flow for metabolomics

Metabolites:

Small molecules in biological organisms

Page 14: Building an efficient infrastructure, standards and data flow for metabolomics

Metabolomics

Measures occurrence and concentrations of many small molecules (metabolites) in an organism at once.

Page 15: Building an efficient infrastructure, standards and data flow for metabolomics

Nuclear Magnetic Resonance (NMR)

Mass Spec

Metabolomics uses a wide-range of analytical techniques

Page 16: Building an efficient infrastructure, standards and data flow for metabolomics

Phenome Centres popping up all over the world

• London

• Birmingham

• Shanghai

• NIH RCMRCs

• …

Page 17: Building an efficient infrastructure, standards and data flow for metabolomics
Page 18: Building an efficient infrastructure, standards and data flow for metabolomics

> 100,000 patient samples / year> Several PetaBytes/year

=> ExaBytes of human data at moderate scale-up

Page 19: Building an efficient infrastructure, standards and data flow for metabolomics

How do you make sense of all that data?

Page 20: Building an efficient infrastructure, standards and data flow for metabolomics

Share them-

Free and Open

Page 21: Building an efficient infrastructure, standards and data flow for metabolomics

MetaboLights

http://www.ebi.ac.uk/metabolights

open-access, cross-species, cross-application,long-term supported

Salek, R.M., Haug, K. and Steinbeck, C. (2013) Dissemination of metabolomics results: role of MetaboLights and COSMOS. Gigascience, 2:8.

Page 22: Building an efficient infrastructure, standards and data flow for metabolomics

MetaboLights Database

Experimental Repository

Reference Layer

Chemistry Spectroscopy Biology

Ana

lysi

s To

ols

Primary Literature

Primary data and Meta-Data, Spectra, Protocols, Synopses, ...

Page 23: Building an efficient infrastructure, standards and data flow for metabolomics
Page 24: Building an efficient infrastructure, standards and data flow for metabolomics

www.ebi.ac.uk/metabolights (metabolights.org, metabolights.eu)

Page 25: Building an efficient infrastructure, standards and data flow for metabolomics

www.ebi.ac.uk/metabolights (metabolights.org, metabolights.eu)

Page 26: Building an efficient infrastructure, standards and data flow for metabolomics

Data growth in EBI data repositories

Page 27: Building an efficient infrastructure, standards and data flow for metabolomics

Data growth in EBI data repositories

3-month doubling time

for Metabolomics

Page 28: Building an efficient infrastructure, standards and data flow for metabolomics

Data growth in EBI data repositories

3-month doubling time

for Metabolomics

MetaboLights is now the recommended

repositoryfor the Nature journals,

EMBO journal, PLOS journals, Metabolomics

Journal and others

Page 29: Building an efficient infrastructure, standards and data flow for metabolomics

Sansone,… Steinbeck et al. (2012) Toward interoperable bioscience data.

Nature Genetics, 44, 121–126.

Page 30: Building an efficient infrastructure, standards and data flow for metabolomics

Sansone,… Steinbeck et al. (2012) Toward interoperable bioscience data.

Nature Genetics, 44, 121–126.

Controlled VocabulariesOntologies

Page 31: Building an efficient infrastructure, standards and data flow for metabolomics

Sansone,… Steinbeck et al. (2012) Toward interoperable bioscience data.

Nature Genetics, 44, 121–126.

Controlled VocabulariesOntologies

Minimum Information Standards

Page 32: Building an efficient infrastructure, standards and data flow for metabolomics
Page 33: Building an efficient infrastructure, standards and data flow for metabolomics

Controlled VocabulariesOntologies

Page 34: Building an efficient infrastructure, standards and data flow for metabolomics

Controlled VocabulariesOntologies

Page 35: Building an efficient infrastructure, standards and data flow for metabolomics
Page 36: Building an efficient infrastructure, standards and data flow for metabolomics
Page 37: Building an efficient infrastructure, standards and data flow for metabolomics
Page 38: Building an efficient infrastructure, standards and data flow for metabolomics
Page 39: Building an efficient infrastructure, standards and data flow for metabolomics

Global Standards and

Data Exchange in

Metabolomics

Page 40: Building an efficient infrastructure, standards and data flow for metabolomics

COSMOS COrdination of Standards in MetabolOmicS

European FP7 coordination action coordinated by us at

EMBL-EBI, Hinxton, Cambridge

• Create missing standards & formats

• Define workflows for dissemination

• Create world-wide data network

Page 41: Building an efficient infrastructure, standards and data flow for metabolomics

MetabolomeXchange 2014

• Global network for exchange and discoverability of metabolomics data

• Includes study as well as reference data

Page 42: Building an efficient infrastructure, standards and data flow for metabolomics
Page 43: Building an efficient infrastructure, standards and data flow for metabolomics
Page 44: Building an efficient infrastructure, standards and data flow for metabolomics

The MetaboLights Reference Layer

Page 45: Building an efficient infrastructure, standards and data flow for metabolomics
Page 46: Building an efficient infrastructure, standards and data flow for metabolomics

•8.7 mio eukaryotic species on earth (+- 1.3mio)

Page 47: Building an efficient infrastructure, standards and data flow for metabolomics

•8.7 mio eukaryotic species on earth (+- 1.3mio)•1.2 mio species identified and classified

Page 48: Building an efficient infrastructure, standards and data flow for metabolomics

•8.7 mio eukaryotic species on earth (+- 1.3mio)•1.2 mio species identified and classified•3000 - 4000 complete species genomes sequenced

Page 49: Building an efficient infrastructure, standards and data flow for metabolomics

•8.7 mio eukaryotic species on earth (+- 1.3mio)•1.2 mio species identified and classified•3000 - 4000 complete species genomes sequenced

Page 50: Building an efficient infrastructure, standards and data flow for metabolomics

•8.7 mio eukaryotic species on earth (+- 1.3mio)•1.2 mio species identified and classified•3000 - 4000 complete species genomes sequenced

What about completed metabolomes?

Page 51: Building an efficient infrastructure, standards and data flow for metabolomics

•8.7 mio eukaryotic species on earth (+- 1.3mio)•1.2 mio species identified and classified•3000 - 4000 complete species genomes sequenced

What about completed metabolomes?

Page 52: Building an efficient infrastructure, standards and data flow for metabolomics

Typical total ion chromatogram of serum from a healthy subject.

Psychogios N, Hau DD, Peng J, Guo AC, Mandal R, et al. (2011) The Human Serum Metabolome. PLoS ONE 6(2): e16957. doi:10.1371/journal.pone.0016957 http://journals.plos.org/plosone/article?id=info:doi/10.1371/journal.pone.0016957

Page 53: Building an efficient infrastructure, standards and data flow for metabolomics

There are known knowns; there are things we know we know.We also know there are known unknowns; that is to say, we know there are some things we do not know.But there are also unknown unknowns – the ones we don’t know we don’t know.

—United States Secretary of Defense,

Donald Rumsfeld

Page 54: Building an efficient infrastructure, standards and data flow for metabolomics

Building upon extensive genomics research, we argue that the time is now right to focus intensively on model organism metabolomes. We propose a grand challenge for metabolomics studies of model organisms: to identify and map all metabolites onto metabolic pathways, to develop quantitative metabolic models for model organisms, and to relate organism metabolic pathways within the context of evolutionary metabolomics, i.e., phylometabolomics. These efforts should focus on a series of established model organisms in microbial, animal and plant research.

Metabolites. 2016 Feb 15;6(1)

Page 55: Building an efficient infrastructure, standards and data flow for metabolomics

Species Metabolomes are being assembled on the fly

right now through data sharing in Metabolomics

Page 56: Building an efficient infrastructure, standards and data flow for metabolomics

Repository Entry

Page 57: Building an efficient infrastructure, standards and data flow for metabolomics

Repository Entry

Page 58: Building an efficient infrastructure, standards and data flow for metabolomics

Reference Layer

Page 59: Building an efficient infrastructure, standards and data flow for metabolomics
Page 60: Building an efficient infrastructure, standards and data flow for metabolomics
Page 61: Building an efficient infrastructure, standards and data flow for metabolomics
Page 62: Building an efficient infrastructure, standards and data flow for metabolomics

7 most annotated metabolomes in MetaboLights

Page 63: Building an efficient infrastructure, standards and data flow for metabolomics

2013201420152016

Page 64: Building an efficient infrastructure, standards and data flow for metabolomics
Page 65: Building an efficient infrastructure, standards and data flow for metabolomics

Funding and CollaboratorsUK Research Councils (BBSRC, MRC) European Commission

Page 66: Building an efficient infrastructure, standards and data flow for metabolomics
Page 67: Building an efficient infrastructure, standards and data flow for metabolomics

Slides on http://www.slideshare.net/csteinbeck

Page 68: Building an efficient infrastructure, standards and data flow for metabolomics

[email protected]

Thank you!