falk schreiber - from big data to smart knowledge ‐ integrating multimodal biological data and...
Post on 10-May-2015
471 Views
Preview:
DESCRIPTION
TRANSCRIPT
Martin Luther University Halle-Wittenberg
Falk Schreiber
From Big Data to Smart Knowledge
Integrating Multimodal Biological Data and Modelling Metabolism
14/07/2014 1
Leibniz Institute IPK Gatersleben
Observations
1. A tidal wave of scientific data
Observations
1. A tidal wave of scientific data
Year Time Costs (Mio. US$) 2003 13 years 2700 2007 a few months 1 2009 a few weeks 0,05 2014 a few days 0,001 ~2017 cheaper to reproduce data than storing it
Observations
1. A tidal wave of scientific data 2. From building blocks to complex systems
genes transcripts proteins metabolites
redu
ctio
nist
app
roac
h
Observations
1. A tidal wave of scientific data 2. From building blocks to complex systems
redu
ctio
nist
app
roac
h integrative approach
genes transcripts proteins metabolites
Observations
1. A tidal wave of scientific data 2. From building blocks to complex systems 3. Multi-domain data
Observations
1. A tidal wave of scientific data 2. From building blocks to complex systems 3. Multi-domain data
From Data to Knowledge – Outline of the Talk
Understanding metabolism
via modelling
From Data to Knowledge – Outline of the Talk
Understanding metabolism
via modelling
Integrating and exploring multimodal biological data
! Network of thousands of biochemical reactions ! Enzyme-catalysed ! Transporter-mediated
! Supports all biological activity
! Metabolic model = List of reactions + associated information
Metabolism
Source: http://www.genome.jp/kegg/ Source: Michael 1993
+ kinetic rate laws + kinetic
parameters
Topological analysis
network structure
Petri net (P/T) analysis
+ thermodynamics + stoichiometry
Flux balance analysis (FBA)
+ mass balance + capacity constraints
+ stochastic rate laws
+ metabolite concentrations
Kinetic modelling
Petri net (SPN) analysis
Metabolic Models S
ize
of m
odel
Leve
l of d
etai
l
+ kinetic rate laws + kinetic
parameters
Topological analysis
network structure
Petri net (P/T) analysis
+ thermodynamics + stoichiometry
Flux balance analysis (FBA)
+ mass balance + capacity constraints
+ stochastic rate laws
+ metabolite concentrations
Kinetic modelling
Petri net (SPN) analysis
Metabolic Models S
ize
of m
odel
Leve
l of d
etai
l
Flux Balance Analysis
! Constraint-based stoichiometric modelling approach to predict and analyse the metabolic steady state conversion rates (fluxes)
! Advantages ! No kinetic parameters required ! Quantitative predictions ! Applicable to large systems
! Applications ! Prediction of optimal metabolic yields and flux distributions ! Prediction of phenotype/viability of knockout-mutants ! Prediction of pathway redundancies ! And more
Principles of Flux Balance Analysis
Simulation
Oxygene level
Objective Function
How to identify plausible physiological states?
Question Objective What are the biochemical production capabilities?
Maximise metabolite product
What is the maximal growth rate and biomass yield?
Maximise growth rate
How efficiently can metabolism channel metabolites through the network?
Minimise the Euclidean norm
What is the tradeoff between biomass production and metabolite overproduction?
Maximise biomass production for a given metabolite production
How energetically efficient can metabolism operate?
Minimise ATP production or minimise nutrient uptake
History of FBA
Software Tools and Pipelines for FBA
! CellNetAnalyzer (CNA) http://www.mpi-magdeburg.mpg.de/projects/cna/cna.html
! COBRA Toolbox http://gcrg.ucsd.edu/downloads/COBRAToolbox
! FBA-SimVis http://fbasimvis.ipk-gatersleben.de
! Thiele et al. A protocol for generating a high-quality genome-
scale metabolic reconstruction. Nature Protocols, 5(1): 93–121, 2010.
! Grafahrend-Belau et al. Plant metabolic pathways: databases and pipeline for stoichiometric analysis. In Agrawal and Rakwal (Eds.), Seed development: omics technologies toward improvement of seed quality and crop yield, Springer, 345-366, 2012.
FBA Model of seed Metabolism in Hordeum vulgare
Grafahrend-Belau et al. Plant Physiology, 2009
FBA Model of seed Metabolism in Hordeum vulgare
Grafahrend-Belau et al. Plant Physiology, 2009
Size 257 reactions, 234 metabolites
Pathways Glyc, TCA, PPP, oxP, Ferm, Rubisco, AA, Starch, CW, and others
Example of Model Application
! Imaging uncovers metabolic compartmentation ! Alanine synthesis mainly in central endosperm, alanine gradient
reflects the local oxygen state ! Modelling purpose: elucidate the role of alanine metabolism
Source of images: L. Borisjuk and H. Rolletschek, IPK
Melkus et al. Plant Biotechnology Journal, 2011 Rolletschek et al. Plant Cell, 2011
Simulation of Region-specific Metabolism
A B
Central endosperm (hypoxic) Peripheral endosperm (aerobic)
Melkus et al. Plant Biotechnology Journal, 2011 Rolletschek et al. Plant Cell, 2011
Simulation of Region-specific Metabolism
A B
Central endosperm (hypoxic) Peripheral endosperm (aerobic)
Melkus et al. Plant Biotechnology Journal, 2011 Rolletschek et al. Plant Cell, 2011
Obtaining Parameters
! Influx ! Quantification from video
data
! Relation of substances in the same area ! Multimodal alignment Scharfe et al. BMC Bioinformatics, 2010 Fester et al. GCB, 2009
! Biomass accumulation ! Quantification from
image series Hartmann et al. BMC Bioinformatics, 2011
Scaling up - Multi* and High Throughput Modelling
Coupling of Organ-specific FBA Models
Coupling of FBA and FSA Models
Müller et al. IEEE PMA, 2012 Grafahrend-Belau et al. Plant Physiology, 2013
High Throughput Modelling
! Path2Models: A pipeline to compute draft models ! >140.000 kinetic, logical and constraint-based models
Le Novère et al. BMC Systems Biology, 2013
High Throughput Modelling
! Path2Models: A pipeline to compute draft models ! >140.000 kinetic, logical and constraint-based models
Le Novère et al. BMC Systems Biology, 2013
From Data to Knowledge – Outline of the Talk
Understanding metabolism
via modelling
Integrating and exploring multimodal biological data
Multi-domain Biological Data
Data Domains
Available Tools
Data Integration – A Major Problem (Example: Networks)
! Bridge the abyss!
Data Integration – A Major Problem (Example: Networks)
! Many information resources can be utilized as IDMappers: ! Web services, web sites
(e.g. PICR, CRONOS, …) ! Relational databases
(e.g. STRING, PDD, …) ! Flat files
(e.g. Kegg, UniProt, …)
Overview: Mehlhorn et al. TransID – the flexible identifier mapping service 112-121 (Internat. Symp. Integrative Bioinformatics), 2013.
! Unified using the
BridgeDB framework
IDMappers
! Comprises a set of identifiers (nodes) and a set of identifier mappings (edges)
! Used to explore identifier interconnections ! Basis of the integration of biological networks ! Example
The Data Linkage Graph
(Tair) (UniProt) (EC number)
! Composed of biological networks and the inferred identifier mappings as mapping edges
! Mapping edges represent identifier connections in the data linkage graph
! Example
The Integrated Graph
Data linkage graph Integrated graph
! Metabolic pathways: Glycolysis, Pyruvate metabolism from KEGG
! Gene regulatory network: Arabidopsis thaliana from Regulogs
Example
Example
Available Tools
Data, Mappings and Mapping Function
! Set of measurements ! Mappings with the object path functions which derives the
relevant metadata and any set of graph element attributes ! Basis: ID Mappers
𝑚
𝑚
𝑚
𝑚
Rohn et al. Bioinformatics, 2011
Example of Integrated Data http://www.vanted.org
! The ABC(DE)-model of Arabidopsis thaliana floral organ specification
! Determination of floral organ identity depends on the combinatorial expression of floral homeotic genes from different classes
! Integration of color-coded images, representing floral homeotic gene expression patterns, into the context of a regulatory network
Junker et al. Frontiers in Plant Science, 2012.
Standards for Modelling and Simulation in SysBio
Standards for Modelling and Simulation in SysBio
Can You Understand This?
Can You Understand This?
Stimulates? but ... what exactly?
Associates into?
Trans- locates?
Reciprocal stimulation?
Is degraded?
Stimulates gene Trans- cription?
Ambiguity in Conventional Representation
Standardised Symbols are Important
Most English speaking country
Quebec Iran China Israel
Singapore Norway Poland USA and Canada
What is SBGN?
! A way to unambiguously describe biochemical and cellular events in graphs
! Limited amount of symbols (~30) à Smooth learning curve
! Can graphically represent quantitative models, biochemical pathways, at different levels of granularity
! Developed since 2006 by a interdisciplinary community, part of COMBINE
! Three languages ! Process Descriptions à one state = one glyph ! Entity Relationships à one entity = one glyph ! Activity Flow à conceptual level
Graph Trinity: Three Languages in One http://sbgn.org
Process Description maps
Entity Relationships
maps
Activity Flow
maps
! Unambiguous ! Mechanistic ! Sequential ! Combinatorial
explosion
! Unambiguous ! Mechanistic ! Non-Sequential
! Ambiguous ! Conceptual ! Sequential
Le Novère et al. Nature Biotechnology, 2009
Graph Trinity: Three Languages in One
Process Description
Entity Relationships
Activity Flow
Systems Biology Graphical Notation (SBGN)
Working with SBGN http://www.sbgn-ed.org
! Verification Czauderna et al. Bioinformatics, 2010 ! Synthesis / bricks Junker et al. Trends in Biotechnology, 2012 ! Translation Czauderna et al. BMC Bioinformatics, 2013
! Layout Schreiber et al. BMC Bioinformatics, 2009 Dwyer et al. IEEE Transactions Visualization & Computer Graphics, 2008
! Data integration Junker et al. Nature Protocols, 2012
Modelling, Visual Analytics, Standards, Network Analysis
Optimise
Predict
visualise, explore, integrate, analyse, model present, understand simulate, predict
Thank You
“We now have unprecedented ability to collect data about nature but there is now a crisis developing in biology, in that completely unstructured information does not enhance understanding. We need a framework to put all of this knowledge and data into - that is going to be the problem in biology. […] Driving toward that framework is really the big challenge.” Sydney Brenner
top related