model management tools for improved reproducibility in systems biology
Post on 12-Jan-2017
133 Views
Preview:
TRANSCRIPT
Model management tools for improved reproducibility in systems biology
Dagmar Waltemath, on behalf of the SEMS team
University of Rostock, Germany
10th International CellML Workshop Auckland, June 2016
2
On models and simulations
Model Simulation
Figs: BioModels (top) and DOI: 10.1073/pnas.88.16.7328 (bottom)
3
Most scientific discoveries rely on previous findings.
Model
Fig.: Tyson 2001 (BIOM195)
Fig.: Tyson 1991 (BIOM005)
Successor
Fig.: History of Cell Cycle models in BioModels
4
Can we rely on findings that we ourselves cannot evaluate? (Probably not!)
“only in ~20–25% of the projects were the relevant published data completely in line with our in-house findings (Fig. 1c). In almost two-thirds of the projects, there were inconsistencies [..] that either considerably prolonged the duration of the target validation process or, in most cases, resulted in termination of the projects because the evidence [..] was insufficient to justify further investments into these projects.” Prinz et al (2011)
5
We identified key challenges of reproducibility insystems biology and systems medicine.
Lack of data standards – Lack of data quality and quantity – Lack of data availability – Lack of transparency
6
A lack of data availability makes it impossible for researchers to reproduce results.
● Model code in BioModels, including supplemental with a how-to reproduce the figures given in the original paper
● Online tool makes data available and browseable
TriplexRNA
Recon 2Recon 2
● Publication backed up with a website containing the supplemental material
● Model code in (non-curated) BioModels● Visualisation of the model can easily
be explored● References to original works
How can we support scientistswho wish to share model-based results?
Issues– Simulation studies comprise
of several files
– Data is heterogeneous, distributed, complex
– Data changes over time
– Documentation of the how the study was performed often missing
7
A lack of data availability makes it impossible for researchers to reproduce results.
How can we support scientistswho wish to share model-based results?
Issues– Simulation studies comprise
of several files
– Data is heterogeneous, distributed, complex
– Data changes over time
– Documentation of the how the study was performed often missing
Our solutions– Tool support for the
COMBINE Archive – lowering the effort to share reproducible models
– Graph-based storage of model-related files – integrated & searchable virtual experiments
– Model version control –towards a provenance of models
8
The COMBINE archive bundles all files necessary to reproduce a simulation study.
COMBINE archive toolkit
● manage COMBINE archives
– Explore
– Edit
– Share
– Publish● Used in: PMR 2, JWS Online,
SED-ML Web Tools, OpenCor …
WebCAT, Scharm et al 2014
9
STON, SED-ML DB & MASYMOS
Integrated storage & retrieval system (MASYMOS)
doi: 10.1093/database/bau130
doi: 10.1186/s13326-015-0014-4
Search across heterogeneous data, ontologies, and structures→poster
Tailor-made storage systems (STON, SED-ML DB)
Using graph databases to integrate standardised model-based data
https://dx.doi.org/10.6084/m9.figshare.3382993.v1
SED-ML DB in JWS Online
BioModelsPhysiome Model repository
10
BiVeS & COMODI
Model version control (BiVeS, COMODI) Provenance-to-be (COMODI)
Tracking the evolution of a CellML/SBML model over time
doi: 10.1093/bioinformatics/btv484
Tracking the evolution of simulation studies and biological systems.
https://dx.doi.org/10.6084/m9.figshare.2543059.v5
Physiome Model repository
doi: 10.1093/bioinformatics/btv484
11
What's next? Models for the clinic, or: Bridging the gap between standards for systems biology & systems medicine
Fig. courtesy Atalag et al (2015) http://hdl.handle.net/2292/27911
Thank you for your attention.
m n @SemsProject
Martin ScharmBiVeS, COMODI, COMBINE Archive Video master
Fabienne LambuschPattern & structure search in SBML models
Mariam NassarRank aggregation
Tom GebhardtSBGN-compliant diffs
Martin PetersM2CAT, COMBINE Archive, SED-ML database
Vasundra ToureSTON, SBGN-ED, SBGN symbol of the month
Ron HenkelMASYMOS, MORRE
www.sems.uni-rostock.de
References
Atalag et al (2015) http://hdl.handle.net/2292/27911
Bergmann et al. (2014) F.T. Bergmann, R. Adams, S. Moodie, J. Cooper, M. Glont et al.: COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project. BMC Bioinformatics (2014)
Prinz et al. (2011) Prinz, Florian, Thomas Schlange, and Khusru Asadullah. "Believe it or not: how much can we rely on published data on potential drug targets?." Nature reviews Drug discovery 10.9 (2011): 712-712.
Schmitz et al. (2014) Schmitz, Ulf, et al. "Cooperative gene regulation by microRNA pairs and their identification using a computational workflow." Nucleic acids research (2014): gku465.
Thiele et al. (2013) Thiele, Ines, et al. "A community-driven global reconstruction of human metabolism." Nature biotechnology 31.5 (2013): 419-425.
Waltemath & Scharm (2014) D. Waltemath and M. Scharm: Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit. Workshop on Data Management for the Life Sciences (2014), Hamburg, BTW 2014.
Waltemath & Wolkenhauer (2016) D. Waltemath and O. Wolkenhauer: How modeling standards, software, and initiatives support reproducibility in systems biology and systems medicine. IEEE Transactions on Biomedical Engineering (2016) in the press.
top related