model management tools for improved reproducibility in systems biology

13
Model management tools for improved reproducibility in systems biology Dagmar Waltemath, on behalf of the SEMS team University of Rostock, Germany 10 th International CellML Workshop Auckland, June 2016

Upload: university-of-rostock

Post on 12-Jan-2017

129 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Model management tools for improved reproducibility in systems biology

Model management tools for improved reproducibility in systems biology

Dagmar Waltemath, on behalf of the SEMS team

University of Rostock, Germany

10th International CellML Workshop Auckland, June 2016

Page 2: Model management tools for improved reproducibility in systems biology

2

On models and simulations

Model Simulation

Figs: BioModels (top) and DOI: 10.1073/pnas.88.16.7328 (bottom)

Page 3: Model management tools for improved reproducibility in systems biology

3

Most scientific discoveries rely on previous findings.

Model

Fig.: Tyson 2001 (BIOM195)

Fig.: Tyson 1991 (BIOM005)

Successor

Fig.: History of Cell Cycle models in BioModels

Page 4: Model management tools for improved reproducibility in systems biology

4

Can we rely on findings that we ourselves cannot evaluate? (Probably not!)

“only in ~20–25% of the projects were the relevant published data completely in line with our in-house findings (Fig. 1c). In almost two-thirds of the projects, there were inconsistencies [..] that either considerably prolonged the duration of the target validation process or, in most cases, resulted in termination of the projects because the evidence [..] was insufficient to justify further investments into these projects.” Prinz et al (2011)

Page 5: Model management tools for improved reproducibility in systems biology

5

We identified key challenges of reproducibility insystems biology and systems medicine.

Lack of data standards – Lack of data quality and quantity – Lack of data availability – Lack of transparency

Page 6: Model management tools for improved reproducibility in systems biology

6

A lack of data availability makes it impossible for researchers to reproduce results.

● Model code in BioModels, including supplemental with a how-to reproduce the figures given in the original paper

● Online tool makes data available and browseable

TriplexRNA

Recon 2Recon 2

● Publication backed up with a website containing the supplemental material

● Model code in (non-curated) BioModels● Visualisation of the model can easily

be explored● References to original works

How can we support scientistswho wish to share model-based results?

Issues– Simulation studies comprise

of several files

– Data is heterogeneous, distributed, complex

– Data changes over time

– Documentation of the how the study was performed often missing

Page 7: Model management tools for improved reproducibility in systems biology

7

A lack of data availability makes it impossible for researchers to reproduce results.

How can we support scientistswho wish to share model-based results?

Issues– Simulation studies comprise

of several files

– Data is heterogeneous, distributed, complex

– Data changes over time

– Documentation of the how the study was performed often missing

Our solutions– Tool support for the

COMBINE Archive – lowering the effort to share reproducible models

– Graph-based storage of model-related files – integrated & searchable virtual experiments

– Model version control –towards a provenance of models

Page 8: Model management tools for improved reproducibility in systems biology

8

The COMBINE archive bundles all files necessary to reproduce a simulation study.

COMBINE archive toolkit

● manage COMBINE archives

– Explore

– Edit

– Share

– Publish● Used in: PMR 2, JWS Online,

SED-ML Web Tools, OpenCor …

WebCAT, Scharm et al 2014

Page 9: Model management tools for improved reproducibility in systems biology

9

STON, SED-ML DB & MASYMOS

Integrated storage & retrieval system (MASYMOS)

doi: 10.1093/database/bau130

doi: 10.1186/s13326-015-0014-4

Search across heterogeneous data, ontologies, and structures→poster

Tailor-made storage systems (STON, SED-ML DB)

Using graph databases to integrate standardised model-based data

https://dx.doi.org/10.6084/m9.figshare.3382993.v1

SED-ML DB in JWS Online

BioModelsPhysiome Model repository

Page 10: Model management tools for improved reproducibility in systems biology

10

BiVeS & COMODI

Model version control (BiVeS, COMODI) Provenance-to-be (COMODI)

Tracking the evolution of a CellML/SBML model over time

doi: 10.1093/bioinformatics/btv484

Tracking the evolution of simulation studies and biological systems.

https://dx.doi.org/10.6084/m9.figshare.2543059.v5

Physiome Model repository

doi: 10.1093/bioinformatics/btv484

Page 11: Model management tools for improved reproducibility in systems biology

11

What's next? Models for the clinic, or: Bridging the gap between standards for systems biology & systems medicine

Fig. courtesy Atalag et al (2015) http://hdl.handle.net/2292/27911

Page 12: Model management tools for improved reproducibility in systems biology

Thank you for your attention.

m n @SemsProject

Martin ScharmBiVeS, COMODI, COMBINE Archive Video master

Fabienne LambuschPattern & structure search in SBML models

Mariam NassarRank aggregation

Tom GebhardtSBGN-compliant diffs

Martin PetersM2CAT, COMBINE Archive, SED-ML database

Vasundra ToureSTON, SBGN-ED, SBGN symbol of the month

Ron HenkelMASYMOS, MORRE

www.sems.uni-rostock.de

Page 13: Model management tools for improved reproducibility in systems biology

References

Atalag et al (2015) http://hdl.handle.net/2292/27911

Bergmann et al. (2014) F.T. Bergmann, R. Adams, S. Moodie, J. Cooper, M. Glont et al.: COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project. BMC Bioinformatics (2014)

Prinz et al. (2011) Prinz, Florian, Thomas Schlange, and Khusru Asadullah. "Believe it or not: how much can we rely on published data on potential drug targets?." Nature reviews Drug discovery 10.9 (2011): 712-712.

Schmitz et al. (2014) Schmitz, Ulf, et al. "Cooperative gene regulation by microRNA pairs and their identification using a computational workflow." Nucleic acids research (2014): gku465.

Thiele et al. (2013) Thiele, Ines, et al. "A community-driven global reconstruction of human metabolism." Nature biotechnology 31.5 (2013): 419-425.

Waltemath & Scharm (2014) D. Waltemath and M. Scharm: Extracting reproducible simulation studies from model repositories using the CombineArchive Toolkit. Workshop on Data Management for the Life Sciences (2014), Hamburg, BTW 2014.

Waltemath & Wolkenhauer (2016) D. Waltemath and O. Wolkenhauer: How modeling standards, software, and initiatives support reproducibility in systems biology and systems medicine. IEEE Transactions on Biomedical Engineering (2016) in the press.