the beauty of workflows and models€¦ · capture steps track & keep results versioning host...

63
The beauty of workflows and models Workflows for research. Reproducible research. Professor Carole Goble The University of Manchester, UK The Software Sustainability Institute [email protected] @caroleannegoble RDMF Meeting, Westminster, 20 June 2014

Upload: others

Post on 07-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

The beauty of workflows and models

Workflows for research.

Reproducible research.

Professor Carole Goble

The University of Manchester, UK

The Software Sustainability Institute

[email protected]

@caroleannegoble

RDMF Meeting, Westminster, 20 June 2014

Page 2: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Scientific publications have at least two goals:

(i) to announce a result and (ii) to convince readers that the

result is correct ….. papers in experimental science

should describe the results and provide a clear enough protocol to allow successful repetition and extension

Jill Mesirov Accessible Reproducible Research

Science 22 Jan 2010: 327(5964): 415-416 DOI: 10.1126/science.1179653

Virtual Witnessing*

*Leviathan and the Air-Pump: Hobbes, Boyle, and the Experimental Life (1985) Shapin and Schaffer.

Page 3: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Scientific publications have at least two goals:

(i) to announce a result and (ii) to convince readers that the

result is correct ….. papers in experimental science

should describe the results and provide a clear enough protocol to allow successful repetition and extension

Jill Mesirov Accessible Reproducible Research

Science 22 Jan 2010: 327(5964): 415-416 DOI: 10.1126/science.1179653

Virtual Witnessing*

*Leviathan and the Air-Pump: Hobbes, Boyle, and the Experimental Life (1985) Shapin and Schaffer.

Page 4: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

“An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment, [the complete data] and the complete set of instructions which generated the figures.”

David Donoho, “Wavelab and Reproducible Research,” 1995

datasets data collections standard operating procedures software algorithms configurations tools and apps codes workflows scripts code libraries services, system software infrastructure, compilers hardware

Morin et al Shining Light into Black Boxes Science 13 April 2012: 336(6078) 159-160

Ince et al The case for open computer programs, Nature 482, 2012

Page 5: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

datasets data collections standard operating procedures software algorithms configurations tools and apps codes workflows scripts code libraries services, system software infrastructure, compilers hardware

Morin et al Shining Light into Black Boxes Science 13 April 2012: 336(6078) 159-160

Ince et al The case for open computer programs, Nature 482, 2012

“Executable Data”

Page 6: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Biodiversity marine monitoring and health assessment

ecological niche modelling

Data Intensive Science

Collaborative Science

Pilumnus hirtellus Enclosed sea problem (Ready et al., 2010)

Sarah Bourlat http://www.biovel.eu

Page 7: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,
Page 8: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Data discovery Data discovery

Data assembly, cleaning, and

refinement

Data assembly, cleaning, and

refinement

Ecological Niche Modeling

Ecological Niche Modeling

Statistical analysis Statistical analysis

Analytical cycle

Data collection Data collection

Insights Insights Scholarly Communication & Reporting

Scholarly Communication & Reporting

Page 9: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

BioSTIF

method

instruments and laboratory

Workflows: capture the steps assembly & interoperability shielding & optimising flexible variant reuse pipelines & exploration repetition & comparison record & set-up provenance collection report & embed multi-code and multi-

resource experiments in-house and external workflow mgment systems

materials

http://www.taverna.org.uk

Page 10: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Application

Gen

eral

ist

Sp

ecia

list

Infrastructure

Scientific Workflow Management Systems

Page 11: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Systems Biology

Modelling Cycle

Page 12: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Virtual Physiological Human Morphology

Microbiology Metabolic Pathways

http://www.vph-share.eu/

Page 13: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

stand

ards, stan

dard

s, stand

ards

Page 14: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Data

Models

Articles

External

Databases

http://www.seek4science.org

Metadata

http://www.isatools.org

Aggregated Content Infrastructure share and interlinking multi-stewarded, mixed,

methods, models, data, samples…

Page 15: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Preservation Planning & Watch continuous preservation management

Environment and users

Repository

access ingest harvest

Monitored environment and users Watch

Planning Operations

create/ reevaluate plans

deploy plan

monitored actions

execute action plan

policies

SCOUT

c3po

PLATO

Taverna workflows

RODA

Long term preservation of digital data. Maintaining scans of newspapers, books, records of data; Metadata maintenance; large and automated.

Preservation Policy: Collection level Control Policy: Low level actions & constraints

http://www.scape-project.eu/

Page 16: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Merge a Preservation Action Plan….

… with an Access Workflow

Execution Workflow

Preservation Planning & Watch

Publish

Use

Components

Page 17: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Workflows (and Scripts and Models) are….

…provenance of data …general technique for describing and enacting a process …precise, unambiguous, transparent protocols and records. …often complex, so they need explaining. …often challenging and expensive to develop. …know-how and best practice. …collaborations.

…first class citizens of research …support the process of research

Page 18: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Workflow publishing

[Scott Edmunds]

Publishing

Journals

Portals

Integrative Frameworks

galaxyproject.org/‎

Page 19: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Reproducibility = Hard Work

Code in sourceforge under GPLv3:

http://soapdenovo2.sourceforge.net/

>5000 downloads

http://homolog.us/wiki/index.php?title=SOAPdenovo2

Data sets

Analyses

Open-Paper

Open-Review

DOI:10.1186/2047-217X-1-18

>11000 accesses

Open-Code

8 reviewers tested data in ftp server & named reports published

DOI:10.5524/100044

Open-Pipelines

Open-Workflows

DOI:10.5524/100038

Open-Data

78GB CC0 data

Enabled code to being picked apart by bloggers in wiki

[Scott Edmunds]

Page 20: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

http://ropensci.org

Repositories Libraries

Registries

Page 21: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Archiving Publishing Component Libraries Preserving

Recording Storing Exchanging Versioning Sharing PACKS

Repositories

Page 22: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Data Operations in Workflows in the Wild

Analysis of 260 publicly available workflows in Taverna, WINGS, Galaxy and Vistrails

Garijo et al Common Motifs in Scientific Workflows: An Empirical Analysis, FGCS, 36, July 2014, 338–351

Page 23: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Research Method Stewardship Management, Publishing, Preservation

Workflows & Scripts

Services & Codes

Standard Operating Procedures

Descriptions Standards

Po

rtal

Different systems Formats

Web Services Code Libraries Executables

Models & Algorithms

Mark-up Languages, Mathematical descriptions Standards

BioModels

Page 24: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Reuse

Organised Groups

Trust

Reciprocity Visibility

Roll your own from standard parts

Complementarity

Design and Instruction

Page 25: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Curation

Non-intrusive, Non-invasive, Not invisible

Enclaves

Specialist Flirts

Blue collar

Incremental JIJIT not JIC

Page 26: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Victoria Stodden, AMP 2011 http://www.stodden.net/AMP2011/,

Special Issue Reproducible Research Computing in Science and Engineering July/August 2012, 14(4)

Howison and Herbsleb (2013) "Incentives and Integration In Scientific Software Production" CSCW 2013.

Page 27: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Data Stewardship

Making practices Sustainability Management planning Deposition Long term access Credit Journals Licensing Open source / access

Best Practices for Scientific Computing http://arxiv.org/abs/1210.0530 1st Workshop on Maintainable Software Practices in e-Science – e-Science 2012

Stodden, Reproducible Research Standard, Intl J Comm Law & Policy, 13 2009

Page 28: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Data Stewardship

Best Practices for Scientific Computing http://arxiv.org/abs/1210.0530 1st Workshop on Maintainable Software Practices in e-Science – e-Science 2012

Stodden, Reproducible Research Standard, Intl J Comm Law & Policy, 13 2009

Making practices Sustainability Management planning Deposition Long term access Credit Journals Licensing Open source / access

Page 29: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

http://sciencecodemanifesto.org/ http://matt.might.net/articles/crapl/

Page 30: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Jennifer Schopf, Treating Data Like Software: A Case for Production Quality Data, JCDL 2012

Software release paradigm Some of your data isn’t data

Not a static document paradigm • Release

research

• Methods in motion.

• Versioning

• Forks & merges

• F1000, PeerJ GitHub….

Page 31: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Pivot around method / software / data rather than paper

Citation semantics: software as was? software as is?

The multi-dimensional paper

Page 32: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

methods, reproducibility

what does it

mean for content managers and the research

workflow?

Page 33: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Replication Gap

1. Ioannidis et al., 2009. Repeatability of published microarray gene expression analyses. Nature Genetics 41: 14

2. Science publishing: The trouble with retractions http://www.nature.com/news/2011/111005/full/478026a.html

3. Bjorn Brembs: Open Access and the looming crisis in science https://theconversation.com/open-access-and-the-looming-crisis-in-science-14950

Out of 18 microarray papers, results from 10 could not be reproduced

Out of 18 microarray papers, results from 10 could not be reproduced

Page 34: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

re-compute

replicate

rerun repeat

re-examine

repurpose

recreate

reuse

restore

reconstruct review

regenerate revise

recycle

regenerate the figure

redo

“When I use a word," Humpty Dumpty said in rather a scornful tone, "it means just what I choose it to mean - neither more nor less.”*

*Lewis Carroll, Through the Looking-Glass, and What Alice Found There (1871)

Page 35: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

reuse reproduce

repeat replicate

Drummond C Replicability is not Reproducibility: Nor is it Good Science, online Peng RD, Reproducible Research in Computational Science Science 2 Dec 2011: 1226-1227.

Methods (techniques, algorithms,

spec. of the steps)

Materials (datasets, parameters,

algorithm seeds)

Experiment Instruments (codes, services, scripts,

underlying libraries)

Laboratory (sw and hw infrastructure,

systems software,

integrative platforms)

Setup

Page 36: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

same experiment same set up

same lab

same experiment same set up different lab

same experiment different set up

different experiment some of same

validate

Drummond C Replicability is not Reproducibility: Nor is it Good Science, online Peng RD, Reproducible Research in Computational Science Science 2 Dec 2011: 1226-1227.

reuse reproduce

repeat replicate

Page 37: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Design Design

Execution Execution

Result Analysis Result Analysis

Collection Collection

Publish /

Report

Publish /

Report

Peer

Review

Peer

Review

Peer

Reuse

Peer

Reuse

Modelling Modelling

Can I repeat &

defend my

method?

Can I review / reproduce

and compare my results /

method with your results /

method?

Can I review /

replicate and certify

your method?

Can I transfer your

results into my

research and reuse

this method?

* Adapted from Mesirov, J. Accessible Reproducible Research Science 327(5964), 415-416 (2010)

Research Report

Prediction Prediction

Monitoring Monitoring

Cleaning Cleaning

Page 38: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Record Everything Automate Everything

recomputation.org

sciencecodemanifesto.org

Page 39: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

[Adapted Freire, 2013]

Authoring

Exec. Papers Link docs to experiment

Sweave

Provenance Tracking, Versioning Replay, Record, Repair

Workflows,

makefiles

ProvStore

open accessible available

description intelligible machine- readable

provenance gather dependencies

capture steps track & keep results

provenance gather dependencies

capture steps track & keep results

Page 40: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

http://nbviewer.ipython.org/github/myGrid/DataHackL

eiden/blob/alan/Player_example.ipynb

https://www.youtube.com/watch?v=QVQwSOX5S08 ?

notebooks

Build into the workflows of research….

Page 41: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

RDataTracker and DDG Explorer

Build into the workflows of research….

[Barbara S. Lerner and Emery R. Boose]

Page 42: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Components Dependencies

Change • 35 kinds of annotations

• 5 Main Workflows

• 14 Nested Workflows

• 25 Scripts

• 11 Configuration files

• 10 Software dependencies

• 1 Web Service

• Dataset: 90 galaxies observed in 3 bands

• Multiple platforms

• Multiple systems

José Enrique Ruiz (IAA-CSIC)

Galaxy Luminosity Profiling

Page 43: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

specialist codes libraries, platforms, tools

services

(cloud)

hosted

services

commodity

platforms

data collections

catalogues software

repositories

my data

my process

my codes

integrative

frameworks

gateways

Page 44: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Document vs Instrument

Reproducibility by Inspection Read It

Reproducibility by Invocation Run It

Page 45: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Instrument Entropy all experiments become less reproducible

Zhao, Gomez-Perez, Belhajjame, Klyne, Garcia-Cuesta, Garrido, Hettne, Roos, De Roure and Goble. Why workflows break - Understanding and combating decay in Taverna workflows, 8th Intl Conf e-Science 2012

Mitigate Detect, Repair Preserve Partial replication Approx reproduce Verification Benchmarks

Page 46: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Environmental Ecosystem

Joppa et al SCIENCE 340 May 2013; Morin et al Science 336 2012

Black boxes

Mixed systems, mixed stewardship

Distributed, hosted systems

Page 47: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Workflow Planning & Watch of Workflows

Watch

Operations Planning

Env & Users

Repository

plan

deploy

monitor monitor

monitor

access ingest, harvest

Decay, Service Deprecation, Data source monitoring, Checklists, Minimal Models

Workflows, myExperiment

Workflows for managing workflows

Page 48: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

portability

variability tolerance

[Adapted Freire, 2013]

preservation packaging

provenance gather dependencies

capture steps track & keep results

provenance gather dependencies

capture steps track & keep results

versioning

host

service

Open Source/Store

Sci as a Service

Integrative fws

Virtual Machines Recompute, limited installation, Black Box Byte execution, copies

Descriptive read, White Box Archived record

Read & Run, Co-location No installation

Portable Package

White Box, Installation Archived record

Page 49: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

portability

[Adapted Freire, 2013]

preservation packaging

provenance gather dependencies

capture steps track & keep results

provenance gather dependencies

capture steps track & keep results

host

service

ReproZip

variability tolerance

versioning

Page 50: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Levels of Reproducibility

Coverage: how much of an experiment is reproducible

Ori

gin

al E

xper

imen

t S

imila

r E

xper

imen

t D

iffe

ren

t E

xper

imen

t

Po

rtab

ility

Depth: how much of an experiment is available

Binaries + Data

Source Code / Workflow + Data

Binaries + Data + Dependencies

Source Code / Workflow + Data + Dependencies

Virtual Machine Binaries + Data + Dependencies

Virtual Machine Source Code / Workflow + Data + Dependencies

Figures + Data

[Freire, 2014]

Page 51: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

indable

ccessible

nteroperable

eusable http://datafairport.org/

Packs

Page 52: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

(Dynamic) Research Objects • Bundle and relate multi-hosted digital resources of a scientific experiment or

investigation using standard mechanisms, Currency of exchange

• Exchange, Releasing paradigm for publishing

http://www.researchobject.org/

Page 53: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

• Bundle and relate multi-hosted digital resources of a scientific experiment or investigation using standard mechanisms, Currency of exchange

• Exchange, Releasing paradigm for publishing

http://www.researchobject.org/

(Dynamic) Research Objects

Page 54: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

• Bundle and relate multi-hosted digital resources of a scientific experiment or investigation using standard mechanisms, Currency of exchange

• Exchange, Releasing paradigm for publishing

http://www.researchobject.org/

(Dynamic) Research Objects

OAI- ORE

W3C OAM

H. Van de Sompel et. al. Persistent Identifiers for

Scholarly Assets and the Web: The Need for an

Unambiguous Mapping 9th International Digital

Curation Conference; Trusty URLs

H. Van de Sompel et. al. Persistent Identifiers for

Scholarly Assets and the Web: The Need for an

Unambiguous Mapping 9th International Digital

Curation Conference; Trusty URLs

Page 55: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Machine readable metadata* Machine actionable systems**

* Especially Linked Data and RDF ** Especially REST APIs

Page 56: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,
Page 57: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Sys Bio Research Object

Adobe UCF

Research Object Bundle

ORE PROV ODF

• Aggregation • Annotations/provenance • Ad-hoc domain-specific

specification

OMEX archive

Systems Biology: Needed a common archive format for reuse across tools

Page 58: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

The research lifecycle

IDEAS – HYPOTHESES – EXPERIMENTS – DATA - ANALYSIS - COMPREHENSION - DISSEMINATION

Authoring

Tools

Lab

Notebooks

Data

Capture Software

Repositories

Analysis

Tools

Visualization

Scholarly

Communication

Commercial &

Public Tools

Git-like

Resources

By Discipline

Data Journals Discipline-

Based Metadata

Standards

Community Portals

Institutional Repositories

New Reward

Systems

Commercial Repositories

Training

[Phil Bourne]

Page 59: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

productivity

reproducibility

personal side effect

public side effect

The Cameron Neylon Equation

towards

Steps

born

reproducible

Page 60: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

“may all your problems be technical” ...Jim Gray

Social

Matters

Organisation

Metrics Culture

Process

[Adapted, Daron Green]

Page 61: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Summary • Workflow & modelling models in Science

• Software-style Stewardship

• Born reproducible

• Collective cost & responsibility

• Social factors dominate

http://www.force11.org

Force2015 12 - 13 January, 2015 Oxford University

Page 62: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

• myGrid – http://www.mygrid.org.uk

• Taverna – http://www.taverna.org.uk

• myExperiment – http://www.myexperiment.org

• BioCatalogue – http://www.biocatalogue.org

• Biodiversity Catalogue – http://www.biodiversitycatalogue.org

• Seek – http://www.seek4science.org

• Rightfield – http://www.rightfield.org.uk

• VPH-Share – http://www.vph-share.eu/

• Wf4ever – http://www.wf4ever-project.org

• Software Sustainability Institute – http://www.software.ac.uk

• BioVeL – http://www.biovel.eu

• Force11 – http://www.force11.org

• SCAPE – http://www.scape-project.eu/

Page 63: The beauty of workflows and models€¦ · capture steps track & keep results versioning host service Open Source/Store Sci as a Service Integrative fws Virtual Machines Recompute,

Acknowledgements

• David De Roure • Tim Clark • Sean Bechhofer • Robert Stevens • Christine Borgman • Victoria Stodden • Marco Roos • Jose Enrique Ruiz del Mazo • Oscar Corcho • Ian Cottam • Steve Pettifer • Magnus Rattray • Chris Evelo • Katy Wolstencroft • Robin Williams • Pinar Alper • C. Titus Brown • Greg Wilson • Kristian Garza • Donal Fellows

• Wf4ever, SysMO, BioVel, UTOPIA and myGrid teams

• Juliana Freire • Jill Mesirov • Simon Cockell • Paolo Missier • Paul Watson • Gerhard Klimeck • Matthias Obst • Jun Zhao • Pinar Alper • Daniel Garijo • Yolanda Gil • James Taylor • Alex Pico • Sean Eddy • Cameron Neylon • Barend Mons • Kristina Hettne • Stian Soiland-Reyes • Rebecca Lawrence • Alan Williams