tools for reproducible research in an increasingly digital world

42
brian m. bot | sage bionetworks | @BrianMBot mayo clinic - 2015 jan 28 tools for reproducible research in an increasingly digital world

Upload: brian-bot

Post on 23-Jan-2017

487 views

Category:

Science


3 download

TRANSCRIPT

Page 1: tools for reproducible research in an increasingly digital world

brian m. bot | sage bionetworks | @BrianMBot

mayo clinic - 2015 jan 28

tools for reproducible research

in an increasingly digital world

Page 2: tools for reproducible research in an increasingly digital world

sage bionetworks~40 FTEs

1/2 research - 1/3 platform - 1/6 leadership/support

Page 3: tools for reproducible research in an increasingly digital world

sage bionetworks

focused on a world where biomedical research will fundamentally change to be more open and collaborative

Page 4: tools for reproducible research in an increasingly digital world
Page 5: tools for reproducible research in an increasingly digital world

production

Page 6: tools for reproducible research in an increasingly digital world

distribution

Page 7: tools for reproducible research in an increasingly digital world

aggregation

Page 8: tools for reproducible research in an increasingly digital world
Page 9: tools for reproducible research in an increasingly digital world
Page 10: tools for reproducible research in an increasingly digital world
Page 11: tools for reproducible research in an increasingly digital world
Page 12: tools for reproducible research in an increasingly digital world
Page 13: tools for reproducible research in an increasingly digital world

6%

21%

8%

11%

54%cannot reproduce

can reproduce in principle

can reproduce w/discrepancies

can reproduce from processed data w/discrepancies

can reproduce partially

the status quo tolerates poor communication of findings

Ioannidis A. et al. Nature Genetics 2009

Page 14: tools for reproducible research in an increasingly digital world

208,294,724 datapoints

124 pages supplemental material

?? lines unobtainable source code

?? version or architecture of statistical analysis program (R)

enumerable R packages and package dependencies

key R package “ClaNC” no longer available

1231 citations

often what is in principle reproducible, is not practically reproducible

unidentified publication‣ from journal with 5 year impact factor of 27‣ article freely available for download‣ data freely available for download

Page 15: tools for reproducible research in an increasingly digital world

“Scientists often study the past as obsessively as historians because few

other professions depend so acutely on it. Every experiment is a conversation with

a prior experiment, every new theory a refutation of the old”

-Siddhartha Mukherjee, The Emperor of All Maladies

Page 16: tools for reproducible research in an increasingly digital world

scientific method1. define a question

2. gather information and resources (background research)

3. form a hypothesis

8. retest (frequently done by other scientists)

4. test hypothesis experimentally

5. analyze experimental data

7. publish results

6. draw conclusions based on data

Page 17: tools for reproducible research in an increasingly digital world

7. publish results

Page 18: tools for reproducible research in an increasingly digital world

finitein

∞...

Page 19: tools for reproducible research in an increasingly digital world

conducting research for others to consume

(even if the ‘other’ is future you)

reproducible research

Page 20: tools for reproducible research in an increasingly digital world

tools for reproducible research

code

data

analysis

Page 21: tools for reproducible research in an increasingly digital world

tools for reproducible research

code

version control

Page 22: tools for reproducible research in an increasingly digital world

tools for reproducible research

code

version control

client-server

distributed

Page 23: tools for reproducible research in an increasingly digital world

tools for reproducible research

code

version control

client-server(e.g svn, cvs)

Page 24: tools for reproducible research in an increasingly digital world

tools for reproducible research

code

version control

client-server

distributed

Page 25: tools for reproducible research in an increasingly digital world

tools for reproducible research

code

version control

(e.g git, mercurial)distributed

Page 26: tools for reproducible research in an increasingly digital world

tools for reproducible research

code

version control

distributed

Page 27: tools for reproducible research in an increasingly digital world

tools for reproducible research

code

data

analysis

Page 28: tools for reproducible research in an increasingly digital world

tools for reproducible research

data

generic

domain repositories

results

Page 29: tools for reproducible research in an increasingly digital world

tools for reproducible research

data

digital object identifier (doi)

a unique identifier which remains fixed over the lifetime of a web-accessible object

metadata, including the object’s location, is stored in association with the doi and may change over time

referring to an online document by its doi provides more stable linking than simply referring to a url

Page 30: tools for reproducible research in an increasingly digital world

tools for reproducible research

code

data

analysis

Page 31: tools for reproducible research in an increasingly digital world

tools for reproducible research

analysis

R Sweave knitr

great if you know LaTeX

Page 32: tools for reproducible research in an increasingly digital world

tools for reproducible research

analysis

R Sweave knitr

great if you are lazy

(like me)

Page 33: tools for reproducible research in an increasingly digital world

tools for reproducible research

analysis

knitr

# Hello World Title ### Author: Brian M. Bot

This is a narrative with inline code execution to tell me that pi is equal to `r pi`. And a plot to show a simple function.

```{r} x <- 1:100 y <- log(x)/x plot(x,y) ```

Page 34: tools for reproducible research in an increasingly digital world

tools for reproducible research

analysis

knitr

# Hello World Title ### Author: Brian M. Bot

This is a narrative with inline code execution to tell me that pi is equal to `r pi`. And a plot to show a simple function.

```{r} x <- 1:100 y <- log(x)/x plot(x,y) ```

Page 35: tools for reproducible research in an increasingly digital world

tools for reproducible research

analysis

ipython notebook

Page 36: tools for reproducible research in an increasingly digital world

tools for reproducible research

other tools

galaxy

docker

packrat

shiny

Page 37: tools for reproducible research in an increasingly digital world

tools for reproducible research

other tools

enables sharing of all resources (data, code, results) and their relationships to one another

Page 38: tools for reproducible research in an increasingly digital world

tools for reproducible research

Page 39: tools for reproducible research in an increasingly digital world

tools for reproducible research

Page 40: tools for reproducible research in an increasingly digital world

tools for reproducible research

Page 41: tools for reproducible research in an increasingly digital world

Go Hawks!

Page 42: tools for reproducible research in an increasingly digital world

mayo clinic - 2015 jan 28

in an increasingly digital world

brian m. bot ——————

[email protected] @BrianMBot

sage bionetworks

tools for reproducible research