five selfish reasons to work reproducibly

Click here to load reader

Post on 07-Aug-2015

523 views

Category:

Science

1 download

Embed Size (px)

TRANSCRIPT

  1. 1. Florian Markowetz CRUK Cambridge Institute www.markowetzlab.org 5 selfish reasons to work reproducibly More publications, more grants, more awesome!
  2. 2. Systems Genetics of Cancer Genetic variation In people In tumours In clones Phenotypic variation Tumour subtypes Aggressiveness Survival
  3. 3. Cancer genome Evolution Cancer tissue Context Cancer genome Function Ines Wei Edith Geo Ke Anne Joe Leon Andy Amanda
  4. 4. Science miracles
  5. 5. How Bright Promise in Cancer Testing Fell Apart New York Times, July 7, 2011
  6. 6. SlidesbyKeithBaggerly
  7. 7. SlidesbyKeithBaggerly
  8. 8. SlidesbyKeithBaggerly
  9. 9. Baggerly & Coombes, AOAS 2009
  10. 10. SlidesbyKeithBaggerly
  11. 11. http://videolectures.net/ cancerbioinformatics2010_baggerly_irrh/
  12. 12. Reproducible Research Its the right thing to do! The world would be a better place if everyone did it! Its the foundation of Science! Its the honourable thing to do!
  13. 13. Reproducibility helps to avoid disaster
  14. 14. Weak StrongPhenotype Step 1 Step 2 Hits Knock-down Known pathway members New RNAi Hits Compare expression phenotypes by NEMs NFB ? Anatomy of the NFB pathway
  15. 15. What a nice result!
  16. 16. What a nice result!
  17. 17. A project is more than a beautiful result!
  18. 18. Starting with reproducibility early helps saving time later
  19. 19. Reproducibility helps writing papers
  20. 20. Why is well-documented and easily accessible code+data useful? Easy to look up numbers and put them in manuscript Be confident your figures and tables are up-to-date Numbers and result automatically update when data change. It is engaging and more eyes can look over the analysis. Easier to spot mistakes.
  21. 21. Why is well-documented and easily accessible code+data useful? Easy to look up numbers and put them in manuscript Be confident your figures and tables are up-to-date Numbers and result automatically update when data change. It is engaging and more eyes can look over the analysis. Easier to spot mistakes.
  22. 22. Reproducibility helps arguing with reviewers
  23. 23. A very engaged reviewer Reviewer: I downloaded the authors data and tried out a variation of their analysis which gave an insignificant result We: Thank you, the reason is XXX and if you do YYY everything is fine.
  24. 24. Reproducibility enables continuity
  25. 25. I am so busy, I cant remember all the details of all my projects
  26. 26. I did this analysis 6 months ago. Of course I cant remember all the details any more
  27. 27. My PI said I should continue the project of a previous postdoc. But that postdoc is long gone and hasnt saved any scripts or data.
  28. 28. Reproducibility helps to build your reputation
  29. 29. http://www.sciencemag.org/content/348/6242/1422/F1.large.jpg
  30. 30. Mind your own business! I document my data the way I want!
  31. 31. Excel works just fine. I dont need any fancy R or Python or whatever.
  32. 32. Sounds alright, but my code and data are spread over so many hard drives and directories that it would just be too much work to collect them all in one place
  33. 33. My field is very competitive and I cant risk wasting time
  34. 34. We can always sort out the code and data after submission
  35. 35. Its only the result that matters!
  36. 36. Id rather do real science than tidy up my data
  37. 37. 5 selfish reasons to work reproducibly 1. Avoid disaster 2. Easier to write papers 3. Easier to talk to reviewers 4. Continuity of your work/in the lab 5. Reputation
  38. 38. When do you need to worry about reproducibility? Before you start the project While you do the analysis When you write the paper When you co-author a paper When you review a paper
  39. 39. When do you need to worry about reproducibility? Before you start the project While you do the analysis When you write the paper When you co-author a paper When you review a paper
  40. 40. Scientific SOFT SKILLS Organization of project Tidy data Tidy code Control over tools Documentation Reproducibility project data code analysis paper Lessclickingandpasting, morescriptingandcoding
  41. 41. Reproducibility is important for Phd students Postdocs PIs Learn tools and apply in daily work! Create aculture of reproducibilityin your lab!
  42. 42. 5 selfish reasons to work reproducibly 1. Avoid disaster 2. Easier to write papers 3. Easier to talk to reviewers 4. Continuity of your work/in the lab 5. Reputation