towards open and reproducible neuroscience in the age of big data
TRANSCRIPT
Towards open and reproducible neuroscience in the age of big data
Chris GorgolewskiCenter for Reproducible NeuroscienceStanford University
ON THE IMPORTANCE OF DATA
ROSALIND FRANKLIN AND PHOTOGRAPH 51
NEUROVAULT.ORG DATA REUSE
Sochat et al. 2015
OPENFMRI DATA REUSE
Gorgolewski et. al 2015
DATA SHARING SAVES MONEY
$878,988COST OF REACQUIRING DATA FOR EACH OF THE REUSES OF
OPENFMRI DATASETS (2015)
STUDIES SHARING DATA HAVE HIGHER STATISTICAL QUALITY
Wicherts JM, Bakker M, Molenaar D (2011) Willingness to Share Research Data Is Related to the Strength of the Evidence and the Quality of Reporting of Statistical Results. PLoS ONE 6(11): e26828. doi: 10.1371/journal.pone.0026828
SHARING DATA IS RELATED TO HIGHER CITATION RATE
PIWOWAR, DAY & FRIDSMA (2007)
Piwowar & Vision(2013)
MAKING MORE DATA ACCESSIBLE TO
MORE RESEARCHERS
MEET PROF. SMITH
BIDS.NEUROIMAGING.IO
MEET MIKE
BIDS.NEUROIMAGING.IO
GETTING LOST IN YOUR DATA
BIDS.NEUROIMAGING.IO
GETTING LOST IN YOUR DATA
HETEROGENEITY IN DATA DESCRIPTION PRACTICES CAUSES:• PROBLEMS IN SHARING DATA,• UNNECESSARY MANUAL METADATA
INPUT,• NO WAY TO AUTOMATICALLY VALIDATE
DATASETS.
BIDS.NEUROIMAGING.IO
GETTING LOST IN YOUR DATA
• MRI HAS BEEN USED TO STUDY THE HUMAN BRAIN FOR OVER 20 YEARS.
• DESPITE SIMILARITIES IN EXPERIMENTAL DESIGNS AND DATA TYPES EACH RESEARCHER TENDS TO ORGANIZE AND DESCRIBE THEIR DATA IN THEIR OWN WAY.
http://www.nature.com/news/brain-imaging-fmri-2-0-1.10365
BIDS.NEUROIMAGING.IO
BRAIN IMAGING DATA STRUCTURE
A NEW STANDARD FOR ORGANIZING HUMAN
NEUROIMAGING DATASETS
BIDS.NEUROIMAGING.IO
WHO IS IT FOR?
1.LAB PIS. IT WILL MAKE HANDING OVER ONE DATASET FROM ONE STUDENT/POSTDOC TO ANOTHER EASY.
2.WORKFLOW DEVELOPERS. IT’S EASIER TO WRITE PIPELINES EXPECTING A PARTICULAR FILE ORGANIZATION.
3.DATABASE CURATORS. ACCEPTING ONE DATASET FORMAT WILL MAKE CURATION EASIER.
BIDS.NEUROIMAGING.IO
WHO IS IT FOR?
BIDS.NEUROIMAGING.IO
PRINCIPLES BEHIND BIDS
1.ADOPTION IS CRUCIAL. 2.DON’T REINVENT THE
WHEEL. 3.80/20 RULE.
BIDS.NEUROIMAGING.IO
EVOLUTION OF BIDS1.KICKOFF MEETING AT STANFORD IN
SPRING 20152.MEETING AT OHBM 2015 (JUNE)3.INTRODUCED TO NEUROINFORMATICS
COMMUNITY AT NEUROINFORMATICS CONGRESS 2015 (AUGUST)
4.FIRST RELEASE CANDIDATE AND PUBLIC CALL FOR COMMENTS (SEPTEMBER)
5.VERSION 1.0.0 PUBLISHED ALONG A INTRODUCTORY PAPERBIDS.NEUROIMAGING.I
O
COMMUNITY OUTREACH
• REACHED OVER 5000 RESEARCHERS• EXCHANGED HUNDREDS OF EMAIL
COMMENTS PRODUCED • ~40 EXAMPLE DATASETS
• 27 COAUTHORS ON THE FINAL MANUSCRIPT
BIDS.NEUROIMAGING.IO
Gorgolewski et al. (2016) Scientific Data
FOLDER ORGANIZATION
BIDS.NEUROIMAGING.IO
FOLDER ORGANIZATION
BIDS.NEUROIMAGING.IO
FOLDER ORGANIZATION
participant_id age sex sub-001 34 M sub-002 12 F sub-003 33 F
BIDS.NEUROIMAGING.IO
FOLDER ORGANIZATION
NIfTI
BIDS.NEUROIMAGING.IO
FOLDER ORGANIZATION
{ "RepetitionTime": 3.0, "EchoTime": 0.03, "FlipAngle": 78, "SliceTiming": [0.0, 0.2, 0.4, …], "MultibandAccellerationFactor": 4, "PhaseEncodingDirection": "j-" }
BIDS.NEUROIMAGING.IO
THE VALIDATOR
incf.github.io/bids-validator/
BIDS.NEUROIMAGING.IO
PROF. SMITH(2030)
BIDS.NEUROIMAGING.IO
SOFTWARE SUPPORTING BIDS
• QAP (QUALITY ASSESMENT)• MRIQC (QUALITY ASSESMENT)• FMRIPREP (PREPROCESSING WORKFLOW)• AUTOMATIC ANALYSIS (FMRI PROCESSING TOOLBOX)• OPENFMRI2BIDS (CONVERTER)• BIDS2ISATAB (CONVERTER)• DCM2NIIX (CONVERTER)• DICM2NII (CONVERTER)• OPENFMRI (REPOSITORY)• BIDSTO3COL (CONVERTER)• BIDS2NDA (CONVERTER)• AFNI BIDS-TOOLS (SET OF TOOLS FOR CONVERTING TO AND ANALYZING BIDS DATASETS IN
AFNI)• HEUDICONV (CONVERTER)• DCM2BIDS (CONVERTER)• C-PAC (CONFIGURABLE PIPELINE FOR THE ANALYSING CONNECTOMES)• BRAINSTORM (MEG/EEG ANALYSIS PACKAGE)
BIDS APPS
BIDS-APPS.NEUROIMAGING.IO
PORTABLE NEUROIMAGING PIPELINES THAT SUPPORT
BIDS DATASETS
BIDS APPS
BIDS-APPS.NEUROIMAGING.IO
BIDS APPS - CONTAINERS
BIDS-APPS.NEUROIMAGING.IO
• BATTERIES INCLUDED – NO NEED TO INSTALL ANY EXTRA SOFTWARE
• NO NEED TO WORRY ABOUT INCOMPATIBLE THIRD PARTY SOFTWARE UPDATES
• EASY TO SWITCH BETWEEN VERSIONS• WORKS ON WINDOWS, MAX, LINUX AND MULTI USER HPC
BIDS-APPS.NEUROIMAGING.IO
WHAT IS DOCKER?
• DOCKER IS THE MOST POPULAR CONTAINER IMPLEMENTATION• IT CONSISTS OF:• DOCKER ENGINE (RUNS CONTAINERS)• DOCKER HUB (CENTRALIZED WEB SERVICE
FOR STORING AND SHARING CONTAINER IMAGES)
BIDS-APPS.NEUROIMAGING.IO
RUNNING CONTAINERS ON CLUSTERS/HPCS
• DOCKER HAS LIMITATIONS:• IT WAS DESIGNED FOR THE CLOUD,
WHERE YOU ARE IN TOTAL CONTROL• REQUIRES MODERN KERNEL VERSION• ALLOWS USERS TO GAIN ROOT ACCESS
BIDS-APPS.NEUROIMAGING.IO
RUNNING CONTAINERS ON CLUSTERS/HPCS
• THE ADVANCED DOCKER FEATURES ARE USEFUL FOR:• NETWORKING MANAGEMENT• SANDBOXING RESOURCES• MAPPING USERNAMES
• ALL SCIENTISTS CARE ABOUT IS PORTABILITY (CAPTURING BINARY DEPENDENCIES)
RUNNING CONTAINERS ON CLUSTERS/HPCS
• SINGULARITY IS A CONTAINER FRAMEWORK THAT• WAS BUILD GROUND UP TO SUPPORT
CLUSTERS/HPCS• HAS MINIMAL REQUIREMENTS• RUNS ON LEGACY KERNELS• DOES NOT ELEVATE PERMISSIONS• ALLOWS IMPORTING DOCKER IMAGES
BIDS-APPS.NEUROIMAGING.IO
BIDS APPS - PARALLELIZATION
BIDS-APPS.NEUROIMAGING.IO
BIDS APPS - COMMUNITY
BIDS-APPS.NEUROIMAGING.IO GORGOLEWSKI ET. AL 2017
22 AVAILABL
E BIDS APPS
BIDS-APPS.NEUROIMAGING.IO
BIDS APPS - VERSIONING
BIDS-APPS.NEUROIMAGING.IO
EXAMPLE BIDS APPS: MRIQC AND FMRIPREP
MRIQC – QUALITY CONTROLFOR STRUCTURAL AND FUNCTIONAL IMAGES
MRIQC.ORG
SPIKES
CROWDSOURCING ARTEFACTS
WHAT HAPPENS WHEN YOU ASK TWITTER FOR HELP...
FMRIPREPROBUST EASY TRANSPARENT
FMRIPREP.READTHEDOCS.IO
WHAT IS IT?FMRI DATA PREPROCESSING
TOOL
PREPROCESSING?DENOISING AND NORMALIZATION
FMRIPREP.READTHEDOCS.IO
WHAT IT IS NOT
GLMDCM
CONNECTIVITYDYNAMICS
ETC.FMRIPREP.READTHEDOCS.I
O
FMRIPREP: PRINCIPLES
• EASY TO INSTALL AND USE
• ROBUST – WORKS ON ANY* DATA
• TRANSPARENT – “GLASS BOX” RATHER THAN “BLACK BOX”
FMRIPREP.READTHEDOCS.IO
FMRIPREP: T1W PREPROCESSING
• N4 BIAS FIELD CORRECTION (ANTS)
• SKULL STRIPPING (ANTS)
• 3 CLASS TISSUE SEGMENTATION (FSL FAST)
• ROBUST MNI COREGISTRATION (ANTS)FMRIPREP.READTHEDOCS.I
O
FMRIPREP: EPI PREPROCESSING
• MOTION CORRECTION (FSL MCFLIRT)
• SKULL STRIPPING (NILEARN)
• COREGISTRATION TO T1(FSL FLIRT WITH BBR)
FMRIPREP.READTHEDOCS.IO
REPORTS
BIDS DERIVATIVES
MAKING MORE DATA ACCESSIBLE TO MORE RESEARCHERS
Poldrack and Gorgolewski, 2014
MAKING MORE DATA ACCESSIBLE TO
MORE RESEARCHERS
OPENNEURO*
*BORN OUT OF A TWEET
FEATURES
• DATA ORGANIZATION – BIDS• DATA MANAGEMENT PLATFORM• UPLOADING• VALIDATION• SNAPSHOTING• DOWNLOADING
• DATA ANALYSIS – BIDS APPS
UPLOADING
BROWSING AND VALIDATION
ANALYSIS RESULTS
FUTURE DIRECTIONS
• HYPOTHESIS GENERATION MACHINE• SAME MODEL, SAME DATA, NEW BRAIN-BEHAVIOR
RELATIONSHIPS• VIBRATION RATIO ESTIMATION
• HOW MUCH YOUR RESULTS DEPEND ON PREPROCESSING DECISIONS (CARP 2012)
• EXPOSING NEUROIMAGING DATASETS TO ML COMMUNITY• FROM NIFTIS TO CSV FILES
• BIDS EXTENSIONS:• PET, MEG, MODELS, DERIVATIVES
ACKNOWLEDGMENTS
The Poldrack Lab @ StanfordData Sharing Task Force
ACKNOWLEDGMENTS
• TIBOR AUER • VINCE D. CALHOUN• R. CAMERON CRADDOCK• SAMIR DAS • EUGENE P. DUFF • GUILLAUME FLANDIN• SATRAJIT S. GHOSH • TRISTAN GLATARD • YAROSLAV O. HALCHENKO• DANIEL A. HANDWERKER• MICHAEL HANKE• DAVID KEATOR• XIANGRUI LI• DAN MARCUS
• ZACHARY MICHAEL• CAMILLE MAUMET• B. NOLAN NICHOLS• THOMAS E. NICHOLS• JOHN PELLMAN• JEAN-BAPTISTE POLINE• ARIEL ROKEM• CHRIS RORDEN• GUNNAR SCHAEFER• VANESSA SOCHAT• WILLIAM TRIPLETT• JESSICA A. TURNER • GAËL VAROQUAUX• RUSSELL A. POLDRACK
ACKNOWLEDGMENTS
• FIDEL ALFARO-ALMAGRO• PIERRE BELLEC • MIHAI CAPOTĂ • M. MALLAR CHAKRAVARTY• NATHAN W. CHURCHILL• ALEXANDER LI COHEN • GABRIEL A. DEVENYI• ANDERS EKLUND • OSCAR ESTEBAN• J. SWAROOP GUNTUPALLI• MARK JENKINSON• ANISHA KESHAVAN• GREGORY KIAR• FRANZISKUS LIEM
• PRADEEP REDDY RAAMANA• DAVID RAFFELT• CHRISTOPHER J. STEELE• PIERRE-OLIVIER QUIRION• ROBERT E. SMITH• STEPHEN C. STROTHER• GAËL VAROQUAUX• TAL YARKONI• YIDA WANG • ROSS BLAIR• SHOSHANA BERLEANT• SUYASH BHOGOWAR• JOSEPH WEXLER• CHRIS MARKIEWICZ
OHBM REPLICATION AWARD
BEST REPLICATION IN NEUROIMAGINGPEER REVIEWED PAPER OR PREPRINTPUBLISHED/POSTED ANYTIME BEFORE FEBRUARY 2017SUBMISSION DEADLINE: FEBRUARY 22 2017THE AWARD ($2000) WILL BE PRESENTED AT A PLENARY
SESSION OF OHBM 2017HTTP://WWW.HUMANBRAINMAPPING.ORG/I4A/PAGES/INDEX.C
FM?PAGEID=3731HTTP://WWW.OHBMBRAINMAPPINGBLOG.COM/BLOG/OHBM-R
EPLICATION-AWARD-QA-WITH-CHRIS-GORGOLEWSKI
HOW TO GET INVOLVED (WE NEED YOUR HELP!)
• SHARE YOUR DATA!• OPENFMRI.ORG (RAW) • NEUROVAULT.ORG (STATISTICAL MAPS)
• LEARN MORE ABOUT BIDS AND JOIN THE WORK ITS EXTENSIONS: • BIDS.NEUROIMAGING.IO
• CHECK OUT BIDS APPS, DEVELOP YOUR OWN: • BIDS-APPS.NEUROIMAGING.IO
• TRY MRIQC, LET US KNOW HOW TO MAKE IT BETTER: • MRIQC.ORG
• TRY FMRIPREP AND TELL US HOW IT WORKS WITH YOUR DATA: • FMRIPREP.READTHEDOCS.ORG
• APPLY FOR THE OHBM REPLICATION AWARD