the bdds platform for big biomedical data...

1
The BDDS Platform for Big Biomedical Data Management and Analysis: Discovering the role of Amyloid Deposition in Neurodegenerative Diseases Ravi K Madduri, Michael D’Arcy, Kyle Chard, Alexis Rodriguez, Judy Pa, Naveen Ashish, Ben Heavner, Gustavo Glusman, Ivo Dinov, John Van Horn, Eric Deutsch, Nathan Price, Leroy Hood, Carl Kesselman, Ian Foster, Joseph Ames, Arthur Toga University of Chicago, University of Southern California, Institute for Systems Biology, University of Michigan Abstract Approach The BDDS Platform Analyzing role of Amyloid Burden Discussion & Conclusions Bibliography Objective: Investigate if there are commonalities in patterns of amyloid deposition in individuals with Alzheimer’s Disease or Parkinson’s Disease that identify those individuals with or at risk for cognitive dysfunction Approach: We created a platform for integration of multi-omic data, facilitate data discovery, cohort creation, enable rapid, scalable and reproducible analysis and finally publish results We employed a systematic, reproducible approach that leveraged existing capabilities for understanding commonalities in amyloid deposition that included the following stages and challenges 1. Data Ingest - ERMRest Phenotypic, Genotypic and Imaging datasets Multiple repositories Data Usage Agreements Automated data clean up and ingest 2. Data Query and Exchange - BagIt ERMRest provides high-level API for query Adopted the BagIt specification to create databags Enhanced BagIt to support big biomedical data 3. Data Analysis – Pipeline and Globus Genomics LONI Pipeline runs the analysis on the amyloid PET and MRI images and outputs amyloid index values (SUVR: standard uptake value ratio) for each brain region, for each subject using computational resources at USC Genomics analysis using Globus Genomics on the Amazon cloud Leverages elastic provisioner that optimizes performance and costs 4. Data Publication and Integration – Globus Publication and ERMRest Results from analysis are integrated with ERMRest Results are published in Globus Publication service with appropriate metadata • Multi-omic data • Multiple distributed repositories • Data usage agreements • Data cleanup and integration Data Ingest • Easy to find data • Enable creation of patient cohorts • Send data for analysis Data Query and Exchange • Multiple analysis platforms • Ease of use • Scale Data Analysis • Enable discovery • End-to-end data view Data Publication and Integration I m ages f r om Al z hei m er ' s and Par ki ns on' s 1. The BagIt File Packaging Format (V0.97) available at : http://tools.ietf.org/html/draft-kunze- bagit-11 2. Minids: http://minid.bd2k.org SUVR values for cortical gray matter regions of interes t overlaid on a s truc tural MRI for a single subject Exom es andwhol e genom es f r om Al z hei m er ' s and Par ki ns on' s Phenot ypi c data We created a powerful platform by integrating several existing services and capabilities Platform development resulted in adopting and extending BagIt format for biomedical big data Future work includes applying the platform for analyzing Alzheimer’s data Reference implementation of the NIH Commons ark:/88120/r83w2c

Upload: others

Post on 12-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The BDDS Platform for Big Biomedical Data …bd2k.ini.usc.edu/assets/all-hands-meeting/BDDSAmyloid...The BDDS Platform for Big Biomedical Data Management and Analysis: Discovering

The BDDS Platform for Big Biomedical Data Management and Analysis: Discovering the role of Amyloid Deposition in Neurodegenerative Diseases

Ravi K Madduri, Michael D’Arcy, Kyle Chard, Alexis Rodriguez, Judy Pa, Naveen Ashish, Ben Heavner, Gustavo Glusman, Ivo Dinov, John Van Horn, Eric Deutsch, Nathan Price, Leroy Hood, Carl Kesselman, Ian Foster, Joseph Ames, Arthur Toga

University of Chicago, University of Southern California, Institute for Systems Biology, University of Michigan

Abstract

Approach

The BDDS Platform Analyzing role of Amyloid Burden

Discussion & Conclusions

Bibliography

Objective:Investigateiftherearecommonalitiesinpatternsofamyloiddeposition inindividualswithAlzheimer’sDiseaseorParkinson’sDiseasethatidentifythoseindividualswithoratriskforcognitivedysfunctionApproach:Wecreatedaplatform forintegrationofmulti-omic data,facilitatedatadiscovery,cohortcreation, enablerapid,scalableandreproducible analysisandfinallypublishresults

Weemployedasystematic,reproducibleapproachthatleveragedexistingcapabilitiesforunderstandingcommonalitiesinamyloiddepositionthatincludedthefollowingstagesandchallenges

1.DataIngest- ERMRest• Phenotypic,Genotypicand

Imagingdatasets• Multiplerepositories• DataUsageAgreements• Automateddatacleanupand

ingest

2.DataQueryandExchange- BagIt• ERMRest provideshigh-levelAPI

forquery• AdoptedtheBagIt specification

tocreatedatabags• EnhancedBagIt tosupportbig

biomedicaldata3.DataAnalysis– PipelineandGlobusGenomics• LONIPipelinerunstheanalysis

ontheamyloidPETandMRIimagesandoutputsamyloidindexvalues(SUVR:standarduptakevalueratio)foreachbrainregion,foreachsubjectusingcomputationalresourcesatUSC

• GenomicsanalysisusingGlobusGenomicsontheAmazoncloud

• Leverageselasticprovisioner thatoptimizesperformanceandcosts

4.DataPublicationandIntegration– GlobusPublicationandERMRest• Resultsfromanalysisare

integratedwithERMRest• ResultsarepublishedinGlobus

Publicationservicewithappropriatemetadata

• Multi-omic data• Multipledistributedrepositories• Data usage agreements• Data cleanup and integration

DataIngest

• Easy tofinddata• Enable creation ofpatient cohorts• Send data foranalysis

DataQueryandExchange

•Multipleanalysisplatforms• Ease ofuse• Scale

DataAnalysis

• Enable discovery• End-to-end data view

DataPublication

andIntegration

Images f r om Alzheim er 's andPar kinson's

1. TheBagIt FilePackagingFormat(V0.97)availableat:http://tools.ietf.org/html/draft-kunze-bagit-11

2. Minids:http://minid.bd2k.org

SUVR values forcortical gray matterregions of interestoverlaid onas tructural MRI for as inglesubject

Exom esandwholegenom es f r om Alzheim er 's andPar kinson'sPhenot ypic dat a

• Wecreatedapowerfulplatformbyintegratingseveralexistingservicesandcapabilities

• PlatformdevelopmentresultedinadoptingandextendingBagIt formatforbiomedicalbigdata

• FutureworkincludesapplyingtheplatformforanalyzingAlzheimer’sdata

• ReferenceimplementationoftheNIHCommons

ark:/88120/r83w2c