appistry wgdas presentation
DESCRIPTION
Presentation given by Appistry's Vice President of Product Strategy, Sultan Meghi at the World Genome Data Analysis Summit. Meghi presented about the big data challenges facing labs as they strive to manage the flow of genetic data from sequencer to the clinic.TRANSCRIPT
From Sequencer to Clinic: Managing Science and Scale
Sultan Meghi, Vice President of Product Strategy
World Genome Data Analysis SummitNovember 28, 2012
2AGENDA
Challenges Along the Path from Genomics Research to Personalized Medicine Implementing technology Implementing science Scaling from research to clinic
The Problem Restated…
What’s the most efficient, reliable and robust way to capture my genetic data, analyze it and secure it for re-analysis and deeper interpretation in a clinical setting?
Enabling Science at Scale Platform for big data Analytics framework for implementing science Flexible deployment
Overview
CUSTOMER NEEDS
Mega-scale data
management and data analysis.
Complex Pipeline
Development, Test &
Deployment.
Infrastructure costs,
complexity, security and compliance.
The Path from Genomics Research to Personalized Medicine
Accelerating the Science of Genetic Discovery for Researchers, Bioinformatics Specialists & Tool
Development.
Target:Clinicians
and Patients leveraging a dynamically expanding
field of science.
GovernmentFunding
3rd Party Payers
4Source: WSJ, NYT, Genome Medicine THE GENOMICS DATA PROBLEM
Overpowering need for genomic data analysis tools
“We can sequence the genome for dirt cheap, but we don’t know how to deal with the data.”
“How do we avoid the pitfall of having cheap human genome sequencing but complex and expensive manual analysis to make clinical sense out of the data?”
Eric Green M.D.,Ph.D. Director, NHGRI
Elaine Mardis Ph.D. Director of Technology Development
5THE BIG DATA CHALLENGE IN GENOMICS
“Big Data” is essentially large amounts of data
Multiple sources or data formats Unstructured or semi-structured Difficult to put into databases and analyze
Seen in other industry areas:
Genomics data is “Big Data”
Finance
Logistics
Geospatial
Defense
Telecom
6THE BIG DATA CHALLENGE IN GENOMICS
“Big Data” challenges in genomics
Source: Appistry proprietary market research by CBT Advisors
“Moving data around and storing the data is painful. It’s a huge problem for us. We’re looking at the cloud for processing options.” - Carol Rohl Ph.D., Director of Merck, Research Labs
“Bioinformatics tools and reference datasets change monthly, weekly and in some cases daily. This requires easy to manage application and data management platforms to keep up to date with all the changes.”
- Sultan Meghji, Appistry, GigaOM 2012
“Datasets are so large, you have to analyze them at the same site where the data is or using mirrors. You do not want to be writing it onto a remote hard drive and move the data each time you want to analyze it.” - John Monahan, Novartis Institutes for Biomedical Research
STORAGE
APPLICATIONS
COMPUTATION
7APPROACHES
Current solutions address challenges individually…
CLOUD STORAGE
ANALYTICS
USER-FOCUSED TOOLS
STORAGE
APPLICATIONS
COMPUTATION
8
Bewildering array of partial or overlapping solutions that may or may not get you there…
9WHY APPISTRY?
Private Cloud for Genomics enables Storage, Computation, Your Workflow & Science, Security & Compliance
CLOUD STORAGE
ANALYTICS
USER-FOCUSED TOOLS
10APPISTRY’S GENOMIC SOLUTION
Capabilities needed
Private Cloud Genomics Services (HIPAA
Compliance)
Automated Data Management and Storage Tightly Coupled to Analysis
Massively Scalable/Reliable Fabric for Algorithms, Tools
and Applications
Analytics Layer Simplifies the Build, Test and
Deployment of Analytic Pipelines
Industry Tools, Data Sets and YOUR Science
11
Step 1: Download Data From SequencerStep 2: Send Data to Storage via FTP or FedEx = 5+DaysStep 3: Access Stored Data + Open-Source AlgorithmsStep 4: Reprogram Algorithms for Infrastructure = MonthsStep 5: Upload algorithms + Data to StorageStep 6: Download Public Gene DatabasesStep 7: Reorganize Gene Database Info for InfrastructureStep 8: Upload Gene Database to StorageStep 9: Download All Stored DataStep 10: Run Algorithms on Sequence Data
AYRRIS PRODUCT
Current genomic data analysis is cumbersome and costly
Source: Appistry survey
Data from Sequencers
ATCGTATCGGC
ACTAATCGCTCGGCTATA
G Public Cloud
Public Gene Databases
Open-Source Algorithms
Costly Data Storage $
$
HIPAA Compliance?
User
1
2
3
3
5
7
8
4 9
10
For Each New Data Set: Repeat steps 1, 2, 9, 10For Each Open-Source Algorithm Update: Repeat steps 3-8
6
CLOUD WORKFLOW
Appistry Private Cloud for Genomics Workflow
12
Data from Sequencers
ATCGTATCGGC
ACTAATCGCTCGGCTATA
G
Appistry CourierOver HTTPS
HIPAA Compliant Genomics Cloud
Annotated Results & Visualizations
Ayrris PipelinesYour Science
Appistry Private Cloud
SNPs, Indels, Rare Variants, etc
Data Center and Researcher
Consumption of Results by internal Bioinformaticians and Clinicians
Appistry CourierOver HTTPS
SFTP Transfer
13BUSINESS MODEL
Cloud-based genomic data analysis and storage
Subscription to Appistry’s secure, HIPAA compliant cloud storage
Design and Implement the Solution for Cloud-based Use, Within Your Own Data Center or Both (‘Cloud-Bursting’) – Same Science, Same Workflow
On-site modular turn-key hardware and software
Enterprise-level implementation of private network HIPAA-enabled storage
APPISTRY CLOUD APPISTRY APPLIANCE
via INTERNET
INSTITUTION
Same access to pipeline analysis algorithms & annotations (Same Science)
Same underlying technology and efficiency
CLOUD WORKFLOW
Integrate within broader Healthcare Perspective
14
Data from Sequencers
ATCGTATCGGC
ACTAATCGCTCGGCTATA
G
Regulatory Compliant
Genomics Cloud
Annotated Results & Visualizations
Ayrris PipelinesYour Science
Appistry Private Cloud
Data from other instruments
Integrated with Medical Data –
EMR, Biller/Payer
Integrated with Research Data
Systems (Genomics, Pharma)
Secured, Integrated Workflows, Data Management
and Analysis
15APPROACHES
Democratization is the solution for “big data” – Appistry is the solution for genetic data
iTunes
“DEMOCRATIZATION” OF DATA
Internet Data
Music Digital Video
Books Genomic Data
16
And, in the future the ability to make decisions from genomic information will be possible for everyone
7 BILLION
Breast cancerOsteoporosisLung cancerHeart disease
AutismLeukemiaADHDGenetic disorders
Decisions for prevention or early treatment
Genomic Information
Thanks for Your Attention
main: 314.450.5720fax: [email protected]
appistry.com1141 South 7th St., Suite 300St. Louis, MO 63141