aws life sciences

20
1/21 Copyright 2013 InVitae, Inc Reece Hart, Ph.D. [email protected] InVitae, Inc. invitae.com Developing a Clinical Genome Developing a Clinical Genome Interpretation Pipeline at AWS Interpretation Pipeline at AWS

Upload: reece-hart

Post on 11-May-2015

365 views

Category:

Technology


2 download

DESCRIPTION

Talk at AWS Life Sciences event in San Francisco, April 4, 2013.

TRANSCRIPT

Page 1: AWS Life Sciences

1/21Copyright 2013 InVitae, Inc

Reece Hart, [email protected]

InVitae, Inc.invitae.com

Developing a Clinical Genome Developing a Clinical Genome Interpretation Pipeline at AWSInterpretation Pipeline at AWS

Page 2: AWS Life Sciences

2/21Copyright 2013 InVitae, Inc

The MissionThe MissionTo provide comprehensive,

clinically-relevant information from genomic variation data in a single test.

Page 3: AWS Life Sciences

3/21Copyright 2013 InVitae, Inc

one sampleone requisition

one reportup to 264 conditions

two weeksone lab, one price

InVitae's process features online requisitioning and reporting, CLIA-certified sequencing, and a HIPAA-compliant information management.

intakeRequisitioning and Laboratory Information Management System

interpretationsequencing review

Page 4: AWS Life Sciences

4/21Copyright 2013 InVitae, Inc

Where does InVitae fit?Where does InVitae fit?

photos:Baylor College of Medicine, Univ. Utah, learningradiology.com, sciencephotos.com

Patient presents with symptoms

If genomic interpretation might influence diagnosis or treatment, doctor refers patient to genetic counselor

GC takes history; sample is sent to internal or one of hundreds of labs that provide specific genomic tests

Sequencing and other lab data are processed into

preliminary iterpretation

Report is returned to GC and/or physician who

verify interpretation and consult with patient

Page 5: AWS Life Sciences

5/21Copyright 2013 InVitae, Inc

http://www.ncbi.nlm.nih.gov/sites/GeneTests/

Page 6: AWS Life Sciences

6/21Copyright 2013 InVitae, Inc

Examples of published clinical variantsExamples of published clinical variants

➢ Inherited conditions● NM_000136.2:c.355_360delATGAGAinsT

‒ at risk for Fanconi Anemia

➢ Carrier conditions● NM_000520.4:c.1277_1278insGATA

‒ carrier for Tay-Sachs

➢ Pharmacogenetic conditions● HLAB*1502 & HLAB*5701 haplotypes (23 snps)

‒ Abacavir drug-induced hypersensitivity‒ Flucloxacillin drug-induced liver injury‒ Carbamazepine drug-induced cutaneous adverse events

Page 7: AWS Life Sciences

7/21Copyright 2013 InVitae, Inc

Reported VariantsReported Variants

Page 8: AWS Life Sciences

8/21Copyright 2013 InVitae, Inc

Our ReportOur Report

Page 9: AWS Life Sciences

9/21Copyright 2013 InVitae, Inc

Page 10: AWS Life Sciences

10/21Copyright 2013 InVitae, Inc

NOW

Early AccessCommercial Program

(CLIA Certified)

264GENETIC TESTS

$1,5002014

Goal to update offeringevery six months

>1000GENETIC TESTS

<$1,000

50x minimum, ~425x average100% coverage of curated variants>90% of targeted regionsSNVs, indels (<100nt), VUS

Page 11: AWS Life Sciences

11/21Copyright 2013 InVitae, Inc

How?How?By deeply integrating custom genetic

curation, sequencing assay design, and analytical pipelines.

Page 12: AWS Life Sciences

12/21Copyright 2013 InVitae, Inc

Curation + Assay Design + PipelineCuration + Assay Design + Pipeline

curation

analysis pipeline

A>T

sequence assay

curation

Page 13: AWS Life Sciences

13/21Copyright 2013 InVitae, Inc

Protected Health Informationon-site, encrypted, restricted access

Architectural OverviewArchitectural Overview

IPSecTunnel

4.5GB up4 MB down

Page 14: AWS Life Sciences

14/21Copyright 2013 InVitae, Inc

Pipeline OverviewPipeline Overview

aligned readsreads

variantsaligned reads

reportvariants

bwasamtoolsgatkpicard

+ assay regions

+ GRCh37 ref+ 1kg known indels

gatkfreebayescustom variant calling

+ qualities

+ metrics

+ quality metrics

call haplotypesmatch variantsVUS pipelinerender

Page 15: AWS Life Sciences

15/21Copyright 2013 InVitae, Inc

AWS TopologyAWS Topology

1 VPC3 subnets

NFS

interactive hostsweb services

scaled dynamically

build/test

Page 16: AWS Life Sciences

16/21Copyright 2013 InVitae, Inc

What's worked well at AWS?What's worked well at AWS?

➢ Performance and Capacity Scaling● on-demand

➢ Security and VPG● Simple, comprehensive rules for VPCs

➢ Service ecosystem: EC2, EBS, S3, R53, IAM, VPC● boto!

➢ Future:● Archive: S3 and Glacier?● Investigate workflows ● Reanalysis

Page 17: AWS Life Sciences

17/21Copyright 2013 InVitae, Inc

What challenges have we experienced?What challenges have we experienced?

➢ Large shared fileystems● poor and variable performance (better now)● familiarity, flexibility, transparency● existing software expectations

➢ Node failures● Once upon a time... 0-3 times per day, no warning● moved zones no longer an issue

➢ Devops● Built our own, but moving to puppet

Page 18: AWS Life Sciences

18/21Copyright 2013 InVitae, Inc

Page 19: AWS Life Sciences

19/21Copyright 2013 InVitae, Inc

One source of variation...One source of variation...

attribution unknown

Page 20: AWS Life Sciences

21/21Copyright 2013 InVitae, Inc