whole genome, exome, or custom targeted … · andrea clapp haley coleman samantha drinian angelica...
TRANSCRIPT
Whole Genome, Exome, or Custom Targeted Sequencing: How do I choose?
Aaron Thorner, PhD Clinical Genomics Group Leader
Center for Cancer Genome Discovery (CCGD)
Dana-Farber Cancer Institute
d
Outline
Center for Cancer Genome Discovery (CCGD)
Power of massively parallel sequencing (MPS)
MPS workflow
Genome, Exome, or Custom Targeted Sequencing
Sequencing analysis and reporting
Outline
Center for Cancer Genome Discovery (CCGD)
Power of massively parallel sequencing (MPS)
MPS workflow
Genome, Exome, or Custom Targeted Sequencing
Sequencing analysis and reporting
CCGD Mission
To advance precision cancer medicine by developing new technologies for the analysis of cancer genomes and to provide basic, translational, and clinical investigators with access to these technologies.
• Technology development: To develop new technologies for the analysis of cancer genomes.
• Collaborations: To provide access to these genomic technologies to basic, translational, and clinical investigators at Dana-Farber and beyond.
• Translation: To translate technologies to the clinical setting.
CCGD Structure
CCGD is the research and development group within the Precision Cancer Medicine effort at Dana-Farber Cancer Institute, Brigham and Women's Hospital, and Boston Children's Hospital.
Massively Parallel Sequencing (MPS)
• Enables comprehensive genome analysis quickly, accurately, and economically • However, cost of analysis and storage has not followed the same trend!
• Whole-genome, whole-exome, transcriptome, and targeted sequencing • Detect rare alleles/mutations • Discover indels, translocations, and copy number variations • Determine potential driver mutations • Identify mechanisms of drug resistance • Measure expression changes
The Power of MPS
Integrative Genomics Viewer (IGV)
Thorvaldsdóttir, Robinson et al., 2012
Outline
Center for Cancer Genome Discovery (CCGD)
Power of massively parallel sequencing (MPS)
MPS workflow
Genome, Exome, or Custom Targeted Sequencing
Sequencing analysis and reporting
CCGD Instrumentation
© Illumina, Inc. All rights reserved. Images courtesy of Illumina, Inc.
HiSeq 3000 (Patterned flow cell)
HiSeq 2500 Rapid Run Mode
MiSeq
Read Length 2 x 100
Paired End (PE) 2 x 100 PE 2 x 100 PE
Lanes per Flow Cell 8 2 1
PF Reads per Lane > 600 million 300 million 28 million
Target Coverage 80% > 30x 80% > 30x 80% > 30x
Time 40 hours 40 hours 17 hours
Workflow
gDNA samples received (50-200 ng FFPE, FF, etc.)
Quantify DNA
*PicoGreen dsDNA Quantification
DNA shearing (Covaris)
Agilent Bioanalyzer or Agilent TapeStation QC analysis, Clean-up
Manual or automated (Beckman-Coulter Biomek FXp)
Library construction (Illumina, Beckman, Kapa);
QC: Bioanalyzer and MiSeq quant Agilent SureSelect Hybrid Capture
Sequencing
Agilent SureSelect Hybrid Capture
Outline
Center for Cancer Genome Discovery (CCGD)
Power of massively parallel sequencing (MPS)
MPS workflow
Genome, Exome, or Custom Targeted Sequencing
Sequencing analysis and reporting
Genome, Exome, or Targeted Sequencing
Should I sequence the whole genome, the whole exome, or a targeted set of exons? • Currently, the functional genome is more clinically relevant/actionable
• More cost-efficient to sequence portions of the genome
• Targeted Sequencing:
• Higher mutiplexing of samples in flow cell lanes reduces cost • Greater sequencing depth • Simultaneous detection of mutations, copy number alterations and
translocations
• Targeted panels (Agilent Technologies) • Whole Exome v5: 50.4 Mb • Oncopanel v3: 550 genes + translocations, 2.8 Mb + 0.8 Mb • Custom targeted panels: sizes vary
Whole Genome Sequencing (3000 Mb) Pros: • Unbiased—sequence all genes, regulatory regions, etc. • Detect structural variants and copy number aberrations • Coverage uniformity • Longer reads • De novo genome assembly Cons: • High cost • Data storage • Difficult clinical research interpretation • Coverage lower (somatic mutation detection more difficult)
Genome, Exome, or Targeted Sequencing
Genome, Exome, or Targeted Sequencing
Whole Exome (50.4 Mb; < 2% of genome) Pros: • Reduced cost • Higher coverage • Lower data storage costs • Exome more highly characterized than whole genome • Faster turnaround • More samples sequenced per lane • Copy number and structural variants (w/ limitations) • Quicker/cheaper analysis Cons: • Copy number and structural variants limitations • Difficult clinical research interpretation (variants of unknown significance, VUS) • Genes with unknown function • Virtually no coverage of regulatory regions • Less uniform coverage versus WGS
Genome, Exome, or Targeted Sequencing
Custom Targeted Sequencing (OncoPanel+translocations) (3.6 Mb; ~0.12% of genome) Pros: • Choose your genes/regions of interest • Reduced cost • Higher coverage • Lower data storage costs • Less material needed for capture • Easier clinical research interpretation • Exome more highly characterized than whole genome • Faster turnaround • More samples sequenced per lane • Some copy number and structural variant analysis • Quicker/cheaper analysis Cons: • Copy number and structural variants highly limited • False negatives (variant of significance is missed) • VUS • Virtually no coverage of regulatory regions, unless targeted • Less uniform coverage
Genome, Exome, or Targeted Sequencing
Q: Should I sequence the whole genome, whole exome, or a targeted set of exons? A: It depends on your scientific question! - Do you need deeper coverage? - Are potential causative genes known or unknown? - Regulatory regions? - Structural and copy number variations? - How much money and material do you have? - What is the starting quality of your gDNA?
*Estimated. Results vary based on sample quality.
Whole Genome Exome v5 OncoPanel
Instrument HiSeq 2500 HiSeq 2500 HiSeq 2500
Read Length 2 x 100 PE 2 x 100 PE 2 x 100 PE
# Lanes 1 1 1
# Samples 1 3 15
# Reads per Lane 300 million 300 million 300 million
Mean Target Coverage*
10 x 150 x 200 x
Genome, Exome, or Targeted Sequencing
Outline
Center for Cancer Genome Discovery (CCGD)
Power of massively parallel sequencing (MPS)
MPS workflow
Genome, Exome, or Custom Targeted Sequencing
Sequencing analysis and reporting
Bioinformatics Analysis
Demultiplex Reads In Lane
Alignment To Reference
Genome
Lane Level Demultiplexed Bam
files
Lane Level Aligned Bam file
Realignment Around Known
Indels
Quality Score Recalibration
• Bam files ready for aggregation
• Lane level quality metrics
• SNP site genotype calls
• Variant and Indel detection
Merge Aligned & Unaligned
BAM
Duplicate Marking
MuTect: Cibulskis, K. et al. Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat Biotechnology (2013).doi:10.1038/nbt.2514 GATK Indel Locator: http://www.broadinstitute.org/cancer/cga/indelocator
In-depth pre- and post-project discussions Comprehensive reporting with QCs and data
Post-fragmentation Post-library construction
Sequencing Metrics
Variant Detection Copy Number Analysis
Reporting of Targeted Seq
www.dana-farber.org/CCGD
www.dana-farber.org/CCGD [email protected]
[email protected] (Project Initiation)
[email protected] (Informatics)
Directors
Matthew Meyerson, MD, PhD Laura MacConaill, PhD Paul Van Hummelen, PhD Faculty William Hahn, MD, PhD Adam Bass, MD Rameen Beroukhim, MD, PhD Matthew Freedman, MD Levi Garraway, MD Todd Golub, MD Massimo Loda, MD Kornelia Polyak, MD, PhD Charles Roberts, MD, PhD Kimberly Stegmaier, MD
Genomics Team
Aaron Thorner, PhD
Andrea Clapp
Haley Coleman
Samantha Drinian
Angelica Laing Suzanne McShane Edwin Thai Liuda Ziaugra
Bioinformatics Team
Matthew Ducar
Joshua Bohannon
Robert Burns
Johann Hoeftberger
Phani Kishore
Monica Manam
Neil Patel
Paul Rapoza
Priyanka Shivdasani
Bruce Wollison
Thank you!
Agilent products described are For Research Use Only. Not for use in
diagnostic procedures.