exploring illumina’s newest sequencing solutionsgd1 gheba, dan, 9/2/2016. 42 sample prep...
TRANSCRIPT
1
© 2016 Illumina, Inc. All rights reserved.Illumina, 24sure, BaseSpace, BeadArray, BlueFish, BlueFuse, BlueGnome, cBot, CSPro, CytoChip, DesignStudio, Epicentre, ForenSeq, Genetic Energy, GenomeStudio, GoldenGate, HiScan, HiSeq, HiSeq X, Infinium, iScan, iSelect, MiniSeq, MiSeq, MiSeqDx, MiSeq FGx, NeoPrep, NextBio, Nextera, NextSeq, Powered by Illumina, SureMDA, TruGenome, TruSeq, TruSight, Understand Your Genome, UYG, VeraCode, verifi, VeriSeq, the pumpkin orange color, and the streaming bases design are trademarks of Illumina, Inc. and/or its affiliate(s) in the US and/or other countries. All other names, logos, and other trademarks are the property of their respective owners.
AppliedGenomics
Exploring Illumina’s Newest Sequencing
Solutions
Dan Gheba
Sr. Sequencing Specialist
September 14th, 2016
2
HiSeq HD Systems
3
2014: HiSeq XPopulation Scale Human Genome Sequencing
1.8T | 6B READS | PE150 | <3 DAYS
4
Sequencing Power for Every Scale
Decreasing price per GB
Incr
easi
ng s
yste
m o
utpu
t & p
rice
HiSeq 25001000 Gb | 4B
2x125NextSeq120 Gb | 400M
2x150MiSeq15 Gb | 25M
2x300
HiSeq 3000750 Gb | 2.5B
2x150
HiSeq 40001500 Gb | 5B
2x150
HiSeq X Ten1800 Gb | 6B
2x150
HiSeq X Five1800 Gb | 6B
2x150
MiniSeq7.5 Gb | 25M
2x150
RNA-seqExomesWGSMethylation
ProkaryoteTargetedQC
WGS – Large Scale
5
HiSeq 4000:Over 2x more data in 1/3 the time!
1.5 TB | 5B READS | PE1503.5 DAYS
1 TB | 4B READS | PE1256 DAYS600 GB | 3B READS |
PE10011 DAYS
V3 V4 HD
6
What can you do with just ONE LANE of HiSeq 4000 data?
Drosophila130 Mb Genome
715X Coverage
Mouse2.7 Gb Genome
34X Coverage
Arabidopsis157 Mb Genome
590X Coverage
7
Innovative Patterned Flow Cell Technology
Undefined featureRandom spacing
Defined featureOrdered spacing
Nanowell Substrate Billions of Ordered Wells
Defined feature size
Optimal cluster spacing
Increased cluster density
Faster, Simplified imaging
Patterned FC Non-patterned FC
8
TruSeq® Nano
TruSeq® PCR-free
Nextera® XT
Nextera® Mate-Pair
TruSeq® Synthetic Long Read
Nextera® Rapid Capture Exome
Nextera® Rapid Capture Custom
TruSeq® Exome
TruSeq® Rapid Exome
TruSeq® Stranded mRNA
TruSeq® Stranded Total RNA
TruSeq® RNA Access
TruSeq® Small RNA
TruSeq® ChIP
TruSeq® DNA Methylation
DNA Targeted DNA RNA / Regulation
Supported Library Prep KitsOn HiSeq 3000 and 4000 Systems
9
10
ATAC-seq on the Hiseq 4000
Data from Eric Chow, Ph.D., UCSF
11
ATAC-seq on the Hiseq 4000
Data from Eric Chow, Ph.D., UCSF
12
Robust performance of patterned flow cells over a broad range of input concentrations with quality libraries
Specification is 312 M reads per lane75% PF yields 368 M reads per lane
13
Nextseq 500
14
The NextSeq® 500 Delivers on Three Key Aspects
Flexibility Speed Simplicity
15
System enhancements: Optics
Power of high throughput appsSize and affordability of a desktop sequencer.
6xMiSeq imaging
capability
6xMiSeq imaging
capability
1/3Size of a HiSeq1/3
Size of a HiSeq
1/3Capital costof a HiSeq
1/3Capital costof a HiSeq
6 parallel miniaturized, solid-state optics modules
16
Two channel SBS uses 2 images
Builds template over 5 cycles
Clusters appearing in green only are T
Clusters appearing in red only are C
Clusters appearing in both images are A
Clusters not present in either green nor red are G
Cluster intensities are plotted and bases are called accordingly
Two Channel SBS – NextSeq 500
17
Fast 2-channel SBS
24 h
18 h
2x75bp run
MiSeq v3 NextSeq 500
FASTER CHEMISTRY
HALF THE IMAGES
25%faster
18
2 x 150bp 2 x 75bp 1 x 75bp
Exome | Transcriptome18 | HOURS
Gene Expression Profile12 | HOURS
Human Genome30 | HOURS
19
High120 GIGABASES | PE150
400M | CLUSTERS
Mid40 GIGABASES | PE150
130M | CLUSTERS
.
One System, Two Output Modes
20
Flexible applications
High-OutputUp to 120 Gb
400M clusters PF1 x 75 bp, 2x75 bp, 2 x 150 bp
20GEX
profiles
30xgenome
6-12 exomes
RNA-Seq
Mid-OutputUp to 40 Gb
130M clusters PF2 x 75 bp to 2 x 150 bp
6-36panels
2-3exomes
2-4samplesRNA-Seq
21
What about Panels?
RNA-seq/Exome25M PE75
ChIP-seq10M SR50
Amplicon2M PE150
MiniSeq $935 $400 $120
MiSeq $875 $350 $125
NextSeq $170 $35 $25
$1,500 run : 25M reads
$1,530 run : 25M reads
$1,650 run : 130M reads
5X
Enabling a lower price point for large amplicon panels
22
SIMPLE LOAD AND GO REAGENTSSETUP TIME ON PAR WITH MISEQ
WasteBuffer
Chemistry
Flow Cell
23
CARTRIDGE FORMAT | AUTOMATED WASH | SELF-CLEANING
Designed for Simplicity and Efficiency
24
When do you Wash?
Cartridge reagents contain wash solution
Post-run, there will be an automated wash
Use within 2 weeks and NO wash is required
Focus less on washing, more on sequencing!
25
What is the potential?
Specification is 400 M ReadsIn this dataset - 600 million PF reads
(Add the 4 lanes circled)
https://basespace.illumina.com/s/DvjfrjGRZh6E
26
What is the potential? https://basespace.illumina.com/s/DvjfrjGRZh6E
Specification is >80% q30In this example - >90% q30
27
What are researchers doing with their NextSeqs?
For Research Use Only. Not for use in diagnostic procedures.
28
Performance
29
Side by side data quality - Error rateHiSeq 2500 vs NextSeq 500
Data trimmed to 2x75
HiSeq 2500 NextSeq 500
0.5%
30
TruSeq Amplicon on NextSeq and MiSeq SequencersExcellent somatic variant calling performance
R2
NextSeq: 98.6%MiSeq: 97.3%
• 9 replicates HorizonDx Quantitative Multiplex Reference Standard. FFPE samples prepared with TruSeq Custom Amplicon, 2x151bp• Contact an Illumina representative for access to the NextSeq data set• TruSeq Amplicon Cancer Panel, Analysis with TruSeq Amplicon BaseSpace App v1.1. Stated limit of detection = 5% variant frequency
MiSeq® v3 NextSeq v2
Yield (NextSeq data downsampled to 8Gb) 8.1Gb 39.4Gb
% >Q30 90.3 92.2
Passing SNVs (expected = 8, VF >5%) 8 8
v2 Chemistry
31
TruSeq Whole-Genome SequencingHigh precision and recall across platforms
Platform/Application Depth%
AlignedSNV
PrecisionSNV
RecallIndel
PrecisionIndel
Recall
HiSeq 2500 v4WGS (NA12878) – 350bp 39.8x 94.0% 99.9% 89.8% 96.6% 83.6%
HiSeq XWGS (NA12878) – 350bp 39.6x 90.6% 99.9% 90.7% 95.8% 83.0%
HiSeq 4000WGS (NA12878) – 350bp 38.2x 92.8% 99.8% 90.9% 95.4% 81.0%
NextSeq 500 v2WGS (NA12878) – 350bp 32.0x 90.6% 99.8% 90.9% 95.2% 80.7%
NextSeq 500 v2WGS (NA12878) – 550bp 33.3x 89.4% 99.8% 90.0% 94.8% 82.0%
Data sources: Most Projects below available in BaseSpace Public Data.• HiSeq 2500 v4: TruSeq PCR Free (4 replicates of NA12878)• HiSeq X Ten: TruSeq Nano (4 replicates of NA12878)• HiSeq 4000: TruSeq Nano 350 (NA12878, 6plex)• NextSeq 500 v2: TruSeq Nano 350 (NA12878)• NextSeq 500 v2: TruSeq Nano 550 (NA12878)
Data above analyzed using BWA Whole-Genome Sequencing v1.0 and VCAT 2.0 BaseSpace apps
For Research Use Only. Not for use in diagnostic procedures.
32
Gene-level FPKM Comparisons
33
HiSeq 4000
NextSeq 500
HiSeq 2500
HiSeq 4000
NextSeq 500
HiSeq 2500
GA
PDH
CA
LR
34
Variant Calling Assessment
Unique to HiSeq 4000 Unique to HiSeq 2500
Common on both platforms
Consistent results between HiSeq 4000
and HiSeq 2500 platforms
35
HiSeq 4000 vs HiSeq 2500 Callability Across AT / GC Regions
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Callability in AT / GC Regions
HiSeq 2500 2x126
HiSeq 4000 2x126
Comparison of AT/GC rich regions of the genome
Both platforms offer excellent coverage
Improved coverage in C rich and G rich regions on HiSeq 4000 system
36
Callability in Genomic Regions
Consistent performance across both platforms
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Callability in Genomic Regions
HiSeq 2500 2x126
HiSeq 4000 2x126
37
SNP Variant Frequencies (dbSNP)
Variant Frequency Variant Frequency
HiSeq 4000 HiSeq 2500
Freq
uenc
y
Freq
uenc
y
• Similar distributions for both platforms• Peaks at 50% and 100% variant frequency as expected
38
SNP Variant Frequencies (not in dbSNP)
Variant Frequency Variant Frequency
HiSeq 4000 HiSeq 2500
Freq
uenc
y
Freq
uenc
y
• Similar distributions for both platforms• Mostly low-frequency variants
39
Economics
40
PE100/125/150Genomes
PE50/75Exomes or RNA‐seq
HD $ 21 $ 29
2500 v4 $ 32 $ 53
2000 v3 $ 48 $ 71
2500 RM $ 55 $ 82
NextSeq500 $ 35 $ 44
Comparison Matrix – Paired End - Per GB
1.5X 1.5X
41
RNA-seq ThroughputAssumes 50M read per sample requirement
5x Days a week $330 per sample
160 Monthly
3x per Month $480 per sample
~180 monthlyOr
4x per month $400 per sample
~370 monthly
3x Days a week $220 per sample
1200 Monthly
$900,000*
Shorter run times = Smaller hit on throughput due to service*all based on list
price
$250,000*
GD1
Slide 41
GD1 Gheba, Dan, 9/2/2016
42
Sample Prep Automation
43
Neoprep: Automation for the everyday lab
Up to
16samplesper run
• Sequencing-ready libraries in ~7–11 hrs
• 30 minutes of hands-on time
Run time: ~10.5 hrs
Hands-on time: 30 min.
NeoPrep Stranded mRNA
Recover sequencing-ready libraries
Collect libraries
NeoPrep system
Prepare samples for loading
TotalRNA
44
45