towards isoform resolution single-cell transcriptomics for
TRANSCRIPT
For Research Use Only. Not for use in diagnostic procedures. © Copyright 2021 by Pacific Biosciences of California, Inc. All rights reserved. Pacific Biosciences, the Pacific Biosciences logo, PacBio, SMRT, SMRTbell, Iso-Seq, and Sequel are trademarks of Pacific Biosciences. BluePippin and SageELFare trademarks of Sage Science. NGS-go and NGSengine are trademarks of GenDx. FEMTO Pulse and Fragment Analyzer are trademarks of Advanced Analytical Technologies. All other trademarks are the sole property of their respective owners.
Towards Isoform Resolution Single-Cell Transcriptomics for Clinical Applications Using Highly Accurate Long-Read SequencingAbstract #: 1873Elizabeth Tseng1, Jason G. Underwood1, Arjun Scott Nanda2, Vijay Ramani2, Scott N. Furlan3
1PacBio, 1305 O’Brien Drive, Menlo Park, CA 94025 2UCSF, San Francisco, CA 3Fred Hutchinson Cancer Research Center, Seattle, WA
Improving scIso-Seq Throughput on PacBio Systems PacBio Sequencing & Deconcatenation Single-Cell Deconvoluation With Short or Long Reads
• PacBio Iso-Seq method generates full-length transcript sequences up to ~15kb with high accuracy (>99.9%)
• 10X single-cell systems produce ~50% TSO-TSO artifact cDNA• Using TSO artifact depletion and cDNA concatenation, we
achieve ~6X throughput, or 8-9 million full-length cDNA molecules per SMRT Cell 8M for the 10X single-cell platform
• We applied to this throughput-improvement method to 10X single-cell libraries sequenced on PacBio Sequel II systems
• Demonstrated cell BC concordance with matching short read libraries
• Full-length isoform information revealed distinct expression levels in T cells not observable through 3’ tagging methods
scIso-Seq Throughput Improvement Methodology
Sample A Sample B
HiFi Reads 2,557,092 3,174,724
Reads with cDNA primers 2,151,948 2,726,226
Deconcatenated cDNAs 7,853,190 8,519,673
Hypothetical cDNAs w/out TSO depletion and concatenation
1,075,974 1,363,113
Effective Throughput Increase ~7.2X ~6.2X
Distribution of Concatemers per Long ReadSample A
Transcript Classification using SQANTI3
Cell BC concordance, PacBio vs. Illumina
−10
0
10
20
−10 −5 0 5 10 15lrUMAP_1
lrUM
AP_2
B MemoryB NaiveBasophilsCD14 MonoCD16 MonoCD4 MemoryCD4 NaiveCD8 EffectorCD8 MemoryCD8 NaiveCD8 TRB−V9
cDCISG15_High TregMAITMultipletsNeutrophilNKpDCProliferatingRBCTreg
Long Reads
−10
0
10
−10 0 10 20srUMAP_1
srU
MAP
_2
B MemoryB NaiveBasophilsCD14 MonoCD16 MonoCD4 MemoryCD4 NaiveCD8 EffectorCD8 MemoryCD8 NaiveCD8 TRB−V9
cDCISG15_High TregMAITMultipletsNeutrophilNKpDCProliferatingRBCTreg
Short Reads
15388 short reads/cell936 cDNAs/cell
Knee plot, all BC PacBio
Short reads calls8386 single cells
(10X Cell Ranger)
−10
0
10
−10 0 10 20srUMAP_1
srU
MAP
_2
B MemoryB NaiveBasophilsCD14 MonoCD16 MonoCD4 MemoryCD4 NaiveCD8 EffectorCD8 MemoryCD8 NaiveCD8 TRB−V9
cDCISG15_High TregMAITMultipletsNeutrophilNKpDCProliferatingRBCTreg
Short Reads
−10
0
10
20
−10 −5 0 5 10 15lrUMAP_1
lrUMAP
_2
MultipleNonePB.30915.37PB.30915.4PB.30915.5PB.30915.7
iso30915ISG20 Isoforms Assigned to Single Cells
−10
0
10
20
−10 −5 0 5 10 15lrUMAP_1
lrUMAP
_2
MultipleNonePB.30915.37PB.30915.4PB.30915.5PB.30915.7
iso30915
−10
0
10
20
−10 −5 0 5 10 15lrUMAP_1
lrUMAP
_2
MultipleNonePB.30915.37PB.30915.4PB.30915.5PB.30915.7
iso30915
T celllineages
T cells express4 common isoforms
−10
0
10
−10 0 10 20srUMAP_1
srU
MAP
_2
B MemoryB NaiveBasophilsCD14 MonoCD16 MonoCD4 MemoryCD4 NaiveCD8 EffectorCD8 MemoryCD8 NaiveCD8 TRB−V9
cDCISG15_High TregMAITMultipletsNeutrophilNKpDCProliferatingRBCTreg
Short Reads
Sample A (5’ library) read schema
Assigning Isoforms to Single CellsISG20 : Interferon Stimulated Exonuclease Gene 20
The Complete Diversity of ISG20 Isoforms Expressed in CD4 Naïve Cells
CD8 Naïve Cells Prefer from the Downstream TSS
GENCODE Reference