target enrichment sequencing - pacbio

76
For Research Use Only. Not for use in diagnostics procedures. © Copyright 2017 by Pacific Biosciences of California, Inc. All rights reserved. June.27.2017 / http://programs.pacificbiosciences.com/l/1652/2017-03-25/3sn5p2 PacBio Americas User Group Meeting Sample Prep Workshop Breakout Session: Target Enrichment Sequencing

Upload: others

Post on 04-Feb-2022

8 views

Category:

Documents


0 download

TRANSCRIPT

For Research Use Only. Not for use in diagnostics procedures. © Copyright 2017 by Pacific Biosciences of California, Inc. All rights reserved.

June.27.2017 / http://programs.pacificbiosciences.com/l/1652/2017-03-25/3sn5p2

PacBio Americas User Group Meeting Sample

Prep Workshop Breakout Session:

Target Enrichment Sequencing

AGENDA

Introduction

-Overview of Target Enrichment Applications and Methods

-Barcoding Options for Target Enrichment

-PCR-Based Target Enrichment

-Probe-Based Target Capture Enrichment (With PCR)

-Target Capture Enrichment Without Amplification

-Technical Resources for Target Enrichment Sequencing

PacBio Scientific Conference Poster Presentations

Q&A and Open Discussion

Overview of Target Enrichment

Applications and Methods

EXAMPLES OF TARGET ENRICHMENT USE CASES

(HUMAN BIOMEDICAL RESEARCH)

Image source: http://worms.zoology.wisc.edu/zooweb/Phelps/ZWK99004k.jpeg 4/6/15

5 Mb contiguous region

of chromosome 6

5 entire genes on 5

different chromosomes

All exons in one gene

on chromosome 5

- Also applies to targeting respective full-length cDNAs

VARIANT DISCOVERY AND DETECTION FOR ANY GENOME,

ANY REGION

Cost-effective target enrichment workflows providing accurate, unbiased

results

- Complete and uniform coverage for your targets, even in low complexity regions

- Repeat expansions

- Promoters

- Flanking regions of transposable elements

- Characterize the full spectrum of genetic variation

- SNPs

- Structural variation

- Indels

- Alternative transcripts

- Low-frequency variants

- Haplotypes

Human Biomedical

Research

Plant and Animal

Sciences

Microbiology and

Infectious Disease

COMMON TARGET ENRICHMENT METHODS

PCR-BASED

(no capture)

ISOTHERMAL

AMPLIFICATION

PROBE-BASED

CAPTURE (w/PCR)

NO

AMPLIFICATION

Select SMRTbell

Templates of Interest

Make SMRTbell

templates from gDNA

Solutions Available Evaluating Partners Late-Stage

Development

TARGET ENRICHMENT METHODS HAVE STRENGTHS & WEAKNESSES

PCR-BASED

AMPLIFICATION

(NO CAPTURE)

PROBE-BASED CAPTURE

(w/PCR)NO AMPLIFICATION

STRENGTHS

- Less complex

- Fast (<1 day)

- Low input DNA

- Low upfront cost

- Quickly iterate optimizations

- Probe-design flexibility

- Large target size (Mb)

- Large fragments (≥6 kb)

- Ability to phase larger regions

- Maintains DNA mods

- Maintains repeat length

- No PCR artifacts

WEAKNESSES

- Difficult to amplify repeats

- Lose ability to detect DNA mods

- Limited total target size

- Designing multiple primers

- More complex workflow

- Longer workflow

- Higher upfront costs

- Lose ability to detect DNA mods

- More complex workflow

- Higher input DNA

KEY APPS /

TARGETS

- Fewer, longer amplicons

- Many, shorter amplicons

- Larger contiguous regions (10s

of kb to multiple Mb)

- Maintains DNA mods &

repetitive regions

TECHNOLOGIES /

KITSGenDx (HLA), Amplicon Panels

(Multiplicom)NimbleGen, IDT, SureSelect CRISPR-Cas9

APPLICATION

EXAMPLESHLA, 16S, CYP2D6, Variant

Confirmation

Cancer Panel, MHC Region,

Several Genes

Nucleotide Repeat

Expansions, Methylation

Barcoding Options for Target

Enrichment

BARCODES ALLOW TARGETED SEQUENCING TO BE MORE

COST-EFFECTIVE

TARGETED

SEQUENCING

WITHOUT BARCODES

Multiple Samples

One Library Prep

One SMRT Cell

$

TARGETED

SEQUENCING WITH

BARCODES

Multiple Samples

Multiple Library Preps

Multiple SMRT Cells

$$$

OPTIONS TO INCORPORATE BARCODES IN WORKFLOW

Adding the barcodes earlier in the template prep process allows

earlier sample pooling, saving time and reagents

Add Barcodes

BEFORE or DURINGAmplification

Add Barcodes

AFTER Fragmentation

or Amplification

Genomic DNA

Amplification or Fragmentation

PACBIO BARCODING OPTIONS FOR TARGETED SEQUENCING

BARCODED UNIVERSAL PRIMERS

BARCODED HAIRPIN ADAPTERSLOCUS-SPECIFIC PRIMERS

TAILED WITH BARCODES

LINEAR BARCODED ADAPTERS

PCR

PCR

Ligation

Fragment

Ligation

Order from Third Party (12 barcodes)

Order from Third Party (384 barcodes)

Product (96 barcodes)

Product (96 barcodes)

INCORPORATION THROUGH PCR AMPLIFICATION INCORPORATION BY LIGATION

PCR

PAIRING TARGET ENRICHMENT & BARCODING METHODS

PCR-BASED

AMPLIFICATION

(NO CAPTURE)

PROBE-BASED

CAPTURE

(w/ PCR)

CAS9 CAPTURE

NO AMPLIFICATION

Barcoded

Universal Primers

RECOMMENDEDrequires 2-step PCR;

less expensive upfront primer

development

Not Applicable Not Applicable

Locus-Specific

Primers Tailed

with Barcodes

RECOMMENDEDgood for high-volume assays;

more expensive upfront to make

primers

Not Applicable Not Applicable

Linear Barcoded

AdaptersNot Applicable RECOMMENDED Not Applicable

Barcoded

AdaptersNot cost-effective Not cost-effective

LIKELY TO BE

RECOMMENDED(still in development)

PCR-Based Target Enrichment

COMMON TARGET ENRICHMENT METHODS

PCR-BASED

(no capture)

ISOTHERMAL

AMPLIFICATION

NO

AMPLIFICATION

Select SMRTbell

Templates of Interest

Make SMRTbell

Templates from gDNA

Solutions Available Evaluating Partners Late-Stage

Development

PROBE-BASED

CAPTURE (w/PCR)

PACBIO BARCODES

-Set of 384 barcodes (16 bp length), optimized for SMRT Sequencing

POOL BARCODED

AMPLICONS

1 SMRTBELL

LIBRARY PREP

SEQUENCING ON

1 SMRT CELL

ANALYSIS FOR

EACH BARCODE

PRIMER DESIGN

Target of Interest , Sample #1

Barcode #1

Barcode #1

Target of Interest, Sample #2

Barcode #2

Barcode #2

X hundreds of

targets & samples

Mandelker et al. (2016) Genet Med 18: 1282-1289

SHORT READ SEQUENCING PLATFORMS HAVE A DIFFICULT

TIME DIFFERENTIATING GENES FROM THEIR PSEUDOGENES

PMS2 EXAMPLE

- The PMS2 gene is associated with autosomal dominant Lynch syndrome (also called hereditary nonpolyposis colorectal cancer syndrome, or HNPCC)

- Identifying variants in PMS2 is hampered by the presence of a pseudogene, PMS2CL, which has nearly identical homology to PMS2 in the final four exons of the gene (exons 12–15)

- 99.2% identical (exons) vs. 98.2% identical (gene)

- Sequence reads derived from hybridization capture and short read sequencing methods cannot be unambiguously aligned to PMS2 or PMS2CL

PMS2

PMS2CL

10 11 12 13 14 15

1 2 3 4 5 6

98.2% identical

TARGETING APPROACH – LONG RANGE PCR

-This generates a ~17 kb amplicon that can be turned into a SMRTbell

template and sequenced on a PacBio sequencing system

PMS2

PMS2CL

10 11 12 13 14 15

1 2 3 4 5 6

Design primers so that only PMS2 will amplify

17 kb amplicon from PMS2

-After making a library and sequencing, data is run through Long Amplicon

Analysis

-Detection of all variants (exonic & intronic)

-Results in fully phased haplotypes:

Mandelker et al. (2016) Genet Med 18: 1282-1289

wt

mut

5 kb

PMS2 RESULTS

SIMILAR GENE EXAMPLES

Gene Disease

PMS2 Lynch syndrome/hereditary nonpolyposis colorectal cancer

PKD1 Polycystic kidney disease

SMN1 Spinal muscular atrophy

SDH Hereditary paraganglioma-pheochromocytoma syndrome

VHL von Hippel-Lindau (VHL) disease

MECP2 Rett syndrome or other MECP2-related disorders

FLCN Birt-Hogg-Dube syndrome

CFTR Cystic fibrosis

BRCA Hereditary breast and ovarian cancer

GRHP Primary hyperoxaluria type 2

HHT, ENG, ACVRL1 Hereditary hemorrhagic telangiectasia, types 1 and 2

NPC Niemann-Pick type C

MLYCD Malonyl-CoA decarboxylase deficiency

AGXT Primary hyperoxaluria type 1

CYP21A2 21-hydroxylase deficient congenital adrenal hyperplasia

Probe-Based Target Capture

Enrichment (With PCR)

COMMON TARGET ENRICHMENT METHODS

PCR-BASED

(no capture)

ISOTHERMAL

AMPLIFICATION

NO

AMPLIFICATION

Select SMRTbell

Templates of Interest

Make SMRTbell

Templates from gDNA

Solutions Available Evaluating Partners Late-Stage

Development

PROBE-BASED

CAPTURE (w/PCR)

PROBE-BASED TARGET CAPTURE ENRICHMENT PROTOCOLS FOR

MULTIPLEXED SAMPLES

http://www.pacb.com/wp-content/uploads/Procedure-Checklist-Target-

Sequence-Capture-Roche-NimbleGen-SeqCapEZ-Library-

PacBioBarcodedAdapters.pdf http://www.pacb.com/wp-content/uploads/Unsupported-Protocol-

Target-Sequence-Capture-Using-IDT-Library-PacBio-Barcoded-

Adapters.pdf

PACBIO BARCODED LINEAR ADAPTERS ENABLE MULTIPLEXING WITH

NIMBLEGEN SEQCAP® EZ AND IDT TARGET CAPTURE LIBRARIES

Illumina Barcodes

PacBio Barcodes

gDNA

gDNA

Univ.Seq Barcode Univ.SeqBarcodegDNA

gDNAUniv.Seq Barcode Univ.SeqBarcode

Univ.Seq Barcode Univ.SeqBarcodegDNA

gDNAUniv.Seq Barcode Univ.SeqBarcode

Univ.Seq Barcode gDNA

gDNAUniv.Seq Barcode Illumina barcode

Illumina barcode

Univ.SeqBarcodegDNA

gDNA Univ.SeqBarcodeIllumina barcode

Illumina barcode

Adapter Ligation PCR (Pre-SMRTbell prep)

Univ.Seq Barcode Univ.SeqBarcodegDNA

gDNAUniv.Seq Barcode Univ.SeqBarcode

• 6 bp Illumina barcode

• Asymmetric design – Barcode only on one

end (not accurately identified)

• Not amenable to PacBio data workflow

• 16 bp PacBio barcodes

• Symmetric design - Barcode on both ends

enables accurate identification

• Compatible with PacBio data workflow

PACBIO MULTIPLEXED TARGETED PROBE-BASED CAPTURE

WORKFLOW

Shear to 7 kb AmplificationProbe hybridization,

Bead capture, Wash

EXPERIMENTAL PIPELINE

INFORMATICS PIPELINE

Phasing with

SAMtools

Bin reads by

haplotype

Quiver

haplotypes

Tertiary

Analysis

Map Reads of

Insert to Hg19

1 2 3 4 5

9 10 11 12 13

Size Selection

3

5-9 kb

5-9 kb

6

Amplification and

SMRTbell prep.

+ Size Selection78

SequencingAnalysis

Genomic DNA

Ligate

Barcoded

Adapters

https://github.com/PacificBiosciences/targeted-phasing-consensus

OVERVIEW: PACBIO PROBE-BASED CAPTURE OPTIONS

DNA probes DNA probes RNA probes

Complete Probe Tile Flexible Probe Design Under Evaluation

Flat rate up to 7 MB Cost per probe

Predesigned & Custom

Panels

Predesigned & Custom

Panels

Faster Turnaround

Sequence data generated on PacBio RS II and MiSeq from cell line NA12762 captured with standard NimbleGen Oncology Panel

TRUE FULL-GENE ANALYSIS

-Example: MUTYH

PacBio

(~5 kb

fragments)

MiSeq

(200 bp

fragments)

Sequence data generated on PacBio RS II and MiSeq from cell line NA12762 captured with standard NimbleGen Oncology Panel

PHASING VARIANTS OVER LARGE DISTANCES

-BRCA1, exon 10:

PacBio

(~5 kb

fragments)

MiSeq

(200 bp

fragments)

Allele 1

reads

Allele 2

reads

Sequence data generated on PacBio RS II and MiSeq from cell line NA12762 captured with standard NimbleGen Oncology Panel

RESOLUTION OF STRUCTURAL VARIATION

-PDE4DIP:

PacBio

(~5 kb

fragments)

MiSeq

(200 bp

fragments)

Allele 1

reads

Allele 2

reads

PacBio resolves a heterozygous ~740bp deletion containing an entire exon,

missed by Illumina

-Combining gDNA and cDNA sequencing data gives better insight into how

DNA variants effects gene expression

-Captured and sequenced 35 AD candidate genes in two AD patients

• Average gDNA fragment size: ~6 kb

• Full-length transcripts ranging from <1 kb - ~10 kb

APPLYING LONG-READ SMRT SEQUENCING FOR VARIATION

SCREENING IN ALZHEIMER’S DISEASE

SNPs AND LARGER SVs DISCOVERED IN THE AD SAMPLES

Results:

-Detected a broad range of

genomic variants:

- In addition to SNPs, 31 unique

SVs were observed ranging

from 65bp to several kb in size

-515 and 507 total isoforms

found for patients 1 and 2,

respectively

-Only 39 were shared among

samples and Gencode v25

database

67 2

339

319

154

312

RIN3 GENE: ~50 bp INSERTION DETECTED

APP GENE: ~550 bp INVERSION DETECTED

~550 bp inversion

Allele 1 (5 isoforms)

Allele 2 (21 isoforms)

MAPT GENE: 26 TRANSCRIPT ISOFORMS DETECTED

-Novel exon (red arrows) found is 3 of the 5 isoforms in Allele 1. Not

observed in any isoforms from Allele 2.

MAPT gene results:

- Detected a

heterozygous

deletion

- One allele is

transcribed into

21 isoforms while

the other to only 5

- Detected a novel

exon and

transcripts

MAPT GENE RESULTS FOR SUBJECT 1

21 isoforms

5 isoforms

Heterozygous SNPs can be used to phase gDNA and transcripts into multi-kilobase

long haplotypes

Target Capture Enrichment

Without Amplification

REPEAT EXPANSION DISEASES

37

CRISPR/CAS9 SYSTEM

Some in vivo applications:

- Gene silencing

- Homology-directed repair

- Transient gene silencing or transcriptional repression

- Transient activation of endogenous genes

- Transgenic animals and embryonic stem cells

• Bacterial Adaptive Immunity

• RNA-guided DNA Endonuclease

PCR-FREE TARGET ENRICHMENT VIA CAS9 DIGESTION:

METHOD OVERVIEW (CURRENTLY IN DEVELOPMENT)

DETAILED CAS9 METHOD

1. Complexity Reduction – Digest gDNA with restriction enzymes to remove 80% of genome

2. Cut with EcoRI and BamHI; make standard SMRTbell library; Exo Digestion

3. Cut open specific SMRTbells with Cas9; Ligate PolyA Hairpin

4. Capture PolyA-containing SMRTbells on Magnetic Bead; Elute off Bead

5. Anneal primer; Complex with Polymerase;

Magbead Load onto SMRT Cell; Sequence

USING CRISPR/CAS9 TO ENRICH FOR REPEAT EXPANSION

DISORDERS

- Improved on-target rate with complexity reduction:

- Restriction enzymes are used to degrade unwanted SMRTbell templates

- Additional starting DNA is required to maintain input into Cas9 digestion step

-Multiplexing:

-Multiple regions can be targeted in the same reaction

- Patient samples could be barcoded during initial SMRTbell library preparation

Number of individual molecules sequenced

*

*

*Restriction Enzyme

COVERAGE ACROSS THE GENOME

CAS9 – “ON TARGET” READS

Target

HUNTINGTON’S DISEASE (HD)

-Autosomal dominant neurodegenerative

genetic disorder

-Caused by an expansion of a CAG triplet

repeat stretch in the Huntingtin (HTT) gene

- polyglutamine tract

CAG REPEAT COUNTS

CAG REPEAT COUNTS IN HT PATIENTS

Samples obtained from Vanessa

Wheeler (Harvard Medical School)

-Widening repeat number distribution

at the mutated allele is biological

- Obtained roughly equal number of

sequenced molecules for normal and

mutated alleles

FRAGILE X SYNDROME

-Most common heritable form of cognitive impairment

-Caused by expansion of a CGG trinucleotide repeat in the 5’ UTR of the

FMR1 gene

fraxa.org

>700 CGG REPEATS SEQUENCED FROM THE FMR1 GENE

• Difference in risk is greatest

near 75-80 CGG repeats

• Having full sequence

information is medically

relevantgollen et a

AGG “INTERRUPTIONS” REDUCE THE CHANCES OF PRE- TO FULL-

MUTATION TRANSMISSION

2 …CGG CGG CGG CGG AGG CGG CGG CGG CGG CGG CGG CGG CGG CGG AGG CGG …1 …CGG CGG CGG CGG AGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG …0 …CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG CGG …

Yrigollen et al. (2012) Genet Med 14:729–736

…CGG CGG CGG CGG AGG CGG…

80%

60%

15%

DIRECT DETECTION OF METHYLATION

METHYLATION DETECTION OF FMR1 SAMPLE

METHYLATION DETECTION OF FMR1 SAMPLE

• CGG repeat region appears to be heavily methylated (5mC)

CONCLUSION

-Target any hard-to-amplify genomic region regardless of sequence context

-Avoid PCR bias and PCR errors

-Accurately sequence through long repetitive and low-complexity regions

- Count repeats and identify sequence interruptions

-Detect and characterize epigenetic modification signals

- Detect sample mosaicism

Amplification-free enrichment with CRISPR/Cas9 and SMRT Sequencing

achieves the base-level resolution required to understand the underlying

biology of repeat expansion disorders

Technical Resources for Target

Enrichment Sequencing

User Bulletins

User Bulletin for PacBio RS II and Sequel Systems: Centrifuge Tube and Pipet Tip

Recommendations (NEW!) (May 2017)

- PacBio advises against the use of Axygen MAXYMum Recovery™ tubes and pipet tips.

Please discontinue use of these products immediately. PacBio recommends alternatives

in the User Bulletin.

- http://www.pacb.com/wp-content/uploads/User-Bulletin-Centrifuge-Tube-and-Pipet-Tip-

Recommendations.pdf

Field Advisory for Sequel System: Securing Sequel Pipet Tip Rack (NEW!) (May 2017)

- PacBio recommends a simple procedure to ensure that the Sequel Pipet Tip rack is

firmly affixed to the tip box.

- http://www.pacb.com/wp-content/uploads/Field-Advisory-Notice-Securing-Sequel-Pipet-Tip-

Rack.pdf

User Bulletin for Sequel System: Heat Seal Advisory (Adhesive Seal Warning) (NEW!) (May 2017)

- PacBio advises against the use of adhesive foils and recommends the use of Sequel Sample

Plate Foil.

- http://www.pacb.com/wp-content/uploads/User-Bulletin-Heat-Seal-Advisory-Adhesive-Seal-

Warning.pdf

User Bulletin for Sequel System: Barcode Scanning of Sequel Sequencing Kit 2.0 (NEW!) (May

2017)

- PacBio is providing clarity on which barcode to scan to ensure the Sequel System has the correct

information and that all the consumables are compatible.

- http://www.pacb.com/wp-content/uploads/User-Bulletin-Barcode-Scanning-of-Sequel-Sequencing-

Kit-2.0.pdf

Find all protocols at http://www.pacb.com/support/documentation/

TECHNICAL RESOURCES FOR TARGET CAPTURE ENRICHMENT

Genomic DNA Target Sequence Capture Protocols

Shared Protocol – Target Sequence Capture Using Roche NimbleGen SeqCap EZ Library (2015)

- http://www.pacb.com/wp-content/uploads/2015/09/Shared-Protocol-Target-Sequence-Capture-

Using-NimbleGen-SeqCap-EZ-Library.pdf

Procedure & Checklist – Target Sequence Capture Using SeqCap EZ Libraries with PacBio

Barcoded Adapters (NEW!) (May 2016)

- http://www.pacb.com/wp-content/uploads/Procedure-Checklist-Target-Sequence-Capture-Roche-

NimbleGen-SeqCapEZ-Library-PacBioBarcodedAdapters.pdf

Unsupported Protocol – Target Sequence Capture Using IDT Library with PacBio Barcoded

Adapters (NEW!) (January 2017)

- http://www.pacb.com/wp-content/uploads/Unsupported-Protocol-Target-Sequence-Capture-

Using-IDT-Library-PacBio-Barcoded-Adapters.pdf

Full-length cDNA Target Sequence Capture Protocols

Shared Protocol – Full-length cDNA Target Sequence Capture Using SeqCap® EZ Libraries

- http://www.pacb.com/wp-content/uploads/2015/09/Shared-Protocol-Full-length-cDNA-Target-

Sequence-Capture-Using-Roche-NimbleGen-SeqCap-EZ-Library.pdf

Unsupported Protocol – Full-length cDNA Target Sequence Capture Using SeqCap® EZ Libraries

- http://www.pacb.com/wp-content/uploads/Unsupported-Protocol-Full-length-cDNA-Target-

Sequence-Capture-IDT-xGen-Lockdown-Probes.pdf

Application Notes

Multiplex Target Enrichment Using Barcoded Multi-Kilobase Fragments and Probe-Based Capture

Technologies (NEW!) (2016)

- http://www.pacb.com/wp-content/uploads/multiplex-target-enrichment-barcoded-multi-kilobase-

fragments-probe-based-capture-technologies.pdf

Targeted sequencing on the PacBio RS II using the Roche NimbleGen SeqCap EZ system

- http://www.pacb.com/wp-content/uploads/2015/09/Application-Note-Targeted-Sequencing-on-the-

PacBio-RS-II-Using-the-Roche-NimbleGen-SeqCap-EZ-System.pdf

Data Analysis

Data Analysis Workflow for Haplotype Phasing of Heterozygous SNPs (Support for phasing and

generating consensus sequence with SAMtools)

- https://github.com/PacificBiosciences/targeted-phasing-consensus

WHERE TO FIND SMRT RESOURCES

http://www.pacb.com/smrt-science/smrt-resources/

Explore our collection of resources and learn how scientists use SMRT Sequencing to advance their research.

Scientific publications

Explore our database of scientific publications featuring PacBio data.

Conference proceedings

Access conference posters and presentations our customers, collaborators, and internal scientists have presented at

various scientific meetings.

PacBio literature

View case studies, brochures, application notes, and more.

Video gallery

Watch our collection of videos, webinars, customer testimonials, and more.

Blog

Read our blog featuring new research, publications, conference summaries, and SMRT Sequencing updates.

Product documentation and training

Visit user documentation for our entire documentation library and training for user training materials.

PacBio Scientific Conference

Poster Presentations

http://www.pacb.com/wp-content/uploads/Kujawa-AGBT-2017-Alzheimers-Disease-Candidate-Genes-

and-Transcripts-Using-Hybridization-Capture.pdf

AGBT 2017

http://www.pacb.com/wp-content/uploads/Ekholm-AGBT-2017-Screening-and-characterization-of-

causative-structural-variants-for-bipolar-disorder.pdf

AGBT 2017

Poster PDF Available Upon Request

ESHG 2017

http://www.pacb.com/wp-content/uploads/Clark-AGBT-2017-Targeted-SMRT-Sequencing-of-Difficult-

Regions-of-the-Genome-Using-a-Cas9-Non-Amplification-Based-Method.pdf

AGBT 2017

Q&A and Open Discussion

Q&A AND OPEN DISCUSSION

Target Capture Enrichment with Roche NimbleGen’s SeqCap EZ Technology

Q: How does the target enrichment protocol work?

A: Customers order an off the shelf exome, human and non-human designs or custom designs and reagents from Roche NimbleGen using the Target

SeqCap EZ shared protocol, and perform standard SMRTbell® template preparation.

Q: What target region can be used?

A: Human and non-human genomic regions and exomes are supported by the SeqCap EZ technology.

Q: What are the different types of enrichments that are available?

A: There are 4 types of SeqCap EZ libraries:

- SeqCap EZ Exome – variants of human exomes

- SeqCap EZ Choice – customer specified subsets of human probes

- SeqCap EZ Design – off-the-shelf design of human (e.g., MHC) and non-Human (e.g., soybean exome)

- SeqCap EZ Developer – completely custom design

Q: What organisms are compatible with this target enrichment approach?

A: All organisms supported by the SeqCap EZ technology.

Q: Does Roche NimbleGen support this protocol?

A: Yes, any design and target procedure questions should be directed to Roche NimbleGen. Any SMRTbell library preparation questions should be

directed to PacBio.

Q: What results have been achieved with this target enrichment approach?

A: To date, we have evaluated the ‘off-the-shelf’ Major Histocompatibility Complex (MHC) and Comprehensive Cancer panel kits. Example results can

be viewed in the Application Note: Multiplex target enrichment using barcoded multi-kilobase fragments and probe-based capture

technologies (http://www.pacb.com/wp-content/uploads/multiplex-target-enrichment-barcoded-multi-kilobase-fragments-probe-based-capture-

technologies.pdf) and PacBio’s AGBT 2015 Poster: Targeted SMRT Sequencing and phasing using Roche NimbleGen’s SeqCap EZ

enrichment (http://www.pacb.com/wp-content/uploads/Poster_TargetedSMRTSequencingPhasing_RocheNimbleGenSeqCapEZ.pdf)

Frequently Asked Questions – Target Enrichment

Q&A AND OPEN DISCUSSION

Q: How long are the enriched fragments?

A: Genomic DNA is fragmented to 10kb. After size selection, capture, amplification and SMRTbell® library preparation, the average insert size is

approximately 6 kb.

Q: What is typical length of fragments targeted with the SeqCap EZ technology?

A: In the SeqCap EZ protocols, the DNA is sheared to the appropriate length for the sequencing technology.

Q: What is the advantage of the longer 6kb fragments?

A: The PacBio® System often delivers even coverage over multi-kilobase regions of the genome. With PacBio long reads, heterozygous SNPs can be

used to phase the reads and generate accurate haplotypes. It also provides good coverage for intronic and exonic regions across the target of

interest.

Q: For the MHC work described in PacBio’s AGBT 2015 Poster: Targeted SMRT Sequencing and phasing using Roche NimbleGen’s SeqCap

EZ enrichment (http://www.pacb.com/wp-content/uploads/Poster_TargetedSMRTSequencingPhasing_RocheNimbleGenSeqCapEZ.pdf)

- How did PacBio confirm the SNPs and haplotypes?

- What was the level of enrichment achieved compared to gDNA?

- What was the percent of reads that were on target?

- How did you phase/haplotype the reads?

A: For the MHC work described in the AGBT 2015 Poster

- Data from the MHC target enrichment experiment was compared to data generated from sequencing with both the PacBio System using

amplicons with the GenDx NGSgo®-AmpX amplification primers and traditional Sanger methods.

- On average the enrichment was >1500-fold for the comprehensive panel and >600-fold for the MHC panel.

- On average the percentage of reads on target was >60%.

- For each targeted region, SAMtools was used to phase and bin reads by haplotype, and then Quiver was applied to polish each haplotype to

high consensus accuracy. This entire workflow is summarized on GitHub here

Q: Where can I find more information about Target Capture Enrichment?

A: Please visit the Targeted Sequencing section of PacBio’s website (http://www.pacb.com/applications/targeted-sequencing/)

Frequently Asked Questions

Frequently Asked Questions - General

How long can I store my polymerase-bound sample?

- PacBio RS II:

- PacBio recommends that polymerase-bound samples be stored at 4C and used within 3 days.

- Sequel System:

- PacBio recommends that polymerase-bound samples be stored at 4C and used within 7 days.

How do I dissociate my polymerase-bound sample from MagBeads?

- Dissociating polymerase-bound sample from MagBeads may damage the sample and is not recommended. PacBio recommends binding

sample to MagBeads immediately before sequencing and proceeding with sequencing as soon as possible. If a delay between MagBead

binding and sequencing is unavoidable, Customers can store the sample in the dark at 4C, but delaying sequencing will be at the

Customer’s own risk. If a MagBead sample has already been aliquoted into a sample plate, the sample plate should be sealed upon storage

at 4C. For Sequel samples, the sample plate should be heat-sealed with the Sequel Sample Plate Foil (P/N 100-667-400). For PacBio RS II

samples, the sample plate should be temporarily sealed with an adhesive microplate sealing film and then the sealing film should be

replaced with the PacBio RS II Sample Plate Septum (P/N 000-882-901) before sequencing.

How long can I store my MagBead bound sample?

- PacBio recommends that MagBead samples be stored at 4C in the dark and sequenced as soon as possible.

My MagBeads were accidentally left at room temperature for several hours. Can they still be used?

- In most cases, MagBeads should still be useable by first chilling them at 4C before use.

My MagBeads / AMPure beads were accidentally stored at -20C. Is it still okay to use the beads?

- PacBio does not recommend using AMPure PB beads or MagBeads that have been accidentally stored at -20C because the beads may

become damaged and may leach after being frozen. However, Customers may use them at their own risk after bringing the MagBeads to

4C and AMPure PB beads to room temperature.

When preparing >30 kb SMRTbell libraries, can (AMPure-purified and concentrated) sheared gDNA be stored at 4C for longer than 24

hours?

- PacBio generally recommends that AMPure-purified and concentrated sheared gDNA be stored for up to 24 hours at 4C or at -20C for

longer durations. However, if the gDNA is relatively pure (i.e., free of endonucleases), it should be acceptable to store the sheared gDNA

sample for 2-3 days at 4C.

Conditions for shearing gDNA to a size that can support producing ≥30 kb libraries must be determined and verified empirically for

each sample. When preparing ≥30 kb SMRTbell libraries using Megaruptor, what is the recommended target shear size if the desired

size selection lower cutoff is, for example, 15-20 kb, 30 kb, or 40 kb?

- When preparing ≥30 kb SMRTbell libraries using Megaruptor, the recommended target shear size depends on the size selection lower cutoff

to be employed. The Table below may be considered a useful starting point; but empirical optimization and accurate size quantitation are

essential:

Library

Insert Size

(kb)

Size Selection

Lower Cut (kb)

Target gDNA

Shear Size

(kb)

30 15 - 20 30

30 - 40 15 - 20 50

40 - 50 30 60

50 - 60 40 75

Where can I find the Plate Map and sequences of all the primers in the Barcoded Universal F/R Primers Plate - 96 (P/N 100-466-100)

product and Barcoded Adapter Plate - 96 (P/N 100-466-000) product?

- To obtain the sequences of the primers used in the Barcoded Universal F/R Primers Plate - 96 Kit, please contact your local Field

Applications Scientist, or submit your inquiry through the PacBio Customer Portal (http://www.pacbioportal.com/) or email

[email protected].

- The Barcode Plate Map Diagram can be downloaded from PacBio’s Documentation webpage (http://www.pacb.com/support/documentation/)

here: http://www.pacb.com/wp-content/uploads/2015/09/User-Bulletin-Barcode-Plate-Mapping.pdf

There is a ‘Barcoding - RSII and SMRT Analysis 2.3.0 or older’ webpage on GitHub

(https://github.com/PacificBiosciences/Bioinformatics-Training/wiki/Barcoding). Where can I find the latest guidance on PacBio

Barcoding recommendations for multiplexed sample preparation for Sequel System / SMRT Link v4.0 (or later)?

- The most up to date information on PacBio multiplexing applicable to SMRT Link v4.0 (or later) can be found here:

https://github.com/PacificBiosciences/SMRT-Link/wiki/SMRT-Analysis-Barcoding-Primer

Can I use Illumina 8-bp barcode index sequences for preparing multiplexed samples for PacBio sequencing?

- No; PacBio does not recommend using Illumina 8-bp barcode index sequences for preparing multiplexed samples for PacBio SMRT

sequencing applications.

How are the 16-bp PacBio barcodes incorporated into the SMRTbell DNA template?

- PacBio uses two approaches:

- Adding a barcode to end of the standard SMRTbell adapter. The combined adapter is called a Barcoded Adapter.

- Adding a barcode to the PCR amplicon. This approach involves a two-step PCR reaction workflow. The internal primers for the first

PCR are augmented at the 5’ end by universal sequences to the target-specific primers. The external primers contain the 16bp barcode

at the 5’ end and the universal sequences. This approach is called Barcoded Universal Primers.

What are the supported applications for using PacBio Barcoded Adapters and PacBio Barcoded Universal Primers with multiplexed

samples? What are not supported applications?

- Supported applications are sequencing of one species per sample or loci. Examples of supported applications include: Confirmation of

SNPs, resequencing, most Long Amplicon Analysis (LAA) applications, and Sanger sequencing replacement. An exception is HLA typing,

which may have 2 species per loci. Multiplexing of HLA has also been demonstrated with the use of additional custom analyses (see

PacBio’s AGBT 2015 Poster:

http://files.pacb.com/pdf/Poster_MultiplexingHumanHLAGenotyping_DNABarcodeAdapters_HighThroughputResearch.pdf)

- Note: The product specifications for the PacBio Barcoded Adapter Kit and PacBio Barcoded Universal Primer Kit are such that the level of

barcode oligo contamination in the 96-plate wells should not exceed 5%. Therefore it is possible, though unlikely, to have 1 other

contaminant barcode primer/adapter sequence present at levels up to 5%. PacBio does not recommend using the PacBio Barcoded Adapter

Kit and PacBio Barcoded Universal Primer Kit for minor variant detection < 10%.

Does PacBio have any specific DNA polymerase enzyme or Kit recommendations for long-range PCR (LR PCR) for generating long

DNA amplicon samples for sequencing?

- While PacBio does not recommend a specific enzyme, a high-fidelity enzyme is generally preferred. For example, PrimeStart GXL from

Takara and ThermoFisher Phusion Hot Start II DNA Polymerase have given good results to our internal scientists.

- Do PacBio’s target enrichment sample prep protocols/tools serve you well for

your project needs?

-What other things would you like PacBio to add to our current solutions for

targeted sequencing?

-What are your opinions on the current state of SMRT Sequencing for targeted

applications?

Other Discussion Points

For Research Use Only. Not for use in diagnostics procedures. © Copyright 2017 by Pacific Biosciences of California, Inc. All rights reserved. Pacific Biosciences, the Pacific Biosciences logo, PacBio,

SMRT, SMRTbell, Iso-Seq, and Sequel are trademarks of Pacific Biosciences. BluePippin and SageELF are trademarks of Sage Science. NGS-go and NGSengine are trademarks of GenDx.

All other trademarks are the sole property of their respective owners.

www.pacb.com