Transcript
Page 1: Jan2016 bio nano han cao

AJ Trio GIAB January 2016

BioNano Genomics

Irys: Genome Maps for Sequence Assembly and

Structural Variation

Page 2: Jan2016 bio nano han cao

2©2015 BioNano Genomics

Irys® Overview

Start with non-Amplified LONG Native genomic

DNA

Label seq. specific sites

(e.g. nickase motifs)

Linearize & Image

Convert images to digitized molecules:

• Convert label locations to distances

between labels

• Create molecular barcodes (100 kb to >1 Mb)

Assemble the molecular barcodes

into consensus maps/contigs:

• Map lengths can be as long as 30 Mb

For SV discovery/detection, compare to a

reference or gold standard, looking for changes

in the patterns:

• Shifts in barcode patterns reveal insertion

(addition), deletion (subtraction), inversion

(re-orientation, translocation of genome segments

For Genome Finishing,

the maps serve as a scaffold:

• Sequencing contigs are converted in silico into

molecular barcodes by highlighting the same

sequence motifs

• These sequencing based barcodes are then

aligned to the BioNano maps

Workflow Applications

Page 3: Jan2016 bio nano han cao

3©2015 BioNano Genomics

Example of a Typical Irys Raw Data Generation andNext Generation De Novo Genome Map Assembly

Irys® Applied

BspQI

Data input (molecules >150kb) 256 Gb

Single molecule N50 235 kb

Genome map N50 1.59 Mb

Number of genome maps 2494

Total length 2.75 Gb

0.75 Mb

Page 4: Jan2016 bio nano han cao

4©2015 BioNano Genomics

Haplotype Aware Assembly

Irys® Applied

Raw Data (molecules > 150 kb) Father Mother Son

BNG* Data input 268 Gb (87X) 289 Gb (93X) 340 Gb (110X)

Single molecule N50 304 kb 261 kb 265 kb

Assembly stats

Number of genome maps 2050 2119 2415

Genome map assembly size** 5.24 Gb 5.22 Gb 5.03 Gb

Genome map size N50 4.46 Mb 3.93 Mb 3.43 Mb

Number of maps aligned to hg19 1939 2079 2319

% Genome maps aligned to hg19 95% 98% 96%

% Overlap with hg19 90% 90% 89%

SV Calls (>1kb)

Deletion 1215 1192 1158

Insertion 2468 2440 2417

* BNG: BioNano Genomics

** Diploid assembly

Page 5: Jan2016 bio nano han cao

5©2015 BioNano Genomics

Structural Variation Heredity Venn Diagram

InsertionDeletion

Irys® Applied

(> 1 kb) (> 1 kb)

Page 6: Jan2016 bio nano han cao

6©2015 BioNano Genomics

Cross-validation of Various SV Calls

Irys® Applied

• BioNano SV calls can validate a high ratio of NGS deletion calls from various methods and

insertion calls from “CSHL assembly” but many BioNano SV calls, especially insertions, are

not detectible by NGS.

146

144

168

169

180

35

0

124

120

79

69

236

280

181

193

79

206

23

3

220

655

113

212

0%

20%

40%

60%

80%

100%

BNG 2-5 kb NGS 2-5 kb BNG >5kb NGS >5kb

DeletionPBHoney tails

PBHoney spots

CSHL Assembly based

CSHL sniffles

Parliament pacbio

Parliament assembly

30

2

20

8

182

1

69

0

161

147

31

25

54 2 7 016 0 4 023 0 6

0%

20%

40%

60%

80%

100%

BNG 2-5 kb NGS 2-5 kb BNG >5kb NGS >5kb

InsertionPBHoney tails

PBHoney spots (+/-5kb buffer)

CSHL Assembly based

CSHL sniffles

Parliament pacbio

Parliament assembly

Page 7: Jan2016 bio nano han cao

7©2015 BioNano Genomics

Cross-validation of BioNano SV Calls (NIST GIAB-AJ Trio) Using a Compilation of SV Calls from Other NGS Methods

Irys® Applied

• Compared against a merged NGS call set, BNG can validate most (>2 kb)

NGS deletion and insertion calls. NGS can validate 2-5 kb BNG SVs

effectively but has low concordance/sensitivity for insertions of any size.

Only call sets with SV size information were included. Concordance is based on >= 1 bp overlap.

* 5 call sets based on PBHoney-tails, CSHL, and Parliament output. PBHoney-spots output was not included.

** 2 call sets based on CSHL output.

149

25

179

35

0%

20%

40%

60%

80%

100%

2-5 kbp > 5 kbp

Insertion**

NGS cross valid by BNG BNG cross valid by NGS

1221646247

208

0%

20%

40%

60%

80%

100%

2-5 kbp > 5 kbp

Deletion*

NGS cross valid by BNG BNG cross valid by NGS

Page 8: Jan2016 bio nano han cao

8©2015 BioNano Genomics

Cross-validation of BioNano SV Calls (NIST GIAB-AJ Trio) Using a Compilation of SV Calls from Other NGS Methods

BNG SV SizeOverlap

with NGS# SVs

(BNG)% BNG

supportedOverlap

with NGS# SVs(BNG)

% BNG supported

1 – 2 kb 204 243 84% 186 689 27%

2 – 5 kb 250 290 86% 219 625 35%

5 – 100 kb 203 312 65% 56 313 18%

100 kb – Up 15 29 52% 3 11 27%

Total 669 869 77% 455 1570 29%

InsertionDeletion

*Parliament and PBHoney tail calls don’t estimate size

-

10,000

20,000

30,000

40,000

50,000

- 10,000 20,000 30,000 40,000 50,000

NG

S (

all

me

tho

ds)

BNGDeletions Insertions

BNG vs All 6 NGS based

Page 9: Jan2016 bio nano han cao

9©2015 BioNano Genomics

Published BioNano CEPH Trio SV Calls

Page 10: Jan2016 bio nano han cao

10©2015 BioNano Genomics

UGT2B17: Medically Relevant Deletion of a Large Gene Paralog

hg38

chr4

son

mother

father

Homo sapiens UDP glucuronosyltransferase 2 family, polypeptide B17 (UGT2B17), mRNA

Page 11: Jan2016 bio nano han cao

11©2015 BioNano Genomics

Mom hap1

Mom hap2

Son hap1

Son hap2

Dad hap1

Dad hap2

hg19

Detection of D4Z4 Chr10 CNV in Subtelomeric Region

• Each repeat unit contains two homeoboxes gene DUX4: transcriptional activator

• Paralog of D4Z4 region on Chr4, deletion of which can cause Facioscapulohumeral

Muscular Dystrophy: < 5 D4Z4 repeat units

labeling motif -Nicking enzyme BspQI

Page 12: Jan2016 bio nano han cao

12©2015 BioNano Genomics

Son hap1 Inherited from Mom Demonstrated by Long Single Molecules Pileup Support

Son hap1

Mom hap1/2

Son hap2

Page 13: Jan2016 bio nano han cao

13©2015 BioNano Genomics

Son hap2 Inherited from Dad hap1 Demonstrated by Long Single Molecules Pileup Support

Son hap2

Dad hap1

Son hap1

Page 14: Jan2016 bio nano han cao

14©2015 BioNano Genomics

Evaluation of Conflicting Alignments and Sequence Assembly Error CorrectionPacBio OnlyBioNano Only

Irys® Applied

*Bickhart and Rosen, USDA

Hybrid s

caffold

Hybrid

NG

SG

en

om

e

ma

ps

Weak sequence evidence and conflicting RH map support sequence chimera.

Page 15: Jan2016 bio nano han cao

15©2015 BioNano Genomics

Summary

•Fully de novo genome map assembly for genome

structure

•Validation of sequence assembly by orthogonal

verification

•Hybrid scaffolding of sequence assemblies

•Structural variation detection

•Benchmarking tool for genome assembly and

structural variation by genome map and single

molecule alignment

Page 16: Jan2016 bio nano han cao

16©2015 BioNano Genomics

Acknowledgments

• NIST-GIAB

− Justin Zook

− Marc Salit

• BioNano Genomics

− Han Cao

− Alex Hastie

− Zeljko Dzakula

− Ernest Lam

− Tiffany Liang

− Andy Pang

− Thomas Anantharaman

− Khoa Pham

− Will Stedman

• Mt. Sinai School of Medicine

− Ali Bashir

• Duke University

− Eric Jarvis

• USDA

− Derek Bickhart

− Ben Rosen


Top Related