the$breakage$fusion$bridge$and$ … · somatic rearrangements in bacs. (a) chromosome 17q21...

55
The Breakage Fusion Bridge and other complex SVs: Combinatorics & Cancer Genomics Vineet Bafna Computer Science & Engineering UCSD

Upload: others

Post on 15-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

The  Breakage  Fusion  Bridge  and  other  complex  SVs:  Combinatorics  &  

Cancer  Genomics  Vineet  Bafna  

Computer  Science  &  Engineering  UCSD  

Page 2: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Non  Allelic  Homologous  RecombinaFon  (NAHR)  

3/11/13   Bafna  

Page 3: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Can  these  mechanisms  explain  complex  rearrangements?  

Kitada,  2008  Zhao,  Cancer  Res.  2004  

Page 4: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C) chimeric amplicon in NCI-

H2171 including MYC; (D) chromosome 2 amplicon in NCI-H1770 including MYCN. The color of t...

Bignell G R et al. Genome Res. 2007;17:1296-1303

©2007 by Cold Spring Harbor Laboratory Press

…The  architecture  of  rearrangements  in  this  amplicon  recapitulates  remarkably  well  the  structural  features  predicted  by  a  classical  breakage–fusion–bridge  cycle  involving  sister  chromaFds,  as  originally  proposed  by  Barbara  McClintock  (1941)  (see  Supplemental  Fig.  4)….    

….Therefore,  the  ERBB2  amplicon  in  HCC1954  bears  all  the  architectural  hallmarks  at  the  sequence  level  of  a  classical  sister  chromaFd  breakage–fusion–bridge  process,  the  first  Fme  that  this  has  been  demonstrated  in  human  cancer….    

Page 5: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

PNAS  1939  

Page 6: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

PNAS  1939  

Page 7: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

PNAS  1939  

Page 8: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

B    A      A    B    C    D    E    F    G  

B    A      A    B    C    D    E    F    G  D    C    B    A    A    B   B    A      A    B    C    D  

     C    C    C    C    D    D    C    B    A    A    B    B    A    A    B    C    D    E    F    G  

Page 9: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Telomere  fusions  in  breast  carcinoma  

Tanaka,  PNAS  2012  

Page 10: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

B    A      A    B    C    D    E    F    G  D    C    B    A    A    B  B    A      A    B    C    D    E    F    G  

     C    C    C    C    D    D    C    B    A    A    B    B    A    A    B    C    D    E    F    G  

   A                  B                C                D                E                F                G  

The  BFB-­‐count-­‐vector  problem  

While  we  cannot  sequence  and  assemble  the  genome,  we  can  measure  copy  counts  of  individual  segments  

Page 11: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

The  BFB  reconstrucFon  problem  

A    B    C    D    E    F    G  

Given  a  segmentaFon  of  the  genome,  and  counts  of  copy  numbers  for  each  segment,  are  the  copy  numbers  consistent  with  a  BFB  sequence?  

A    B    C    D    E    F    G  

✔   ✗  

Page 12: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Input:  12      10      2  

Page 13: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

BFB Trees

Page 14: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

ProperFes  of  BFB  trees  

•  ProperFes    –  Long  ends  –  Node  symmetry  –  Pair  Symmetry  

•  A  traversal  on  a  tree  with  these  3  properFes  is  a  BFB  sequence.  

Pairs  

Page 15: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Finding Longest Palindromes

2 10 12

1

1·n1= 5

122

2·n1 + 2·n2 + 1·n3 = 6

2 4 2·n1 + 4·n2 = x

Page 16: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

BFB-Tree Has Better Worst-Case Performance

BFB-Tree for [20, 3, 19, 19, 19] took 12 seconds. BFB-Pivot for [20, 18, 20, 20, 6] took 22,251 seconds

Page 17: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Most Count Vectors Do Not Admit BFB

•  Expanded analysis to 64,000,000 count vectors of length <= 6 and ni <= 20. Of those count vectors, only 7.2% can be achieved by BFB.

•  Finding a count vector achievable by BFB is strong evidence BFB occurred?

Page 18: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Experimental Count Vectors Are Imprecise

Page 19: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Lung cancer cell-line PTX250

[3  3  4  6  34]  

Page 20: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Many CVs close to a BFB sequence

Page 21: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

A non-BFB count vector

[1, 2, 1, 2, 8, 1, 3, 1, 3] à [2, 2, 2, 2, 8, 3, 3, 3, 3]

Zhao,  Cancer  Res.  2004   (lung  cancer  cell-­‐line  NCI-­‐H2171)  

Page 22: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

A  linear  Fme  algorithm  for  BFB  

•  Once  you  allow  errors  in  counts,  many  count  vectors  can  conceivably  be  explained  by  BFB,  although  there  are  a  few  that  are  conclusivey  not  BFB  

•  What  if  we  choose  large  numbers  of  segments  (k=8,10,12)?  The  BFB-­‐tree  heurisFc  is  much  too  slow  to  deal  with  large  data  sets.  

Page 23: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Count  vector  canonizaFon  

4   D   D   D   D   D   D   D   D  3   C   C   C   C   C   C   C   C   C   C   C   C  2   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B  1   A   A   A   A   A   A   A   A   A   A   A   A   A   A  0   $   $  

2" ′=[2,  14,  18,  12,  8]  

Page 24: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Layer  4  

2" ↓4 =[0,  0,  0,  0,  8]  

%↓4   

4   D   D   D   D   D   D   D   D  3   C   C   C   C   C   C   C   C   C   C   C   C  2   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B  1   A   A   A   A   A   A   A   A   A   A   A   A   A   A  0   $   $  

•  4×  DD  

Page 25: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Layer  3  

•  There  are  6  CC  at  level  3.  •  Soln:    

•  Add  2  ε  to  the  4  DDs  •  Wrap:  to  get  4XCDDC,  2XCC  

2" ↓3 =[0,  0,  0,  12,  8]  

%↓3   4   D   D   D   D   D   D   D   D  3   C   C   C   C   C   C   C   C   C   C   C   C  2   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B  1   A   A   A   A   A   A   A   A   A   A   A   A   A   A  0   $   $  

Page 26: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Layer  2  

•  We  have  6  blocks  (4XCDDC,  2XCC)  at  level  2,  but  9  at  level  2.  

•  Soln:  Add  3  ε  •  Wrap  to  get  4XBCDDCB,  2XBCCB,  3XBB  

2" ↓2 =[0,  0,  18,  12,  8]  4   D   D   D   D   D   D   D   D  3   C   C   C   C   C   C   C   C   C   C   C   C  2   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B  1   A   A   A   A   A   A   A   A   A   A   A   A   A   A  0   $   $  

Page 27: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Layer  1  

•  9  blocks  at  level  2,  7  at  Block  1  •  Fold  2X{BCDDCB-­‐BCCB-­‐BCDDCB}  to  get  5  blocks.  •  Add  2  ε  to  get  7  •  Wrap  to  get  2X{A-­‐BCDDCB-­‐BCCB-­‐BCDDCB-­‐A},  3XABBA,  2XAA.  

2" ↓1 =[0,  14,  18,  12,  8]  4   D   D   D   D   D   D   D   D  3   C   C   C   C   C   C   C   C   C   C   C   C  2   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B  1   A   A   A   A   A   A   A   A   A   A   A   A   A   A  0   $   $  

Page 28: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Layer  0  

•  Fold  all  of  the  7  blocks  at  level  1  to  get  a  single  block.  

•  Wrap  the  $  around  the  block  

2" ↓0 =[2,  14,  18,  12,  8]  4   D   D   D   D   D   D   D   D  3   C   C   C   C   C   C   C   C   C   C   C   C  2   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B   B  1   A   A   A   A   A   A   A   A   A   A   A   A   A   A  0   $   $  

Page 29: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Wrapping  and  folding  

•  The  algorithm  does  a  sequence  of  wrapping  and  folding.  

•  The  folding  must  be  done  in  a  special  way  so  that  a  greedy  algorithm  always  folds  correctly.  

Page 30: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Folding  

•  Only  the  height  of  the  block  is  important.  Sort  the  blocks  by  their  heights.  

•  Consider  a  non-­‐increasing  (by  height)  collecFon  with  2i  copies  of  the  i-­‐th  height  block.  

•  We  call  such  a  collecFon,  an  l-­‐convex  collec:on.  •  An  l-­‐convex-­‐collecFon  can  be  rearranged  into  an  l  convex  plalindrome  

Page 31: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Folding  

•  An  l-­‐convex-­‐collecFon  can  be  rearranged  into  an  l  convex  palindrome.  

•  An  l-­‐convex-­‐palindrome  may  not  be  a  BFB-­‐block.  •  However,  all  BFB-­‐palindromes  can  be  created  by  flanking  an  l-­‐BFB-­‐palindrome  by  two  blocks  of  max  (or  higher  height)  

Page 32: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

The  algorithm  

•  In  going  from  layer  I  to  layer  i-­‐1  –  If  the  number  of  blocks  in  layer  i-­‐1  is  larger,  just  add  extra  ε  

–  If  the  number  is  smaller,  do  a  series  of  l-­‐convex-­‐collecFon  folding  operaFons,  each  Fme  reducing  the  number  of  blocks  by  a  power  of  2.  If  you  jump  over,  add  extra  ε  

•  How  can  we  fold  in  a  way  that  will  not  corrupt  the  blocks  for  future  layers?  

Page 33: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Recursive  Block  decomposiFons  

Page 34: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Recursive  Block  decomposiFons  

•  Remove  the  odd  block(s)  B,  and  put  them  in  the  center.  •  Blocks  higher  than  B  can  be  put  around  B  •  Divide  remaining  blocks  into  half,  and  recurse.  

Page 35: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Recursive  Block  decomposiFons  

d   Bd  (half  of  remaining)   Ld  (odd  blocks)   Hd  (higher  than  odd)   sd  

0   {4β10,  2β7,  2β9,  β8}   β8   2β7  

1   {2β10,  β9}   β9   Φ  

2   β10   β10   Φ  

3   Φ   Φ   Φ  

Page 36: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Block  decomposiFons  and  signature  

d   Bd  (half  of  remaining)   Ld  (odd  blocks)   Hd  (higher  than  odd)   sd  

0   {4β10,  2β7,  2β9,  β8}   β8   2β7   1  

1   {2β10,  β9}   β9   Φ   0  

2   β10   β10   Φ   0  

3   Φ   Φ   Φ   -­‐1  

s0 = L0

sd = Ld ! Ld!1 !12Hd!1 +max sd!1, 0{ }

Page 37: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Signature  

•  We  order  the  signature  using  lexicographic  sort.  

•  Let  B1,  B2  be  block  collecFons  of  the  same  size  with  s(B1)  <=  s(B2).    

•  Theorem  1:  if  B2  can  be  folded  to  size  n,  B1  can  be  folded  to  size  n  

•  Theorem  2:We  can  fold  in  a  way  so  as  to  always  achieve  the  minimum  signature  in  the  folded  block  

Page 38: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Results  •  There  is  an  algorithm  following  the  described  sketch,  who  succeeds  to  produce  a  BFB-­‐string  corresponding  to  the  input  vector  iff  there  exists  such  a  string.  –  Time  complexity:  &(("),  where  "  is  the  maximum  count  in  the  vector  (i.e.  linear  in  the  length    of  the  output  string,  yet  exponenFal  in  the  length  of  the  input  representaFon,  which  is  &((log " )).  

–  Variants:  •  An  &((log " )  algorithm  for  the  decision  problem  •  An  &(("↑2 log " )  algorithm  for  the  problem,  in  the  case  where  it  is  assumed  that  there  are  measurement  errors  in  the  input  

Page 39: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Using  BFB  in  pracFce  

•  For  each  window,  we  output  the  distance  between  the  example,  and  the  nearest  BFB  

•  We  apply  a  sliding  window.  At  each  posiFon  (and  window-­‐length),  we  idenFfy  a  count-­‐vector  with  minimum  distance  to  a  BFB.  

•  If  the  distance  meets  a  threshold,  we  accept,  else  reject  

Page 40: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

DetecFng  BFB  versus  other  rearrangements  

•  The  BFB  operaFons  take  place  among  other  rearrangements  that  change  the  copy  number  

•  The  count  of  copies  is  inexact  at  best  •  NegaFve  examples:  We  simulate  15000  diploid  chromosomes  undergoing  over  50  rearrangements  (inversions,  tandem  duplicaFons,  deleFons)  

•  PosiFve  examples:  5000  cases  with  50  rearrangements  on  diploid  chromosomes,  and  one  with  2-­‐10  BFB  operaFons  

•  In  each  case,  a  count-­‐vector  is  created  •  δ=  min.  distance  of  the  count  vector  from  a  true  BFB  

Page 41: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

63%  FPR  

Page 42: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Fold-­‐backs  •  Long  count  vectors  certainly  help,  but  they  might  not  be  enough  either.  

•  If  10%  of  the  cases  are  BFB,  a  10%  FPR  rate  becomes  50%FP  in  naturally  occurring  samples.  

•  Working  with  1%  FPR  reduces  TP  to  18%.  •  Excess  of  fold-­‐backs  (e.g.  –A,  A)  might  provide  other  clues  

-­‐C    -­‐B    -­‐A    A    B    C    D    E    F  

Page 43: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

DetecFng  fold-­‐backs  

Donor  

Reference  

Page 44: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

A  simple  score  funcFon  

•  f  =  fracFon  of  breakpoints  that  are  fold-­‐backs  (higher  the  bewer)  

•  δ=  min.  distance  of  the  count  vector  from  a  true  BFB  

•  s=(1-­‐λ)δ+λ(1-­‐f)  is  a  generic  score  funcFon  (lower  the  bewer)  

Page 45: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

63%  FPR  

Page 46: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

DetecFng  BFBs  •  With  errors  in  segment  copy  counts,  it  is  easy  to  create  short,  false  BFB-­‐like  count  vectors.  

•  We  cannot  detect  short  BFB-­‐cycles.  •  With  8  or  more  BFB  cycles,  and  correspondingly  long  count-­‐vectors,  we  need  advanced  algorithms  to  idenFfy  BFBs.  

•  Long  count  vectors  allow  us  to  detect  BFB  operaFons  with  high  confidence  even  amidst  noise  of  other  rearrangements.  

•  Other  signals,  such  as  fold-­‐back  operaFons  help  in  further  improving  the  signal.  

Page 47: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Campbell,  Nature  2010.  (Exemplar  of  BFB  in  paQent  with  pancreaQc  cancer)  •  Note  the  excess  of  Head-­‐head  fold-­‐backs  relaFve  to  others  rearrangements  •   We  idenFfy  a  count-­‐vector  of  length  10  with  a  very  low  distance  threshold  that  

allows  us  to  detect  this  as  BFB  with  some  confidence.  

Page 48: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Chr  12:  BFB  in  primary  tumor  (pancreaQc)  sample  

Page 49: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Chromothripsis  

Page 50: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)
Page 51: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)
Page 52: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)
Page 53: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

ObservaFons    •  Massive  remodeling  of  a  single  chromosome  •  Tens  to  hundreds  of  genomic  rearrangements  •  Very  few  copy  number  states  (two  or  perhaps  three).  •  RetenFon  of  heterozygosity  in  regions  with  higher  copy  number.  •  Breakpoints  show  significantly  more  clustering  than  observed  by  

chance.  

Page 54: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)
Page 55: The$Breakage$Fusion$Bridge$and$ … · Somatic rearrangements in BACs. (A) Chromosome 17q21 amplicon in HCC1954 including ERBB2; (B) chimeric amplicon in HCC1954 including MYC; (C)

Concluding  thoughts:  complex  structural  variaFons,  or  not?  

•  There  are  many  lines  of  enquiry  for  complex  structural  variaFon  –  Given  fragmentary  data,  can  you  use  it  to  reconstruct  the  architecture  of  the  sampled  genome  (AKA  sequence  assembly)  

–  Given  a  set  of  operaFons,  predict  a  plausible  sequence  of  events  (or  the  number)  that  takes  a  sampled  genome  to  a  reference  target  (AKA  genome  rearrangements)  

–  Given  a  sampled  genome,  what  kind  of  rearrangements  are  the  most  likely  explanaFon?  Is  there  evidence  for  more  complex  rearrangements  (e.g.,  this  talk)?