architecture and evolution of neochromosomes
TRANSCRIPT
Architecture and evolution of cancer neochromosomes
Tony PapenfussBioinformatics DivisionThe Walter and Eliza Hall Institute of Medical Research
Bioinformatics and Cancer Genomics LabPeter MacCallum Cancer Centre
What are chromosome-scale mutations?
• Large-scale changes to chromosomes found in cancers (& congenitally)
• Includes structures and processes
• Often complex; require ways of thinking and analysing data to make sense of them
Kataegis
• Greek for thunderstorm
• Localized hypermutation
Single nucleotide variant index
Geno
mic
dist
ance
Alexandrov et al, Nature 2013
Chromothripsis: chromosomal shattering
Stephens et al. Cell 2011
“Criteria” for chromothripsis
Korbel & Campbell, Cell 2013
Linear Breakage-Fusion-Bridge cycle
Gisselsson et al, Hum Genet 1999McClintock, Genetics 1941
Telomere loss Replication Di-centric Di-centric
Torn apart at cell division
Daughter cells inherit different chromosomes
lacking telomeres
BFB generates inverted duplications
Giant cancer-associated neochromosomes
• Giant super-numerary chromosomes found in some cancers
• Little studied• Linear or circular (rings)• Gigantic• NCs have centromeres &
telomeres• Harbour known oncogenes
Sandberg, Cancer Genet Cytogenet (2004)
Prevalence of neochromosomes in cancers
Class Tumour type NC prevalence
Mesenchymal Parosteal osteosarcoma 90%Well-differentiated liposarcoma 85%De-differentiated liposarcoma 82%Dermatofibrosarcoma protuberans 67%Overall 14%
Haematological Dendritic cell neoplasm 24%Large B cell lymphoma 18%
Overall 3%Garsed, Holloway & Thomas, Bioessays (2009)
Hereditas, Volume 42, Issue 3-4, 1956
The remarkable case of four interlocked rings forming a chain (Fig.2v) was found in a cell with quadruple chromosome number, the rest of the chromosomes being arranged in quartets. This structure is unparalleled in chromosome experience, as far as we know…
Low resolution studies of neochromosomes in well-differentiated liposarcoma (WD-LPS)
• Neochromosomes are composed of material from multiple chromosomes
• Chr12 always present
• High level of amplification of known oncogenes: MDM2, CDK4, HMGA2mFISH 778 WD-LPS cell line
Pedeutour et al. Genes Chromosomes Cancer, 1999
Amplification of neochromosomal material may involve circular breakage-fusion-bridge cycles
Gisselsson et al, Hum Genet 1999
Sequencing of WD-LPS neochromosomes
• Flow enriched neochromosome isoforms from 5 cell lines: 449 (primary), 778 (recurrence), GOT3, T1000, LPS141
• Sequenced enriched neochromosomes to 5X-30X coverage per cell line
• Performed RNA-seq in all cell lines
• Sequenced 2 patient primary WD-LPS tumours to ~30X coverage
• FISH and ChIP-seq (CENP-A) studies of centromeres in cell lines
Chromosome size
778 chrA
778 chrB
GC c
onte
nt
778 cell line
Normal chromosomes
Initial analysis
• Align the reads back to the human reference genome (hg19)
• No recurrent single nucleotide variants or indels on the neochromsome
• Several fusion genes containing novel exons
• No recurrent fusions genes
Estimation of copy number
GC (%)
GC bias
Num
ber o
f rea
ds
Position (Mb)
778 Chr12 - Read depth
Num
ber o
f rea
ds(5
kb w
indo
ws)
Estimation of copy number: Background correction and calibration
Copy
num
ber
778 Chr12 - Copy number
Position (Mb)
778 Chr22 - Copy number
Position (Mb)
Neochromosomes are composed of 100s of highly amplified genomic intervals derived focally from nearly every chromosome
778 Chr12 - Copy number ST059 Chr12 - Copy number
Position (Mb) Position (Mb)
Copy
num
ber
Amplified genomic intervals show high level of internal rearrangement
778 Chr12
How are these connected?
Detecting genomic fusions
(ii) Discordant reads
(iii) Split reads
Copy number profileDiscordant read counts
Discordantly aligned read pairs
Breakdancer
778 Chr12
DRClusterSegmentation
778 neochromosome
• 260 genomic intervals
• >500 genomic fusions
• Nearly every chromosome involved
778
ST059 ST079
LPS141GOT3
T1000
Identifying material on the neochromosomeInitial segmentation by thresholding
Scaffolding amplified genomic intervals
Large-scale structure of neochromosomes•778 NC contains a 221Mb core region
•Derived from 31Mb of the genome & includes 289 genes
•Low copy regions derived from Chr7 & 22 provide telomeric caps
•Similar structures in the other lines
•Only 1.4Mb of sequence, containing 24 genes, is recurrently amplified
How did the neochromosomes form?
Circular versus linear breakage fusion bridge
• Linear breakage fusion bridge:
• Expected to generate an excess of inversions—not observed
• Generates inverted duplications—not observed
• Additionally, in 778/449 ring chromosomes were observed in the patient primary
Fusions at the edges of amplified genomic regions on Chr12 frequently connect to other other edges on Chr12
Edge to edge fusions
A B C
AB CAssembly
We can construct a walks across most of Chr12 via intra-chromosomal edge to edge fusions
Edge to edge fusions
Amplified genomic intervalsChr12
ST059
778
Edge to edge walks involving Chr12 suggest chromothripsis prior to amplification
and initially affecting only Chr12
T1000: initiated from a fusion chromosome?
Inferring chromothripsisA. Non-uniform clustering of breakpoints (✔ ; qq-
plot)
B. Regularity of oscillating copy number (½; not Chr12)
C. Interspersed LOH (½; mono-allelic amplification)
D. Rearrangements affect 1 haplotype (✗)
E. Random ordering of fragments and fusion types (✔ )
F. Walks (✔ ; edge to edge walk & formal statistical tests)
Captured telomeres of the 778 NC bear the classic signatures of chromothripsis
Position (Mb) Position (Mb)
778 Chr7 778 Chr22
Other chromothripsis-like signatures on unenriched chromosomes
T1000 Chr4 ST059 Chr3
GC c
orre
cted
read
dep
th
Copy
num
ber
Position (Mb) Position (Mb)
Can we test whether chromothripsis occurred prior to breakage fusion bridge?
Can the patterns be explained without resorting to chromothripsis?
Computational model of circular BFB
• Uses a uniform probability distribution to model breakpoints
• Run without selection, it erodes the neochromosome away
Simple model of fitness and selection
• Assume fitness is proportional to number of copies of key genes
• Require at least one copy of each selected gene to be preserved
• Scale the fitness so that it saturates at some CN
• Choose daughter cell randomly, weighted according to fitness
fitness=3
fitness=5fitness=1
Combining multiple model runs• At each cycle, we select 1 daughter cell (or the parent) and
discard others
• Each simulation traces a single lineage in the tree of possible cell fates
• Run this 500-1000 times and look at the distributions of summary statistics. e.g. Number of genomic intervals, interval lengths, …
• We explored different models of fitness/selection & different spatial distributions for the breakpoint
• Run model with and without chromothripsis
• Used uniform, biased and empirical distributions of chromothripsis breakage
Breakage-fusion-bridge only modelNu
mbe
r of g
enom
ic in
terv
als
Breakage-fusion-bridge with chromothripsis model
Num
ber o
f gen
omic
inte
rval
s
BFB-only model do not generate edge to edge walks; BFB+chromothripsis does
Model
Model of the life history of well-differentiated liposarcoma neochromosomes
Conclusions• Sequencing flow isolated neochromosomes revealed the
dynamic processes involved in their formation
• The approaches used likely have broader relevance
• The combination of mechanisms is also likely to be used in other scenarios
• For example, the emergence of resistance to targeted therapies is sometimes driven by extreme amplification of the target
Acknowledgements
Arthur HsuVincent Corbin
Papenfuss lab, WEHI
David Thomas
Peter MacCallum Cancer Centre
Owen Marshall, MCRIAndrew Holloway Zhi-Ping Feng Mark Lansdell Ben Lansdell
Dale Garsed Jan SchroederLeon di Stefano
The Lorenzo & Pamela Galli Melanoma Research Fellowship