why sequence tumors from mice?
DESCRIPTION
Exome sequencing analysis of the mutational spectrum in carcinogen and genetic models of Kras -driven lung cancer. Peter Westcott, Kyle Halliwill, Minh To, David Quigley, Reyno Delrosario, Erik Fredlund, David Adams 1 , and Allan Balmain - PowerPoint PPT PresentationTRANSCRIPT
Exome sequencing analysis of the mutational spectrum in carcinogen and genetic models of Kras-driven lung cancer
Peter Westcott, Kyle Halliwill, Minh To, David Quigley, Reyno Delrosario, Erik Fredlund, David Adams1, and Allan Balmain
UCSF Helen Diller Family Comprehensive Cancer Center, 1450 3rd Street, San Francisco.
1 Wellcome Trust Sanger Centre, Cambridge, England.
Why sequence tumors from mice?
Timing of initiation collection
Initiating gene(s), carcinogen(s)
Can distinguish mutations involved in initiation from progression
Control!
Specific goals of this study
Part of the MMHCC TCGA Pilot Project
What is the effect of the causative carcinogen on mutation spectrum?
Characterize the utility of sequencing mouse tumors:
Clean genetic induction (GEM) vs. carcinogen induction?
What mutations arise after Kras initiation?
Exome sequencing
Urethane MNU KrasLA2 (GEM)
44 lung tumorsfrom 17 mice
26 lung tumorsfrom 7 mice
13 lung tumorsfrom 4 mice
Kras+/- (FVB/Ola)
Kras+/- Kras+/+
KrasLA2 (FVB/Ola)
Control tail DNA: 2 Kras+/+ tails
Spontaneous lung tumors
Exome sequencing
Have a confident list of somatic variants
Have aligned reads to mouse genome, called against multiple controls and performed extensive QC (Kyle Hallilwill)
Illumina paired-end sequencing (Wellcome Trust Sanger Centre)
Exome sequencing
Carcinogen models of Kras-driven lung cancer
~90% of lung tumors harbor Kras mutations.
Urethane (ethyl carbamate)
Adenosine and cytidine DNA adducts lead to mispairing:
Kras Q61L (CAACTA), Q61R (CAACGA).
A TReplication
Mispairing
Carcinogen models of Kras-driven lung cancer
MNU (methyl-nitroso urea)
~90% of lung tumors harbor Kras mutations
Guanosine DNA adducts lead to GA transitions
Kras G12D (GGTGAT)
Genome-wide spectrum of these carcinogen mutations not known
GG G
AReplication
Mispairing
Mutation spectrum
Urethane
MNU
LA2
Light shade = Kras+/-
Mutation spectrum
Slight bias for mutationsat G/C nucleotide
Strong bias for mutationsat G nucleotide with flanking G or A
Strong bias for mutationsat A/T nucleotide
Mutation spectrumAv
erag
e co
unts
per
tum
or
Purine bias at 5’ flanking base
5’ A 5’ G
Mutation spectrum
Are non-carcinogen mutations separable?
Aver
age
coun
ts p
er tu
mor
For the most part
670
80
60
40
20
0 NCG->T Other G->A A->T A->G A->C G->C G->T
UrethaneMNULA2
ARE CARCINOGEN MUTATIONS RELEVANT?
Other driver mutations?
Analysis complicated:
High mutation rates: MNU – 21.2/Mb Urethane – 6.4/Mb LA2 – 1.9/Mb Correlation between gene length and mutations
Start with variants within Vogelstein’s 2013 list of drivers:
Selected only consequential mutations at highly conserved sites in expressed genes
Other driver mutations?
GENE EXON_LENGTH NONSYN_MUTMll2 19827 16Sf3b1 6191 5Crebbp 7507 4Asxl1 6674 3Pdgfra 6553 3Met 6652 3Cic 6099 3Atm 11964 3Arid1b 11325 3Alk 5918 3Gnas 3717 2Notch2 10506 2Arid1a 8175 2Fgfr3 4222 2Hnf1a 3186 2Flt3 3656 2Brca2 10540 2Akt1 2640 2Rb1 4625 2
None of these mutations occur in LA2 tumors
Slight enrichment for longer genes
Modest increase in NS mutation ratio
One S367 to F – required for autophosph. and activity
Subclonal Myc T58P?
Conclusions
Clear recapitulation of expected carcinogen mutations
Mutation Spectrum
GEM shows few mutations
Mutations highly specific and distinguishable
Driver Mutations
Kras
Interesting candidates in carcinogen-induced tumors
Future work
InDel analysis.
Optimize list of potential driver mutations (relevant sites?).
Validate top 1000 interesting variants by Sequenom (Wellcome Trust Sanger Centre).
Array CGH (copy number analysis). Inverse correlation of point mutational burden and copy number changes?
Acknowledgments
$: NSF
Kyle HalliwillMinh ToDavid QuigleyReyno Del RosarioErik Fredlund
ALLAN BALMAIN
DAVID ADAMS (WELLCOME TRUST SANGER CENTRE)
$: NIH Training Grant T32 GM007175
$: MMHCC
Supplemental (Kyle’s Pipeline)
• Capture using Agilent mouse whole exome kit
• Sequenced on illumina HiSeq
– Paired end, 75 bp each, average read span of 180 bp
• Converted back to FASTQ, then followed QC pipeline (next slide)
Supplemental (Kyle’s Pipeline)
Align to Mm10 with BWAAlign to Mm10 with BWA
Mark duplicates and fix mate information with picard
Mark duplicates and fix mate information with picard
Base recalibration and realignment with GATK
Base recalibration and realignment with GATK
Alignment and coverage information with picard
Alignment and coverage information with picard
Variant calling with MuTectVariant calling with MuTect
Filter for depth and previously observed variants with vcftools
Filter for depth and previously observed variants with vcftools
QC and Variant Calling Strategy
Supplemental (Kyle’s Pipeline)
Sample .bam
Sample .bam
Control1 .bamControl1 .bam
Control2 .bamControl2 .bam
IntersectVariant List1 .vcfVariant
List1 .vcf
Variant List2 .vcfVariant
List2 .vcf
Variant Calling via MuTect
Candidate Variant List
.vcf
Candidate Variant List
.vcf
Candidate Variants
Candidate Variants
Filter, Annotate
Variant Calling Details