mutation detection by massively parallel sequencing of solution captured human genomic loci frances...
TRANSCRIPT
Mutation detection by massively parallel sequencing of solution captured human genomic loci
Frances SmithDNA LaboratoryGuy’s Hospital
CMGS 12th April 2010
Aims of the project
• Comprehensive diagnostic service to sequence all genes involved in Glycogen Storage Disease (GSD)
• New technologies– Agilent SureSelect– Illumina sequencing
Clinical Need• GSD
– Defects in glycogen synthesis or breakdown in liver or muscle
– Broad overlapping clinical phenotype – 18 genes
• No comprehensive test• Reduce cost• Speed up diagnosis• Reduce invasive tests
Solution capture
• Submit genomic intervals to eArray
– 18 GSD genes
– 29 NMD genes
– Total of 4 Mbp and 1200 exons
• 120 bp RNA probes• 55 thousand probes per library• 5 x probe tiling (85 bp overlap)• Repeat masking
Probe Design Parameters
Probes
Exon
Solution capture
Prepped library
Biotinylated RNA ‘probes’
B
B
BB
B BPool
Hybridise 24h at 65°C
B
B
BDNA:RNA hybrids
Select hybrids -streptavidin
B
Wash
PCR
Target enriched sequencing library
Results• 8 lanes of sequencing
– 17 Gbp• 80% (13 Gbp) maps to human genome
– 1.6 Gbp per lane– Equivalent to 66 whole DMD genes
• Sensitivity (% target bases giving reads) = 99.5% @ >30x coverage
• Specificity (% reads mapping to targets)= 63%
Why do some probes not capture well?
• GC content– Extremes of GC%
not captured well• Secondary structure
– Self complimentarity
• Sequence context– Close to repeats
Good probe: Poor probe:
What do we do about it?
• Re-design the library
•Increase sequencing output
•Sanger sequence persistent gaps
Validation• Known GSD mutations captured and sequenced blind
• 2 compound heterozygote substitutions• Homozygous frameshift• Compound heterozygote substitution and nonsense mutation
•Deletions captured and sequenced• 5bp• 7bp• 13bp• 38bp ….. Testing more
Problems and Challenges
• Bioinformatics– Huge amounts of data– Storage and analysis issues
• Cost– Set up and run costs high
• Time• Technically challenging• Variation
– Large number of genes therefore large number of UV’s– How do we investigate/report these?
Summary• GSDv1 probe library designed and validated
• Solution capture and illumina sequencing carried out for point mutations and deletions up to 38bp
• Alignment software
• Ongoing– New versions of GSD library designed– Multiplexing– Other heterogeneous disorders
Acknowledgments
Guy’s DNA Lab– Steve Abbs– Michael Yau– Tom Cullup
Sanger Institute– Dan Turner– Alison Coffey– Eleanor Howard
Biomedical Research CentreGuy’s & St Thomas’ NHSFoundation Trust and KCL
– Pete Green– Effie Papouli– Muddassar Mirza
Clinical colleagues– Mike Champion– Charu Deshpande