Solanum lycopersicum Chromosome 4
Sequencing Update
SOL Germany– October 2008
Wellcome Trust Medical Photographic Library
Summary of Project Status at WTSI
WTSI will be leaving the project on 31st Oct 2008
Transfer of BAC selection/contig building QC-checking is on the move to Imperial College London
Introduction
WTSI Tomato Clone Pipeline
Pipeline Stage Number of BACs 2007
Number of BACs 2008
Subcloning 34 0
Shotgun 21 1
Assembly Start 7 0
Auto-prefinishing 3 0
Finishing 11 33
QC Checking 4 2
Finished 63 147
Total 143 183
Phase 3
Phase 1+2
HTGS:
Chromosome 4Sequence Generated
Total Sequence Available 19,815,026 bp
Total Unique Sequence 19,527,597 bp
Total amount of Finished Sequence = 15,244,914 bp
Some facts and figures
We have 81 contigs on chr4 (80 contigs with sequence available).
Average contig length is just under 250 kb.
The average number of BACs per contig is 2.3.
The largest sequence contigs are in the range of 450kb-500kb with 5 or 6 BACs.
Distribution of tomato Chromosome BACs and sequence content
Centromere
= Euchromatin
= Heterochromatin
62 markers 41 markers 124 markers
30 contigs 11 contigs 36 contigs
56 BACs 59 BACs 63 BACs
SSR markers 4 contigs 5 BACs
UNORDERED
27 markers 16 markers 61 markers
Summary of Progress on Chromosome 4
81 map contigs have been built on chromosome 4 (AGP files)
119 BACs/44 contigs definitely on chr4 in FISH/ IL mapped
58 BACs under confirmation but have chr4 marker sequence
~60 Markers for which BACs have not been identified.
~13 BACs have been sequenced to HTGS3 and placed on chr0, definitely not on chr4 (others initiated, in same contig etc but stopped in pipeline).
22 Missing markers missing sequence?
Summary of what we will do next
1) Confirm chr4 location of BACs that lack chr4 marker sequence and or have conflicting map location. IL mapping.
2) Use missing marker sequences to identify further BACs (3D pools) and confirm chr4 location using IL mapping.
3) Use 3D BAC pools to identify BACs to extend current contigs.
4) Analyse output from X2 GS-FLX and X2 Illumina sequencing runs on cDNA from chr4 IL and parental lines to identify SNPs and further chr4 markers.
5) Use any markers from (4) to isolate further BACs for sequencing.
Acknowledgements
Wellcome Trust Sanger Institute:Carol ChurcherJane RogersSean HumphrayClare Riddle and Mapping Core GroupKaren McLaren and Finishing Team 46Stuart McLaren and Pre-finishing Team 58Christine Lloyd and QC Team 57Karen OliverMatt JonesCarol Scott
Imperial College London:Gerard BishopDaniel BuchanJames AbbottSarah ButcherRosa Lopez-Cobollo
University of Nottingham:Graham Seymour
Scottish Crop Research Institute:Glenn Bryan
Cornell University: Lukas MuellerJim Giovannoni
MIPS/IBI Institute for Bioinformatics:Klaus MayerRemy Bruggmann
FISH ResourcesStephen Stack Group (Colorado)Hans de Jong (Wageningen)Dora Szinay (Wageningen)
FUNDING