tomato genome build: sl2.5 to sl3.0
TRANSCRIPT
Tomato Genome Build
SL2.5 SL3.0
Surya Saha, Jeremy Edwards, Prashant Hosmani, Mirella Flores and
Lukas Mueller
Sol Genomics Network (SGN)
Boyce Thompson Institute, Ithaca, NY
[email protected] @SahaSurya
Slides: http://bit.ly/SOL15bld3
Tomato Genome
SGN Workshop, SOL 2015
Whole-genome shotgun and full-length
BAC sequences
• 454 reads (filtered, 22X)
• BAC end Sanger reads (0.18X)
• Fosmid end Sanger reads (0.087X)
• Selected BACs
• Sanger reads (5.2X).
Tomato Build SL2.40 SL2.50
SGN Workshop, SOL 2015
Lindsay Shearer
Stephen Stack
SL2.50 Availability
JBrowse
• Locus pages
• Gene pages
• FTP site
Also at NCBIhttp://www.ncbi.nlm.nih.gov/assembly/GCF_000188115.3/
SGN Workshop, SOL 2015
CHROMOSOMES
SCAFFOLDSCONTIGS
Gene to Genome – The BIG picture
CONTIG GAPS
SCAFFOLD GAPS
SGN Workshop, SOL 2015
GENES
MAP (chr1)
Ovate (chr1)
TM2 (chr 9)
L2 (chr 10)
State of the SL2.50 Build
SGN Workshop, SOL 2015
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 1 2 3 4 5 6 7 8 9 10 11 12
Sequence Scaffold gap length Component gap length
State of the SL2.50 Build
SGN Workshop, SOL 2015
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 1 2 3 4 5 6 7 8 9 10 11 12
Sequence Scaffold gap length Component gap length
Length 823Mb
Sequence 737Mb
Contig gaps 43Mb (5.30%)
Scaffold gaps 42Mb (5.17%)
Total gaps 86Mb (10.47%)
Tomato BAC Resources
Bruce Roe
HTGS Phase 1: 332
HTGS Phase 2: 520
HTGS Phase 3: 2764
http://www.ncbi.nlm.nih.gov/genbank/htgs/faq
SGN Workshop, SOL 2015
ftp://ftp.solgenomics.net/tomato_genome/bacs/
SGN Workshop, SOL 2015
BAC libraries
• HindIII (129,024 clones)
• EcoRI (72,264 clones)
• MboI (52,992 clones)
Tomato BAC Resources…
2764 Phase 3 BACs
340,000 high-quality reads
20X of entire genome
Mueller et al. 2009
SGN Workshop, SOL 2015
ITAG 2.4 Genes
SL2.50 Genome
Whole Genome
Shotgun assembly
Phase 3 BACs
20kb or more
SGN Workshop, SOL 2015
ITAG 2.4 Genes
SL2.50 Genome
Whole Genome
Shotgun assembly
Phase 3 BACs
20kb or more
No BACs mapped
here!!
Workflow
SGN Workshop, SOL 2015
Automatic
integration of BACsManual validation NCBI validation
Workflow
SGN Workshop, SOL 2015
Automatic
integration of BACsManual validation NCBI validation
https://github.com/solgenomics/Bio-GenomeUpdate
BAC
assemblies
Align to SL2.50
• 500bp BAC ends
• 100% identity
Jeremy Edwards
Place
BACs
Phrap Assembly (HTGS Phase 3 BACs)
SGN Workshop, SOL 2015
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 2 3 4 5 6 7 8 9 10 11 12
Assembled BACs Singleton BACs
Phrap Assembly (HTGS Phase 3 BACs)
SGN Workshop, SOL 2015
Chr10 Contig68 10 BACs (242Kb!!)
Chr2 Contig185 7 BACs (566Kb!!)
Workflow
SGN Workshop, SOL 2015
Automatic
integration of BACsManual validation NCBI validation
Prashant Hosmani
Mirella Flores
Workflow
SGN Workshop, SOL 2015
Automatic
integration of BACsManual validation NCBI validation
SGN Workshop, SOL 2015
0
50
100
150
200
250
300
350
400
450
1 2 3 4 5 6 7 8 9 10 11 12
1041/2764 BACs Integrated
• No contained BACs
• No single-end mapping BACs
SGN Workshop, SOL 2015
Improved Contiguity
0
500
1000
1500
2000
2500
3000
1 2 3 4 5 6 7 8 9 10 11 12
SL2.50 components SL3.0 components
Gap Reduction
0
500000
1000000
1500000
2000000
2500000
3000000
3500000
4000000
4500000
1 2 3 4 5 6 7 8 9 10 11 12
SL2.50 Ns SL3.0 Ns
Future
BioNano optical maps
• Integrate sequences from chr 0 into chrs 1-12
• Validation
• Improved gap sizing
SGN Workshop, SOL 2015
Gabino Sanchez
Future…
Annotation of SL3.0
• Lift over ITAG 2.4 annotations from SL2.50
• Maker annotations
• ITAG 3.0 annotations
SGN Workshop, SOL 2015
Thank you!!
Questions??
SGN Workshop, SOL 2015
Slides: http://bit.ly/SOL15bld3