paa / solanaceae 2006 - july 23-27, 2006 - madison, wisconsin, usa sequencing the gene-rich space of...

22
PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA Sequencing the gene-rich space of tomato chromosome 7 Current status of the French effort Farid Regad Genomic and Biotechnology of Fruit UMR990 INRA/INP-Toulouse France [email protected]

Post on 19-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Sequencing the gene-rich space of tomato chromosome 7

Current status of the French effort

Farid RegadGenomic and Biotechnology of Fruit

UMR990 INRA/INP-Toulouse [email protected]

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Chromosome 7 project

Sequencing of gene-rich euchromatic region of chromosome 7– Genetic length: 112 cM– Number of linked markers: 237– Estimated number of BACs to be sequenced: 277

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Chromosome 7 project

Fundings:– INRA, allowing the start of the project– French National Research Agency, starting January

2006– EU-SOL EU-SOL, starting September 2006

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Status of the project

Toulouse sequencing team

Sequencing pipeline

Validation of seed BACs– FISH localisation– IL validation

Sequencing statusShotgun coverage optimized by clone selection

(DACS)

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

INRA Toulouse sequencing team

Mondher Bouzayen Project leader

Farid Regad Project management

Seed BAC validationCorinne Delalande BAC selectionPierre Frasse Minimum Tiling Path

Physical mapping

Mohamed Zouine Bioinformatics

http://gbf.ensat.fr

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Farid RegadMohamed Zouine

Corinne Delalande Pierre Frasse

Mondher Bouzayen

Bioinformatics : C. Gaspin (Genopole Toulouse bioinformatic Plateforme)

FISH :O. Coriton (DGAP Rennes)

S. Stack (Colorado, USA) Z. Cheng (Pékin, Chine)

Seed BACs, BAC libraries, genetic maps, IL linesJim Giovannoni, Steve Tanksley (USA), Syngenta, Dani Zamir (Il)

Main investigators

BAC libraries management, BAC filters, hybridizations :Hélène Bergès (CNRGV Toulouse)

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Sequence analysesSequence analyses

GBF GBF

Sequencing pipeline

SequencingSequencing

Genome expressGenome express

AnnotationAnnotation

INRA Toulouse Bioinformatics PlatformINRA Toulouse Bioinformatics Platform

Seed BACs selection Seed BACs selection

Data exchange and storage in local Data exchange and storage in local databasedatabase

NCBI SGN

EUSOL

Location validation on chromosome 7Location validation on chromosome 7

IL: GBFIL: GBF FISH: China / France / USAFISH: China / France / USA

Overlaping BAC selectionOverlaping BAC selection

GBF GBF

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Physical mapping of tomato BACs on chromosome 7

FISH: Fluorescence In-Situ Hybridization

– BAC probes hybridised on pachytene chromosomes or on mitotic chromosomes

IL (D. Zamir)

– BACs mapping on Introgressed Lines (ILs)

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

pericentricheterochromatin

TG216

TG438

T1112T1355

T1328

T1428

T1962T1414

T1497

T0676

TM18

CT54

T0966

T0731

TM15

T1347

T1257

T0848

pericentricheterochromatin

309B15

euchromatin

euchromatin

centromere

telomere

telomere

chromomere

euchromatin

059P18130B18

167K07

213E05

230E07

232G04

241F16308M01

309F18

215P04

195N01

Song-Bin Chang Steve Stack

(Colorado USA)

Olivier Coriton (INRA Rennes)

FISH mapping of tomato BACs

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Courtesy of Song-Bin Chang and Steve Stack (Colorado)

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

LeHba 0027M11+Le-Hba0215P04

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

BACs are in the process of FISHing

– 10 BACs: Collaboration with Steve Stack, Colorado, USA (4 assigned on chr. 7)

– 20 BACs: Collaboration with Zhukuan Cheng, China

– Other BACs will be FISHed by HIS platform INRA Rennes, France

FISH mapping status

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Location confirmation - IL

Hba0002M15

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Status of the project

9 BACs shotgun libraries underway17 BACs, PHASE1 done3 BACs, PHASE2 done1 BAC finishedShotgun coverage will be optimized by clone

selection (DACS, patented by Genome-express)

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Sizing

Phrap40

LE_HBa0325D07

LE_HBa0002D20

LE_HBa0095C18

LE_HBa0188B22LE_HBa0215P04

LE_HBa0309B15

LE_HBa0309F18

LE_HBa0033O01

LE_HBa0130B18

LE_HBa0023C09 LE_HBa0166A09

LE_HBa0163O04SL_MboI0031B19

LE_HBa0308M01

LE_HBa0037F23

LE_HBa0059P18

LE_HBa0230E07

LE_HBa0130B18

SL_MboI0119A22

LE_HBa0002M15

LE_HBa0241F16

SL_MboI0017L19

LE_HBa0001N06

40000 60000 80000 100000 120000 140000 160000

40000

60000

80000

100000

120000

Estimations

7,5kb

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Assembly

Validation of Waterman hypothesis– identification of 2 problematic BACs

Coverage (x)

LE_HBa0002D20LE_HBa0095C18

LE_HBa0188B22

LE_HBa0325D07

LE_HBa0215P04

LE_HBa0309B15SL_MboI0031B19

LE_HBa0309F18

LE_HBa0308M01

LE_HBa0033O01

LE_HBa0037F23

LE_HBa0059P18

LE_HBa0230E07

LE_HBa0130B18

SL_MboI0119A22

LE_HBa0002M15 LE_HBa0023C09

LE_HBa0241F16

SL_MboI0017L19

LE_HBa0001N06

LE_HBa0163O04

4 6 8 10 12 14

15

20

25

30

Nunber of contigs normalised for a 100kbp BAC

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Single terminator, dye primer sequencing reaction Pooling of 4 reaction products per capillary

GE-DACS™ Technology

multiplexedsignatures

GENOME express

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

GE-DACS™ Technology

Signature-to-sequence comparison– Alignment of pseudo-sequence against expected identity

adaptivethresholding

expectedidentity

pseudosequence

GENOME express

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

GE-DACS™ Technology Signature-to-signature comparison

– Segment comparisons by cross-correlation

– ‘Correlogram’ based correspondence detection

correlogram

GENOME express

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

BAC data summaryseed Bac overlap overgo markersize ctg# sequencing status PCR validation FISH location

LE_HBa0002D20 TG418 ND DACS Yes US telomereLE_HBa0003E13 ND Yes ChinaLE_HBa0023C09 C2_At2g42810 1130 PHASE I YesLE_HBa0030F21 ND Yes ChinaLE_HBa0033O01 ND COMPLETED YesLE_HBa0034H08 ND Yes ChinaLE_HBa0037F23 ND PHASE I YesLE_HBa0040A11 ND Yes ChinaLE_HBa0049C13 ND Yes ChinaLE_HBa0049P16 LE_HBa0166A09 ND PHASE I YesLE_HBa0059P18 T0966 4361 PHASE I Yes USLE_HBa0095C18 C2_At4g30950 3594 DACS YesLE_HBa0130B18 TG438 981 PHASE I Yes USLE_HBa0138C17 ND Yes ChinaLE_HBa0163O04 T0028 1175 PHASE I Yes US 41% of the long arm from the centromereLE_HBa0166A09 ND PHASE I YesLE_HBa0167K07 T0731 ND Yes US 93.0 cM, sub-telomeric end LE_HBa0188B22 CT135 2000 PHASE II Yes RennesLE_HBa0213E05 T1355 748 Yes ChinaLE_HBa0215P04 ND PHASE I Yes US subtelomeric long armLE_HBa0230E07 T1328 ND PHASE I Yes USLE_HBa0232G04 T1112 4640 Yes RennesLE_HBa0241F16 TG216 ND PHASE I Yes USLE_HBa0242B17 ND PHASE I Yes RennesLE_HBa0308M01 CT54 ND PHASE I Yes USLE_HBa0309B15 T048 2612 PHASE I Yes US subtelomeric long armLE_HBa0309F18 TM15 ND PHASE II Yes RennesLE_HBa0325D07 T1414 977 DACS Yes USLE_HBa0001N06 LE_HBa0309B15 ND PHASE II YesLE_HBa0002M15 C2_At1g19140 2136 PHASE I YesLE_HBa0079F09 ND ND Shotgun YesLE_HBa0175E07 ND Shotgun YesLE_HBa0226J04 ND ND Shotgun Yes

SL_MboI0017L19 LE_HBa0215P04 ND ND Shotgun YesSL_MboI0031B19 LE_HBa0325D07 ND ND Shotgun YesSL_MboI0119A22 LE_HBa0309B15 ND ND Shotgun YesSL_MboI0046H06 ND ND Shotgun YesLE_HBa0226J04 ND ND Shotgun Yes

SL_EcoRI0019G22 ND ND Shotgun YesSL_EcoRI0099J13 LE_HBa0325D07 ND ND Shotgun Yes

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Expected agenda

January 2006– Official start of the project

June 2006– Optimisation and validation of the sequencing pipeline

September 2006– 21 BACs sequenced

January 2007– 70 BAC completed

June 2007– 150 BAC completed

March 2008– 277 BAC completed

December 2008– Chromosome 7 assembly and Finishing completed

2009– Chromosome 7 annotation

PAA / Solanaceae 2006 - July 23-27, 2006 - Madison, Wisconsin, USA

Acknowledgements Jim Giovannoni, Steve Tanksley, Joyce Van Eck

– Mapping data, seed BACs and BAC libraries

Dani Zamir– IL lines

Syngenta– new markers on chromosome 7

Steve Stack, Zhukuan Cheng, Olivier Coriton – BAC FISHing

INRA, French NRA, EU-SOL– Funding support

CNRGV INRA-Toulouse– BAC libraries storage and handling

SGN Consortium (Lukas Mueller, …)– Bioinformatics and n line access to all relevant data for the sequencing