cotton genomics @sid
TRANSCRIPT
COTTON GENOMICS COTTON GENOMICS
Siddhartha Swarup JenaRAD/10-30
Ph.D. Mol. Bio & Biotech
Siddhartha Swarup JenaRAD/10-30
Ph.D. Mol. Bio & BiotechS S Jena
Phylum: Magnoliophyta
Class: Magnoliopsida
Order: Malvales
Family: Malvaceae
Tribe: Gossypieae
Cotton- King of Fibres
Diploid: 2n = 26, tetraploid: 2n = 52
2.7 billion nucleotides.
GossypiumGossypium
S S Jena
Annual, biennial or perennial
Herbaceous, short shrub or small tree
Primary axis, alternateLeaves have varying
texture, shape, hairiness
Showy cream, yellow, red or purple flowers axilary, terminal or solitary with typically 5 petals
DiversityDiversity
S S Jena
African-Asian diploids:G. herbaceumG. arboreum
New World tetraploids:G. barbadenseG. hirsutum
Four Independently Domesticated Species!!
Four Independently Domesticated Species!!
G. barbadense
G. hirssutum
S S Jena
Gossypium hirsutum, also known as Upland cotton, Long Staple Cotton, or Mexican Cotton, produces over 90% of the world’s cotton;
G. barbadense, also known as Sea Island Cotton, Extra Long Staple Cotton, American Pima, or Egyptian Cotton, contributes 8% of the world’s cotton;
G. herbaceum, also known as Levant Cotton, and G. arboreum, also known as Tree Cotton, together provide 2% of the world’s cotton.
Four Independently Domesticated Species!!
Four Independently Domesticated Species!!
S S Jena
Concerning diploid parentageConcerning diploid parentage
Cytogenetic studies indicate G. raimondii as the closest living relative of D genome parental donor
Hutchinson et al., 1947
– used 5 D-genome species in crosses with G. hirsutum or G. barbadense.
– Indicated G. raimondii as closer to the D-genome than other species tested.
– Innovative approach involving comparative analysis of diverse synthetic allohexaploids.
Liu et al., 2001
– G. raimondii is the sister group to clade of all 5 allopolyploid species
S S Jena
A-genome perspectivesA-genome perspectives
A-genome of allopolyploid cotton is more similar to the A-genome diploids than the D-genome of the allopolyploid is to that of the D-genome diploids!
G. arboreum and G. herbaceum better models of the progenitor A-genome diploid than G. raimondii is of the D-genome diploid
G. herbaceum more likely the A-genome donor than G. arboreum.
S S Jena
Biogeographical TheoriesBiogeographical Theories
Theories, based on cytogenetic data, suggested that polyploidization occurred after a Trans-Atlantic dispersal of a species similar to G. herbaceum.
Wendel and Albert, 1992:
Suggest pre-Pleistocene A-genome radiation into Asia, followed by trans-Pacific dispersal to the Americas
Supported by biogeography of D-genome species
Recent arrival of G. raimondii in Peru.
S S Jena
Allopolyploidization of Cotton Occurred Only Once
Allopolyploidization of Cotton Occurred Only Once
All New World tetraploid cottons contain Old World Cytoplasm.
Must have been one single seed plant in the initial hybridization event.
Long distance dispersals occurred by Transoceanic Voyages.
S S Jena
Phylogeny and evolution of Gossypium species
S S Jena
CottonDBCottonDB
CottonDB is a product of the Agricultural Research Service of the US Department of Agriculture and is maintained as a public resource to serve the cotton research community.
CottonDB is a database that contains genomic, genetic and taxonomic information for cotton (Gossypium spp.).
It serves both as an archival database and as a dynamic database which incorporates new data and user resources.
In 1995, CottonDB was initiated. It is the first and most extensively used database for cotton worldwide.
S S Jena
CottonDBCottonDB
CottonDB is a database that contains genomic, genetic and taxonomic information for cotton (Gossypium spp.).
It serves both as an archival database and as a dynamic database which incorporates new data and user resources.
S S Jena
CottonDBCottonDB
CottonDB.org contains:
– Genomic markers and nucleotide sequences – Genes, alleles and gene products – BAC clones and clone libraries – TM-1 fingerprint contigs developed by USDA-
ARS/Texas A&M University – Taxonomy of the Gossypium genus – Genetic maps – Contact information and research interests of
colleagues – Relevant bibliographic citations
S S Jena
Linkage map Linkage map
The first molecular linkage map of the Gossypium species was constructed from an interspecific G. hirsutum × G. barbadense F2 population based on RFLPs.
The map comprised of 2,584 loci at 1.74 cM intervals and covered all 13 homeologous chromosomes of the allotetraploid cottons, representing the most complete genetic map of the Gossypium.
At least six BAC and BIBAC libraries have been developed and made available to the public
S S Jena
ESTsESTs
281,233 ESTs have been available for the Gossypium species in GenBank.
Of these ESTs,
178,177 were from the polyploid cultivated cottons with 177,154 (63.0%) from G. hirsutum and 1,023 (<1.0%) from G. barbadense.
while 103,056 were from the related diploid species with 39,232 (13.9%) from G. arboreum (A2), 63,577 (22.6%) from G. raimondii (D5), and 247 (<1.0%) from G. herbaceum (A1).
S S Jena
ESTs cont.ESTs cont.
These ESTs were collectively generated from 32 cDNA libraries constructed from mRNA isolated from 18 genotypes of three species, G. hirsutum, G. arboretum, and G. raimondii, by one-pass sequencing of cDNA clones from one (3′ or 5′ end) or both ends.
Generated from 12 different organs- developing fibers, seedlings, buds, bolls, ovules, roots, hypocotyls, immature embryos, leaves, stems, and cotyledons.
Some of the ESTs were generated from plants growing under biotic or abiotic stress conditions such as drought, chilling, and pathogens.
A predominant feature of the cotton EST set is the significant preference of their tissue sources for fiber or fiber-bearing ovules than other organs.
S S Jena
Physical mapping Physical mapping
To date the database is limited to information concerning our ongoing physical mapping effort in three species of cotton, including the two cultivated 'AADD' tetraploid species Gossypium barbadense and G. hirsutum, and the wild DD genome species G. raimondii.
BAC libraries for all three species are currently being assayed using genomic and cDNA clones derived from linkage maps, and also dispersed repetitive DNA clones.
The physical mapping database has been constructed using the BACMan data management application
S S Jena
QTLsQTLs
QTLs for fiber quality properties in two Upland cottons, compared with those of ELS (extra long staple) cotton, with regard to their locations and gene effects.
A total of thirteen QTLs have been identified, four for fiber strength, three for fiber length, and six for fiber fineness
They are located on different chromosomes or linkage groups of our molecular maps comprised of 355 DNA markers covering 4,766 cM (Haldane function) of the cotton genome in 50 linkage groups
S S Jena
Cotton Vs ArabidopsisCotton Vs Arabidopsis
Although cotton genome is large and complex, its physical size of a cM is only 50% larger than that of Arabidopsis
A high level of homology between Arabidopsis and Gossypium genomes and abundant polymorphism among Gossypium germplasm were detected using conserved cDNAs from Arabidopsis genome.
The upland cotton genome consists of approximately 61% unique sequences and low copy number DNA
S S Jena
Chloroplast genomeChloroplast genome
The chloroplast genome of cotton is 160,317 base pairs (bp) in length, and is composed of a large single copy (LSC) of 88,841 bp, a small single copy (SSC) of 20,294 bp, and two identical inverted repeat (IR) regions of 25,591 bp each.
The genome contains 114 unique genes, of which 17 genes are duplicated in the IRs. In addition, many open reading frames (ORFs) and hypothetical chloroplast reading frames (ycfs) with unknown functions were deduced.
S S Jena
Chloroplast genome cont.Chloroplast genome cont.
Compared to the chloroplast genomes from 8 other dicot plants, the cotton chloroplast genome showed a high degree of similarity of the overall structure, gene organization, and gene content.
The cotton chloroplast genome was somewhat longer than the chloroplast genomes of most of the other dicot plants compared here.
However, this elongation of the cotton chloroplast genome was found to be due mainly to expansions of the intergenic regions and introns (non-coding DNA).
Moreover, these expansions occurred predominantly in the LSC and SSC regions.
S S Jena
Cotton sequencing will greatly help molecular breeding
Cotton sequencing will greatly help molecular breeding
Increases our understanding of cotton physiology and evolution.
Model of polyploid and comparative genome evolution
Maintains the competitive advantage of cotton fiber relative to other fibers
Creates new values for cotton on the farm and beyond the farm gate
S S Jena
Factors slowing down the cottonsequence progress
Factors slowing down the cottonsequence progress
Its large genome, a relatively large physical size of 2246 Mbp and small chromosomes
Allotetraploid AD genome property,
n=2x=AD=26
A large fraction of genome comprised of repetitive DNA seq.
S S Jena
THANK YOUTHANK YOU
S S JenaS S Jena