samir v . sawant principal scientist csir-national botanical research institute rana prarap marg...

Download Samir  V . Sawant Principal Scientist CSIR-National Botanical Research Institute Rana Prarap Marg Lucknow-226001

If you can't read please download the document

Upload: tamyra

Post on 25-Feb-2016

55 views

Category:

Documents


5 download

DESCRIPTION

Genomics of Gossypium spp . for Development of Genetic Markers and Discovery of Genes Related to Fiber and Drought Traits. Samir V . Sawant Principal Scientist CSIR-National Botanical Research Institute Rana Prarap Marg Lucknow-226001. Synopsis of Presentation:. - PowerPoint PPT Presentation

TRANSCRIPT

Slide 1

Samir V. SawantPrincipal ScientistCSIR-National Botanical Research InstituteRana Prarap MargLucknow-226001Genomics of Gossypium spp. for Development of Genetic Markers and Discovery of Genes Related to Fiber and Drought Traits

Synopsis of Presentation:Large Scale Genomic Resource Development in Cotton.

Genes Underlying Drought Tolerance & Fiber Quality Traits.

Large Scale Genomic Resource Development in Cotton though Sequencing of HMPR libraries

Selection of Six Diverse Germplasms of G. hirsutum (based on AFLP genetic diversity) Genotypes Species Source of Collection JKC 703G. hirsutumJK Agri., Hyderabad, Andhra Pradesh JKC 725G. hirsutumJK Agri., Hyderabad, Andhra Pradesh JKC 737G. hirsutumJK Agri., Hyderabad, Andhra Pradesh JKC 770G. hirsutumJK Agri., Hyderabad, Andhra Pradesh LRA 5166G. hirsutumTNAU, Coimbatore, Tamilnadu MCU5 G. hirsutumUASD, Dharwad, KarnatakaJena et al. (2011) Crop & Pasture Science 62:859-75Genic-Enrichment by Methylation Sensitive Restriction Digestion

M BstBI HpaII ClaI HindIII EcoRVDigestion of genomic DNA with different enzymes for methylation patternUncut DNA

Enriched DNA Sensitive In-sensitiveRai et al. (2013) Plant Biotechnology J. GermplasmsEnzymes used Total Reads (in millions)Total Bases (in Mb) JKC 703HpaII1.50429.1ClaI1.64474.6 JKC 725HpaII1.36372.9ClaI1.87533.5 JKC 737HpaII1.45407.7ClaI1.69542.9 JKC 770HpaII1.30376.1ClaI1.76481.1 MCU5HpaII1.15316.8ClaI1.47416.6 LRA 5166HpaII1.45433.9ClaI1.70513.4 Total 18.475298.8Reads Generated for Various Genic-enriched Cotton Genotypes GermplamsJKC 725JKC 737JKC 770MCU5LRA 5166% of mapped% of mapped% of mapped% of mapped% of mappedReadsBasesReadsBasesReadsBasesReadsBasesReadsBasesJKC 70389708768876675608768JKC 7259068896976618969JKC 737896977608867JKC 77075598969MCU57963Genotype wise Comparison of Genic-enriched reads using gsMapper v2.5.37Germplams % reads mapped% bases mappedJKC 703 78.161.1JKC 72583.262.3JKC 73783.8 66.4JKC 77082.363.8MCU580.564.4LRA 516677.356.5Enzyme wise comparison of HMPR enriched reads Enzyme wise comparison of Genic-enriched reads using gsMapper v2.5.3ParametersCotton GenotypesJKC 703JKC 725JKC 737JKC 770MCU5LRA 5166Super assemblyAll Contigs (>100 bp)58,14261,86254,73153,41927,95263,002533271Singletons (millions)1.231.431.331.290.801.243.56Total bases (Mb)377.9428.1427.3378.4233.4398.91272.6Large contigs (>500 bp)21,9202025517,96017,6578,66325,084215504Largest contig size (Kb)29.730.930.431.036.029.624.3Avg. contig size (bases)826808809826815868900N50 contig size (bases)808785787802771861894Q40 plus bases92.7892.692.2592.6394.2592.9993.8De novo Assembly using Newbler v2.5.3 AssemblerGene Prediction and Annotations

AUGUSTUS90294GenScan125422GlimmmerHMM97533Common gene models(present in any of two or more prediction tools)93363NCBI nrTotal hits: 63950Unique:38645TAIR 10Total hits: 52838Unique: 16956Cotton ESTTotal hits: 45054Unique: 19513Gene PredictionFull length genes21399Reciprocal BlastAnnotation

Similarity of Predicted Gene Models with Other Plant GenomesV. viniferaG. raimondiiA. thalianaR. communis11qRT PCR Validation of 12 Randomly Selected Predicted Gene Models in G. hirsutum

Y- axis: Fold Expression in Fiber and Root as compared to Leaf tissues12Identification of Transcription Factor Encoding GenesY- axis: Fold Expression in Fiber and Root as compared to Leaf tissuesqRT PCR Validation of 12 Randomly Selected Predicted Transcription factor encoding Gene Models in G. hirsutum

14Genome-wide SNP Discovery in G. hirsutum

Pooling contigs from each germplasm Super contigsOutput of AutoSNPFiltered out False SNPDetected SNPcontigsAssembly(Newbler v2.5.3)AutoSNPUsing customized programSNP discovery using Newbler v2.5.3 Assembler Assembly of individual germplasmsAssembly (Newbler v2.5.3)15AAATTTG. hirsutum genotype-1Allelic SNP (Taken)Non-Allelic SNP (Discarded)Assembly using gsAssembler v2.5.3 (40 bp overlap with 97% identity)ATGCCGCCCCAAACACACautoSNP v2.0 for contigs with minimum six reads (3 reads from each genotype)G. hirsutum genotype-2Strategy for SNP Discovery in G. hirsutum

Cultivars SNP summary Sequence alignmentJKC 703LRA 5166LRA 5166JKC 725JKC 725JKC 725JKC 770JKC 770JKC 770JKC 770Genome-wide SNP Discovery in G. hirsutum

17Distribution of identified SNPs Details of SNP discoveryTotal identified SNPs4,22,617True SNPs called75,714Non-redundant SNPs66,444Novel SNPs66,364Annotated Exonic SNPsSynonymous 2604Non-synonymous SNPs6506Genome-wide SNP Discovery in G. hirsutum

UTRs, 4446Intronic, 4518v18

JKC 703T C

JKC 725T C

JKC 770T C

LRA 5166T C

JKC 737 T C

MCU5T CValidation of Identified SNPs in G. hirsutum

SNPs used for Validation : 30Germplasms used : 6 SNPs Detected : 30

SSRs identification and Novelty comparison against Cotton Marker DatabaseNumber of SSRsUnit size of different repeat type47,093 Novel SSRsTotal SSRs Identified SSRs Successful in designing primers Novel SSRs developed1,48,93056,14247,093SSR novelty analysisSSRs distribution on the basis of Motifs Unmatched whole sequence wise Unmatched primer sequence wise Unmatched flanking sequence wise 20

291/297/300 bpJKC 703

291/300 bpJKC 770

291/297 bpJKC 725

291/297 bpJKC 737

291/300 bpLRA 5166

291/297 bpMCU 5Validation of Identified SSRs in G. hirsutum

SSRs used for Validation : 40Germplasms used : 12 Polymorphic SSRs : 6% Polymorphism : 15

G. raimondii(JGI)

G. raimondii(Chinese draft)SSRsSNPsDistribution of G. hirsutum SSRs and SNPs containing sequences on G. raimondii reference genome22miRNAs in Gossypium(on the basis of miRBase) Total miRNAs identified78 miRNA families identified42 miRNA novel to G. hirsutum17NovelmiRNAsmiR-1713miR-2112miR-2675miR-3522miR-3696miR-165miR-437miR-477miR-536miR-950miR-1070miR-4343miR-4371miR-5023miR-5065miR-5555miR-3963miRNA Novel to Gossypium23 Promoters identified24839 1000 bp 826 500 bp 3135Fiber developmental stage specific Promoters Initiation 184 Elongation28 Secondary cell wall 110Size Distribution of Identified PromotersPromoters and Cis Regulatory ElementsSize in basesNo. of Promoters

Initiation (184)

Elongation (28)Sec. Cell Wall Synthesis (110)

Genomic Resources Developed for G. hirsutum An OverviewNovel SNPs66364AssembledBases1272 MbPromoters3135AssembledSequences4095128

Novel SSRs47093TFs1093Full length genes21399GC Content37.76 %Repetitive content12.16 %Gene Models93363

vRai et al. (2013) Plant Biotechnology J. 25Development & Characterization of gSSRs and eSSRs in Diploid Cotton (G. herbaceum)Jena et al. (2012) Theoretical Applied Genetics 124 (3):565-76

Cross-species transferability of G. herbaceum derived gSSRs and eSSRsUPGMA tree of 15 genotypes of G. herbaceum based on Neis genetic distance using 200 SSRs

Total SSRs from G. herbaceum263 gSSRs1970 eSSRsRepeat Enriched Genomic LibrariesDroughtTranscriptome SequencingSSR NBRI_gB010 among four species of cotton10,947 SNPs1440 SSRs 2608 SNPs 11120233430715206SSR Sequence50bp FlankingPrimers1,847238,780NBRI SNPsPublic Domain SNPsNBRI SNPsPublic Domain SNPs10,947 0 0334 Novel SSRs1,847 Novel SNPs10,947 Novel SNPsSrivastava et al., Journal Plant Breeding doi:10.1111/pbr.12087 (In Press)G. hirsutumG. herbaceumDevelopment of molecular markers from Indian genotypes of two Gossypium L. speciesJKC 703 (superior in fibre quality)JKC 777 (inferior in fibre quality)Vagad(Drought tolerant)RAHS-14(Drought Sensitive)GujCot(Drought tolerant)RAHSIPS-187(Drought Sensitive)27(JKC725)(JKC770)(JKC703) (JKC737) (JKC783)Biological replicate 1, 2,3Biological replicate 1, 2,3RNA extraction/microarray hybridization

RMA background correction770-1,2,3.CEL725-1,2,3.CEL703-1,2,3.CEL737-1,2,3.CEL783-1,2,3.CELFurther analysis for SFPsIn Silico analysis of 37,473 SFPs in six crossesValidation of SFPs in two germplasm (JKC 703 x JKC 770)No. of Selected SFPs 224No. of SNPs found 122No. of indels found 10Microarray Based Single Feature Polymorphisms (SFPs) in Gossypium hirsutumSrivastava et al. (2012) CommunicatedInferior fiber qualitySuperior fiber quality

A-genome derived SSRs (genomic & expressed)2,23356,142AD-genome derived SSRs (Genic enrichment)A-Genome derived SNPs (Transcriptome sequencing)59266,444AD-genome derived SNPs (Genic enrichment)AD-genome derived SNPs (Transcriptome sequencing)2,600NBRI COTTON MARKERSAD-Genome derived SSRs (Transcriptome sequencing)1440AD-genome derived SFPs (Microarray based)1321,29,573Total Novel MarkersTotal SSRs 59,805Total SNPs 69,768SSRs/SNPs/SFPs development from Cotton at NBRICOTTON SNP CHIP (Affymetrixs Axiom myDesign Cotton Array)(CSIR-NBRI) Targeting 50,000 SNPs for Genotyping with Mapping Population

Axiom myDesign TG Array Plates enable us to: Easily select relevant SNPs from our SNP databaseCreating panels of 500,000 markers per sampleAxiom myDesign Array: Targeted genotyping, tailored for our studyA streamlined assay:Total genomic DNA (200 ng) is amplified and randomly fragmented into 25 to 125 base pair (bp) fragments.

These fragments are purified, re-suspended, and hybridized to Axiom Genome-Wide and myDesign Array Plates.

Following hybridization, the bound target is washed under stringent conditions to remove non-specific background to minimize background noise caused by random ligation events.

Each polymorphic nucleotide is queried via a multi-color ligation event carried out on the array surface.

After ligation, the arrays are stained and imaged on the Gene Titan MC Instrument.Deployment of COTTON SNP CHIP on Mapping PopulationsCICR, Nagpur:

a. H X H RIL population (Fiber Traits)b. A X He RIL population (Mapping and Fiber Traits)

2. UAS, Dharwad:

a. H X B RIL population (Fiber Traits) b. Core Collection (Association Mapping)

3. TNAU, Coimbatore:

a. H X H RIL population (Fiber Traits) b. H X H RIL population (Sap sucking pests)

NBRIs Cotton DatabaseA Webpage for Cotton ResourcesII. Genes Underlying Drought Tolerance & Fiber Quality Traits

Mannitol percentageAccessionsControl2%4%6%8%Vagad10010010010086Guj cot-211001001008266RAHS-14100761200RAHS-IPS-1871001001600H-17100100846214AH-7GP100100100140AH-127100100100224AH-41100100100180RAS-45100100100180DB-3-121001001006430RAHS-131100100100160JYLEHAR100100100142GH-18-2LC1001001008622RAHS-132100100341610Screening of G. herbaceum genotypes on different concentrations of mannitol

Tolerant genotype (Vagad)

Sensitive genotype (RAHS-14)Effect of drought on tolerant and sensitive genotype

Drought sensitiveContinuous wateringDrought sensitive1 week alternate wateringDrought Tolerant Continuous watering

Drought Tolerant1 week alternate wateringScreening of Cotton Genotypes for Drought Tolerance and SensitivityRanjan et.al., BMC Genomics (2012) 13:94

RAHS-14Vagad

Properties of Vagad

Reduced stomatal conductance (gs) Decreased transpiration rate (E) Reduced water potential (WP) Higher realtive water content (RWC) Leading to better water use efficiency (WUE).

Vagad has inherent ability to sense the drought at much early stage and respond to it in much efficiently. Physiological Parameters in Response to Drought in Vagad and RAHS- 14Ranjan A et.al., BMC genomics 2012 MarchTranscriptional profiling during drought and water condition in Leaf tissue of Vagad and RAHS-14ParametersVagad libraryRAHS-14 libraryTotal reads (overlap size of 100 bp and 96% identity)a8536856354Total contigs (100 bp or greater)b114396313Singletone 2408720780Exemplar3124423155Average length of contigs350 bp180 bpNumber of contigs with greater than 500 bp946705cNumber of genes with significant hit in NCBI NR database1077210408dNumber of genes with significant hit in cotton EST database1630113822Pyrosequencing dataMicroarray dataGenotypesDifferentially up regulated genes (Fold change 2)Vagad water 656RAHS- 14 water535Vagad drought430RAHS- 14 drought411Ranjan A et.al., BMC genomics 2012 MarchGenome wide gene expression profiling of leaf tissue of Vagad and RAHS-14 Propanoid pathway Pigment biosynthesis Polyketide biosynthesis Responses to various abiotic stresses Secondary metabolite pathways Ethylene responsive factor WRKY Programmed cell death Senescence Lipid metabolismVagadRAHS-14Ranjan A et.al., BMC genomics 2012 MarchComparative root Transcriptome Analysis of Drought Tolerant and Sensitive Genotypes of G. herbaceum

Drought tolerantDrought sensitive ReadsBasesContigsSingletonAv. Contig lengthAv. S. lengthGujCot-2155, 62013, 020, 1401,28130, 501481.7bp237.8bpRAHS-IPS 18749, 30811, 199, 20785830, 776532.9bp228.6bpSupercontigs1, 04, 92824,219, 3472, 66450, 531508.7bp231.7bpPyrosequencing dataRoot architecturesGenotypesDifferentially up regulated genes (Fold change 2)Vagad water 165RAHS- 14 water156Vagad drought256RAHS- 14 drought538Microarray dataRanjan A et.al., BMC genomics 2012 NovemberFunctional enrichment of genes of root tissue in drought tolerant and sensitive genotypes Tolerant genotypeSensitive genotypeRegulation of Transcription factors (TFs) under drought stressRanjan A et.al., BMC genomics 2012 NovemberDifferentially expressed genes analyzed by Genevestigator in mapping the specific expression of genes in different root zones

Ranjan et.al., BMC Genomics (2012) 13: 680Selection of Candidate Gene for Studying the Abiotic Response Library nameTLTRSLSRNumber of Transcript (tpm)07200Identification of Transcription Activator (TA) from Cotton Transcriptome of root tissue (TL-tolerant leaf, TR-tolerant root, SL-sensitive leaf, SR-sensitive root)Expression of TA

GheTA WT

GheTA WT

WTGheTAcontrol50 mM Mannitol150 mM Mannitol200 mM Mannitol250 mM Mannitol

WTGheTAcontrol50 mM NaCl75 mM NaCl100 mM NaCl150 mM NaClIncreased tolerance to drought and salt stress and better root development in tobacco over expressing GheTA

Control5 % PEG10 % PEG

100 mM NaCl

150 mM NaCl250 mM Mannitol500 mM Mannitol WT GheTA

WT GheTA

Abiotic stress tolerance of the GheTA over-expressing tobacco transgenic plants by leaf disk assayOver-Expression of GheTA leads to increased root biomass and better WUE in cotton transgenic plants

Wild typeCotton TransgenicCarbon Isotope Discrimination ratio shows higher water use efficiency (WUE) of cotton TA transgenic plantsFiber Quality ParametersSuperior genotypesInferior genotypesJKC 725JKC 777JKC 703JKC 737 JKC 7832.5% span length (mm)30.5-32.5 30.5-32.521.5-23.525.5-26.523.5-25.5Fiber strength (g/tex) 24.5-26.024.5-26.021.5-22.521.5-22.522.5-23.5Fineness (micronaire)3.7-4.03.7-4.04.0-4.33.2-3.54.0-4.2 Fiber quality of genotypes Fiber cellulose contentFiber lignin contentExpressional Reprogramming During Fiber Development In Contrasting Genotypes of G. hirsutum Nigam et al. (2013) Communicated45

Microarray data analysis

Two way ANOVA studyMethod used for Analyzing Microarray Data from Contrasting Cotton GenotypesSingular Enrichment Analysis (SEA)

Genotype significant genes

DPA significant genes

Interaction significant genesMapMan Bins Cluster Analysis

Cluster-1Cluster-3

Cluster-20DPA9DPA12DPA19DPA25DPA6DPA0DPA9DPA12DPA19DPA25DPA6DPA

Cluster-40DPA12DPA25DPA6DPA

12DPA19DPA25DPA6DPA0DPA9DPA12DPA19DPA6DPA9DPA19DPA6DPACluster-60DPA9DPA9DPA19DPA0DPA12DPACluster-5ABSuperior genotypesInferior genotypes Enrichment of Transcription factors

ParametersMerged assembly of both genotypesJKC 703JKC 777Total reads generated5479394881281036067Total bases generated (Mb)168.4160.9329.3Average read size (bp)307330318High Quality reads used in assembly529496473666972907All Contigs (>100 bp)179001645721308Singletons539834502381120Total bases after assembly (MB)24.722.337.2Large contigs (>500 bp)8752815312947Largest contig size (bp)356049915008Average contig size (bp)860823936N50 contig sizea (bp)893838987Aligned Reads (%)86.888.086.9Aligned Bases (%)86.487.485.6Inferred read errorb (%)2.02.01.8Q40 plus basesc (%)94.094.195.5 de novo and merged assemblyParametersGenotypesMerged assembly of both genotypesJKC 703JKC 777Total unigenesa7188361480102428Hits in 'NCBI nr' database401343684456390Hits in 'tair9' database33,359 37,017 51120Hits in ESTScan 445924057020,852Differential unigenes 21562076- Annotation of unigenes25 DPA Fiber Transcriptome: Assembly and AnnotationsR2 =0.68, p-value = 0.001Correlation between Cotton Fiber Microarray and Transcriptome SequencingSUPERIOR GENOTYPEINFERIOR GENOTYPE25

19

12

9

6

0Continued Barssinosteroid Signalling JASA ?AuxinBES1DET2PhospholipasesPat5, Pat6Cell wall EnzymesPG,PAE,PME,PMEImRNA and Protein degradation Ubiquitin ligase, Proteasome, SplisosomeBIN2Transporter MachineryABC transporterTranscription factors HSPs,Ca ++ /CAM1PRF5SPL3MEE 59OxiplipinsALDHAPY2 ABACSEAInitiationElongationSCWABI3/VP1EXL5SPL5AGPsAGPsLIM domain S3BES1DET2Barssinosteroid Targeted Gene ExpressionBarssinosteroid signallingBREfficient energy source for fast elongating fiber (Oxidative Phosphorylation e.g.TCA ,Glycolysis) Asp familyLarge number of ribosomal subunits (Better Protein synthesizing machinery )Increased Stress Tolerance to facilitate elongation process up to its completionIncreased Stress Environment within fibre cell and complete end of cell elongation processDecreased high elongation rate of fibre cells during elongation periodSCW formation StartsFibre Cells Continuative ElongationEnd of Cell ElongationPectin Modification Cell wall looseningEnergy ExhaustContinued Energy Providing MachineryFlavonoid BiosynthesisBarssinosteroid Targeted Gene ExpressionCell deathROSAscorbate peroxidaseGlutathione-S-TransferasePOXH202ROSCalcium SignallingGAH2O2Induction of Stress Hormones signal like, Ethylene and ABAOxidative stressOxidative stressLipid peroxidationC2C2-GATANACWRKYWRKYAP2-EREBPHSFHypothetical regulatory model showing over-represented genes and pathways in Superior and Inferior genotypesQTL startQTL endQTL sizeQTL size (MB)chr sizechr size (MB)% covered by QTL in a chrQTL locationTotal No. of genes in the QTL regionCompletely Mapped genes in QTL region (Our data)Total mapped gene on Chromomosome (our data)17773028456852922791226427.915586823355.8649.9Chr 12757501381390547207047351931418819.316276943062.7630.7Chr 22148491303185870354339853224811532.244576564845.7670.4Chr 3.129862992420114854571919137077063.7074576564845.768.1Chr 3.28841892504436205882318683795668.376217825862.1713.47Chr 41008321334002919623805835837766458.376414041364.1491.0Chr 5114747710914491832565061084201427642.015107451551.0782.2Chr 61334451143099961521192921119682.116098246560.983.4Chr 7.12951317213773747571941674342042043.426098246560.9871.2Chr 7.244416817237014370557104571869608718.695712882057.1232.7Chr 084592791474942098297282272478612924.787071302070.7135.0Chr 093146138245251922031616836424630.646268101062.681.0Chr 119521406001684Positions of mapped differentially expressed genes on QTL

Percentage of genes mapped in QTL 600/1684 = ~35%Circos Plot of all the differential genes mapped on the cotton jgi genome with all the QTL.

Heat map of 9 TA5 transcription factorsGhi8127.1.S1_s_at (GhSPL5)microRNA156 Targeted Transcription Factor (TA5) Governs the Boll Number, Size and Lint Yield in G. hirsutumq-RT PCR measurementGra1201.1.A1_at (GhSPL13)Gra1300.1.A1_at (GhSPL14)GhTA5 produce transcripts that are targeted by miR156

Northern blot of miR156 & miR172 in cotton fiber

Overexpression Lines Wild Type Knockdown LinesOverexpression linesKnockdown lines

12351234561234512312312321123124Wild TypeNumber of Cotton BollsOverexpression lineKnockdown lineOverexpression lineKnockdown lineLint cotton weight (gm) /plantSeed cotton weight (gm) /plantAverage of cotton boll per plantPhenotypic evaluation of overexpression and knockdown linesIdentification and characterization of fiber specific promoter in G. hirsutum (NBRI_2800)Fold ChangeFiber developmental stagesExpression pattern of NBRI 2800 gene in different fiber developmental stages

Histochemical localization of GUS expression in NBRI_2800 transgenic cotton plantAcknowledgmentsBioinformatics Dr. Mehar Asif Dr. Sumit K. Bag Dipti Nigam Archana Bhardwaj Ridhi Goel Pooman PantFunctional Genomics Neha Pandey Rajiv Tripathi Vrijesh Yadav Anshulika SableHMPR Sequencing& Epigenetic regulation Sunil K. Singh Krishan M. Rai Verandra KumarCotton Marker Dr. S N Jena Anukool Srivastava Ravi P. Shukla

Collaborators J K Agrigenetics Tierra Seed Science TNAU, Coimbatore CICR, NagpurUAS, DharwadThank You