Download - MS thesis presentation_FINAL
1 CHARACTERIZATION OF MICROSTRUCTURAL MUTATION EVENTS IN PLASTOMES OF CHLORIDOID GRASSES (CHLORIDOIDEAE; POACEAE).
Thomas J. Hajek III, M.S.Department of Biological SciencesNorthern Illinois University, 2014
Melvin R. Duvall, Director
2 Overview Introduction Hypotheses Research methods Results Discussion of key findings Conclusions
3Dr. M.R Duvall Laboratory published
results..(2009 - Present)
NextGen has increased the amount of data collection 1 complete plastome (2009) and 70% complete draft using Sanger
methods 1 (2010) all sanger 2 (2012) all sanger ≈64 complete plastomes published (2013-2015) using NGS averaging 20/year (1000% production increase) for past 3 years
....but there are MANY more in the pipeline
4 WHY GRASS?Grasses are BIG BUSINESS Knowledge
Knowing with high degrees of certainty the evolutionary relationships among these extant species.
Complete CDS could allow for integration of genes of interest into existing commercial crops or forage graminoids.
Cereals Rice, Corn, Wheat ≥ 50% human calorie intake. over 70% of all crops grown for human and livestock consumption.
It is important that we understand evolutionary relationships of grasses at a molecular level manage ecosystems, bio-engineer species resistant to plant pathogens, produce high yielding commercial crops.
4
5
A brief backgroundFossil records suggest that some ancestors of the grass family: (rice and bamboo) began to diversify as early as 107 – 129 Mya (Prasad et al., 2011).radiated into 11K accepted species.fifth largest plant family on earth (Stevens, 2007).includes 12 subgroups or subfamilies of grasses (GPWG II, 2012).grasses dominate over 40% of the land area on earth (Gibson, 2009)
6 Why subfamily Chloridoideae? well-defined plant lineage
monophyletic subfamily 1420 known species of the 11K described grasses. (~13%) Both Human and Livestock consumption.
may have a role in bioengeneering of drought resistant crops and livestock grazing
share specific evolutionary adaptations (Peterson et al., 2010). C4 photosynthesis. (as opposed to C3 and CAM)
More efficient form of photosynthetic carbon fixation that is effective in arid regions.
Climate changes could affect closely related species ability to thrive in changing environments (i.e. current regions that produce commercial and grazing crops could become more arid).
Use this knowledge to produce GMOs via Genetic manipulation from closely related species that could help them to adapt to a changing environment.
7
Peterson et al (2010)
• Peterson study included the sequence of only 6 partial gene sequences (6,789 bp) and 814 bp of ITS.
• Advances in sequencing methods have provided larger amounts of data for analysis.
• My study includes sequence for the entire genome of chloroplasts (plastome). (≈140 kbp x 10 spp)
8Leseberg and Duvall (2009) on
the complete plastome of Coix lacryma-jobi
plastome-scale MMEs are a potentially valuable, underutilized resource that can be used for supporting relationships
THIS STUDY analyzed types of mutations besides substitution mutations
may be able to predict and define genomic relationships among species Microstructural Mutation Events (MMEs)
Slipped-strand mispairing (SSM) insertions/deletions (indels) Non-tandem repeat indels Inversions
8
9Hypotheses
1. Of the two types of MMEs, indels occur more frequently than inversions.
2. Tandem repeat indels, i.e. those indels occurring in regions of tandemly repeated sequences, occur with greater frequency than indels not associated with such repeats.
3. MMEs that affect fewer nucleotides (shorter indels, smaller inversions) occur with greater frequency than larger MMEs.
4. Plastome-scale MMEs are an effective source of data for the inference of high resolution, highly supported phylogenies consistent with the inference from nucleotide substitutions.
9
Research Methods DNA sampling Sanger sequencing (E. tef) NextGen sequencing (NGS) Identification of MMEs Phylogenomic analyses
10
HilariaHilaria
ZoysiaZoysia
NeyraudiaNeyraudia
Eragrostis tefEragrostis tef
Bouteloua
Spartina
Distichlis
Sporobolus E. minor
Centropodia
Research methodsDNA Sampling1111
Sanger Method & E. tef
Ergrostis tef seedlings were provided by Amanda Ingram, of Wabash College, Crawfordsville, IN
DNA extraction Leaf tissues of all four species were ground in liquid nitrogen.
extraction was performed using Qiagen DNeasy Plant Mini Kits (Qiagen Inc., Valencia, CA) following the manufacturer's protocol.
Amplification Arbitrarily divided into 119 regions (range = 500-1,200
bp) ~250 Primer sites. IR primer set from Dhingra and Folta (2005). Most primers from Leseberg and Duvall (2009)
Target region is “primed” for transcription by Fidelitaq (Affymetrix) or Pfu (Strategen Inc.) polymerases.
PCR
DNA extraction and Amplification
13 Electrophoresis methods were used to verify the size
and number of amplified DNA fragments. Expected size of amplicons ≈ 1200 bp Ladders (ThermoFisher, Hanover Park, IL) were used in
conjunction with negative controls to assure the legitimacy and size of the DNA fragments.
DNA fragments were cleaned and purified (Wizard kit method, Promega Corp., Madison).
PCR products exported to Macrogen, Inc., (Seoul, Korea) for DNA capillary Sanger sequencing.
Problems: Not all primers yielded amplicons with desired size. Some amplicons yielded sequence that is unusable. Not all primers available actually work (sequence
not conserved in the target sequence). Species specific primers were designed
14 Sanger Sequencing and Assembly
Macrogen files were imported into Geneious Pro software. Check signal strength and distinctness of peaks from
electropherogram. Trim ambiguous regions of sequence with weak signals. Concatenate forward and reverse sequence for specific regions that
were amplified. Assemble contiguous sequence with ≥15 bp overlap between
regions.Also
Design primers for regions that failed to amplify with standard primer set.
Annotate complete genome for GenBank submission.
15
Eragrostis tef plastome 134,435
bp
16
Research methodsNGS
One chloridoid plastome from Neyraudia reynaudiana (Wysocki et al., 2014) was previously published
Bouteloua curtipendula (Michx.) Torr. a S. Burke 27 (DEK) NIU
Distichlis spicata var. stricta(Torr.) Scribn.a Saarela 677 (CAN)
Centropodia glauca (Nees) T. A. Cope a Linder 5410 (BOL) University of Cape Town, South Africa, Western Cape Provence
Eragrostis minor Host a L. Clark 1333 (ISC) Iowa State University
Spartina pectinata Bosc ex Link a P. Peterson 20865 (CAN) Canadian Museum of Nature, Ontario
Sporobolus heterolepis (Gray) A. Gray a M. Duvall s. n. (DEK) NIU
Hilaria cenchroides Kunth a J. T. Columbus 5049 (RSA) Rancho Santa Ana Botanic Garden, CA
Zoysia macrantha Desv. a J. T. Columbus 5049 (RSA) Rancho Santa Ana Botanic Garden, CA
17
NextGen Sequencing Methods & Materials
Library Preparation & NGS Sequencing D. spicata and H. cenchroides
diluted to 2 ng/μl DNA sonication using the Biorupter sonicator at University of Missouri Libraries prepared using TruSeq (Illumina) kit
B. curtipundula, S. pectinata, S. heterolepis, E. minor, C. glauca, Z. marcrantha. diluted to 2.5 ng/ul Tagmentation vs. sonication Libraries prepared/purified using the Nextera Illumina library preparation kit & DNA Clean and
Concentrator kit Both Library types were submitted to the DNA core facility (Iowa State University, Ames, IA)
for bio-analysis and HiSeq 2000 next generation sequence determination.
NGS Quality Control Illumina Reads (1- 32 Mbp @ 100 bp
each) Dynamic Trim = (FASTQ) Quality Score
filter LengthSort = retain reads ≥ 25bp
18
Velvet (de novo) assembly Contig assembly via anchored
conserved region extension ACRE (Wysocki, 2014)
Plastome Assembly
19 Sequence overlap for gaps in the plastomes that were not resolved using ACRE were determined by extracting and
matching sequences from the flanking contigs to the reads produced by NGS to complete the plastid genome.
19
Gap b/w 104-108Gap b/w 112-117
N. reynaudiana Sanger reads aligned to NGS confirmed sequence identity between both methods
NGS assembly verified against Sanger contigs for N. reynaudiana
20
Examples of identifying MMEs
Inversions ≥ 2 bp w/stem ≥ 3 bp
Indels ≥ 3 bp SSM w/unambiguous
tandem repeats
21
Scored events with binary matrix
pos type D B H S Sp Z E e N C #BP7147 SSM 0 0 0 1 1 1 0 0 0 0 3
14466 SSM 0 0 0 0 0 0 0 0 1 0 314549 SSM 0 0 0 0 0 0 0 1 0 0 333041 SSM 0 0 1 0 0 0 0 0 0 0 336425 SSM 1 ? ? ? 1 1 1 1 1 0 345802 SSM 0 1 0 0 0 0 0 0 0 0 346936 SSM 0 1 0 0 0 0 0 0 0 0 359287 SSM 0 0 0 0 0 0 1 0 0 0 3
pos type D B H S Sp Z E e N C #BP9364 NTR 0 0 0 1 1 ? 0 ? 1 0 3
16559 NTR 1 1 1 1 1 1 1 1 1 0 319603 NTR 0 1 0 0 0 0 0 0 0 0 322008 NTR 1 0 0 0 0 0 0 0 0 0 327774 NTR 1 1 1 1 1 1 1 1 1 0 362266 NTR 0 0 0 1 1 0 0 0 0 0 368674 NTR 0 0 0 0 0 0 1 1 0 0 372573 NTR 0 0 1 0 0 0 0 0 0 0 3
POS OG SEQ D B H S Sp Z E e N C #BP CDS22 CC 0 0 0 0 0 0 0 1 1 0 22390 TC 1 1 1 1 0 1 0 0 0 0 2 matK152294 GA 0 0 0 1 1 1 0 0 0 0 2109211 CA 0 1 0 0 0 0 1 0 0 0 2110074 AA 0 1 0 1 1 1 0 0 0 0 2 ndhF112304 GA 1 0 0 0 0 0 0 0 0 0 22667 TTG (TTC) 1 1 1 1 0 0 1 1 0 0 3 matK2
SSM indels NTR indels
Inversions
Phylogenomic Analysis
Phylogenomic analyses were performed using a series of five datasets for ML, MP and BI [1] complete plastome sequences [2] the binary matrix of characterized MMEs [1-2] plastome sequence + binary matrix [3] a matrix of CDS
78 protein CDS four rRNA sequences 32 tRNA sequences
[4] all non-coding sequences introns and intergenic regions
Phylogenomic Analyses23 Ten species aligned using Geneous Pro MAFFT plugin
Gaps removed (eliminate ambiguities)
1 inverted repeat (Ira) removed (prevent overrepresentation of sequence)
MME added 605 characters to the sequence matrix 581 indels + 24 inversions
Phylogenomic Analyses
Five maximum-likelihood (ML) analyses jModelTest 2
RAxML-HPC2 on XSEDE on (CIPRES) GTRCAT
plastome sequences BINCAT
MME binary matrix 1000 BS iterations MLBVs via Consense tool (Phylip software package on CIPRIS) Phylogenomic trees were visualized and edited using FigTree v1.4.0
24Centropodia glauca specified as OG for all Phylogenomic (ML, MP and BI) analyses
Phylogenomic Analyses
Five branch and bound maximum parsimony (MP) analyses PAUP* v4.0b10 MP branch and bound bootstrap analyses were performed using 1,000 replicates in
each case
Five Bayesian Inference (BI) analyses were performed MrBayes 3.2.2 on XSEDE on CIPRES two Markov chain Monte Carlo (MCMC) analyses 20,000,000 generations each model for among-site rate conversion was set to invariant gamma sampled values discarded at burnin was set at 0.25 to generate 50% majority rule
consensus trees
25
RESULTS
26
Plastome Assembly, Annotation, and Alignment 1,216,882 bases of
new plastid sequence added to GenBank database
share a general organization of the highly conserved gene content and gene order that are consistent with the grass plastome
Plastome characterization28
Species LSC IrB IrA SSC Total % AT
B. curtipedula 79309 20975 20975 12606 133865 61.8
E. tef 79802 21026 21026 12581 134435 61.6
C. glauca 80074 21012 21012 12467 134565 61.5
H. cenchroides 80238 21082 21082 12419 134821 61.7
E. minor 80316 21065 21065 12577 135023 61.8
S. heterolepis 80614 21028 21028 12692 135097 61.6
N. reynaudiana 81213 20570 20570 12744 135362 61.7
S. pecinata 80922 20985 20985 12720 135612 62.6
Z. macrantha 81351 20961 20961 12572 135845 61.6
D. spicata 82488 21226 21226 12679 137619 61.7
Microstructural mutation scoring and analysis29 Number of bases in slipped strand mispairing event
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 27 28 29 31 32 39 40 120 ΣD. spicata 5 6 22 5 5 2 4 2 1 1 0 0 1 0 0 1 0 1 1 2 0 1 1 1 1 0 1 0 0 0 0 64B. curtipedula 6 10 30 11 11 6 4 5 2 1 1 0 2 0 1 0 0 1 1 2 0 0 0 0 0 0 1 0 0 0 0 95H. cenchroides 4 7 39 13 5 4 4 2 1 1 1 1 1 1 0 2 1 0 1 3 0 1 0 0 0 0 0 0 0 1 0 93S. heterolepis 5 11 33 3 5 3 3 1 1 1 0 2 1 0 0 0 0 0 1 2 1 0 1 0 0 0 0 0 0 0 0 74S. pecinata 6 11 31 3 4 2 4 0 2 1 0 2 1 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 69Z. macrantha 7 10 32 2 2 2 4 0 1 1 0 2 1 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 66E. tef 4 12 27 6 3 0 5 0 1 1 0 1 1 0 1 0 0 1 0 2 0 0 0 0 0 1 0 0 0 0 1 67E. minor 4 10 24 7 3 0 4 1 1 2 0 1 1 0 0 0 1 2 1 2 0 0 0 0 0 1 0 0 1 0 0 66N. reynaudiana 4 8 26 5 3 0 3 1 1 1 0 0 2 0 1 0 0 0 0 2 0 0 0 0 0 0 0 1 0 0 0 58
Microstructural mutation scoring and analysis30
Number of bases in indel (NTR)
3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 28 29 30 31 34 35 36 37 39 44 45 46 48 52 55 59 63 67 75 78 84 86 88 94 117 119 121 145 159 182 391 433 Σ
D. spicata 7 9 18 13 3 3 9 6 1 0 3 1 0 2 1 3 2 1 1 0 1 1 0 2 0 0 0 1 1 0 0 0 1 1 2 1 2 0 1 0 0 2 0 1 1 1 0 0 1 1 1 1 1 1 0 1 109
B. curtipedula 5 12 16 19 6 1 8 5 2 0 3 2 0 1 1 1 3 1 1 1 0 1 0 1 0 0 1 1 0 0 0 0 1 1 2 0 1 0 0 1 1 1 1 0 1 0 1 0 0 1 0 0 0 0 0 1 105H.
cenchroides 6 11 23 15 4 2 8 9 2 1 4 1 1 1 1 2 2 2 1 1 0 0 0 1 0 0 1 1 0 1 0 0 1 1 1 0 2 0 0 0 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 110
S. heterolepis 7 11 22 14 3 1 5 6 0 0 6 1 0 1 0 1 2 1 0 1 1 0 0 1 0 0 0 1 0 0 0 0 1 1 2 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 1 1 97
S. pecinata 6 11 22 15 5 2 5 5 1 0 6 1 0 0 0 1 2 1 0 1 1 0 0 2 0 1 0 1 0 0 1 0 1 1 2 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 101
Z. macrantha 4 10 15 12 3 2 5 5 0 0 5 1 0 0 0 1 2 1 0 0 0 0 0 1 0 1 0 1 0 0 0 0 1 1 2 1 1 0 0 1 1 1 0 0 1 0 0 0 0 1 0 0 0 0 0 1 81
E. tef 5 16 23 10 4 4 8 3 2 0 3 2 0 2 0 1 2 1 0 0 0 0 1 0 1 0 0 1 0 0 0 1 2 1 2 0 0 1 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 1 100
E. minor 5 15 23 10 4 4 8 4 2 0 3 2 0 2 0 1 2 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1 2 1 2 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 0 0 1 101N.
reynaudiana 5 9 15 6 2 3 7 4 0 1 2 2 0 1 0 3 2 2 0 1 0 0 0 0 0 0 0 1 0 0 0 0 1 1 1 0 1 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 74
Microstructural mutation scoring and analysis31
Inversion scoring and analysis32
Inversion Size Frequency 2 3 4 5 6 7 9 ΣD. spicata 2 6 0 2 0 1 1 12B. curtipedula 3 6 1 2 1 1 2 16H. cenchroides 1 7 1 2 1 1 1 14S. heterolepis 3 5 0 2 1 1 1 13S. pecinata 2 4 0 2 1 1 1 11Z. macrantha 3 2 0 2 1 1 0 9E. tef 1 4 0 2 0 1 1 9E. minor 1 4 0 2 0 1 1 9N. reynaudiana 1 2 0 1 0 1 1 6
24 identified
Indels in CDS total of 581 indels were identified (plastome alignment)
28 in CDS rpoB, rps14, rps18, clpP, rpoC1, rpoC2, matK, ycf68, ndhF and ccsA Range 1-78 bp
CDS indels = 4.8% of the total
Indels in CDS
1 3 5 6 9 15 21 30 63 78 Σ
D. spicata 0 3 0 1 2 0 1 0 ? 1 8
B. curtipedula 0 1 0 2 1 1 2 0 ? 0 7
H. cenchroides 0 1 0 1 1 0 0 1 ? 0 4
S. heterolepis 0 1 0 0 1 0 0 0 0 0 2
S. pecinata 0 2 0 0 1 0 0 0 0 0 3
Z. macrantha 0 1 0 1 1 0 1 0 1 0 5
E. tef 3 2 1 2 2 0 0 0 0 0 10
E. minor 0 1 1 1 2 0 1 0 0 0 6
N. reynaudiana 0 2 0 2 0 0 1 0 ? 0 5
34
CDS specific inversions (4/24)
Inv2 matK
Taxa position nucleotide sequence AA sequenceΔ AA
properties
D. spicata2617 - 2640 ATTTTCTTTTGAAAAAAGAAAAAT NEKSFLFI P,A
B. curtipedula2570 - 2593 ATTTTCTTTTGAAAATAGAAAAAT NEKSFLFI P,A
H. cenchroides2605 - 2628 ATTTTCTTTTGAAAAAAGAAAAAT NEKSFLFI P,A
S. heterolepis2589 - 2612 ATTTTCTTTTGAAAAAAGAAAAAT NEKSFLFI P,A
S. pecinata2597 - 2620 ATTTTCTTTTTTCAAAAGAAAAAT NEKKLLFI (+), NP
Z. macrantha2596 - 2619 ATTTTCTTTTTTCAAAAGAAAAAT NEKKLLFI (+), NP
E. tef2585 - 2608 ATTTTCTTTTGAAAAAAGAAAAAT NEKSFLFI P,A
E. minor2580 - 2603 ATTTTCTTTTGAAAAAAGAAAAAT NEKSFLFI P,A
N. reynaudiana2559 - 2582 ATTTTCTTTTTTCAAAAGAAAAAT NEKKLLFI (+), NP
C. glauca2604 - 2627 ATTTTCTTTTTTGAAAAGAAAAAT NEKKFLFI (+), A
Inv1 matK
Taxa position nucleotide sequence AA sequenceΔ AA
propertiesD. spicata 2342 - 2357 TTTCTTTTGAAAAAGAAG KKQFLL P,AB. curtipedula 2295 - 2310 TTTCTTTTGAAAAAGAAG KKQFLL P,AH. cenchroides 2330 - 2345 TTTCTTTTGAAAAAGAGG KKQFLP P,AS. heterolepis 2314 - 2329 TTTCTTTTGAAAAAGAAG KKQFLL P,AS. pecinata 2322 - 2337 TTTCTTTTTCAAAAGAAG KKKLLL (+), NPZ. macrantha 2321 - 2336 TTTCTTTTGAAAAAGAAG KKQFLL P,AE. tef 2310 - 2325 TTTCTTCTTCAAAAGAAG KKKLLL (+), NPE. minor 2305 - 2320 TTTCTTCTTCAAAAGAAG KKKLLL (+), NPN. reynaudiana 2284 - 2299 TTTCTTCTTCAAAAGAAG KKKLLL (+), NPC. glauca 2329 - 2344 TTTCTTCTTCAAAAGAGG KKKLLP (+), NP
35
CDS specific inversions
ndhF
Taxa position nucleotide sequence AA sequenceΔ AA
propertiesD. spicata 103962 - 103979 ATCCAAAAAGAACTTTTGGGG DLFFKQP A B. curtipedula 100534 - 100551 ATCAAAAAAGTTCTTTTTTGA DFFNKKS PH. cenchroides 101573 - 101590 ATCCAAAAATAACTTTTTTTG DLFLKKQ A S. heterolepis 102038 - 102055 ATGCAAAAAGTTCTTTTGGGG HLFNKQP PS. pecinata 102162 - 102179 ATGCAAAAAGTTCTTTTTGGA HLFNKKS PZ. macrantha 102588 - 102605 ATGCAAAAAGTTCTTTTGGGG HLFNKQP P E. tef 101078 - 101095 ATCCAAAAAGAACTTTTTGGG DLFFKKP A E. minor 101632 - 101649 ATCCAAAAAGAACTTTTTGGG DLFFKKP A N. reynaudiana 101895 - 101912 ATCCAAAAAGAACTTTTTTGG DLFFKKP A C. glauca 101331 - 101348 ATCCAAAAAGAACTTTTTTGG DLFFKKP A
ccsA
Taxa position nucleotide sequence AA sequenceΔ AA
propertiesD. spicata 108168 - 108182 TTTCGAAATTCTTTCGAT FRNSFD P,PB. curtipedula 104715 - 104729 TTTCGAAAGAATTTCGAT FRKNFD (+), PH. cenchroides 105580 - 105594 TTTCGAAAGAATTTTGAT FRKNFD (+), PS. heterolepis 106265 - 106279 TTTCGAAAGAATTTCTAT FRKNFY (+), PS. pecinata 106402 - 106416 TTTCGAAAGAATTTCTAT FRKNFY (+), PZ. macrantha 106690 - 106704 TTTCGAAAGAATTTCTAT FRKNFY (+), PE. tef 105125 - 105139 TTTCGAAAGAATTTAGAT FRKNLD (+), PE. minor 105687 - 105701 TTTCGAAAGAATTTAGAT FRKNLD (+), PN. reynaudiana 106098 - 106112 TTTCGAAAGAATTTCGAT FRKNFD (+), PC. glauca 105314 - 105328 TTTCGAAAAAATTTCGAT FRKNFD (+), P
Phylogenomic Analysis
Dataset [1] ML, MP and BI have
identical topology (SPS | MPC) All BV = 100 for ML
and MP except where indicated with (*) where MPBV = 58
Eragrostis minor
Bouteloua curtipendula
Eragrostis tef
Spartina pectinata
Centropodia glauca
Zoysia macrantha
Sporobolus heterolepis
Distichlis spicata
Neyraudia reynaudiana
Hilaria cenchroides
0.0062 | 608
0.003 | 313
0.0064 | 643
0.0035 | 359
0.0051 | 511
0.0082 | 774
0.0019 | 210
0.0042 | 420
0.0097 | 926
0.0078 | 803
0.016 | 1540
0.0141 | 1308
0.0004 | 111
0.0037 | 453
*
0.0023 | 287
0.0014 | 226
0.0054| 1070 0.003
0.0054| 1070
Phylogenomic Analysis
0.8
Neyraudia reynaudiana
Spartina pectinata
Zoysia macrantha
Distichlis spicata
Centropodia glauca
Eragrostis minor
Sporobolus heterolepis
Eragrostis tef
Hilaria cenchroides
Bouteloua curtipendula
0.124 | 50
0.129 | 44
*
0.243 | 87
4.0E-7 | 13
0.21 | 76
4.0E-7 | 12 ***
**0.063 | 20
0.063 | 27
0.103 | 35
0.041 | 23
0.058 | 29
0.036 | 16
0.02 | 14
0.29 | 72
3.458 | 95
3.458 | 95
0.115 | 36
0.06 | 25
Dataset [2] ML, MP have identical topology BI not able to resolve B.c., H.c. and D.s.
(polytomy) MLBV = 100 on all internal nodes except where
indicated with (**) where MLBV = 92 MPBV = 100 on all internal nodes except
(*) MPBV = 75 (**) MPBV = 99 (***) MPBV = 63
Phylogenomic Analysis ML dataset [1-
2] BV = 100 on all
internal nodes except (*) MLBV = 85
0.004
Neyraudia reynaudiana
Eragrostis minor
Distichlis spicata
Sporobolus heterolepis
Centropodia glauca
Hilaria cenchroides
Eragrostis tef
Boutelouacurtipendula
Zoysia macrantha
Spartina pectinata
0.0025
0.0021
0.0084
0.004
0.0106
0.0057
0.0037
0.0044
0.0065
0.0088
0.0067
0.0015
0.0151
0.0171
0.0057
0.0004
0.0032
0.0055
*
Zoysia macrantha
Spartina pectinata
Sporobolus heterolepis
Bouteloua curtipendula
Hilaria cenchroides
Distichlis spicata
Eragrostis minor
Eragrostis tef
Neyraudia reynaudiana
Centropodia glauca500 changes
1169
230
300
561
627
392
336
672
481
126
1620
1456
786
1007
221
439
815
1090
*
MP dataset [1-2]
BV = 100 for all internal nodes except (*) MPBV = 56
Phylogenomic Analysis Dataset [3] ML, MP and BI have
identical topology All BV = 100 except
(*) MLBV = 59 (*) MPBV = 79
Neyraudia reynaudiana
Sporobolus heterolepis
Distichlis spicata
Eragrostis tef
Zoysia macrantha
Centropodia glauca
Eragrostis minor
Spartina pectinata
Hilaria cenchroides
Bouteloua curtipendula
0.0069 | 377
0.0017 | 107
0.0028 | 174
0.0028 | 198
0.0067 | 372
0.0041 | 247
0.0071 | 400
0.0035 | 208
0.0004 | 50
0.0015 | 111
0.0043 | 249
0.0039 | 2410.001 | 95
0.0041 | 475
0.0041 | 489
0.0022 | 135
0.01 | 597
0.0116 | 664
*
0.003
Phylogenomic Analysis Dataset [4] ML, MP and BI have
identical topology All BV = 100 except
(*) MPBV = 85
Zoysia macrantha
Spartina pectinata
Sporobolus heterolepis
Bouteloua curtipendula
Distichlis spicata
Hilaria cenchroides
Eragrostis minor
Ertagrostis tef
Neyraudia reynaudiana
Centropodia glauca
0.0075 | 587
0.0021 | 128
0.0035 | 163
0.0068 | 270
0.009 | 352
0.0045 | 185
0.0042 | 177
0.01 | 395
0.0052 | 246
0.0006 | 58
0.0224 | 857
0.0094 | 380
0.0199 | 739
0.0137 | 526
0.0023 | 99
0.0051 | 205
0.0107 | 398
0.0075 | 591
*
0.005
DISCUSSION & Key Findings
41
Indel analysis
Hypothesis: indels occur more frequently than inversions 581 indels 24 inversions CONFIRMS hypothesis
Hypothesis: Tandem repeat indels, i.e. those indels occurring in regions of tandemly repeated sequences, occur with greater frequency than indels not associated with such repeats NTR indels = 308 occurrences SSM indels = 275 occurrences REFUTES the hypothesis Orton (2015) had contrary result
taxa in this study belong to a more ancient lineage than the congeneric species in Orton’s (2015) study
Orton’s species have had less time to accumulate subsequent mutations that obscure tandem repeat patterns
Indel analysis
Hypothesis: MMEs that affect fewer nucleotides (shorter indels, smaller inversions) occur with greater frequency than larger MMEs. Smaller MMEs require lower input of energy and so
would occur with frequencies inversely proportional to their size (Wu et al. 1991)
5 bp indels 1.8 to 3.4 fold increase in frequency over 4 bp indels
Orton (2015) had similar result 5 bp indels ≈1.6 fold increase over 4 bp REFUTES hypothesis.
Small inversions
Kim and Lee (2005) postulate: small inversions are more common than large inversions 3 bp occurrences = 10 2 bp occurrences = 6
Refutes this hypothesis Result of:
steric limitations of loop forming regions errors of inversion size interpretations
the loop was absorbed by the stem regions TACCCAATATCCTGTTGGAACAAGATATTGGGTA
MME phylogenomics
Hypothesis: Plastome-scale MMEs are an effective source of data for the inference of high resolution, highly supported phylogenies consistent with the inference from nucleotide substitutions. Refuted
Characterized MMEs weakened MLBV ([1] = 100 to [1-2] = 85) on nodes supporting the internal relationships of the Cynodonteae (B.curtipendula sister to D. spicata)
MMEs changed the topology of the MP analysis for the relationship of the Cynodonteae (B.curtipendula sister to H. cenchroides) with LOW MPBVs ([1] = 58 to [1-2] = 56).
0.004
Neyraudia reynaudiana
Eragrostis minor
Distichlis spicata
Sporobolus heterolepis
Centropodia glauca
Hilaria cenchroides
Eragrostis tef
Boutelouacurtipendula
Zoysia macrantha
Spartina pectinata
0.0025
0.0021
0.0084
0.004
0.0106
0.0057
0.0037
0.0044
0.0065
0.0088
0.0067
0.0015
0.0151
0.0171
0.0057
0.0004
0.0032
0.0055
*
Zoysia macrantha
Spartina pectinata
Sporobolus heterolepis
Bouteloua curtipendula
Hilaria cenchroides
Distichlis spicata
Eragrostis minor
Eragrostis tef
Neyraudia reynaudiana
Centropodia glauca500 changes
1169
230
300
561
627
392
336
672
481
126
1620
1456
786
1007
221
439
815
1090
*
Phylogenomic analyses topologies were largely stable Largely congruent with conclusions of
Peterson (2010; 2014) EXCEPT: Cynodonteae
B. curtipendula, D. spicata, and H. cenchroides
Changed depending on dataset and method
Note that the terminal branches ARE LONG Could produce faulty phylogenomic inferences Long-branch attraction (Felsenstein, 1978)
“homoplasious character state changes on different long terminal branches could be a source of error when conducting phylogenetic analyses”.
Zoysia macrantha
Spartina pectinata
Sporobolus heterolepis
Bouteloua curtipendula
Hilaria cenchroides
Distichlis spicata
Eragrostis minor
Eragrostis tef
Neyraudia reynaudiana
Centropodia glauca500 changes
1169
230
300
561
627
392
336
672
481
126
1620
1456
786
1007
221
439
815
1090
*
MP dataset [1-2]
BV = 100 for all internal nodes except (*) MPBV = 56
Phylogenomic analyses Dataset [1] Plastome scale datasets include a larger
# of informative characters compared to previous studies.
Recent findings (Duvall et al. in review) show that the
sister relationship between B. curtipendula and D. spicata is more strongly supported under ML, MP and BI when additional plastome sequences from congeneric species are added to the matrix.
Eragrostis minor
Bouteloua curtipendula
Eragrostis tef
Spartina pectinata
Centropodia glauca
Zoysia macrantha
Sporobolus heterolepis
Distichlis spicata
Neyraudia reynaudiana
Hilaria cenchroides
0.0062 | 608
0.003 | 313
0.0064 | 643
0.0035 | 359
0.0051 | 511
0.0082 | 774
0.0019 | 210
0.0042 | 420
0.0097 | 926
0.0078 | 803
0.016 | 1540
0.0141 | 1308
0.0004 | 111
0.0037 | 453
*
0.0023 | 287
0.0014 | 226
0.0054| 1070 0.003
0.0054| 1070
[2] (*) MLBV = 100 (*) MPBV = 75
0.8
Neyraudia reynaudiana
Spartina pectinata
Zoysia macrantha
Distichlis spicata
Centropodia glauca
Eragrostis minor
Sporobolus heterolepis
Eragrostis tef
Hilaria cenchroides
Bouteloua curtipendula
0.124 | 50
0.129 | 44
*
0.243 | 87
4.0E-7 | 13
0.21 | 76
4.0E-7 | 12 ***
**0.063 | 20
0.063 | 27
0.103 | 35
0.041 | 23
0.058 | 29
0.036 | 16
0.02 | 14
0.29 | 72
3.458 | 95
3.458 | 95
0.115 | 36
0.06 | 25
Phylogenomic analyses
Dataset [2] Only 605 characters
212 parsimoniously informative B. curtipendula and H. cenchroides share
more homoplasious MMEs
Eragrostis minor
Bouteloua curtipendula
Eragrostis tef
Spartina pectinata
Centropodia glauca
Zoysia macrantha
Sporobolus heterolepis
Distichlis spicata
Neyraudia reynaudiana
Hilaria cenchroides
0.0062 | 608
0.003 | 313
0.0064 | 643
0.0035 | 359
0.0051 | 511
0.0082 | 774
0.0019 | 210
0.0042 | 420
0.0097 | 926
0.0078 | 803
0.016 | 1540
0.0141 | 1308
0.0004 | 111
0.0037 | 453
*
0.0023 | 287
0.0014 | 226
0.0054| 1070 0.003
0.0054| 1070
0.004
Neyraudia reynaudiana
Eragrostis minor
Distichlis spicata
Sporobolus heterolepis
Centropodia glauca
Hilaria cenchroides
Eragrostis tef
Boutelouacurtipendula
Zoysia macrantha
Spartina pectinata
0.0025
0.0021
0.0084
0.004
0.0106
0.0057
0.0037
0.0044
0.0065
0.0088
0.0067
0.0015
0.0151
0.0171
0.0057
0.0004
0.0032
0.0055
*
Zoysia macrantha
Spartina pectinata
Sporobolus heterolepis
Bouteloua curtipendula
Hilaria cenchroides
Distichlis spicata
Eragrostis minor
Eragrostis tef
Neyraudia reynaudiana
Centropodia glauca500 changes
1169
230
300
561
627
392
336
672
481
126
1620
1456
786
1007
221
439
815
1090
*
[1] (*) MLBV = 100(*) MPBV = 58
ML [1-2](*) MLBV = 85
MP [1-2] (*) MPBV =
56
Eragrostis minor
Bouteloua curtipendula
Eragrostis tef
Spartina pectinata
Centropodia glauca
Zoysia macrantha
Sporobolus heterolepis
Distichlis spicata
Neyraudia reynaudiana
Hilaria cenchroides
0.0062 | 608
0.003 | 313
0.0064 | 643
0.0035 | 359
0.0051 | 511
0.0082 | 774
0.0019 | 210
0.0042 | 420
0.0097 | 926
0.0078 | 803
0.016 | 1540
0.0141 | 1308
0.0004 | 111
0.0037 | 453
*
0.0023 | 287
0.0014 | 226
0.0054| 1070 0.003
0.0054| 1070
[1] (*) MLBV = 100(*) MPBV = 58
Neyraudia reynaudiana
Sporobolus heterolepis
Distichlis spicata
Eragrostis tef
Zoysia macrantha
Centropodia glauca
Eragrostis minor
Spartina pectinata
Hilaria cenchroides
Bouteloua curtipendula
0.0069 | 377
0.0017 | 107
0.0028 | 174
0.0028 | 198
0.0067 | 372
0.0041 | 247
0.0071 | 400
0.0035 | 208
0.0004 | 50
0.0015 | 111
0.0043 | 249
0.0039 | 2410.001 | 95
0.0041 | 475
0.0041 | 489
0.0022 | 135
0.01 | 597
0.0116 | 664
*
0.003
[3](*) MLBV =
59(*) MPBV =
79
B. curtipendula and H. cenchroides share homoplasious sequence identity in CDS Note: low BVs
Eragrostis minor
Bouteloua curtipendula
Eragrostis tef
Spartina pectinata
Centropodia glauca
Zoysia macrantha
Sporobolus heterolepis
Distichlis spicata
Neyraudia reynaudiana
Hilaria cenchroides
0.0062 | 608
0.003 | 313
0.0064 | 643
0.0035 | 359
0.0051 | 511
0.0082 | 774
0.0019 | 210
0.0042 | 420
0.0097 | 926
0.0078 | 803
0.016 | 1540
0.0141 | 1308
0.0004 | 111
0.0037 | 453
*
0.0023 | 287
0.0014 | 226
0.0054| 1070 0.003
0.0054| 1070
[4] (*) MLBV = 100(*) MPBV = 85
Zoysia macrantha
Spartina pectinata
Sporobolus heterolepis
Bouteloua curtipendula
Distichlis spicata
Hilaria cenchroides
Eragrostis minor
Ertagrostis tef
Neyraudia reynaudiana
Centropodia glauca
0.0075 | 587
0.0021 | 128
0.0035 | 163
0.0068 | 270
0.009 | 352
0.0045 | 185
0.0042 | 177
0.01 | 395
0.0052 | 246
0.0006 | 58
0.0224 | 857
0.0094 | 380
0.0199 | 739
0.0137 | 526
0.0023 | 99
0.0051 | 205
0.0107 | 398
0.0075 | 591
*
0.005
[1] (*) MLBV = 100(*) MPBV = 58
B. curtipendula and D. spicata share homologous sequence identity in non-coding regions
Conclusions
52
Conclusions Conventional phylogenetic analyses that utilize
CDS only CDS No longer appears to be reliable means
of defining lineages Topology dataset [3] Cynodonteae NOT
congruent with previous work ML, MP and BI produced a tree with B. curtipendula sister
to H. cenchroides
produces phylogenomic trees with low BVs BVs for B. curtipendula sister to H. cenchroides are low
(MLBV = 59 and MPBV = 79)
Recent studies are showing that B. curtipendula is sister to D. spicata when more congenic species are added to the matrix (Duvall unpublished).
Conclusions Plastome scale analysis [1] Most informative type of dataset for
drawing inferences INCREASED BVs
divergence of Eragrostideae before Zoysieae and Cynodonteae
INCREASED from MLBV = 90 to MLBV|MPBV = 100|100
relationship between the subtribes Zoysiinae (Z. macrantha) and Sporobolinae (S. heterolepis and S. pectinate)
INCREASED from MLBV = 81 to MLBV|MPBV = 100|100
relationships between sister tribes Zoysieae (Z. macrantha, S. pectinate and S. heterolepis)and Cynodonteae (B. curtipendula, D. spicata and H. cenchroides)
INCREASED from MLBV = 90 to MLBV|MPBV = 100|100
Conclusions Plastome scale analysis (dataset
[1]) cont. INCREASED BVs supporting the Zoysieae subtribe as sister to
the Hilarinae (H. cenchroides), Monanthochloinae (D. spicata) and Boutelouinae (B. curtipendula) clade
from MLBV = 85 to MLBV|MPBV = 100|100
for the sister relationship of B. curtipendula with D. spicata
from MLBV = 77 to MLBV = 100 NOTE: MPBV = 58 (LBA artifact)
Indel analysis 5 bp size class of indels occur with
highest frequency It is unknown whether this trend is
a result of some uncharacterized facet of the energetics of slippage,
a limitation on mutation recognition systems,
some feature of DNA repair mechanisms in the plastid,
or an artifact of indel scoring.
Conclusions
57Future applications
The way in which microstructural mutations arise in plastomes is not well understood
the exact way in which cpDNA repair mechanisms function remains elusive
Further investigation into identifying the gene products that are responsible for cpDNA damage repair is paramount for a better understanding of the mechanisms responsible for indels and inversions and improving our knowledge of chloroplast genome evolution.
Questions?58
Acknowledgments
Dr. Mel Duvall Dr. Joel Stafstrom Dr. Thomas Sims Bill Wysocki Sean Burke Lauren Orton Joseph Cotton
59
Xtra slides60
61
Bouteloua curtipendula Spartina pectinata Distichlis spicata Centropodia glauca
Human
Eragrostis tef (Africa)
millet/quinoa
Bouteloua curtipendula ornimental drought
tolerant gardens / erosion control
61
Note: some members of this subfamily (such as Z. macrantha) may have unknown evolutionary adaptations that may benefit bioengineering of drought tolerant crops
Livestock
Zoysia macrantha (AU) thrives in highly
acidic to alkaline soils.
ConclusionsHypotheses revisited
1) Of the two types of MMEs, indels occur more frequently than inversions. Confirmed
581 indels vs. 24 inversions
2) Tandem repeat indels (SSM) occur with greater frequency than indels not associated with such repeats (NTR). Refuted
Tandem repeats could have been obscured by subsequent substitution events Replicating DNA SSM
Tandem repeats can either be excised or duplicated depending on the +/- strands (3’→5’ (insertion)or 5’→3’ (deletion) )
ConclusionsHypotheses revisited
3) Smaller MMEs occur with greater frequency than larger MMEs. Refuted Increase of 1.8 – 3.4 fold of 5 bp over 4 bp indels
Consistent with recent MS Orton’s findings (1.6 fold increase) Unknown if result of:
Uncharacterized facet of the energetics of slippage Limitation of mutation recognition systems Some feature of plastid DNA repair mechanism Just an artifact of indel scoring
64Primer design
Conserved sequences from the existing sequences that flanked the incomplete region were selected for the following criteria to be satisfied. newly designed primer to be at least:
25 bp 3’ G or C anchor minimum GC content of 50% minimum melting temperature (Tm) of 50ºC hairpin of ΔG > -6.0 self-dimer of ΔG > -6.0 heterodimer of ΔG > -6.0
~80 bp hole
65
Primer design (cont’d) Geneious Pro 5.5.6 (Biomatters Ltd, Aukland, NZ) software was initially
used to generate a list of potential primer sequences
66
Potential primer sequences were analyzed with a web tool (Oligoanalyzer) from www.idtdna.com/site.
67Potential primer sequences were analyzed with a web tool (Oligoanalyzer) from www.idtdna.com/site.
68The Grass Phylogeny Working Group II
(GPWG II)
This laboratory is involved in a worldwide collaboration of plant systematists and plant biologists (The Grass Phylogeny Working Group II (GPWG II)) who pool their research together in order to work out a well-supported evolutionary history of the entire family.
The data obtained from the work of this laboratory will aid in determining on a fine scale the exact relationships between all ten of the representative grasses.
Greater support values for determining these relationships.
69Polymerase chain reactions (PCR)
(ASAP01 program)For primers designed by Dhingra and Folta (2005)
and Leseberg and Duvall (2009) 50 μl mixture consisting of 1.5 μl forward primer, 1.5 μl reverse primer
(each diluted 1:40 with HOH), 1.5 μl DNA template, 0.4 μl dNTP's (1:1:1:1), 5.0 μl 10x TBE buffer, 39.6 μl HOH and 0.5 μl PFU Turbo Polymerase (Strategen Inc, Carlsbad, CA).
Also Fidelitaq® used when PFU failed to produce amplicons. GeneAmp ® PCR System 2700 was used for DNA amplification using
program ASAP01 with the following parameters: 94ºC for 4.0 min with 10 cycles PCR touchdown (55ºC to 50ºC) at 40
seconds each to assure primer specificity would not preclude DNA amplification.
72ºC for 3.0 min; 35 cycles at 94ºC for 40 sec each, 50ºC for 40 sec, then 72ºC for 3.0 min with a final extension time of 7.0 min at 72ºC.
70 Electrophoresis Electrophoresis methods were used to verify the size and
number of amplified DNA fragments. Expected size of amplicons ≈ 1200 bp
PCR products were placed in a 0.8-1.0% agarose gel in a TBE buffer for 50 min at 100V.
High and low ladders (ThermoFisher, Hanover Park, IL) were used in conjunction with negative controls to assure the legitimacy and size of the DNA fragments.
DNA fragments were cleaned and purified (Wizard kit method, Promega Corp., Madison).
PCR products exported to Macrogen, Inc., (Seoul, Korea) for DNA capillary Sanger sequencing.
71 Not all primers amplified…..An alternate PCR program (ASAPCL) was created to be used in conjunction with the new primers that were designed.
parameters for this program: 94ºC for 4.0 min; 40 cycles at 94ºC for 40 sec each,
50ºC for 40 sec, then 72ºC for 3.0 min with a final extension time of 7.0 min at 72ºC.
NO TOUCHDOWN Primer sequences identical to template primer specificity should not preclude DNA
amplification
72Macrogen result example check and trim
73Forward and reverse sequences were pairwise
aligned to produce a small consensus sequence≥15bp overlap
74
Adjacent region concensus sequences were assembled to make Contigs
~200 bp overlap
Continued until
76Annotation of CDS
Completed plastomes were pairwise aligned to an already annotated genome and annotations were transferred with ≥ 70% identity.
CDS extracted and checked for proper reading frames and manually adjusted when necessary
77
CDS sequences were extracted and translated into AA sequence to determine proper reading frames.
Annotations manually adjusted to give proper reading frames
78
Extracted flanking sequence from area around hole was aligned to NextGen sequence reads.
79
Insertions/deletions (Indels)
• These events were scored if they were ≥3 bp length
MME Scoring and Analyses
80Inversions reverse compliment base pairing
• Sequence was manually searched for inversions and annotated with base compliment loop forming regions.
• Scored if ≥2 bp with stem ≥3 bp
81Each event type scored separately Σ Σ Σ Σ Σ Σ Σ
D 0 1 0 0 0 1 2 1 1 1 1 0 0 0 0 1 1 6 0 0 0 1 1 2 0 0 1 1 0 1 1B 0 1 0 1 1 0 3 1 1 1 1 0 0 1 0 0 1 6 0 1 1 1 1 2 1 1 1 1 1 1 2H 0 1 0 0 0 0 1 1 1 1 1 1 0 1 0 0 1 7 1 0 1 1 1 2 1 1 1 1 0 1 1S 0 1 1 0 1 0 3 1 0 1 1 0 0 0 1 0 1 5 0 0 0 1 1 2 1 1 1 1 0 1 1
Sp 0 0 1 0 1 0 2 0 1 1 1 0 0 0 0 0 1 4 0 0 0 1 1 2 1 1 1 1 0 1 1Z 0 1 1 0 1 0 3 0 0 1 1 0 0 0 0 0 ? 2 0 0 0 1 1 2 1 1 1 1 0 ? 0E 0 0 0 1 0 0 1 1 0 0 1 0 1 1 0 0 0 4 0 0 0 1 1 2 0 0 1 1 0 1 1e 1 0 0 0 0 0 1 1 0 0 1 0 1 0 0 0 1 4 0 0 0 1 1 2 0 0 1 1 0 1 1N 1 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 2 0 0 0 ? 1 1 0 0 1 1 0 1 1C 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
#BP 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 4 4 5 5 6 7 9 9
2 3 4 5 6 7 9 ΣD. spicata 2 6 0 2 0 1 1 12B. curtipedula 3 6 1 2 1 1 2 16H. cenchroides 1 7 1 2 1 1 1 14S. heterolepis 3 5 0 2 1 1 1 13S. pecinata 2 4 0 2 1 1 1 11Z. macrantha 3 2 0 2 1 1 0 9E. tef 1 4 0 2 0 1 1 9E. minor 1 4 0 2 0 1 1 9N. reynaudiana 1 2 0 1 0 1 1 6
Inversion Size Frequency
Phylogenomic Analysis Maximum Parsimony (MP) results from all datasets
Dataset usedTotal
number of characters
Number of parsimony informative characters
Tree length
CI excluding uninformative
charactersRI
[1] 104,248 3143 11647 0.7463 0.7597
[2] 605 212 674 0.7544 0.7971
[1-2] 104,853 3355 12328 0.746 0.7611
[3] 62,486 1437 5191 0.7205 0.7311
[4] 41,012 1688 6356 0.7722 0.7852
Indels in CDS Only 5.2% of indels occur in CDS supports the assumption that noncoding sequences are more likely to retain
mutations since they do not directly affect gene function. Indels in CDS cause:
frameshift mutations, alter AA sequences, introduce internal stop codons = deleterious
purifying selection acts against deleterious mutations
CDS specific inversions inversions found in CDS of matK,
ndhF and ccsA Changed physical properties of
AA at these loci from the ancestral condition.
All are essential for cell metabolism
Infer that these mutations do not affect protein function
Reversion to ancestral condition has been observed
Dynamic process
Table 12-a
Inv1 matK
Taxa position nucleotide sequence AA sequence Δ AA
properties D. spicata 2342 - 2357 TTTCTTTTGAAAAAGAAG KKQFLL P,A B. curtipedula 2295 - 2310 TTTCTTTTGAAAAAGAAG KKQFLL P,A H. cenchroides 2330 - 2345 TTTCTTTTGAAAAAGAGG KKQFLP P,A S. heterolepis 2314 - 2329 TTTCTTTTGAAAAAGAAG KKQFLL P,A S. pecinata 2322 - 2337 TTTCTTTTTCAAAAGAAG KKKLLL (+), NP Z. macrantha 2321 - 2336 TTTCTTTTGAAAAAGAAG KKQFLL P,A E. tef 2310 - 2325 TTTCTTCTTCAAAAGAAG KKKLLL (+), NP E. minor 2305 - 2320 TTTCTTCTTCAAAAGAAG KKKLLL (+), NP N. reynaudiana 2284 - 2299 TTTCTTCTTCAAAAGAAG KKKLLL (+), NP C. glauca 2329 - 2344 TTTCTTCTTCAAAAGAGG KKKLLP (+), NP
85Predictive power?
86 Predictive power?
Hypothetical sequence with potential to form loop structures